• Open Access

Diagnosing swine flu: the inaccuracy of case definitions during the 2009 pandemic, an attempt at refinement, and the implications for future planning

Authors


Andrew Mahony, Infectious Diseases Department, Austin Health, 145 Studley Road, Heidelberg 3084, Victoria Australia.
E-mail: andrew.mahony@austin.org.au

Abstract

Background  At the onset of the pandemic H1N1/09 influenza A outbreak in Australia, health authorities devised official clinical case definitions to guide testing and access to antiviral therapy.

Objectives  To assess the diagnostic accuracy of these case definitions and to attempt to improve on them using a scoring system based on clinical findings at presentation.

Patients/Methods  This study is a retrospective case–control study across three metropolitan Melbourne hospitals and one associated community-based clinic during the influenza season, 2009. Patients presenting with influenza-like illness who were tested for H1N1/09 influenza A were administered a standard questionnaire of symptomatology, comorbidities, and risk factors. Patients with a positive test were compared to those with a negative test. Logistic regression was performed to examine for correlation of clinical features with disease. A scoring system was devised and compared with case definitions used during the pandemic. The main outcome measures were the positive and negative predictive values of our scoring system, based on real-life data, versus the mandated case definitions’.

Results  Both the devised scoring system and the case definitions gave similar positive predictive values (38–58% using ascending score groups, against 39–44% using the various case definitions). Negative predictive values were also closely matched (ranging from 94% to 73% in the respective score groups against 83–84% for the case definitions).

Conclusions  Accurate clinical diagnosis of H1N1/09 influenza A was difficult and not improved significantly by a structured scoring system. Investment in more widespread availability of rapid and sensitive diagnostic tests should be considered in future pandemic planning.

Introduction

Pandemic H1N1/09 influenza A virus (“swine flu”) swept across the world in 2009. It triggered widespread public health actions, infected millions, and ultimately led to more than 18449 reported deaths from April 2009 to August 2010.1 Fortunately, H1N1/09 did not prove as virulent as other viruses such as highly pathogenic avian influenza (H5N1), upon which many national influenza pandemic preparation plans were based.

In Victoria, Australia, public health measures were guided by the state government endorsed “Victorian Health Management Plan for Pandemic Influenza,” which in turn was part of the “Australian Health Management Plan for Pandemic Influenza 2008,” the latter subsequently updated during the pandemic.2,3 Different phases of response, based on active surveillance to monitor disease transmission, were used to guide control measures; a working case definition of pandemic influenza, one of the key elements of surveillance, was devised by expert members of the Communicable Diseases Network Australia (CDNA), and this became a gateway to diagnostic testing and early use of antiviral therapy.4,5 Cases were required to meet clinical criteria – fever plus at least one upper respiratory tract symptom – plus have relevant travel or contact histories in order for testing to be authorized; later in the pandemic, when it was clear that local community transmission was predominant, testing was only recommended for those with moderate to severe disease or those in particular high-risk groups: infants, healthcare workers, nursing home residents, and children in special development schools. Clinicians were surprised by the rigidity of the initial criteria, as it quickly became apparent that the vast majority of patients presenting to hospitals had not travelled from countries already recognized to have been affected by the pandemic.4

We sought to assess the validity of the mandated initial case definitions in patients tested for H1N1/09 influenza and in doing so, to inform future planning by public health leaders, laboratories and clinicians.

Methods

We conducted a retrospective case–control study of patients with suspected influenza who presented to the emergency departments or specially designated Influenza clinics of any of three Melbourne metropolitan hospitals or one associated community-based center, between April 30, 2009 and July 31, 2009. Patients were included if they had an acute respiratory tract illness where the treating clinician suspected influenza, and had undergone testing for H1N1/09 influenza A nucleic acid by polymerase chain reaction (PCR) via nose, throat, or nasopharyngeal swab/s, using previously published methods, at the Victorian Infectious Diseases Reference Laboratory (VIDRL).6 Cases were those who tested positive for H1N1/09 influenza A, and controls were those who tested negative for influenza. Patients were identified prospectively to study investigators, as well as through searching microbiology laboratory results from the recruiting centers. Patients’ medical records were reviewed and patients were then contacted by telephone and interviewed using a standard questionnaire, with answers generated by a composite of file review and responses from this interview.

The questionnaire covered basic epidemiology (age, sex, and concurrent unwell family members), any symptoms at onset of influenza-like illness (ILI; fever, cough, sore throat, runny nose, myalgia, nasal congestion, diarrhea, and headache), comorbidities (chronic lung, heart or kidney disease, active malignancy, smoking, and immunosuppressed state), and pregnancy.

The case definitions used during the pandemic were first applied to the observed data in order to determine their positive and negative predictive values in our cohort. Two models were constructed to determine whether these case definitions could be improved upon by incorporating different clinical or epidemiological factors. First, a logistic regression was performed to examine for the correlation between clinical features, comorbidities, and other risk factors associated with influenza as compared with patients with negative influenza tests. For this analysis, we excluded patients with tests positive for non-H1N1/09 influenza A strains. Variables were selected for multivariable analysis by backwards stepwise regression with a threshold P-value for rejection of >0·2. Calibration, or the match between the model’s predicted and observed probabilities of H1N1/09 infection, was assessed using the Hosmer–Lemeshow goodness-of-fit test. This test compares the observed data against the model predicted values grouped into deciles, with a < 0·05 suggesting poor fit.7 Discrimination, or the ability of the model to distinguish between patients with H1N1/09 infection and controls, was assessed by measuring the area under the receiver operator characteristic (ROC) curve.8 The AUROC represents the probability that a randomly selected subject with influenza had a higher score than a randomly selected subject without influenza.

Second, a summary scoring system was constructed from the regression coefficients using the factors found to be most strongly associated with outcome on multivariable analysis, and calibration and discrimination were assessed similarly. Regression coefficients, which are the natural logarithm of the odds ratio, were used so that an additive score could be constructed. All statistical analyses were performed using Stata, version 10.1 (StataCorp, College Station, TX, USA).

Ethics approval for this study was gained from the relevant ethical review board of each hospital: Austin Hospital, Northern Health and Alfred Hospital. Verbal consent to collect and analyze data was provided by all patients included in the study.

Results

Across the study centers, 818 patients were recruited. Table 1 lists enrolled patients’ characteristics. Patient disposition, isolation, and antiviral therapy have been described previously.9 A total of 253 patients (31%) were tested after more than 3 days’ duration of symptoms, with 77 (30%) of these positive for H1N1/09 infection. The performance of the CDNA case definitions in our cohort is shown in Table 2. Case definition one gave the highest positive predictive value, but this was only 44%; all three definitions had similar negative predictive values (83–86%). Univariate and multivariable logistic regression analyses are given in Table 3. Four symptoms – fever, cough, headache, and myalgia – as well as symptom combinations, younger age, and having a family member with ILI appeared to be significantly predictive of H1N1/09 infection while being immunosuppressed was a negative predictor on univariate analysis. On multivariable analysis, combinations of symptoms (e.g. fever and cough) were no better than individual symptoms using a variety of models and so were not included in the final regression; fever and cough were the only significant predictive symptoms, whereas diarrhea was negatively associated. Pregnancy emerged as a significant variable and younger age remained important. Male gender appeared to be significant, but as we did not think this was plausible given the lack of evidence in the literature, this variable was removed from the model; its removal did not have any significant impact on the other variables’ coefficients (data not shown). Goodness of fit of the multivariable logistic regression model was adequate for predicting H1N1/09 infection (Hosmer–Lemeshow test, = 0·35), and the probability of H1N1/09 infection increased with each decile, from 7% in the lowest decile to 81% in the highest decile. The AUROC curve was 0·75 (95% confidence interval, 0·71–0·79), indicating fair discrimination of this regression model (Figure 1).

Table 1. Patients’ characteristics
VariableFluA negative = 500 (%)H1N1/09 positive = 265 (%)FluA positive = 53 (%)*
  1. *H3N2 influenza A was also circulating within Australia during the 2009 pandemic.

Female:male272:228 (54:46)132:133 (50:50)20:33 (38:62)
Age 0–9 (= 129)81 (16·2)35 (13·2)13 (24·5)
Age 10–29 (= 330)166 (33·2)148 (55·8)16 (30·2)
Age 30–59 (= 285)197 (39·4)72 (27·2)16 (30·2)
Age 60–89 (= 74)56 (11·2)10 (3·8)8 (15·1)
Chronic lung disease133 (26·6)68 (25·7)19 (35·8)
Smoking81 (16·2)30 (11·3)8 (15·1)
Heart disease54 (10·8)21 (7·9)7 (13·2)
Diabetes mellitus26 (5·2)13 (4·9)2 (3·8)
Immunosuppressed77 (15·4)25 (9·4)3 (5·7)
Chronic kidney disease22 (4·4)11 (4·2)1 (1·9)
Active malignancy41 (8·2)12 (4·5)2 (3·8)
Pregnancy18 (3·6)13 (4·9)1 (1·9)
Table 2. Australian case definitions used for H1N1/09 influenza and usefulness in the study cohort (with 95% confidence intervals)
DefinitionSymptoms/signsEpidemiologySensitivitySpecificityPredictive values
  1. *While viral culture and sequencing are described in this definition as gold standards of a confirmed case, these techniques are not in clinical use and are of lower sensitivity than PCR.

  2. PPV, positive predictive value; NPV, negative predictive value.

Suspected case 1. Acute febrile respiratory illness, with at least one of rhinorrhea, nasal congestion, sore throat, or coughOnset within 7 days of travel to Mexico, USA, Canada (and other countries with evidence of local transmission)85% (80–89%)43% (39–48%)PPV 44% (40–48%)
NPV 84% (79–88%)
2. As aboveOnset within 7 days of close contact with a person who is a confirmed case
Close contact of a confirmed case within that case’s infectious period
89% (85–93%)27% (23–31%)PPV 39% (35–43%)
NPV 83% (76–88%)
Probable case 3. As above, for which no other cause is identified93% (89–96%)23% (19–27%)PPV 39% (35–43%)
NPV 86% (79–91%)
Confirmed casePositive laboratory test: specific PCR, isolation of virus, viral sequencing Gold standard*Gold standard 
Table 3. Univariate and multivariable logistic regression, H1N1/09 versus influenza negative: clinical and epidemiological features
VariableRegression coefficient95% confidence intervals P-valueAdjusted regression coefficient95% confidence intervals P-value
Fever1·861·382·34<0·0011·891·362·42<0·001
Cough0·610·240·980·0010·560·130·980·011
Sore throat0·02−0·290·330·92Dropped (> 0·2)
Runny nose0·03−0·280·340·84Dropped (> 0·2)
Nasal congestion−0·28−0·590·030·07−0·42−0·79−0·060·023
Diarrhea−0·16−0·550·230·41−0·47−0·93−0·020·040
Headache0·550·240·870·0010·490·120·860·010
Myalgia0·500·150·850·005Dropped (> 0·2)
Fever + cough1·220·891·56<0·001Not included
Fever + headache1·040·721·36<0·001Not included
Fever + myalgia1·000·531·47<0·001Not included
Fever + cough + headache1·050·731·38<0·001Not included
Fever + cough + headache + myalgia1·320·651·99<0·001Not included
Male0·18−0·110·480·23Not included
Chronic lung disease−0·04−0·380·300·82Dropped (> 0·2)
Smoking−0·43−0·880·010·06Dropped (> 0·2)
Heart condition−0·34−0·870·180·20Dropped (> 0·2)
Diabetes mellitus−0·06−0·750·620·86Dropped (> 0·2)
Immunosuppressed−0·56−1·04−0·090·02Dropped (> 0·2)
Chronic kidney disease−0·06−0·800·680·87Dropped (> 0·2)
Active malignancy−0·63−1·290·030·06Dropped (> 0·2)
Pregnant0·32−0·411·050·391·240·382·100·005
Healthcare worker−0·13−0·660·380·60Dropped (> 0·2)
Family member with ILI0·370·070·680·020·370·020·720·041
Age 0–4 years−0·66−1·320·000·05Dropped (> 0·2)
Age 5–18 years0·890·561·22<0·0010·880·501·26<0·001
Age 19–41 years0·00−0·300·310·98Dropped (> 0·2)
Age 42–65 years−0·52−0·93−0·120·01Dropped (> 0·2)
Age 66+ years−1·27−2·08−0·460·002Dropped (> 0·2)
Figure 1.

 Receiver operating characteristics for the multivariable logistic regression model and derived scoring system in the diagnosis of H1N1/09 influenza.

The risk factor scoring system is shown in Table 4; for convenience, all regression coefficients were multiplied by ten to formulate each score. Scores were then stratified into groups, with the probability that tested patients had H1N1/09 infection rising with increasing score (Table 5). Shortcomings in discrimination of the scoring system were apparent in the AUROC curve of 0·73 (95% confidence interval, 0·69–0·76, Figure 1), indicating that an eight-variable scoring system did not give sufficient power to accurately diagnose H1N1/09 infection during a pandemic (= 0·42 comparing the two models’ AUROC).

Table 4. Scoring system derived from significant variables of multivariable regression
FactorScore
Fever19
Cough6
Nasal congestion−4
Diarrhea−5
Headache5
Pregnant12
Family member with ILI4
Age 5–18 years9
Table 5. Strata of scores in the study cohort and accuracy in diagnosis of H1N1/09 infection (with 95% confidence intervals)
GroupScoreNumber of patientsGroup % with H1N1Sensitivity*Specificity*Predictive values*
  1. *If used as cut off for the diagnosis of H1N1/09.

  2. PPV, positive predictive value; NPV, negative predictive value.

1<6695·8%100%0%
26–1814311·9%99% (96–100%)13% (10–16%)PPV 38% (34–41%)
NPV 94% (86–98%)
319–2520730·9%92% (88–95%)38% (34–43%)PPV 44% (40–48%)
NPV 90% (85–94%)
426–3215844·9%68% (62–74%)67% (63–71%)PPV 52% (47–57%)
NPV 80% (76–84%)
5>3218858·0%41% (35–47%)84% (81–87%)PPV 58% (51–65%)
NPV 73% (69–77%)

Discussion

We found that the application of case definitions, presence of single or multiple symptoms of ILI with or without limited additional epidemiology, and a scoring system derived from a large cohort of patients were of limited utility in accurate diagnosis of H1N1/09 pandemic influenza during the 2009 Australian influenza season.

Case definitions, as devised by the CDNA, were quite sensitive (85–93%, depending on the specific definition) but suffered from poor specificity. These findings suggest that for every 100 patients meeting the various case definitions, between 54 and 61 patients without influenza were treated and/or recommended to remain in isolation, and for every 100 patients tested who did not meet the case definition, between 14 and 17 patients were found to have influenza. In the early phases of a future pandemic, when containment of a highly virulent pathogen may be paramount for maintaining public health, this leakage of missed cases could potentiate further transmission in the community. In retrospect, the policy of using clinical case definitions to control testing and antiviral treatment for H1N1/09 during the early containment phase of the epidemic was fundamentally flawed, despite published opinions to the contrary;10 the definitions were not accurate enough to enable complete active case finding, and nor did their specificities improve over time. Similar evaluations of case definitions employed by a variety of international, national, and local health bodies also found limitations in their negative predictive values, ranging from 66% to 90%;11–13 the clinician assessing a patient with ILI in the containment phase of a pandemic would find any of the definitions unhelpful when trying to conclusively exclude influenza, as the negative predictive value of a clinical case definition needs to approach 100% to minimize ongoing transmission. A “cough plus fever” decision rule has also been shown to have the same performance characteristics as simple clinician judgement during an influenza season.14

Our univariate logistic regression suggested four ILI symptoms classic for influenza – fever, cough, headache and myalgia – increased the likelihood of confirmed H1N1/09, with odds ratios >1·5 (regression coefficients >0·4). This is in broad agreement with the one published systematic review of the operating characteristics of signs and symptoms of influenza.15 Combinations of these symptoms, therefore, appeared to be significantly predictive. However, this was not borne out on multivariable analysis, with each individual symptom outperforming combinations in forming a predictive model. It is likely that using combinations of symptoms introduces additional confounding to such models. We were surprised and interested to find that sore throat, runny nose, and myalgia were not associated with H1N1/09 and that nasal congestion and diarrhea were negatively associated. This is in contrast with early reports of the clinical presentation of H1N1/09, where gastrointestinal symptoms were noted in a significant number of cases, but consistent with other Australian case series.16,17

The strongest risk factors on multivariable regression for H1N1/09 influenza in the study cohort were not otherwise surprising. The unusual age distribution of H1N1/09 influenza – with older adults likely to have had a degree of cross-protection from exposure to previous pandemic virus strains – is reflected in the risk of being a child aged between 5 and 18 years of age in our cohort.18 Pregnancy has also been well described as a risk factor for hospitalization and death.19 In a previous H3 influenza epidemic in Canada, fever and cough were similarly shown to be the only symptoms predictive of a positive test result.20

We developed a scoring system using eight factors in an attempt to improve on the accuracy of the case definitions. While others have devised a simple score incorporating clinical and laboratory data in determining the likelihood of H1N1/09 pneumonia compared to other causes of community-acquired pneumonia, or the need for hospitalization, to the best of our knowledge, this is the first scoring system based entirely on variables immediately available to the clinician.21,22 The scoring system marginally outperformed the case definitions – score group 3, for example, had equivalent or superior predictive values to any of the CDNA definitions – and was of similar discrimination to the logistic regression (AUROC 0·73 versus 0·75). However, the system is relatively cumbersome and has not been externally validated. We do not believe that it offers any significant advantages for clinicians or public health practitioners and highlights the inability of clinical and epidemiological features to reliably predict H1N1/09 influenza.

Our study has several limitations. Data were collected retrospectively, potentially introducing recall bias although contemporaneous medical records were used where possible to validate the information collected. There was a bias at the referring centers to test only those patients meeting a case definition, as the testing reference laboratory was required to reject other samples in order to meet demand during the pandemic. This would have tended to improve the accuracy of the case definitions, but we were not able to identify patients who either did not have a sample collected or whose sample was not tested, to verify this. Patients were not excluded from our study based on their duration of symptoms prior to being tested. As the sensitivity of PCR-based testing for influenza wanes with time from onset of illness, this could have potentially led to misclassification of some cases into the control group; nonetheless, we found a significant proportion of patients tested beyond 3 days of illness to have H1N1/09 infection, consistent with other studies demonstrating PCR-positivity well beyond this time period.23 Inclusion of overweight, based on Body Mass Index, may have added discrimination to our model as morbid obesity has emerged as a risk factor for severe disease.24 Ideally, we would have used a separate cohort of patients to validate our model but had insufficient numbers to do so; in light of its limited advantages over the case definitions, we have not gone on to do this using patients from subsequent influenza seasons.

We believe that our results have important implications for future pandemic planning. As this and previous studies have shown, clinical signs cannot be used to reliably rule in or rule out newly emerged influenza. The effectiveness of public health measures based on case definitions is likely to be impaired by the lack of sensitivity and specificity of these definitions. Testing for influenza should not be restricted to patients based on the presence or absence of particular symptoms or epidemiological risk factors, particularly where the results of testing may have implications for the management of individual patients. To circumvent the inherent problems in the accuracy of case definitions for pandemic influenza, investment in the more widespread availability of accurate diagnostics, and the ability to rapidly turn around the results of such testing, must be considered a priority. Cost-benefit analyses of the impact of rapid diagnostics on the spread of pandemic influenza, comparing high-risk groups to larger populations, would aid in determining the feasibility of such a strategy.

Acknowledgements

The authors are grateful to the clinical research nursing staff at the study sites who administered questionnaires to study participants.

This project was funded by an Australian Commonwealth Government National Health and Medical Research Council (NHMRC) Strategic Award.

Conflicts of interst

AC is an investigator on a study to examine the adverse events of influenza vaccine, funded by CSL Ltd, the manufacturer of the vaccine. AC has not received any direct funding from any pharmaceutical company and was supported by a National Health and Medical Research Council (NHMRC) training fellowship.

Ancillary