Active surveillance documents rates of clinical care seeking due to respiratory illness

Abstract Background Respiratory viral infections are a leading cause of disease worldwide. However, the overall community prevalence of infections has not been properly assessed, as standard surveillance is typically acquired passively among individuals seeking clinical care. Methods We conducted a prospective cohort study in which participants provided daily diaries and weekly nasopharyngeal specimens that were tested for respiratory viruses. These data were used to analyze healthcare seeking behavior, compared with cross‐sectional ED data and NYC surveillance reports, and used to evaluate biases of medically attended ILI as signal for population respiratory disease and infection. Results The likelihood of seeking medical attention was virus‐dependent: higher for influenza and metapneumovirus (19%‐20%), lower for coronavirus and RSV (4%), and 71% of individuals with self‐reported ILI did not seek care and half of medically attended symptomatic manifestations did not meet the criteria for ILI. Only 5% of cohort respiratory virus infections and 21% of influenza infections were medically attended and classifiable as ILI. We estimated 1 ILI event per person/year but multiple respiratory infections per year. Conclusion Standard, healthcare‐based respiratory surveillance has multiple limitations. Specifically, ILI is an incomplete metric for quantifying respiratory disease, viral respiratory infection, and influenza infection. The prevalence of respiratory viruses, as reported by standard, healthcare‐based surveillance, is skewed toward viruses producing more severe symptoms. Active, longitudinal studies are a helpful supplement to standard surveillance, can improve understanding of the overall circulation and burden of respiratory viruses, and can aid development of more robust measures for controlling the spread of these pathogens.


| INTRODUC TI ON
Respiratory infections are a leading cause of morbidity and mortality globally and impose a high burden on economic productivity and medical and public health systems (hospitalizations, visits, therapeutics). A variety of viruses and bacteria regularly generate respiratory infections in humans, and specific pharmacological interventions are limited to vaccines, antivirals and antibiotics for a small subset of these pathogens. New and improved vaccines and therapeutics for many common respiratory viruses-respiratory syncytial virus (RSV), rhinovirus, human metapneumovirus (HMPV)-are currently under evaluation or development; however, because many infected persons do not seek clinical care, the true burden of each of these viruses is not known. This circumstance complicates predictive quantification of the cost effectiveness of each intervention and its ability to control targeted pathogens in the broader population.
Household, serological and community studies have shown that a consistent percentage of respiratory infections (most frequently influenza infections) are asymptomatic or subclinical. [1][2][3] However, estimates of the asymptomatic ratio are highly heterogeneous 3 and tend to be lower for household studies following a symptomatic index case (10%-30%) than serologic 1 and longitudinal community studies, 2,4 most of which identify the majority of infections as asymptomatic.
Presently, respiratory surveillance in the United States is performed at local scales, is healthcare-based, and is typically syndromic or viral. 5 Syndromic surveillance was established mainly to capture influenza activity through data collection on patients seeking care for respiratory symptoms within select facilities.
Patient complaints classifiable as influenza-like illness (ILI, formally defined in the United States as fever plus sore throat and/ or cough) are documented regardless of laboratory diagnosis. ILI is a convenient measure and is routinely used to capture seasonal influenza trends in many countries. However, it has been shown that syndromic diagnosis alone cannot establish the etiology of respiratory manifestations because many respiratory viral infections present with similar symptoms. 6,7 Hence, many influenza surveillance systems and predictive models supplement syndromic surveillance with laboratory-confirmed diagnosis performed on patient specimens, that is, viral surveillance. 5,8 However, the number of specimens collected for laboratory assay and reported to public health officials make up a very small subset of the total cases, and many collaborating laboratories test exclusively for influenza.
Syndromic and viral surveillance only draw upon medically attended cases and are neither designed to capture mild or asymptomatic respiratory infections nor to represent the large part of the population that chooses not to seek care. Reports on syndromic and viral surveillance have to be interpreted as ratios (eg, the number of patients with classifiable ILI among total visits within reporting facilities; the number of virus positive specimens among all specimens tested) and thus do not give a broad estimate of prevalence of disease or infections in the general population. In contrast, serology can be performed to assess rates of antibody production in the broader population against a particular virus. However, serological studies are retrospective, and thus unsuitable for estimating prevalence in a timely manner, and indirect, and thus not optimal for viruses eliciting short-lived immunity. 9 To estimate the total impact of respiratory illness on the population, multiplicator models based on telephone and web surveys have been used, such as during the 2009 influenza pandemic. 10,11 Studies have estimated that 17%-30% of people experiencing ILI seek medical attention during a typical flu season 2,12-14 ; however, across the world, rates of seeking health care for respiratory symptoms are more heterogeneous and range from 4% to 85%. 15,16 In New York City (NYC), a survey conducted by the Department of Health and Mental Hygiene (DOHMH) estimated that each Emergency Department-attended ILI corresponds to roughly 60 illnesses in the (adult) community. 14 Both survey-and web-based approaches have some important limitations. First, they overlook asymptomatic and mild infections, which are important from an epidemiological vantage. Second, despite being nonspecific, self-reported ILI is often inappropriately interpreted as an indicator of influenza infection. Third, healthcare-seeking behavior for respiratory illness is highly variable and is dependent on healthcare policy, socioeconomic background, severity of symptoms possibly related to virus type, and the influence of media and community.
Here, we used a longitudinal study approach to estimate the burden of viral respiratory infections at the population level and to evaluate the typical indicators used by surveillance systems. This analysis is part of a broader study intending to document the prevalence and impact of viral respiratory infections on the NYC population. In a previously published analysis, we showed that more than two-thirds of respiratory infections are asymptomatic and healthy individuals typically experience multiple infections per year, with children and their caretakers presenting more infections per years than other adults. 15 Here, on one hand, we were interested in measuring the viral agents captured by ILI (influenza-like illness) and how well medically attended ILI reflects the burden of viral respiratory infections, of influenza alone, and more generally of respiratory disease in the broader population. On the other hand, by using a very unique dataset, we endeavored to quantify the prevalence of respiratory viral infections and illnesses among the general population and to capture healthcare seeking behavior.

K E Y W O R D S
ILI, medically attended respiratory infections, population-based estimate of respiratory infections, respiratory illness surveillance 2 | ME THODS

| Datasets
We used data from multiple datasets: a longitudinal cohort of NYC residents, a cross-sectional sample of patients seeking care at three NYC pediatric hospitals, and respiratory surveillance data published by the NYC DOHMH. The longitudinal data were used to (a) quantify the impact of respiratory infection in the population in terms of number of infections, healthcare seeking behavior, and symptoms and (b) to evaluate medically attended cases as indicator of disease burden. Cross-sectional (pediatric) hospital data were used for comparison with virus prevalence in the longitudinal cohort (children and teenager) population. The DOHMH surveillance data were used for comparison with the syndromic and viral data from the cohort.

| Cohort
We enrolled 214 healthy individuals from multiple locations in the Manhattan borough of NYC. Cohort composition is the same as described in Refs. [4,9] and included children attending two daycares, along with their siblings and parents; teenagers and teachers from a high school; adults working at two emergency departments (a pediatric and an adult hospital); and adults working at a university medical center. The study period spanned two years from October 2016 to April 2018 with some individuals enrolled for a single cold and flu season (October-April) and others for the entire study period. Participants (or their guardians, if minors) had to provide informed consent after reading a detailed description of the study (CUMC IRB AAAQ4358). Nasopharyngeal swab specimens were collected weekly from each enrolled individual and tested for respiratory viruses. Further, participants completed daily self-reports rating nine respiratory illness-related symptoms (fever, chills, muscle pain, watery eyes, runny nose, sneezing, sore throat, cough, chest pain), which were recorded on a Likert scale (0 = none, 1 = mild, 2 = moderate, 3 = severe). The daily report also requested information on whether participants had sought medical attention, stayed home, or taken cold/flu-related medications (both over-the-counter and antibiotics, that are not available in NYC without prescription) as a consequence of their listed symptoms. The longitudinal cohort was obtained using convenience sampling.

| Pediatric emergency departments (EDS)
A total of 761 children and teenagers were enrolled at three New York pediatric EDs from August 2016 to June 2018. Patients arriving at one of the pediatric EDs with respiratory complaints (ie, acute illness, asthma) were offered the opportunity to take part in the study and, upon parental consent, tested on site for respiratory viruses.

| DOHMH
All clinical laboratories that perform influenza testing on NYC residents and a large sample of NYC laboratory facilities licensed to perform influenza testing report results electronically to the DOHMH.
These laboratories provide weekly data on the number of influenza tests requested, positive results by influenza type (when available), as well as data on RSV. Beginning with the 2017-2018 season, three NYC laboratories also provide the DOHMH test results for other respiratory viruses: adenovirus, coronavirus, HMPV, rhinovirus/enterovirus, and parainfluenza. 8 All EDs in NYC report the weekly total number of ILI. We used DOHMH ILI and viral data and positivity data during the 2016/2017 and 2017/2018 cold/flu seasons for comparison with the cohort data. Sample collection and extraction followed the same protocol as in Refs. [4,9] and are reported in Text S1.

| Statistical analysis
Analysis of longitudinal data was conducted using the total number of positive samples, as well as the number of infection events. We defined an infection (or viral) event as a group of consecutive weekly specimens from a given individual that were positive for the same virus (allowing for a one-week gap to account for false negatives and temporary low shedding). Medically attended illness (MA), sick days (HOME), and medicine uptake (MEDS) were defined as episodes in which the participants reported seeking care, staying home, or taking medicines for any respiratory symptoms, independent of the etiology. Medically attended ILI (MA ∩ ILI) was defined as episodes in which the participants reported seeking care with symptoms compatible with the US ILI definition. Fever was a self-reported symptom, and no threshold was specified. Medically attended illnesses, sick days, and medicine uptake associated with a viral event were identified within −3/+7 days from any positive test date during an event in order to account for incubation time. We performed the analysis twice, including and excluding co-infections, defined as samples testing positive for multiple respiratory viruses.
To determine whether the distribution of viral pathogens differs between the cohort (general) population and healthcare-based settings, we compared the relative distributions of viruses within the cohort (restricted to children and teenagers) and the pediatric EDs.
In doing the comparison, we restricted the samples from the ED to match the time of sample collection from the cohort (October 2016 through April 2018). Moreover, we estimated for each virus v i the conditional probability of seeking medical attention P MA|v i and we used it to define a scaling factor mapping the prevalence of a specific virus in the medical settings P v i |MA to the prevalence in the general population P(v i ). The scaling was defined using Bayes' theorem: where P MA is the probability of a medically attended respiratory illness (viral or otherwise) and P MA|v i is the probability of seeking care given infection with virus v i . In theory, we can use P MA and

| RE SULTS
Complete demographic information is presented in Table S1 of the Supplementary Materials. Depending on the virus, 4%-20% of infections resulted in an individual seeking medical attention (MA), 7%-44% were associated with one or more sick days, and 24%-59% were associated with medicine intake (Table 1 and Table S2) within −3/+7 days of testing positive. Similar results were obtained when including co-infections (Table S3). HRV was associated with the most medical visits, due to its high prevalence, but influenza and HMPV were most likely to result in medical care and sick days (chi-squared test comparing influenza and HMPV to other respiratory viruses P < .01). There were 39 reports of participants taking antibiotics (irrespective of associated RVP results); however, this use of antibiotics was associated with a reported medical consul- Conversely, influenza and HMPV were, respectively, 23% and 8% of hospital data, but only 6% and 4% of the cohort.
After rescaling the distribution of viruses in the Peds-ED using Bayes' theorem mapping, that is, the virus-specific P MA|v i from Table 1   Each week between 0% and 12% of cohort participants reported symptoms classifiable as ILI ( Figure   with those derived from DOHMH surveillance data in NYC and document similarly timed respiratory viral outbreaks ( Figure S5 and Text S2).

| D ISCUSS I ON
Medically attended respiratory cases reported to public health systems are only a subset of total respiratory disease. Mild symptoms, 17 difficulty accessing the healthcare system, lack of perceived risk, and opting for alternative medicine 16 are typical reasons for not seeking medical care for respiratory symptoms. Estimates of healthcare seeking behavior are typically obtained via telephone or Internet-based surveys and have been used to develop probabilistic models that transform counts of laboratory confirmed cases to population-level estimates of prevalence. 10,18 Overall survey-estimated rates of physician consultation for respiratory illness vary from 4% to 85%. 15,16 This considerable variability has been associated with F I G U R E 1 Differences in viral distribution among EDs and the general population. Comparison of the distribution of viruses within patients at pediatric hospitals and among a cohort of children and teenagers tested regularly irrespective of symptoms. The median age associated with specimens was the same for the hospital and cohort (4 y). We restricted the analysis to samples testing positive for a single respiratory virus at PedsED (258) and to samples taken from the children/teenagers cohort (257) within the same time period: October 2016 to April 2018. The pie chart on the right represents data from the pediatric hospitals rescaled by the likelihood of seeking care for a specific virus (Table 1), following the Bayes mapping reported in Methods. We did not consider RSV positivity in either dataset because children in the cohort did not include young infants, who are most subject to severe RSV infections. To estimate the relative proportion of viruses, we can disregard the numerator of the scaling factor and use P MA|v i from Table 1 socioeconomic heterogeneity, variable healthcare policies (health insurance policy, 8  Recently, the need for developing methods to link epidemiological studies, clinical practice, and standard surveillance has become a priority. 19 Here, to contribute to this goal, we used an active, longitudinal sampling study combining daily self-reported symptoms with weekly laboratory testing for respiratory viruses, and compared these population data to healthcare-acquired data. The longitudinal data not only allowed estimation of the likelihood of seeking medical attention when experiencing ILI (30%), but also the likelihood of seeking medical attention given infection with individual viruses (a much lower and virus-dependent probability).
Medically attended ILI has been largely used as a proxy for respiratory infections and most frequently for influenza infections alone.
Here, we showed that medically attended ILI is a noisy indicator for the burden of respiratory viruses and for influenza in isolation. In our longitudinal cohort, the majority of ILI events would have gone undetected as most people simply did not seek medical help. Moreover, during weeks in which participants sought care for ILI, 32% tested negative for respiratory viral infection and 82% tested negative for influenza. Among those 32% testing negative, ILI symptoms could have been due to different pathogens (bacteria or viruses not included in the RVP) or to infections manifesting for too few days to be captured by our weekly testing.
Similar estimates for the percentage of ILI due to influenza (found to be 18% here) have been reported both in the literature 20 and from clinical laboratories reporting to the CDC. 5 More interestingly, if medically attended ILI is used to estimate respiratory viral infection within the broader population, we find that 95% of infections (and 79% of flu cases) would be unobserved. This result was not only due to the preponderance of asymptomatic infections: Among subjects who tested positive for any virus and developed respiratory symptoms classifiable as ILI, 64% still did not seek medical attention.
These observations of limited sensitivity and PPV (positive predicted value) serve as a warning against using ILI in isolation as a proxy index for respiratory virus or influenza prevalence in the population.
Further, the exclusive use of ILI may not be appropriate for capturing medical visits for respiratory illnesses (any cause of disease). In fact, more than half of the medically attended respiratory illnesses reported within the cohort did not satisfy the definition of ILI, mostly because fever was not recorded in cases presenting with other severe symptoms (like cough, chest pain, and sore throat). This misalignment with ILI criteria may be partially due to self-reporting of symptoms (fever was not necessarily measured) and to the definition of ILI that was specifically designed to capture the signal of influenza. The predictive power of MA-ILI for influenza was in fact higher than for unspecified respiratory viral infections. We showed  Clearly, healthcare-based surveillance is the primary feasible approach for monitoring the prevalence of respiratory viruses regularly and in real time. It also consistently provides critical data supporting infectious disease forecasting efforts, especially for influenza. 24 However, active longitudinal sampling for respiratory virus infections and care-seeking behaviors could be used to document important supplementary information largely missed by standard surveillance.
This alternate sampling provides a unique picture of respiratory virus prevalence in the community and demonstrates that most respiratory virus infections are not documented as ILI, that rates of seeking clinical care vary by virus, and that many infected individuals seeking care do not meet the definition of ILI. Furthermore, the longitudinal approach can be useful for quantifying potential inappropriate population behavior during respiratory illness (eg, antibiotic uptake without medical consultation).

This work was supported by the Defense Advanced Research
Projects Agency contract W911NF-16-2-0035. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

D ECL A R ATI O N O F I NTE R E S T S
JS and Columbia University disclose partial ownership of SK Analytics. JS also discloses consulting for Merck. All other authors declare no competing interests.