Longitudinal active sampling for respiratory viral infections across age groups

Background Respiratory viral infections are a major cause of morbidity and mortality worldwide. However, their characterization is incomplete because prevalence estimates are based on syndromic surveillance data. Here, we address this shortcoming through the analysis of infection rates among individuals tested regularly for respiratory viral infections, irrespective of their symptoms. Methods We carried out longitudinal sampling and analysis among 214 individuals enrolled at multiple New York City locations from fall 2016 to spring 2018. We combined personal information with weekly nasal swab collection to investigate the prevalence of 18 respiratory viruses among different age groups and to assess risk factors associated with infection susceptibility. Results 17.5% of samples were positive for respiratory viruses. Some viruses circulated predominantly during winter, whereas others were found year round. Rhinovirus and coronavirus were most frequently detected. Children registered the highest positivity rates, and adults with daily contacts with children experienced significantly more infections than their counterparts without children. Conclusion Respiratory viral infections are widespread among the general population with the majority of individuals presenting multiple infections per year. The observations identify children as the principal source of respiratory infections. These findings motivate further active surveillance and analysis of differences in pathogenicity among respiratory viruses.


| INTRODUC TI ON
Traditional estimates of influenza-like infection rates are based on specimens collected from patients seeking treatment in medical clinics or emergency departments. 1,2 This surveillance overlooks the large part of the population who experience asymptomatic infections or choose not to see a doctor. As a result, prevalence estimates of different viruses are skewed toward the most pathogenic agents, and overall infection rates are likely underestimated. Here, we supplement traditional surveillance through sampling among healthy individuals who are tested weekly for multiple respiratory viruses irrespective of whether they are experiencing respiratory symptoms. We use this information to quantify the prevalence of common respiratory viruses within the general population, characterize the seasonality of these viruses, and assess the risk factors (age, habits, and health conditions) that increase susceptibility among hosts.

| Cohort composition and survey
We enrolled 214 healthy individuals from different locations in the borough of Manhattan in New York City. The cohort included children attending 2 day cares, along with their siblings and their parents, teenagers and teachers from a high school, adults working at two emergency departments (a pediatric ER and an adult ER), and adults working at a university medical center. The study period spanned two years between October 2016 and April 2018 with some individuals enrolled for a single cold and flu season (October-April) and others for the entire period. At enrollment, individuals were asked to complete a baseline survey and provide two nasopharyngeal swab samples (one for each nostril). Following this preliminary step, two nasopharyngeal samples were collected weekly from each participant irrespective of symptoms. The baseline questionnaire completed at the time of enrollment included information on ethnicity, general health, daily habits, travel history, and household structure.
Parents provided consent and questionnaire answers for children enrolled. Details on the participants are summarized in Table 1.

| Statistical analysis
Analyses were conducted using the total number of positive samples, as well as the number of illness events. We defined an illness event as a group of consecutive weekly swab specimens for a given individual that were positive for the same virus (allowing for a 1-week gap to account for false negatives and temporary low shedding).
The impact of population-based variables on virus positivity or illness event rates was assessed using ANOVA and logistic regression. The chi-squared statistic was also used to assess pairwise differences. The participants were divided into four groups as follows: children (under 10 years of age), teenagers, adults with daily contact with children (parents and PEDS ER doctors), and adults without daily contact with children. For this analysis, only the 192 participants who contributed at least six separate pairs of nasopharyngeal samples were included. Rhinovirus and coronavirus were the most frequently identified viruses, present in 408 and 188 samples (55% and 25% of the positive samples), respectively, followed by adenovirus (11%), RSV (5%), influenza (5%), parainfluenza virus (4%), and HMPV (3%). Among these viruses, influenza, RSV, coronavirus, and HMPV were most prevalent in the winter months and had no documented incidence during the summer months. In contrast, rhinovirus, adenovirus, and parainfluenza circulated throughout the entire study period (the temporal distribution is showed in Figure 1 and Supplementary Figure S1).

| RE SULTS
We compared the results among the following four cohort groups: children, teenagers, adults with daily contact with children, and adults without daily contact with children. Children presented a significantly higher number of co-infections than the other groups: 16% of positive results among children were positive for more than one virus vs 0%, 2%, and 6%, respectively, for teenagers, adults without children, and adults with children (significantly higher for the children, P < 0.0001).
The percentage of tests that were positive differed significantly among the groups: 36% for children, 15% for teenagers, 17% for the adults with children, and only 7% for adults without children. The odds of testing positive for children were six times higher than the odds of testing positive for adults without daily contacts with children (see Supplementary Figure S2 for raw numbers across the different locations and Table S1 for results of logistic regression). The analysis of viral events also confirmed a significant difference across groups, with children exhibiting the highest number of viral events and adults without children the lowest. Comparison of the number of viral events among the four groups is shown in Figure 2, together with P-values for pairwise comparisons ( Table 2).
We tested the effect of several baseline factors on the normalized number of infections. Gender, presence of pre-existing respiratory conditions (any condition, but also separately asthma and allergy), choice of public vs private means of transportation, and selfidentification with Hispanic ethnicity did not have a significant association with the number of infections. In contrast, age group, living with other children, and self-identification with American Indian race had a significant effect on the number of infections per 10 test (Pvalues respectively 0, 0.05, and 0.01). Note that the majority (73%) of people self-identifying as American Indian were children.
The distribution of viruses found across the different age groups was similar, with coronavirus and rhinovirus accounting for 70% to     These observations suggest children are a principal source of respiratory infection and confirm earlier studies that found day cares to be optimal environments for transmission. 12,13 Self-identification with American Indian or Alaskan native race was also a factor influencing the number of respiratory viral infections. This association was likely due to the non-mixed nature of our population, as nearly all of the participants self-identifying as Alaskan native or American Indian were children or parents from one of the day care settings. Children were also associated with a higher risk for co-infection than adults and teenagers, as has also been shown in earlier studies. 14 A larger variety of viruses was found in children and their close contacts; however, rhinovirus and coronaviruses were the most frequently identified viral respiratory pathogens in all age groups.
Together, these two viruses accounted for more than 70% of positive results. The presence of multiple subsequent infections with the same virus in many individuals suggests short-lasting immunity or potential low cross-immunity among multiple co-circulating serotypes of the same pathogen. Previous studies on HRV report up to 20 different rhinovirus types (among more than one hundred known) circulating in a community during one season. Further, the prevailing strains can differ widely between locations, across seasons, and switch almost completely from year to year. 15 Our estimates of incidence rates differ markedly from those built on syndromic surveillance data. 16,17 Among patients seeking care, some viruses like influenza are overrepresented and others, like coronaviruses, profoundly underrepresented. This asymmetry is likely due to the different pathogenicity of the viruses causing respiratory infections and underscores the importance of using of nonsyndromic surveillance data to capture the true overall prevalence of respiratory virus infection within the general population.
A limitation of this study is the low frequency of sampling in late spring/summer months, due to decreased participation of the enrolled individuals. Despite the lower number of samples collected during these months, seasonal and non-seasonal patterns are clearly identifiable. Some viruses (influenza, RSV, coronavirus and HMPV) had a distinct peak during winter months, whereas others circulated year round. Such assessment of seasonality for different pathogens is important for planning vaccination and control strategies and to understand the dynamics of transmission.
Future work should involve analyses of differences in pathogenicity among respiratory viruses, as well as the impact of genetic, demographic, and environmental features on pathogenicity. Moreover, longitudinal sampling coupled with information on symptomology should be used to analyze the impact of asymptomatic infections and the role of asymptomatic carriers on transmission dynamics.

This work was supported by the Defense Advanced Research
Projects Agency contract W911NF-16-2-0035. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.