Factors affecting the transmission of SARS-CoV-2 in school settings

Background Several studies have reported SARS-CoV-2 outbreaks in schools, with a wide range of secondary attack rate (SAR; range: 0-100%). We aimed to examine the key risk factors to better understand transmission in school settings. Methods We collected records of SARS-CoV-2 school outbreaks globally published from January 2020 to January 2021 and compiled information on hypothesized risk factors. We utilized the directed acyclic graph (DAG) to conceptualize the risk mechanisms, used logistic regression to examine each risk-factor group, and further built multiple variable models based on the marginal analysis. Adjusted odds ratios (aOR) and 95% confidence intervals (CI) were calculated. Results From 17 relevant articles, 26 school clusters were included for analysis. The best-fit model showed that the intensity of community transmission (aOR: 1.26; 95% CI: 1.22 - 1.30, for each increase of 10 cases per 100,000 persons per week), social distancing (aOR: 0.26; 95% CI: 0.18 - 0.37), mask-wearing (aOR: 0.52; 95% CI: 0.35 - 0.78) were associated the risk of SARS-CoV-2 infection in schools. Compared to students in pre-schools, the aOR was 0.12 (95% CI: 0.07 - 0.19) for students in primary schools and 1.31 (95% CI: 0.93 - 1.87) for students in high schools. Conclusions Preventive measures in both schools (e.g. social distancing and mask-wearing) and communities (additionally, vaccination) should be taken to collectively reduce transmission and protect children in schools. Flexible reopening policies may be considered for different levels of schools given their risk differences.


48
Introduction 49 Since the early stages of the COVID-19 pandemic, concerns have been raised about the impact 50 of schools on community transmission and the well-being of students and staff, as well as the 51 impact on the schedules of healthcare workers concerning childcare [1]. Out of an abundance 52 of caution and fear that the SARS-CoV-2 virus would spread rapidly in schools much like 53 influenza pandemics [2], countries globally decided to suspend in-person classes and begin 54 online instruction. By April 2020, over 1.5 million students worldwide were affected by school 55 closures in response to the COVID-19 pandemic [3]. In contrast to influenza pandemics where 56 children are the key drivers of transmission, studies have indicated that children are likely less 57 susceptible to SARS-CoV-2 infection, tend to experience less severe disease when infected, and 58 likely have lower transmissibility [4,5]. Given this new evidence, schools in many places have 59 gradually reopened since the summer of 2020, while implementing varying level of preventive 60 measures (e.g., mask wearing, distancing, limiting the number of students, rotating schedules, 61 and viral testing) to reduce risk of transmission. Given these circumstances, the risk of SARS-62 CoV-2 outbreaks in school settings may differ substantially across space and time. Indeed,63 several studies have examined school outbreaks of COVID-19 and reported secondary attack 64 rates (SAR) -i.e., the proportion of infected contacts of an index case out of all contacts of that 65 index case [6] -ranging from 0 (i.e., no secondary infections) to 100% (i.e., infections among all 66 contacts). However, this discrepancy is still not fully understood and a better understanding 67 can inform better preventive measures for future outbreaks not limited to COVID-19 or school 68 settings. 69 70 To identify the main factors that determine the transmission of SARS-CoV-2 in schools and 71 inform strategies to prevent future school outbreaks, here we examined the associations 72 between SARS-CoV-2 SAR in children and various potential risk factors. We compiled data from 73 relevant studies in the literature reporting SARS-CoV-2 SAR in schools and for posited risk 74 factors (e.g. incidence in the community) and further used regression models to examine key 75 risk factors of having high SAR in schools. We found that appropriate distancing and mask 76 wearing, school levels, and the magnitude of SARS-CoV-2 transmission in the community are 77 key determinants of SAR in school settings. 78 79 Methods 80 Data sources 81 Studies were searched for on the "Living Evidence for COVID-19" database [24], which retrieves 82 articles from EMBASE via Ovid, PubMed, BioRxiv, and MedRxiv. Any article within this database 83 was considered, from December 2019 up to February 17, 2021. The search terms used include 84 "transmission AND (school OR schools)" or "transmission AND children." When titles and 85 abstracts were identified as being potentially relevant, the articles were read to determine if 86 the outbreaks took place in a school setting and the number of infections and contacts were 87 reported. In addition, we extracted 11 observations included in a systematic review of evidence 88 regarding the ability of children to transmit SARS-CoV-2 in schools [7]. In total, 17 studies were 89 included in this analysis (see Appendix 1). 90 91 Relevant data, as deemed by an initial conceptual analysis using the directed acyclic graph 92 (DAG; see details below), were taken from the articles identified above. and whether masks and social distancing were required. In addition, we compiled additional 98 data for potential risk or confounding factors of SARS-CoV-2 transmission in schools for each 99 identified study as detailed in the next section. 100 101 Conceptual analysis and variable coding 102 The unit of analysis was individual study and in cases where several school types were covered 103 in one study the data were stratified by those school types. We first conducted a conceptual 104 analysis using the DAG and identified eight key components that may affect SARS-CoV-2 105 transmission in schools (Fig 1). Below we describe each of the eight components, rationale for 106 inclusion, and related variables examined. 107 108 1) School types, based on studies indicating differential transmission risk among different age 109 groups [20,22]. Here, we examined this factor as a categorical variable including 4 levels, i.e. 110 pre-school or early childhood education center (ECEC), primary school, high school, and mixed-111 level school. The first three levels were per reports in the included school studies. For studies 112 that examined several types of school but did not report school type specific SARs, we assigned 113 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted June 22, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 them to a "mixed-level school" category. For example, if a study gave the overall SAR combining 114 a pre-school and a primary school it was given the value "mixed-level school." 115 SARS-CoV-2 SAR among children in school settings is the number of infected contacts divided by 116 the total number of contacts at each school. 117 118 2) Physical school settings such as student density in the classroom and ventilation systems that 119 may affect the intensity of school contact and clearance of air. As it is difficult to obtain 120 information related to ventilation settings, here we included class size in our analysis based on 121 the average number of students per classroom in each country, as reported by the Organisation 122 for Economic Co-operation and Development (OECD) [12]. 123 124 3) Cultural climates, which "represent independent preferences for one state of affairs over 125 another that distinguish countries (rather than individuals) from each other" in the school studies, including testing only the symptomatic, both symptomatic and some 148 asymptomatic, and all contacts. However, due to the small sample size in "only symptomatic" 149 (n = 3), we dichotomized surveillance to testing "only symptomatic or some asymptomatic" and 150 "all contacts" of an index case in each school cluster. 151 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. transmission of SARS-CoV-2. Here, we used specific humidity (a measure of absolute humidity) 161 to examine the potential impact from disease seasonality, as specific humidity and temperature 162 are highly correlated. Specifically, ground surface temperature and relative humidity for each 163 study location were extracted from the National Oceanic and Atmospheric Administration using 164 the "stationaRy" package [11]. Daily mean specific humidity in g H20/kg air was then computed 165 based on the temperature and relative humidity data and was averaged over the study period. 166 167 8) Indicators of socioeconomic status such as national income that reflect a country's ability to 168 mobilize resources to fight COVID-19. As such, we included measured national income for each 169 study in our analysis; specifically, national income is measured as the gross domestic product 170 (GDP) subtracting capital depreciation and adding net foreign income, using data from the 171 World Inequality Database [13]. 172 173 Statistical Analyses 174 Marginal analysis. 175 Due to the low number of observations (n = 26 schools), we conducted an initial analysis to test 176 combinations of the DAG covariates described above. The goal was to examine the relationship 177 between SAR and only one group of variables at a time and then include the most relevant 178 predictors into the final model based on this analysis. For each test, we used a logistic 179 regression model of the following form: where logit is the log-odds (i.e., + , -., /, with p as the probability of event) and SAR 182 represents the SAR as reported from each of the 26 schools. X is one of the combinations of 183 variables we examined as follows: 184 1) Preventive measures (mask-wearing and social distancing) adjusting for seasonal changes 185 2) Community transmission (weekly death rate per 100,000 and weekly case rate per 100,000) 186 3) Seasonal changes 187 4) Classroom size, adjusting for national income 188 5) School type 189 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Multi-risk factor analysis 191 All five variable groups were found to be associated with SAR in the marginal analysis (see 192 Results). We thus tested models including different combinations of these variables. For all 193 models, we included surveillance type to account for potential biases in reporting including 194 missing asymptomatic infections, which would underestimate SAR. We also assessed for 195 confounding between our variables of interest and SARS-CoV-2 SAR (see adjustments specified 196 above). This procedure began with including all of the significant variables from the marginal 197 analysis and removing one at a time to see the effect of its removal on model fit. We then 198 evaluated and selected the most parsimonious model with the best fit based on the Akaike 199 information criterion (AIC; Table S1). The best performing model took the following form: 200 201 l ( ) ∼ mask-wearing + social distancing + weekly case rate + school type + surveillance 202 203 All statistical analyses were performed in RStudio, a user interface for R (R Foundation for 204 Statistical Computing, Vienna, Austria). All models were fitted using the "glm" function from the 205 built-in "stats" library in R.

207
Sensitivity analysis 208 We tested different measures of community transmission, to examine the robustness of our 209 models to potential biases due to variations in case-ascertainment, mortality risk, and delay in 210 event occurrence (e.g. from infection to death) and reporting. Specifically, we additionally 211 examined two mortality-related measures in representing the intensity of community 212 transmission: 1) weekly death rate for the study period and 2) weekly death rate for the study 213 period plus one month. The additional month added accounts for the time lag that occurs as 214 death from COVID-19 may take several weeks [14]. Risk factors identified using these two 215 measures were similar to those in our main analysis (Table S2). As the risk of death due to 216 COVID-19 reduced substantially over time [15], likely due to improved medical treatment and 217 management, such a change may bias the death rates recorded over time for the included 218 studies spanning the initial phase in March 2020 to a later phase in Dec 2020. For this reason, 219 we used weekly case rate per 100,000 to represent the intensity of community transmission in 220 our main analysis. 221 222 Results 223 Summary statistics 224 We identified 26 reported SARS-CoV-2 outbreaks in schools, totaling 630 secondary cases in 225 children among 8322 contacts. These outbreaks occurred in 12 countries, spanning 4 WHO 226 regions including the Americas, Western Pacific, European, and Eastern Mediterranean Region. 227 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Table 1 shows the frequencies and summary statistics for SARS-CoV-2 SAR and other variables  228 included. SARS-CoV-2 SAR and weekly cases per 100,000 population were both right skewed. 229 While the reported SAR ranged from 0 to 100%, the majority of schools reported very low SAR 230 (median: 0.7%, interquartile range: 0 -17%). The majority of school settings included in these 231 studies required mask-wearing (19/26, or 73%) and distancing (14/26, or 54%). Around half of 232 the studies tested all contacts of the index case (12/26, or 46%). The studies covered roughly 233 even proportion for different school types: 5 (19%) were pre-schools, 9 (35%) were primary 234 schools, 7 (27%) were high schools, and 5 (19%) were mixed schools. 235 236 Marginal analysis 237 The marginal analysis (Figure 2), adjusting for surveillance, indicated that, for the intensity of 238 community transmission, each 10-unit increase in weekly case rate was associated with an 239 increased risk of contracting SARS-CoV-2 in schools [odds ratio (OR): 1.11; 95% confidence 240 interval (CI): 1.08 -1.14]; and each 10-unit increase in weekly death rate was associated with an 241 increased risk of contracting SARS-CoV-2 in schools when the model did not adjust for surveillance ( Figure 2). 252 253 Multi-risk factor analysis 254 Among all models tested (Table S1), the best-performing model with the lowest AIC included 255 three key groups of risk factors, namely, intensity of community transmission, preventive 256 measures, and school type, adjusting for surveillance. Overall, these risk factors in combination 257 were able to explain 57.7% of the variance in the reported SARS-CoV-2 SAR (McFadden's 258 pseudo R 2 = 0.577; Fig 3) [21, 23]. Fig 4 and Table 2 show the estimated adjusted ORs (aORs) 259 for each risk factor. 260
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The model identified community transmission as the most significant variable associated with 277 the risk of SARS-CoV-2 transmission in schools. In particular, the multi-risk factor model 278 estimated that, in a community of 100,000 persons, every 10 additional cases per week in the 279 community increases the odds of infection in schools by 26%. This finding highlights the 280 potential community-to-school importation of cases and subsequent risk of outbreak in schools. 281 As such, care must be applied when reopening or operating schools in areas with high levels of 282 community transmission. In addition, reversing some original fears about school-to-community 283 SARS-CoV-2 transmission, it is likely that the community transmission drives outbreaks in 284 school, not the reverse. Further, with the availability of COVID-19 vaccines, predominantly to 285 adults at present, it is paramount that all eligible adults get vaccinated promptly to lower the 286 risk of transmission in the community and in turn to provide indirect protection to children via 287 the increased population immunity. 288 289 Our model suggests that distancing policies have the greatest impact on reducing transmission 290 in schools (aOR: 0.26, 95% CI: 0.18 -0.37). This finding is likely a combined outcome of reduced 291 number of contacts and reduced short-range transmission when social distancing policies were 292 followed. Maintaining distance has necessitated fewer people in a room at the same time, 293 leading to fewer contacts. In addition, the increased personal space in classroom enables 294 students to avoid the likely higher viral concentration within short-range of the emitter (either 295 via aerosols, droplets, or in combination) when far apart. 296 297 Nevertheless, it is important to note that social distancing measures may be more difficult to 298 achieve fully in disadvantaged communities with underfunded and overcrowded schools, 299 further exacerbating the COVID-19 crises in these communities, usually of color [16,17]. 300 Furthermore, racially motivated structural factors prevent these disadvantaged communities 301 from practicing social distancing policies outside of the school. For example, these communities 302 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. In comparison to pre-schools, students in both primary and mixed-level schools had a lower risk 319 of SARS-CoV-2 infection whereas those in high schools had a higher risk. This finding is 320 consistent with previous studies indicating the likely lower susceptibility to SARS-CoV-2 321 infection and transmissibility among young children [23]. In addition, it is also likely in part due 322 to the greater ability of older children to follow directions regarding preventive measures but 323 with less compliance among high school students. The similarity of the estimates between 324 primary school and mixed-level schools is likely due to the number of primary schools that were 325 folded into the mixed-level school category -three of the five schools that made up this group 326 included primary schools. Overall, these findings support flexible reopening policies for 327 different levels of schools given the risk differences. 328 329 When schools test all contacts of the index case, this effectively functions as a control measure 330 in that more cases will be detected and is more likely to result in a school closure. This may 331 explain why the OR estimate for all contacts surveillance is protective, as the outbreak was 332 identified and able to be contained before the asymptomatic individuals were able to infect 333 others. 334 335 This study has several limitations. First, the analyses included a limited number of studies (n = 336 26), due to the inclusion criteria and the dearth of published reports on this topic to date. 337 Future research with additional relevant studies may enhance the statistical power of inference 338 for the risk factors examined here. Second, due to a lack of detailed information for each 339 specific school setting, we used proxy measures in the analyses (e.g., class size at the national 340 All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted June 22, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 level was used rather than for each reporting school), which may have limited the ability of the 341 models to identify the association of these factors with SARS-CoV-2 transmission risk. Similarly, 342 due to the lack of data, we were not able to examine other key factors such as ventilation in 343 classrooms, social economic status of individual students and their households, and potential 344 differences in susceptibility and transmissibility by age group. Future work with comprehensive 345 study designs and data collection is warranted to provide further insights into how infections, 346 not limited to SARS-CoV-2, spread in schools and the broad, bi-directional impact of school and 347 community transmission. This would be invariable to inform better strategies to combat future 348 infectious disease outbreaks. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted June 22, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 Table 2. Results of the best-fit multi-risk factor model for the identification of factors associated 433 with SARS-CoV-2 SAR in schools. Adjusted odds ratio (aOR) estimates and 95% confidence 434 intervals are given from the logistic regression model including surveillance to control for 435 differences in testing school clusters, school type due to inconsistent reporting of age groups in 436 the literature, mask-wearing and social distancing for preventative measures, and weekly case 437 rates per 100,000. The vertical black bar indicates the null value of 1.0. Each variable type is delineated by the 459 shaded regions. 460 461 Supplemental Tables  462  Table S1. Performance of different models. In the marginal analysis, 5 models for 5 groups of 463 risk factors are tested. In the multi-risk factor analysis, 11 models with all possible combinations 464 of 4 groups of risk factors are tested (i.e., excluding models with only one group tested in the 465 marginal analysis). All models adjusted for surveillance. Note that although class size was 466 identified as a significant risk factor in the marginal analysis, further including this variable in 467 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted June 22, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021 the multi-risk models gave impossible values for parameter estimates; we therefore excluded 468 class size in the multi-risk model analysis. The best-performing model with the lowest AIC is 469 bolded. 470 471 Table S2. Sensitivity analysis for both cases and deaths, with and without a 1-month extension 472 to the study time period. 473 474 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted June 22, 2021. ;https://doi.org/10.1101https://doi.org/10. /2021  (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  the mean odds ratio estimates and horizontal black bars show the 95% confidence intervals. 508 The vertical black bar indicates the null value of 1.0. Each variable type is delineated by the 509 shaded regions. 510 511 512 All rights reserved. No reuse allowed without permission.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  Tables  588  Table S1. Performance of different models. In the marginal analysis, 5 models for 5 groups of 589 risk factors are tested. In the multi-risk factor analysis, 11 models with all possible combinations 590 of 4 groups of risk factors are tested (i.e., excluding models with only one group tested in the 591 marginal analysis). All models adjusted for surveillance. Note that although class size was 592 identified as a significant risk factor in the marginal analysis, further including this variable in 593 the multi-risk models gave impossible values for parameter estimates; we therefore excluded 594 class size in the multi-risk model analysis. The best-performing model with the lowest AIC is 595 bolded. 596 (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.