Are environmental risk factors for current wheeze in the International Study of Asthma and Allergies in Childhood (ISAAC) phase three due to reverse causation?

Abstract Background Phase Three of the International Study of Asthma and Allergies in Childhood (ISAAC) measured the global prevalence of symptoms of asthma in children. We undertook comprehensive analyses addressing risk factors for asthma symptoms in combination, at both the individual and the school level, to explore the potential role of reverse causation due to selective avoidance or confounding by indication. Objective To explore the role of reverse causation in risk factors of asthma symptoms. Methods We compared two sets of multilevel logistic regression analyses, using (a) individual level exposure data and (b) school level average exposure (ie prevalence), in two different age groups. In individual level analyses, reverse causation is a possible concern if individual level exposure statuses were changed as a result of asthma symptoms or diagnosis. School level analyses may suffer from ecologic confounding, but reverse causation is less of a concern because individual changes in exposure status as a result of asthma symptoms would only have a small effect on overall school exposure levels. Results There were 131 924 children aged 6‐7 years (2428 schools, 25 countries) with complete exposure, outcome and confounder data. The strongest associations in individual level analyses (fully adjusted) were for current paracetamol use (odds ratio = 2.06; 95% confidence interval 1.97‐2.16), early life antibiotic use (1.65; 1.58‐1.73) and open fire cooking (1.44; 1.26‐1.65). In school level analyses, these risk factors again showed increased risks. There were 238 586 adolescents aged 13‐14 years (2072 schools, 42 countries) with complete exposure, outcome and confounder data. The strongest associations in individual level analyses (fully adjusted) were for current paracetamol use (1.80; 1.75‐1.86), cooking on an open fire (1.32; 1.22‐1.43) and maternal tobacco use (1.23; 1.18‐1.27). In school level analyses, these risk factors again showed increased risks. Conclusions & clinical relevance These analyses strengthen the potentially causal interpretation of previously reported individual level findings, by providing evidence against reverse causation.

Waikato Medical Research Foundation, Glaxo Wellcome New Zealand, the NZ Lottery Board and Astra Zeneca New Zealand. Glaxo Wellcome International Medical Affairs supported the Regional Coordination and the ISAAC International Data Centre (IIDC). Without help from all of the above, ISAAC would not have given us all these results from so many countries. The research leading to these results has partially been supported by the United 1.75-1.86), cooking on an open fire (1.32; 1.22-1.43) and maternal tobacco use (1.23; 1.18-1.27). In school level analyses, these risk factors again showed increased risks.

Conclusions & clinical relevance:
These analyses strengthen the potentially causal interpretation of previously reported individual level findings, by providing evidence against reverse causation.

K E Y W O R D S
asthma, environment and hygiene hypothesis, epidemiology

| INTRODUCTION
Asthma is becoming increasingly important as a childhood disease on a global basis. 1 The Global Asthma Report 2018 estimated that as many as 339 million people have asthma and that the burden of disability is high. 2 The International Study of Asthma and Allergies in Childhood (ISAAC), using a simple and inexpensive standardized methodology, [3][4][5] has documented a wide variation of asthma prevalence in different parts of the world, 6,7 and a number of papers have been published addressing the findings for individual risk factors, with several associations observed (see "Variables" below). [8][9][10][11][12][13][14][15][16][17][18][19][20] However, these risk factors have not previously been considered together within the same analysis, so it is possible that some of the observed associations may be at least partially due to confounding by other risk factors.
The current paper represents the first comprehensive analyses to address these risk factors together, in order to fill this gap in the current knowledge. We have done this in two ways. Firstly, we have conducted a "standard" analysis using the individual level exposure data for each risk factor (eg maternal smoking). However, for some risk factors the cross-sectional nature of the study means that such analyses may be subject to "reverse causation" if individual level exposure statuses were changed as a result of asthma symptoms or diagnosis. This may occur due to selective avoidance (eg if the child's mother stops smoking because the child has developed asthma) or "confounding by indication" (eg if exposures such as paracetamol or antibiotics are taken in response to symptoms which are related to the subsequent development of asthma).
As schools were the level of sampling in ISAAC, we have therefore conducted a second set of analyses using the school level average reported exposure (ie the prevalence; rather than the reported individual exposure) to each risk factor to attempt to avoid or minimize such biases. School level analyses may suffer from ecologic (community-level) confounding, but reverse causation is perhaps less of a concern because individual changes in exposure status as a result of asthma symptoms would only have a small effect on overall school exposure levels. It is therefore of considerable interest to compare the individual level and school level analyses.
If reverse causation due to confounding by indication was exerting a major influence on the individual level associations, we would expect the associations to be much reduced at the school level. Conversely, if there was reverse causation due to selective avoidance, we would expect a stronger association at the school level, although this could also be due to contextual factors operating at the school level. Consistency of findings at the two levels thus provides indirect evidence against reverse causation and against strong contextual factors.
Biases may differ in different parts of the world, for example breastfeeding is more strongly associated with socio-economic status in high-income countries than in low-and middle-income countries, 21 hence there is a greater potential for confounding by socio-economic status in the former. Therefore we additionally conducted analyses stratified by country-level affluence to examine the extent to which associations and biases differed.

| Study
ISAAC Phase Three methods have been described in detail elsewhere 4 and will be summarized briefly here. ISAAC Phase Three is a multi-centre, multi-country, cross-sectional study of two age groups of schoolchildren (6-7-year-old children and 13-14-year-old adolescents) chosen from a random sample of schools in a defined geographical area. 3,4 The Phase Three survey took place in [2000][2001][2002][2003] and included two standardized questionnaires. The first obtained data on symptoms of asthma, rhinoconjunctivitis and eczema and was identical to that used in Phase One of ISAAC. 6,22 The second, the environmental questionnaire, obtained data on a range of SILVERWOOD ET AL. | 431 possible risk factors for the development of asthma and allergic disorders. 8 The questionnaires can be found on the ISAAC website (http://isaac.auckland.ac.nz).

| Variables
We considered the outcome of wheeze in the last 12 months, defined by a positive response to the question "Has your child/have you had wheezing or whistling in the chest in the past 12 months?" In many countries in the world, we find that most asthma (based on symptoms) has not been diagnosed, which is why ISAAC is based on symptoms. The ISAAC symptoms questionnaire validates well against doctor-diagnosed asthma. 23 The environmental questionnaires in the two age groups did not contain identical questions, so it was not possible to examine the same set of potential risk factors in each age group. In addition, we restricted our analyses to the risk factors which had shown associations with wheeze in the last 12 months in previous analyses at the individual level. For the younger age group, we Most of the above risk factors were parameterised as binary variables from "yes/no" questions in the environmental questionnaire.
The exceptions were as follows: paracetamol use in the past 12 months (at least once per month vs less than once per month), truck traffic (seldom or more frequently vs never), fast food consumption (once per week or more vs less than once per week), television viewing (at least 1 hour per day vs less than 1 hour per day) and birth weight (less than 2.5 kg vs at least 2.5 kg). Full definitions are in Table S1.
Sex was self-reported as male/female, and the highest level of maternal education was recorded as primary, secondary, tertiary or missing/not stated.
Gross National Income (GNI) as of 2002 was obtained from the World Bank website 25 where available, with gaps filled by the CIA World Factbook. 26 Countries were classified as "affluent" or "nonaffluent" using a 2001 GNI value of US$9205 per capita as a cut-off, which separates high-income countries from low-and middle-income countries. 27

| Statistical analyses
To be included in the analysis for a particular age group, centres had to include at least 1000 individuals and to have a response rate of >60% for children and >70% for adolescents. Analyses were conducted separately in the two age groups. Within each age group, schools with fewer than 10 individuals were excluded from the analysis.
All analyses were conducted using mixed effect (multilevel) logistic regression models. The four-level hierarchical nature of the data

(individuals [level 1], schools [level 2], centres [level 3] and countries
[level 4]) was acknowledged by allowing random intercepts at levels 2, 3 and 4 in individual level models and by including random intercepts at levels 3 and 4 in school level models. Centres were selfselected, whereas schools were randomly sampled within centres, making school the preferred level of analysis. Sex and maternal education were adjusted for as individual level confounders in all models.
Three different modelling approaches were used: (a) individual level, (b) school level and (c) hybrid fixed effects. 28 However, results from the hybrid fixed effect models were very similar to those from the individual level and school level models, so they are not discussed further.
Individual level models related the individual level outcome to each individual level risk factor within schools. School level models related the individual level outcome to the school level average exposure (ie prevalence) of each risk factor. In these models, the estimated OR corresponding to the school level prevalence of the risk factor can be interpreted as the effect on the individual outcome of attending a school where all children are exposed compared to attending a school where no one is exposed.
Within each approach, models were fitted for: (a) each exposure of interest using the sub-sample who had data present for wheeze, sex, maternal education and the given exposure (the "maximum sample"), (b) each exposure of interest using the sub-sample who had data present for wheeze, sex, maternal education and all exposures of interest (the "common sample") and (c) each exposure of interest mutually adjusted using the sub-sample who had data present for wheeze, sex, maternal education and all exposures of interest (the "common sample").
The extent of collinearity in the mutually adjusted models was examined by comparing the standard errors in the mutually adjusted model and the minimally adjusted model fitted to the same sub-sample. 29 There was no evidence of substantial collinearity.
Additionally, we ran the fully adjusted analyses separately for "affluent" and "non-affluent" countries. We then separately tested for effect modification of each risk factor by country-level affluence.
Analyses were conducted using Stata version 14. 30

| 6-7 year olds
The 6-7-year-old participants included 221 280 children from 75 centres which met the initial data quality criteria (at least 1000 children and a response rate of >60%). Of these, 212 480 children (from 2903 schools, 75 centres, 32 countries) were from schools with at least 10 children and had data present for wheeze, sex, maternal education and at least one of the exposures of interest so contributed to the analyses for one or more exposures (the "maximum sample"), with 131 924 children (from 2428 schools, 64 centres, 25 countries) having data present for all analysis variables (the "common sample"). See the data flowchart ( Figure 1) for further details. Individual-and school level summary statistics are presented in Table S2 for the maximum sample and in Table 1 for the common sample.
Minimally adjusted associations in the common sample were broadly similar to those in the maximum sample (Tables 2 and S3).
The strongest associations in the fully adjusted individual level analyses were for current paracetamol use (OR = 2.06, 95% CI 1.97-2. 16 When using the school level prevalence (Table S5)

| 13-14 year olds
The 13-14-year-old participants included 362 048 adolescents from 122 centres which met the initial data quality criteria (at least 1000 adolescents and a response rate of >70%). Of these 350 915 adolescents (from 2511 schools, 122 centres, 54 countries) were from schools with at least 10 adolescents and had data present for wheeze, sex, maternal education and at least one of the exposures of interest so contributed to the analyses for one or more exposures (the "maximum sample"), with 238 586 adolescents (from 2072 schools, 99 centres, 42 countries) having data present for all analysis variables (the "common sample"). See the data flowchart ( Figure 2) for further details. Individual-and school level summary statistics are presented in Table S2 for the maximum sample and in Table 1 for the common sample.
Minimally adjusted associations in the common sample were broadly similar to those in the maximum sample (Tables 2 and S3).
The strongest associations in the fully adjusted individual level analyses were for current paracetamol use (1. In the analyses stratified by country-level affluence (Tables S4-S5), there was evidence (P < 0.001) at the individual level that paracetamol use in the last 12 months was more strongly associated with wheeze in affluent countries (1.97; 1.85-2.09) than non-affluent (1.75; 1.69-1.82) (Table S4). There was no evidence of effect modification at the school level (Table S5).

A number of papers have been published describing the association of asthma symptoms with individual level risk factors in ISAAC Phase
Three. [8][9][10][11][12][13][14][15][16][17][18][19][20] Here, we present the first comprehensive analyses to address these risk factors together in a multilevel framework and compare the individual level and school level findings to assess the possibility of various types of bias and confounding.
The associations we present here at the individual level (Table 2) generally confirm the results for recent wheeze in published ISAAC papers. However, the ORs do not correspond exactly with previous publications due to the following differences in analytical approach.
Firstly, the ISAAC survey methodology involved cluster sampling (sampling schools, then selecting all children of the appropriate age within each selected school). In previous publications, no adjustment was made for within-school clustering of risk factors. In our multilevel models, inclusion of school as a random intercept adjusts more formally for intra-class correlation of both symptoms and exposures. This is a strength of the multilevel modelling approach.
Secondly, previous ISAAC Phase Three publications have adjusted for sex but not for socio-economic status at the individual level, whereas we included individual level maternal education as a SILVERWOOD ET AL. year-olds were excluded from the fully adjusted model due to incomplete risk factor information. However, comparison of results from the maximum sample with those from the common sample shows that findings were generally very similar for the subset of respondents with complete covariate data, suggesting that valid conclusions can be drawn from the "common sample" dataset.
It should also be noted that, whilst early life exposures are less prone to reverse causality than current exposures, recall errors (which may be biased with respect to disease status) are perhaps more likely to have affected early childhood exposures in an interview conducted when the child was 6-7 years old.
An innovative feature of this paper is the presentation of associations of school level prevalence of risk factors with individual level wheeze. This type of population-level analysis is potentially vulnerable to the "ecological fallacy," 31,32 but this concept has several components, of which only one (ecological or population-level confounding) applies in our study. We avoid other forms of ecological fallacy because the population-level exposure (school level prevalence of each risk factor) was derived by aggregating individual level data, so the exposure measure relates directly to the schools actually participating in the study (not, for instance, a city-wide or national average) and to the children for whom questionnaire data were T A B L E 1 Summary statistics for variables and their prevalence in subjects who had data present for wheeze, sex, maternal education and all exposures of interest (the "common sample")

Prevalence (%) Median prevalence (%) Prevalence IQR (%)
6-7 y Wheeze in the last 12 mo 9. 8 9.2 (4.7, 15.3) Low birthweight 8. returned (not, for instance, children of a different age or social group in the same area). We regard these as strengths of the multilevel analytical approach.
The school level associations shown in Table 2 generally maintained their direction on mutual adjustment, but the magnitude of the ORs (comparing the minimally adjusted and fully adjusted results) were less stable than the corresponding individual level associations (also in Table 2). Nevertheless, in the younger age group, significant school level associations were observed in the fully adjusted model with low birthweight, antibiotics in infancy, farm animal exposure in the first year, frequent fast food and television exposure, maternal smoking (but not paternal smoking) and current paracetamol use (but not paracetamol use in first year of life). In the older age group, significant school level associations were also observed with television viewing, maternal smoking and current paracetamol use.
The observed consistency of findings at the two levels provides indirect evidence against reverse causation and against strong contextual factors. Furthermore, since the spectrum of unmeasured confounders is likely to be different at the individual and population levels, consistency of results between the two levels provides additional reassurance against unmeasured confounding. Therefore, on both counts, cross-level consistency strengthens the evidence for a causal relationship at the individual level.
Such cross-level comparisons ( Table 2) show a close similarity in ORs at the individual level and school level for current paracetamol exposure and wheeze in each age group. This is of particular interest as a causal interpretation of this association has been disputed, due to the possibility of reverse causation (due to confounding by indication for paracetamol use and wheezing in infancy, or due to aspirin avoidance by older children with asthma or their families).
ISAAC Phase Three findings for paracetamol in the first year of life have also been debated. 33  T A B L E 2 Effects of individual-and school level exposures on wheeze in the last 12 months for subjects who had data present for wheeze, sex, maternal education and all exposures of interest (the "common sample"). Mixed logistic regression models with random intercepts at the school, centre and country levels Additionally adjusted for all other variables in the table.
Another risk factor which might be prone to reverse causation (due to pet avoidance in allergic families) is cat exposure in infancy.
Here, the school level association is somewhat stronger than the individual level association in the minimally adjusted models, as would be predicted from avoidance bias. However, after full adjustment the estimated associations are very similar.

| 437
In the older age group, we found associations with paternal tobacco smoking which differed in direction between the individualand school level analyses. This was a surprising finding which we have been unable to satisfactorily explain.
Finally, stratified analyses identified some risk factors whose effects seemed to differ by country-level affluence (Tables S4-S5). In the younger age group, current paracetamol use was consistently (ie in both individual-and school level analyses) found to be a stronger risk factor for wheeze in affluent countries relative to non-affluent countries. Cat and farm animal exposure in the first year of life were found to be stronger risk factors in non-affluent countries (where there is perhaps less avoidance bias) in the individual level analysis. In the school level analysis, the affluence level-specific associations similarly differed, though there was not statistical evidence for effect modification. In the older age group, current paracetamol use was again found to be a stronger risk factor for wheeze in affluent countries relative to non-affluent countries, though only in the individual level analysis.
In conclusion, these multilevel analyses generally confirm previously reported child-level findings for wheeze in ISAAC but, importantly, they provide additional evidence in favour of direct (rather than reverse) causation. This is the first comprehensive analysis of school level associations, which may be particularly relevant to public health policies, which aim to prevent asthma symptoms by modifying environment, lifestyle or medication use among whole communities, rather than individual children or their families.