Quantifying the impact of the modifiable areal unit problem when estimating the health effects of air pollution

Air pollution is a major public health concern, and large numbers of epidemiological studies have been conducted to quantify its impacts. One study design used to quantify these impacts is a spatial areal unit design, which estimates a population‐level association using data on air pollution concentrations and disease incidence that have been spatially aggregated to a set of nonoverlapping areal units. A major criticism of this study design is that the specification of these areal units is arbitrary, and if one changed their boundaries then the aggregated data would change despite the locations of the disease cases and the air pollution surface remaining the same. This is known as the modifiable areal unit problem, and this is the first article to quantify its likely effects in air pollution and health studies. In addition, we derive an aggregate model for these data directly from an idealized individual‐level risk model and show that it provides better estimation than the commonly used ecological model. Our work is motivated by a new study of air pollution and health in Scotland, and we find consistent significant associations between air pollution and respiratory disease but not for circulatory disease.

know whether, within a population, it is the same individuals who exhibit disease and have the highest air pollution exposures.
Spatial ecological studies partition the study region into K nonoverlapping areal units such as electoral wards or census tracts and utilize data on total disease incidence and average air pollution concentrations for the populations living in each unit. Data on air pollution concentrations are available as point-level measurements and/or grid-level modeled concentrations, and average concentrations for each areal unit have been estimated using simple averaging (Lee & Sarran, 2015) or spatial prediction techniques such as block Kriging (Zhu, Carlin, & Gelfand, 2003). A number of different statistical issues have been addressed in the literature in relation to modeling air pollution and its health effects, including different approaches for (i) spatiotemporal pollution prediction (Gilani, Berrocal, & Batterman, 2019;Nicolis, Díaz, Sahu, & Marín, 2019); (ii) estimating individual-level exposures (Clifford et al., 2019); (iii) the impacts of preferential sampling of pollution (Lee, Szpiro, Kim, & Sheppard, 2015); and (iv) allowing for errors in exposure variables (Keller & Peng, 2019;Strand, Sillau, Grunwald, & Rabinovitch, 2015).
However, the specific focus of this article is the spatial structure of the disease data, which are only available as the total number of disease cases for each of the K areal units, and these units have typically been designed to minimize the within-area variation in socioeconomic deprivation, respect physical barriers such as rivers, and have similar population sizes. The specification of these units is crucial for epidemiological air pollution studies, because it is the spatial variation in total disease incidence and average air pollution concentrations between these areal units that drives the estimated pollution-health relationship. In reality however both these variables vary continuously over space, and partitioning the study region into a different set of disjoint areal units, potentially changing the number of units K, will result in a different ecological-level data set for analysis despite the underlying study population remaining unchanged.
This phenomenon is known as the modifiable areal unit problem (MAUP; Openshaw, 1984) and is a major criticism of studies of this type, which reduces the strength of the evidence that such studies provide. The MAUP arises when trying to make inference about spatially continuous processes using data aggregated to a single set of areal units. There has been extensive research into the MAUP in a variety of fields dating back to Openshaw and Taylor (1979), and a review is given by Manley (2014). However, there has been limited research to date on the impact of the MAUP in air pollution and health studies, with Parenteau and Sawada (2011) finding that the estimated impact of NO 2 on respiratory morbidity in Ottawa varied across three different sized spatial scales. This study is rather limited however because it only estimated the impact of the MAUP on a single data set and did not attempt to identify the factors in a data set that lessen or increase the impact of the MAUP. In addition, it focused exclusively on changing the number and sizes of the spatial units, which is known as the scaling effect (Openshaw, 1984). It did not assess the zoning effect, where the number and average size of the units remains unchanged but the geographical specification of these units does change. We note in passing that the MAUP is not the only spatial rescaling problem that researchers have addressed, with Aregay et al. (2018) developing models for multiple scales of small-area disease data.
This article however is the first to fully tackle the MAUP in air pollution and health studies and is motivated by the following two aims. The first is to quantify the impact that the scaling and zoning effects of the MAUP have on air pollution and health effects estimated from areal unit studies, so that the validity of the effects estimated in existing studies can be better understood. The second is to develop a model that reduces the impact of the MAUP, by deriving an appropriate areal unit-level model directly from an idealized individual-level model. These aims are addressed in the context of both constant and varying air pollution and health effects, the latter varying over areal units. We estimate varying effects using varying coefficient models (Hastie & Tibshirani, 1993), which allows the air pollution and health effect to vary depending on external variables known as effect modifiers. These methodological developments are motivated by a new study of air pollution and health in Scotland, and the varying coefficient models are used to quantify whether air pollution disproportionally affects poorer communities, thus widening health inequalities (NHS Health Scotland, 2016) between rich and poor communities. A description of the study data is presented in Section 2, while our methodology is described in Section 3. A simulation study is presented in Section 4, while the result from the motivating study are presented in Section 5. The article concludes with a discussion in Section 6.

MOTIVATING STUDY
Our motivating study is based in Scotland, which while having comparatively low-ambient air pollution concentrations, nonetheless has 38 officially declared Air Quality Management Areas (AQMA, http://www.scottishairquality.co.uk/laqm/ aqma), where pollution levels have or are likely to breach European Union (EU, European Parliament, 2008) pollution limits. Partially as a result of the failure to reduce pollution in these AQMA, the Scottish Government has undertaken a review of its 2015 Cleaner Air for Scotland (CAFS) strategy (http://www.gov.scot/Resource/0048/00488493.pdf), with a view to ensuring that its successor strategy is more evidence based. Providing more recent and more comprehensive epidemiological evidence about the impact of air pollution in Scotland specifically is therefore a necessity to support this policy review. However, there is a lack of recent Scottish-specific research study evidence, with only Lee (2018) and Lee, Robertson, Ramsay, Gillespie, and Napier (2019) presenting studies using data later than 2011 (up to 2016). However, these studies use data over short temporal durations (10 months and 2 years, respectively) and focus exclusively on physical rather than mental health. The study reported here provides a more comprehensive and robust picture of air pollution health effects, by using data on both physical and cognitive health (dementia) for the 7-year period from 2011 to 2017. The study region of mainland Scotland is partitioned into K = 1,249 intermediate zones (IZ), which are a Scottish Government created geography used for the allocation, display and analysis of small-area-level statistics, having an average population of around 4,000 people.

Disease outcomes
In this study we quantify the impact of air pollution on both physical and cognitive health. While the links between physical health and air pollution are well known (see Bellini, Baccini, Biggeri, & Terracini, 2007;Lee et al., 2019), the association with cognitive health is much less well researched (see Chen et al., 2017;Oudin et al., 2016). For physical health we focus on respiratory (ICD-10 codes J00-J99) and circulatory (ICD-10 codes I00-I99) hospitalizations and deaths, because these diseases have been most strongly linked to air pollution by existing studies, and because they are among Scotland's leading causes of deaths (see http://www.gov.scot/Topics/Statistics/Browse/Health/TrendMortalityRates). For cognitive health we focus on dementia (ICD-10 codes F01 and F03) deaths, because the studies listed above have shown evidence of associations with air pollution in other countries. Finally, we also consider total nonaccidental mortality, as this is a common endpoint used in existing epidemiological air pollution studies. For each disease outcome we have data on the total numbers of disease events from the populations living within each IZ over the 7-year study duration, yielding a set of spatially referenced counts. In the case of the hospitalization data they relate to the total number of events (not just first events for an individual) where the primary diagnosis was the disease type in question. The total counts for each health outcome grouping are summarized in the top panel of Table 1. This identifies that hospitalizations are more common than deaths as would be expected, and that dementia is the rarest cause of death.
The observed numbers of disease events (Y ) will depend on the population size and its age-sex structure in each IZ. This is accounted for by computing the expected numbers of disease events (e) using indirect standardization. Specifically, the number of people living in each IZ in each sex-specific 5-year age strata is multiplied by the Scotland-wide disease rate for that strata, before summing over strata to compute the expected number of disease events. Based on these data the exploratory measure of disease risk is the standardized mortality/morbidity ratio (SMR = Y /e), which is summarized in the middle panel of Table 1. The table shows a wide variation in SMRs across the IZs in mainland Scotland, with the IZs that exhibit the maximum SMRs for most diseases being between two and three times higher than the national average (an SMR of 1). The exception to this is dementia mortality that has a maximum SMR of 6.26, which is due to the uneven distribution of cases across Scotland as well as the small numbers of deaths attributed to this disease leading to an unstable SMR. A map of the SMR for respiratory hospitalizations is displayed in the left panel (a) of Figure 1. This shows that areas with elevated risks are mainly in and around the city of Glasgow, a city known to have some of the worst health in the UK, with marked levels of health inequality.

Air pollution concentrations
Measured air pollution concentrations are available from http://www.scottishairquality.scot, but the network of monitors is highly dispersed and is poorly represented at the small area IZ scale used for this study. Therefore in common with existing studies (e.g., Lee & Sarran, 2015) we utilize annual average modeled concentrations from an atmospheric dispersion model. Specifically, we use concentrations estimated by the pollution climate mapping (PCM) model developed for the Department for the Environment, Food and Rural Affairs (DEFRA), which are freely available from https://uk-air. defra.gov.uk/data/pcm-data. This model estimates concentrations on a 1-km 2 square grid, which are spatially misaligned with the irregularly shaped intermediate zones that the disease data relate to. Concentrations of nitrogen dioxide (NO 2 ) and coarse (PM 10 ) and fine (PM 2.5 ) particulate matter are available (measured in μg m −3 ). These pollutants were chosen because they are the ones specified as exceeding existing limit values in all of Scotland's air quality management areas except one. These yearly concentrations are averaged over the 7-year study period, and the distributions of the resulting concentrations across the IZs are summarized in the bottom panel of Table 1. The table shows that the spatial distributions of the pollutants are skewed to the right, which is because most of the 1-km 2 grid squares in Scotland are rural and hence have a very low population density. The spatial distribution of PM 10 is displayed in panel (b) of Figure 1, which shows it is highest in the cities of Glasgow and Edinburgh as well as around the east and southeast coasts, the latter due to transboundary pollution from continental Europe and England, respectively. The three pollutants are highly correlated as expected, with correlations ranging between 0.671 (between NO 2 and PM 10 ) and 0.961 (between PM 2.5 and PM 10 ).

Covariate data
The main confounding factor in spatial studies of disease incidence is socioeconomic deprivation, in part because of its links to smoking prevalence (Kleinschmidt, Hills, & Elliott, 1995). Areas with more deprived populations exhibit worse health on average than more affluent ones. These inequalities in population health are well recognized in Scotland (NHS Health Scotland, 2016). Socioeconomic deprivation is measured in Scotland by the Scottish Index of Multiple Deprivation (SIMD, http://www.gov.scot/Topics/Statistics/SIMD), which is a composite index consisting of deprivation indicators in the domains of access to services, crime, education, employment, health, housing, and income. These are then weighted and combined to create the final index. We have access, at the IZ level, to the seven domain-specific indicators in 2016 as covariates for possible inclusion in our models, although obviously we will not use the health domain as our response variable is health related. The crime indicator has a single IZ with a missing value, which is imputed by computing the average value from geographically neighboring IZs (those sharing a common border). These domain-specific indicators of deprivation are correlated, with the highest pairwise correlations of 0.87 to 0.98 being observed between the income, employment, and education domains. In addition, we also have access to data on the average property price in each IZ, which is used as the effect modifier to assess whether the pollution-health effect depends on the relative affluence of an IZ, thus potentially widening health inequalities.

Exploratory analysis
To assess the extent of the residual spatial autocorrelation and overdispersion in the disease data after covariate adjustment, quasi-Poisson log-linear models were fitted to each disease outcome separately. The expected numbers of disease events was included as an offset term, and for the purposes of this exploratory analysis PM 10 was the only pollutant included in the model. As the grid-level pollution data are spatially misaligned with the IZ-level disease counts, simple averaging was used to spatially realign the pollution data to the IZ level as used by Lee and Sarran (2015). Specifically, each 1-km 2 grid square has an associated centroid (central point), and the estimated pollution concentration for an IZ is computed as the mean of the grid square concentrations whose centroids lie within the IZ. Any IZ that does not contain a grid square centroid is assigned the pollution concentration from the nearest grid square. As the income, employment and education domains of SIMD are highly correlated we only use one of these as a covariate to avoid collinearity problems. Based on the AIC the income domain provided the best fit for five of the six disease outcomes, so for consistency in our modeling we do not consider the education and employment domains any further. The remaining domain-specific SIMD indicators (access to services, crime, housing, and income) were used in each disease-specific model, and the residuals showed substantial overdispersion, with overdispersion parameters ranging between 2.4 (circulatory mortalities) and 16.9 (respiratory hospitalizations). The residuals also showed significant spatial autocorrelation, with p-values based on a Moran's I permutation test (Moran, 1950) all being less than .00001 except for Dementia mortality (p-value of .07).

Study design
The study region is partitioned into K nonoverlapping areal units  = { 1 , … ,  k }, and data on disease incidence and socioeconomic confounders are available for each unit over a fixed time interval. The disease incidence data for a single outcome (i.e., hospitalization or mortality) consist of Y() = (Y ( 1 ), … , Y ( K )) and e() = (e( 1 ), … , e( K )), which respectively denote the observed and expected numbers of disease cases in each areal unit. The latter is computed by indirect standardization and is given by is the number of people living in area  k who are in the jth age-sex strata, while r j is the nationwide risk of the single disease outcome under study for strata j for j = 1, … ,J. The matrix of socioeconomic deprivation confounding variables is denoted by . Modeled pollution data are not available at this areal unit level and are instead available on a much finer 1-km 2 square grid  = ( 1 , … ,  M ). These data are denoted by x() = (x( 1 ), … , x( M )) and are thus spatially misaligned with the aggregated disease and confounder data. Finally, we have modeled population size estimates P() = (P( 1 ), … , P( M )) from Reis et al. (2017) at the same 1-km 2 grid square resolution. In Section 3.2 we outline the standard ecological model that is commonly applied to these data, while in Section 3.3 we outline a more realistic model that should be less prone to the MAUP and ecological bias because it is aggregated from an idealized individual-level risk model.

The standard ecological model
Analysis of these data is typically conducted at the areal unit level { 1 , … ,  k }, because this is the smallest spatial scale at which disease incidence data are available. Thus, one typically aims to estimate the average pollution concentration in each areal unit, namely, x( k ) = ∫  k x(s)ds, where x(s) denotes the pollution concentration at a single-point location s. From the observed grid-level data this is most easily estimated by averaging the grid-level values x( i ) whose centroids where q( k ) denotes the number of grid square centroids that lie within  k . Then using this estimate the commonly used ecological regression model is given by (1) The confounder regression parameters = ( 1 , … , p ) are assigned independent zero-mean weakly informative Gaussian prior distributions with a large variance, to ensure the data play the dominant role in determining their values. The pollution-health effect parameter ( k ) is allowed to vary over areal units  k , and its prior specification is described in Section 3.4. The remaining term in the linear predictor () = ( ( 1 ), … , ( K )) is a vector of random effects, which account for any residual spatial autocorrelation in the disease data after adjusting for the covariates. Random effects models for spatial areal unit data are a well-researched topic, and conditional autoregressive (CAR) models, a special case of Gaussian Markov Random Fields (GMRF), are the most popular specification due to their computational efficiency resulting from their sparse precision matrices. These models are given by () ∼ N(0, 2 Q(W) −1 ), where the precision matrix Q(W) is a function of a K × K neighborhood or adjacency matrix W = (w kr ). This matrix determines the spatial autocorrelation structure of the random effects, and if w kr > 0 then ( ( k ), ( r )) are conditionally correlated, whereas if w kr = 0 then ( ( k ), ( r )) are conditionally independent. A binary specification is most often adopted, where w kr = 1 if areas ( k ,  r ) are spatially close and share a common border, and w kr = 0 otherwise. The models proposed by Besag, York, and Mollié (1991) and Leroux, Lei, and Breslow (2000) are the most popular in practice, and here we use the latter because a study by Lee (2011) showed it was preferable to the commonly used BYM model of Besag et al. (1991). This model is most often written in the form of the following univariate full conditional distributions ( k )| (− k ) (where (− k ) denotes all the random effects except ( k )): ( 2) Here is a spatial dependence parameter defined on the interval [0,1], with = 1 corresponding to strong spatial autocorrelation (the intrinsic CAR model proposed by Besag et al., 1991), while = 0 corresponds to spatial independence (i.e., ( k )| (− k ) ∼ N(0, 2 )). A noninformative uniform prior is assigned to on the unit interval, while a weakly informative conjugate inverse-gamma prior is assigned to 2 .

Aggregation of an individual-level risk model
The ecological regression model outlined above is ubiquitously used for modeling areal unit count data, but Wakefield and Shaddick (2006) have considered what an appropriate areal unit-level model would be if one began with an idealized individual-level risk model and aggregated it to the areal unit level. As this aggregated model is based on an individual-level model not susceptible to the MAUP, it should be less susceptible to the MAUP than the naive ecological model outlined above. Thus, we now derive such an aggregate model, and note that it is not an approximation to the naive ecological model (1), but instead an alternative to it that should be less susceptible to the MAUP. Following Wakefield and Salway (2001), let Y ij ( k ) denote the outcome of a single disease (e.g., respiratory hospitalization) for the ith individual in the jth age and sex strata (i.e., females, 0-4, females 5-9, etc.) in areal unit  k . The hospitalization and mortality outcomes from the motivating study are modeled separately here, just as they are in the standard ecological model outlined above. When modeling mortality outcomes Y ij ( k ) is binary, with Y ij ( k ) = 1 meaning the individual died during the study period while if Y ij ( k ) = 0 they did not. Conversely, when we are separately modeling hospitalization data then Y ij ( k ) denotes the (probably small) number of times that the individual was admitted to hospital during the study period. Then as the response variable has different sample spaces for these two outcome types, appropriate individual-level risk models for Y ij ( k ) will be slightly different. Specifically, appropriate individual-level risk models are: (3) A Bernoulli model is used for mortality outcomes because it is a binary event, while a Poisson model is used for hospitalization outcomes because one could be admitted to hospital multiple times during the study period. In the mortality model̃i j ( k ) is the probability of death and is modeled with a log-link function because mortality is a rare event, while in the hospitalization model̃i j ( k ) represents the estimated number of hospitalizations.
For both disease outcome models [z ij ( k ), x ij ( k )] are respectively individual-level covariates (confounders) and air pollution exposures for the ith individual from the jth age-sex strata living in areal unit  k . In addition, r j denotes the nationwide risk of disease for age-sex strata j to adjust for individual demographics and is used in the computation of e( k ) described above for the ecological model. The parameters [̃,̃( k )] are respectively confounder regression parameters and spatial random effects (to control for areal unit-level unmeasured spatial confounding), which are not necessarily equal to the corresponding parameters [ , ( k )] from the ecological model (1). Similarly, the regression parameter̃( k ) quantifying the effect of air pollution on disease risk is not necessarily equal to the value from the ecological model, that is, ( k ) ≠̃( k ). Furthermore, as the areal units have been designed to be homogeneous in terms of socioeconomic deprivation, we do not allow̃( k ) to further vary across individuals within an areal unit.
These individual-level models cannot be fitted because the only available disease data for each outcome are the areal unit-level total disease counts Y ( k ), which sum the individual disease outcomes over all n j ( k ) individuals in strata j and across all J strata in areal unit  k . Thus while this enforced aggregation returns us to the MAUP, its effects should be reduced if we build a model by aggregating (separately by outcome) these individual-level risk models, instead of using the ecological model outlined in Section 3.2. Therefore as the aggregated disease count for areal unit  k can be written as , its expectation based on either of the individual-level models in (3) (which is the same) is given by where S j ( k ) denotes the set of individuals in strata j and areal unit  k . Thus regardless of whether one models mortalities or hospitalizations, E[Y ( k )] from (4) is the same. However, the only data available for model fitting is , and thus we need to make some assumptions to be able to fit the above model. The first of these is that z ij ( k ) = z( k ) for all individuals in  k , because we only have areal unit-level confounders and not individual-level ones. This assumption simplifies the expected disease count to However, this model still cannot be fitted in its current form, because we do not know the set of individual pollution exposures x ij ( k ) for the ith individual in strata S j in areal unit  k . Therefore we need to approximate , the population averaged risk of pollution for people in strata j, using the gridded pollution and population data we do have. To do this we first assume that this population-level average risk does not differ by strata, that is, for all pairs of strata (j,h). Then we approximate this unknown population-level average risk by the population weighted average risk based on the known grid-level concentrations x( i ). This approach gives a larger weight to a grid square's concentrations if more people live in that square. Specifically, we make the approximation where P( k ∩  l ) is the estimated population in the intersection area  k ∩  l , and  l ∈  k denotes that grid square  l has a nonzero intersection with areal unit  k . We note that in the simplifying case that all the populations P( k ∩  l ) are equal for all l, we return to the model proposed by Wakefield and Shaddick (2006) albeit using modelled rather than monitored concentrations. The observed population data P( l ) are at the grid level, and we estimate the population in a grid square and areal unit intersection by P( k ∩  l ) = P( l ) a( k ∩ l ) a( l ) , where a( k ∩  l ) and a( l ) denote the area of  k ∩  l and  l , respectively. Thus, the intersection population P( k ∩  l ) is estimated as the proportion of the population in grid square  l that lies within  k . This assumes that population density is constant within a grid square, and these assumptions lead to the modified risk model Here e( k ) = ∑ J j=1 n j ( k )r j as defined in Section 3.1 and a( l ) = a( h ) as all grid squares are the same size. This model differs from that of Wakefield and Shaddick (2006) in that they did not have access to population data P( i ), and hence did not weight their pollution concentrations by estimated population density. In addition, they only considered constant pollution effects, that is,̃( k ) =̃. This mean model for Y ( k ) suggests that given the constraints on data availability, an appropriate aggregate model is where the random effects model is identical to that outlined in Section 3.2. A Poisson data likelihood model is assumed here because once spatially aggregated the disease variable Y ( k ) is a count of the number of events (hospitalizations or mortalities) from the population living in areal unit  k . Thus, the main difference between the naive ecological model (1) and the aggregate model (9) is that the former first averages over the gridded pollution concentrations within each areal unit to create a single pollution measure,x( k ), in the risk model, while the aggregate model averages the risks, exp[x l ( l )̃( k )], across all grid squares  l within each areal unit. We undertake a simulation study in the next section to illustrate which of these two competing models better estimates the effects of air pollution on human health.

Constant and varying coefficients models for̃( k )
We consider both constant and varying coefficients for̃( k ), with the former used because it gives a single overall effect size and is commonly used in existing studies. We use the later to quantify whether air pollution disproportionally affects less affluent communities, thus making it a cause of health inequalities. We describe our varying coefficient model in terms of the aggregate model parameter̃( k ), but note that we also use it for the ecological model parameter ( k ). The prior distributions for the constant and varying coefficient models are given below, where in both cases weakly informative Gaussian priors are specified for the unknown parameters.
For the varying coefficient model̃( k ) is assumed to be linearly related to an areal unit-level measure of socioeconomic deprivation or affluence m( k ), called an effect modifier. In initial analysis of the study data we investigated nonlinear relationships with affluence via cubic regression splines, but the results showed no evidence against linearity and hence this simpler model is presented here. This linear model is also consistent with the earlier work of Lee (2012).

Inference
Inference for both models is conducted in a Bayesian framework using Markov chain Monte Carlo (MCMC) simulation. The ecological model (1) and (2) can be implemented using the CARBayes package (Lee, 2013), while software in the form of an R implementation to run the aggregate model (9) with and without the varying coefficient effect is available at https://github.com/duncanplee/Aggregate-spatial-models. Also available in that repository is an R code file that runs a sample analysis using all four models (ecological and aggregate models with constant and varying pollution-health effects), as well as an R code file to run the simulation study presented below. Finally, the repository contains a subset of the data analyzed here, but the observed disease counts have been jittered to preserve confidentiality as they are not publicly available.

SIMULATION STUDY
This section presents a novel simulation study that is the first to answer the following key questions: (i) What impact does the scaling and zoning components of the MAUP have on air pollution and health effects? and (ii) Does the aggregate model (9) perform better than the naive ecological model (1)?

Data generation
Data are generated on a 71 × 71 square grid study region  comprising just over 5,000 grid squares (M = 5,041) and representing a pseudocontinuous spatial surface, and the data generated at this scale are subsequently spatially aggregated to observe the effect of the MAUP. For each simulated data set a population of around 3.5 million individuals are generated across this region, where the grid square population counts are denoted by P( i ). These population counts are generated on the square root scale from a spatially correlated multivariate normal distribution, which ensures that "cities" and "rural areas" are created that comprise geographically adjacent groups of grid squares that respectively have high and low populations. The square root scale is used to ensure that the distribution of population counts is skewed to the right when back transformed by squaring, because cities have much higher population counts than rural areas. The disease counts are generated from Y ( i ) ∼ Poisson(e( i ) ( i )), where the expected counts e( i ) = P( i ) depend on the population size and the disease prevalence denoted by . Grid-level disease risk ( i ) is modeled on the log scale by ln[ ( i )] = x( i ) + ( i ), where the regression parameter is fixed at = .1. The pollution covariate x( i ) is generated as either independent or spatially correlated, so that the impact of the MAUP on these two different types of covariates can be observed. Both pollution covariate types are generated from zero-mean multivariate normal distributions, with the independent covariate having an identity correlation matrix = I. The correlated covariate has a correlation matrix defined by the spatial exponential correlation function = exp(− D), where D is an M × M distance matrix between all pairs of grid squares and = 0.5 is the dependence parameter chosen so that the correlation between adjacent grid squares is 0.6. The random effects ( i ) are included to mimic the presence of spatially correlated unmeasured confounding and are also generated from a zero-mean multivariate normal distribution with an exponential correlation function with = 0.1 (adjacent grid squares have a correlation of 0.9). Once a grid-level data set has been generated it is spatially aggregated to K randomly generated nonoverlapping areal units, which is the type of data available in practice. This is achieved by randomly choosing K grid squares as anchor points, and then assigning the remaining grid squares to the areal unit that has the closest anchor point. The grid level observed and expected disease counts (Y ( i ), e( i )) are summed within areal units to obtain the aggregated counts (Y ( k ), e( k )), while the pollution covariate x( i ) is averaged across each areal unit to obtain x( k ). This spatial aggregation process is repeated Q = 20 times with different spatial aggregations, yielding Q differentially aggregated areal unit data sets for analysis from each grid-level data set. The variation in the regression parameter estimates {̂1, … ,̂Q} from these Q spatially aggregated data sets quantifies the size of the zoning effect component of the MAUP. Data are generated under 48 different scenarios based on all possible combinations of the following five factors, which assesses the effect of the MAUP to how the data were generated.
1. Scaling effect The number of areal units is set to K = 100 (on average 50 grid squares per areal unit), K = 500 (on average 10 grid squares per areal unit), or K = 1,000 (on average five grid squares per areal unit). 2. Disease prevalence The disease is relatively common e( i ) = 0.05P( i ) (one in 20 people have it) or rarer e( i ) = 0.01P( i ) (one in 100 people have it). 3. Pollution covariate correlation The pollution covariate is independent or correlated in space, which is determined by as described above. In each scenario C = 50 grid-level data sets are generated, each of which is then spatially aggregated in Q = 20 different ways, yielding 1,000 spatially aggregated data sets under each scenario. We fit two separate models to each of these 1,000 aggregated data sets in each scenario, the first of which is the naive ecological model (denoted Ecological) given by (1) that is applied to the aggregated data {Y ( k ), e( k ), x( k )}. The second is the aggregate model (denoted Aggregate) given by (9), which is applied to the aggregated disease data {Y ( k ), e( k )} and the grid-level pollution data {x( i )}, the latter mimicking the available grid-level pollution data from the motivating study. Inference for both models is based on 2,000 MCMC samples, which were obtained by running a Markov chain for 300,000 samples, removing the first 100,000 as the burnin period, and thinning the remaining samples by 100 to remove their autocorrelation.

Results
We present the results in two stages: the first quantifies the impact of the MAUP, while the second quantifies the accuracy with which the true pollution-health effect is estimated.

The impact of the MAUP
The impacts of the scaling and zoning components of the MAUP are presented in Table 2, which quantifies the variation in {̂1, … ,̂Q} from the Q = 20 spatially aggregated data sets relating to each grid-level data set. Specifically, the interquartile range (IQR) is computed for these Q = 20 estimates, and then the median of these IQRs is computed for the C = 50 grid-level data sets generated under each scenario. These average IQRs are presented in the table as percentages of the true value of = .1 to make the results interpretable, and the table shows a number of key findings. First, the median IQR is relatively low if you have at least 500 areal units and the pollution covariate is spatially correlated, with median IQRs less than 10% of the true value of in all cases. This suggests that the zoning component of the MAUP should have relatively little impact on the estimates from the motivating study, because there the pollution covariate is correlated and the number of IZs is over 1,000. Second, the impact of the zoning effect reduces as the number of areal units K increases (the scaling effect), which is not surprising because one has more data points with which to estimate and is getting closer to the grid (individual) level. Third, the estimates from the aggregate model almost always exhibit less variation than those from the ecological model, with percentage differences around 1% in most scenarios. Fourth, the remaining factors considered here all influence the magnitude of the zoning effect within the MAUP, with reductions in the zoning effect occurring if (i) the pollution covariate is spatially correlated with a high variation; (ii) the disease is common, and (iii) the level of residual spatial confounding is low. If the pollution covariate is spatially correlated then neighboring grid squares will have similar covariate values, hence different spatial aggregations of these grid squares will have similar sets of areal-level covariate values across the K areal units, resulting in similar estimated risks and thus reducing the zoning effect. As the variation in the pollution covariate increases there are greater contrasts in its areal-level values across the K areal units, leading to more precise estimation of and thus reducing the variation between estimates from different spatial aggregations. Similarly, as the disease prevalence increases there are more disease cases across the study region. This increases the size of, and the variation between, the areal unit-level disease counts, which provides more precise estimates of covariate effects and hence a reduction in the zoning effect between different spatial aggregations. Finally, as the amount of residual spatial confounding reduces the grid-level risks are more homogeneous between neighboring grid squares, meaning that the choice of spatial aggregation has less effect on the set of K areal-level average risks, resulting in more consistent estimation of (reduction of the zoning effect).

The accuracy of
The accuracy with which can be estimated is presented in Table 3, which displays root mean square errors (RMSE) of̂from all 1,000 aggregated data sets for each model and scenario. As before, the results are presented as percentages of the true value of = .1. The table shows that as long as the pollution covariate is spatially correlated and there are more than 500 areal units, as is the case in the motivating study, then estimation accuracy is good, with RMSEs less than 12% of the true value in all cases. Estimates from the aggregate model are almost always better than those from the ecological model, with average reductions in RMSE of around 1%. Increasing the number of areal units increases estimation accuracy (reduces the RMSE) markedly, as does having: (i) a pollution covariate that is spatially correlated with a high variation; (ii) a common disease with larger numbers of disease events, and (iii) a low level of residual spatial confounding.
TA B L E 2 Estimated impacts of the scaling and zoning effects in the MAUP from the simulation study for different scenarios Note: First, the interquartile range of {̂1, … ,̂Q} is computed that arise from the Q = 20 spatially aggregated areal unit data sets from each grid-level data set. Then the average (median) of these IQRs from the C = 50 grid-level data sets is presented in the table above to quantify the zoning effect. In all cases this median value is presented as a percentage of the true value of = .1.

Model description
Both the ecological and aggregate models are fitted to each disease outcome separately, to ensure that the estimated associations are not influenced by any cross correlations between diseases. Only a single pollutant is included in each model run because the three pollutants are highly correlated, which results in 18 = 6 × 3 disease and pollutant combinations. In each case the observed disease count variable, Y ( k ), is modeled by the expected number of disease cases, e( k ), as an offset in the model, one of the three pollutants, a vector of socioeconomic confounders, and the spatially correlated random effects. The socioeconomic confounders include the access to services, crime, housing, and income domains of the SIMD, with the other domains being removed due to collinearity as discussed in Section 2.4. The ecological and aggregate models are fitted first assuming a constant pollution-health effect to provide a comparison with existing studies, before relaxing that assumption and allowing the estimated effect size to vary across the IZs. In the latter case our aim is to estimate the extent to which the pollution-health effect is modified by the level of socioeconomic deprivation in an IZ. Our hypothesis is that there may be larger pollution-health effects in less affluent communities, which will increase the magnitude of health inequalities. To assess this hypothesis we allow the pollution-health effect̃( k ) to depend on the natural log of the average property price in an IZ. Property price was chosen because it was a measure of affluence that was not in the SIMD and hence not in the covariate part of the model. A natural log transformation was taken so that the variable was symmetric and did not contain large outliers that may affect the results. We assume a linear relationship̃( k ) =̃0 +̃1m( k ) between the pollution-health effect in areal unit  k ,̃( k ), and the log of the average property price in area  k , m( k ), because exploratory analyses of these data with nonlinear spline based relationships showed no evidence of nonlinearity. Inference for all models fitted to the study data is based on 6,000 MCMC samples generated from three parallel Markov chains, each of which was burnt in for 100,000 samples (by which point convergence was assessed to have been reached), before being run for a further 400,000 samples which were then thinned by 200 to remove their autocorrelation. The results for the constant pollution-health effects are presented in the next section, while the varying effects are presented in Section 5.3. Note: The relative risks relate to a 3 μg m −3 increase for NO 2 , a 1 μg m −3 increase for PM 2.5 , and a 2 μg m −3 for PM 10 , which are close to their spatial standard deviations. Significant effects are in bold.

Constant pollution-health effects
The estimated pollution-health effects from each model and pollutant-disease combination are presented in Table 4. All results are presented as relative risks for the following increases in each pollutant: NO 2 -3 μg m −3 , PM 2.5 -1 μg m −3 , and PM 10 -2 μg m −3 . These increases are chosen for the pollutants because they are close to their respective spatial standard deviations, thus making them realistic increases in concentrations that might occur. The main message from Table 4 is that air pollution has consistent significant effects on respiratory disease (hospitalizations and mortality), while no effects are observed for any of the other disease outcomes. For respiratory disease 11 out of the 12 relative risks (2 disease outcomes × 2 models × 3 pollutants) have 95% credible intervals wholly greater than 1, with point estimates ranging between a 2% and a 5.8% increased risk for the given increase in the pollutant. By contrast, there is consistently no effect observed for circulatory disease, with the one significantly negative association likely being due to the multiple testing inherent when estimating the 12 pollutant-disease effects. Similarly, there is no consistent association for dementia mortality although some of the point estimates are relatively large, and the lack of significance may be due to the wide uncertainty intervals that arise from having very small disease counts for dementia, a disease which is a relatively rare cause of death compared with respiratory disease. In addition, there is no consistent association with total nonaccidental mortality either across the six pollution-health estimates, with only PM 10 showing significant associations. Finally, there are some differences between the estimates from the aggregate and ecological models, for example, the associations estimated between respiratory hospitalizations and PM 2.5 , and the simulation study suggests that the aggregate model is likely to provide the more accurate results.

Varying pollution-health effects
Varying pollution-health effects were only estimated for the respiratory disease outcomes (hospitalization and mortality), because the remaining outcomes showed no evidence of significant relationships with air pollution from Table 4. The results are displayed in Figure 2, where separate effects are shown for each pollutant (in each row of the figure), disease outcome (hospitalization and mortality) and model (ecological and aggregate). Each figure presents the linear relationship between the relative risk of air pollution and affluence, where the latter is represented by the natural log of the average property price in each IZ. The solid line depicts the posterior mean, while the dashed lines represent 95% credible intervals. The relative risk scales relate to the same increases in each pollutant that were presented in Table 4, namely, NO 2 -3 μg m −3 ; PM 2.5 -1 μg m −3 ; and PM 10 -2μg m −3 . The main finding is that the relative risk of air pollution decreases F I G U R E 2 The estimated relationship between the pollution-disease relative risks for each disease outcome, pollutant and model (ecological and aggregate), and affluence. Affluence is measured by the natural log of the average property price in each IZ as the level of affluence increases, which is consistently observed for all pollutants, disease outcomes, and models. This suggests that air pollution does widen health inequalities, because it has a greater impact on less affluent communities compared with more affluent ones. These differences in relative risk are significant for the mortality outcome for all pollutants, as well as for the PM 10 -hospitalization association, because one cannot draw a horizontal line between the 95% credible intervals.

DISCUSSION
This article has investigated the extent to which the modifiable areal unit problem is likely to affect the population-level air pollution and health associations estimated from spatial areal unit studies, which is a commonly made criticism of this study design. Our main finding is that as long as the pollution concentrations are spatially correlated and that the number of areal units is large, both of which occur in the motivating study, then the MAUP is unlikely to have a sizeable impact on the results. For example, the simulation study showed that the average IQR in the set of effect estimates resulting from different spatial aggregations of the same data (the zoning effect) was less than 10% of the true effect size. This suggests that the estimated associations from the motivating study are of the correct order of magnitude and not affected by the MAUP to any great extent. This would not be the case if the covariate of interest was independent and/or the number of spatial units was low, in which case the simulation study showed that the zoning effect could cause the variation (as measured by the median IQR) in the effects estimated from different spatial aggregations of the same data to be up to 80% of the true effect size, hence making the estimates unreliable. Our second main finding is that the aggregate model derived in Section 3 generally outperforms the ecological model commonly used in studies of this type, both in terms of the impact of the MAUP and estimation accuracy of the pollution-health association. However, the difference in estimation performance between the aggregate and ecological models is not large, with improvements in RMSE of around 1% of the size of the true association in most scenarios considered. Thus, while the aggregate model does not remove the impact of the MAUP it does mitigate it slightly, and is thus the model of choice for spatial ecological association studies.
The main finding from our motivating study is that increased levels of air pollution were consistently associated with increased risks of respiratory disease (both hospitalization and mortality), whereas no consistent associations were observed for circulatory disease. This agrees with the existing epidemiological literature in Scotland such as Willocks et al. (2012), Yap et al. (2012), and Lee et al. (2019), which also shows associations for respiratory but not for circulatory disease using a variety of study designs. Thus, the Scottish evidence is consistent with the majority of the international evidence on the existence of an association between air pollution and respiratory disease. However, the consistent failure to identify evidence of a significant association between air pollution and cardiovascular disease in Scottish studies is at odds with most of the international literature, with a review of the latter provided by World Health Organisation (2013). Therefore, future work is required to understand the reasons for these differences, which could be due to numerous factors including lower average pollutant concentrations in Scotland, or differences in the underlying causes and cofactors determining the Scottish population's vulnerability to circulatory disease. No associations were also observed between any pollutant and mortality attributed to dementia, although as some of the point estimates were relatively large (e.g., with PM 2.5 ) this might be due to the wide uncertainty intervals that result from a lack of power caused by their being small numbers of disease events over the 7-year study period. Our other main finding is that air pollution appears to consistently have a greater impact on less affluent communities compared with more affluent ones, so widening the level of health inequalities, the reduction of which is a current priority for Scottish Government (NHS Health Scotland, 2016).
The major limitation of this and all other epidemiological air pollution studies is that the air pollution exposure for each areal unit has been estimated using population-weighted outdoor residential concentrations. As the derivation of the aggregate model in Section 3 shows, using individual pollution exposures from the populations living in each areal unit would make the model more realistic, likely resulting in more accurate pollution-health associations. However, personal exposure estimates need to allow for the fact that people spend a large proportion of their time indoors and are hence exposed to indoor sources of pollution, as well as accounting for movement patterns away from their residence such as traveling to work, the shops, and so forth on a daily basis. Such personal exposure profiles are not currently available, and their estimation constitutes the next significant challenge in air pollution research.