Area Disparities in Britain: Understanding the Contribution of People vs. Place Through Variance Decompositions

This article considers methods for decomposing wage variation into individual and group specific components. We discuss the merits of these methods, which are applicable to variance decomposition problems generally. The relative magnitudes of the measures depend on the underlying variances and covariances, and we discuss how to interpret them, and how they might relate to structural parameters of interest. We show that a clear‐cut division of variation into area and individual components is impossible. An empirical application to the British labour market demonstrates that labour market area effects contribute very little to the overall variation of wages in Britain.


I. Introduction
…you are under no obligation to analyse variance into its parts if it does not come apart easily, and its unwillingness to do so naturally indicates that one's line of approach is not very fruitful. R. A. Fisher (1933) These warnings of R. A. Fisher notwithstanding, there are surely many situations where the decomposition of variance is potentially informative, even if the variance does not *We thank Phil Murphy, attendees at various seminars, plus two anonymous referees for their contributions. An earlier version of this article circulated as Spatial Economics Research Centre SERC DP0060 with the title 'Wage disparities: people or place' This work was based on data from the Annual Survey of Hours and Earnings, produced by the Office for National Statistics (ONS) and supplied by the Secure Data Service at the UK Data Archive. The data are Crown Copyright and reproduced with the permission of the controller of HMSO and Queen's Printer for Scotland. The use of the data in this work does not imply the endorsement of ONS or the Secure Data Service at the UK Data Archive in relation to the interpretation or analysis of the data. This work uses research data sets which may not exactly reproduce National Statistics aggregates. JEL Classification numbers: J31, R11. come apart easily. This article considers methods for decomposing the variance of an individual outcome into group components, where this individual outcome is determined by both individual characteristics and the characteristics of the group to which the individual belongs and where individuals can sort across groups. In particular, we are interested in the importance of individual characteristics vs. group membership in explaining variation in individual outcomes. Such decompositions are of interest in a wide variety of contexts. For example, labour economists have long been concerned with the role sorting on individual characteristics plays in explaining differences in wages between groups of workers (particularly wage differences across industries). See, for example, Krueger and Summers (1988), Gibbons and Katz (1992) and Abowd, Kramaz and Margoli's (1999). Another example occurs in urban economics when considering the extent to which sorting matters for individual and area disparities in wages. See, for example, Duranton and Monastiriotis (2002), Taylor (2006), Dickey (2007), Mion and Naticchioni (2009), Dalmazzo and de Blasio (2007) and Combes, Duranton and Gobillon (2008). Education economics provides another example, where the interest is in finding out the contribution of schools to the variance of pupil test scores (e.g. Kramarz, Machin and Ouazad, 2009). Despite this widespread and growing interest in decompositions by geography, industries, firms, schools and other groups, the alternative methods and their interpretations have not been very clearly articulated in previous literature. The first part of this article discusses several methods for gauging the contribution of group effects to the dispersion of outcomes across individuals, where these group effects are treated as fixed effects. 1 To make the discussion concrete, we focus on wages as the individual outcome, and consider the role of individual characteristics and a group effect associated with the labour market area of the individual. The methods we discuss are based on decomposing the variance of individual wages into between-group and within-group components in order to estimate the share of the individual variance attributable to group effects. These methods are easily extended to consider the share of group level variation attributable to group, as opposed to individual, effects. We consider the relationships between these different methods and how their results might be interpreted. The approaches that we consider are not novel, but the issues surrounding their application have not received careful attention in the literature. The second part of the article applies these decompositions to data on employee wages in Britain. The results here illustrate that area effects make only a small contribution to overall wage inequality, and that observed disparities between mean wages in different areas are primarily due to sorting.

II. Variance decompositions
Assume we have data on wages w i and individual characteristics x i and that individuals are located in one of J areas. 2 Consider the (Mincerian) linear regression model for individual log wages: 1 By this we mean that we will allow the effects of area and individuals to be correlated. Researchers in some disciplines like to use multi-level/hierarchical models (Goldstein, 2010) to provide variance decompositions, but these estimators assume that the various levels (e.g. areas, individuals) are uncorrelated random effects, which does not seem at all desirable in many contexts. 2 We assume that individuals live and work in the same area.
where " are random errors, d is a J × 1 column vector with jth element equal to one if the worker is in area j and is a J × 1 vector of parameters. Variable d is the 'area effect' for workers in area j. The vector of individual characteristics x could consist of observable variables (e.g. age), or be a vector of individual dummy variables, so that x is an 'individual effect'. 3 Unless needed, we suppress individual i (and time t) subscripts and all constant terms (so variables are in deviation-from-mean form). It simplifies notation to define ≡ d and ≡ x and rewrite equation (1) as: Our aim is to assess the contribution of area effects ≡ d to the variance of ln w. The area effect associated with an area j gives the expected gain in log wages that a randomly chosen individual could expect from being assigned to area j (relative to some arbitrary baseline area).

Decomposing the total variance
Estimation of equation (1) by OLS (or some equivalent method) gives coefficient estimateŝ ,ˆ and residuals". We can decompose the variance of ln w as follows: where var(·) is the sample variance, cov(·) the sample covariance,ˆ ≡ d ˆ andˆ ≡ x ˆ and " ≡ y − d ˆ − x ˆ . Dividing through by var(ln w) gives: Using R 2 (ln w; x, d) to denote the R-squared from a regression of ln w on x and d we get: Note that R 2 (ln w; x, d) provides one potential estimate of the contribution of area effects to the variance in ln w. However, interpretation of this overall R-squared statistic in such a way requires the assumption that all variance in individual characteristics (including within-area variation) is caused by area effects, implying for example that area effects determine all differences in individual skills between areas. In most contexts, this assumption is surely too extreme, because individuals sort across areas through migration, and area differences in skills are at least in part driven by this sorting. The principle advantage of having data on worker characteristics and workplace location, and panel data that follows individuals over time as they move across areas, is, of course, that we can use equation (5) to obtain more informative estimates of the contributions of area vs. individual effects in R 2 (ln w; x, d). The next sections set out the alternative ways of doing this in greater detail.

The raw variance share
A more plausible upper estimate of area share is obtained if we assume area causes any differences in average individual characteristics in each area. Then, the contribution of area is: where˜ ≡ d ˜ ,˜ is the coefficient vector and R 2 (ln w; d) the R-squared from the regression of ln w on area dummies d, with no control for x [contrastˆ ≡ d ˆ from equation (1)]. We refer to this as the raw variance share (RVS). RVS is simply the between-group variance in log wages divided by the total variance in log wages. If average individual characteristics vary by area and are correlated with, but not caused by, area effects then˜ is an upward biased estimate of and RVS provides an over-estimate of the area share of variance.

The correlated variance share
A second possibility, based on the decomposition of total variance in equation (4) is: whereˆ ≡ d ˆ comes from a regression of ln w on area dummies d and individual characteristics x [i.e. OLS estimation of equation (1)]. That is, the area share is the ratio of the variance of area effects to the variance of log wage. This seems like an intuitive measure of the contribution of area effects. Note, however, that interpretation is complicated because even though equation (1) controls for individual x, area effectsˆ will still be correlated with x andˆ if there is associative sorting across areas. This means that var(ˆ )/ var(ln w) excludes the direct contribution of sorting [which enters through the covariance term in equation (5)], but captures any indirect effect that sorting may have on the variance of the area effects themselves. That is, it captures the contribution of area effects including those induced by area composition but not the effects, if any, that area has in determining individual characteristics. We show this in more detail below in the section on Interpretations.
For this reason, we refer to this measure as the correlated variance share (CVS). The ratio var(ˆ )/ var(ln w) provides an analogous CVS estimate of the variance share attributable to individual characteristics, conditional on the area effects.

Uncorrelated variance share
We can also estimate the contribution of the components of area effects that are uncorrelated with individual characteristics. There are a number of equivalent methods for doing this that give a statistic that is often referred to as the semi-partial R-squared [and by Borcard and Legendre (2002) as 'Fraction a' partitioning]. It is also related to the partial R-squared, as shown in Appendix B. We refer to this as the uncorrelated variance share (UVS). Analogous methods can be used to estimate the corresponding UVS attributable to individual characteristics, but we describe only the process for area effects below.
The UVS measures the amount of variation in log wage that is explained by the part of the area effect that is uncorrelated with individual characteristics that are included in x. Appendix C shows that these three methods of calculating the UVS are equivalent. Appendix D also provides a decomposition of total variance similar to equation (4) but based on the UVS rather than the CVS.
These equivalent methods for estimating UVS are:

R-squared method
This method involves three steps: (a) regress ln w on x and d to get R 2 (ln w; x, d) to obtain an R-squared that estimates the share of overall variance explained by both individual characteristics and area effects; (b) regress ln w on just x alone to get R 2 (ln w; x), which is the share of overall variance explained by just the individual characteristics; (c) calculate UVS = R 2 (ln w; x, d) − R 2 (ln w; x) to obtain the share attributable to the components of that are correlated with area effects but uncorrelated with individual characteristics

ANOVA (partial sum of squares) method
An alternative is to compute the Regression Sum of Squares and Total Sum of Squares, which are routinely computed by ANOVA software, and calculate: where RSS is regression sum of squares and TSS the total sum of squares, so the numerator is the partial sum of squares of d.
Partitioned regression method (a) Regress ln w on x and d and obtain the regression predictions of area effectsˆ ≡ d ˆ and individual effectsˆ ≡ x ˆ ; (b) regressˆ onˆ and obtain a regression coefficient estimateˆ and the uncorrelated residual area componentsˆ =ˆ −ˆ ˆ ; (c) regress ln w on the residual from (b) and calculate UVS = R 2 (ln w;ˆ ).

Balanced variance share
To consider the explanatory power of individual characteristics and group membership, Abowd et al. (1999) suggest reporting the standard deviation (across workers) of the effect of a variable and the correlation of the variable with wages as yet another variance analysis. The effect of a variable is calculated by multiplying the coefficient on that variable by the value of that variable for each observation. The variable is said to have large explanatory power when the effect has a large standard deviation and is highly correlated with wages. Combes et al. (2008) perform such an analysis to consider the importance of individual vs. area in explaining area wage disparities. These two components (the standard deviation of the effect and the correlation with wages) can be used to construct another variance share measure. Usingˆ ≡ d ˆ ,ˆ ≡ x ˆ , s(·) to denote sample standard deviation and r(·) the Pearson correlation coefficient note that: Hence: provides this other measure of the area share of the variance of ln w. An analogous equation to equation (10) would yield a measure of the individual share in the variance of ln w . The decomposition simply apportions the sorting component, 2cov(ˆ ,ˆ ) in equation (4), equally to area effects and to individual characteristics, so we call this the balanced variance share (BVS). Note that there is no obvious justification for splitting the covariance in this way, apart from to guarantee that the area and individual variance shares add up to the total R-squared R 2 (ln w; x, d).

Contribution of area effects to area disparities
So far, we have focused on deriving the contribution of area effects to overall wage disparities. We can, however, use the RVS, CVS, UVS or BVS to back out the contribution of area effects to area,rather than overall individual disparities. The ratio CVS/RVS or var(˜ )/ var(ˆ ) from equations (6) and (7) gives the contribution of area excluding the direct contribution of sorting. The ratio UVS/RVS or [R 2 (ln w; x, d) − R 2 (ln w; x)]/R 2 (ln w; d) gives the contribution of area excluding both the direct and indirect contribution of sorting. BVS/RVS provides a final measure which adds in an additional contribution attributable to the covariance. The interpretations follow directly from the properties of CVS, UVS and BVS.

III. Comparisons and interpretation
We have outlined several methods for calculating the individual and group shares of variance. In this section, we consider the relative magnitudes of the measures and discuss how their interpretation depends on the underlying determinants of individual differences in outcomes.

Magnitudes
In Appendix C [equation (C4)], we show that R 2 (ln w; x) = (1 + 2ˆ +ˆ 2 )var(ˆ )/ var(ln w) whereˆ is the coefficient from the regression ofˆ ≡ d ˆ onˆ ≡ x ˆ . An analogous derivation gives: whereˆ is the coefficient from a regression ofˆ ≡ x ˆ onˆ ≡ d ˆ . Comparing with equation (7), it is immediate that RVS = (1 + 2ˆ +ˆ 2 )CVS. Note that unless −2 <ˆ < 0 or var(ˆ ) = 0, RVS > CVS. That is, when individual characteristics are positively correlated with area effects (i.e.ˆ > 0), RVS is greater than CVS. In other words, the R-squared from a regression of ln w on area dummies will attribute to area effects some of the contribution of individual characteristics. The same is true when sorting is negative, and strong relative to the variance of area effects (i.e.ˆ = cov(ˆ ,ˆ )/ var(ˆ ) < −2). In the range −2 <ˆ < 0, negative sorting tends to cancel out the contribution of area effects when estimated using RVS. When = −1, negative sorting exactly offsets area effects, so regression of ln w on area dummies would yield an RVS of zero, whereas CVS would be positive.
Appendix C [equation (C5)] shows that the uncorrelated variance share is: whereˆ is the coefficient from the sample regression ofˆ ≡ d ˆ onˆ ≡ x ˆ . Comparing the numerators in the UVS in equation (12) and CVS in equation (7): with equality ifˆ is uncorrelated withˆ . Therefore the CVS gives an area share that is at least as large as that obtained by the UVS. As we said before, the BVS adds cov(ˆ ,ˆ ) to the CVS. Hence, with positive covariance betweenˆ andˆ , the BVS for area share is bigger than the CVS. From equation (11) it is also clear that the RVS for area share is bigger than BVS, when there is positive sorting (covariance). With negative sorting the BVS for area share is smaller than CVS, but it may be larger or smaller than UVS or RVS. Finally, all methods give the same result ifˆ andˆ are uncorrelated. Naturally, these comparisons follow through to the decomposition of the variance of ln w into the components attributed to area plus observable characteristics, vs. unobservable factors. For example, with positive sorting, adding up the BVS for area [BVS(ˆ )] and BVS for individual characteristics BVS(ˆ ) minimises the share of the variance attributed to unobservables (i.e. it is the standard R-squared) and attributes as much as possible toˆ andˆ . At the other extreme, adding UVS(ˆ ) to UVS(ˆ ) provides a more conservative estimate of the share attributable to area and individual characteristics based on the orthogonal components ofˆ andˆ alone, and so assigns a greater share of ln w to unknown components.
Given that all of the methods give a different answer when there is associative sorting of individuals across groups it is natural to ask how we should interpret the different calculations and whether we should prefer any specific method. This depends on the underlying factors determining individual and area outcomes as we now show.

Interpretations
In order to gain some understanding of what contributes to each of these variance shares, it is helpful to fully specify a system of linear equations which determine individual wages, area effects and individual characteristics. Continuing to use ≡ d and ≡ x : Individual wages: Area effects: Individual characteristics: The equation for individual wages is identical to that used earlier in Section II. As shown in equation (15) we allow area effects to be determined by average area composition (where the mean of is taken across individuals, within areas) and by exogenous variation (e.g. climate). Equation (16) allows individual characteristics to be determined by area and by exogenous individual specific variation v (e.g. innate ability). Specifying equation (16) in terms of rather than x simplifies later notation.
Averaging equation (16) at the area level gives area-mean individual characteristics = +¯ , and it is the correlation between these and area effects that biases estimations of the relative contributions of individual and area effects. Part of this correlation arises because of the interdependency between and explicit in equation (15) and (16) when = 0 and/or = 0. However, there is an additional correlation, present even when = 0 and = 0 if cov( , ) = 0.That is, if exogenous area effects ( ) and area-mean exogenous individual characteristics (v) are correlated through sorting processes that are otherwise unrelated to wages (e.g. if higher skill people prefer warmer climates).
Note that all this assumes that the underlying sources of variance across areas are the exogenous additive components ( , ) and rules out multiple equilibria that arise in nonlinear theoretical models. Even in this very restricted world, it turns out to be difficult to provide firm statements about the share of area effects or whether RVS, BVS, CVS or UVS provides the most appropriate empirical measure.
In the Appendix E, we show that equations (14), (15) and (16) can be used to get expressions for the variance of area effects, mean individual characteristics and their covariance in terms of the underlying structural errors and parameters: If area affects individual characteristics ( = 0) equations (18) and (19) show that the exogenous component of area effects ( ) enters into var(¯ ) and cov (¯ , ). Similarly, if area composition determines area effects ( = 0), equations (17) and (19) show that the exogenous component of individual characteristics (¯ ) enters into var( ) and cov(¯ , ). (Positive) sorting, cov( ,¯ ) > 0 increases var( ) if area effects are increasing in¯ ( > 0) and increases var(¯ ) if individual characteristics are increasing in ( > 0). Positive sorting also increases cov(¯ , ), with the increase greater if both > 0 and > 0. These results regarding¯ carry through to the distribution of individual characteristics because cov( , ) = cov(¯ , ) and var( ) = var(¯ ) + var( ), where represents all the within-area individual components of wages. In short, var(¯ ), var( ) and cov(¯ , ), which are observable, cannot be attributed to one or other of the potentially unobservable exogenous components ( , ) that are the underlying sources of variance across areas in the simple system represented in equations (14), ( 15) and (16).
Turning to the four variance shares, it is useful to repeat the formulas: 4 .
It can be seen that equations (17)-(19) can, in principle, be used to express the four variance shares in terms of the underlying structural components. It should be clear, however, that even in this very basic linear world there is no simple relationship between these variance shares and the underlying structural components. Which variance share is most appropriate depends on a complex interplay between: (a) the underlying structural component in which we are interested; (b) the magnitudes and signs of the coefficients determining area effects and individual characteristics and (c) sorting.
One simple case occurs when composition does not cause area effects ( = 0), in which case CVS gives the variance share of the area shocks . Correspondingly, when area does not cause changes in individual characteristics ( = 0), var(ˆ )/ var(ln w) gives the share of individual characteristics in log wages. Note too that when = = 0, cov( , ) provides an unbiased estimate of the degree of sorting but otherwise combines the covariance caused by sorting and by the causal influence of area on characteristics (and vice-versa). In all other contexts, pinning down the contribution of specific sources of area-level variations is not possible, but as discussed above RVS, BVS, CVS and UVS can at least be used to provide a bounding exercise to guide us as to the likely contributions.

IV. Application: wage disparities in Britain
In this section, we use the four variance shares to look at the contribution of individual and area effects to wage variation in Britain. We improve on existing UK work on regional inequalities (Duranton and Monastiriotis, 2002;Taylor, 2006;Dickey, 2007) by working with functional labour market areas (rather than administrative boundaries) at a smaller spatial scale and by using panel data to control for unobserved individual characteristics. Outside of the UK, a number of existing studies have already used individual data to study 4 In these expressions, remember thatˆ is the coefficient from the regression ofˆ onˆ [i.e. cov(ˆ ,ˆ )/ var(ˆ ) not the 'structural' coefficient in equation (16)] andˆ is the coefficient from the regression ofˆ onˆ [i.e. cov(ˆ ,ˆ )/ var(ˆ ) not the 'structural' coefficient in equation (15)]. We ignore this complication in the text, but note that the problems of linking variance shares to structural components persist even in the unlikely case that sample estimates equal their population values. spatial sorting and agglomeration economies. See, for example, Mion and Natticchioni (2009), Dalmazzo and Blasio (2007) and Combes et al. (2008). This article is most closely related to the last of these but pays more attention to the issue of the share of disparities attributable to area effects using the insights developed above.
We use data for 1998-2008 from the Annual Survey of Hours and Earnings (ASHE) and its predecessor the New Earnings Survey (NES). NES/ASHE are constructed by the Office of National Statistics (ONS) based on a 1% sample of employees on the Inland Revenue PAYE register for February and April (Office for National Statistics, 2011). The sample is of employees whose National Insurance numbers end with two specific digits and workers are observed for multiple years (up to 11 years given the time span of our sample). The sample is replenished as workers leave the PAYE system (e.g. to retirement) and new workers enter it (e.g. from school). NES/ASHE include information on occupation (Standard Occupation Classification, SOC), industry (Standard Industrial Classification, SIC), whether the job is private or public sector, the workers age and gender and detailed information on earnings. We use basic hourly earnings as our measure of wages. NES/ASHE do not provide data on education but information on occupation works as a good proxy for our purposes. NES/ASHE provide national sample weights but as we are focused on sub-national travel to work areas (TTWA) data, we do not use them in the results we report below.
ASHE provides information on individuals including their home and work postcodes, while the NES only reports work postcodes. 5 We use the National Statistics Postcode Directory (NSPD) to assign workers to TTWA using their work postcode, allowing us to use the whole NES/ASHE sample over our study period. Given the way TTWA are constructed (so that 80% of the resident population also work within the area) the work TTWA will also be the home TTWA for the majority of workers. To ensure reasonable sample sizes, our analysis divides Britain into 157 'labour market areas' of which 79 are single 'urban' TTWA and 78 are 'rural areas' created by combining TTWAs. 6 We derive estimates of RVS, CVS, UVS and BVS, from regression analysis of individual wages, based on equation (1) above, but adapted to include individual fixed effects and time dummies to take advantage of the panel dimension of our data: where j indexes areas, i are individual fixed effects, t are time dummies,x it is a vector of time varying individual characteristics, and everything else is as in equation (14). captures the impact of individual characteristics controlling for area. Likewise, captures the impact of area controlling for both time invariant unobserved individual characteristics ( i ) and time varying observed individual characteristics (x it ). These parameter estimates allow us to obtain estimates ofˆ = d andˆ = x ˆ from which we can calculate RVS, CVS, UVS and BVS. 5 The NES data goes back to 1970. However, work postcodes are only available since 1997 and are required in order to provide a consistent series of labour market areas, because the TTWA definitions provided in the NES/ASHE data change over time. At the time of the analysis the 1997 work postcode data was unreliable, which is the reason our panel starts in 1998, and we focus only on recent patterns of area disparities. 6 We reached this classification in three steps: (a) we identified the primary urban TTWAs as TTWA centred around, or intersecting urban-footprints with populations of 100,000 plus; (b) we identified TTWA with an annual average NES/ASHE sample size greater than 200 as stand-alone non-primary-urban TTWA (e.g. Inverness); and (c) we grouped remaining TTWA (with sample sizes below 200) into contiguous units (e.g. North Scotland).
Including individual fixed effects means that identification of the area fixed effects comes from movers across areas. As usual, we cannot rule out the possibility that shocks to " it are correlated with the decision to move so the area effect estimates need to be interpreted with some caution. In the absence of random allocation (or a policy change that as good as randomly assigns people) tracking individuals and observing the change in wages experienced when they move between areas is the best we can do to identify causal area effects. 7 Estimating the area effects from movers in our data means we need to assume fixed-across-time area effects, given the frequency of moves and the length of the panel, and the number of TTWAs. 8 We shall show, however, that dropping individual fixed effects and controlling only for observables provides area effects estimates that are very stable over time. Given this, and as individual unobserved characteristics are important for explaining wages, our preferred specification incorporates individual effects but assumes area effects are fixed across time. Some basic descriptive statistics are provided in Appendix F.
We start by estimating equation (20) with an increasing set of control variables to show how allowing for sorting across areas affects the magnitude of estimated area effects. To summarize the distribution of effects, we first report the percentage change in wages when moving between different parts of the area effects distribution: the minimum to maximum and to mean, mean to maximum, the 10th to the 90th percentile and the 25th to the 75th percentile. Table 1 reports results based on several different specifications. The first row reports results from equation (20) when only including time dummies (which gives the upper bound estimates of area effects). The second row reports results when the observable variables are a set of age dummies, a gender dummy and a set of one digit occupation dummies. The third row uses a set of age dummies, a gender dummy, two digit occupation dummies, industrial dummies (three digit SIC) and dummies for public sector workers, part time workers and whether the worker is part of a collective agreement. Results using individual fixed effects are reported in rows four and five. In row four, we simply include year dummies and individual effects. Row five uses individual effects, year dummies, age dummies and one digit occupation dummies.
If we ignore the role of sorting (i.e. do not control for i and x it ) then the differences in area average wages look quite large. Moving from the worst to the best area, average wages increase by just over 60%; from the minimum to the mean by a little under 20%; and from the mean to the maximum by just over 35%. Of course the minimum and maximum represent extremes of the distribution. The move from the 10th to the 90th percentile sees average wages increase by 22%, from the 25th to 75th percentile by just over 10%. Introducing a limited set of observable characteristics (row 2) to control for sorting substantially reduces estimated area differences. A larger set of individual characteristics (row 3) reduces the estimated differences further as does allowing for individual fixed effects, with or without additional observable characteristics (rows 4 and 5). Interpreting observed area differences as area effects considerably overstates the impact on wages that occurs as individuals move 7 Note, that we track individuals in the panel for up to 10 years, so the wage changes we observe for individuals moving between areas are not just the one-off changes when the move occurs, but include longer-run changes (e.g. due to differences in wage growth rates between areas). 8 Theoretically, we could still allow for such year on year changes in TTWA specific effects, but estimating them requires movers in and out of all areas in every year which turns out to be too demanding given our sample sizes. from bad to good areas. Specifically, across the comparisons we report, ignoring the role of sorting overstates area effects by a factor of three. It is worth noting that the comparison with the figures reported in Combes et al. (2008) suggests that the sorting is even more important in driving area disparites in Britain than it is in France. Their nearest equivalent to Table 1 above suggests that the max/min difference without any controls for individual effects is 74% (their table 3, p. 730) falling to 38% when sorting of workers across areas is taken into account. Our figures show max/min differences of 61.7% falling to 17.3% once we account for sorting. Similarly, their 90/10 differences start at 21% and fall to 14%-16% after controlling for sorting, while in Britain the corresponding figures from Table 1 are 22% falling to just 3.8%. 9 Evidently, sorting matters a lot in Britain, with a considerably larger share of the general disparity in wages between areas explained by sorting than in France. As discussed above, the contribution of area effects to overall (and area) wage disparities depends not only on the size of specific effects but also on the overall distribution of area effects and on the distribution of individuals across those areas. We suggested four alternative variance decompositions for measuring this contribution. Table 2 reports results for the contribution of area differences to overall and area wage disparities using these four measures. We calculate them from regression specifications using the same individual characteristics as in Table 1. For comparison, the second panel of the table reports the contribution of the individual characteristics using the different measures while the third panel reports the contribution of area effects to area disparities. In each panel, we also report the standard deviation of the estimated area effects (ˆ ) or individual effects (ˆ ) and their correlation with ln wages, to mimic the presentation given in Combes et al. (2008) Table 2 for France. As discussed above in equation (10), the BVS can be derived by dividing the standard deviation of the estimated effect by the standard deviation of ln wages and multiplying the result by the correlation between the estimated effect and ln wages. The top panel of Table 2, column 1 shows that, even if we ignore the effects of sorting, and simply consider raw area differences in mean log wages, labour market areas only explain 6% of the overall variation in wages across individuals (remember that in the absence of controls for any individual characteristics the variance share of area effects is the same whichever way it is calculated). Moving across the columns in Table 2, panel 1, we progressively add controls for individual characteristics, culminating in a specification with individual fixed effects in columns 4 and 5. Looking down rows within the top panel shows how the variance share changes according to the way we estimate it -BVS, CVS and UVS. As discussed in section III, the correlated variance share provides an upper bound to the combined contribution of exogenous area effects plus those induced by area composition (i.e. the area mean individual characteristics). The uncorrelated variance share provides a lower bound to the contribution of exogenous area effects alone. The balanced variance share could be larger or smaller than the correlated variance share, depending on whether there is positive or negative assortative matching of individuals to areas. It is evident from the fact that BVS > CVS > UVS in Table 2, that we have positive sorting across areas. On any of these measures, the contribution of area effects is around 3% or less when controlling for basic observable characteristics (columns 2 and 3) and around 1% or less when controlling for unobservable individual characteristics (columns 4 and 5). In short, the most striking finding from Table 2 is that area effects only play a small role in explaining overall individual wage disparities.
We have shown (in section III) that it is not, in general, possible to determine from these variance shares the exact contribution of sorting, vs. the influence of area composition on area effects, vs. the influence of area effects on area composition. However, comparison of the CVS (ˆ ) and UVS(ˆ ) measures is partly revealing. Looking at the gap between the UVS and CVS in our preferred specifications with individual fixed effects in columns 4 and 5 of panel 1 suggests that area effects arising from area composition account for over half of everything that can be attributed to area effects. To see this, note that CVS(ˆ ) -UVS(ˆ ) is more than 50% of either CVS(ˆ ) [or even BVS(ˆ )], in columns 4 and 5. Even so, this is still a trivial share of the overall individual disparity in wages -roughly 0.5% in our preferred specification [CVS(ˆ ) -UVS(ˆ )] in column 5.
In contrast to these results on area effects, the contribution of individual characteristics shown in the second panel of Table 2 is very large. Age, gender and occupation variables alone account for 54-58% of the individual variation in wages. Adding in individual fixed effects drives this share up to between 85% and 88% in the last column, implying that the contribution of individual characteristics is between 100 times (comparing BVS) and 850 times (comparing UVS) bigger than that of area effects. Comparison of CVS(ˆ ) and UVS(ˆ ) implies that area effects also contribute very little to the variance of individual effects on wages.
Comparing with the results in Combes et al. (2008) Table 2, it is again evident that more of the wage dispersion in Britain is attributable to individual characteristics, and less to area effects than is the case in France. Both the standard deviation of area effects (0.14 in France, 0.041 in Britain), and the correlation of these with ln wages (0.34 in France, 0.12 in Britain) are substantially higher in France. Conversely, both the standard deviation of individual effects (0.29 in France, 0.49 in Britain) and the correlation with ln wages (0.80 in France, 0.94 in Britain) are higher in Britain. The BVS for area effects from their work (derived from their table 2) is 13%, whereas the BVS for individual effects is 64%.
The final panel of the table shows that area effects as defined here play a somewhat more important role in explaining the area disparities captured by the simple between-area variance share (the RVS in Table 2). When controlling for basic characteristics, sorting accounts for a little over half of the observed area disparities leaving area effects to account for around 48%. Once controlling for individual fixed effects the upper bound estimate of the contribution of area effects to area disparities is considerably smaller at a little over 10% with the lower bound estimate around 1%. Even so, the dominant contribution to area disparities (one minus the share reported in the final panel of Table 2) must come from individual characteristics, and the way these are distributed across areas. Table 2 has shown these decompositions based on pooling all years in our data set. The results in Table 3 investigate the stability of the contribution of area effects over time, and shows that this contribution has been stable. For each year, the table reports the contribution of the four area variance shares controlling for the fullest possible set of individual characteristics. For comparison, the first row reports results when pooling across years (taken from Table 2). Repeating other results from Table 2 by year give figures that are similarly stable across years. As discussed above, it is this stability of area effects across time, combined with the importance of individual effects which leads us to view the individual fixed effects constant-across-time area effects specifications reported in the final two columns of Table 2 as our preferred specification.

V. Conclusions
This article has reviewed a number of ways to decompose variance in wages into the contribution from individual and area specific effects. The article collates the methods present in the literature, discusses how different variance decompositions are related to each other and how they should be interpreted, and finally, highlights that a clear-cut division of variation into components is strictly speaking impossible, and that whatever method a researcher chooses, assumptions and caveats will remain. In particular, we have discussed decompositions that we named RVS, BVS, CVS and UVS. We have shown why disentangling the contribution of area effects to wage distribution is complex, and why an approach based on bounds such as RVS, BVS, CVS and UVS can only provide guidance to the relative importance of area effects.
The RVS is the simple between-area share of the total variance and provides an overall upper bound for the contribution of area effect on total wage variation. It counts the average differences in worker characteristics across areas as part of the area effect. For Britain, this measure has persistently accounted for roughly 6% of the overall wage variation for over a decade. The CVS excludes the direct contribution of sorting, but captures the effect that sorting may have on the variance of the area effect. We interpreted the CVS as an upper bound to the combined contribution of exogenous area effects plus interactions-based spillovers that are linked to the mean characteristics of individuals in each area. CVS in Britain has been around 2%, but less than 1% if individual fixed unobservable effects are controlled for. The alternative BVS decomposition that has appeared in previous work (e.g. Abowd et al., 1999;Combes et al. 2008) yields a slightly higher estimate of the contribution of area effects in our setting, because there is positive assortative matching of individuals to areas (i.e. individual wage effects are correlated with area wage effects). However, in general, this measure is quite hard to interpret, as it mixes up the variance attributable to areas with the covariance between area effects and area-mean individual effects. We argue that CVS is a more natural indicator. Finally, compared to CVS, UVS further excludes the effects that sorting may have on the variation of the area effects, and such provides a lower bound to the contribution of exogenous area effects. With this measure, the area effects explain 1.5% of wage variation in Britain. If individual fixed effects are included, the effect shrinks to less than 0.1%, and this should be considered as the lower bound of the effect of area on variation of wages. These figures are small relative to those found in France using comparable estimates in Combes et al. (2008), where (according to the BVS indicator) area disparities account for 13% of the variance in log wages.
Whichever estimate we use, our general finding is that most of the observed regional inequality in average wage in Britain is explained by 'sorting' or 'people' rather than 'places'. Our preferred estimates, which include the individual fixed effects, suggest that the contribution of individual characteristics to variation in wages is between 100 to 850 times larger than the contribution of area effects.
where the first line follows from the full variance decomposition equation (4) and so the R-squared and partitioned regression methods are equivalent.