Counterfactual decomposition of changes in wage distributions using quantile regression



We propose a method to decompose the changes in the wage distribution over a period of time in several factors contributing to those changes. The method is based on the estimation of marginal wage distributions consistent with a conditional distribution estimated by quantile regression as well as with any hypothesized distribution for the covariates. Comparing the marginal distributions implied by different distributions for the covariates, one is then able to perform counterfactual exercises. The proposed methodology enables the identification of the sources of the increased wage inequality observed in most countries. Specifically, it decomposes the changes in the wage distribution over a period of time into several factors contributing to those changes, namely by discriminating between changes in the characteristics of the working population and changes in the returns to these characteristics. We apply this methodology to Portuguese data for the period 1986–1995, and find that the observed increase in educational levels contributed decisively towards greater wage inequality. Copyright © 2005 John Wiley & Sons, Ltd.


The last decade has witnessed an increased interest by economists in the analysis of wage inequality. To a large extent, such interest arises from the fact that the tendency towards a reduction in wage inequality that had prevailed during the previous decades experienced a reversal during the 1980s. This is so not only for unconditional inequality, but also for wage inequality within groups defined based on education and experience (e.g. Juhn et al., 1993). Although the original concern was documented for the United States (Levy and Murmane, 1992), increasing inequality is now well documented for almost all of the industrialized countries (Gottschalk and Smeeding, 1997).

A leading explanation for these changes in the structure of pay asserts that increases in wage inequality were caused by shifts in labour demand favouring high-skilled labour at the expense of low-skilled labour, primarily caused by changes in the technology, notably by the use of computers (Juhn et al., 1993, Bound and Johnson, 1992, Autor et al., 1998). The evidence supporting this hypothesis is rather indirect, and is based on the observation that despite the steady increase over time of the relative supply of high-skilled labour, conventional Mincerian wage equations indicate a rise in the returns to schooling. Ceteris paribus, increases in the number of college graduates would decrease the relative wage of college graduates, thereby decreasing inequality. To reconcile the increase in the skills of the labour force with increasing returns to education and wages inequality, one is naturally led to conclude that labour demand must have shifted to more than compensate the shifts in supply. The skill-biased technical change explanation was recently detailed by Bresnaham (1999), who stresses the organizational changes induced by the information and communication technologies and their direct impact on the relative demand for certain skills, namely ‘people’ and managerial skills.

The usual supply and demand framework typically employed to analyse changes in the structure of wages implicitly assumes that to each type of labour (say, skilled and unskilled), and after controlling for measured characteristics, there corresponds a single wage, as opposed to a distribution of wages. Perhaps more correctly, the conventional supply and demand analysis focuses only on the average wage. This is consistent with the conventional empirical approach of estimating Mincerian wage equations by least squares methods, which provide estimates of the effect of education (along with the effect of other attributes) upon the mean of the conditional wage distribution. Reasoning in the context of this framework, Topel (1997, p. 72) was led to assert that ‘[a] reason for interest in supply effects stems from the hope that human capital investment will mitigate future inequality’, while Johnson (1997, p. 53) suggested that ‘a long-term commitment to increasing greatly the fraction of individuals who go to college is the appropriate public policy response to the phenomenon of increasing inequality’.

However, recent research employing statistical methods other than the least squares regression (namely, quantile regression) has revealed that education has a greater effect upon the wages of individuals at the top of the wage distribution than upon wages of individuals at the bottom of that distribution.1 In other words, more educated individuals experience more unequal wage distributions, and this seems to have been exacerbated during the 1980s. These results suggest that education may have a second effect on wage inequality. By increasing the number of educated workers, a pressure is certainly exerted that decreases the wages of these workers. But, if more educated individuals experience greater wage spreads, increased educational levels may also contribute to an increase in wage inequality. The net result of these two effects is certainly an empirical question and constitutes the major motivation for this paper.

A full understanding of the changes that have occurred in wage schedules requires the disentanglement of the effect of changes in the stock of human capital in the working population from the effects of changes in returns to the components of human capital. We propose a method that extends the traditional Oaxaca decomposition of effects on mean wages (Oaxaca, 1973) to the entire wage distribution. The method is based on the estimation of the marginal density function of wages in a given year implied by counterfactual distributions of some or all the observed attributes. This methodology is applied to the analysis of the changes in the distribution of wages in Portugal from 1986 to 1995. For instance, we will estimate the wage density that would have prevailed in 1995 if education had been distributed as in 1986 and the other covariates as in 1995. By comparing it with the actual marginal distribution in 1995, we can isolate the contribution of changes in education to the observed changes in the distribution of wages.

The counterfactual nature of the exercise requires the estimation of the wage distribution conditional on the variables of interest. We accomplish this first step by means of quantile regressions, that is, by estimating models for the quantiles of the conditional wage distribution. Quantile regressions capture the impact of changes in covariates upon a conditional wage distribution, in very much the same way that mean regression measures the impact of changes in covariates upon the mean of the conditional wage distribution.

However, we wish to go beyond a mere conditional model. Indeed, a conditional distribution does not reflect the variability of the covariates in the population. In other words, it is the distribution that would prevail if all workers had the same observed characteristics. The second step of our approach, and its major methodological contribution, is thus to marginalize the conditional distribution estimated in the previous step using different scenarios for the distribution of workers' attributes. The basic gist of our approach resembles DiNardo et al. (1996) in that their methodology also estimates counterfactual densities and yields a decomposition of the factors that explain the changes in the marginal distribution of wages. However, as we shall see in Section 2 below, not only are the specifics of the two approaches quite distinct, but they also provide different insights into the factors behind observed changes in the distribution of wages.

The paper proceeds as follows. We present the methodology used to analyse the changes in wages in Section 2. Section 3 provides an application of the methodology to Portuguese wage data [Section 3.1 overviews the changes in the Portuguese labour market for the period 1986–1995, focusing on the changes in the wage distribution and in the characteristics of the labour force; we then proceed to present and discuss our empirical findings in Section 3.2]. Section 4 concludes.


2.1. Conditional Wage Distributions

Let Qθ(ω|z) for θ ∈ (0, 1) denote the θth quantile of the distribution of the (log) wage, (ω), given a vector, z, of covariates. As is customary in the empirical analysis of earning functions, z includes a constant, a gender dummy, the level of schooling, age, age squared, tenure and tenure squared. We model these conditional quantiles by

equation image(1)

where β(θ) is a vector of coefficients, the quantile regression (QR) coefficients.

For given θ∈(0, 1), β(θ) can be estimated by minimizing in β (Koenker and Bassett, 1978)

equation image


equation image

For details on the asymptotic inference procedures about the coefficients β(θ) used in the paper, see Koenker and Bassett (1978, 1982) and Hendricks and Koenker (1992).

From the point of view of our study, the most important aspect to emphasize is that the conditional quantile process—i.e. Qθ(ω|z) as a function of θ∈(0, 1)—provides a full characterization of the conditional distribution of wages in much the same way as ordinary sample quantiles characterize a marginal distribution (Bassett and Koenker, 1982, 1986). But, of course, the estimated QR coefficients are also quite interesting as they can be interpreted as rates of return (or ‘prices’) of the labour market skills at different points of the conditional wage distribution.

An important maintained assumption is the linearity of the quantile regression model. For instance, it will hold exactly in a conditional location–scale model where both location and scale depend linearly on the covariates. In more general settings (1) should be regarded as a ‘reasonable’ approximation. Additional flexibility may easily be achieved, without sacrificing the semiparametric nature of the approach, by assuming that the θ conditional quantile is a given transformation (which may depend on θ) of an affine function of the covariates (see Buchinsky, 1995 or Machado and Mata, 2000).

2.2. Marginal Densities of Wages

The second step of our approach involves estimating the marginal density function of wages. The difficulty lies in estimating a marginal density that is consistent with the conditional distribution defined by (1). To see the point more clearly, notice that it would be possible to estimate a marginal wage density directly from the data on wages. That density, however, would not necessarily conform to the conditional distribution modelled by (1) and, consequently, would not allow us to perform a counterfactual analysis.

The basic idea underlying our estimation of the wage density is the well-known probability integral transformation theorem from elementary statistics: if U is a uniform random variable on [0,1], then F−1(U) has distribution F. Thus, if θ1, θ2, …, θm are drawn from a uniform (0,1) distribution, the corresponding m estimates of the conditional quantiles of wages at equation image, constitute a random sample from the (estimated) conditional distribution of wages given z.

The theoretical underpinnings of this procedure are quite simple. On the one hand, as we have mentioned, the probability integral transformation theorem implies that one is simulating a sample from the (estimated) conditional distribution at the given z. On the other hand, the results in Bassett and Koenker (1982, 1986) establish that, under regularity conditions, the estimated conditional quantile function is a strongly consistent estimator of the population quantile function, uniformly in θ on a compact interval in (0, 1), that is equation image, for some ξ> 0.

To ‘integrate z out’ and get a sample from the marginal, instead of keeping z fixed at a given value, we draw a random sample of the covariates from an appropriate distribution.2 To motivate this procedure, consider the following example where, for simplicity, we take the wages to be discrete and consider only one covariate, say the gender indicator. Let Y represent the outcome of the following random experiment: first, draw a value for Z from the probability function g(z) and denote it by z0; second, draw a wage ω from the conditional distribution corresponding to the value of Z previously drawn, fω|Z(ω|z0). Then, the probability function of Y is P(Y = y) = fω|Z(y|0)g(0)+ fω|Z(y|1)g(1) which, obviously, equals fω(y), the marginal probability of the wage being y.

The next two subsections present in detail the procedures both for the case of the estimated marginal in a given year and for the counterfactual marginals.

Marginal Densities Implied by the Conditional Model

Let ω(t), z(t), t = 0, 1 denote wages and the (k) covariates at date t (t = 0 corresponds to 1986 and t = 1 to 1995). Denote by g(z;t) the joint density of the covariates at time t. We wish to generate a random sample from the wage density that would prevail in t if model (1) were true and the covariates were distributed as g(z;t). Our approach may be described as follows:

  • 1.Generate a random sample of size m from a U [0, 1]: u1, …, um.
  • 2.For the data set at time t (denoted by Z(t), a nt × k matrix of data on the covariates) and each {ui} estimate
    equation image
    yielding m estimates of the QR coefficients equation image.
  • 3.Generate a random sample of size m with replacement from the rows of Z(t), denoted by equation image.
  • 4.Finally
    equation image
    is a random sample of size m from the desired distribution.

Counterfactual Densities

We are interested in two types of counterfactual exercises. On the one hand, we want to estimate the density function of wages in 1995, corresponding to the 1986 distribution of covariates. On the other hand, we also need the density of wages in 1995 if only one covariate was distributed as in 1986.

To generate a random sample from the (marginal) wage distribution that would have prevailed in 1995 (t = 1) if all covariates had been distributed as in 1986 (t = 0) (and, of course, workers had been paid according to the 1995 wage schedule), just follow the algorithm above but drawing the bootstrap sample of the third step from the rows of Z(0).

Let y(t) (say, gender), denote one particular covariate of interest at time t. We now show how to simulate the (marginal) wage distribution that would have prevailed in 1995 if y had been distributed as in 1986 and the other covariates as in 1995. It is convenient to partition the space of y(t) into J classes, C1(t), …, CJ(t). For gender or education these partitions are obvious: just take the two genders or the educational levels. For the continuous random variables, age and tenure, the classes may be the bins of a histogram of y(t). Denote by fj(t), j = 1, …, J the relative frequency of class Cj(t). Herein, we have used, for tenure and age, the interdecile ranges so that fj(t) = 0.1, j = 1, …, 10. The procedure runs as follows (in italic we illustrate the procedure for gender),

  • 1.Follow steps 1 to 4 as above to generate equation image, a random sample of size m of the wage density at t = 1.
  • 2.Take one class, say C1(1). (Consider men.)
    • (a)Let I1 = {i = 1, …, m|yi(1)∈C1(1)}; select the subset of the random sample generated in step 1 corresponding to I1, i.e. equation image. (Select the subsample of men from the 1995 estimated marginal density.)
    • (b)Generate a random sample of size m × f1(0) with replacement from equation image. (Generate a random sample of size equal to the number of men in 1986 from the subsample of men in 1995.)
  • 3.Repeat step 2 for j = 2, …, J. (Repeat step 2 for women)

The basic idea of the procedure is quite simple. Indeed, the distribution of wages that would have prevailed in 1995 if gender (say) had been distributed as in 1986 is

equation image

with f1(ω|y) denoting the distribution of wages in 1995 for gender y and F0(y) the 1986 distribution of gender. That is, the 1995 marginal distribution of wages is a mixture of the wage distribution for men and of the wage distribution for women with weights that equal proportion of men and women in the 1995 workforce. So, our procedure amounts to a change of weights in that mixture, i.e., to the use of the proportions in the 1986 workforce.

2.3. Decomposing the Changes in the Wage Density

Denote by f(ω(t)) an estimator of the marginal density of ω (the log wages) at t based on the observed sample {ωi(t)} and by f*(ω(t)) an estimator of the density of ω at t based on the generated sample equation image, that is what we have been calling the marginal implied by the model. The counterfactual densities will be denoted by f*(ω(1);Z(0)), for the density that would result in t = 1 (1995) if all covariates had their t = 0 distributions, and f*(ω(1);y(0)), for the wage density in t = 1 if only factor y were distributed as in t = 0.

We analyse the changes from f(ω(1)) to f(ω(0)) by comparing

  • f*(ω(1);Z(0)) with f*(ω(0)): the contribution of the QR coefficients for the overall change; and

  • f*(ω(1)) with f*(ω(1);Z(0)): the contribution of the covariates to the changes in the wage density.

To gauge these changes we resort to the usual summary statistics. Let α(·) be one such statistic (for instance, quantile, scale measure or concentration index). We then have the decomposition of the changes in α:

equation image

In the same way, we may measure the contribution of an individual covariate by looking at indicators such as

equation image

The analysis thus far assumes that changes from 1986 to 1995 took place in a given order. Of course, this is quite arbitrary. For instance, one might wish to compare the wage distribution in 1995 with the one that would have prevailed in 1986 if all the covariates had been distributed as in 1995 (i.e., f*(ω(1)) with f*(ω(0);Z(1)) or, still, this latter density with the 1986 marginal (f*(ω(0);Z(1)) with f*(ω(0))). These two comparisons would provide alternative measures of, respectively, the contribution of the changes in the QR coefficients and the contribution of the changes in the distribution of covariates. One must ensure, therefore, that, qualitatively, the conclusions drawn with one decomposition are resilient to changes in the order of that decomposition.3

2.4. Alternative Methodologies

Blau and Kahn (1996) also resort to the estimation of counterfactual wage densities to perform international comparisons of male wage inequality. Their analysis is carried out in the context of the usual regression model, ω = z′β + σε, where σ is a positive constant, ε an unobserved random variable with mean 0 and variance 1 and statistically independent of z, and all the other symbols are as above. The wage distribution that would prevail in country j if workers were paid according to the estimated US wage schedule and the US residual standard deviation can, then, be constructed by using the covariates and the estimated ε's for country j in conjunction with the parameters estimated for the USA. If the conditional distribution of (log) wages given the chosen covariates belongs to a translation family the estimation of several conditional quantiles is redundant because they will all be parallel (i.e., they will yield the same estimates for the coefficients on the regressors). In such a case, the linear conditional location framework used in Blau and Kahn is entirely valid, and ours introduces unnecessary complication. However, the translation family hypothesis is very restrictive. Whenever it is deemed inappropriate—either because there is heteroscedasticity (i.e., σ should depend on z) or because other attributes of the wage distribution also depend on the covariates—the construction of counterfactual densities requires more general methodologies and the flexibility provided by QR may prove quite useful.

As we have mentioned in the Introduction, our empirical strategy bears resemblance to that of DiNardo et al. (1996). Their approach is essentially based on nonparametric weighted-kernel methods. We depart from their methodology on two counts: the cornerstone of our method is a parametric model for the quantiles of the conditional distribution; and we resort to resampling procedures to obtain a marginal distribution consistent with both the conditional model and the covariate densities. We claim no global superiority of our approach. Certainly, resorting to a parametric model is necessarily restrictive. Yet, this weakness buys some additional information. Indeed, it enables the identification, in the changes in the wage density that are not explained by the changes in the distribution of the covariates (coined ‘unexplained’ in DiNardo et al., 1996), of the part that is due to the changes in the quantile regression coefficients; that is, in the rates of return to human capital. Consequently, our approach decomposes the change in distribution of wages that is explained by the statistical model in part due to changes in the distribution of workers' attributes and, also, due to changes in returns to those attributes.

Recently, there has been a surge of methodologies that also yield counterfactual comparisons and share the semiparametric nature with ours. One such alternative is the estimator of conditional distributions proposed by Donald et al. (2000) where covariates are introduced using a proportional-hazards model and the regressor coefficients are allowed to vary discretely with wages. Although the flexible formulation of the regressors eases the strong restrictions of the proportional-hazards model, it certainly does not eliminate them. Also, the implementation of the method may be difficult as it requires the choice of the right partition of the endogenous variable space.

Fortin and Lemieux (2000) present still another way of performing counterfactual analysis and apply it to the changes of the gender wage gap. They model the (marginal) probability that the wage exceeded prespecified cutoff values as an ordered probit. The method is ingenious and easy to use, and delivers insightful decompositions of the observed changes in the wage distributions. To make the comparison with ours clearer, one may think of their approach as replacing the assumption of linear conditional quantile functions by the assumption that a transformation (the inverse standard normal c.d.f., to be specific) of the rank of a given wage (in the empirical wage distribution) is a linear of the covariates. As Koenker and Hallock (2000) remark, the conditional quantile assumption appears more natural from a statistical point of view, if only because it nests the classic linear location shift model.

Like ours, the approach of Gosling et al. (2000) uses quantile regressions to model the conditional distribution. However, the two differ in the method used to construct the unconditional distributions implied by the conditional model. Having estimated the conditional quantiles, they derive the unconditional quantiles in three steps: first, the conditional quantile function is inverted to produce the conditional c.d.f.; second, the conditional c.d.f. is averaged with respect to the empirical distribution of the covariates to yield the unconditional distribution function; finally, the unconditional c.d.f. is inverted again to produce the respective quantiles. This represents, in fact, a feasible alternative to our method of estimating unconditional wage distributions from a given conditional quantiles function. For instance, if in the second step above one takes a distribution other than the empirical, the resulting marginal could be used for the counterfactual comparisons constructed in this paper. We feel, however, that our approach is simpler.

One criticism that if often raised to the quantile regression-based approach is the potential lack of monotonicity of estimated conditional quantile functions, (see e.g. Donald et al., 2000, p. 616). When evaluated at the covariates sample mean the conditional quantile functions can be shown to be nondecreasing on (0, 1) (Bassett and Koenker, 1982, theorem 2.1). For other values of the covariates it is, indeed, true that this property need not hold and crossings may occur.4 However, since the estimated quantile function is consistent for its population counterpart (Bassett and Koenker, 1982, theorem 3.2; 1986, theorem 3.1), for any two values θ and θ′, θ < θ′, for which, in the population, Qθ(ω|z) < Qθ′(ω|z), it must necessarily be the case that, for a sufficiently large sample size, the empirical quantile functions satisfy θ(ω|z) < θ′(ω|z) (with probability one). The theory, therefore, predicts that the potential violations of monotonicity will be smaller the larger the sample size and the less dense a set of θ's in (0, 1) is used to evaluate the conditional quantiles.

To evaluate the potential impact of this problem in our data set, we have estimated quantiles at 200 equally spaced points in (0, 1), conditional on six ‘extreme’ covariate combinations.5 In this experiment, the violations of monotonicity were relatively infrequent (never more than 3%). Moreover, the impact of those crossings on the conclusions is likely to be minor because the differences in the corresponding quantiles of the log wage distribution were quite small, always smaller than 0.01 for a variable whose average magnitude is about 6 with a standard deviation of roughly 0.5.


3.1. Work Force Characteristics

Our data consist of two samples of about 5000 full-time wage earners employed by firms located in mainland Portugal containing information on hourly wages as well as on workers' attributes, such as gender, education, age and tenure. The samples were randomly selected from the raw files of Quadros de Pessoal, a survey conducted by the Portuguese Ministry of Employment, which covers the work force of all firms employing paid labour in Portugal, totalling over 2 million people every year.

The data indicate that the aforementioned tendency towards increased wage inequality can also be observed in Portugal during the period under examination. However, there are significant differences between Portugal and the USA in the nature of wage inequality. While in the USA wage inequality increases were produced with stable real wages at the top deciles of the distribution and shrinking real wages at the bottom (see e.g. Topel, 1997), wage inequality increased in Portugal, while real wages have been growing across the whole distribution. From 1986 to 1995 the average hourly real wage in the economy increased by 27%, which shifted the whole distribution to the right (see Figure 1 for kernel estimates of the densities). However, wage growth was considerably higher at the top of the wage distribution (38% and 27% at the ninth decile and at the third quartile) than at the bottom (19% at the first decile and first quartile). This unequal growth led to increased wage inequality, which may be attributed mainly to increases in inequality in the upper tail of the wage distribution. For example, while the ratio between the wage at the ninth decile and that of the third quartile increased from about 1.55 in 1986 to 1.73 in 1995, the corresponding ratio comparing wages at the first quartile and first decile remained practically constant at 1.19.6

Figure 1.

(Log) wage densities for 1986 and 1995 (1995 dotted)

The year 1986, the starting point of our analysis, coincides with the date of Portuguese membership in the European Union. EU membership is likely to have had a significant impact upon the Portuguese labour market, in particular upon the factors commonly identified with the increase in wage inequality. In the first place, EU membership meant increased exposure to foreign competition. In particular, the patterns of trade changed, with traditional low-skill products having their weight reduced in exports and increased in imports. This is the type of change that reduces demand of low-skilled labour and is consistent with the explanations advanced by Borjas and Ramey (1995) for changes in wage inequality in the USA.

At the same time, very substantial resources were devoted to policies designed to modernize the industrial structure, namely by subsidizing investment in modern technologies. These changes contributed to an increase in the demand for skilled labour at the expense of unskilled labour and thus to an increase in wage inequality. On the other hand, largely financed with EU funds, widespread training programmes were also created with the intent of increasing the supply of skilled labour. Moreover, as elsewhere in the world, the supply of skilled labour also increased as a consequence of the aging and retirement of older workers, and their replacement by younger and better-educated individuals.

Indeed, the educational level of the labour force increased considerably during this period. From 1986 to 1995, the average number of years of schooling rose from five to over six. These changes are visible in the bottom of Table I, which shows that from 1986 to 1995, there is a marked increase in the proportion of people with six or more years of education and a decline of those with four years or fewer.

Table I. Covariates: descriptive statistics
  1. Sample averages; sample standard errors in parentheses.

Wages6.06 (0.54)6.33 (0.58)
Gender (% women)3440
Age35.22 (12.04)35.51 (11.55)
Tenure9.42 (8.37)7.93 (8.50)
Education5.03 (2.87)6.39 (3.32)
Years of education (%)  

This increase in educational levels means that people enter the labour market later in 1995 than they used to in 1986. Figure 2 shows that the proportion of teenagers in the working population is lower in 1995 than in 1986. At the same time, however, the proportion of the elderly in the labour force was also reduced, the net consequence being that the average age of individuals in the sample remained essentially constant at 35 years.

Figure 2.

Density functions of the continuous covariates (1995 dotted)

Increases in the age of entry into the labour market also lead directly to a reduction in the average tenure in the economy. And, indeed, this was the case in our samples. The probability of having a tenure of less than 10 years, which was 58% in 1986, rose to 72% in 1995. The average of 9.4 years of tenure in 1986 decreased to less than 8 years in 1996. However, later entry in the job market is not the whole story where tenure is concerned, as Figure 2 clearly illustrates.

On the whole, the distribution of tenure in 1995 is much more skewed than in 1986, with a greater concentration of very recent jobs. Moreover, the 1986 distribution of tenure was clearly bimodal, the second mode being at about 12 years of tenure. This peak is directly driven by one of the idiosyncratic features of the Portuguese labour market, corresponding to individuals who were hired in 1973 and 1974, and gained permanent jobs as a result of the April 1974 revolution. The importance of this cohort of workers diminishes gradually over time, and only a tiny peak is still detected at about 21 years of tenure in 1996.

Finally, women represent an increasing proportion of the labour force, a trend that can also be observed elsewhere in the world. Starting from a level of 34% in 1986, they make up about 40% of the total working population in 1995.

3.2. Results

Changes in the ‘Prices’ of Labour Market Skills

As a first step in our empirical analysis, we analyse the QR results displayed in Figure 3. The plots show the coefficient estimates, equation image for θ∈(0, 1), and the associated confidence bands (represented by the dots). For each variable, the plots provide information on the coefficient estimates for 1986 (left column), 1995 (centre column) and changes between 1986 and 1996 (right column). The dots represent the 95% (heterogeneity-consistent) confidence interval for the regression deciles, obtained by the method in Hendricks and Koenker (1992). For comparison purposes, the coefficients estimated by mean regression (OLS) are reported as a solid horizontal line in the first two columns. The information in this figure can be summarized to give the impact of each covariate upon wage inequality. Indeed, as the dependent variable is in logs, the difference in the estimated coefficients at two different quantiles provides a measure of the impact of that covariate upon the (log of the) ratio between wages at these quantiles. A simple way of estimating this effect without having to choose two particular quantiles is to estimate the slope of a straight line fitted to the point estimates of each QR coefficient, which is done in Table II

Figure 3.

Quantile regression coefficients. The points represent 95% confidence intervals for the deciles; the solid horizontal line represents the least squares (conditional mean) estimate

Table II. Effects on wage inequality. Slopes of a LAD fit to the QR coefficient estimates in Figure 3

The plots corresponding to the gender variable in Figure 3 show that females earn less than males (coefficients are negative), and that this gender gap increases as we move up through the wage distribution. This effect implies that the wage distribution for women is less dispersed than that for men. Indeed, the negative sign associated with gender in Table II indicates that a larger proportion of women in the working population contribute towards reduced wage inequality. Moreover, unlike what has occurred in the USA, for example Bound and Johnson (1992), the gender gap seems to have increased in Portugal over the decade. This pattern is particularly acute for the top of the distribution, as the plot on the right-hand side illustrates. The differences between estimates are negative, and from the 40th quantile to the right, they are significantly different from zero, as the confidence band does not cross zero (the dashed horizontal line). Because this evolution is particularly important at the top of the wage distribution, it reveals that male wage inequality increased more than did female wage inequality, which is similar to what has been documented for several other countries by Gottschalk and Smeeding (1997).7

As expected, wages increase with education and this is true across the whole distribution. Furthermore, this effect is more important at the highest quantiles of the distribution than at the lowest, implying that education increases wage dispersion. In other words, samples with more educated individuals show a higher wage dispersion than samples of less educated people. These results are qualitatively similar to comparable findings for the USA (Chamberlain, 1994; Buchinsky, 1994), Germany (Fitzenberger and Kurz, 1997), Uruguay (González and Miles, 2001) or Zambia (Nielsen and Rosholm, 2001) although the results seem somewhat stronger in Portugal.8

The average return to education increased from 1986 to 1995. Over this period, returns increased at the top of the distribution as well, but remained roughly constant at the lowest quantiles (see the right panel of Figure 3). Consequently, the inequality increasing effect of education strengthened during that decade (see Table II).9

Age and tenure entered the regressions both with linear and quadratic terms; the plots represent their effects evaluated at the variables' means. Older workers earn more, especially so at relatively high-paid jobs. The returns to age decreased from 1986 to 1995, in particular at the bottom of the wage distribution. As a consequence, its effect upon inequality increased over the period, although it has remained at very low levels (Table II). In contrast, the effect of tenure upon inequality is negligible, and it did not change much.

Counterfactual Analysis

In order to decompose the changes in the wage distribution into changes attributable to changes in the coefficients (remuneration of those attributes) and changes in the covariates (individual workers' attributes), we follow the procedures described earlier (with the number of replications, m, set to 4500). The results are summarized in Figure 4, which plots the differences between pairs of distributions of interest.10 The first plot compares the densities in 1995 and in 1986 (f(ω(1)) − f(ω(0))). It is clearly visible that the density in 1995 has a lower mass in the left tail and a greater mass in the right tail, meaning that the wage distribution has shifted to the right, as was already known from the discussion in Section 3.1. The next two plots show that both the effect of the covariates—(f*(ω(1)) − f*(ω(1);Z(0)))—and of the coefficients—(f*(ω(1);Z(0)) − f*(ω(0)))—exert an impact in the same direction, that is, both contribute to the observed shift of the wage distribution to the right. (These results are unchallenged when the effects are estimated in reverse order, as the last two plots in Figure 4 show.)

Figure 4.

Differences between (log) wage densities. The first frame refers to the estimated marginal densities. Frame 2 (by rows) compares the 1995 counterfactual density if all attributes were as in 1986 with the 1986 density. Frames 3 to 6 estimate the difference between the 1995 density and the density that would have occurred in 1995 if the indicated factors had been as in 1986. Frames 7 and 8 are analogous to frames 2 and 3 except that the order of the decomposition is reversed

The decomposition of the overall effect of the covariates into its constituents reveals that education is the only covariate which contributes unequivocally towards the observed change in the wage distribution. The effect of gender is to shift the distribution towards the left, as more women are included in the sample and women earn relatively lower wages. The contributions of age and tenure are opposite, but less clear. The effect of age is to increase wages, perhaps due to the later entry of individuals into the labour force, but the reduction in the weight of the elderly is a counteracting force. With respect to tenure, the plot reflects the increase in the proportion of individuals with very low tenure but, at the same time, the disappearance of the second mode at the middle of the distribution of tenure.

These results are presented in greater detail in Table III. The first two columns of this table refer to the marginal wage distributions in 1986 and 1995, while the third column presents the estimates of the overall changes which occurred during the period. In addition to a point estimate, this column also presents the 95% bootstrap confidence interval for this estimate, obtained using the quantiles 97.5 and 2.5 of the bootstrap distribution of the relevant summary statistic.11 The next three columns decompose total changes into changes due to the covariates (the 1995 estimated marginal versus the counterfactual 1995 marginal density if all attributes were distributed as in 1986); changes due to changes in the coefficients (the counterfactual 1995 marginal density if all attributes were distributed as in 1986 versus the 1986 estimated marginal); and residual changes, that is, changes unaccounted for by the estimation method (estimated as the difference between the estimates of the total changes provided by using the empirical wage density and by using the estimated marginal densities). The final four columns (‘individual covariates’) decompose the changes in the wage distribution due to the changes in covariates into the changes that can be attributed to the different variables.12 The last line in each cell gives the percentage of total change ‘explained’ by the corresponding factor, that is, it expresses the first line in a cell as a proportion of the total change in the corresponding statistic.

Table III. Decomposition of the changes in the wage distribution. The first entry in each cell is the point estimated in the change in the attribute of the density, explained by the indicated factor; the second entry is the 95% bootstrap confidence interval for that change; the third entry is the proportion of the total change explained by the indicated factor. The quantile changes were computed using log wages, whereas the remaining indicators use natural units. Scale = (Q(0.75) − Q(0.25))/Q(0.50), Skewness = (Q(0.75) − Q(0.25) − 2Q(0.50))/(Q(0.75) + Q(0.25)), Kurtosis = (Q(0.90) − Q(0.10))/(Q(0.75) − Q(0.25)) (see Oja, 1981 and Ruppert, 1987 for a discussion of these statistics)
 MarginalsAggregate contributionsIndividual covariates
10th quant.5.5355.7210.1860.0490.187−0.050−0.0210.0270.015−0.027
  0.167; 0.1960.026; 0.0700.158; 0.212 −0.035; 0.0110.011; 0.0560.007; 0.045−0.040; 0.001
25th quant.5.7055.8940.1890.0490.148−0.008−0.0140.0350.013−0.015
  0.168; 0.2070.026; 0.0710.125; 0.170 −0.037; 0.0040.015; 0.058−0.011; 0.031−0.035; 0.005
  0.221; 0.2630.037; 0.0960.133; 0.184 −0.042; 0.0090.024; 0.087−0.011; 0.040−0.049; 0.010
75th quant.6.3506.6210.2710.0920.197−0.018−0.0230.0920.024−0.015
  0.234; 0.3120.054; 0.1290.161; 0.234 −0.060; 0.0110.053; 0.124−0.016; 0.063−0.052; 0.022
90th quant.6.7867.1680.3820.0930.2380.051−0.0220.0920.017−0.053
  0.315; 0.4450.039; 0.1490.199; 0.288 −0.076; 0.0410.044; 0.158−0.055; 0.072−0.099; 0.005
  0.036; 0.142−0.000; 0.1000.016; 0.109 −0.064; 0.0340.015; 0.122−0.037; 0.064−0.044; 0.061
  −0.86; 0.012−0.049; 0.070−0.019; 0.091 −0.057; 0.046−0.039; 0.075−0.041; 0.058−0.043; 0.071
  0.037; 0.436−0.178; 0.106−0.096; 0.181 −0.140; 0.161−0.193; 0.105−0.238; 0.103−0.257; 0.071
Gini coeff.0.3250.3600.0350.0160.0140.0050.0140.025−0.011−0.008
  0.018; 0.0530.002; 0.0300.001; 0.029 0.001; 0.0270.010; 0.040−0.027; 0.005−0.024; 0.008

The first (horizontal) panel of Table III presents the estimates for selected quantiles of the distribution. It confirms that wages increased throughout the whole distribution from 1986 to 1995, the proportional growth being larger at the highest quantiles. Both covariates and coefficients contribute to the actual evolution of the location estimates and their effect is significantly different from zero (the confidence intervals do not include zero) in all of the estimated quantiles. The effect of coefficients is quantitatively more important than the effect of covariates at each of the estimated points. Overall, the models work fairly well, as the residuals account for a relatively small portion of the total change.

The subsequent panels display the evolution of several summary measures of the distribution. They reveal that the distribution became more dispersed, less skewed and with fatter tails, although the result for skewness is not statistically significant. Furthermore, the results for the Gini coefficient indicate that the wage distribution has, indeed, become more unequal. As to the sources of such increase in wage inequality, the analysis reveals that both the changes in the covariates and in the coefficients contribute to the observed changes in wage inequality—measured by the interquantile range or the scale statistic or, still, by the Gini coefficient—in the same direction, and in roughly equal proportions (although the effect of the covariates upon scale is only marginally significant).13

As to the contribution of individual covariates, the last four columns reveal that education is the only attribute which has an unequivocally positive impact upon wages across the whole distribution. Not only is the estimated effect of education positive in all of the estimated quantiles, but the confidence intervals never include zero. The inequality increasing effect of education is directly driven by the monotonic increase of the estimated effects for education for our selected quantiles. Focusing, for ease of interpretation, on the 75–25 and 90–10 interquantile ranges, we see that inequality rose by 8.2 p.p. and by 19.6 p.p., respectively. Had the distribution of education remained as it was in 1986, we estimate that the 90–10 range would have increased by roughly 13 p.p., while the interquartile range would have increased by only 2.5 p.p. In summary, our results indicate that even in the absence of changes in the returns to education, the observed increases in the level of schooling in Portugal during the late 1980s and early 1990s would have increased wages across the board, but would have increased wage inequality at the same time.

Age, gender and tenure did not have a noticeable contribution to the way the wage distribution evolved. In the first place, although the estimates are consistently signed across the distribution in all of our variables, they are smaller in magnitude and somewhat less precise, the zero belonging almost always to the confidence intervals. Moreover, the coefficients for these variables display a somewhat erratic pattern across the distribution, the consequence being that no clear effect upon inequality shows up.

Overall, although female participation and tenure are considerably different in the beginning and the end of the period under analysis, education is the only measured attribute whose evolution contributes unequivocally towards greater inequality.


Before concluding, it is important to see how the results of our study square with the conventional wisdom about the effect of education upon wage inequality. To keep things simple, think of an economy in which workers can be split into two types, low- and high-skilled, skills being acquired by school attendance (which we normalize to be one year). There can be substantial heterogeneity among workers of each type but, on average, low-skilled workers are paid wL while high-skilled ones are paid wH. The school premium may be measured by wH/wL, for which the slope of a least squares regression of log wages on education is a good approximation.

Increasing the proportion of skilled workers in the economy leads to a reduction in wage inequality via two effects. The first is a price effect. Skilled workers, earning relatively higher wages, become relatively more abundant and, consequently, see their relative wage shrink. The second is a composition effect. A greater proportion of individuals is now in the skilled (highest wage) group, which also contributes to reducing inequality. These two effects combined lead to the prediction that shifts to the right in both demand and supply of skilled workers that would leave price unchanged would result in less wage inequality.

This would be the end of the story if each type of worker was a homogeneous group, or if the degree of heterogeneity within types was the same. However, if wages are more spread out in the high-skills group, as indicated by our quantile regression results, an increase in the level of schooling reduces the weight of the low-spread group, contributing to an increased overall wage dispersion and greater inequality. Judging from the available evidence presented by Buchinsky (1994) for the USA, Fitzenberger and Kurz (1997) for Germany and from our own results for Portugal, the existence of this inequality increasing effect of education seems clear.14 The extent to which it can offset the aforementioned composition and price effects depends on their relative magnitude, and is an empirical question. For our data, we found that if the wage structure had remained constant, the observed increase in education would have led to increased wage inequality (the estimated figure for the interquartile range is an increase of 5.7 p.p.).

Comparing the magnitude and the evolution of the dispersion-increasing effect of education across countries is, however, somewhat more difficult than identifying its existence, as the empirical specifications and time periods differ from study to study. Our estimates for the difference in returns to education at the 10th and at the 90th percentiles are 4.8% and 6.8% in 1986 and 1995, respectively. There is almost no overlap in the period but, for the last 10 years in his sample (1978 to 1987), Buchinsky (1994) estimates this difference in the returns to education to have increased from 0.73% to 2.32%.15 Fitzenberger and Kurz (1997) do not report returns on education based on a continuous education variable for Germany. Rather, they report a college premium vis à vis high school of magnitude 32% at the 10th percentile and 41% at the 90th. These estimates were obtained using pooled data for the period 1984–1994, and the authors report nonsignificant changes in the estimates over the period. In contrast, the estimates obtained for Portugal using a comparable specification give figures roughly in the same range in 1986 (35%–55%), but very different estimates in 1995 (43%–91%). Results reported by González and Miles (2001) for Uruguay are again based on a different specification. They report the returns to an additional year of schooling at different quantiles and levels of education. Their estimates for primary education at the 10th and the 90th percentiles are 5%–7% in 1986 and 6% in 1997, while the corresponding estimates for university education are 7.5%–10% and 11%–14%, respectively. These estimates are very much in line with the equivalent estimates for Portugal, which are 5%–9% and 2%–8% in 1986 and 8%–9% and 11%–14% in 1995 (Machado and Mata, 2001).

The overall picture we get from these comparisons is, therefore, that the inequality-increasing effect of education is possibly higher in Portugal than in the USA or Germany. Our analysis for Portugal reveals that the composition effect (i.e. a higher proportion of more educated individuals) is, globally, inequality increasing. There are no comparable results for other countries, and given that this effect depends on the dispersion of returns to education across the wage distribution, and Portugal stands as the country where dispersion is the highest, we cannot proceed much further.

Our analysis of the changes in the distribution of the measured attributes of the workforce reveal that, even if the wage structure had remained constant over the decade, inequality would have increased. However, prices did not remain constant. On the contrary, the average price of skilled labour increased, contributing to further increases in the wage inequality. However, the most interesting fact about the evolution of the price of skilled labour is perhaps the remarkable widening in the returns to schooling observed in Portugal and in the USA. This widening in returns implies that wage inequality would have increased even if the composition of the labour force and the average return to schooling had remained constant. The extent to which the observed increase in inequality is due to changes in the average return to skill and to the widening in returns across the wage distribution is an open question. Nevertheless, we observe that the widening in returns did not occur in Germany, one of the few industrial countries where wage inequality has not increased recently (Gottschalk and Smeeding, 1997).

The difference between the evolution of the spreads of the returns to schooling in Portugal and in the USA, on the one hand, and in Germany, on the other, remains to be explained. One may speculate that this seems to be an indication that the answer must lie in the supply of skilled labour, rather than in the changes in demand, as changes in demand should have provoked similar changes in all of the countries under consideration.


In this paper we propose a method to decompose the changes in the wage distribution over a period of time into several factors contributing to those changes. The basic building block is the estimation of the conditional distribution by quantile regressions; then, by resorting to resampling procedures, one estimates marginal distributions consistent with the estimated conditional model as well as with any hypothesized distribution for the covariates. Comparing the marginal distributions implied by different distributions for the covariates one is then able to perform counterfactual exercises. The methodology is quite general and applicable to all problems where the researcher is interested in the evolution of distributions.

We applied this methodology to Portuguese data for the period 1986–1995, a period during which an increase in wage inequality was observed in Portugal. Our estimates suggest that changes both in individuals' attributes and in the returns to these attributes contributed in the same direction to the observed increase in wage inequality. The contributions of both changes are roughly of the same magnitude.

Education emerges in our analysis as being at the centre of the observed increase in wage inequality. There are two reasons for this. First, over the period under scrutiny, the distribution of the returns to education became more spread out: the returns at the top of the wage distribution increased, while staying roughly constant at the bottom. This effect alone would increase inequality even with an identically educated work force. Second, the observed increase in the level of education in Portugal also contributed towards a less equal wage distribution. Returns on education tend to be higher on the right than on the left tail of the wage distribution. At any given moment, a more educated work force typically displays a higher wage dispersion than its less educated counterparts. Consequently, even if the returns on education had remained constant during the period, the sharp shift to the right of the distribution of education of the Portuguese work force would imply an increased wage inequality. This factor alone is responsible for an increase of 5.7 p.p. in the interquartile range of wages.

Conventional wisdom asserts that rising education leads to lesser wage inequality. Our analysis indicates that, contrary to this wisdom, increases in educational levels do not necessarily translate into a more equal wage distribution. Our work thus raises questions which have not been given much attention in the previous literature, suggesting alternative points of concern to the analysis of wage inequality. The literature has so far been concerned with explaining the evolution of the average return to an additional year of education. From the standpoint of explaining inequality, it may be equally important to explain why the returns to education have evolved so differently for individuals at the top and at the bottom of the wage distribution. We offer no explanation for this pattern in this work, but this certainly deserves further investigation.


This work was started while José Mata was at the Bank of Portugal. We are grateful to Roger Koenker, Pedro Portugal, Maria Tannuri and three anonymous referees for useful comments and suggestions. Support from FCT under programme POCTI, partially funded by FEDER, is gratefully acknowledged. The usual disclaimer applies.

  • 1

    The evidence for this comes from a number of different countries such as the USA (Buchinsky, 1994), Germany (Fitzenberger and Kurz, 1997), Uruguay (González and Miles, 2001) and Zambia (Nielsen and Rosholm, 2001).

  • 2

    We are grateful to Roger Koenker for having suggested this approach.

  • 3

    It should be clear, however, that the method only provides accounting decompositions conditional on a given model. Thus, changes in unobserved ability or in the labour market institutions, for example, would also be reflected in coefficient changes.

  • 4

    He (1997) proposes a restricted version of quantile regression that avoids the occurrence of crossing. However, the quantile model considered by He (1997) for the semiparametric set-up we are working in is less general than model (1) for it assumes linear heteroscedasticity implying that the components of β(θ) are monotone in θ.

  • 5

    Specifically, we have considered the subpopulation of men with the following attributes:


    where L (H) stands for ‘low’ (‘high’); that is, the sample average of the attribute minus (plus) one standard deviation. The table does not exhaust all the possible combinations but rather retains only those considered reasonable. The column ‘Crossings’ refers to the number of θ's (out of 200) for which a break of monotonicity was found, i.e. for which Qθ(ω|x) < max{Qt(ω|x):t < θ}.

  • 6

    For more on increases in wage inequality in Portugal, see Cardoso (1998).

  • 7

    The important changes in share of women in the labour force may raise the issue of sample selection. This question cannot be tackled with our data set as it includes only employed individuals. Martins (2001) estimates wage equations for women with correction for sample selection using Portuguese data; the impact of the correction on the mean return to schooling was found to be minor (a reduction of 10% to 9%). The general issue of sample selection in quantile regression has been dealt with by Buchinsky (1998).

  • 8

    The results are less clear cut in the case of Spain. Abadie (1997) reports a constant effect across quantiles while Garcia et al. (2001) find that an increasing effect holds for men but not for women.

  • 9

    Chay and Lee (2000), using quite a different econometric methodology, estimate that the return to unobserved ability (correlated with education) has risen in the USA during the 1980s, and that may explain part of the observed increase in the college–high school premium. Unobserved ability may also help explain the variation of returns to education across quantiles; a rise over time in the ability premium would also generate an exacerbation of the inequality increasing effect of education. A simple model, adapted from Mata and Machado (1996), may clarify the point. Suppose wages may be decomposed into an education (s) and an ability effect (a), w = s + a, where s and a may take two values each, s = s0, s1, s0 < s1 and a = aL, aH, aL < aH; the correlation between s and a is introduced by the assumption that Pr[a = aL|s = s0] = 1 and Pr[a = aL|s = s1] = λ< 1. So, equation image and Qw(θ|s1) = s1 + aL, if θ ≤ λ or Qw(θ|s1) = s1 + aH, if θ > λ. Consequently, the measured school premium (β(θ)) will be β(θ) = (s1s0) for ‘low’ quantiles (lower than λ) and, for ‘high’ quantiles, β(θ) = (s1s0) + (aHaL). Therefore, in this simple model, the wage distribution is more spread out for more educated groups and this spread difference increases with the ability premium.

  • 10

    Estimation of the densities was performed using the S-PLUS function density (Statistical Sciences, 1995). The estimation procedure was a kernel-density smoother with a Gaussian kernel and bandwidth 4.24n−1/5min(σ, R/1.34), where n denotes the sample size, σ the standard deviation and R the interquartile range (Silverman, 1986). To estimate the differences between two log wage densities, each of the densities was evaluated at the same points in a range that contains either range.

  • 11

    The number of bootstrap samples was 1000. For more on inference about inequality measures see e.g. Mills and Zandvakili (1997).

  • 12

    For instance, the ‘contribution of education’ is estimated from the comparison of the 1995 estimated marginal vs. counterfactual 1995 marginal density if only education was distributed as in 1986.

  • 13

    Available from the authors upon request, there is a similar analysis with the order of the decomposition reversed. Specifically, the ‘contribution of the covariates’ results from comparing the counterfactual 1986 marginal density if all attributes were distributed as in 1995 with the 1986 estimated marginal, and the ‘contribution of the coefficients’ results from comparing the 1995 estimated marginal with the counterfactual 1986 marginal density if all attributes were distributed as in 1995. The point estimates of those contributions are quantitatively different (especially when measured by the Gini coefficient). Nevertheless, either decomposition conveys the same broad picture that both covariates and coefficients have contributed to observed changes in wage inequality.

  • 14

    Using a nondirectly comparable methodology, Gosling et al. (2000) report that, within each educational level, inequality has increased over successive cohorts of workers in the UK.

  • 15

    These figures are based on the estimated returns at these quantiles of 5.4%–10.2% and 4.6%–11.4% for Portugal in 1986 and 1995, respectively and 6.9%–7.6% and 8.7%–11.1% for the USA. The same figures for the USA in 1971, for example, were 5.9%–8.4%.