2.1 The Model
In this section I present a model that allows me to estimate the evolution of wages throughout an individual's career. I start by specifying a simple wage equation, similar to those used in Abraham and Farber (1987), Altonji and Shakotko (1987), Topel (1991) and Altonji and Williams (1997). The wage equation of an individual can be written in the following way:
where W_{it} denotes the observed (log) wage of individual i in year t, E_{it} is the individual's labour market experience at time t, T_{it} represents seniority (tenure with the same employer) and X_{it} is a set of other variables that affect current wages. The parameters of interest, β_{1} and β_{2}, represent the returns to an additional year of experience and seniority, respectively. Note that there is a direct connection between mobility decision and seniority: each time an individual changes jobs his seniority starts from zero. In other words the value of seniority in each period is equal to the length of the sequence of unrealized past mobilities. The main problem encountered when estimating the return to seniority is the fact that seniority is endogenous in the sense that both seniority and the error term, e_{it}, might be jointly dependent on a common set of unobservables.
In this model I directly specify a separate equation to approximate the mobility decisions that led to the observed job changes. I directly model the latent utility of a job switch and the observed wages, as described below. One could follow the same logic to argue the endogeneity of experience. However, I do not specify an additional equation for the labour market participation decision because the analysis is restricted to only individuals working in every period (discussed in Section 2.2).
Consider an individual i who makes a decision in each year, t, to either change jobs or keep an existing job. The individual changes jobs only if the utility from changing is higher than the utility from keeping the existing job. I denote this difference in utility levels as . Following Buchinsky et al. (2008), I model the mobility–wage process for working individuals in the following way:
 (1)
where and are the sets of individualspecific characteristics that affect mobility and wages, and are individualspecific random effects, while and are contemporaneous error terms. The researcher does not observe the latent utility, M*, but only the decision to change jobs, , where 1 (·) is the usual indicator function. The model in (1) is a selection model, where the first equation represents the individual's choice and the second equation represents the observed outcome in the form of wages.
The variables that affect the mobility decision include the information available to the individual at the time of the job change, such as lagged experience and seniority, last year's mobility and a set of other variables discussed in Section 2.3. To insure that the model is semiparametrically identified (Heckman, 1990), I include a set of variables that affect the mobility decision but not wages, such as family composition and other family income.^{1} In the wage equation, I include variables that affect wage growth both within a given job and across jobs. Variables affecting withinjob wage growth are the current levels of experience and seniority, while wage growth across jobs depends on a set of variables that affect initial wages at a new job. Other variables included in the wage equation are discussed in Section 2.3.
In the year an individual changes jobs, wages can change discontinuously because the wage changes across jobs follow a different trajectory than within a job. To account for these changes in the wage function, I incorporate a set of variables that include information about the individual's labour market history. This specification allows me to account for wage changes at different levels of seniority and experience, as well as the amount of times an individual changed jobs in the past, and helps me control for the quality of job match and job shopping behaviour. The decision to change jobs is affected by both labour demand and the current human capital composition. Antonelli et al. (2010) provide some recent evidence of the effects of labour demand on human capital formation.
To complete the likelihood specification I make the following distributional assumptions:
where Γ is a full covariance matrix, Σ is restricted as follows
and N_{2}(a, B) denotes bivariate normal distribution with mean vector a and covariance matrix B. The variance of the is restricted to one for the usual identification reasons in a probit model.
One can write down the likelihood function of the labour transitions of individual i for periods 2 through T conditional on the observed data in year 1 in the following way:
 (2)
where
 (3)
and .
The likelihood function is conditional on the initial values, M_{i1}. Alternatively, one could specify an initial conditions equation to describe the mobility process in the first year (e.g. Heckman, 1981). This is the route taken in Buchinsky et al. (2008). However, here I use the likelihood specification conditioning on the observed initial values, similar to Lancaster (2000). The conditional likelihood specification allows me to simulate the predictive distributions, described below, in a more efficient way.
The posterior distribution is proportional to the product of the likelihood function and the prior. The elements of the parameter vector, θ, are assigned the conjugate prior distributions, i.e. β∼ N(b_{0}, V_{0}), Γ∼ IW(v_{0}, R_{0} v_{0}), ρ∼ N(0, c) 1[− 1, 1] and σ^{2}∼ IG(a_{1}, a_{2}), respectively.^{2} As detailed below, I use the conditional posterior densities to obtain the marginal distributions of the parameters. The values of the hyperparameters, (b_{0}, V_{0}, R_{0}, v_{0}, c, a_{1}, a_{2}), are set to reflect the prior knowledge about the parameters. I work with rather dispersed prior distributions, thereby allowing the data to dictate the distribution of the parameters.
This model differs from the specification used in Abraham and Farber (1987), Altonji and Shakotko (1987), Topel (1991) and Altonji and Williams (1997). The models used in these studies specify a single wage equation that includes experience and seniority. They allow for unobserved heterogeneity to affect both wages and seniority and try to control for it using various methods. Abraham and Farber (1987) argue that if completed job duration is included in the earning function it will eliminate the correlation of tenure with the error component. Altonji and Shakotko (1987) present an instrumental variable solution, using withinjob variation in tenure as an instrument. Topel (1991), on the other hand, uses a twostep estimation procedure that yields an estimate of the lower bound of the return to tenure that is much higher than the results of the other two studies mentioned above. In this model, the correlation between the seniority and the unobserved heterogeneity component is controlled for by means of explicitly modelling the mobility decision and allowing for the random effects to be correlated across the mobility and wage equations.
A second difference is that, unlike Topel (1991) who assumes the starting wages to be a function of only past experience, I allow starting wages to vary with the level of past seniority as well. The reason for doing so is that while experience measures the amount of general human capital that is completely transferable across jobs, seniority measures not only jobspecific human capital, but also human capital that may be transferable to some jobs but not others (e.g. industryspecific human capital). Neal (1995) investigates the existence of industryspecific human capital using the Displaced Workers Survey. He argues that a complete explanation of the relationship between wages and seniority must involve factors that are specific to a particular job, but also industry or occupationspecific skills. This specification allows me to measure both the returns to seniority within a job and the effect of accumulated industryspecific human capital on the initial wages at a new job.
2.2 Data
Following previous studies of the returns to seniority mentioned above (Abraham and Farber, 1987; Altonji and Shakotko, 1987; Topel, 1991; Altonji and Williams, 1997; Buchinsky et al., 2008), I use the data from the Panel Study of Income Dynamics (PSID). This data set is particularly attractive for longitudinal research of labour transitions for a number of reasons.^{3} First, individuals are followed for very long periods of time; particular efforts were made to follow individuals even when they change place of residence. Second, information is collected on a wide range of topics, including information about an individual's jobs and family composition. Third, the families are followed across generations; when children of original sample members leave home, they remain in the sample.
I use 18 waves of the PSID, spanning the years 1975 through 1992. The sample used in this study is restricted to household heads who are between the ages of 18 and 60, are in the sample for at least three consecutive years, and participate in the labour force in every year they appear in the sample. All observations from the poverty subsample are excluded. In addition, because the sample is restricted to individuals who work in each year they appear in the sample, high school dropouts are excluded from the estimation as this restriction severely affected (by more than 50%) the size of this group. The resulting data set includes 2741 individuals with 12 or more years of education.
Particular effort has been made to crosscheck and validate the constructed variables related to experience and seniority. Topel (1991) gives a detailed explanation of the problematic features of the data, especially the tenure data. For example, tenure variables are sometimes recorded in wide intervals, are missing, or are inconsistent with participation and mobility data. Consequently, the data were thoroughly checked.^{4}
I estimate the model separately for two education groups (high school and college graduates) to allow for the human capital accumulation process to vary with education level. The dependent variables are the logarithm of the real annual income, W_{it}, expressed in 1987 US dollars and adjusted for fulltime work (if the number of weeks the person worked was less than 50, earnings are adjusted), and an indicator of labour mobility, M_{it}, which is set to be one (job change occurred in year t) if the person changed employers in the year t − 1 but worked in the new job less than half the amount of time he was employed in that year, or if he moved in the first six months (for full time) of the year t.
Summary statistics are presented in Table 1. Several changes over time can be observed, such as an increase in both sample size and annual income. Despite a modest amount of sample attrition the sample size increases over the years for two reasons. First, children of the originally surveyed families became household heads and are added to the sample. Second, new families were added to the sample. The annual income increases over time, a fact that can be explained by the sample restriction rule. While the average experience and seniority increases, the education level does not change, which is consistent with following individuals over time. The proportion of individuals who change jobs each year ranges from 8% to 20% and decreases over time.
Table 1. Summary Statistics for the PSID Extract for Selected Years, 1976–1992.  Variable  Year 

1976  1980  1986  1992 


Individual and family characteristics 
 Observations  1028  1384  1735  2066 
1.  Mobility  0.0759  0.1828  0.1055  0.0774 
 (0.2649)  (0.3866)  (0.3073)  (0.2674) 
2.  Log wages  10.1452  10.1527  10.2482  10.2631 
 (0.6353)  (0.6433)  (0.7190)  (0.7228) 
3.  Education  14.6255  14.4588  14.7499  14.7231 
 (1.9880)  (1.9416)  (1.8326)  (1.7845) 
4.  Experience  13.5136  14.3559  16.9879  20.5765 
 (8.9736)  (8.9566)  (9.3086)  (9.5802) 
5.  Seniority  4.4839  4.1220  6.3074  8.2615 
 (5.2849)  (5.3954)  (5.9456)  (7.2034) 
6.  Black  0.1654  0.2038  0.2035  0.1805 
 (0.3717)  (0.4029)  (0.4027)  (0.3847) 
7.  Hispanic  0.0360  0.0332  0.0340  0.0542 
 (0.1864)  (0.1793)  (0.1813)  (0.2265) 
8.  Married  0.8434  0.7847  0.8052  0.8103 
 (0.3636)  (0.4112)  (0.3962)  (0.3922) 
9.  Family other income  0.8640  1.7059  2.9469  4.8468 
 (4.8345)  (7.0220)  (10.1179)  (24.3498) 
10.  Northeast  0.1420  0.0853  0.0594  0.0779 
 (0.8576)  (1.0580)  (1.1634)  (1.0365) 
11.  North central  0.2023  0.1467  0.0968  0.1162 
 (0.8803)  (1.0800)  (1.1769)  (1.0513) 
12.  South  0.2597  0.2413  0.2213  0.2817 
 (0.8977)  (1.1065)  (1.2124)  (1.0977) 
13.  Living in SMSA  0.7529  0.7392  0.6219  0.6234 
 (0.4315)  (0.4393)  (0.4851)  (0.4846) 
14.  County unempl. rate  7.6733  6.8274  6.3164  6.6595 
 (2.9995)  (2.4642)  (2.5440)  (2.0723) 
15.  Age 15 or less in 1975  0.0447  0.2045  0.3873  0.5426 
 (0.2068)  (0.4035)  (0.4873)  (0.4983) 
16.  Age 16 to 25 in 1975  0.2646  0.3100  0.2755  0.2144 
 (0.4413)  (0.4626)  (0.4469)  (0.4105) 
17.  Age 26 to 35 in 1975  0.4446  0.3194  0.2282  0.1684 
 (0.4972)  (0.4664)  (0.4198)  (0.3743) 
18.  Age 36 to 45 in 1975  0.1508  0.1062  0.0726  0.0513 
 (0.3580)  (0.3082)  (0.2596)  (0.2207) 
The data set used in Topel (1991) was based on the earlier waves of the PSID, 1968–1983. In comparison with the data used in his study, individuals in the sample, on average, have more years of education: 14.5 versus 12.6 years. This difference can be explained by the exclusion of high school dropouts. At the same time, the experience and seniority are lower in the sample (17 and 6 years of experience and seniority versus 20 and 10, respectively).
Looking at the demographic variables, we can see that about 20% of the sample are AfricanAmerican, while only 4% are Hispanic. The fraction of Hispanics in the sample increases between 1988 and 1990 as a result of efforts made by the PSID to collect information for individuals who left the sample in the previous years. The proportion of small children in the sample decreases over time, while the amount of children remains roughly constant. Geographic location variables indicate a certain degree of geographic mobility. The proportion of older individuals decreases over time, while the proportion of younger individuals increases, as is evident from the cohort variables.
2.3 Estimation Procedure
The goal in estimating the model is to summarize the marginal posterior distribution of the model parameters. The complexity of the model does not allow me to analytically derive marginal posterior distributions of the elements of the parameter vector, θ. Hence, I estimate the posterior distribution of the parameters of the model using the Markov chain Monte Carlo simulation method, specifically the Gibbs sampler (Gelfand and Smith, 1990; Casella and George, 1992) and the Metropolis–Hastings algorithm (Hastings, 1970; Chib and Greenberg, 1995).
The Gibbs sampler allows me to obtain a sample of draws from the marginal posterior distributions of the parameters by sequentially sampling from the posterior distributions, conditional on the latest draws of the other parameters. The derivation of the conditional densities is additionally complicated by the fact that both and μ_{i} are unobserved. I follow Chib and Greenberg (1998) and augment the parameter vector, θ, to include both the latent utility M*, and the random effects, μ, so that θ={β, Σ, Γ, M*, μ} .
The Gibbs sampler was run for 7000 iterations, and the first 2000 iterations were discarded. The length of the chains is chosen to insure convergence and have a sufficient number of draws from the postburnin run. Convergence and mixing of the chains is discussed in Appendix A.
One of the goals of this paper is to learn to what extent the amount of schooling affects wage growth. To answer this question the estimation was carried out for two education groups, referred to as ‘high school graduates’ and ‘college graduates’, having 12 to 16 years of education and more than 16 years of education, respectively.
The variables included in the mobility equation include a constant, education, lagged experience and its square, lagged seniority and its square, other family income, county unemployment rate, number of children in the family, number of children younger than 2 years old, and between 2 and 5 years old, three region dummies, an SMSA dummy, dummies for AfricanAmerican and Hispanic, a dummy for married and cohort effects measured using age categories in 1975 (less than 15, 16 to 25, 26 to 35 and 36 to 45), and calendar year dummies.^{5}
The variables included in the wage equation are a constant, education, experience and its square, seniority and its square, a set of variables to allow for discontinuous wage changes associated with labour mobility,^{6} county unemployment rate, three region dummies, an SMSA dummy, dummies for AfricanAmerican and Hispanic, and cohort effects measured using age categories in 1975 (less than 15, 16 to 25, 26 to 35 and 36 to 45), and calendar year dummies.