Abstract
 Top of page
 Abstract
 1. INTRODUCTION
 2. OAXACABLINDER DECOMPOSITION
 3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 4. DETAILED DECOMPOSITION
 5. VARIABILITY IN DECOMPOSITION ESTIMATES
 6. EXAMPLE
 7. DISCUSSION
 REFERENCES
We develop a regression decomposition technique for hazard rate models, where the difference in observed rates is decomposed into components attributable to group differences in characteristics and group differences in effects. The baseline hazard is specified using a piecewise constant exponential model, which leads to convenient estimation based on a Poisson regression model fit to personperiod, or splitepisode data. This specification allows for a flexible representation of the baseline hazard and provides a straightforward way to introduce timevarying covariates and timevarying effects. We provide computational details underlying the method and apply the technique to the decomposition of the blackwhite difference in first premarital birth rates into components reflecting characteristics and effect contributions of several predictors, as well as the effect contribution attributable to race differences in the baseline hazard.
1. INTRODUCTION
 Top of page
 Abstract
 1. INTRODUCTION
 2. OAXACABLINDER DECOMPOSITION
 3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 4. DETAILED DECOMPOSITION
 5. VARIABILITY IN DECOMPOSITION ESTIMATES
 6. EXAMPLE
 7. DISCUSSION
 REFERENCES
Hazard rate models have been used by social researchers to study fertility, mortality, job mobility, and other processes involving transitions from one state to another over time. Interest generally focuses on how rates respond to changes in individual and structural characteristics or how these factors shape differences in rates across social groups. Understanding the sources of group differences in rates can inform policymakers and scholars alike about the impact of compositional differences across groups and the effects of group differences in returnstorisk associated with certain individuallevel and structural characteristics. Multivariate decomposition analysis is an appropriate tool for this purpose.
Multivariate decomposition is widely used in social research to quantify the contributions to group differences in average predictions from multivariate models. The technique utilizes the output from regression models to parcel out components of a group difference in a statistic, such as a mean or proportion, which can be attributed to compositional differences between groups (i.e., differences in characteristics or endowments) and to differences in the effects of characteristics (i.e., differences in the returns, coefficients, or behavioral responses). These techniques are equally applicable for partitioning change over time into components attributable to changing effects and changing composition.
Decomposition techniques for linear regression models have been used for many decades in sociological research. This heterogeneous collection of techniques is more generally referred to as regression standardization (Althauser and Wigler 1972; Duncan 1969; Duncan, Featherman, and Duncan 1968; Coleman and Blum 1971; Coleman, Berry, and Blum 1971; Winsborough and Dickinson 1971). Demographic standardization and decomposition techniques—generally referred to as component analysis—have a much longer history, and were formally developed by Kitagawa (1955) and generalized by Das Gupta (1993). This technique is also known as “shiftshare” analysis, and has been used to decompose differences in rates and inequality measures (e.g., see Shorrocks 1980, 1982; Williams 1991). Unlike regressionbased approaches that rely on individuallevel observational data, component and shiftshare analysis utilize aggregate data, often in the form of published tables. Oaxaca (1973) and Blinder (1973) are usually credited with introducing regression decomposition in the econometric literature in the early 1970s. Although their methods are formally identical to those developed by sociological methodologists and demographers, the technique has become more commonly known as OaxacaBlinder, Oaxaca, or BlinderOaxaca decomposition.
Regression decomposition has been extended to nonlinear models including probit (Gomulka and Stern 1990), logit (Even and Macpherson 1993; Fairlie 2005; Nielsen 1998; Yun 2005a), and count models (e.g., see Bauer et al. 2007; Heitmueller 2004; Park and Lohr 2008). For linear regression, logit, and count models, the observed difference in group means, proportions, or counts (i.e., a difference in the “first moment”) is additively decomposed into a characteristics (or endowments) component and a coefficient (or effects) component. It should be noted that in any given application a researcher may be interested in one or the other of these components—for example, in the portion of the total differential that could be attributed to compositional differences between groups, or to the change in characteristics over time for a single group (e.g., see Even and Macpherson 1993; Nielsen 1998).^{1}
The rationale for extending multivariate decomposition to rate models is motivated by considering parallels with traditional approaches along with the conveniences achieved by adopting the more widely used regressionbased decomposition techniques. The traditional demographic approach of component analysis is a form of decomposition that seeks to partition a difference in rates into components due to compositional differences between groups and to group differences in rates (Kitagawa 1955). Aggregate data are required for traditional component analysis, which has an advantage insofar as analysis can be carried out based on published data tables (e.g., see Smith, Morgan, and KoropeckyjCox 1996). However, the increased complexity of method when extended to more than a few variables is a disadvantage. Given the limitations of the traditional approach and the advantages of carrying out analysis using individuallevel observational data, we develop a convenient regressionbased method for decomposing differences in rates utilizing results from multivariate models. This approach provides a link between the traditional demographic approach of multiple component analysis for differences in rates and recent regressionbased decomposition approaches.
Given the widespread use of hazard rate models in applied research in a variety of disciplines, as well as longstanding interests in understanding the sources of group disparities and changes over time, extending the regression decomposition to hazard rate models is warranted. This paper develops a multivariate decomposition technique for proportional hazard rate models that are specified with a piecewise constant baseline hazard. This approach is flexible in that it can accommodate arbitrary forms of time dependence in the baseline hazard as well as nonproportional covariate effects. The decomposition is based on a generalized linear model of the same form as the logit, probit, and loglinear models for which software has been developed and extensions may be easily implemented. However, complexities are introduced that are not present in other regression decomposition methods.
In this paper we discuss refinements to the OaxacaBlinder decomposition method that lead to a practical approach for multivariate decomposition of a difference in rates. Section 2 reviews the standard OaxacaBlinder decomposition. Section 3 discusses the specification of the hazard rate model and the setup for the multivariate decomposition of rates. Section 4 discusses the detailed (covariate by covariate) decomposition, and Section 5 discusses sampling variability of the estimates. Section 6 provides an illustrative example, and Section 7 provides a discussion of extensions and limitations of the technique.
2. OAXACABLINDER DECOMPOSITION
 Top of page
 Abstract
 1. INTRODUCTION
 2. OAXACABLINDER DECOMPOSITION
 3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 4. DETAILED DECOMPOSITION
 5. VARIABILITY IN DECOMPOSITION ESTIMATES
 6. EXAMPLE
 7. DISCUSSION
 REFERENCES
The OaxacaBlinder technique is the most familiar and widely used decomposition technique for linear models. The approach has been applied in research on wage differentials with the goal of understanding the relative roles played by group differences in levels of certain characteristics and group differences in the effects of those characteristics on wage differentials. For example, it is often argued that the portion of the wage differential that cannot be accounted for by group differences in characteristics is the result of labor market discrimination or differences in the returns to human capital factors such as education or job experience, and differences in unmeasured factors.
OaxacaBlinder regression decomposition begins with a linear model estimated separately for two groups, or for one group at two time points, indexed by j,
 (1)
where denotes the fitted value of y for the ith individual in the jth group, x_{ij} is a collection of measured characteristics for that individual (a K × 1 vector)—including a constant term—and b_{j} is the set of estimated regression coefficients (a K × 1 vector). The difference in average predictions can be partitioned into the sum of two components as,^{2}
 (2)
The first component reflects the contribution to the total differential due to group differences in the mean values of x, holding the effects constant at group 1 levels. This component is called the explained component, or endowment or characteristics effect, which is generally denoted by E. The second component reflects the portion of the differential due to group differences in b, holding the mean value of characteristics constant at group 2 levels, which is generally denoted by C. This component is called the unexplained component or coefficients effect. An equivalent decomposition, albeit with a change of sign, results from switching the roles of the comparison (group 1) and the reference group (group 2). In practice, both sets of results are reported or the results from the two separate decompositions are averaged. It is also possible to base the decomposition on results from various forms of pooled regression models (e.g., see Jann 2008 for a review).
In addition to a decomposition of the overall difference, we are often interested in the unique contribution of each covariate to the overall difference, or the detailed decomposition. For example, if groups differ on levels of education and returns to education, it would be desirable to isolate the distinct contributions to the total differential attributable to differences in levels of, and returns to, education along with the unique contributions of the other predictors in the model.
The Oaxaca technique is applicable when group differences in sample means or changes in sample means over time are the focus of inference. However, many sociodemographic outcomes involve differences in predicted rates or proportions estimated from nonlinear response models. It is well known that the usual Oaxaca method of mean/coefficient substitution is not strictly applicable to nonlinear response models, hence the recent interest in extending the method to this class of models. Moreover, for nonlinear response models, the results from the detailed decomposition are sensitive to the order in which variables enter the decomposition. Various methods have been proposed to overcome this dependency, including averaging over all possible orders of covariate replacement (Fairlie 2005) and by determining the relative contribution of each variable to each component using a set of appropriately constructed weights (Even and Macpherson 1993; Nielsen 1998; Yun 2005a).
This paper builds on previous research by Even and Macpherson (1993) and Nielsen (1998), who extend the OaxacaBlinder approach to binary response models. These methods, as well as several innovative extensions, have been developed in a more systematic way by Yun (2004), who addressed several weaknesses in past approaches to multivariate decomposition of nonlinear response models (e.g., see Fairlie 2005). Yun's estimator is simple to calculate and its sampling distribution can be obtained using asymptotic theory (e.g., see Yun 2005a).
3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 Top of page
 Abstract
 1. INTRODUCTION
 2. OAXACABLINDER DECOMPOSITION
 3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 4. DETAILED DECOMPOSITION
 5. VARIABILITY IN DECOMPOSITION ESTIMATES
 6. EXAMPLE
 7. DISCUSSION
 REFERENCES
We follow the logic used in previous research on multivariate decomposition of binary response models by introducing modifications for rate models. As an illustrative example, we decompose the observed difference in premarital birth rates for nonHispanic blacks and whites using data from the 1979 cohort of the National Longitudinal Survey of Youth (NLSY). We define the empirical rate in the conventional way as the number of events divided by the total amount of exposure to risk. Let d_{i} be a binary variable coded 1 if an event occurs for individual i at age t_{i}, and 0 otherwise (i.e., t_{i} is right censored). The observed rate can be expressed as . The blackwhite difference in rates is expressed as
 (3)
where the indices B and W denote the higherrisk (nonHispanic black) and lowerrisk (nonHispanic white) group, respectively, and , is computed as
 (4)
where
 (5)
is the estimated cumulative, or integrated, hazard associated with the ith individual in the lth time interval from a piecewise constant exponential hazard rate model. We can view Λ_{ilj} as the expected number of events experienced by the ith individual in the lth interval of exposure to risk, assuming a timehomogeneous Poisson process with rate λ_{ilj} that is observed until either a first event occurs or the subinterval of time has elapsed without the event occurring (e.g., see Aitkin and Clayton 1980; Barlow and Proschan 1975; Holford 1980). For the (piecewise constant) exponential model, the total number of events in group j equals the sum of the estimated integrated hazards. That is,
 (6)
Each individual contributes Δt_{il} units of exposure to the lth time interval. It follows that an individual's total exposure equals the individual's event or censoring age, . The number and the widths of the age intervals are chosen exogenously and must be the same for each group. Specifically, we define a set of cutpoints, [τ_{0}, τ_{1}), [τ_{1}, τ_{2}), … , [τ_{l}, ∞), or pieces, for the piecewise constant model that are common to both groups. Then, given an individual's event or censoring time t_{i}, we determine an individual's exposure in the lth interval as
 (7)
This results in n_{i} subepisodes of risk for individual i. Note that the sum of the subinterval exposures over all individuals necessarily equals the total exposure in the sample, i.e., . Combining this with equation (6), we can show the equivalence between the exposureaveraged predicted event counts and the observed rates,
 (8)
For our illustrative example, we adopt a proportional hazards model with piecewise constant hazards over six age intervals: [12,16), [16,18), [18,20), [20,22), [22,24), and [24+). This allows time dependence in the baseline hazard by age and is similar to the partitioning used by Wu and Martinson (1993), Powers (2001), and others. It is convenient to parameterize the baseline log hazard separately from the structural part of the model by excluding the conventional intercept and including a set of dummy variables for the six age intervals, D_{i1j}, … , D_{i6j} and a corresponding set of parameters for the log baseline hazard, a_{1j}, … , a_{6j}, which results in the following model specification for the rate:
 (9)
where z denotes the vector of predictors and γ denotes the corresponding vector of coefficients.
This is a proportional hazards model that is semiparametric in the sense of a Cox (1972) proportional hazards model as the number of time intervals increases.^{3} Assuming a constant exponential hazard for each piece, the integrated hazard in equation (5) can be written as Λ_{ilj}=Δt_{ilj}λ_{ilj}. We exploit the similarity between the loglinear model for counts and the exponential model for rates by including the logged exposure to risk in the lth interval (logΔt_{il}) as an “offset” term in a Poisson regression model. It is well known that this approach yields a piecewise constant exponential hazard rate model (e.g., see Holford 1976, 1980; Laird and Oliver 1981). Equation (5) can now be written as
 (10)
It should be noted that while modeling is based on equation (10) using Poisson regression, we actually decompose the difference in the rates given in equation (9). That is, x= (D_{1}, … , D_{6}, z) and b= (a_{1}, … a_{6}, γ). There are advantages to this formulation apart from the fact that standard programs can be used to estimate the models (i.e., Stata glm, R glm, and SAS proc genmod). Nonproportional covariate effects can be introduced by replacing γ_{j} in equation (10) with γ_{lj}—that is, by including interactions of covariates (z) and the dummy variables (D) for the age intervals. Similarly, timevarying covariates can be included in the model, with possibly different values of z for each interval. The calculations above are facilitated by arranging the input data in the form of a splitepisode data structure, with n_{i} periods of risk (i.e., personperiods or stacked data) allocated to individual i. In this case the double summations in equation (4) are replaced by single summations over the personperiod data (e.g., see Allison 1982).
We would like to decompose the overall difference in equation (3) into components that reflect compositional differences between groups and differences in the effects of those characteristics between groups similar to what was done in equation (2). We can rewrite equation (3) as^{4}
 (11)
The E component appearing in equation (11) is the portion of the differential attributed to compositional differences or differences in “endowments,” which is the predicted premarital birth rate for blacks minus the predicted rate if whites experienced the same returns to risk, or behavioral responses, to characteristics as blacks. This component reflects the contribution to the difference that would have occurred if the two groups differed with respect to characteristics alone. The C component in equation (11) is the portion of the blackwhite gap attributable to differences in the coefficients, and it reflects the contribution to the difference that would prevail if only the covariate effects differed across groups. Both groups’ characteristics are held fixed at white levels to assess this component.
In the expressions above, the coefficients for the black sample are used as weights in the composition (E) component and the white covariate values are used as weights in the coefficient (C) component, making blacks the comparison group and whites the reference group in this case. The same differential (with a change in sign) can be obtained from an alternative decomposition that switches the roles of the reference and comparison groups. This is referred to as the “indexing” problem (Neumark 1988; Oaxaca and Ransom 1988, 1994).
By fixing the coefficients in the composition component to black levels, we assess the contribution to the blackwhite gap that would have occurred if the returns to risk associated with the covariates in the model were fixed to the values in the black sample. By fixing characteristics to white levels in the coefficient component, we assess the contribution to the differential that is due to the blackwhite difference in effects. An equivalent decomposition would reverse this procedure. That is, we could perform a different decomposition by weighting the composition component by the white coefficient values while using the observed characteristics of blacks as weights in the coefficient component. Sometimes the average of the results of the two specifications is reported.
4. DETAILED DECOMPOSITION
 Top of page
 Abstract
 1. INTRODUCTION
 2. OAXACABLINDER DECOMPOSITION
 3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 4. DETAILED DECOMPOSITION
 5. VARIABILITY IN DECOMPOSITION ESTIMATES
 6. EXAMPLE
 7. DISCUSSION
 REFERENCES
The decomposition thus far has been described at the aggregate level. To understand the unique contribution of each predictor to each component of the difference requires a detailed decomposition. That is, we wish to partition E and C into portions, E_{k} and C_{k}(k = 1, … , K) that represent the unique contribution of the kth covariate to E and C, respectively. Unlike the decomposition for a linear model, a nonlinear decomposition is sensitive to the order in which the independent variables are entered into the decomposition. This problem is referred to as “path dependence” (e.g., see Yun 2004). The two approaches to detailed decomposition outlined below provide remedies to this problem.
Fairlie (2005) adopts a multistep procedure for a decomposition based on a logit model, focusing on the characteristics component,E. The procedure requires that we perform a onetoone matching of comparisongroup and referencegroup observations based on the ranking of their respective withingroup predicted response probabilities. The independent contribution of a variable to E is determined by evaluating a decomposition in which one covariate value from the reference group (e.g., z_{1W}) is swapped with one from the comparison group (e.g., z_{1B}). Thus, the contribution of each variable to E is equal to the difference in the average prediction when the reference group's distribution on a variable is replaced with the comparison group's distribution on that variable while holding the distributions of the other variable constant.
This method is straightforward when the sample sizes are equal. Since this is seldom the case, modifications to the matching procedure are required. The following steps are suggested by Fairlie: (1) draw a random sample from the larger group equal in size to that of the smaller group, (2) rank each group by their respective predicted response probabilities, (3) match observations from the two samples according to their respective rankings on the predicted responses, and (4) evaluate the average group difference in the response probabilities using the sequential covariate swapping approach outlined earlier. This approach does not solve the path dependence problem unless it is accompanied by randomizing the variable swapping order in step (4). In practice, it is necessary to carry out these steps on a large number of random samples from the larger group. The results are then averaged over all the random samples.
Even and Macpherson (1993), Nielsen (1998), and Yun (2004) have suggested simpler methods for detailed decomposition using weights derived from a linearization of the decomposition equation. The detailed decompositions obtained in this way are invariant to the order that variables enter the decomposition, thus providing a solution to path dependency. It should be noted that Even and Macpherson (1993) focus on the endowment component only, whereas Nielsen (1998) focuses only on the coefficient component.
Thus, the composition weights reflect the contribution of the kth covariate to the Taylor approximation of E (E_{T}) as determined by the magnitude of the group difference in means weighted by the reference group's effect. Similarly, the coefficient weights reflect covariate k's contribution to C_{T} as determined by the magnitude of the group difference in the effects weighted by the comparison group's mean. The weights are invariant to change in the scale of the covariates.
The raw difference can now be expressed in terms of the overall components as a total of weighted sums of the unique contributions:
 (16)
This weighting method gives results that are nearly identical to the sampling and randomization procedures outlined earlier as long as enough samples are drawn.
5. VARIABILITY IN DECOMPOSITION ESTIMATES
 Top of page
 Abstract
 1. INTRODUCTION
 2. OAXACABLINDER DECOMPOSITION
 3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 4. DETAILED DECOMPOSITION
 5. VARIABILITY IN DECOMPOSITION ESTIMATES
 6. EXAMPLE
 7. DISCUSSION
 REFERENCES
Many applications ignore the sampling variability of the decomposition components (e.g., see Borooah and Iyer 2005; Sweeney and Phillips 2004; Van Hook, Brown, and Kwenda 2004). The characteristics and effects components do not provide information about the precision of the contributions to group differences per se. For this reason, it is important to gauge the sampling variability of E and C in substantive applications. Because the components used in the decomposition are functions of maximum likelihood estimates, the delta method described by Rao (1973: 321–23) can be used to derive asymptotic standard errors of the detailed contributions. Interval estimation and significance testing can be done in the usual way (e.g., see Yun 2005a). This approach utilizes expressions for the first derivatives (i.e., gradients) of the detailed components with respect to the estimates, in addition to the variance/covariance matrix of the estimates from each group, as we show next.
E and C, along with the detailed contributions, E_{k} and C_{k}, are nonlinear functions of the maximum likelihood estimates b. The derivatives of E_{k} and C_{k} with respect to b, together with the variance/covariance matrix of b, are used to obtain the asymptotic variance/covariance matrix of the detailed components. We begin by expressing the endowment component as a weighted sum of the individual contributions, E_{k},
 (17)
The kth element of the gradient vector is given by
 (18)
where
For nonlinear models in general,, j ∈{B, W}. For the models considered here
 (19)
which has a convenient form owing to the assumption of Poisson sampling.^{6}
Letting var(b_{B}) denote the variance/covariance matrix of b_{B} and E denote the K × K matrix with E_{1}, … , E_{K} on the main diagonal and zeros elsewhere, the asymptotic (co)variances matrix of the detailed characteristics component is
 (20)
Following the same logic, the coefficient component can be written as the sum of individual contributions as
 (21)
Each covariate's contribution to the overall coefficient component depends on the parameter vectors, b_{B} and b_{W}. The kth elements of the respective gradients are
 (22)
and
 (23)
where
 (24)
When j = W, this quantity has the opposite sign. Let var(b_{B}) and var(b_{W}) denote the covariance matrix of the estimates from blackwhite models, respectively, and let C be a K × K matrix with C_{1}, … , C_{K} on the main diagonal and zeros elsewhere, the large sample (co)variance matrix of the detailed coefficient components is
 (25)
Significance tests on individual components, blocks of components, or the overall decomposition as a whole can be carried out using Wald tests by redefining E and C to include a subset of the original set of terms along with the corresponding submatrices of var(b_{B}) and var (b_{W}). The variance estimates derived above assume that the independent variables are fixed and that groups are independent. They will underestimate the true variances if this is not the case.
It would also be possible to obtain a bootstrapped distribution of the components by applying a repeated modeling approach. Alternatively, a Bayesian approach can be used to obtain the posterior distribution of each component using a Markov Chain Monte Carlo (MCMC) method as outlined by Radchenko and Yun (2003). An alternative to bootstrapping or a full Bayesian approach is to simulate the distributions of each component by drawing M parameter vectors for each group, carrying out the decomposition on the simulated parameter vectors, and obtaining means and variances of the resulting distributions of the decomposition components. Specifically, let
 (26)
denote the mth simulated parameter vector from the jth group, which is assumed to follow a multivariate normal distribution centered around the MLEs, with variance/covariance Σ_{b}. With no loss of generality, Σ_{b} could be drawn from an inverseWishart distribution to allow for sampling variation in the covariances. Under this approach, the decomposition is carried out M times, resulting in a posterior predictive distribution for each quantity in the decomposition (e.g., see Lynch and Western 2004). Statistical inference can be carried on the quantities from the resulting distributions.
6. EXAMPLE
 Top of page
 Abstract
 1. INTRODUCTION
 2. OAXACABLINDER DECOMPOSITION
 3. REGRESSION DECOMPOSITION OF A DIFFERENCE IN RATES
 4. DETAILED DECOMPOSITION
 5. VARIABILITY IN DECOMPOSITION ESTIMATES
 6. EXAMPLE
 7. DISCUSSION
 REFERENCES
Race/ethnic differences in the risk of outofwedlock childbearing are routinely examined using groupspecific hazard rate models or models in which race/ethnicity is included as a risk factor. Although this approach yields insight into the relative importance of key predictors of nonmarital fertility for different race/ethnic groups, it cannot answer questions about the relative contributions of race differences in characteristics and effects to the absolute race/ethnic differences in rates. In particular, to what extent is the racial difference in rates attributable to compositional differences in predictors—such as what might be reflected by group differences in socioeconomic resources and family structure—and to differences in the effects of these predictors (i.e., the group differences in behavioral responses to these characteristics)?
We decompose the observed blackwhite difference in premarital birth rates into compositional and returntorisk components. The decomposition is carried out at the aggregate and detailed levels, thus allowing an assessment of the contribution of each model predictor to the racial gap. For research on first nonmarital fertility transitions, this type of analysis provides a way to assess the contributions of socioeconomic background and family structure, whose effects and distributions differ by race.
Data from the 1979 National Longitudinal Survey of Youth (NLSY79) are used to model first nonmarital fertility transitions (i.e., first premarital birth) for blacks and whites using proportional hazards models (Center for Human Resource Research 1979). We adopt a parsimonious model specification using covariates that have been widely used in past research including (1) family background characteristics (mother's education, adjusted family income,^{7} and number of older siblings) and (2) family structure characteristics (mother's age at respondent's birth, proportion of years living in single mother family, and number of family changes up to the time of the event or before age 18, whichever occurs first). The latter two variables are computed using the 18year living arrangement histories in the NLSY (e.g., see Wu and Thomson 2000; Powers 2005).
The estimated blackwhite difference in the crude rates of first premarital birth is 0.02048 (r_{B} − r_{W} = 0 .02530 − 0 .00482 = 0 .02048). To facilitate the presentation of results, we express this difference as 20.48 births (per year of age) per 1000 women. Table 1 presents covariate means and model estimates for each group as well as the crude rates per 1000 and race differences in rates. Table 2 provides the detailed decomposition obtained by averaging the results of separate decompositions with interchanged reference and comparison groups.^{8} The contributions have been multiplied by 1000 to reflect increases or decreases in the gap in terms of numbers of births per year of age per 1000 women. Under the current model, compositional differences between blacks and whites (i.e., differences in levels of resources and family structure) contribute 5.16 births per 1000 (25.2%) to the overall gap, whereas blackwhite differences in covariate effects (i.e., the returnstorisk of these characteristics) contribute 15.32 births per 1000 (74.8%) to the estimated difference.
Table 1. Means, Effects (Hazard Ratios and Baseline Hazards), and Event Percentages Independent Variables  Blacks  Whites 

Means  e^{b}  Z*  Means  e^{b}  Z* 


Percentage of years in single mother family  0.24  1.117  1.02  0.06  2.166  2.41 
Number of family changes  0.62  1.162  3.82  0.49  1.336  5.87 
Mother's schooling  10.76  0.937  −4.38  11.87  0.874  −5.69 
Adjusted family income × 10,000  0.55  0.570  −5.15  1.00  0.658  −3.09 
Number of older siblings  2.72  1.053  3.01  1.90  1.182  4.61 
Mother's age at R's birth  24.91  0.971  −4.30  25.48  0.949  −4.09 
Baseline Hazard Age Intervals  Percentage of Events  e^{b}  Z*  Percentage of Events  e^{b}  Z* 

[12, 16)  5.60  0.014  −16.99  0.74  0.006  −11.39 
[16, 18)  15.03  0.367  −4.24  2.45  0.176  −4.30 
[18, 20)  14.00  0.455  −3.30  2.58  0.244  −3.44 
[20, 22)  9.87  0.473  −3.06  2.06  0.276  −3.07 
[22, 24)  5.16  0.348  −4.06  0.87  0.165  −3.97 
[24+)  6.26  0.194  −6.42  2.49  0.245  −3.39 
Event percentage  55.93    11.19   
Crude rates × 1000  25.30    4.82   
Blackwhite difference in rates = 20.48       
N  1,357    2,287   
Table 2. Decomposition into Characteristics (E) and Coefficients (C) Components Independent Variables  95% CI  95% CI 

E (×1000)  Lower  Upper  Percentage of Total  C (×1000)  Lower  Upper  Percentage of Total 


Percentage of years in single mother family  0.89  −0.77  2.54  4.33  −1.11  −1.82  −0.39  −5.42 
Number of family changes  0.46  0.39  0.52  2.23  −0.81  −1.24  −0.37  −3.95 
Mother's schooling  1.68  −0.71  4.07  8.19  8.11  5.18  11.11  39.60 
Adjusted family income × 10,000  4.37  2.26  6.48  21.32  −1.09  −2.81  0.63  −5.33 
Number of older siblings  1.19  0.28  2.11  5.83  −2.82  −4.17  −1.46  −13.74 
Mother's age at R's birth  0.36  −1.16  1.89  1.77  6.00  3.60  8.39  29.28 
Baseline Hazard Age Intervals 
[12, 16)  −5.02  −6.25  −3.79  −24.51  2.75  1.36  4.14  13.43 
[16, 18)  −0.44  −0.83  −0.04  −2.13  1.97  0.79  3.14  9.61 
[18, 20)  0.38  −0.30  1.05  1.83  1.19  0.26  2.11  5.80 
[20, 22)  0.36  −0.34  1.06  1.76  0.66  0.02  1.30  3.22 
[22, 24)  0.49  −0.02  1.00  2.39  0.60  0.14  1.06  2.93 
[24+)  0.45  0.09  0.81  2.18  −0.13  −0.44  0.18  −0.64 
Overall contributions     
95% CI  25.21  95% CI  74.79 
[3.43 – 6.89]   [12.88 – 17.77]  
We first discuss the contributions of the substantive predictors to the overall premarital birth rate gap. We discuss the baseline hazard contribution later. Table 2 shows the detailed decomposition for the family background and family structure variables. A positive characteristic effect, E_{k}, indicates the amount that the blackwhite gap would decrease if the group difference in variable k would disappear. Based on the results from the proportional hazard models (Table 1), each change in family structure (or family transition) increases the risk of a first premarital birth by 16% for blacks and 34% for whites. However, whites experience fewer of these transitions on average than blacks, with means of 0.49 and 0.62 transitions, respectively. The results in Table 2 show that with respect to the (white) reference group, this compositional disadvantage for black women contributes 0.46, or about 2.2%, to the overall difference. Turning to the income effect, we see from Table 1 that a $10,000 increase in adjusted family income is associated with a 34% and 43% decrease in the risk of premarital birth for whites and blacks, respectively. Despite similar returns to income, average income in black families in the NLSY is 55% that of white families. From Table 2 we see that the difference in family income by race accounts for 4.37 births per 1000 women, which comprises over 21% of the overall racial difference in rates. Among the compositional factors considered here, making family incomes and number of older siblings in the comparison population (blacks) equal to that of the reference population (whites) would produce the largest reductions in the racial gap in the premarital birth rate.
A similar interpretation applies to the effects component, C_{k}. A negative coefficient indicates the expected increase in the blackwhite gap if blacks experienced the same returnstorisk as whites. For example, if we consider the “number of older siblings” effects reported in Table 1, each additional elder sibling is expected to increase a woman's risk of a premarital birth by 18.2% and 5.3% for white and black women, respectively. From Table 2, we find that the overall blackwhite gap would be expected to increase by 2.82 births per 1000 (7.7%) if black women were penalized by the number older siblings to the same extent as white women. Similarly, a positive Ccoefficient reflects the expected decrease in the blackwhite gap due to equalizing an effect to the white level. For example, whites and blacks experience different returns to maternal education. Based on the results from the proportional hazards model in Table 1, each additional year of mother's schooling reduces the risk of premarital birth by 12.6% for whites and 6.3% for blacks. The decomposition results in Table 2 show that if blacks benefitted from higher levels of maternal education to the same degree as whites, then we would expect the blackwhite gap in the premarital birth rate to decrease by 8.11 births per 1000, or 39.6% of the overall gap. Differences in returns to maternal education, as well as differences in the effects of maternal age at respondent's birth, are the largest contributors to the overall gap.
If we were to consider a hypothetical policy designed to reduce the blackwhite gap in the premarital birth rate, then equalizing socioeconomic resources across groups would lead to a larger decrease in the compositional portion of the gap than would making groups more similar in terms of family structure (number of family transitions, proportion of years spent in a single mother family, and number of older siblings). However, a greater share of the total differential can be attributed to differences in the effects of maternal education and mother's age at respondent's birth, so equalizing these effects across groups would yield the greatest reduction in the blackwhite premarital birth rate gap. It is probably safe to say that changing behavioral responses presents a more challenging task from a policy perspective than equalizing socioeconomic resources across groups.
6.1. Baseline Hazard Components
Compositional components involving the dummy variables for the age intervals that define the baseline hazard play the same role as the constant term in a standard multivariate decomposition. The piecewise constant hazard model effectively partitions the constant term into several pieces, with individuals differing on the number of pieces they contribute. In standard models, the mean value of the constant is always 1 and the difference in means across groups is always 0. For the decomposition of the piecewise constant hazard model, the characteristics effects associated with the pieces of the baseline hazard reflect race differences in the distribution of exposure, which in turn is a function of race differences in the age distribution of events and censoring. The fact that the characteristics effects of the baseline hazard reported in the first panel of Table 2 are at first negative and then positive reflects the fact that the age distribution of events is centered at a younger age for black women and at an older age for white women.
The coefficient effects for the baseline hazard are informative about the contribution of racial differences in the agespecific baseline hazard rates to racial gap in premarital birth rates. Taken together, group differences in the logged baseline hazards (i.e., the coefficients pertaining to the ageinterval of the event from the model) account for about 7 births per 1000, or 37%, of the racial gap (see Table 2). This is the expected reduction in the gap if blacks were to experience the same agespecific baseline rates as whites. We find that the largest contributors to the differential are attributed to differences in the baseline hazards pertaining to the first three age intervals, which reflect the race differences in the underlying rates for teenagers.
6.2. Nonproportional Hazards
As mentioned earlier, it is possible to incorporate nonproportional effects via interactions with the ageinterval dummy variables. For example, we could introduce a D_{l} × z interaction into equation (9). Adding these types of interactions presents no additional difficulty in the decomposition procedure per se. However, the characteristics effects for an interaction term will reflect differences in exposure in age interval l in addition to differences in characteristics for those at risk in age interval l. Thus the characteristic effects associated with ageinterval interactions are somewhat ambiguous. However, the coefficient effects have a straightforward interpretation.
In these data we find evidence of an agevarying effect of family income in the sample of nonHispanic white women, with different effects on the risk for those aged 24 and younger and for those older than 24. We fit a nonproportional effects model to both groups that includes one family income effect on the risk in the 12–24 year old ageinterval and one income effect on the risk beyond age 24. We refer to this as Model 2. Table 3 provides the relevant income effects for both groups as well as the decomposition results from Model 2 and from the original model (Model 1). We find that for white women, a $10,000 change in family income yields a 62% reduction in the risk of a premarital birth in the 12–24 age interval, whereas the same change in family income increases the risk of a premarital birth by 63% at older ages. For black women, increased income has no significant effect on the risk of premarital births at older ages.
Table 3. Proportional and Nonproportional Effects of Family Income and Decomposition Components  Blacks e^{b}  Whites e^{b}  E  Percentage of Total  C  Percentage of Total 


Model 1: Proportional effect of family income 
Family income  0.570*  0.658*  4.37*  21.32  −1.09  −5.33 
Model 2: Nonproportional effect of family income 
Family income × age [12,24)  0.553*  0.385*  4.78*  21.25  2.51*  12.23 
Family income × age [24^{+})  0.699  1.629*  0.12  0.45  −0.40*  −1.95 
The decomposition results in Table 3 show that compositional differences in income levels for those at risk of events in the 12–24 age interval account for 23% of the total racial gap according to Model 2. This is similar to the contribution of 21% in Model 1, which pertains to all ages. Differences in the effects of income at younger ages account for 12% of the total gap in Model 2. Therefore, equalizing the returns to income for younger women would be expected to reduce the blackwhite gap by 2.5 births per 1000 as a result of the larger race difference in age 12–24 income effects in Model 2. This is in sharp contrast to Model 1, where the effects of income are similar by race and where the racial difference in income effects comprises a negligible portion of the blackwhite difference in the premarital birth rate. Race differences in effects of income at older ages account for a small portion of the total gap, and equalizing income effects for older women would be expected to increase the gap in the premarital birth rate by less than 1 birth per 1000 women.
Under Model 2, differences in characteristics account for 27.4% of the racial gap in premarital birth rates while differences in effects account for 72.6%. These results are similar to those from Model 1 (25.2% and 74.8%, respectively). While the overall contributions are similar, it should be noted that the income × age interaction in Model 2 necessarily impacts the baseline hazard, thus blurring the distinction between the baseline hazard and the structural part of the model to some extent. We find that race differences in the baseline hazard account for 13% of the overall gap in Model 2 compared with 37% of the gap in Model 1.