The Scandinavian Fantasy: The Sources of Intergenerational Mobility in Denmark and the US

Abstract This paper examines the sources of differences in social mobility between the US and Denmark. Measured by income mobility, Denmark is a more mobile society, but not when measured by educational mobility. There are pronounced non‐linearities in income and educational mobility in both countries. Greater Danish income mobility is largely a consequence of redistributional tax, transfer, and wage compression policies. While Danish social policies for children produce more favorable cognitive test scores for disadvantaged children, they do not translate into more favorable educational outcomes, partly because of disincentives to acquire education arising from the redistributional policies that increase income mobility.


I. Introduction
Policy analysts around the world point to Scandinavia as a model for reducing inequality and promoting intergenerational mobility (see, e.g., Baily, 2016). By conventional measures, social mobility by income is much higher there than in the US.
In this paper, we use rich Danish data to explore the sources of these differences in social mobility. By all accounts, Denmark is a prototypical Scandinavian welfare state. Lessons learned from Danish data apply to Scandinavia more generally.
Our investigation reveals some surprises and apparent contradictions. The literature on Danish social mobility by income is surprisingly sparse and uses only a limited number of measures of income. One contribution of this paper is to demonstrate that the choice of the measure of income used matters greatly in determining the relative social mobility of the US and Denmark.
The standard measure of intergenerational mobility is based on the intergenerational elasticity (IGE): a regression coefficient showing the percentage change in a child's income associated with a percentage change in parental income. We show that estimated IGEs depend greatly on the measure of income used and that estimated IGEs vary with the level of income. US social mobility is low (absolutely and compared to Denmark) for children from high-income families.
Popular discussions of the benefits of the Scandinavian welfare state point to its generous support of childcare and education relative to the US as major determinants of its greater social mobility. In Denmark, college tuition is free, there is ready access to childcare, pregnancy-leave policy is generous, and there is virtually universal free pre-kindergarten. Yet, despite these stark policy differences, the influence of family background on educational attainment is surprisingly similar in the two countries. Levels of intergenerational educational mobility are about the same. At higher levels of family income, educational mobility is lower in both countries.
In both countries, cognitive and non-cognitive skills acquired by age 15 are more important for predicting educational attainment than parental income. The more child-generous Danish welfare state produces much more favorable distribution of cognitive skills for disadvantaged Danish children compared to their counterparts in the US. The similarity of the influence of family background on educational attainment in the two countries, despite the more favorable distribution of test scores for Danish disadvantaged children, arises in part from the compression of the wage scale and the generous levels of social benefits that discourage Danish children from pursuing further schooling. In addition, the generosity of the Danish welfare state does not prevent sorting of children into neighborhoods and schools on the basis of family background, which appears to benefit the more advantaged. Scandinavia invests heavily in child development and boosts the test scores of the disadvantaged. It then undoes these beneficial effects by providing weak labor market incentives. Better incentives to acquire skills would boost Danish educational mobility. Stated differently, the greater incentives to acquire education in the US labor market tend to offset its less favorable investments in the cognitive skills of disadvantaged children. In addition, while the Danish welfare state promotes equality of opportunity compared to the US, many barriers remain. There are large skill gaps between the children of the advantaged and the children of the disadvantaged, during early and late childhood. Residential sorting across neighborhoods and schools is strong. This paper proceeds in the following way. In Section II, we analyze income mobility in Denmark and the US. We examine the sensitivity of estimated income IGEs to alternative measures of income. We examine the sources of differences in income mobility. We also report non-parametric estimates of income mobility. In Section III, we examine the relationship between schooling attainment, measures of family financial resources, cognitive and non-cognitive skills of children at age 15, family background (education and home environment), and measures of schooling quality. We report surprisingly similar effects of family influence on educational attainment in both societies. We show a link between welfare benefits and educational attainment in Denmark. We discuss the role of neighborhood sorting on child educational attainment. In Section IV, we qualify our analysis. We conclude in Section V.

II. Income Mobility
In this section, we explore alternative measures of intergenerational income mobility. Different measures of income convey very different impressions of social mobility. We show how the levels of transfers, the mapping of education to income, the levels and progressivity of taxation, and income inequality differ between the US and Denmark. All four factors affect estimates of income mobility.
We report estimates of non-linear (NL) IGEs for both countries. We find different patterns depending on which income measure we consider.
Differences favoring Denmark appear at the lowest and the highest levels of income.
Data US Data. We use two US data sources. We use Panel Study of Income Dynamics (PSID) data for our main analysis of intergenerational income mobility. We measure parental income using a nine-year average from the child's 7th to 15th year. 1 Child income is measured as income at ages 34-41 down to ages 30-35 for the 1972-1978 birth cohorts. In our main analysis, we only consider individuals with positive incomes. See Section F in the Online Appendix (https://cehd.uchicago.edu/scandinavian-appendix) for more details.
As the sample size for the PSID data is small (relative to the Danish data), we use the March Current Population Survey (CPS;1968 from the Integrated Public Use Microdata Series (IPUMS) 2 when we analyze US income distributions. The sample consists of civilian, non-institutionalized citizens. We use parents in 1987 and individuals aged 36-38 in 2011.
Danish Data. 3 For Denmark, we use the full population register data on the entire cohorts born in [1973][1974][1975]. We discard individuals who migrate (or whose parents migrate), individuals for whom we have no identification of the father or mother (around 2 percent), and individuals with negative incomes (averaged over the period where we measure income). Parental income is measured as a nine-year average from when the child is 7-15 years of age, and the child's income is measured at ages 35-37, 36-38, and 37-39 for the 1975, 1974, and 1973 cohorts, respectively. The full sample size is 166,359, and once we restrict to positive incomes the sample is reduced to 149,190 individual parent-child matches. 4 In the Online Appendix, Table A23 provides the definitions of the various income measures we consider, Table A1 summarizes income levels for the US and Denmark by different quantiles and income measures, and Figure A1 depicts the distributions. Table A1 and Figure A1 in the Online Appendix show that incomes in Denmark are more compressed than incomes in the US. There is a large low-income group in the US that virtually does not exist in Denmark (Forslund and Krueger, 1997;Aaberge et al., 2002;Corak, 2013). 5 In the next section, we show that cross-sectional differences in income distributions between Denmark and the US are an important source of higher income mobility in Denmark than in the US.

Linear Intergenerational Income Elasticities
There is a large literature investigating the association between parents' and children's income. 6 The modal statistic used to study income mobility is the IGE of income β I G E : ln(Y C ) = α + β I G E ln(Y P ). (1) The father/son or parent/child IGE is generally found to be much higher in the US than in Denmark. Estimates generally lie between 0.3 and 0.5 in the US and around 0.1 to 0.2 in Denmark (Björklund and Jäntti, 2011;Blanden, 2013;Solon, 2002). There is a similar range for rank-rank associations. Boserup et al. (2013) and Chetty et al. (2014) estimate this to be 0.18 in Denmark and 0.34 in the US, respectively. Based on these estimates of the income IGE, Scandinavia is portrayed as a land of opportunity. 7 Cross-country differences in estimated IGEs of income can arise for a multitude of reasons that we attempt to capture using different income measures. One measure proxies transmission of total individual income potential with wage earnings, capital income, and profits. Another proxies transmission of total income including public transfers (but not the impact of in-kind transfers). A third measure introduces the effects of the progressivity of the taxation on income mobility. A fourth measure, wage earnings, proxies intergenerational transmission of earnings-potential rewarded in labor market -differences arise, in part, from differences in returns to education.
A further source of differences in estimated IGEs arises from differences in levels and trends in cross-sectional income inequality. 8 We put this issue aside for now, and investigate it in the following subsection. 5 See Section B in the Online Appendix. Freeman et al. (2010) discuss a broad range of likely causes and consequences of wage compression for the Swedish welfare state. See also Aaberge et al. (2000), Pedersen and Smith (2000), and Tranaes (2006), who provide similar evidence from Denmark. 6 See, for example, Blanden (2013), Corak (2006), and Solon (2002) for reviews of the literature. 7 See Table A1 in the Appendix to this paper for a summary of previous IGE estimates for Denmark (comprehensive) and the US (selected). 8 The previous literature investigating social mobility has long addressed some of the issues. One early example is Solon (1992). Table 1 shows estimated intergenerational income elasticities for similar income measures in Denmark and the US. The odd-numbered columns report estimates for Denmark. The even-numbered columns report the corresponding estimates for the US.
Column 1 shows that the estimated IGE based on gross income, excluding public transfers, is 0.352 for Denmark. This estimate is much higher than estimates reported in the literature, which use wage earnings, earnings, or income including public transfers. The corresponding estimate for the US is 0.312. The difference between the two estimates is not statistically significant. The third and fourth columns show that the estimated IGE for Denmark drops by around 20 percent to 0.271 when public transfers are included in the measure of income. This decrease illustrates the important role of redistribution in Denmark. For the US, the corresponding estimate jumps to 0.446, bringing us close to the estimate reported in Solon (1992) and Chetty et al. (2014) (see Table A1 in the Appendix). Comparing the estimate in column 3 in Table 1 to that of column 9 in the same table, we see that adding taxation reduces the Danish IGE estimate further. Unfortunately, we do not have the data required to estimate the corresponding IGE for the US. 9 When we focus on wage earnings alone in columns 5 and 6, the estimated IGE for Denmark drops dramatically to 0.083, while the corresponding US estimate is 0.289. Finally, adding public transfers to wage earnings results in an even larger gap between the two countries. For wage earnings plus public transfers, the Danish IGE is 0.063 while the US estimate is 0.419. 10 Our estimates for Denmark do not contradict the findings of the previous literature. Rather, they enrich our understanding of them. Measured by income potential (columns 1 and 2), we find that intergenerational mobility in Denmark is not significantly different from intergenerational mobility in the US. When we account for public transfers, estimates for the two 9 Table A3 in the Online Appendix shows IGE estimates while controlling for a child's highest completed grade. The table shows that controlling for own education reduces the IGE estimates by approximately one-third relative to the unadjusted estimates presented in Table 1. Yet, the qualitative differences between income measures and countries remain unchanged. IGE estimates are similar for Denmark and the US for gross income excluding transfers, but diverge for other income measures. In addition, it is evident from the table that the coefficients for a child's highest completed grade on income are larger in the US than in Denmark. Furthermore, the coefficients for a child's highest completed grade for Denmark decrease substantially when we consider income measures including transfers or post-tax income, whereas they are unaffected by the inclusion of transfers for the US. 10 In Table A4 in the Online Appendix, we show the corresponding IGE estimates while controlling for parents' education. The estimated elasticities decrease by 25-30 percent, but we find no sign of any patterns or cross-country differences that are not present for the unadjusted IGE estimates in Table 1 ) and standard errors from regressions of child log income on parent log income for Denmark and the US. For Denmark, we use full population register data for children born in the period [1973][1974][1975], and for the US we use PSID data for children born in the period 1972-1978. For Denmark, parental income is measured as a nine-year average from the child's 7th to 15th year, and the child's income is measured at ages 35-37, 36-38, and 37-39 for the 1975, 1974, and 1973 cohorts, respectively. For the US, parental income is measured as a nine-year average from the child's 7th to 15th year, and the child's income is measured as last-year income at ages 34-41, 33-40, 32-39, 31-38, 30-37, 30-36, and 30-35 for the 1972, 1973, 1974, 1975, 1976, 1977, and 1978 cohorts, respectively. The columns are based on the following: column 1, for Denmark, all taxable income including wage earnings, profits from own business, capital income, and foreign income excluding all public transfers (both taxable and non-taxable); column 2, for the US, all taxable income including earnings (payroll income from all sources, farm income, and the labor portion of business income), asset income (such as rent income, dividends, interest, income from trust and royalties, and asset income from business), and private transfers (such as income from alimony, child support, and help from relatives and others); column 3, for Denmark, all taxable income including wage earnings, public transfers, profits from own business, capital income, and foreign income; column 4, for the US, all taxable income including earnings, asset income, private transfers and public transfers (such as social security income, SSI, TANF, ETC, other welfare income, retirement, pension, unemployment, and workers compensation); column 5, for Denmark, taxable wage earnings and fringes, labor portion of business income, and non-taxable earnings, severance pay, and stock-options; column 6, for the US, payroll income from all sources (such as wages and salaries, bonus, overtime income, tips, commissions, professional practice, market gardening, additional job income, and other labor income), farm income, and labor portion of business income; column 7, for Denmark, taxable wage earnings and fringes, labor portion of business income, and non-taxable earnings, severance pay, and stock-options, plus taxable and non-taxable public transfers (social assistance, unemployment benefits, labor market leave, sick leave assistance, labor market activation, child benefits, education grants, housing support, early retirement pension, disability pension, and retirement pension); column 8, for the US, payroll income from all sources, farm income, labor portion of business income, and public transfers; column 9, for Denmark, total gross income minus all final income taxes paid in given year (note we do not have information on individual net-of-tax income from the PSID). + countries diverge. Income mobility by this measure is substantially higher in Denmark than in the US. When we consider wage earnings alone or wage earnings inclusive of public transfers, we obtain estimates for Denmark reported in the previous literature with estimated IGEs around 0.1. One should interpret cross-country differences with great caution. There is no single best measure of the IGE. We do not claim that we have shown that levels of income mobility in the US and Denmark are alike or different. The conclusion from this analysis is that by accounting for transfers, wage compression, returns to education, and progressive income taxation, we can explain a substantial portion of the Denmark-US difference in associations between children's and parents' income.
In addition, several measurement problems discussed in the previous literature (see, e.g., Solon, 2004) might also affect estimated IGEs. Imputing zeros with an arbitrary value affects estimates. Censoring might also produce biased results, for example, by leaving out the long-term unemployed from the analysis. 11 Table A5 in the Online Appendix reports the estimates corresponding to Table 1 when imputing zero incomes with $1,000. The table shows that estimated IGEs change for income categories that include many zeros (gross income excluding transfers and wage earnings). 12 Nevertheless, the overall patterns from Table 1 remain unchanged for Denmark. For the US, however, the PSID data are much more sensitive to the inclusion of zero and non-reported incomes. In order to obviate the problems with zero income, analyses estimating relationships between children's and parents' ranks in their respective income distributions 13 have recently been used (see Dahl and DeLeire, 2008;Chetty et al., 2014). We do not report results for rank-rank estimates in the main text and we refer readers to the Online Appendix. 14,15 11 Additional measurement problems include life-cycle bias and measurement error from year-to-year variation in income. We attempt to avoid these potential biases by considering parental and child income measured as averages over several years (permanent income) and by measuring children's income when they are in their late 30s. 12 For Denmark, estimated IGEs increase the smaller number used to make the imputation. When we use $1,000, the estimated IGE for gross income excluding transfers is 0.49, and when we use $1 it increases to around 0.6.
where R( * ) denotes children's and parents' ranks in their respective distributions. While β R R is scale-invariant in income, β I G E is not. The link between the two measures depends on the underlying distributions (see Trivedi and Zimmer, 2007). 14 Table A6 in the Online Appendix replicates this analysis for rank-rank regressions. The findings are qualitatively similar. For total gross income excluding transfers, the Danish estimates are close to the US levels reported by Chetty et al. (2014, 15 Rank-rank analyses do not solve the issues that the researcher faces when using log income. We refer the reader to Section C in the Online Appendix for a discussion. Section

The Role of Inequality in Shaping the IGE
The cross-country correlation between income mobility and income inequality has received a lot of attention in the past decade (Corak, 2006). Krueger (2012) calls this the "Great Gatsby curve". In this subsection, we examine the mechanical relationship between estimated IGE and changes in inequality across generations. It follows from the definition of the IGE, that an increase in inequality from one generation to the next amplifies the estimate without affecting mobility measured by correlation coefficients. Hence, differences in inequality between generations and countries might generate differences in perceived income mobility. 16 Table 2 shows how differences in variances drive estimates. The table shows the regression coefficients from Table 1 together with the correlation and intergenerational ratio of standard deviations below each coefficient. The table shows that, although not statistically significantly different, the intergenerational correlation for gross income excluding transfers in the US is above its Danish counterpart. It is the ratio of standard deviations that drives the Danish IGE to levels above the US. When public transfers are included in gross income, the correlation and ratio increase in the US, C in the Online Appendix presents further results on two of the additional issues often discussed in the previous literature on income mobility. For a recent review, see Black and Devereux (2011). The first issue is life-cycle bias (i.e., that associations between children's and parents' income will be understated if children's income is measured early in their working career, where yearly earnings do not reflect lifetime earnings). We show that the rank-rank slopes for Denmark do not stabilize until the child's income is measured during their late 30s; see Figure A42 in the Online Appendix and also see Nybom and Stuhler, 2015 for similar evidence from Sweden. We also illustrate that measuring parental income earlier in the child's life reduces the rank-rank slopes. (In Denmark, measuring children's income during their early 20s actually results in negative coefficients.) The second issue is attenuation bias (measurement error bias) that stems from the noise arising from including too few years of income data (Solon, 1992). In the Danish data, this can be much larger than the levels reported in Chetty et al. (2014). When we measure parental income when the child is below 10 years of age and add income data from subsequent years to the analysis, the differences in rank-rank slopes based on one and five years of data, respectively, range from 12 to 32 percent, depending on which income measure is used (see Figure A43). However, when we use income measured during the child's late teens and add data from preceding years, the corresponding one-to five-year differences are around 0-3 percent, in accord with the analysis of Chetty et al. (2014). 16 In a similar vein, one might question how differential trends in educational inequality affect comparisons across countries with high rates of high school and college degrees in earlier generations, as in the US, and countries where high school and college degrees become modal only over the past 30-50 years, as in Denmark and Norway. This remains an open question. while in Denmark the ratio decreases and the correlation is roughly unchanged. These results also emphasize that transfers are more progressive and constitute a larger fraction of income in Denmark compared to the US. Furthermore, the table shows that the large increase in the estimated IGE for the US when public transfers are included, partly arises because transfers reduce inequality in parents' income while inequality in children's income is largely unaffected. 17 When we focus on wage earnings alone, the correlation in Denmark drops from a level of 0.214 to 0.081, whereas in the US the intergenerational correlation remains unchanged. 18 Table A5 in the Online Appendix presents a corresponding analysis imputing zero incomes with $1,000. The main difference for Denmark is that intergenerational correlations for gross income excluding transfers and wage earnings increase to 0.246 and 0.118, respectively, while the correlations for the remaining incomes measure remain largely unaffected. Hence, including individuals with zero incomes, there is a substantial reduction in the intergenerational correlation when we add transfers to gross income in Denmark.
From this analysis, we see that IGE estimates are sensitive not only to the income measures used, but also to inequality levels and changes. It is not meaningful to compare IGE estimates, when arbitrary large or small levels of inequality drive the estimates. In order to investigate this issue in greater depth, we conduct a further analysis showing the sensitivity of IGE estimates to adjustments for inequality.
We present regressions where we transform the different income distributions for Denmark to the corresponding income distributions for the US, holding income ranks fixed. Then, we place the US distribution in the Danish distribution. In the upper panel of Table 3, we present IGE estimates where quantiles of the Danish distributions -for parents (reported in the rows) and for children (reported in the columns) -are mapped into the equivalent income measures for the US for children born in the period 1973-1975 in 2011 with parents in the 1987 March CPS data. Figure  A2(a) in the Online Appendix illustrates the transformation for wage earnings distributions. The child or parent with the nth total gross income rank in Denmark is assigned the total gross income level associated with the nth 17 This is also shown in Figure A24 in the Online Appendix, where we use CPS data for the US and register data for Denmark to plot average wage earnings and wage earnings plus transfers in the two countries across different educational levels for the cohorts born in the period 1947-1978. 18 In Tables A8-A12 in the Online Appendix, we report the intergenerational correlations and standard deviations of all major income components for Denmark. The tables show that the increased ratio of standard deviations from wage earnings to gross income stems from capital income and profits from own businesses. The ratio of standard deviations changes drastically for gross income because the covariance between wage earnings and profits is negative for parents and zero for children, thus reducing the overall variance of parents' income relative to children's income. (1) (3)  rank of total gross income for the US child or parent distribution. A similar transformation is used for total income excluding public transfers, total net-of-tax income, wage earnings, and wage earnings plus public transfers. Using this method, we illustrate what the Danish IGE would be for the different income measures, if Denmark had the same levels of inequality within generations as those found in the US.
In the lower panel of Table 3, we do the opposite, which is illustrated in Figure A2 The columns and rows labeled "Baseline" for both parents and children show the actual IGE coefficients from Table 1. The first line in the upper panel shows that if Danes born between 1973 and 1975 had the same income distribution as the corresponding US age cohorts, the Danish IGE estimates for gross income excluding transfers, including transfers, and netof-tax would increase by 50-100 percent, whereas it would be unchanged for wage earnings. In the next thought experiment, we examine the consequences of giving Danish parents the same income distribution as US parents. Estimated IGEs would decrease. Transforming children's and parents' income distribution reduces IGEs by 30-50 percent when we consider the gross income measures, and increases IGEs by roughly 50-100 percent when we consider wage earnings and wage earnings plus transfers.
When we perform the equivalent exercise for the US, we naturally reach the opposite conclusion. Changing the income distribution of children while holding parents' income distributions fixed results in large reductions in the IGE, whereas changing income distributions for parents while holding children's income distributions fixed results in large increases in the IGE. Finally, by transforming both generations' income distributions, the IGE by gross income excluding transfers increases, the IGE for income measures including transfers decreases, and the IGE for wage earnings is unchanged. Table 3 shows that IGEs in Denmark and the US are quite different when wage earnings and wage earnings plus transfers are used as measures of income. Estimated IGEs based on these two income measures are robust to the changes in inequality that we observe for both countries. For the remaining measures of income, the similarities between the IGEs in Denmark and the US are substantial and depend strongly on trends and levels in inequality.
The analyses presented in this section emphasize that levels of estimated income (im)mobility depend on the subjective evaluation of the reader. Not only do estimates vary by income measures, they are also affected by whether changing inequality is linked to mobility. For example, with a fixed correlation between children's and parents' income, doubling income inequality from one generation to the next clearly increases differences in income levels and the consumption possibilities between children from high-income and low-income families. Should the chosen measure of income mobility capture such change? Without specifying a social welfare function and a normative definition of fairness, this question does not have a clear answer.

Non-Linear Intergenerational Income Elasticities
It is likely that any benefits from the Scandinavian welfare states accrue to the least advantaged. This is a feature that linear models of the IGE might fail to adequately capture. Thus, it is particularly interesting to analyze non-linearities in the IGEs. Wage compression and the high level of redistribution via taxes and transfers only add weight to the relevance of considering possible non-linearities.
Yet, few previous studies consider non-linearities. Bratsberg et al. (2007) report that the relationship between the logarithm of child and parent income is convex in Denmark (and in Scandinavia more generally) and concave in the US for measures of wage earnings. 19 They attribute this finding to higher mobility for individuals from low-income families in Denmark than in the US. We replicate these findings in Figure A3 in the Online Appendix. However, as previously emphasized, results differ according to which income measure is used. In Denmark, for example, wage earnings of children and parents display a convex relationship, while for total gross income excluding public transfers the relationship is linear, or perhaps even concave.
We account for non-linearities using local linear regressions. We estimate the NL-IGE, It is feasible to estimate the NL-IGEs using absolute income, thereby obviating the problem that ln(0) does not exist. 21 However, the estimation of NL-IGE using absolute income involves a trade-off in terms of precision for high income levels due to the right tail of the income distribution. Doing so reduces the precision of estimates substantially. In order to be able to compare estimates from $10,000 to $125,000 and not just from $30,000 to $60,000 of parental incomes, we therefore report estimates using log income here. The corresponding point estimates using absolute income are very similar to the results shown in the main text and are reported in Figures A5 and A6 in the Online Appendix. Figures 1 and 2 show plots of NL-IGE estimates of log income, weighted with absolute income, without imputation for zero income for Denmark and the US. The vertical lines in the figures mark the 5th and 95th percentiles in the income distributions in the Danish data and the 5th and 95th percentiles for the US data. It should be noted that these estimates only allow us to infer local conclusions about mobility. A zero IGE estimate at a low level of income does not imply that going from rags to riches is likely. It only shows that a marginal movement up (or down) in income levels relative to parental income is just as likely as the status quo. The figures present estimates for the income ranges where the data allow us to make meaningful estimates (because there is very limited support for high incomes in the PSID). Figure 1 shows NL-IGE estimates for gross income excluding public transfers and gross income including public transfers for Denmark and the US (Figures 1(a) and 1(b)). The elasticity goes from levels around 0.25 to almost 0.4 when parental income increases from $0 to $100,000. Thereafter, the estimates slowly decline and reach a level of around 0.1-0.2 at the 99th percentile of parental income. The corresponding results for the US (in Figures 1(c) and 1(d)) show that elasticities at low income levels closely correspond to those in Denmark, although they are imprecisely estimated. In the US, elasticities increase monotonically with parental income. At the 95th percentiles of parental gross income excluding and including public transfers, US intergenerational income elasticities are well above 0.5. Figure 2 graphs NL-IGE estimates for wage earnings and wage earnings plus public transfers for Denmark (panels a and b) and the US (panels c and d), and net-of-tax total gross (disposable) income for Denmark is an Epanechikov kernel. One should note that it is important to distinguish between a kernel with absolute income K hλ (Y P 0 , Y P i ) and a kernel with log income The former assigns symmetric weight around Y P 0 while the latter weighs observations above Y P 0 more because of the logarithmic transformation. Finally, the imputation of zero incomes inflates estimates of NL-IGEs at the low to medium parental income range (the parental income ranges where zero incomes for children are most prevalent).  In order to obtain a more precise view of the cross-country differences in NL-IGEs, Figure 3 plots the differences between the US and the Danish elasticities across levels of parental income. From Figure 3(a), we see that income mobility in gross income excluding public transfers is roughly similar for family incomes up to $100,000. From this point onward, a gap    emerges that -albeit imprecisely estimated -continues to increase. Income mobility by gross income excluding transfers is much lower in the US than in Denmark for top quartile family incomes, but not for families with low income. When transfers are added to income, as shown in Figure 3(b), the elasticities in the US are persistently above the Danish elasticities with a widening gap at high incomes. When we only consider wage earnings in Figure 3(c), the Danish IGE is around 0.2 lower than the US IGE across all parental income levels. This result also dovetails nicely with our argument about the importance of wage compression in Denmark, as opposed to the increasing return to education in the US, being key mechanisms behind the observed income mobility differences. Finally, Figure 3(d) shows that for wage earnings plus transfers, intergenerational income elasticities in Denmark are consistently below US levels. Here, the Danish IGE is around 0.35 lower than the US IGE at low income levels and 0.25 lower at high income levels. Hence, the largest difference between income mobility in Denmark and the US is now for the lowest family incomes.

III. Educational Mobility by Family Background
In the previous section, we studied intergenerational income mobility across two countries and show that wage compression and tax/transfer policies are major determinants of cross-country differences in mobility. Although the reward for education might be lower in Denmark, its generous support of education, support of childcare, and early education initiatives promote skill formation as measured by test scores among the disadvantaged. Many point to the more generous educational and childcare policies in place in Denmark as a source of its greater social mobility (e.g., Sanders, 2013). We examine this claim and show that average educational mobility is remarkably similar across the two countries. We start by demonstrating the near-universal participation in such programs in Denmark coupled with a lack of educational and income gaps compared to the US. 22 Figures 4(a) and 4(b) show the fraction of children enrolled in preschool programs at the age of 4 in the US and Denmark from 1995Denmark from to 2005 The figures show the overall rates, together with the rates for children for whom both parents have fewer than 12 years of schooling (less than high school) and for children for whom both parents have at least 15 years of schooling (college or more). The figures show that average preschool enrollment rates at age 5 were, on average, similar in the two countries in 1995. Since then, rates of participation have stagnated in the US and 22 Throughout this section we use a variety of data sources. We discuss these briefly in the main text. Section F in the Online Appendix describes them in detail. 23 See Tables A13 and A14 in the Online Appendix for an overview of the expenditures on education systems in the US, Denmark, and the rest of Scandinavia. They demonstrate the greater generosity of the Danish system measured in a variety of ways. Expenditures on preschools are especially generous. Currie (2001) and Simonsen (2010) give detailed descriptions of expenditures and pricing schemes in early education in the US and Denmark,respectively. 24 The Scandinavian daycare and preschool system is rooted in a social pedagogy tradition as opposed to many other European countries and the US, which focus more on an educational approach (OECD, 2001(OECD, , 2006. "The English-speaking world has adopted a 'readiness for school' approach, which although defined broadly focuses in practice on cognitive development in the early years, and the acquisition of a range of knowledge, skills, and dispositions. A disadvantage inherent in this approach is the use of programmes and approaches that are poorly suited to the psychology and natural learning strategies of young children. In countries inheriting a social pedagogy tradition (Nordic and Central European countries), the kindergarten years are seen as a broad preparation for life and the foundation stage of lifelong learning." (OECD, 2006, p. 2). The Nordic approach is summarized as follows. "The core of the curriculum is the dialogue between adult and child and creative activities, discussions and reflections. The curriculum sets goals for early education, but is flexible so that it can be adapted to local and individual needs." (Taguma et al., 2013, Table 2.1). In recent years, however, early childhood care in Denmark has increased its focus on education as well (see Jensen et al., 2010, for a discussion). increased to a level close to full uptake in Denmark. Importantly, the figures also show large gaps in enrollment rates by parental education in the US, whereas there are no differences in Denmark. Figures 4(c) and 4(d) show rates of daycare/preschool use at ages 2, 3, and 4 by parental wage income rank in 2005 in the two countries. Enrollment rates are lower in the US, trends in participation are flatter, and family income gradients for participation in the programs are steeper.
A few studies present causal evidence linking access to universal public childcare to improvements in skills in a Scandinavian context. Havnes and Mogstad (2011b) study a large expansion of childcare in Norway on long-run outcomes. They find that daycare enrollment improves educational attainment and earnings, especially for children from low-resource families. 25 Datta Gupta and Simonsen (2010Simonsen ( , 2012 investigate the effects of home care, non-parental/related family care (i.e., in a child-minder's home), and public daycare in Denmark on socio-emotional skills. Datta Gupta and Simonsen (2010) find that public daycare relative to family care increases socio-emotional skills at age 7, while Datta Gupta and Simonsen (2012) suggest that the effects might fade at later ages.
There is an active body of literature in which the effectiveness of early childhood interventions in the US is investigated. 26 The evidence from many US programs might not be relevant to the current discussion, as the programs are often very intensive and target specific groups of children. Cascio (2009) reports that large-scale, publicly funded childcare programs in the US are less effective than their Scandinavian counterparts. She suggests that low-intensity programs crowd out other programs (e.g., Head Start) and divert funding from other public skill formation initiatives.
While the cited studies only investigate policy changes within a given country, they support the claim that increased early childhood investments, through universal public childcare, improve the skills of the least advantaged children and thus intergenerational skill mobility. The evidence for their effectiveness is supported by Figure 5, which shows distributions of Program for International Student Assessment (PISA) mathematics and reading scores in Denmark and the US in 2003. 27 The figure shows stark differences in the lower tails of PISA test scores. The lowest quartile in the US performs much worse than the lowest quartile in Denmark.
Yet, despite the greater provision of early childhood education to lowresource families in Denmark, the lack of any pecuniary costs of education in Denmark, the compressed skill distributions, and the association between educational attainment levels from one generation to the next are remarkably similar in Denmark and the US. Figure 6 shows the fraction of those aged 20-34 in (or with) a tertiary education, by parental educational attainment in Denmark, the US, and Norway. The figure shows that only 6-8 percent of individuals aged 20-34 who are enrolled in or have completed a tertiary education come from homes where both parents have not graduated from an upper secondary education. Generally, there are few differences in these percentages across the three countries. Figure  A13 and Section B.2 ( Figure A30) in the Online Appendix corroborate this evidence. Figure A13 shows that educational mobility in Denmark is not 25 Havnes and Mogstad (2011a,b) show that most of the uptake in publicly provided childcare comes from children who were in informal care arrangements. We return to this point in our conclusion. 26 Currie and Thomas (2000) and Elango et al. (2016) are examples. 27 Figures A7 and A8 in the Online Appendix provide similar results for adults using data from the Program for the International Assessment of Adult Competencies (PIAAC) and from the International Adult Literacy Skills Survey (IALS). higher than in the US if we instead consider regression-based coefficients relating children's and parents' educational attainment as reported in Hertz et al. (2008). Educational transitions across generations are very similar in the two countries for more recent cohorts, as we show in Figure A30 in the Online Appendix.
In the rest of this section, we elucidate these findings and investigate the reasons why seemingly similar levels of educational mobility arise. First, we briefly describe the data used in our analyses. Then, we examine educational attainment by parental resources and which factors help to explain the relationship between children's education and parents' resources. We also consider explanations that link the findings from our different analyses.

Data
US Data: CNLSY. We restrict the sample to cohorts born in 1991 or earlier. 28 In addition to information on own characteristics, we include information on mother's characteristics from the original National Longitudinal Survey of Youth (NLSY) data. We restrict the sample to individuals for whom we observe at least one test score for both cognitive and noncognitive skills (see below), along with parental income. This leaves us with a sample of 3,268 individuals. We lose 15 percent because of missing observations, and 28 percent of the remaining sample are born in 1987 or later. See Table A19 in the Online Appendix for sources of loss of sample information.
Danish Data: 1987 Cohort. We use the entire cohort of children born in Denmark in 1987. Using a unique individual identifier, we link information on demographic characteristics to schooling outcomes and exam grades in ninth grade. 29 The data also include a unique parental identifier, which allows us to link the information on the children to parental income and wealth, demographic characteristics, and mother's educational attainment. We restrict the sample to children whose parents have non-negative household wage income in 2002. This results in a sample of 39,539 children. 30 Comparability of Samples. There are two fundamental differences between the Danish and US samples. First, while the latter come from survey data, the former come from a full population register based on information reported from relevant institutions and authorities. Second, cohorts vary in their characteristics in the two countries. Danish data are centered around a child's birth year. The data from the Children of the National Longitudinal Survey of Youth (CNLSY) are centered around parents' birth year, as these data are based on children born from five cohorts of parents. In the CNLSY, we record information on multiple cohorts of children (before and after 1987) and only five cohorts of mothers. For the Danish data, we consider only one cohort of children born in 1987 and numerous cohorts of parents. We do not censor the data to align parents' and children's cohorts between the two countries, as this would induce heavy selection in terms of mother's age at childbirth. As female fertility patterns are different between the US and Denmark, such selection imposes arbitrary differences between the two countries and could consequently invalidate the analysis. 31 Measuring Income and Wealth. In the CNLSY data, we measure parental income using the sum of the mother's and the spouse's self-reported wage earnings. In the Danish data, we measure parental income as the sum of the mother's and father's wage earnings. 32 For both countries, we measure income as average income between the child's 3rd and 15th years. The two income concepts are similar in content. 29 Primary school grades are not binding for the child's further educational trajectory for this cohort. 30 See Table A19 in the Online Appendix for sources of loss of sample information. 31 Table A15 in the Online Appendix presents regression coefficients of parental permanent wage income and wealth on children's high school completion and college attendance where we sample cohorts in the Danish register data with the same distribution as observed in the CNLSY data, and where we sample the number of observations in each cohort as observed in the CNLSY data. The results do not differ significantly or qualitatively from our main results, which we present in Table 4. 32 Results are robust to using gross income including Unemployment Insurance benefits (UI) and welfare transfers.
For the US, we measure assets by reported net assets in the CNLSY. 33 For Denmark, assets are measured by net assets (excluding pension savings) from income and wealth data reported to tax authorities. 34 In both countries, we measure assets at age 15 of the child. While the data again differ in terms of source, net assets are highly dependent on housing wealth. 35 Thus, intra-country differences in wealth can capture both differences in market luck in the housing sector, family endowments, and lifetime income. 36 Measuring Education. In the US data, high school completion is defined using questions on whether or not the child has a high school diploma/General Educational Development certificate (GED). 37 We define college attendance as a report of either full-or part-time enrollment in college. In the Danish data, we define high school completion as having completed an education that requires at least 12 years of schooling, which includes both academic and vocational high school graduates, and college as having been enrolled in an education that requires at least 15 years of schooling. 38 33 These include the value of major owned durables (e.g., housing), as well as debts, but not pension assets. 34 These include valuations of major owned durables. 35 See Browning et al. (2013) for a discussion of Danish data. 36 Home equity comprises a larger share of households' net wealth in Denmark than in the US. In 2010, home equity was estimated as approximately 24 percent of households' net wealth in the US (Gottschalck et al., 2013). In Denmark in 2014, this was 37.5 percent (Statistics Denmark, 2016). Reported shares for both countries include pension savings in total net wealth. 37 Cameron and Heckman (1993) and  show that these two concepts are not equivalent. However, omitting the GED from the definition of high school completion would likely reduce the similarities of Denmark and the US, as the Danish measure of high school completion also includes a version of the GED (the Højere Forberedelseseksamen (HF) or Higher Preparatory Examination). The HF is a substitute high school degree designed for those who dropped out of high school or earlier educational levels before they completed this. While grade point averages (GPAs) from the HF provide access to further education and university as regular academic high school (Gymnasium) does, HF graduates have lower average levels of completed schooling and lower adult income (likely because HF completion instead of Gymnasium completion, for a given GPA, proxies fewer skills on other dimensions; Heckman and Rubinstein, 2001). Students often take the HF at older ages than normal high school students. Some are high school dropouts while others have not enrolled in high school, but dropped out of education after the compulsory years and have spent 5-10 years out of the educational system. 38 The Danish educational system is rooted in a Northern European tradition and is not directly comparable to the US system, while secondary and tertiary educations in Denmark are highly comparable to those in countries such as Germany and Norway. Our definitions of "high school" and "college" bring the US and Danish system closer, both qualitatively and in population means. However, this simplification of the Danish educational ladder reduces comparability to other Scandinavian schooling systems, unless similar simplifications are Cross-country institutional differences are a potential confounder. While we have chosen our definitions of high school completion and college attendance to maintain comparability, we do not (and cannot) control for all cross-country institutional differences. Two potentially problematic issues are social promotion 39 and the minimum school leaving age, 40 which might distort the levels of human capital associated with equal levels of schooling in Denmark and the US.
Measuring Skills. For the US, we use the Peabody Individual Achievement Test (PIAT) scores to measure cognitive skills. The CNLSY features three sets of PIAT scores: reading recognition, reading comprehension, and mathematics. For non-cognitive skills, we use the antisocial, headstrong, hyperactivity subscales from the Behavior Problem Index (BPI). The measures of cognitive skills and non-cognitive skill are in accordance with those of, for example, Cunha and Heckman (2008) and Heckman et al. (2006). For Denmark, we measure skills using grades from the ninth grade. Cognitive skills measured are residualized by non-cognitive measures. Exam grades (even cognitive ones) are highly dependent on non-cognitive skills (the final year of compulsory schooling, i.e., before they begin high school; Borghans et al., 2016). We measure cognitive skills using final mathematics exam grades (written), mathematics mid-term grades (written), final physics exam grades, and non-cognitive skills using orderliness/organization/neatness grades from the Danish written exam, Danish made there as well. Figure A52 in the data section of the Online Appendix illustrates how the two schooling outcomes are affected by our definitions and age restrictions in Denmark. 39 Social promotion reduces the academic material needed to pass this level. Social promotion can, in a more complex form, result in reducing the academic levels needed to complete a given education, and thus inflate graduation rates, thereby invalidating cross-country comparisons of the educational levels in question. The phenomenon exists in both countries, evidenced by the substantial attention it has received in the public debate. For Denmark, see Berligske Tidende (2015) Clinton (1998), and United States Department of Education (1999). However, there are no data available that allow us to test for differences and/or similarities between social promotion in Denmark and the US. 40 In the US, the law dictates that children should attend school until they turn 16-18 (depending upon the state). For most states, this includes the first year of high school. In Denmark, there is no minimum school-leaving age, but instead a minimum number of years of schooling. Children are not allowed to leave school before they have completed ninth grade. As a consequence, most US children have to begin high school even though they are not forced to complete it. This might induce some to graduate high school who would not have done so in the Danish setting, and thus might increase (perceived) educational mobility in the US. written mid-term, and mathematics written exam. 41 As test scores and grades are highly associated with non-cognitive skills (Borghans et al., 2011a(Borghans et al., ,b, 2016, we use residuals from the cognitive measures regressed on the non-cognitive measures in the measurement system to estimate cognitive skills.  (b) show that, in both countries, rates of high school completion increase in parental income and wealth. In the US data, the relationship has its greatest curvature at low levels of income and wealth, while a gradient is evident across the entire range of parental wealth and income in the Danish data. In both countries, 90 percent of children at the top of the income and wealth distribution complete high school, whereas for low levels of income and wealth, approximately 65-70 percent complete high school in the two countries. Broadening the income and wealth ranges to all levels of support beyond the ranges where we have an overlap between the two countries, Figure A15 in the Online Appendix shows that individuals whose parents are at the lower end of the distributions are more likely to complete high school in the US than in Denmark. For parents with low levels of income and wealth, 60 and 45 percent of children in the US and Denmark, respectively, complete high school. Figures A14(c) and (d) show that college attendance rates also increase with parental wealth and income. The gradient, with respect to wealth, is larger in the US than in Denmark at the bottom of the wealth distribution. 41 Our measures of non-cognitive skills in the two countries are clearly not equivalent. The Danish measure of non-cognitive skills is more related to an orderliness/effort measure while the US measure is related to behavioral problems. Another concern when using grades is that our measures of non-cognitive skills are more closely related to academic achievement than to socio-emotional skills. We do not consider this to be an issue in the present case. When we estimate factor loadings and perform variance decompositions from the two factors on outcomes Driving Under the Influence (DUI) and psychiatric admissions, these outcomes are significantly more associated with non-cognitive (socio-emotional) skills than cognitive skills. The factor for non-cognitive skills explains around three to five times as much of the variance in DUI and mental disorders compared to the factor for cognitive skills. 42 Figures A16(a) Parental income is only strongly associated with increasing rates of college attendance for families with above-median wealth in the US. In contrast, in Denmark, the income gradient is largest for families with below-median wealth. 43 Finally, Figures A19(a) and (b) in the Online Appendix show level differences of the surfaces displayed in the previous figures for the areas of income and wealth where we have common support in the Danish and US data. The figures show that levels of high school completion are higher in the US than in Denmark for children from low-income/low-wealth families, while this group's college attendance rates are substantially higher in Denmark than in the US.

Education and Family Background
The figures just described illustrate that levels of social mobility in Denmark do not always exceed social mobility in the US. One result in Figure A14 might suggest that mobility is higher in Denmark while another suggests the opposite. We next investigate which mediating factors explain mobility (or lack thereof) in Denmark and the US, and whether the role of these factors differ.

Controlling for Skills Formed in Early Adolescence, Family Characteristics, and Sorting of Children into Schools by Parental Income
In this subsection, we adjust the figures discussed in the previous subsection by controlling for cognitive and non-cognitive skills, measures of family background, and measures of school characteristics. Doing so significantly reduces the income and wealth differentials, with early adolescent measures of cognitive and non-cognitive skills playing a major role. Table 4 presents linear regression estimates of parental log income and wealth on children's high school completion and college attendance for the US and Denmark. The estimates can thus be interpreted as elasticities. 44 In the upper panel of the table, we present estimates with high school 43 Figures A17 and A18 in the Online Appendix show rates of high school completion and college attendance by wage income and wage income plus public benefits in levels and ranks, respectively. The figures show that there is little or no difference in the relationship between the two income measures for both levels of schooling in the US and for college attendance in Denmark. For rates of high school completion in Denmark, however, we see that the inclusion of public benefits results in a steeper negative slope at the very lowest levels of incomes. The reason for this anomaly is that a small share of those with close to zero wage earnings and public transfers in Denmark live from capital income or profits from businesses instead. The educational outcomes of this group are very different from those experienced by children from the remaining low-income families. 44 Table A16 in the Online Appendix shows the corresponding regression coefficients for parental income and wealth ranks for children's high school completion and college attendance controlling for child skills at age 15, family background, and school quality.  Figure A14 in the Online Appendix, and in the subsequent three columns we gradually increase the conditioning set. 45 From column 1 of the table, we see that parental income and wealth are strongly associated with children's high school completion and college attendance in the US and in Denmark. As shown in Figure A14, parental income and wealth gradients for children's high school completion are significantly higher in Denmark, while only the gradient for wealth differs for college attendance.
The second column presents associations controlling for child level of cognitive and non-cognitive skills measured at age 15-16. In comparison to the estimates from the first column, the income and wealth gradients for high school completion and college attendance are substantially reduced. Thus, the relationship between parental resources and child education is to a large degree mediated by levels of cognitive and non-cognitive skills in the adolescent years. While the upper panel shows that the coefficients for income and wealth still differ between Denmark and the US for high school completion, the estimates in the second column of the lower panel show that there are no significant differences between Denmark and the US for either income or wealth gradients in college attendance.
Even though cognitive and non-cognitive skills are highly predictive of educational attainment in both Denmark and the US, cross-country differences in these skills do not explain the entire relationship between parental resources and educational attainment. When we extend the analysis in the third column by adding measures of parental background (education/family status) to the measures of child skills, the relationship between parental financial resources and child education weakens further. 46 Again, we find that associations between parental income and wealth, on the one hand, and children's probability of high school completion, on the other, are stronger in Denmark compared to the US. We find no cross-country differences in the estimated relationships for college attendance.
Cognitive and non-cognitive skills and parental/family background play similar roles in mediating the relationship between parental financial resources and children's educational outcomes in both countries. However, other differences remain in comparing educational income and wealth gradients in the two countries. For example, Denmark and the US differ in the variability of school quality. Differences between the quality of public and private schooling likely depend on overall resources devoted to public schools -a major difference between the two countries. Denmark spends a far greater fraction of its GDP on public education than the US. 47 Yet, school resources and peer characteristics still vary by parental resources in Denmark, suggesting similar relationships between measures of schooling quality and family characteristics across the two countries. We present a preliminary exploration of these relationships for Denmark later in this section, where we establish that a school quality gradient also exists in Denmark. However, because of the lack of data, we are unable to test for differences in the distributions of school quality.
In the final column of Table 4, we show estimates of the association between children's education and parental income and wealth conditioning on the child's level of skills at ages 15-16, family background measures, and school characteristics measured in the primary school years. The gradients in parental income and wealth are substantially reduced because quality measures for primary school predict later educational attainment. 48 For Denmark, there is no remaining statistically significant relationship between parental resources and children's education, while for the US, a small relationship remains. Moreover, we only find one statistically significant cross-country difference at a 5 percent level between the gradients of children's rates of high school completion and college attendance as functions of parental financial resources, and generally none of the estimates differ qualitatively between Denmark and the US.

Non-Linear Elasticities between Children's Education and Parents' Income
The results reported in Table 4 are average estimates for the two populations in question. As we have argued for income mobility, it is likely that any benefits from the Scandinavian welfare states accrue to the least 47 See OECD (2014) for an overview of public and private resources devoted to primary, secondary, and tertiary education, and Table A13 in the Online Appendix. 48 As data sources from the two countries differ (Denmark has register data and the US has survey data), we proxy school characteristics by aspects that are not directly comparable. For the US, we use parents' ratings of their child's school; for Denmark, we use average characteristics of earlier cohorts in a given school. The results are used to illustrate that substantial sorting takes place in both Denmark and the US and that this sorting coincides with parental resources. Hence, once we also condition on our school characteristics measures, the income and wealth gradients in children's schooling are reduced further. Importantly, the results are not causal, nor do they identify the impact of school quality on later education. advantaged. To allow for non-linearities in the association between children's education and parent's gross income including transfers, we repeat the analysis from Section II and estimate local linear regressions of children's high school completion/college graduation on parents' log income. Using the same data and estimation strategy as used in Section II, Figures  7(a)-(d) examine the non-linearities in the elasticities between children's high school/college completion and parental gross income including transfers. Figures 7(e) and 7(f) show the cross-country differences between the estimated elasticities. 49 The figures show strong non-linearities within countries and across educational levels. Elasticities for high school completion vary between 0 and 0.3 in Denmark, and between 0 and 0.12 in the US. For college graduation, the non-linear relationship with parental income is even more apparent. In both countries, elasticities vary between approximately 0.10-0.15 for lowincome families and 0.40-0.45 for families with an average annual income of around $125,000. Yet, as shown in Figures 7(e) and (f), there is no substantial difference in educational mobility between Denmark and the US. When differences arise, they often do not favor Denmark. Moreover, the shape of the cross-country differences in educational mobility across parents' total gross income does not show any strong non-linear pattern favoring the least advantaged in Denmark relative to the least advantaged in the US.
These results also shed light on the likely relationship between credit constraints in the adolescent years and educational attainment, which is investigated by a large body of literature. 50 This literature is often inconclusive as it does not control for the other parental characteristics associated with income. A related strand of literature investigates the effects of tuition and restrictions to funding of education. 51 Even though we do not explicitly address this issue, our results are consistent with the evidence that it is not income during the adolescent years that matters, but investments as crystallized in cognitive and non-cognitive skills and longer-term fam-49 Figure A21 in the Online Appendix shows the equivalent results using wage earnings and not gross income including transfers. Figures A22 and A23 show non-linear estimates of children's education (high school completion, college graduation, master degree, highest grade completed) on parents' average highest grade completed. The figures show non-linearities across different levels of parental education and differences between the various measures of children's education. However, the figures show no patterns that favor educational mobility in Denmark over that in the US. 50 See the summary of the literature in Heckman and Mosso (2014) and Lochner and Monge-Naranjo (2016). 51 Cameron and Taber (2004), Carneiro and Heckman (2002), and Keane and Wolpin (2001) find little evidence of this relationship as opposed to Bailey and Dynarski (2011), Belley and Lochner (2007), and Lochner and Monge-Naranjo (2012), who report stronger evidence of credit constraints to, for example, college enrollment. show US-Denmark differences in local intergenerational elasticities between children's education and parental log gross income including transfers. High school completion is defined as highest completed grade ≥ 12, and college graduation as highest completed grade ≥ 15. Local linear regressions are weighted using kernels of absolute income. Standard errors are constructed from 50 and 1,000 bootstraps, respectively. The vertical lines mark the 5th and 95th percentiles in the data.
ily background factors that drive these associations. 52 Our analysis shows that most of the association between high school completion and parental resources, and around half of the association between college enrollment and parental resources, is accounted for by differences in children's cognitive and non-cognitive skills and family background in early adolescence. 53 Although the US and Denmark constitute two opposite poles in terms of tuition costs, the income and wealth gradients in high school completion and college enrollment do not differ substantially between the two countries.
In conclusion, despite the higher cognitive scores for the disadvantaged and the lower pecuniary costs of education in Denmark, our analysis in this section together with several other supplementary data sources (see Sections A.2 and B.4 in the Online Appendix; OECD, 2014; Hertz et al., 2008) all point in the same direction. There are few noteworthy differences in educational mobility between the US and Denmark, and certainly nothing that can explain the differences in income and wage earnings mobility reported in Section II. This analysis raises the following important question. Are there factors embedded in the Scandinavian welfare state that reduce incentives to pursue education and thus educational mobility? We discuss and investigate this in the next subsection.

Welfare Levels and Educational Incentives
It is well established that the economic returns to education are substantially lower in Denmark and the other Scandinavian countries than in the US (e.g., Harmon et al., 2003;Fredriksson and Topel, 2010). Two mechanisms leading to this difference are wage compression and the high levels of welfare benefits observed in Scandinavia. As noted in Edin and Topel (1997) and Fredriksson and Topel (2010), incentives to pursue education diminish as returns to education decrease and welfare benefits increase. In this subsection, we establish an empirical relationship between educational attainment and potential public benefits in Denmark. We refer the reader to Section B in the Online Appendix, and to Edin and Topel (1997), Fredriksson andTopel (2010), Freeman et al. (2010), Rosen (1997), and Tranaes (2006) for discussions and descriptive evidence of the differences between the income and employment prospects of unskilled or low-skilled individuals in Denmark and the US, and the relationship between public sector employment, public benefits, and the wage floor in Denmark. In Section B in the Online Appendix, we further show that incomes are  1973-1975 in Denmark. The figure also shows maximum unemployment insurance benefits, social assistance level for individuals with children (extra benefit for second child applies), and social assistance level for individuals without children. The horizontal lines are the raw benefits and do not include means-tested daycare slots and other types of benefits. Average wage earnings are estimated from the full sample and are not conditional on employment. Education: "Below high school" is years of schooling <12; "Gymnasium" is defined as 12 years of schooling and a gymnasium or HF degree (see discussion of HF in footnote 36); "Vocational/Some college" is defined as 12< years of schooling <15, or 12≤ years of schooling <15 and a vocational training degree; "College" is defined as 15≤ years of schooling <17; "Master or above" is defined as years of schooling ≥17, which corresponds to at least a masters degree from a university. Wage earnings include taxable wage earnings and fringes, labor portion of business income, and non-taxable earnings, severance pay, and stock-options. compressed in the tails of the educational and income distributions in Denmark. Hence, the lower returns to education in Denmark compared to the US do not stem from cross-country differences in educational tracks, which could cloud the relationship between years of schooling and income. Figure 8 presents evidence on the issue at hand. The figure shows mean pre-tax wage earnings measured in 2010-2012 for the cohorts born in the period 1973-1975 in Denmark, by level of highest completed education, 54 together with horizontal lines indicating the 2011 maximum unemployment insurance benefits and the social assistance levels in Denmark. 55 From the 54 In this analysis we use the data that were introduced in the Data subsection of Section II. 55 Eligibility for unemployment insurance benefits is based on previous employment and membership of an unemployment insurance fund (insurance is tax-deductable and benefits are subsidized by the public sector). The figure reports the maximum level available. Below this level, unemployment insurance benefits replace wage earnings by a rate of 90 percent. figure, it is evident that for individuals with the lowest levels of education, average wage earnings barely exceed maximum social assistance levels. Even as one climbs the educational ladder, it is not until college completion that wage earnings are twice the size of earnings from social assistance. The progressivity of the Danish tax system only makes this pattern more pronounced.
While the relationship between returns to education, public benefits, and educational attainment has been discussed in the literature we have cited, there is little causal evidence. Figure 9 provides the first evidence of such a causal relationship by illustrating the response to two reforms of social assistance levels for youths passed in Denmark in 1991Denmark in and 1992Denmark in /1993, respectively, which increased the incentive to be enrolled in education relative to dropping out. 56 The first reform raised the minimum age of eligibility for full social assistance (SA) from age 20 to age 21 and the second reform raised the minimum age for receipt from 21 to 25. Below the minimum age, individuals were only entitled to "youth assistance" (ungdomsydelse), which was substantially lower than the full SA level, and they had an increased obligation to participate in employment-focussed activation programs. 57 Figure 9 shows rates of enrollment in any level of education measured on a weekly basis from age 19 to age 26 for the cohorts born in the period 1969-1974. In Figure 9(a), we plot enrollment rates for individuals who were 20 and 21 years old at the timing of the 1991 reform that raised the minimum age from 20 to 21. The figure shows that enrollment rates were similar at younger ages, but at the exact timing of the reform the two groups diverged and enrollment rates became approximately 2-3 percentage points higher for the affected group who were 20 years old at the timing of the reform, relative to the unaffected group who were 21 years old. Figure 9(b) shows a similar response around the timing of the 1992/1993 reform that raised the minimum age from age 21 to 25. The figure shows enrollment rates for individuals who were 20-23 years old at the timing of the reform. We see that the groups had similar trajectories prior to the change but diverged once the minimum age was raised. Those affected Social assistance is means-tested (on household level) such that income earned is deducted 1:1 from benefits. Levels differ by whether recipients have children or not. 56 See https://www.retsinformation.dk/Forms/R0710.aspx?id=53834, https://www.retsinforma tion.dk/Forms/R0710.aspx?id=53848, and https://www.retsinformation.dk/Forms/R0710.aspx? id=53886 (accessed April 1, 2016) for the specific legislation in question. 57 Jonassen (2013) studies behavioral responses around age 25. He finds that take-up of social assistance increases substantially once benefits increase to full level at age 25. Furthermore, he finds no evidence of substitution between social assistance and public education grant at age 25. This contrast to our results suggests that the link between social assistance and education is mainly present at lower-tier educations (i.e., for low-skilled individuals), which would have been completed at age 25 in any case. Fig. 9. Fraction enrolled in education by age around the timing of two reforms in 1991 and 1992/1993 that raised the minimum age for eligibility for full social assistance levels, in Denmark Notes: This figure shows the fraction enrolled in an education by age (measured weekly) from age 19 until age 26 for the cohorts born in 1969 and 1970, and between 1971 and 1974, respectively. The figures are constructed using full population register data with exact enrollment and exit dates from all educational levels (except first to seventh grades) in Denmark, merged with demographic registers with exact information of birth dates. Both panels show enrollment rates around reforms where the minimum age for full social assistance eligibility was raised. In 1991, it was raised from age 20 to age 21. In 1992/1993, it was raised further from age 21 to age 25. Below this age of eligibility, the level of social assistance was substantially reduced and increased activation obligations applied. Panel (a) shows enrollment rates for individuals who were 20 and 21 when the age for full social assistance eligibility was raised from 20 to 21. Panel (b) shows enrollment rates for individuals who were between 20 and 23 when the age for full social assistance eligibility was raised from 21 to 25. at age 20 broke away from the three remaining cohorts at age 20. Those affected at age 21 broke away from the two remaining cohorts at age 21. Those affected at age 22 diverged at this exact age from those who were affected at age 23.
The results presented here establish a negative relationship between educational enrollment and the level of public benefits, albeit with two caveats. First, it is beyond the scope of this paper to estimate the underlying behavioral parameters -we strongly encourage future research to explore this relationship. Second, we have neither precise estimates of the potential gains from the greater equality in childhood investments and fewer pecuniary costs of education in Denmark than in the US, nor the disincentives for educational attainment that wage compression and public benefits constitute. Hence, we cannot determine whether the similarities in educational mobility in the two countries occur because the effects offset each other, although we find this to be a plausible explanation given the evidence.

Neighborhood Sorting of Children by Family Background
Neighborhood sorting by family socio-economic status is prevalent in each country. Public schooling in Denmark is universal and attempts to offer all children equal amounts of high-quality schooling. This policy might be disequalizing because children with early advantages accumulate skills at a higher rate while in school. 58 High levels of equal investments in schooling for all children amplify initial gaps between advantaged and disadvantaged children. This is a consequence of static complementarity between investments and child skill levels at each age, reinforced by increasing complementarity between investments and skill levels as children age. 59 Figures  10 and 11 show that, in Denmark, different measures of parental resources correlate with the school and peer quality of public schools, 60 and thus investments in children through the public schools tend to increase with parental income. 61 58 Heckman and Mosso (2014) present evidence supporting disequalization. 59 Cunha and Heckman (2007), Carneiro et al. (2013), and Heckman and Mosso (2014) discuss static and dynamic complementarity. Static complementarity is described by the "Matthew effect": to those who have, more is given. Dynamic complementarity is the effect of investments at one age on building complementarity at later ages. 60 Danish schools receive higher rates per student for special needs education, but not based on overall resource level of catchment area. See the Public School Law (https://www.retsinformation.dk/Forms/r0710.aspx?id=163970). 61 A vast literature has described the significant impacts of school and peer quality on educational outcomes and income. See Sacerdote (2011) for examples. Notes: This figure shows school "leave-one-out" means of predicted cognitive and non-cognitive skills, from the estimated measurement system, using the cohort born in 1987 in Denmark. Mean of peers' birth endowments in preschool are calculated using the 1995 birth cohort. The dashed lines show 95 percent confidence intervals. The skills are anchored to P(high school completion) and the y-axis can be interpreted as such. Hence, a difference of 0.02 from the log(income) of 10 to log(income) of 11 for non-cognitive skills implies that the mean level difference in non-cognitive skills for peers of children whose parents' log(income) equal 10 and 11, respectively, are associated with a 2 percentage point difference to the likelihood of completing high school. We use birth weight, gestational length, and length at birth to estimate birth endowments. We use exam grades on mathematics and physics to estimate cognitive skills and grades on organization/neatness to estimate non-cognitive skills.
In a similar vein, Figures A1(a) and (b) show variation in average high school completion and college attendance rates across schools. In some schools, only 50 percent of students complete high school and 10 percent of students attend college, respectively, while in other schools, all students complete high school and 80 percent attend college. The figures also show large differences in average parental gross income across schools. The differences correlate strongly with the later educational attainment of students. Figures A1(c) and (d) plot the average high school completion and college attendance rates against the school mean peer parental gross income and highest grade completed. The figures show that the average educational attainment of a ninth grade student is strongly positively correlated with peer family income and education.  Notes: This figure shows school "leave-one-out" means of predicted birth endowments and cognitive and noncognitive skills, from the estimated measurement system, using the cohort born in 1987 in Denmark. The dashed lines show 95 percent confidence intervals. The skills are anchored to P(high school completion) and the y-axis can be interpreted as such. Property value is measured as mean valuation of owned property (from Statistics Denmark and Danish national tax authorities) in a given catchment area. Hence, a difference of 0.02 from the 1st to the 100th percentile for cognitive skills implies that the mean level of cognitive skills for peers of children whose parents own property in the most expensive school catchment area are associated with a 2 percentage point higher likelihood of completing high school. We use birth weight, gestational length, and length at birth to estimate birth endowments. We use exam grades on mathematics and physics to estimate cognitive skills and grades on organization/neatness to estimate non-cognitive skills.
Catchment areas for public institutions that limit peers to sort with certain income groups and equal public investments tend to favor children from high-income families. 62 The strong relationship in Denmark between educational attainment and family background could arise solely as a result of neighborhood sorting on the basis of family income and wealth.
We lack comparable information for the US. If, in fact, sorting is equally strong in the two countries, this factor helps to explain the near equality 62 See Black and Machin (2011) for a review of this literature. Tiebout (1956) and Black (1999) show that housing prices are affected by school quality. More related to the Danish case, Machin and Salvanes (2016) provide recent evidence on this issue from Norway. of educational IGEs in both countries. Residential sorting might help to undo the benefits of the Scandinavian welfare state. We leave this topic for future research.
Certainly, full equality of opportunity is not present in Denmark. Figure  A2 shows socio-emotional ratings measured at age 7 and 12, and cognitive and language test scores measured at age 12 by parental permanent gross income for Danish children. For all three measures, the average gaps between the most disadvantaged children and the most advantaged children are around 0.5 of a standard deviation. Thus, as evidenced by this figure and our analysis earlier in this section, substantial skill gaps throughout childhood and adolescence remain in Denmark. While the Scandinavian welfare state invests heavily in children throughout childhood and redistributes income (consumption) during adulthood, it has not eradicated the strong influence of parents and early childhood environments. As a consequence of the complementarity between skills and investments, later life universal schooling investments during childhood or adolescence might be ineffective in reducing gaps between advantaged and disadvantaged children.

IV. Limitations, Future Directions, and Open Questions
Before concluding, we discuss some limitations of this study. First, like much of the empirical literature on social mobility, we report empirical relationships (and not necessarily causal relationships) across generations. Our discussion emphasizes the need for a clearer theoretical framework to disentangle the effects of different income sources (wage earnings, profits, capital, public transfers, and taxation) and mechanisms through which they are related across generations (the dynamics of parental and public investments/human, monetary, and physical capital transmission).
Second, our analysis measures parental income as permanent income during a child's primary and secondary schooling ages. Permanent family income over the life cycle of children has been shown to account for most of the variation in the relationship between family income and children's schooling (Carneiro and Heckman, 2002). Yet, it might be that the pivotal differences between the US and Denmark materialize at early ages. 63 63 In a similar vein, in our analysis of income mobility, we have chosen an age range to measure income that should proxy lifetime income closely. Yet we still only provide a snapshot of income mobility under the assumption of homogeneous time preferences. Educational attainment and time preference correlate (Oreopoulos and Salvanes, 2011). As a consequence, intergenerational income mobility estimates reported here and in the remaining literature might be biased if income at older ages should receive less weight than income at early ages for individuals without college degrees, and vice versa. One possible next step would be to create a generalized intergenerational income elasticity, which evaluates Low-income parents might be constrained in making early lifetime investments in the US but not in Denmark. 64 Third, any attempt to capture a country's level of intergenerational mobility and the relationship between parental and child outcomes by a few point estimates is bound to be unsatisfactory. While the estimation of NL-IGEs is a step in the right direction, other empirical strategies might be used. One strategy estimates local rank regressions, where the IGE is found by minimizing the product of ranked residuals, thus putting less weight on extreme observations and more weight on mid-rank observations. Our empirical analysis of rank regressions is consistent with our analysis of NL-IGEs. There is curvature in the estimated relationships at the top and at the bottom of the income distribution. 65 Another method is copulas, which might be particularly useful in the present case of describing the dependence between parental and child income because tail dependence in income distributions is notoriously difficult to determine. 66 There are a number of aspects of inequality that we have not analyzed. We have not addressed issues pertaining to in-kind transfers, health, and access to health care, but only to inequality in terms of skill formation, educational attainment, and income. A more comprehensive analysis would be desirable.

V. Conclusion
Academics and policymakers around the world point with admiration to Scandinavia as a model for reducing inequality and promoting social mobility without sacrificing economic efficiency or growth. This paper takes a first step towards investigating in what dimensions and for what reasons Scandinavia is more effective in promoting social mobility. Despite Denmark's far more generous welfare state, its extensive system of preschools, and its free college tuition, the family influence/child education relationship is very similar to that of the US. In both countries, much of the average association between parental resources and the educational attainment of children can be explained by factors set in place by age 15, including child skills. However, distributions of cognitive test scores of disadvantaged Danish children are much better than those of their counterparts in the US. the entire stream of lifetime earnings for parents and children, allowing for differential time preferences. Even better would be to form the value functions of lifetime earnings (e.g., Hai and Heckman, 2016). 64 See the discussion in Lochner and Monge-Naranjo (2016). 65 See Section D of the Online Appendix for a brief outline. 66 See Section E of the Online Appendix for a brief introduction and examples.
The failure to promote greater educational mobility in spite of providing generous social services is most likely rooted in the welfare state. Our findings point to wage compression and the higher levels of welfare benefits as being counterproductive in providing incentives to pursue education. The low returns to education observed in Denmark help to explain the disconnect between the egalitarian childhood policies in Denmark and the roughly equal levels of educational mobility in Denmark and the US. The sorting of families into neighborhoods and schools by levels of parental advantage is likely to be another contributing factor. While the Danish welfare state might mitigate some childhood inequalities, substantial skill gaps still remain.
While patterns of educational attainment are similar across the two countries, the relationships linking skills and income differ greatly. The IGE estimates of income mobility -used as evidence for Scandinavia's high social mobility -are very sensitive to the choice of income measure analyzed. Using total income potential excluding public transfers as a measure of income, there are fewer differences between estimated IGEs for Denmark and the US than previously portrayed. Considering wage earnings or wage earnings plus public transfers, average income mobility is higher in Denmark than in the US. We find evidence of strong non-linearities in measures of intergenerational income mobility. Differences in Danish-US income mobility favor Denmark (i.e., produce lower local IGEs) at higher levels of parental income and at very low levels of parental income. The education-family background gradients are also non-linear in both countries but do not favor Denmark at either tail of the parental income distribution. This paper sends a cautionary note to the many enthusiasts endorsing the Scandinavian welfare state. We make no statements about the optimality and fairness of the US and Danish systems from a philosophical or social choice point of view. The Danish welfare state clearly boosts the cognitive test scores of disadvantaged children compared to their US counterparts. However, test scores are not the whole story, or even the main story of child success, despite the emphasis on them in popular discussions. Moreover, substantial gaps in test scores remain across social groups within Denmark. Differences in income mobility between Denmark and the US also arise from wage compression in the Danish labor market, the progressivity of the Danish tax-transfer system, and the increasing college premium in the US and the rise in inequality there. These factors drive the higher population average income mobility in Denmark and equalize post-tax consumption possibilities. They also discourage educational attainment in Denmark. Along with neighborhood sorting, they explain the similarity in the influence of family background on educational attainment in the two countries.
The US excels in incentivizing educational attainment. The Danish welfare state promotes cognitive skills for disadvantaged children. Policies that combine the best features of each system would appear to have the greatest benefit for promoting intergenerational mobility in terms of both income and educational attainment. Fig. A2. Test score gaps in Denmark at age 7 and age 12, by parental permanent gross income Notes: The figure shows deviations of SDQ scores (in panels (a) and (b)), CHIPS scores (cognitive test in panel (c)), and a language test score (in panel (d)) relative to the sample mean by $4,000 bins of parental permanent gross income including transfers. Scores have been standardized to mean 0 and standard deviation 1. A higher score of SDQ implies greater socio-emotional difficulties. A higher score in the CHIPS and language test implies better cognitive and language skills.