The intergenerational transmission of language skill

Abstract This paper examines the relationship between parents’ and children's language skills for a nationally representative birth cohort born in the United Kingdom—the Millennium Cohort Study (MCS). We investigate both socioeconomic and ethnic differentials in children's vocabulary scores and the role of differences in parents’ vocabulary scores in accounting for these. We find large vocabulary gaps between highly educated and less educated parents, and between ethnic groups. Nevertheless, socioeconomic and ethnic gaps in vocabulary scores are far wider among the parents than among their children. Parental vocabulary is a powerful mediator of inequalities in offspring's vocabulary scores at age 14, and also a powerful driver of change in language skills between the ages of five and 14. Once we account for parental vocabulary, no ethnic minority group of young people has a negative “vocabulary gap” compared to whites.

measures of linguistic attainment have rarely been used in empirical operationalizations of Bourdieu's theory of cultural reproduction (Sullivan, 2001). Bernstein also placed substantial weight on the differences in language use by middle and working-class children, arguing that this affected their ability to succeed at school (Bernstein, 1971(Bernstein, , 1973(Bernstein, , 1975. Language knowledge is clearly an important prerequisite for school learning, and language difficulties have been linked to a range of adverse outcomes (Law, Rush, Schoon, & Parsons, 2009). However, most empirical studies have neglected the role of parental language skills. This means we do not know to what extent parental language skills are transmitted to the child, or how important parental language skills are in explaining socioeconomic gaps in children's language skills.
Perhaps the most widely cited empirical source on the question of the relationship between parents' and children's vocabularies is Hart and Risley's (1995) observational study. Hart and Risley's research on 42 families in one U.S. college town found strong social class and black-white differences in the range of vocabulary used by parents when talking to their children. Their headline finding that "professional class" children had been exposed to 30 million more words than "welfare children" had by age three (Hart & Risley, 2003) has been enormously influential, despite the drawback of a small and unrepresentative sample. (To be clear, the 30 million figure does not refer to unique words, but the total barrage of speech to which the children were exposed, including repetition). The study is contested (Golinkoff, Hoff, Rowe, Tamis-LeMonda, & Hirsh-Pasek, 2018;Sperry, Sperry, & Miller, 2018), but important, given the lack of evidence in this field. We are aware of only one previous large quantitative study which assesses the role of parental vocabulary in shaping class and race differences in children's vocabularies (Farkas & Beron, 2004).

| The current study
We exploit unique new data on the vocabulary scores of both parents and children in a nationally representative UK birth cohort study. A distinctive feature of the study is that mothers, partners, and children took an equivalent vocabulary test when the children were aged 14. This allows us to build on the existing evidence base in a number of important ways. First of all, we are able to establish the vocabulary gaps that exist for both parents and children according to social class and ethnic group using a nationally representative birth cohort study. Second, we address the extent to which socioeconomic and ethnic gaps in children's scores at age 14 are driven by differences in the parents' scores. Third, we assess the extent to which the role of the home environment and the child's own cultural capital are reduced once parental vocabulary is taken into account. Finally, given that language scores on school entry are a strong predictor of later language acquisition (Duncan et al., 2007), we assess whether parental vocabulary is associated with a growing language gap for children between the ages of five and 14.
The vast majority of the literature examining the relationship between parental and child vocabulary is focused on the early years, possibly because it is during this period that the greatest challenges are met (Cartmill et al., 2013;Fernald, Marchman, & Weisleder, 2013;Rowe, 2012). Yet social class differentials in vocabulary continue to grow during adolescence and even into mid-life (Sullivan & Brown, 2015a, 2015b.

| What explains socioeconomic language gaps?
From the perspective of cultural reproduction theory, the ability to understand and use "educated" language is a vital part of the advantage that is transmitted by high-status parents (Bourdieu, 1977). In this sense, language can be seen as part of the cultural capital that is transmitted within the home. But there is an ambiguity within Bourdieu's work regarding the role of language-is language a fundamental building block of learning, or is it simply a signal of class membership, which is arbitrarily rewarded by the education system? Of course, these are not mutually exclusive. It may be the case both that building vocabulary is vital for learning across subjects, and that educators sometimes arbitrarily reward styles of expression associated with elite groups. However, our focus is on language as a tool for communication and learning, and hence on quantitative differentials in vocabulary rather than on qualitative differences in styles of expression.
The concept of cultural capital has been operationalized in diverse ways. A useful distinction has been drawn between "status-seeking" and "information processing" forms of cultural capital (Ganzeboom, 1982). Information processing cultural capital leads to the development of knowledge and skills which are rewarded in the education system (Sullivan, 2002). Status-seeking cultural capital is rewarded via teacher bias rather than improved skills (Farkas, Grobe, Sheehan, & Shuan, 1990;Jaeger & Møllegaard, 2017). We prefer the terms "literary cultural capital" and "non-literary" cultural capital to refer to cultural activities that relate to books and reading and those that do not. Studies that have separated the two have found that literary cultural capital has more influence on educational attainment (De Graaf, De Graaf, & Kraaykamp, 2000;Jaeger, 2011;Sullivan, 2001).
Psychologists have also stressed the importance of the home literacy environment to children's language learning (Melhuish et al., 2008;Waldfogel & Washbrook, 2011). Three aspects of parenting have been highlighted as central to children's early language and learning (Rodriguez et al., 2009): (a) frequency of children's participation in routine learning activities (e.g., shared book reading, storytelling); (b) the quality of caregiver-child engagements (e.g., parents' cognitive stimulation and sensitivity/responsiveness); and (c) the provision of age-appropriate learning materials (e.g., books and toys). Studies have found substantial socioeconomic differentials in these parental inputs (Bassok, Finch, Lee, Reardon, & Waldfogel, 2016). However, studies assessing the role of the home literacy environment have not accounted for the role of parental language skills. This is important because it is likely both that parents who have strong language skills will be most comfortable engaging in activities such as shared reading, and also that they may be more effective at engaging their children in these activities than, for example, a parent who struggles with basic literacy skills (Sullivan et al., 2013). Evidence suggests that assortative mating according to verbal cognitive scores is notably higher than for non-verbal scores (Plomin & Deary, 2015).
Regardless of the theoretical perspective applied, empirical findings across the disciplines highlight the importance of the home literary climate. Whether books in the home are termed "embodied cultural capital" or "learning materials," they remain a powerful predictor of educational outcomes (Marks, Cresswell, & Ainley, 2006).
Both theoretical perspectives have merit, as books in the home are learning materials, but also reflect the value placed upon books and learning within the family, and represent a cultural display, signaling that the owner is a cultured person. Parental reading to children (Bus, Van Ijzendoorn, & Pellegrini, 1995) and children's own reading (Stanovich & Cunningham, 1998;Sullivan & Brown, 2015a) are powerful predictors of both language learning and wider educational outcomes.

| Ethnic differentials
Findings regarding ethnic gaps in both cognitive and educational attainment vary widely according to the minority groups considered and the particular national context (Alba & Waters, 2011). Much of the sociological literature relates to the U.S. context, where black-white test score gaps form in early childhood and widen during the school years (Jencks & Phillips, 1998;Quinn, 2015b). It is, therefore, important to be clear about the important differences between the US and UK contexts. Waters et al. (2013) provide a useful overview of the main immigrant groups and differences in the United Kingdom and United States experience. Key points of difference are that black people in the United Kingdom, especially the black Caribbean group, have high levels of intermarriage and residential integration with whites compared to black people in the United States. In contrast, while Pakistani and Bangladeshi immigration to the United Kingdom began in the 1960s and 1970s, the widespread practice of transnational arranged cousin marriage maintains an ongoing "first generation" for many families, with associated lower tendency to speak English at home (Sullivan, 2010). In the United Kingdom, children of Pakistani and Bangladeshi ethnic backgrounds start at a disadvantage, but make greater progress during schooling than whites (Hoffmann, 2018;Strand, 2011;Sullivan et al., 2013), and white and black Caribbean working-class students are generally the lowest achievers at school (Strand, 2014). UK ethnic minorities as a whole are more likely than whites to gain a university degree (Modood, 2004), and all ethnic minority groups in the United Kingdom are more likely to enter university than their white peers with similar prior attainment (Belsky, Barnes, & Melhuish, 2007). Theories regarding social class differences in the transmission of educational advantages and disadvantages cannot simply be mapped onto ethnic differences. Particularly in the case of immigrant groups, the educational fates of children may not be as strongly tied to the parents' current status as class-based theories would lead us to predict.

| RE S E ARCH QUE S TI ON S
While researchers from a range of theoretical and disciplinary perspectives have emphasized the importance of inequalities in language development, a number of important empirical questions remain unanswered.
1. How large are vocabulary gaps according to childhood socioeconomic circumstances, ethnic group, and other factors? 2. What is the role of the home literacy culture in predicting child vocabulary and explaining SES gaps? We hypothesize that the come literacy culture predicts child vocabulary, net of socioeconomic background.
3. What is the role of the child's own cultural capital in predicting vocabulary and explaining SES gaps? We assess the roles of reading for pleasure (literary cultural capital) and playing an instrument (non-literary cultural capital). We hypothesise that reading for pleasure is an important predictor of vocabulary, whereas playing an instrument is less important.
4. How important are the mother's and partner's vocabulary in predicting the child's vocabulary, and does this substantially mediate SES and other differentials in the model? We hypothesise that parental vocabulary is of primary importance as a predictor of offspring's vocabulary, and mediates SES differences. The initial response rate was 72% of all families with eligible children living at nine months in the sampled wards (Plewis, Calderwood, Hawkes, Hughes, & Joshi, 2007). There have been six waves of data collection, at ages 9 months and 3, 5, 7, 11, and 14 years. The seventh, age 17 wave, is in the field at the time of writing. The study is multi-disciplinary and contains rich repeat measures of childhood socioeconomic circumstances, child development, and child health. The MCS datasets are freely available to researchers internationally via the UK Data Service (http://ukdat aserv ice.ac.uk). The CLS website provides detailed information and documentation on the study (http://www.cls.ioe.ac.uk/mcs).
Eleven thousand seven hundred fourteen households responded at the sixth wave of data collection (MCS6).
This represents a response rate of 76% of those issued to the field at sweep 6 and just under 61% of the initial sample. Of these, cohort members in 10,781 households completed the vocabulary test. This group is our analytical sample. We use records for only one child per family (singletons and the first-born twin or triplets) to avoid having to account for the clustering of children within families. We exploit data provided by the "main respondent" parent (this is typically the mother, and we refer henceforth in the text to mothers rather than main respondents), the spouse or cohabiting partner (where applicable), and the child themselves, up to age 14. More information on data collection and attrition is available in the MCS6 technical report (Ipsos Mori, 2017).
We exploit data from birth to age 14, and, as in any longitudinal analysis, the problem of missing data must be addressed (Mostafa & Wiggins, 2015). It is well known that list-wise deletion/complete case analysis returns biased estimates, so we use multiple imputation with chained equations (25 imputed datasets) to "fill-in" values of any missing items in the variables selected for our analysis adopting Schafer's data augmentation approach (Schafer, 1997) under the assumption of "missing at random" (MAR). In order to maximise the plausibility of the MAR assumption we also include a set of auxiliary variables in our imputation model. In this instance, MAR implies that our estimates are valid if missingness is due to variables (auxiliary or substantive) that were included in our models (Little & Rubin, 2002). In addition, to take account of disproportionate, stratified clustering in the MCS sample design and attrition, models are adjusted for non-response and the MCS survey design. The combination of multiple imputation and non-response weighting restores the sample to be nationally representative of the UK population born in 2000-2001(Fitzsimons, 2017. We impute the full sample, but delete cases for which the outcome is missing (Hippel & Paul, 2007).

| Language skill
The mother, partner, and child's vocabulary scores were assessed when the cohort member was aged 14. Vocabulary is strongly associated with other dimensions of verbal ability (Baddeley, Logie, Nimmo-Smith, & Brereton, 1985).
The vocabulary scores were derived from a shortened version of the Applied Psychology Unit (APU) Vocabulary Test, a standardised test produced by the University of Edinburgh (Closs & Hutchings, 1976), and used in previous studies including the 1970 British Cohort Study (BCS70). The APU Vocabulary Test directly examines vocabulary knowledge, through multiple-choice items in which a stimulus word has to be matched to a synonym from five alternatives. At the start of the test the stimulus words are very easy, for example, "begin" and become progressively more difficult; for example, "pusillanimous," The final score is the sum of the correct answers, from a total of 20 multiple-choice items. The APU test has previously been shown to have good psychometric properties and is highly correlated with other tests of verbal intelligence (Levy & Goldstein, 1984). We provide information on the internal reliability and distribution of the vocabulary scores for each respondent in Appendix. Although the test was developed for teenagers, within our sample, the internal reliability scores are higher for the adult respondents.

| Socioeconomic and demographic factors
The socioeconomic and demographic information in our models includes the age (in months), sex and ethnic group of the child, and the region of the United Kingdom that the family lives in. Parents' education is the highest TA B L E 1 Mean vocabulary scores for young people, mothers, and partners (imputed and weighted)  (Goldthorpe & McKnight, 2006), home ownership, and log equivalised family income. The number of older and younger siblings is included, as older siblings have been shown to be advantaged both in terms of vocabulary and general cognitive outcomes (Black, Devereux, & Salvanes, 2005;Hoff-Ginsberg, 1998;Nisbet, 1953). Whether English is the main language spoken at home at wave 1 of the survey (or wave 2 where unavailable at wave 1) is included, as this may be related to both parental and child vocabulary scores. We include the ages of the mother, partner, and child in the model, as vocabulary is expected to increase with age, especially in the case of the child. In addition, the models control for single-parent household status at wave 6 (in 2015), and, if the mother's partner was present, whether they completed the vocabulary test or not.

| Home literary climate
Books in the home and the frequency of parental reading to the child at age three.

| Child's cultural capital
The child's own reading frequency and playing a musical instrument at age 11 (self-reported).

| Early cognitive scores
Cognitive abilities at age five were measured using three subscales of the British Ability Scales Second Edition (BAS II): naming vocabulary, picture similarities, and pattern construction. The three subscales capture core aspects of verbal and pictorial reasoning, and spatial abilities (Elliott, Murray, & Pearson, 1978;Elliott, Smith, & McCulloch, 1996;Hill, 2005;Jones & Schoon, 2008).

| RE SULTS
We begin by describing mean parental and child vocabulary scores according to the other variables to be used in our models and assessing the correlations between cognitive measures. This is followed by a series of linear regression models, with child vocabulary at age 14 as the outcome. Finally, we present a path analysis as a formal test of the mediation of the effect of parental education on child vocabulary by parental vocabulary and other factors. Table 1 presents raw (imputed and weighted) mean scores out of 20 in the vocabulary assessment, by respondent type (young person, mother, and partner). Young people achieved a mean score of seven out of 20 on average, while mothers and partners gained substantially higher scores (10 and 11, respectively). The standard deviation is also higher for the parents than for the child, reflecting a wider spread of scores.

| Descriptive results
We observe stark ethnic differences (based on the young person's ethnic identification) in adult vocabulary scores. The parents of white and ethnically mixed young people had the highest mean scores (between 10.6 and 12.2)-around two and a half times higher than the parents of ethnically Bangladeshi young people, who received the lowest mean scores (between 4.4 and 4.5). These differentials to some extent reflect the prevalence of first-generation immigrants among each ethnic group, for example, over 90% of Bangladeshi mothers were not born in the United Kingdom (Sullivan, 2010). As such, we would emphasize that minority parents' lower vocabulary scores are likely to reflect a lack of English fluency, rather than wider ability or attainment. The ethnic gaps in the young people's scores are far more modest, ranging between a mean score of 6.2 for Pakistanis and 7 for ethnically mixed and white young people. This means that ethnic minority youth (excepting the mixed group) tend to have vocabulary scores that are relatively close to those of their parents, and in the case of Bangladeshis and Pakistanis, they achieve higher average scores than their parents. Regional differences in adult vocabulary scores are also apparent, with those living in London scoring lowest, reflecting the city's ethnically diverse population.
There are strong gradients in parental vocabulary scores according to parental education, social class, and income. Among households where at least one partner had a higher university degree, mothers scored an average of 15 out of 20, compared to 6.5 for households where neither parent had any formal educational qualification.
The education gaps are less marked for the offspring than for the parents. The children of university graduates scored 8.6 versus 5.8 for children in households with no qualifications. For families with no parental qualifications, the mean score for children (5.8) is only around one correct answer less than for mothers (6.5). Children of parents with a higher degree scored an average of 8.6 compared to 15. A similar pattern is observed for social class, household income, and home ownership-parental socioeconomic vocabulary gaps are larger than those for young people.
Not surprisingly, adult English vocabulary is considerably lower among those whose home language is mixed or non-English, compared to English only (6.7 and 10.9, respectively for mothers), but the difference among young people is negligible (6.6 vs. 6.9).
Turning to indicators of cultural resources, we see that both parents' and children's vocabulary scores are higher in households with higher levels of books at home and more frequent reading to the child. Young people who read frequently have relatively high mean vocabulary scores, whereas their parents' scores are less strongly differentiated according to this measure. The vocabulary gap between young people who play a musical instrument and those who do not is small (7.4 vs. 6.6). Table 2 shows a correlation matrix of the young person, mother and partner vocabulary scores, and the child's early cognitive scores. The mother and partner scores are highly correlated, at around 0.5. This is in line with previous estimates of assortative mating for verbal intelligence (Plomin & Deary, 2015). Correlations of around 0.3 are observed between the young person and mother/partner. We also see higher correlations between earlier verbal cognition and age 14 vocabulary (0.36) than between early measures of spatial and pictorial reasoning and later vocabulary (0.25 and 0.20, respectively). Table 3 shows a series of models predicting vocabulary scores at age 14. The outcome variable, parental vocabulary scores, and the child's prior cognitive scores are all standardized z-scores.

| Regression results
Model 1 includes socioeconomic and demographic information and provides an indication of the magnitude of the associations between these variables and the child's vocabulary scores before any potential mediating factors have been accounted for. Parental education is strongly linked to the child's vocabulary. Having an undergraduate (bachelors) university degree or a higher (postgraduate) degree (compared to no qualifications) provides roughly three times the advantage associated with having a parent with a higher managerial or professional occupation (compared to a routine occupation) when both are included in the same model. Income and home ownership are not significantly associated with vocabulary, taking the other factors in the model into account.  Young people who are identified as ethnically Indian, Pakistani, black Caribbean, black African or "other" have lower scores than whites. There is no difference between boys and girls. As expected, young people with older siblings have a significant disadvantage in vocabulary scores, while the existence and number of younger siblings make little difference. The country of the United Kingdom is included, with London split from the rest of England, as educational policy and practice vary across these regions, and ethnic compositions vary widely. Living in Scotland or Wales is associated with a disadvantage, and London with an advantage, in vocabulary once individual characteristics are controlled. Both maternal age (in years) and child age (in months) are positively associated with the child's vocabulary score. The age range of the MCS births extended over a full calendar year. The coefficient associated with a month in age (0.01) can, therefore, usefully be compared to other coefficients in the model. For example, having a parent with an undergraduate degree is associated with five times the vocabulary advantage associated with one year in age and a parent with a postgraduate degree is associated with over six times the advantage of a year in age.

TA B L E 2 Correlation matrix: Young person's, main and partner's vocabulary and young person's early cognition
Model 2 introduces family cultural resources, in the form of books in the home and reading to the child at age three. Both of these variables are positive predictors of the young person's vocabulary. In this model, having over 500 books in the home is associated with a similar vocabulary advantage to having a parent with a postgraduate degree, and equates to around five times the advantage attributable to a year in age. The introduction of family cultural resources into the model mediates the parental education and social class effects to some extent. Model 3 includes the child's own reading for pleasure and playing a musical instrument. Playing a musical instrument is associated with a positive difference in vocabulary equivalent to ten months in age. Reading for pleasure most days is associated with a differential over three times greater than the differential attributable to one year in age. However, these child activities do little to reduce the effects of parental education, social class, and cultural capital.
In model 4, we introduce the mother's and partner's vocabulary scores. Both, especially the mother's, are strongly independently associated with the child's score. Including parental vocabulary reduces the apparent influence of parental education, reducing the coefficients for a degree and higher degree by about half, and reducing lower levels of education to statistical insignificance. Social class also becomes statistically non-significant in this model. The coefficients for the home literary climate are also substantially reduced, but the association with the child's own cultural activities is unaffected. While we treat parental vocabulary as a mediator of parental education, we acknowledge that a large portion of the gap in parental vocabulary is likely to be in place prior to parents Multiple imputation was applied to all missing data, including absent partners in single family households at MCS6. ***p < .001; **p < .01; *p < .05; + p < .10.

TA B L E 3 (Continued)
gaining their highest educational qualification, and the fact that our measure of parental vocabulary is not time varying means that we cannot unpack the reciprocal relationship between parental vocabulary and educational attainment over time in this paper. Our interpretation of parental vocabulary as a mediator of parental education attainment simply means that part of the association of parental education on child vocabulary is explained by the higher vocabulary scores of more educated parents.
All ethnic differences become small and non-significant in this model, with the exception that the Bangladeshi and Pakistani groups have a substantial advantage over whites once parental vocabulary is controlled. This to the child at age three become non-significant, their effect being fully captured by the child's cognitive scores at age five. The associations with books in the home and the child's own reading for pleasure are only slightly reduced, as these variables remain powerfully predictive of vocabulary at 14 conditioning on earlier cognitive scores.
As an indication of effect size, we converted the coefficients in this final model in terms of the raw test scores.
Notable coefficients are as follows: a one standard deviation increase in verbal cognition at age five is associated with a 0.5 word increase in mean vocabulary scores (out of 20); a one standard deviation increase in maternal vocabulary is associated with an advantage of 0.4 words; a one standard deviation increase in partner's vocabulary equates to 0.3 words; more than 500 books in the home equates to 0.7 words; Bangladeshi ethnicity equates to 0.7 words; a non-English language at home equates to 0.6 words, and reading for pleasure most days at age 11 equates to 0.8 words.
In Table 4 and Figure 1, we provide a formal mediation analysis, based on a simplified path analysis version of the penultimate model (model 4), carried out in MPlus (Muthén & Muthén, 1998

| CON CLUS IONS
Our central result is that parental vocabulary scores mediate a substantial share of the socioeconomic gradient in children's vocabulary at age 14. The importance of parental vocabulary is not surprising, but suggests that both the "cultural capital" and "home learning environment" literature have neglected a fundamental element of the learning resources that children have available at home-their parents' own knowledge.
The paper advances the field of research into socioeconomic differentials in young people's language skills by providing evidence on the "word-gap" based on large-scale, nationally representative data on both parents' and children's vocabulary scores. The raw inequalities that we find in parental vocabulary are startling. For example, parents with an undergraduate degree knew twice as many words on the assessment as parents with no qualifications. Though of course not directly comparable with Hart and Risley's (1995) small-scale study, which was carried out in another place and time, and using different methods, this difference is in line with the order of magnitude of the "word gap" found in Hart and Risley's work.
However, whereas Hart and Risley found similar social class differences in vocabulary for children as for their parents, we do not. The socioeconomic differentials that we found for young people at age 14 were marked, but substantially more modest than those found among their parents. Similarly, vocabulary gaps between ethnic groups were substantial in the parents' generation, but slight for the children. This gives some grounds for optimism, in that socioeconomic differentials in vocabulary are not transmitted wholesale from parents to children. Children are exposed to vocabulary, not just from their parents, but from a range of sources including friends, teachers, books, TV, and the internet. Some of these wider exposures may mitigate the relationship between parental and child vocabulary. In particular, it is likely that schooling plays a role (Quinn, 2015a). However, it is of course possible that vocabulary gaps will widen substantially during the cohort members' life course.
Our models of children's vocabulary at age 14 show that parental education appears to be a more consistent driver than other aspects of socioeconomic position of differentials in children's vocabulary scores. The differentials due to parental education were somewhat reduced by accounting for the home literary climate. In contrast, the child's own cultural activities, particularly reading, matter but do not mediate the differential due to parental education. This challenges the traditional cultural reproduction framework, to the extent that the child's own cultural participation appears to have little to do with the reproduction of socioeconomic differentials in attainment.
We have shown that parental vocabulary is a vital mediator of differentials in children's vocabulary according to parental education, and parental vocabulary also partly explains the apparent link between the home literary environment and children's vocabularies. This suggests that the omission of parental vocabulary from most previous models of children's language development, and indeed of their educational development more generally, may have led to a skewed and incomplete understanding of inequalities in children's outcomes, exaggerating the role of parental resources and behaviors which proxy parental language competencies (and may well also proxy other associated cognitive abilities). Furthermore, parental vocabulary strongly predicts language at 14, conditioning on cognitive scores at age five, suggesting that its influence is not restricted to early childhood development.
There is an extensive international literature on ethnic differentials in vocabulary and other cognitive test scores (Belsky et al., 2007), and our findings on the role of parental vocabulary in accounting for ethnic differentials in children's vocabulary provide a novel insight. We found that some groups of ethnic minority parents had substantially lower vocabulary scores than whites, and Pakistani and Bangladeshi parents had lower English vocabulary scores than their children. Ethnic gaps among the children's generation were smaller, but, controlling for socioeconomic and demographic factors, the Indian, Pakistani, black Caribbean, black African, and "other" ethnic groups remained at a disadvantage in their vocabulary scores compared to their white peers. The outstanding negative differentials in young people's vocabulary between some minority ethnic groups and whites were fully explained by differences in parental vocabulary. Our analysis also suggests that speaking a language other than English in the home is generally positive, once other factors are controlled (Marian & Shook, 2012;Portes & Rumbaut, 2001). There is great diversity between families with English as an additional language, which would demand an analysis primarily focused on this question to elaborate. Nevertheless, we can conclude that poor English language skills among parents present an obstacle for children, but this does not imply that the presence of an additional language in the home is detrimental in itself.
Despite the several strengths of our study, we acknowledge some limitations. First, attrition from the study since the baseline is just under 40%. We use Multiple Imputation to address this. Whilst it is difficult to know the extent to which there may be residual unobserved factors affecting attrition, in controlling for an extensive range of observables the issue is likely to be mitigated.
A second limitation is that we only have parental vocabulary measured at one time-point, when the young person is aged 14, and for some parents, particularly those from immigrant groups, this may not accurately reflect their vocabulary earlier in the child's life.
A third limitation of this paper is that we are only able to report on the intergenerational transmission of vocabulary. A full assessment of the role of language in the process of "cultural reproduction" would require an assessment of later educational attainment and occupational outcomes. We intend to investigate these in future work.
A fourth limitation is that we are unable to address genetic heritability (Plomin & Deary, 2015). Evidence from the Dunedin cohort (Belsky et al., 2016) suggests that children born into socially disadvantaged families tend to have slightly below average polygenic scores for educational attainment and that these scores predict cognitive, educational, and socioeconomic attainment. This is beyond the scope of the current study, but future studies will be able to exploit the fact that the age 14 wave of MCS collected saliva from children and parents for subsequent DNA extraction.
From a theoretical point of view, our findings support the view that language skills are an important part of the resources that more privileged parents possess and are able, to some degree, to transmit to their children. This can be seen as supporting a Bourdieusian "cultural reproduction" perspective to a degree, yet this process is far from deterministic. We also find some support for Modood's (2004) notion of "ethnic capital" overcoming a lack of traditional cultural capital, notably in the case of ethnically Bangladeshi and Pakistani children, whose English vocabulary scores are higher than would be expected given their parents' low scores.
In policy terms, our findings should temper the tendency of some political commentators to blame socioeconomic differences in learning on deficits in working-class parenting behaviors (Field, 2010;Telegraph, 2010). Our results suggest that children whose parents are less educated and those from particular ethnic minority groups may require additional input at school to support the development of a rich vocabulary, and encouraging independent reading is likely to be a useful tool in this regard, though we acknowledge that more research is needed to unpack the direction of causality between reading and cognitive and behavioral and emotional development. In the case of immigrant parents who lack English fluency, support for their English language development is likely to benefit their children, and this issue has increased in salience given the dramatic rise in the proportion of births to non-UK women since the millennium.
Finally, our findings emphasize the value of including measures of parental cognitive skills, including language skills, in birth cohort studies, and other datasets, both as an important explanator and as a vital control variable.
More research is needed internationally to examine whether the role of parental vocabulary varies across national contexts.