SES differences in language processing skill and vocabulary are evident at 18 months


Address for correspondence: Anne Fernald, Department of Psychology, Stanford University, Stanford, CA 94305, USA; e-mail:


This research revealed both similarities and striking differences in early language proficiency among infants from a broad range of advantaged and disadvantaged families. English-learning infants (= 48) were followed longitudinally from 18 to 24 months, using real-time measures of spoken language processing. The first goal was to track developmental changes in processing efficiency in relation to vocabulary learning in this diverse sample. The second goal was to examine differences in these crucial aspects of early language development in relation to family socioeconomic status (SES). The most important findings were that significant disparities in vocabulary and language processing efficiency were already evident at 18 months between infants from higher- and lower-SES families, and by 24 months there was a 6-month gap between SES groups in processing skills critical to language development.


There are striking differences among children in patterns of early language growth. Some infants start speaking before their first birthday, while others don't produce words until the end of the second year (Fenson, Marchman, Thal, Dale, Reznick, & Bates, 2007). Although some late talkers catch up in vocabulary a few months later, others continue to show slower trajectories of language growth and achieve lower levels of language proficiency (Bates, Dale, & Thal, 1995; Fernald & Marchman, 2012). Differences in socioeconomic status (SES) are strongly associated with variation in language outcomes. By the time they enter kindergarten, children from disadvantaged backgrounds differ substantially from their more advantaged peers in verbal and other cognitive abilities (Ramey & Ramey, 2004), disparities that are predictive of later academic success or failure (Lee & Burkam, 2002). In adults as well, SES differences in language proficiency are robust (Pakulak & Neville, 2010), reflecting the cumulative influence of a wide range of endogenous and environmental factors over a lifetime.

Despite such evidence for significant differences among children in early language learning, research on acquisition has tended to focus much more on elucidating common patterns of language growth than on understanding the causes and consequences of variability. This emphasis has been driven by several factors: First, the search for similarities rather than differences among children is grounded in a philosophy of science that underlies psychological research more broadly – one that gives priority to processes assumed to be universal rather than to endogenous and experiential factors that can lead to variability (Arnett, 2008). Second, the use of controlled experimental methods in research on early language development favors between-group comparisons of infants at different ages, with limited attention to variability within age groups (Fernald, 2010). Third, the vast majority of developmental studies in the US rely on ‘convenience samples’ of children from higher-SES families that are unrepresentative of the larger population and thus are inherently restricted in variability (Henrich, Heine, & Norenzayan, 2010). Fourth, although educational researchers have documented robust differences in verbal abilities among school-age children varying in SES (e.g. Dickinson & Tabors, 2001; Lee & Burkam, 2002), this literature is often viewed as ‘applied’ research with limited relevance to ‘basic’ research on language development. We argue here that understanding the extent and origins of variability among children in the emergence of early language proficiency should be central to any developmental theory that acknowledges, at whatever level, the influence of children's early experience on language growth.

This perspective motivates the current study of differences as well as similarities in early language proficiency among children from higher- and lower-SES families. In experimental studies using looking-time measures, we have shown that infants develop speed and efficiency in interpreting spoken language in real time (Fernald, Pinto, Swingley, Weinberg, & McRoberts, 1998) and that individual differences in early processing efficiency are strongly linked to variation in children's later language outcomes (e.g. Fernald, Perfors, & Marchman, 2006; Marchman & Fernald, 2008). However, in these previous studies, as in many other university-based studies with English-learning children, most participants came from highly-educated and affluent families. The goal of the present study was to examine the development of language processing efficiency in relation to vocabulary learning in English-learning infants from families varying in SES. Using real-time processing measures, we followed children longitudinally from 18 to 24 months, focusing on two sets of questions: First, to what extent do infants across this broader SES range show parallel gains in processing efficiency and vocabulary between 18 and 24 months? And second, is there evidence that SES-related differences in processing skills critical to language development are already present in infancy?

SES differences in verbal abilities and their long-term consequences

The finding that children from disadvantaged families start kindergarten with lower language and cognitive skills than those from more advantaged families is old news, emerging repeatedly in studies since the 1950s (e.g. Bereiter & Englemann, 1966; Deutsch, Katz, & Jensen, 1968). The robustness of such differences is confirmed in more recent research such as the Early Childhood Longitudinal Study, Kindergarten Cohort (ECLS-K), a comprehensive analysis of young children's achievement scores in literacy and mathematics based on a large and nationally representative sample (Lee & Burkam, 2002). Even before they entered kindergarten, children in the highest SES-quintile group had scores that were 60% above those in the lowest group. In terms of effect size, children in the highest SES-quintile scored .7 standard deviation (SD) units above middle-SES children in reading achievement, while children in the lowest SES-quintile scored almost .5 SD units below the middle-SES mean. Moreover, the disparities in children's cognitive performance at kindergarten entry that were attributable to SES differences were significantly greater than those associated with race/ethnicity. Another recent study found that 65% of low-SES preschoolers in Head Start programs had clinically significant language delays (Nelson, Welsh, Vance Trup, & Greenberg, 2011). This research revealed a systematic relation between degree of language delay and other weaknesses in academic and socio-emotional skills that were well established by 4 years of age. Socioeconomic gradients in language proficiency are also found within populations living in extreme poverty (L. Fernald, Weber, Galasso, & Ratsifandrihamanana, 2011).

A challenging and controversial question: when do SES differences begin to emerge?

Results showing that SES differences in verbal abilities are already evident in the preschool years suggest that these disparities must start to develop in the first years of life, setting children on particular trajectories with far-reaching consequences for later academic success. How early do such differences begin to emerge? Research on this important developmental question has been limited for a variety of reasons – ranging from methodological challenges in evaluating language proficiency in young children, to the complexities of engaging in debate about politically sensitive issues related to social stratification. The methodological problem is easy to characterize: Until recently, measures available for assessing language and cognitive proficiency in children younger than 3 years have not been high in predictive validity, limiting their effectiveness in linking characteristics in infancy to long-term outcomes. But with the refinement of more sensitive methods for evaluating early language, recent studies have revealed considerable variability in verbal skills among very young children – to be reviewed in the following section. Another set of issues that has discouraged research on early origins of cognitive differences among children from different backgrounds is more difficult to characterize. The legacy of a prolonged and bitter debate about the nature of racial and SES differences in the US has reinforced the reluctance of researchers to pursue the question of early origins of SES-related disparities in cognitive skills that are relevant to school success.

A brief history of this complex debate is relevant to the issues raised in the current study. The scientific consensus in the early 20th century was that cognitive abilities were entirely genetically determined, a view that changed gradually with mounting evidence that experiential factors were also influential (see Fernald & Weisleder, 2011). By the 1960s, when the Civil Rights movement focused national attention on inequities in educational opportunities for Black children, there was intense interest in eliminating achievement gaps that could no longer be ignored. Riessman (1962) argued that SES disparities in school success resulted from cultural differences in minority children's early experience with parents in the home, rather than from immutable genetic differences. This ‘cultural deprivation’ argument appeared to offer hope for solutions through appropriate intervention, although characterizing the home environment of minority children as deficient in cognitive stimulation clearly had negative implications. While this idea rallied political support for new programs such as Operation Head Start, what came to be known as the ‘deficit model’ also generated intense controversy among educators who objected that parents should not be blamed for their children's difficulties in school. By the 1970s, politically motivated backlash to the deficit model converged with the rise of nativist theories of language development, which focused on modal patterns of development presumed to be universal rather than on differences among children. Fernald and Weisleder (2011) argue that this convergence was influential in curtailing debate on questions that had generated extensive research over the previous two decades – namely, whether SES differences in children's verbal abilities are rooted to some extent in differences in their early language experience at home, and if so, whether these experiential differences contribute to the substantial disparities observed among children in their later academic success.

Although interest in variability in language learning had declined substantially by the 1980s, a few researchers began to explore in greater depth the potential contributions of early parent–child interaction to differences in language development (e.g. Hart & Risley, 1995; Hoff-Ginsberg, 1998; Huttenlocher, Haight, Bryk, & Seltzer, 1991). Based on detailed analyses of mothers’ speech to infants at home, these studies used longitudinal designs to identify features of maternal speech that predict language outcome measures. Hart and Risley found that by 36 months, the higher-SES children in their sample spoke twice as many words as the lower-SES children. But their most remarkable finding was the extreme variation in amounts of child-directed speech among families at different SES levels, differences that were correlated with children's early vocabulary and were also predictive of later school performance (Walker, Greenwood, Hart, & Carta, 1994). According to Hoff (2003), it was the quality of infants' early language environment that actually mediated the link between SES and children's vocabulary knowledge.

Assessing differences in language proficiency in very young children

These studies of variability in early language environments with small samples of families laid the foundation for research exploring the early emergence of cognitive disparities in much larger and more diverse samples of advantaged and disadvantaged children. Farkas and Beron (2004) examined the monthly growth trajectory of oral vocabulary knowledge in Black and White children from 36 months to 13 years of age, using a large, representative national data set. Their most striking finding was that most of the inequality in vocabulary growth attributable to race and SES differences developed prior to 36 months. Moreover, the magnitude of the Black–White vocabulary gap that was already evident by the age of school-entry remained unchanged through the age of 13 years. These authors concluded that by 36 months, SES differences in children's language experience have already led to significant vocabulary disparities, which then widen further in the preschool years and remain constant thereafter. Data from the NICHD Early Childhood Care Research Network also revealed that a substantial achievement gap between low-income Black and White children was already evident by 3 years, and that family as well as school characteristics contributed to maintaining this gap through elementary school (Burchinal, McCartney, Steinberg, Crosnoe, Friedman, McLoyd, Pianta, & NICHD Early Child Care Research Network, 2011). A third recent study with a large, representative sample from the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B) showed that disparities between lower- and higher-SES infants on language and cognitive measures began to emerge by 9 months, and that by 24 months there was a mean difference of .5 SD units between SES groups on the Bayley Cognitive Assessment (Halle, Forry, Hair, Perper, Wandner, Wessel, & Vick, 2009).

These large-sample studies of SES disparities in cognitive skills emerging early in life have all been based on standardized assessments of language abilities, using measures which require the child to follow instructions and execute an unambiguous response by speaking or pointing. But given these task demands, such assessments cannot be used effectively with toddlers younger than 2 years. While parent reports of a child's vocabulary can yield valuable data on early language development (Fenson et al., 2007), they do not provide a direct measure of the child's response. Until recently, these methodological limitations made it difficult to investigate the origins of individual differences in language proficiency in infants younger than 24 months. However, refinements in experimental techniques now allow researchers to monitor the time course of language comprehension by very young language learners, providing direct measures of early efficiency in language processing in real time.

Recent experimental studies on language processing in the second and third years have used real-time measures to assess how efficiently children identify the referent of a familiar word in real-time comprehension. In the looking-while-listening (LWL) procedure (Fernald, Zangl, Portillo, & Marchman, 2008), children see pictures of two familiar objects as they listen to speech naming one of the objects, and their responses are coded with millisecond-level precision. Cross-sectional studies of both English- and Spanish-learning infants show dramatic gains in the speed and accuracy of language understanding across the second year (Fernald et al., 1998; Hurtado, Marchman, & Fernald, 2007). Moreover, young children, like adults, are able to interpret incoming language incrementally, directing their attention to the appropriate picture as the speech signal unfolds in time (Fernald, Swingley, & Pinto, 2001; Swingley, Pinto, & Fernald, 1999). In a longitudinal study with English-learning toddlers from 15 to 24 months, these online processing measures were found to be stable over time, and processing speed at 24 months was robustly correlated with vocabulary growth over this period (Fernald et al., 2006). Moreover, a follow-up study with the same children 6 years later showed strong links between processing efficiency in infancy and performance on standardized tests of language and cognitive skills in elementary school (Marchman & Fernald, 2008). These real-time processing measures have revealed consistent concurrent and predictive relations to language outcomes across studies of typically developing children. They are also high in predictive validity in research with late-talkers, children at increased risk for persistent language delays (Fernald & Marchman, 2012). For these reasons, the LWL task is well suited for investigating both similarities and differences in early language processing skill among infants from different socioeconomic backgrounds.

Research questions

The main goals in this research were to examine the early development of language processing efficiency in relation to vocabulary learning in English-learning infants from families across a broad demographic range, and to determine whether SES differences in processing efficiency are already evident in infancy, at a younger age than has been reported in previous research. Our previous studies with English-learning children were all conducted at a university laboratory in a prosperous urban area, where almost all the families who volunteer to participate in research are affluent and highly educated (Site 1). To extend beyond this convenience sample of high-SES families, we needed to establish an additional research site in an area where it is possible to recruit equivalent numbers of lower- and middle-SES English-speaking families. Site 2 is located in an urban area comparable in population size to Site 1. However, because these two areas differ substantially in terms of median family income, cost-of-living, and percentage of children living in poverty, as shown in Table 1, we are able to include a much more diverse sample of English-learning children at Site 2 than is possible at the university lab.

Table 1. Demographic information on population, median income, cost-of-living index, and poverty rate in the two research sites
 Site 1Site 2
  1. a

    US Census 2010 for the catchment area from which participants are recruited.

  2. b

    Cost-of-living index as compared to US average of 100 (Source:

  3. c

    2010 Federal poverty level = $22,050 for family of four (US Department of Health and Human Services).

Total populationa90,20090,500
% non-Hispanic whitea66%83%
Median per capita incomea$69,000$23,900
Cost-of-living indexb157.992.9
% children living below federal poverty levela,c5.3%22.9%



Participants were 48 English-learning children (26 females), recruited through birth records and day care centers at Site 1 (= 20) and Site 2 (= 28). Exclusionary criteria at time of recruitment included preterm birth, birth complications, hearing/visual impairments, medical issues, or a known developmental disorder. Reported ethnicity of participants was non-Hispanic White (66%), Asian (13%), Alaskan Native/American Indian (10%), Native Hawaiian/Pacific Islander (6%), or African American (4%). After receiving a brochure describing the project, interested parents contacted us by phone, website, or reply card. Parents were then interviewed by phone about their child's language background, health history, and family history of language disorders. Qualifying families were invited to join the study if the child was not regularly exposed to a language other than English. Six additional participants were excluded from final analyses because the families could not attend the 24-month testing session or did not complete both language questionnaires.

Socioeconomic status

Although participants were all typically developing infants from monolingual English-speaking families, they were diverse in socioeconomic background, as shown in Table 2. The mothers in these families had about three years of post-high school education, on average, yet spanned a broad range of educational levels: 21% did not finish or were still attending high school, or did not continue their education past high school, 19% had some college, 33% completed a BA degree, and another 27% also received some post-BA training. Table 2 also shows scores on the Hollingshead Four Factor Index of Socioeconomic Status (HI; Hollingshead, 1975). This widely used index of family SES is based on a weighted average of both parents' education and occupation, with possible scores ranging from 8 to 66. The HI is divisible into five ‘strata’ of social status: unskilled worker, semi-skilled worker, skilled worker, semi-professional, and major professional. In this sample, parents' occupations spanned the full range from unskilled worker to major professional. For some analyses, families were divided into Lower- (≤ 45, = 23) and Higher-SES (> 47, = 25) sub-groups based on a median split of HI scores, as shown in Table 2. Both groups included at least one mother with only a high school education, as well as several mothers who had attended college. Nevertheless, the distributions of maternal education levels were substantially different in the two groups. Nearly 90% of the mothers in the Higher-SES group had at least a 4-year college degree, with more than half completing masters or doctoral degrees, while only 30% of the mothers in the lower-SES group had completed college and one had a masters degree. Of the children from families in the Higher-SES group, 19 were recruited at Site 1 and six at Site 2. Of those from families in the Lower-SES group, one was recruited at Site 1 and 22 at Site 2.

Table 2. Mean (SD) and range for maternal education and Hollingshead Index for full sample and lower-SES and higher-SES sub-groups
 All participantsLower SESHigher SES
  1. a

    Reported years of maternal education defined as high school = 12 years; college = 13–16 years; post-baccalaureate = 17–18 years.

  2. b

    Hollingshead four-factor Index of Social Status (HI; Hollingshead, 1975). Possible scores range from 8 to 66. SES sub-groups were based on a median split of HI.

Maternal Eda15.3 (2.4)10–1813.7 (2.2)10–1816.7 (1.6)12–18
HIb46.6 (15.1)14–6633.9 (10.1)14–4558.3 (7.3)47–66

Offline measures of vocabulary

Reported expressive vocabulary

At 18 and 24 months, parents completed the MacArthur-Bates Communicative Development Inventory: Words & Sentences (CDI; Fenson et al., 2007). This parent-report instrument asks parents to indicate on a checklist (680 items) which words their child ‘understands and says’. All parents were told to substitute words on the checklists with variants of those words specific to their family (e.g. nana for grandmother).

Procedure for assessing real-time language understanding

Children's real-time comprehension of familiar words was assessed at 18 and 24 months using the looking-while-listening (LWL) procedure (Fernald et al., 2008). The testing apparatus, recording procedures, and verbal and visual stimuli were identical at Sites 1 and 2, and the same two experimenters conducted test sessions at both sites. On each trial, participants viewed two pictures of familiar objects while listening to speech naming one of the pictures. Visual stimuli were colorful pictures (36 × 50 cm) of the target and distracter objects on gray backgrounds, aligned horizontally on a video display. Children sat on the caregivers' lap during the 5-min session, and caregivers wore darkened sunglasses to restrict their view of the images. Each stimulus sentence consisted of a carrier phrase with the target word in final position, followed by an attention-getter (e.g. Where's the car? Do you like it?). The child's face was video-recorded for later frame-by-frame coding. On each trial, the two pictures were shown simultaneously for 2 s prior to speech onset, remaining on the screen during the auditory stimulus until 1 s after sound offset. Between trials, the screen was blank for approximately 1 s. Each trial lasted approximately 7 s.

Verbal stimuli

A female native speaker of English recorded several tokens of each sentence. Candidate stimuli were acoustically analyzed; final stimulus sentences were selected to be comparable in naturalness and pitch contour and edited so that carrier frames and target words were matched for duration. At 18 months, the mean length of the target noun was 614 ms (range = 604–623 ms). At 24 months, mean noun duration was 640 ms (range = 565–769).

At 18 months, the target nouns were baby, doggy, birdie, kitty, ball, shoe, book, and car, object labels likely to be familiar to English-speaking children at this age (Dale & Fenson, 1996). Each object was presented four times as target and four times as distracter, yielding 32 experimental trials. Interspersed among the critical trials were four filler trials (e.g. Do you like those pictures?). At 24 months, children heard sentences containing the familiar target nouns baby, doggy, birdie, kitty, cookie, book, car, and juice each presented twice as target and twice as distracter, a total of 16 experimental trials. These familiar word trials were interspersed with fillers (four trials) and trials in which the target word was placed in a carrier frame with an adjective (16 trials) or a semantically related verb (eight trials). These trials are not analyzed here. Trials on which the parent reported that the child did not understand the target word were excluded from analyses on a child-by-child basis.

Visual stimuli

Pictures corresponding to target words were presented in fixed pairs matched for visual salience, with each object serving equally often as target and distracter. All tokens were judged to represent objects typically familiar to young children. Position of target picture was counterbalanced across trials. Trials were presented in a pseudo-random order such that the same target word never occurred on adjacent trials, and the target picture did not appear on the same side more than two trials in a row.


Video records of children's gaze patterns were analyzed frame-by-frame by highly trained coders blind to target side and condition. All coding was conducted at Site 1 by coders who were not involved in running the sessions and were blind to testing site. On each frame, coders indicated whether the child was looking at the left picture, right picture, in between the two pictures or away from both. This yielded a high-resolution record of eye movements for each 33-ms interval as the stimulus sentence unfolded, aligned with the onset of the target noun. Trials were later classified as target- or distracter-initial, depending on which picture the child was fixating at target-noun onset. To determine reliability, 25% of sessions were independently re-coded, with inter-observer agreement computed in two ways. First, the mean proportion of frames on which coders agreed on gaze location averaged 98%. Second, the mean proportion of shifts in gaze on which coders agreed within one frame was also calculated, a more conservative measure which also yielded high reliability (97%).

Calculation of accuracy and RT

Two measures of efficiency in real-time speech processing were calculated for each child. First, accuracy was computed as the mean proportion of looking to the named picture on target- and distracter-initial trials, averaged over 300–1800 ms from noun onset. Mean accuracy was based on an average of 22.9 trials (SD = 5.3) per child at 18 months and 12.2 trials (SD = 2.9) at 24 months. Second, reaction time (RT) was computed on only those trials on which the child was looking at the distracter picture at the onset of the target word and shifted to the target picture within 300–1800 ms from target word onset. Trials on which the child shifted either within the first 300 ms or later than 1800 ms from target word onset were excluded, since these early and late shifts were less likely to be in response to the stimulus sentence (Fernald et al., 2008). Mean RTs were based on an average of 8.8 trials (SD = 3.6) at 18 months and 5.0 trials (SD = 2.1) at 24 months.


Focusing on two crucial aspects of early language proficiency – the development of expressive vocabulary and skill in real-time spoken language processing – this study examined differences and similarities in patterns of developmental change from 18 to 24 months in a diverse group of English-learning children. A central question was how variability in lexical development and real-time processing efficiency would relate to variability in family SES. The scatterplots in Figure 1 show that SES differences were significantly correlated with vocabulary as well as with accuracy and reaction time, our two measures of processing efficiency: 18-month-olds growing up in families with higher HI scores were more advanced in vocabulary, r(48) = .34, < .02, and were also more accurate, r(48) = .52, < .001, and faster, r(47) = −.50, < .001, in spoken word recognition in the LWL task. Correlations between SES and these three language measures were also significant at 24 months: vocabulary: r(48) = .29, < .05; accuracy, r(48) = .30, < .05; RT, r(48) = −.45, < .001. For the next analyses, we divided participants into two SES groups based on a median split of HI scores (see Table 2), to compare children from Lower- and Higher-SES families in their patterns of change with age in vocabulary and processing efficiency.

Figure 1.

Scatter plots of Vocabulary, Accuracy and RT at 18 months with SES (HI). Dashed vertical line indicates median split of HI values.

Change in vocabulary from 18 to 24 months in lower- and higher-SES children

Mean expressive vocabulary scores at 18 and 24 months for Lower- and Higher-SES children are shown in Table 3 and Figure 2. In a 2 × 2 mixed analysis of variance (ANOVA), with SES group as a between-Ss factor and age as a within-Ss factor, the main effect of age was significant, F(1, 46) = 163.5, < .001, ηp2 = .78, reflecting larger vocabulary scores at 24 months than at 18 months across all children. On average, children's vocabulary size increased by about 225 words over this period. The main effect of SES group was also significant, F(1,46) = 8.6, < .001, ηp2 = .16, confirming that children in the Higher-SES group were significantly more advanced in vocabulary than those in the Lower-SES group. Indeed, at 18 months, nearly half the children in the Lower-SES group (= 12) had fewer than 50 words in their reported vocabulary, while only eight children in the Higher-SES group had scores of 50 words or less. A similar trend was evident at 24 months: Children from Higher-SES families produced nearly 450 words, on average, while children from Lower-SES families produced about 150 fewer words, consistent with previous reports of SES differences in reported vocabulary in this age range (e.g. Arriaga, Fenson, Cronan & Pethick, 1998).

Table 3. Mean (SD) and range of expressive vocabularya at 18 and 24 months for all participants and by SES sub-groupb
AgeAll participantsLower SESHigher SES
  1. a

    Number of words produced on the MacArthur-Bates CDI: Words & Sentences (Fenson et al., 2007).

  2. b

    SES groups based on a median split of HI scores.

18 months141.9 (123.0)5–503107.0 (114.2)5–503174.0 (124.3)16–471
24 months367.9 (180.2)4–665287.9 (163.3)4–573441.5 (165.4) 59–665
Figure 2.

Mean number of spoken words reported on the MacArthur/Bates CDI by age and SES (HI). Error bars represent SE of the mean over participants.

An even more striking result was that the pattern of developmental change in vocabulary differed as a function of SES, reflected in a significant age by SES group interaction, F(1, 46) = 6.1, < .02, ηp2 = .12. As illustrated in Figure 2, a group difference in vocabulary between children from Lower- vs. Higher-SES backgrounds was clearly evident at 18 months, and by 24 months the between-group difference was even larger. Children in the Higher-SES group made significantly greater gains (= 268 words, SD = 116) over this period than did children in the Lower-SES group (= 180 words, SD = 127), t(46) = 2.5, < .02.

Changes in processing efficiency from 18 to 24 months in higher- and lower-SES children

Next we compared children at both ages in the two SES groups on two measures of processing efficiency – mean accuracy and mean RT (see Table 4) – using 2 (age) × 2 (SES group) mixed ANOVAs.

Table 4. Mean (SD) of accuracy and reaction time (RT) in the looking-while-listening task at 18 and 24 months for all participants and the lower- and higher-SES sub-groups
 All participantsLower SESHigher SES
  1. a

    Comparisons to chance (.50) are significant, all < .001.

  2. b

    Mean proportion looking to the target computed over 300 to 1800 ms from noun onset, including all target-initial and distracter-initial trials on which the parent reported the child understood the target word.

  3. c

    Mean latency (ms) to initiate a shift from the distracter to the target picture within 300 to 1800 ms from noun onset including only those trials on which the parent reported the child understood the target word.

18 months.64 (.09)a.59 (.08)a.69 (.07)a
24 months.73 (.10)a.69 (.11)a.77 (.08)a
18 months841 (185)947 (151)746 (162)
24 months738 (162)802 (166)666 (108)


Across SES groups, 24-month-olds spent a greater proportion of time looking at the correct picture than did 18-month-olds, F(1, 46) = 31.2, < .001, ηp2 = .40. There were also significant between-group differences in accuracy: Higher-SES children were more accurate overall than the Lower-SES children, F(1, 46) = 22.8, < .001, ηp2 = .33. The age × SES interaction was not reliable, = .69, ηp2 = .003, reflecting comparable relative gains in accuracy from 18 to 24 months for infants in both groups.

The main effect of age is illustrated in Figure 3, which shows the time course of looking to the target picture in the LWL task for children at 18 and 24 months. This graph plots change over time in the mean proportion of trials on which children overall fixated the target picture, averaged over participants at each 33-msec interval as the sentence unfolds. The proportion of looking to the target picture remained near chance at least half-way through the target noun, when acoustic information potentially enabling identification of the correct referent first became available. After this point, the mean proportion of correct looking began to increase, continuing to rise after the offset of the target noun. Between 18 and 24 months, children increased their proficiency in looking to the named target before the offset of the target noun, reaching a higher level of accuracy at 24 months than 6 months earlier. It is also important to note that the proportion of looking to the named target picture was significantly above the chance level of .50 chance at 18 months, t(47) = 11.2, < .0001, and 24 months, t(47) = 15.6, < .0001, indicating that children overall could correctly identify the referents of familiar object names at both ages.

Figure 3.

Mean proportion looking to the target picture as a function of time in ms from noun onset at 18 and 24 months. Error bars represent SE of the mean over participants. The vertical dashed line marks the acoustic offset of the target word.

Although accuracy improved with age for children in both SES groups, there was also a strong and early influence of SES. Figure 4 plots the time course of looking to the correct target picture at 18 and 24 months for the Lower- and Higher-SES groups. The Higher-SES children responded by looking to the named target sooner in the stimulus sentence, and achieved substantially higher levels of accuracy than those in the Lower-SES group. But what is most remarkable about Figure 4 is that the curve for the Lower-SES children at 24 months essentially overlaps with the curve for the Higher-SES children at 18 months. Indeed the mean accuracy for Lower-SES children at 24 months (= .69) was identical to that for Higher-SES children at 18 months (= .69), indicating that 24-month-olds in the Lower-SES sample were performing at the same level overall as Higher-SES children who were 6 months younger.

Figure 4.

Mean proportion of looking to the target as a function of time in ms from noun onset for Lower-SES and Higher-SES learners. Open squares/circles represent the time course of correct looking at 18 months; filled squares/circles represent the time course of looking in the same children at 24 months. Error bars represent SE of the mean over participants.

Reaction time

Similar patterns of developmental change were found in analyses of processing speed, shown in Figure 5. At 24 months, children were about 100 ms faster to initiate a shift from distracter to target picture, on average, than they were at 18 months, a significant main effect of age, F(1, 45)  = 15.2, < .001, ηp2 = .25. The main effect of SES on RT was also significant, F(1, 45) = 27.5, < .001, ηp2 = .38, confirming that children in the Higher-SES group were significantly faster overall in familiar word recognition than children in the Lower-SES group. There was no significant age × SES group interaction, = .27, ηp2 = .03, reflecting parallel gains in response speed with increasing age in both groups of children. However, consistent with the findings for accuracy, the absolute differences in processing speed between the two groups at each age were substantial: the mean RT for Lower-SES children at 24 months was comparable to the mean RT for 18-month-olds in the Higher-SES group.

Figure 5.

Mean RT to initiate a shift from the distracter to the target picture at 18 and 24 months for the Higher-SES and Lower-SES learners. Error bars represent SE of the mean over participants.

Relations between online processing skill and vocabulary in a diverse sample of children

The final analysis explored whether variability in online processing skills aligned with vocabulary knowledge in this diverse sample. First-order correlations between RT and accuracy in real-time comprehension and vocabulary scores at 18 and 24 months are shown in Table 5. As in previous studies with more homogeneous samples of English-learning children from advantaged families, we found reliable links between performance in the LWL task and expressive vocabulary size at both 18 and 24 months, although links were stronger and more consistent at the later time point. At 24 months, accuracy and RT were correlated with both earlier and concurrent vocabulary scores, accounting for 15–23% of the variance. These results echo the recurring finding that those children who are faster and more accurate in real-time interpretation of familiar words tend to be those who are also reported to produce more words (Fernald et al., 2006; Fernald & Marchman, 2012; Hurtado et al., 2007).

Table 5. First-order correlations (r) between processing efficiency and vocabulary at 18 and 24 months
 18 months24 months
  1. a

    < .07;

  2. b

    < .05;

  3. c

    < .01.

18 months.35b−.25a.43c−.42c
24 months .43c−.18.48c−.47c


This research revealed similarities but also striking differences in early language proficiency among infants from advantaged families and from less advantaged families. Our first goal was to track developmental changes in language processing efficiency in relation to vocabulary learning in this diverse sample of English-learning children. Our second goal was to examine SES differences in these crucial aspects of early language development. The most important finding was that significant disparities in language proficiency between infants from higher- and lower-SES families were already evident at 18 months of age, and by 24 months there was a 6-month gap between the two groups.

Similarities and differences among children in early processing efficiency and vocabulary

Although participants in this study came from very different backgrounds, they showed common patterns of change in the efficiency of real-time language processing from 18 to 24 months. Older children were more likely than younger children to interpret the incoming speech signal incrementally, fixating the target picture as soon as they had enough information to identify the referent. We also found reliable links between skill in early spoken language processing and vocabulary development, replicating results previously shown in children from affluent, highly educated families (Fernald et al., 2006; Fernald & Marchman, 2012), but never before in English-learning children from a broader SES range. These results provide further evidence that real-time language processing is aligned with early vocabulary development.

Extending earlier results showing consistent relations between early processing efficiency and vocabulary size to a more diverse group of English-learning children was an important starting point. However, the more surprising outcome of this study was that by the age of 18 months, there were already substantial differences among children as a function of SES. Children from lower-SES families had significantly lower vocabulary scores than children from higher-SES families at the same age, and they were also less efficient in real-time processing. As seen in Table 4, mean accuracy for the lower-SES children increased from .59 to .69 between the ages of 18 and 24 months; however, mean accuracy for the higher-SES children was already .69 at 18 months, increasing to .77 by 24 months. Measures of processing speed showed a similar pattern: in the lower-SES children, the mean RT at 24 months (= 802 ms) was still not as fast as the mean RT at 18 months in the higher-SES children (= 746 ms). These differences were equivalent to a 6-month disparity between the higher- and lower-SES children, in vocabulary size and in both measures of language processing efficiency.

Exploring sources of variability in young children's early language proficiency

Where do these substantial differences come from? Variability among individuals in verbal abilities is influenced to some extent by genetic factors (Oliver & Plomin, 2007), but the contributions of early experience to differences in language proficiency are also substantial. Research on language problems in twins has also shown that environmental factors are more powerful than genetic factors in accounting for similarities in language development in children in the same family (Oliver, Dale, & Plomin, 2004). Other studies suggest that the contribution of environmental factors to variability in IQ has been underestimated in behavioral genetics studies, which tend to focus on children in middle-class families (Rowe, Jacobson, & Van den Oord, 1999; Turkheimer, Haley, Waldron, D'Onofrio, & Gottesman, 2003). In a study of twins from families diverse in SES, Turkheimer et al. (2003) found that 60% of the variance in cognitive abilities was accounted for by shared environmental factors among children living in poverty, with the genetic contribution close to zero; however, for children in higher-SES families, the opposite pattern of findings emerged. While the power of SES to moderate the heritability of verbal and other cognitive abilities is under debate (Hanscombe, Trzaskowski, Haworth, Davis, Dale, & Plomin, 2012), there is consensus that infants' genetic potentials in these domains can only be realized with appropriate environmental support. In families where adequate resources and support are consistently available, children are more likely to be buffered from adverse circumstances than are children in impoverished families, and so are more likely to be able to achieve their developmental potential.

There are many different experiential factors associated with living in poverty that could contribute to variability in language learning. For example, the physical conditions of everyday life related to safety, sanitation, noise level, and exposure to toxins and dangerous conditions differ dramatically for children in lower- and higher-SES families, as does the access to crucial resources such as adequate nutrition and medical care (Bradley & Corwyn, 2002). Conditions of social and psychological support vary as well, with higher levels of stress and instability in disadvantaged families (Evans, Gonnella, Marcynyszyn, Gentile & Salpekar, 2005). All of these environmental factors are known to have consequences for cognitive and social outcomes in young children (e.g. Evans, 2004). There are also well-known differences in the quality of parent–child interaction among families differing in SES related to these circumstantial factors. For example, parents under greater stress tend to respond less sensitively to their children (Mesman, van IJzendoorn, & Bakermans-Kranenburg, 2011), and provide less adequate social and cognitive stimulation. This is likely to be one important factor contributing to the well-documented SES differences in the amount and quality of child-directed speech (Hoff, 2003, 2006). Hart and Risley (1995) estimated that by 36 months, the children they observed from advantaged families had heard 30 million more words directed to them than those growing up in poverty, a stunning difference that predicted important long-term outcomes (Walker et al., 1994).

Could variation in early language experience also contribute to individual differences in infants' real-time processing efficiency, as well as in vocabulary learning? This question was explored in longitudinal research with Spanish-speaking families, examining links between maternal talk, children's processing efficiency, and lexical development (Hurtado, Marchman, & Fernald, 2008). Those infants whose mothers talked with them more at 18 months were those who learned more vocabulary by 24 months. But the most noteworthy finding was that those infants who experienced more and richer language were also more efficient in real-time language processing 6 months later, compared to those who heard less maternal talk. One interpretation of these findings is that having the opportunity for rich and varied engagement with language from an attentive caretaker provides the infant not only with models for language learning, but also with valuable practice in interpreting language in real time. Thus, child-directed talk sharpens the processing skills used in online comprehension, enabling faster learning of new vocabulary.

Long-term consequences of early differences in language skills

How would an advantage in processing efficiency facilitate vocabulary learning? Studies with adults show that faster processing speed can free additional cognitive resources (e.g. Salthouse, 1996), which may be particularly beneficial in the early stages of language learning. The infant who can interpret a familiar word more rapidly has more resources available for attending to subsequent words, with advantages for learning new words that come later in the sentence. A slight initial edge in the efficiency of familiar word interpretation could be strengthened through positive-feedback processes, leading to faster growth in vocabulary that in turn leads to further increases in receptive language competence. If rapid lexical access of familiar words facilitates learning new words, then greater efficiency in language processing at 18 and 24 months could have cascading advantages that result in further vocabulary growth.

Results from several studies support the idea that variability in both processing speed and vocabulary could have long-term consequences. In research with adults and children, mean RT across various tasks predicted success on cognitive assessments at every age (Kail & Salthouse, 1994). Because mean RT in adults correlates so consistently with measures of memory, reasoning, language, and fluid intelligence, Salthouse (1996) has argued that gradual increases in processing speed account fundamentally for developmental change with age in cognitive and language functioning. This association has been characterized as a developmental cascade by Fry and Hale (1996), who proposed that increasing processing speed strengthens working memory, and that stronger working memory then leads to greater cognitive competence. Since vocabulary size also predicts IQ in both adults and children (Matarazzo, 1972; Vance, West, & Kutsick, 1989), an early advantage in lexical development could have cascading benefits for other aspects of language learning as well (Bates, Bretherton, & Snyder, 1988). Vocabulary knowledge also serves as a foundation for later literacy (Lonigan, Burgess, & Anthony, 2000), and language proficiency in preschool is predictive of academic success (Alexander, Entwisle, & Horsey, 1997). It is clear from these findings that the early emerging differences we found in language proficiency between children from different SES backgrounds have serious implications for their long-term developmental trajectories.


In this research we found significant differences in both vocabulary learning and language processing efficiency that were already present by 18 months, with a 6-month gap emerging between higher- and lower-SES toddlers by 24 months. These results mirror findings from new analyses of the ECLS-B data set, which used more global measures to show that reliable differences in cognitive performance between children in lower- and higher-SES families were present by 24 months (Halle et al., 2009; Tucker-Drob, Rhemtulla, Harden, Turkheimer, & Fask, 2011). What our findings add is the first evidence that SES-related disparities in language skills emerge at an even earlier age. Using high-precision measures of infants' real-time responses to familiar words, it was not until 24 months that the less advantaged children reached the same levels of speed and accuracy achieved by more advantaged children at 18 months, a 6-month gap in the development of processing efficiency. Such a large disparity cannot simply be dismissed as a transitory delay, given that differences among children in trajectories of language growth established by 3 years of age tend to persist and are predictive of later school success or failure (Burchinal et al., 2011; Farkas & Beron, 2004).

Because this difference can be characterized as a lag in early processing efficiency with potentially important long-term consequences, it is important to frame this finding in light of scientific discoveries that reveal the weaknesses of the controversial ‘deficit model’ of the 1960s. The view that children from disadvantaged homes were inherently ‘culturally deprived’ (Riessman, 1962) was based on a vague notion of culture as embodied in middle-class practices, institutions, and values. At that time, little was known about the actual activities and practices of parents in different families, with even less scientific evidence on trajectories of cognitive and language development from infancy through childhood. Thus the term ‘deficit’ was used as a global indictment of parenting styles in impoverished families that were simply different from middle-class families – a well-intended but misguided attempt to help teachers understand the difficulties minority children were experiencing in the recently desegregated school system.

There was obfuscating vagueness on both sides of the debate. Advocates of the deficit model proposed a causal account of the effects of children's early life experience on later cognitive development in which both predictor and outcome variables were poorly specified. While many critics of the deficit model raised valid points urging greater respect for different cultural practices (e.g. Heath, 1983), others countered with proposals that were simplistic and counterproductive, often reflecting a political agenda. These proposals ranged from calling a halt to research on parenting practices in minority families because it was inherently paternalistic and racist, to focusing on eliminating poverty rather than on ‘blaming the victim’ (Ryan, 1971). The deficit model was incoherent at the time, and the continuing debate on this construct has not led to greater precision or insight (Gorski, 2006).

In an effort to reframe this argument, we end with an example from nutrition, where cognitive consequences can be linked to particular deficits without evoking the reflexive opposition associated with deficit models in social science. Children with iron deficiency anemia (IDA) are typically low in energy and have cognitive difficulties. For many years, the prevailing explanation for these symptoms was that parents treated lethargic children with IDA as if they were younger, which supposedly retarded their cognitive development (Pollitt, 1993). Thus differences among children in global measures of cognitive ability were attributed to ill-defined problems in parenting behavior. However, recent research on IDA has led to a much more precise specification of both causes and consequences. Studies with animal models show that iron deficiency in pre- and postnatal development disrupts the optimal course of myelination, which then reduces efficiency of neural transmission (Beard, Wiesinger & Connor, 2003). And longitudinal research measuring brain responses to auditory and visual stimuli shows that children with IDA have slower neural transmission, which is very likely to affect the efficiency of cognitive processing (Algarín, Peirano, Garrido, Pizarro, & Lozoff, 2003).

Resting on a foundation of research showing solid relations between a specific causal factor and specific consequences, these discoveries of links between iron deficiency and long-term cognitive difficulties become valuable and highly relevant as public health information. If a mother was told that her child had a ‘cultural deficit in nutrition’, such a broad, vague claim could only be perceived as a perplexing insult. However, if she heard about new research showing that iron is absolutely critical for optimal brain development in infancy, and that healthy brain development is vital to her child's success in school and in later life, she might be more interested in learning about new ways to provide more iron in her child's diet.

While recent research on nutrition focuses on biological factors that influence early cognitive development, there is increasing scientific evidence that experiential factors also play a critical role in infants' early language development – by nurturing vocabulary learning (Hart & Risley, 1995; Hoff, 2006) as well as strengthening skill in real-time language processing (Hurtado et al., 2008; Weisleder & Fernald, under review). Although the present study was not designed to explore causes of the variability we found among children, our results add to this literature by showing the potential benefits of early processing efficiency for vocabulary growth, and also revealing the potential cost to children with less efficient processing skills, in terms of missed opportunities for learning. From the perspective of basic research and theory in language acquisition, it is essential to investigate not only the typical developmental trajectories of children from privileged families, but also the wide range of variability that becomes apparent when children from more diverse backgrounds are included. We address this goal here by documenting substantial differences between infants from lower- and higher-SES backgrounds that are already evident in the second year of life, using sensitive measures of early language proficiency known to be predictive of later outcomes. The next step is to explore the powerful sources of variability in early experience that contribute to such differences in infants' emerging language proficiency, and to examine the nature and timing of their influence in larger and more diverse samples of children. From a policy perspective, the ultimate challenge is to frame these discoveries as a public health message (Knudsen, Heckman, Cameron, & Shonkoff, 2006), with the goal of helping caregivers understand the crucial role they can play in enabling infants to build and strengthen skills essential for optimal development.


This research was supported by grants from the National Institutes of Health (HD42235 and DC008838). We are grateful to the children and parents who participated, and to our community partners in Northern California who enabled us to conduct this study. Special thanks to Krisa Bruemer, Jillian Maes, Viviana Limón, Lucia Martínez, Nereyda Hurtado, Poornima Bhat, Ricardo Hoffmann Bion, Kyle McDonald, Katherine Adams, Mofeda Dababo, and the staff of the Center for Infant Studies at Stanford University.