Causal Supports for Early Word Learning


  • This research was supported by NSF Grant BCS-0445871 and NIH Grant RO3 HD048759. We are grateful to the families who participated, as well as to Katherine Bauernfreund and Lena Sadowitz for their assistance.

concerning this article should be addressed to Amy E. Booth, Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, 2240 Campus Dr., Evanston, IL 60208. Electronic mail may be sent to


What factors determine whether a young child will learn a new word? Although there are surely numerous contributors, the current investigation highlights the role of causal information. Three-year-old children (N = 36) were taught 6 new words for unfamiliar objects or animals. Items were described in terms of their causal or noncausal properties. When tested only minutes after training, no significant differences between the conditions were evident. However, when tested several days after training, children performed better on words trained in the causal condition. These results demonstrate that the well-documented effect of causal information on learning and categorization extends to word learning in young children.

Young children are excellent word learners. We have made great progress toward understanding how and why this is so. In particular, we know a lot about when and how children (a) isolate words from ongoing speech, (b) map words to their intended referents, and (c) extend words appropriately. Questions remain, however, regarding the factors that determine whether, and how rapidly, a particular word becomes a lasting component of a child’s lexicon.

Clearly, young children learn some words before others (e.g., Fenson et al., 1994; Nelson, 1973). Both the frequency and schedule of exposure help determine order of acquisition (e.g., Childers & Tomasello, 2002; Hollich et al., 2000). The perceptual and/or conceptual accessibility of referents also probably contributes (e.g., Gentner, 1982; Rosch, Mervis, Johnson, & Boyes-Braem, 1976). Another potential factor is the type of knowledge a child has about the items being labeled. We propose that words applied to referents with known conceptual properties will be more readily acquired than will words applied to referents for which conceptual properties are unspecified.

Well-documented principles of memory suggest two reasons to believe that this should be so. First, focused attention facilitates memory (e.g., Craik, Govoni, Naveh-Benjamin, & Anderson, 1996; Uncapher & Rugg, 2005). Conceptual information appears to be of particular interest to young learners and might therefore attract considerable attention. Second, meaningful elaboration leads to more robust memories (e.g., Brown, 1975; Craik & Tulving, 1975; Levin, 1988). Because conceptual knowledge can articulate causal and theory-based relation among elements of a concept, it is particularly well suited for providing a coherent framework for semantic representations.

Our hypothesis is also consistent with the tight relation between words and conceptual knowledge observed throughout development (e.g., Booth & Waxman, 2002; Gelman & Markman, 1987; Gopnik & Meltzoff, 1987; Graham, Kilbreath, & Welder, 2004; Waxman, 1999). Words are instrumental in early object individuation (Xu, Cote, & Baker, 2005), categorization (Balaban & Waxman, 1997; Booth & Waxman, 2002; Waxman & Markow, 1995) and inductive generalization of conceptual properties of kinds (e.g., Gelman & Coley, 1990). Toddlers also often preferentially utilize object function (a conceptually rich construct) over perceptual similarity in guiding their extension of novel words (e.g., Diesendruck & Bloom, 2003; Kemler Nelson, Russell, Duke, & Jones, 2000). Finally, conceptual knowledge of ontological kinds and causal powers guides novel word extension in young children (e.g., Booth, Waxman, & Huang, 2005; Gopnik & Sobel, 2000; Lavin & Hall, 2001).

Given what we know generally about memory, and specifically about early and bidirectional ties between words and conceptual knowledge, it follows that introducing novel words to young learners along with conceptual information about their referents should benefit acquisition. However, the attentional learning account (ALA, e.g., Colunga & Smith, 2008; Smith, 1999) suggests otherwise. Although this account rests squarely on basic attention and memory processes, it has consistently eschewed any special role for conceptual information in early word learning. Indeed, according to recent formulations, conceptual knowledge is representationally indistinguishable from other types of knowledge, as well as from the processes that created it. As a result, it cannot exert a unique influence on early word learning.

To test these competing hypotheses, we taught 3-year-old children novel words for novel objects that were described in terms of either their causal or noncausal properties. We focused specifically on causal information because it is consistently described as “conceptual” in the literature and has been regularly tied theoretically and empirically to the coherence and stability of concepts in both adults and children (e.g., Barrett, Abdi, Murphy, & Gallagher, 1993; Gopnik & Nazzi, 2003; Murphy & Medin, 1985; Rehder, 2003; Sloman, 2005). Moreover, an abundant literature documents young children’s sensitivity to causal information (e.g., Leslie & Keeble, 1987; Oakes & Cohen, 1995; Schulz & Bonawitz, 2007; Sobel & Kirkham, 2007).

Because the nature of causal relation differs across domains, we include both artifacts and animals in this investigation (Ahn, 1998; Gelman, 2003; Greif, Kemler Nelson, Keil, & Gutierrez, 2006; Keil, 1994). For artifacts, causally relevant properties clearly center around function (see Bloom, 1996). For animate kinds, causally relevant properties are more varied, including eating habits, aspects of inheritance and growth, habitat, and social behaviors. We target the survival functions of animal parts because their causal structure could most easily be communicated in our brief training sessions. Fortunately for our purposes, evidence suggests that preschool children are sensitive to, and interested in, this information (e.g., Ahn, 1998; Gelman, 2003; Greif et al., 2006; Keil, 1992, 1994; Kelemen, Widdowson, Posner, Brown, & Casler, 2003).



Thirty-six 3-year-olds (19 female) participated (43.11 months, range = 38.78–47.57). All were living in Evanston, Illinois or surrounding communities and were acquiring English as their native language. Most (77%) participants were Caucasian. However, 11% were African American, 6% Asian, and 6% Hispanic. An additional 16 participants were excluded from analyses due to (a) inattentiveness (n = 2), (b) technical difficulties (n = 1), (c) experimenter error (n = 2), (d) language delay (n = 1), or (e) failure to return for follow-up (n = 10). There were no notable differences between children who returned for the follow-up and those who did not. All participants were given a book after their first session and either another book or Pokemon® cards after their final session.


Six color photographs of unfamiliar artifacts and six color drawings of Pokemon® creatures (see Figure 1) were individually laminated onto 15.3 × 15.3 cm training cards. (Although photographs of unfamiliar animals would have best matched the artifacts, piloting revealed that children regularly labeled even seemingly bizarre animals with known labels [e.g., aye-aye = “monkey”].) Each also appeared in linear combination with two other images from the same domain on two different 55.8 × 15.3 cm test cards. In total, 12 training and 12 test cards were constructed. An additional two cards pictured a dog and a car.

Figure 1.

 Pictures and labels of novel artifacts and animals (Pokémon characters Hitmonlee, Breloom, Chinchou, Vibrava, Kabutops, and Venonat ©Pokémon USA, Inc. Reprinted with permission).


Children sat across from the experimenter. Half saw artifact stimuli; half saw animate.

Property training.  The experimenter introduced each card individually, and in the same order for every child. She described a nonobvious property for each. Two descriptions focused on causal properties (e.g., these are used to grind up food). Two focused on noncausal properties (e.g., these have a part inside that is made of gold). (Some of the “noncausal” information could be construed as causally relevant if one inferred, or imagined, its potential influence on functionality. If 3-year-olds are inclined to make such inferences, this could reduce the strength of our predicted effect; see Table 1.) The remaining two provided no specific information (e.g., these are really great things) and served as a baseline. Assignment of images to condition followed one of six repeating sequential patterns (e.g., causal, noncausal, baseline, causal, noncausal, baseline) and was counterbalanced across children. Each type of description occurred in first, second, or third position an equal number of times across children. The causal and noncausal descriptions were matched as closely as possible in terms of both their plausibility and distinctiveness (see Hunt & McDaniel, 1993; Schmidt, 1991 for evidence that these factors are important to learning and memory.) Adult ratings (N = 12) of the plausibility of the noncausal (= 3.57, SE = 0.11) and causal (= 3.86, SE = 0.21) descriptions on a 5-point scale did not differ. Adults also generated an equal number of items that correctly fit the noncausal (= 0.43, SE = 0.13) and causal (= 0.68, SE = 0.16) descriptions.

Table 1. 
Summary of the Information Provided During Training for Each Experimental Condition
ArtifactCausalUsed to scrape dirt off of shoesHave a soft pad that spins around to make cars shinyHave needles inside that sew buttons on clothesSpin around to measure how fast wind is blowingUsed to paint lines on the groundHave a crank that you can turn to grind up food
NoncausalAlways kept on the ground right outside of housesHave a soft pad that you can take off easilyHave tiny needles inside that are sharp at the tipAlways kept on top of buildings or on boatsHave a handle that you can twist to make longerHave a gold part you can see when you take off crank
AnimateCausalStretch their legs to knock coconuts out of treesLoudly hit head with tail to scare other animalsHave yellow bulbs to light their way under waterWrap wings around body to stay warmUse their head like a shovel to dig cavesHave antennae that help them hear far away sounds
NoncausalHave legs that are really stretchy and slimyHave tails that make a rattling noise as they walkHave yellow bulbs that turn blue underwaterHave wings with orange stars on the backHave heads that feel rough like sandpaperHave antennae that they can hide in their fur

Free play 1.  The experimenter played with the child and an unrelated toy for 3 min.

Label training.  The experimenter reintroduced the images individually and in the same order as in property training, but this time she labeled each image four times along with a reminder of the prior description (e.g., “This is a gulla. Wow, look at this gulla. Remember, gullas are used to grind up food. It’s a gulla.”). Assignment of words to items was fixed (see Figure 1).

Free play 2.  The experimenter played with the child for another 3 min.

Comprehension testing.  Test cards were presented in a fixed order across participants, and labels were tested in the same order as they were introduced during training. Recall that each test card included three images. A number of constraints were imposed on the composition and ordering of these cards to optimally balance the conditions. Test cards were presented such that the correct target appeared on the left, right, and in the center an equal number of times. Also, any single picture never appeared for more than two consecutive trials, and when it appeared twice in a row, its position was changed. Finally, the target was pitted against images assigned to each description type (i.e., causal, noncausal, baseline) an equal number of times for each infant.

Using a “curious” stuffed animal for pretence, the experimenter introduced the first test card. She asked, “Froggy wants to know where the gulla is. Can you point to the gulla?” After the child responded, the experimenter tested each of the remaining words with a different card.

Production testing.  The experimenter next pretended that Froggy wanted, “to know what all of these things are called.” She first introduced a picture of a familiar object (i.e., a car) and asked the child what it was. She repeated this query for each of the newly trained stimuli with a break half way through to test a second highly familiar item (e.g., a dog). Familiar items were intended to build confidence and encourage responding.

Delayed testing.ensp; Participants returned 6–15 days later (= 9) to repeat the comprehension and production testing procedures. Finally, children were asked, for each item, “Do you remember anything special about this one?” This “special property” probe was intended to elicit memories for the causal or noncausal information provided during training.


A primary coder recorded the choices made by each participant on each trial. They did so by viewing the test phase only of the recorded experimental sessions with the sound removed. A secondary coder independently recorded the choices made by 25% of the participants. Agreement was 100%. The primary coder also transcribed verbal responses to production and “special property” probes. Children received a production score of 1 for each word for which they correctly produced at least half of the constituent phonemes. Children received a “special property” score of 1 for each causal or noncausal description that they correctly produced in full. They received a score of .5 for correct, but incomplete, descriptions.


Despite uniformly correct labeling of the familiar objects (e.g., car), successful production levels for the novel words collapsed across condition were near floor (8.6%) and therefore were uninformative. Analyses of the comprehension data were more illuminating. We first calculated the proportion of trials on which each child correctly identified the referent in each treatment condition (see Figure 2). As predicted, performance in the baseline condition (= 0.41, SE = 0.05) did not differ from chance; t(35) = 1.66, ns. This result held for performance at the first (= 0.43, SE = 0.06) and second testing session (= 0.39, SE = 0.06). We next conducted a repeated measures analysis of variance including experimental condition (noncausal vs. causal) and testing session (first vs. second) as within-subject factors and domain (artifact vs. animate) as a between-subjects factor. No main effect of domain was observed. However, a main effect of testing session was evident, F(1, 34) = 9.04, = .005, as was a trend toward an effect of condition, F(1, 34) = 2.98, = .09. These main effects were mediated by a significant interaction between them, F(1, 34) = 11.85, = .002.

Figure 2.

 Mean proportion of trials on which the correct referent was chosen in each condition at initial and delayed testing.

At the time of the first testing session (i.e., 3-min delay), the causal (= 0.35, SE = 0.06) and noncausal (= 0.39, SE = 0.06) conditions did not differ from each other, t(35) = 0.62, ns. At the time of the second testing session (i.e., 1- to 2-week delay), however, performance in the causal condition (= 0.64, SE = 0.06) outstripped that observed in the noncausal condition (= 0.40, SE = 0.05), t(35) = 3.35, = .002, = .75. This analysis was confirmed by a nonparametric Wilcoxin signed-rank test, = 2.93, = .003.

To assess whether children benefited significantly from training, we compared performance levels to chance (.33). At Test 1, performance was at chance in both conditions. At Test 2, performance remained at chance in the noncausal condition, t(35) = 1.40, ns, but rose above chance in the causal condition, t(35) = 5.62, < .001. This analysis was confirmed by a nonparametric test on the distribution of children getting 0, 1, or 2 items correct (n = 4, 18, and 14, respectively) in the causal condition, χ2(2, N = 36) = 12.87, = .002.

Finally, to evaluate children’s memory for the “special properties” of items at Test 2, we compared average response scores across experimental conditions. As was the case for word learning, performance in the causal condition (= 0.67, SE = 0.12) outstripped that observed in the noncausal condition (= 0.22, SE = 0.37), t(35) = 4.09, < .001. This result was echoed in a nonparametric analysis, = 3.30, = .001. Importantly, memory for neither words nor “special properties” was related to the precise number of days that had passed since training in any condition (or overall), rs ranged from −.29 to .12, ns.


In this investigation, we considered whether the ability of preschoolers to learn new words varied with the type of information provided to them about the items being labeled. Our prediction that children would be more likely to learn labels for items that were described in terms of their causal, rather than their noncausal, properties was confirmed. Although children performed equally poorly (i.e., at chance) in all conditions when initially tested, children’s performance rose above chance in the causal condition whereas it remained unchanged in the noncausal (and baseline) conditions when tested after a longer term delay. Careful matching procedures mitigate explanations based on the perceptual availability, distinctiveness, or plausibility of the information provided.

These results are consistent with research demonstrating an early emerging and intimate relation between words and conceptual knowledge. This relation appears to be bidirectional. Words guide early individuation, categorization, and inductive generalization, and conceptual information guides word extension (e.g., Balaban & Waxman, 1997; Diesendruck & Bloom, 2003; Gelman & Coley, 1990; Graham et al., 2004; Kemler Nelson et al., 2000; Xu et al., 2005). The current work adds to this literature by demonstrating that conceptual knowledge facilitates the acquisition of new words in preschoolers. It should be noted that there are actually several levels at which a word might be said to be “acquired.” The current investigation demonstrates comprehension and retention over a lengthy delay. It does not demonstrate extension or production. Future research will be necessary to detail the scope of the reported effect.

Recent research by Kemler Nelson, O’Neill, and Asher (2008) also showed that providing functional information about novel artifacts facilitates acquisition of their names in preschoolers. In that work, children were taught novel words for four novel artifacts. Children were either told the function of each object (e.g., “To hit balls into the air”) or were told a fact that was irrelevant to category membership (e.g., “My brother gave this to me”). Children in the function condition outperformed those in the fact condition in tests of word learning. The current research extends this finding to the domain of animate kinds and demonstrates the effect of causal information relative to other category-relevant (rather than category-irrelevant) information that was matched for both distinctiveness and plausibility.

Together this research challenges the notion that early word learning is immune to the influence of conceptual knowledge (ALA; e.g., Colunga & Smith, 2008; Smith, 1999). More broadly, it is relevant to ongoing and heated debate regarding whether and how causal information is distinctly represented and/or processed (see Glymour, 2002; Gopnik & Schulz, 2007, for reviews and/or collections of relevant articles; Shanks, Holyoak, & Medin, 1996). This study demonstrates that early word learning processes are not only sensitive to differences between conceptual and nonconceptual information, but derive considerable benefit from the unique influence of the former. It should be noted, however, that work remains to be carried out toward specifying the range of conceptual information that provides this benefit. Although we suggest that the important distinction is between causal and noncausal information, other cuts are plausible. For example, it is possible that functions, but not other types of causal information (e.g., animal diet or habitat), play this special facilitative role in early word learning.

In advance of a final answer to these important questions, we can consider what is currently well established about human cognition that might account for the powerful influence of causal information on early word learning. Because the effects demonstrated here were not language specific (i.e., both names and “special properties” of target items were better remembered in the causal than the noncausal condition), it is likely that domain-general processes played a fundamental role in their expression. Two such processes seem of particular relevance.

First, focused attention at the time of learning facilitates memory (e.g., Craik et al., 1996; Uncapher & Rugg, 2005). Causal information might be particularly effective at attracting attention for several reasons including its distinctiveness, its complexity, and its explanatory force (Gopnik, 2000). Indeed, young children’s questions about novel objects overwhelmingly focus on properties that are causally relevant to their broader domain membership (i.e., artifacts vs. animate kinds; Greif et al., 2006). Children in the current investigation might therefore have paid more attention to those labeling episodes that were infused with domain-specific causal information. Although we cannot be sure from the current data, we think this is unlikely. In general, children were highly engaged in this task, with no indication that their interest levels rose and fell with the type of information offered. In fact, at test, when children responded incorrectly, they were equally likely to choose items that had previously been described in terms of their causal or noncausal properties (34% vs. 36%). Moreover, when, after completion of the study, we asked several participants which of the pictures they liked best, we found no relation between their selections and their word learning performance.

A second domain-general process that might be relevant concerns the positive influence of semantic elaboration on memory (e.g., Anderson, 1983; Craik & Tulving, 1975; Eysenck, 1979; Kirchhoff, Schapiro, & Buckner, 2005). Memories constructed around meaningful or enabling relation appear to be particularly robust in both adults and children (e.g., Barr & Hayne, 1996; Bauer & Fivush, 1992; Bradshaw & Anderson, 1982; Copple & Coon, 1977). By their very nature, causal information links other semantic knowledge (i.e., causes and effects) together in a meaningful way, thereby potentially enhancing memory.

Before offering a final summary and conclusion, we must address the delayed nature of the reported effect. This finding resonates with a growing literature highlighting the importance of consolidation, often achieved during sleep, for learning and memory (e.g., Fenn, Nusbaum, & Margoliash, 2003; Stickgold & Walker, 2005; Wilson & McNaughton, 1994). Recent evidence suggests that sleep might be particularly useful in enhancing processing of the abstract, relational properties of experience (e.g., Gomez, Bootzin, & Nadel, 2006; Stickgold & Walker, 2004; Wagner, Gais, Haider, Verleger, & Born, 2004), perhaps akin to the causal training provided here.

Still, this finding is somewhat surprising in light of studies documenting word learning in young children only minutes (or less) after training (see Bloom, 2000; Golinkoff et al., 2000). Indeed, Kemler Nelson et al. (2008) specifically demonstrated an effect of functional information on word learning after a 2-min delay. Even in our own pilot work (N = 13), we found evidence of word learning in a causal condition, very much like that utilized here, after only a 10-min delay. Perhaps key to reconciling these findings is the fact that all previous studies involved teaching children four or fewer words whereas the current experiment involved teaching six (along with additional novel information). Children might have simply been overwhelmed by this intensive training, thereby requiring more time thereafter to refocus their attention on the task. Clearly, further research is required to better specify the role of consolidation, as opposed to other processes, like fatigue or forgetting, to the effects observed here.

In sum, the current investigation represents one of but a very few to consider the factors that determine how readily young children learn particular words. (here, count nouns). We offer three principal conclusions. First, providing causal information about either artifacts or imaginary animals increases the likelihood that 3-year-olds will learn labels applied to them. Second, providing equally distinctive and nonobvious information that is not explicitly causally relevant does not facilitate word learning. Third, the learning promoted by causal information is not necessarily immediately obvious—a period of recovery from training and/or consolidation may be necessary. These conclusions are consistent with both evidence and theory tying causal information to the coherence and stability of concepts, as well as to the words that reference them. It is important to note, however, that the current findings do not imply that words cannot be learned on the basis of purely perceptual and/or causally tangential information. The demands of the task presented to children in this study were high. Under less demanding circumstances, children could surely learn words introduced without causal information. The current results, however, do indicate that when information processing resources are stretched, words for objects with known causal properties are the ones most likely to be learned (see also Bauer, 1992; Bradshaw & Anderson, 1982). We would argue that the everyday lives of 3-year-olds regularly present this sort of strain on cognitive resources.