# The Idea of an Exact Number: Children's Understanding of Cardinality and Equinumerosity

## Abstract

Understanding what numbers are means knowing several things. It means knowing how counting relates to numbers (called the cardinal principle or cardinality); it means knowing that each number is generated by adding one to the previous number (called the successor function or succession), and it means knowing that all and only sets whose members can be placed in one-to-one correspondence have the same number of items (called exact equality or equinumerosity). A previous study (Sarnecka & Carey, 2008) linked children's understanding of cardinality to their understanding of succession for the numbers five and six. This study investigates the link between cardinality and equinumerosity for these numbers, finding that children either understand both cardinality and equinumerosity or they understand neither. This suggests that cardinality and equinumerosity (along with succession) are interrelated facets of the concepts five and six, the acquisition of which is an important conceptual achievement of early childhood.

What does it mean to say that a child understands numbers? There are many early milestones in number learning, and parents sometimes say that a toddler who can count to five or ten “knows” those numbers. Similarly, young children in literate environments learn to identify the written digits 0–9 along with letters of the alphabet and thus, in a sense, “know” the numbers. But what does it mean to understand numbers, in some important conceptual way? One operational definition comes from Piaget (1952). In the Piagetian tradition, children understand numbers when they pass the conservation-of-number task, around age 5 or 6 years. For Piaget, the key number concept is equinumerosity (sometimes called exact equality)—the idea that two sets have the same number of items, if and only if their members can be placed in perfect one-to-one correspondence (Frege, 1980 [1884]). The child's understanding of equinumerosity as an abstract principle is what the conservation task is supposed to measure.

A different operational definition of number knowledge arises in more recent work (e.g., Carey, 2009; Hurford, 1987; Klahr & Wallace, 1976). In this newer literature, children are said to understand numbers when they apply the cardinality principle of counting (Gelman & Gallistel, 1978) on the Give-N task (e.g., Condry & Spelke, 2008; Le Corre, Van de Walle, Brannon, & Carey, 2006; Sarnecka & Lee, 2009; Wynn, 1990, 1992). The cardinality principle states that the last word uttered in a (correct, rule-governed) count expresses the number of items in the whole set. It is the cardinality principle that gives number words their meanings, by making the cardinal meaning of any number word knowable from that word's ordinal position in the counting list. (For example, readers who do not speak Japanese—but do understand cardinality—can easily guess the meaning of the Japanese number word nijuuichi if they are told that it is the twenty-first word in the Japanese counting list.)

One current proposal about number development (Carey, 2009) is that children who understand cardinality (as measured by the Give-N task) also understand the key numerical concept of succession (often called the successor principle or successor function)the idea that each number is generated by adding one to the previous number (Dedekind, 1901 [1872/1888]). One empirical study (Sarnecka & Carey, 2008) supports this claim for the numbers 5 and 6, although another study (Davidson, Eng, & Barner, 2012) finds that this early understanding is less robust for higher numbers, such as 25.

Integrating the older and newer notions of what it means to “understand” numbers, Izard and colleagues identified equinumerosity and succession as “two key concepts on the path toward understanding exact numbers” (Izard, Pica, Spelke, & Dehaene, 2008). But how do these concepts interact in development? Our proposal in this study is that children's understanding of cardinality (as measured by the Give-N task) predicts their understanding of equinumerosity (at least for the numbers five and six).

Note that this connection is not obvious. The traditional litmus test for understanding equinumerosity is Piaget's conservation-of-number task, which children pass at age 5 or 6. As Muldoon, Lewis, and Freeman (2009) noted,

The developmental puzzle is that up to the age of six, even some 2 years after they have mastered procedural counting, many children have yet to grasp that two sets with the same cardinal number must, by virtue of logical necessity, be equivalent, and that sets with different cardinals must by the same logic be numerically different. (pp. 203–204)

We will argue that Piaget's classic conservation-of-number task underestimated children's knowledge because it asked children about the abstract entity number, rather than about particular numbers such as five and six. (In other words, Piaget asked questions such as “Are there the same number of flowers and vases?” rather than “There are five flowers. Are there five vases, or six?”)

Piaget, of course, asked the question this way because he was interested in abstract and explicit knowledge that the child could articulate. But more recent theories of number-concept development (e.g., Carey, 2009) hold that children first learn about particular number words, and only later generalize their knowledge to the superordinate category of numbers. The counting list (one, two, three, etc.) is learned as a placeholder structure (something like the chant eenie, meenie, minie, mo), with little or no numerical content. The child acquires the deep numerical concepts (e.g., cardinality, equinumerosity, succession) during the process of assigning (or constructing, or discovering, depending on your theoretical bent) meanings for those number words. This is the process known as conceptual-role bootstrapping (Carey, 2009; Block, 1986; see also Quine, 1960). If children first learn about equinumerosity in the context of particular number words such as five and six, then measuring equinumerosity knowledge outside the context of particular number words (e.g., using only the word “number” as Piaget did) may obscure the early development of this knowledge.

There are hints of this in the findings reported by Sarnecka and Gelman (2004). That study investigated children's understanding of the specificity of number words. This is the idea that every number word picks out a specific, unique numerosity (Wynn, 1990, 1992). Some proposals had claimed that children did not understand this property of number words until they mastered the cardinality principle of counting (as measured by the Give-N task). Sarnecka and Gelman developed three new tasks to measure children's knowledge of specificity, two of which children passed before understanding the cardinality principle. Thus, Sarnecka and Gelman concluded that children understood specificity before cardinality.

This article revisits the third task—the one children failed until they understood cardinality. This was the “Compare-Sets” task. In it, children were shown two pictures, representing the snacks given to a pair of animals. The pictures were either identical or differed by one item. The children were told how many items one set had, and then were asked about the other set (e.g., “Frog has five peaches. Does Lion have five, or six?”).

The authors intended the Compare-Sets task to measure the child's knowledge that number words are specific. When non-CP-knowers (children who do not yet understand cardinality as measured by the Give-N task) passed two other “specificity” tasks but failed Compare-Sets, the authors concluded that the task was simply too difficult, and predicted that if the procedural demands could be reduced, the performance gap between CP-knowers and non-CP-knowers would disappear.

The present work tests that prediction and concludes that it was wrong. A new, simplified version of the Compare-Sets task actually makes the performance gap between CP-knowers and non-CP-knowers even more obvious. In light of this finding, we revisit the question of what the Compare-Sets task actually measures and argue for an answer that was not considered in the 2004 study: that the task does not primarily measure the child's knowledge of specificity, but of equinumerosity. So while children may indeed see number words as specific (or simply as being about quantity—another possibility consistent with the 2004 results), they do not understand equinumerosity (i.e., that any and only two sets with the same number word can be placed in one-to-one correspondence with each other) until they become CP-knowers.

Earlier studies have reported findings that are consistent with this possibility, although none have tested it directly. For example, Sophian (1988) presented 3- and 4-year olds with two sets of objects (e.g., a group of jars arranged in a circle, with a pile of spoons in the middle). In half the trials, children were told the number of each set, and then asked about their correspondence. For example, “There are n jars. There are m spoons. Can every jar have its own spoon?” In the other trials, children were told about the correspondence and given the number of one set, and then were asked about the number of the other set. For example, “Every jar has its own spoon. There are n jars. Are there n spoons?” Sophian reported that about 30–40% of 3-year-olds, and 70–75% of 4-year-olds, succeeded on both types of trial. These are approximately the proportions of Sophian's (relatively high-SES) sample that we would expect to be CP-knowers if they were tested on the Give-N task.

Frydman and Bryant (1988) reported a similar finding. In that study, 4-year-olds were asked to divide (“share out”) a set fairly, and to count one of the resulting portions. Having done that, many of the 4-year-olds were able to infer the number of another, uncounted portion. (See Izard et al., 2008 for another related finding.)

This study revisits the Compare-Sets task and tests Sarnecka and Gelman's (2004) explanation for the performance gap between CP-knowers and other children (i.e., that the task was too procedurally difficult). A simplified version of the task greatly reduces the burden on attention and memory by leaving the sets visible the whole time.1 But contrary to Sarnecka and Gelman's prediction, simplifying the task does not eliminate the performance gap between CP-knowers and other children.

In light of these findings, we reconsider how this task should be interpreted. We suggest that the gap in performance between CP-knowers and non-CP-knowers may reflect CP-knowers’ understanding of equinumerosity—and that equinumerosity itself may be (along with understanding of the cardinality principle and the successor function) a manifestation of a broad conceptual achievement: the “exact numbers” idea—or at least the idea of the exact numbers five and six.

## 1 Method

### 1.1 Participants

Participants included 51 children (30 girls, 21 boys). Their ages ranged from 2 years 7 months to 4 years 1 month (Mage = 3;4). All children were monolingual speakers of English. Children were recruited by mail and phone using public birth records in the greater Boston area, and were tested at a university child development laboratory in Cambridge, Massachusetts. Parents who brought their children in for testing received reimbursement for their travel expenses and a token gift for their child. No questions were asked about socio-economic status, race, or ethnicity, but participants were presumably representative of the upper middle SES, predominantly white and Asian communities in which they lived.

### 1.2 Procedure

The purpose of this task was to determine the child's number-knower level (i.e., to determine which exact number word meanings the child knew) and specifically to determine whether the child understood the cardinality principle of counting. Materials included a stuffed animal (e.g., a bunny, ~20 cm high), a plastic plate (~11 cm in diameter), and a set of 15 plastic counters (e.g., apples, each ~3 cm in diameter). The experimenter began the task by placing the animal on the table and saying, for example, “In this game, we give things to the bunny, like this…” (here the experimenter mimed placing something on the plate and sliding the plate across the table to the animal). The experimenter then placed a bowl of 15 apples on the table in front of the child and said, “Can you give the bunny one?” After the child put one or more apples on the plate and slid the plate over to the animal, the experimenter asked a single follow-up question, which repeated the original number word asked for (e.g., “Is that one?”) If the child said “yes,” then the experimenter said, “Thank you!” and placed the apple(s) back in the bowl. If the child said “no,” then the experimenter repeated the original request.

The child was always asked for 1 on the first trial, and for 3 on the second trial. If the child succeeded on both of those trials, the third request was for 5. Otherwise, the third request was for 2. Further requests depended on the child's answers: If a child succeeded at giving some number N, the next request was for N + 1; the highest number requested was 6. If the child failed at giving N, the next request was for N−1; the lowest number requested was 1. The task ended when the child had a least 67% successes (with a minimum of two trials) at a given number N, and at least 67% failures (with a minimum of two trials) at N + 1. This pattern was the basis for sorting into number-knower levels: Children who succeeded at 1 (but failed at 2 and higher) were called one-knowers; children who succeeded 1 and 2 (but failed at 3 and higher) were called three-knowers, and so forth. Children who were able to generate all set sizes up to and including 6 were called cardinality-principle-knowers. Failures were counted against both numbers involved. For example, if a child gave four apples when asked for “two,” that was counted as failure on both “two” and “four.” (This replicates the diagnostic criteria used by Sarnecka & Gelman, 2004; as well as the criteria used by Le Corre et al., 2006; Le Corre & Carey, 2007; Lee & Sarnecka, 2010; Sarnecka & Carey, 2008; Sarnecka & Lee, 2009; and Wynn, 1990, 1992.)

The purpose of this task was to test whether children could extend a number word from one set to another on the basis of one-to-one correspondence between the sets. It is based on the task used by Sarnecka and Gelman (2004). Materials for this task included two stuffed animals (a frog and a lion) and eight pairs of picture cards, depicting the animals’ snacks. Each card showed a homogeneous row of five or six food items (e.g., peaches). Each pair of cards was either identical (e.g., five peaches and five peaches) or differed by one item (e.g., six cupcakes and five cupcakes). When the sets were different, there was an empty circle at one end of the row, highlighting the place where one item was missing.

The experimenter introduced the task in the following way. “This is a story about when Frog and Lion came to my house, and I gave them some snacks. I tried to make their snacks just the same, because they like their snacks to be the same. But sometimes I made a mistake, and their snacks were not the same. The first snack I gave them was peaches…”

Here, the experimenter placed the first pair of cards on the table, one in front of each animal, and asked the first control question, “Are their snacks just the same, or did I make a mistake?” Trials where children answered this question incorrectly were excluded from the analysis. The vast majority of “incorrect” answers occurred on the first trial where the sets differed, because the child often said that the snacks were “the same,” meaning that they were the same kind of food (e.g., the child often said something like “Yes, they both got peaches”). In this case, the experimenter emphasized the discrepancy by saying, for example, “Well, they both got peaches but… oh no! Look! I forgot to put a peach there! Doh! (slapping forehead) That's not right! I made a mistake! I'm so silly!”

After the child's attention had been drawn to the same-ness or difference of the two sets, the experimenter either removed the cards (“hidden” trials) or left them sitting in full view (“visible” trials) and asked the test question, which gave the child the number of one set and asked about the other. In the “hidden” trials, the question was of the form, “Frog had five cupcakes. Did Lion have five, or six?” In the “visible” trials, the question was of the form, “This (pointing to Frog's snack) is six peaches. Is this (pointing to Lion's snack) five, or six?”

Children kept their hands in their lap and did not count the items on the cards. (In the “hidden” trials, the cards were removed from view before the test question, so there was nothing to count anyway.) On “visible” trials, if a child made any move to count (e.g., by pointing to an item), the experimenter removed the cards from view and said, “This is not a counting game. You can just guess” and then waited for the child to return hands to lap before laying the cards back on the table. Such exchanges were rare, because children rarely made any attempt to count the items.

On the “hidden” trials, the test question was followed by a final control question, “And were their snacks just the same, or did I make a mistake?” Trials where children failed this final control question were also excluded from the analysis.

Each child completed a block of four “visible” trials and a block of four “hidden” trials, for a total of eight trials. Within each block, the set sizes given to Frog/Lion were 5/5, 6/6, 5/6, and 6/5. Order of blocks, and order of trials within each block, was counterbalanced across subjects.

#### 1.2.3 Data analysis

Responses were binary (correct/incorrect) and each child could contribute up to eight valid responses, one for each trial. A common way to analyze such data is to collapse across the levels of some factors (e.g., to add up each child's responses, creating a score of 0–4 for visible trials and a score of 0–4 for hidden trials). However, in this case, we chose not to collapse across different trial types because we did not want to assume that any factors were unimportant. Instead, we analyzed these data by fitting generalized, linear, mixed-effects models (McCulloch, 2003) with a logit link function. In the interest of clarity, our presentation of these results will only include the details of the fitting process where those details are critical to understanding or evaluating the results. The actual fits were done using the lme4 (Bates, 2005) package in R (version 2.10.1; R Development Core Team, 2006).

## 2 Results

Based on their performance in the Give-N task, children were identified as either cardinality-principle-knowers (CP-knowers, = 22) or non-cardinality-principle-knowers (non-CP-knowers, = 29). Among the non-CP-knowers, there were five pre-number-knowers, five-one-knowers, nine-two-knowers, seven-three-knowers, and three-four-knowers. There was a strong relationship between age and knower level, r(51) = .66, = .000, reflecting the fact that older children knew more than younger children. There was no evidence of differences in the knower levels of boys versus girls, F(1,49) = 0.713. Except where noted, the non-CP-knowers were collapsed into a single group in the analyses reported below.

The initial control question, “Are their snacks just the same, or did I make a mistake?” was asked before the test question on all trials. On the “hidden” trials only, the control question was repeated again after the test question, to check that the child still remembered whether the sets had been identical. As described in the method section above, children often misunderstood the control question at first, taking it to mean “same type of food” rather than “same amount of food.” All together, one or both control questions were answered incorrectly on 32% of trials. There was a statistically reliable tendency for the same children to miss control questions in both the hidden and visible conditions, r(51) = .311, = .027. There were just three instances in which a child answered the control question correctly and then refused to answer the test question. All subsequent analyses excluded trials on which the child either failed to answer one or both control questions correctly, or did not answer the test question (e.g., trials that were not completed because the child decided to quit playing).

These exclusions eliminated one child from the data set: a female two-knower, age 3;1. We considered also dropping the data from four other children. After the exclusions, these children had no responses on all four trials in either the “hidden” or “visible” block (two children each). However, the data from these four children were retained after we determined that this did not qualitatively alter any of our conclusions. After these exclusions, there were 273 responses in the data.

The analysis of the remaining data from the Frog-and-Lion task focused on the effects of five factors. “Participants” was the one random factor. There was one two-level, between- participant, fixed effect: “CP-knower status” (CP-knower/non-CP-knower). And there were three two-level, within-participant, fixed effects: “visibility” (hidden/visible), “N-first” (5/6 objects in the first set presented), and “same-different” (identical sets/discrepant sets). The model based on these factors that best fit the data was one that included the main effect of CP-knower status (z = 4.033, = .000), and the main effect of visibility (z = 2.044, = .041), and that allowed the variance of the random effect of participants to differ across the levels of the visibility factor. The interaction of these two factors was not statistically reliable (z = 1.587, = .113). Collapsing over the visibility factor, children in the non-CP-knower group performed better than chance (z = 6.149, = .000).

However, being a CP-knower increased the average probability of responding of responding correctly from .59 to .85. Collapsing over CP-knower status, having the sets visible rather than hidden increased probability of responding correctly from .67 to .80. This final model was simpler because none of the potential terms involving either of the presentation variables (N-first or same-different) did much to improve the fit of the model. In other words, it made little difference whether the first set size presented was five or six; nor did it matter much whether the sets were identical or different. For these terms the most significant had z = −1.367, = .172 and most were substantially less important.

We also re-ran the analysis to include the 70 trials on which the child missed the first control question, because Sarnecka and Gelman (2004) did include such trials in their analysis. (“Hidden” trials on which the child missed the second control question were still excluded.) Results were similar to the first analysis, showing main effects of CP-knower status (z = 4.962, = .000) and visibility (z = 2.350, = .019). However, in this analysis the interaction of these two factors was also statistically significant (z = 2.046, = .041), meaning that having the sets remain visible was more helpful to CP-knowers than to non-CP-knowers.

#### 2.2.1 Analysis of age effects

Before accepting that this model provided an appropriate summary of these data, we felt it was important to explore two plausible alternatives. The first is that CP-knower status and success in this task both may reflect a developing maturity that can be indexed by age. Certainly, as reported in the previous section, number-knower level in these data was strongly correlated with age (see Fig. 1a). However, when the model was expanded to include linear, quadratic, and cubic age terms, the fit was improved no more than if these had been random predictors (χ2(3) = 2.097, = .553), and the size of the coefficient in the model associated with CP-knower status was attenuated by less than 1% and remained statistically significant (z = 3.770, = .000). In other words, despite the correlation between knower level and age, it was CP-knower status (and not age) that predicted success on the task (see Fig. 1b).

#### 2.2.2 Analysis of differences among the non-CP-knower levels

The second alternative explored whether differences in number-knower level, other than the distinction of CP- knower/non-CP-knower, explained any variation in performance. Including number-knower level (i.e., pre-knower, one-knower, two-knower, three-knower, or four-knower) instead of CP- knower-status (i.e., CP-knower or non-CP-knower) as a factor in the model did not substantially improve the model fit (χ2(4) = 6.765, = .149). As it happened, the only non-CP-knower level with performance that differed reliably from the overall average was three-knowers, who did significantly worse than the average (z = −2.827, = .005; lacking any principled explanation for this anomaly, we assume that it was a fluke.) In the model that included number- knower level as a factor, there was still a reliable difference between CP-knowers and the overall average, z = 3.866, = .000 (see Fig. 2).

## 3 Discussion

These results suggest a number of things. First, Sarnecka and Gelman's (2004) interpretation of their Compare-Sets results was incorrect. In that paper, the task was seen as a way of measuring the child's knowledge that each number word picks out a specific, unique numerosity. But if that is what the task measures, then why shuuld CP-knowers (i.e., children who have already figured out the cardinality principle of counting) perform so much better than non-CP-knowers on the Compare-Sets task?

The data present two patterns that seem to require explanation: First, the non-CP-knowers (as a group) performed slightly better than chance. Second and more striking, the CP-knowers performed much better than the non-CP-knowers not only at the group level but also at the individual level. In fact, every single CP-knower outperformed every single non-CP-knower (see Figs. 1b and 2).

Any explanation for these patterns is necessarily speculative, but the lack of overlap in performance by CP-knowers and non-CP-knowers seems consistent with the possibility that the two groups are using different strategies on the task.

For example, the non-CP-knowers could perform slightly better than chance because some implicit pragmatic bias led them to repeat the same word when the sets were termed “the same,” and to choose the other word when the sets were termed “not the same.” Such a rule need not be specific to number words and could apply in the absence of any conceptual understanding of what makes two sets numerically “the same”—that is, without the child having any understanding of equinumerosity. (Of course, a child who represented this rule explicitly, and applied it consistently, would get every trial correct. Because no non-CP-knower even approached perfect performance, it might make more sense to think of such a pragmatic constraint as operating subtly and implicitly on children's behavior.

The more interesting question is, why do CP-knowers (and only CP-knowers) perform so well on this task? What number knowledge do they have, that non-CP-knowers lack? Obviously, CP-knowers (by definition) understand how counting works. But children were not allowed to count the items in the Compare-Sets task, so counting skill alone cannot directly explain the CP-knowers’ success.

It has previously been argued that only CP-knowers understand how the successor function generates one set size to go with the number word five, and another set size to go with the word six (Sarnecka & Carey, 2008). But this does not explain the present results, because the Compare-Sets task does not directly measure understanding of succession (e.g., it does not measure the child's knowledge that the set is increased by exactly one item with each word in the list).

What the task does measure is the child's knowledge of equinumerosity. This may be (like cardinality and succession) a concept that CP-knowers have—at least for the numbers five and six—and non-CP-knowers lack. If so, then the conceptual achievement that has long been called the cardinality-principle induction might be better termed the cardinality-principle-successor-function-equinumerosity induction, an unwieldy term indeed. Following Izard et al. (2008), we prefer to use a simpler term: the “exact numbers” idea.

There is another possibility to consider. CP-knowers might succeed on the Compare-Sets task, not because they understand equinumerosity, but because they have mapped the number words “five” and “six” to quantity representations in the innate approximate number system. Le Corre and Carey (2007) showed that non-CP-knowers do not have such mappings, and that children construct them several months after making the CP-induction. If the CP-knowers (and only the CP-knowers) were able to estimate five and six items without counting, then they might be able to answer the question “Does Lion have five, or six?” simply by looking at (or remembering) Lion's snack and estimating how many items were in it. This might also explain why keeping the sets visible was more helpful to CP-knowers than to non-CP-knowers.

However, there are two problems with this explanation. First, it would require that all the CP-knowers in our study had mapped “five” and “six” to the approximate number system (ANS), because all the CP-knowers performed quite well on the task. But Le Corre and Carey (2007) found that children do not construct such mappings until some 6 months after making the CP-induction. This would lead us to expect that at least some of the CP-knowers (i.e., the ones who had not been CP-knowers for very long) would still be what Le Corre and Carey called “non-mappers.” The fact that in our study, every CP-knower performed well suggests that their performance was tied more directly to the CP-induction itself. The other problem with the ANS-mapping explanation is that the CP-knowers' performance is just more accurate than would be expected under an estimation account. Even if all the CP-knowers in our sample had constructed ANS mappings for the number words, previous research suggests that these mappings would not be precise enough to allow the children to correctly discriminate 5 from 6 some 85% of the time (Halberda & Feigenson, 2008; Piazza et al., 2010). For these reasons, we tend to favor the idea that the present results reflect CP-knowers’ greater understanding of equinumerosity (relative to non-CP-knowers) rather than differences in the ANS acuity of the two groups, or in their mapping of number words to ANS representations.

To us, the most important implication of the present work is educational. One of the main goals of pre-kindergarten math education should be to make sure that all children understand exact numbers. That is, children need to understand the principles of cardinality, equinumerosity, and succession (for some portion of the counting list, say up to 10) before they start kindergarten. Children who lack these concepts really do not know what numbers are. Without that understanding, they cannot make sense of number operations, greater than/less than comparisons, or other content in the early elementary math curriculum. We hope that the present work will help to clarify the importance of the “exact numbers” idea as a conceptual-development milestone of the preschool years.

## Acknowledgments

This study is based on work supported by the National Institutes of Health under NICHD R03HD054654 to the first author, and by the National Science Foundation under DRL 0953521 to the first author. Data collection was supported by the National Science Foundation under REC 0337055 to Elizabeth Spelke and Susan Carey. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors, and do not necessarily reflect the views of the National Institutes of Health or the National Science Foundation. We thank the children who participated in this study and their families; thanks also to research assistants Alexandra Cerutti and Jyothi Ramakrishnan for their help with data collection, and to Prof. Jeremy Heis for his very helpful insights.

### Note

1. 1

Thanks to Prof. Kirsten Condry for suggesting this way of simplifying the task.