Connecting neural coding to number cognition: a computational account


Richard W. Prather, 1101 E 10th St, Bloomington, IN 47405, USA; e-mail:


The current study presents a series of computational simulations that demonstrate how the neural coding of numerical magnitude may influence number cognition and development. This includes behavioral phenomena cataloged in cognitive literature such as the development of numerical estimation and operational momentum. Though neural research has begun to describe neural coding of number, it is unclear how specific characteristics of the neural coding may relate to the expansive list of behavioral phenomena in the development of number cognition. The following study considers several possibilities.


Number cognition, broadly speaking, includes numerical estimation, simple arithmetic operations, magnitude judgments, and counting amongst other skills. There is a long history of research on number cognition, including the cognitive and neural processes involving numerical magnitude. Research includes behavioral studies of number development (e.g. Gelman & Gallistel, 1978; Piaget, 1954, amongst others) and more recently a large number of neural studies relevant to number cognition (e.g. Ansari & Dhital, 2006; Ansari, Garcia, Lucas, Hamon & Dhital, 2005; Cantlon, Brannon, Carter & Pelphrey, 2006; Cantlon, Libertus, Pinel, Dehaene, Brannon & Pelphrey, 2008; Cohen Kadosh & Walsh, 2009; Dehaene, Piazza, Pinel & Cohen, 2003; Göbel, Calabria, Farnè & Rossetti, 2006; Pesenti, Thioux, Samson, Bruyer & Seron, 2000; Walsh, 2003; Whalen, McCloskey, Lesser & Gordon, 1997). This increasingly large literature involving humans has been supplemented by research with non-human primates (e.g. Brannon & Terrace, 1998; Nieder & Miller, 2003; Roitman, Brannon & Platt, 2007) and by computational methods that incorporate neural principles (e.g. Ahmad, Casey & Bale, 2002; Dehaene, 2007; Dehaene & Changeux, 1993; Verguts & Fias, 2004; Zorzi, Stoianov & Umiltà, 2004).

Significant contributions to cognitive development have been made through computational modeling that connects neural and behavioral data – in areas of language learning (Elman, 1993), motor development (e.g. Spencer, Simmering, Schutte & Schöner, 2007), and visual development (e.g. Mareschal & Johnson, 2002). For example, Spencer and colleagues use neurocomputational modeling to provide evidence for a novel interpretation of the classic A-not-B error developmental phenomenon (Piaget, 1954). By modeling of visual-motor neural processes Spencer and colleagues conclude that the A-not-B phenomenon is an example of a broader class of errors that occur in development. The current study presents a series of simulations based on recent advances in the study of the neural coding of numerical magnitude that offer new insights into behavioral phenomena described in the developmental literature.

Neural coding of number

A variety of investigations with both humans and non-human primates have characterized the neural activity related to the perception of number. First, research has focused on the localization of neural activity specific to number. There has been convergence on the intraparietal sulcus and areas of prefrontal cortex (e.g. middle frontal gyrus) from both humans (e.g. Ansari & Dhital, 2006; Ansari et al., 2005; Cantlon et al., 2006; Cantlon et al., 2008; Dehaene et al., 2003) and non-human primates (Nieder, Freedman & Miller, 2002; Nieder & Miller, 2003; Nieder & Merten, 2007; Sawamura, Shima & Tanji, 2002). Numerical coding activity has been recorded in both intraparietal sulcus and prefrontal cortex; two areas that have been found to be functionally connected (Cavada & Goldman-Rakic, 1989; Chafee & Goldman-Rakic, 2000; Quintana, Fuster & Yajeya, 1989). Neural activity in these areas has been recorded in tasks such as number magnitude comparison, arithmetic operations and even the perception of a digit. The basic result has been replicated across a variety of presentation formats, such as dot displays and written digits (Eger, Sterzer, Russ, Giraud & Kleinschmidt, 2003) and cultures (Tang, Zhang, Chen, Feng, Ji, Shen, Reiman & Liu, 2006).

Second, studies have described in detail neural responses to number with the use of direct neural recording. Two types of neural coding have been described: number selective coding and summation coding. Summation, or monotonic coding, of number includes graded coding that increases as the perceived number magnitude increases (Roitman et al., 2007). This type of coding is consistent with the accumulator model of number representation; that number is represented by accumulating a fixed number of pulses produced serially by some pacemaker (Meck & Church, 1983). There is also evidence of number specific activity in that the spiking rate of a given set of neurons is correlated maximally to a particular value N, and less so for N + 1, N– 1 and so on (Nieder et al., 2002; Nieder & Miller, 2003; Nieder & Merten, 2007; Sawamura et al., 2002). This holds across presentation format (e.g. dot displays, written digits) of the numerical values. This type of coding creates Gaussian-like neural tuning function (see Figure 1). Each number magnitude is not coded exactly, but in a manner that is consistent with Weber-Fechner’s law (Fechner, 1966 [1860]); that noticeable differences between perceptual stimuli are a function of the proportional difference. As the magnitude of the number increases the neural tuning function width increases proportionally. For example, the width of the tuning function for the magnitude 5 is half that of the magnitude 10, which is half of 20. Thus differences in the perceived value are a function of the proportional stimulus differences, as with Weber-Fechner’s law.

Figure 1.

Example neural tuning functions. Values 10 (black) and 20 (grey) are shown on both linear (top graph) and log (bottom graph) scales.

Theories of how number sensitive neural activity develops have been supported by computational models (e.g. Ahmad et al., 2002; Dehaene, 2007; Dehaene & Changeux, 1993; Miller & Kenyon, 2007; Pearson, Roitman, Brannon, Platt & Raghavachari, 2010; Verguts & Fias, 2004). These studies demonstrate the development of number selective activity from other inputs, such as perceptual object tracking, or accumulator-like summation coding (Miller & Kenyon, 2007; Verguts & Fias, 2004). Computational results show number selective activity coded with tuning functions that are proportional to the number magnitude, skewed on the linear scale and symmetric on the log scale, similar to the neural data (Dehaene, 2007).

The current simulations

The current simulations are in part based on prior neural and computational work. General aspects of the model such as Gaussian tuning curves for number values have been illustrated in prior neural (e.g. Nieder & Miller, 2003) and computational work (Dehaene, 2007; Verguts & Fias, 2004). The current model posits these basic aspects and focuses on developmental change in both the neural activity and behavior. Prior computational work has not provided a clear mechanism of how the neural coding of number may influence developmental behavioral phenomena, such as the apparent log to linear shift in number line estimations; ‘what triggers the conceptual shift from logarithmic to linear in children remains unknown’ (Dehaene, 2007, p. 557). The current focus on how changes in neural activity may influence behavioral changes provides possible answers to this and other questions of numerical development.

The current model focuses on two aspects of the neural tuning curves. First, the width of the function depends on the magnitude of the value being coded. Thus the tuning function for the value 10 is narrower than the function for the value 30, on a linear scale. The functions are proportionally similar, and thus similar on a log scale (Nieder & Miller, 2003; see Figure 1). Second, the tuning functions, though resembling Gaussian distributions, are positively skewed on a linear scale. The positive skew also results from the transformation from a logarithmic scale to a linear scale; if the tuning function is symmetric on a log scale it will be positively skewed on a linear scale. In their studies of non-human primates, Nieder and Miller (2003) reported that neural responses are positively skewed on a linear scale. In addition, Nieder and Merten (2007) found that in the coding of values 1–30, smaller values are clearly positively skewed, and larger values are not skewed as much. Computational accounts (Dehaene, 2007; Verguts & Fias, 2004) have shown positive skew in number coding that arises through unsupervised learning with number magnitudes. Thus these two properties – the logarithmic scale and the positive skew – may be fundamental aspects of the human number system. Although both positive skew and proportional tuning functions have been reported in the literature, their role in number cognition has not been well studied.

The current study includes a series of computational simulations that explore how the properties of the neural coding of number may contribute to the development of number cognition. More specifically, the simulations provide a likely neural mechanism for several phenomena previously only described behaviorally. The tasks used in the simulations reflect the tasks used in behavioral investigations of number line estimation and operational momentum. Within the simulations, for a given set of numerical values there is a corresponding set of neural tuning functions that resemble Gaussian distributions with peak activity corresponds to the number being coded (see Figure 2). The simulations specifically examine the relation in coding between the positive skew and the varying width of the tuning function. Building on the neural evidence (Nieder & Miller, 2003), it is assumed that the more narrow distributions that characterize small number values are more skewed than the wider distributions that represent larger numbers. Thus, the tuning functions resemble a Poisson distribution in that both displays attenuate positive skew. Poisson distributions have a history of use in neurocomputational work in describing neural spike trains (Ashby & Valentin, 2007; Boccaletti, Latora, Moreno, Chavez & Hwang, 2006; Song, Miller & Abbott, 2000). The tuning curves presented in prior work (Nieder & Miller, 2002) are arranged to show one particular neural population’s relative activation to varied numerical stimuli. The tuning curves used in the current work represent the relative activation of a range of neural populations in response to one specific numerical stimulus. The shape and characteristics of the neural tuning curves, if viewed this way, retain the identical shape of a positively skewed Gaussian curve.

Figure 2.

Example tuning functions used in the current simulations (s = 0.5) for number magnitudes 5, 20, 50 (black, grey, dotted lines). Linear (top) and log (bottom) scales shown.

Prior research has also reported that when behavioral errors occur, the neural activity for the preferred quantity is significantly reduced compared to correct trials (Nieder et al., 2002; Nieder, Diester & Tudusciuc, 2006; Nieder & Miller, 2004; Nieder & Merten, 2007). Errors in neural coding of number were linked to errors in the behavioral task. This is key to the current framework. Errors or lack of precision in neural coding may occur and give rise to these same properties in numerical judgments.

Number estimation

By a variety of measures, young children are poor estimators of numerical values and relative quantities in comparison to adults (e.g. Siegler & Booth 2004; Opfer & Siegler, 2007). One task that has been used to investigate the development of number estimation is the mapping of number values to spatial representations such as a number line (e.g. Baroody, 1999; Booth & Siegler, 2006; Opfer & Siegler, 2007; Siegler & Booth, 2004; Siegler & Opfer, 2003). Older children’s and adults’ estimates are linear, but preschool (and young school age) children produce estimations that are overall logarithmic. Researchers have interpreted this developmental change as a change in children’s cognitive ‘representation’ of number being initially solely logarithmic changing to include linear also (e.g. Siegler & Booth, 2004; for an alternative view see Moeller, Pixner, Kaufmann & Nuerk, 2009). In brief, by this account, younger children rely on representations of number on a log scale while older children are able to use multiple representations, including linear. Though the behavioral phenomenon is quite robust, it is unclear what precipitates the change toward linear estimation other than increased experience with numbers, nor is it clear why young children initially have a logarithmic representation. Just what might be changing as a function of experience with numbers?

The advances in understanding the neural coding of discrete quantities offer a potential account. The assumption is that cognitive-level representations may reflect underlying properties of the neural code. As pointed out by many (e.g. Nieder & Miller, 2003; Johnson, Hsiao & Yoshioka, 2002), studying behavior limits conclusions to the realm of cognitive representations; however, what we know about the neural code suggests a clear hypothesis about the transition from logarithmic to linear mapping of numbers to a number line. Children’s difficulty in the number estimation task may arise because the width of tuning representations which increase proportionally with the magnitude of the number with respect to the spatial representation of number on number line, which is not proportionally scaled. Although this is true for adults as well as children, mapping from a proportional representational system to a linear one may be more difficult for young children than adults if the tuning functions change in certain ways with age. This is the question investigated in the simulations.

The present approach is consistent with findings suggesting that children and adults often use the same neural networks for a task, and that differences in performance are largely a matter of magnitude, timing, or extent of activation (Brown, Lugar, Coalson, Miezin, Petersen & Schlaggar, 2005; Casey, Galvan & Hare, 2005; Casey, Giedd & Thomas, 2000; Durston, Davidson, Tottenham, Galvan, Spicer, Fossella & Casey, 2006; Gaillard, Hertz-Pannier, Mott, Barnett, LeBihan & Theodore, 2000; Rubia, Overmeyer, Taylor, Brammer, Williams, Simmons, Andrew & Bullmore, 2000; Schlaggar, Brown, Lugar, Visscher, Miezin & Petersen, 2002). That is, children may show quantitatively poorer or qualitatively different patterns of performance because their networks are noisy, and are less able to drive activation of parts of the network at the appropriate moment or to the optimal degree. This has been illustrated in computational work in which narrowing of tuning functions of neurons contributes to modeling developmental changes in cognition (e.g. Simmering, Schutte & Spencer, 2008; Schutte, Spencer & Shoner, 2003). Narrow tuning curves have been shown to be necessary for accurate coding of number (Diester & Nieder, 2008). In addition, behavioral work shows that the Weber Fraction, the smallest proportional difference that can be differentiated, changes with age (Halberda, Mazzocco & Feigenson, 2008), which may indicate a change in these underlying tuning functions..

The following series of simulations show (1) that the combination of positive linear skew and broad neural tuning functions leads to estimation errors that are overall logarithmic; and (2) the log to linear development in number estimation is facilitated by neural coding of number and its development, specifically that the narrowing of neural tuning curves with development result in the log to linear shift seen in the behavioral literature.

Model specifications

The following simulations use vectors to represent neural tuning functions. Each item in the vectors represents the relative activation level for a group of neurons that respond selectively to some number stimuli. Each simulation included one vector for each of the number magnitudes to be estimated. The values in each vector represent the relative activation (spiking rates) of number selective neurons. For example, the value A was represented by the vector A(n1, n2, …n150), where nx is the activation for the neurons selective for the number magnitude X. Vectors for values A = 1 through 100 were calculated and each vector contained 150 activation values. For example, the activation value at index 5 corresponds to the average activation for all neurons which respond maximally to the number magnitude 5. Activation values represent the relative activation levels for that specific vector only and do not correspond to specific spiking rates. Research suggests that the maximum spiking rate for large numbers is actually lower than for smaller numbers (e.g. Nieder & Dehaene, 2009), thus here relative spiking rates are used for ease of comparison. Activation values for each vector were calculated using a modified Gaussian distribution function. This a general function that defines a variety of Gaussian distributions. Similar equations have been used in prior computational work (Dehaene, 2007).


The values of h and m are set as constants for all simulations. Whereby h is the maximum value of the function, this is set to 1; m is the mean of the distribution and is set to zero. The value of s determines the width of the curve and varies across model instantiations. X is defined by the logarithmic difference between the target number and vector item index. For example if the target number is A = 6 and S = 1, for A(n6), x = log106 - log106, x = 0. The remaining equation variables are constants other than s which for this example is equal to 1. The equation result is A(n6) = 1; thus when the vector index is equal to the target number the relative activation equals 1. Then, for A(n4), x = log106 - log104, x = 0.176, and A(n4) = 0.984. Thus, for index 4 the relative activation is slightly reduced. The method of defining X by logarithmic differences results in Gaussian functions that are symmetric on a log scale and of identical width (see Figure 2). On a linear scale the functions vary in width and positive skew (skew merely refers to the fact that the function is not symmetric about the mean). Smaller values are both more narrow and more skewed. Again this is simply the consequence of transforming a Gaussian curve that is symmetric on a log scale to a linear scale.


All simulations were evaluated using MATLAB (Mathworks) software. A series of simulations were evaluated, including, as a point of comparison, both symmetric and positively skewed coding of varying tuning function widths. In each case coding vectors were calculated for target numbers 1 through 100. The initial vectors can be interpreted as idealized activation patterns to which some activation noise is added to determine the model output vectors. If the model produced vectors where the maximum value has the same index then the model correctly estimated that number value. Noise is calculated as an change in the vector values by some percent taken from a random distribution, where the mean noise is zero. Thus some vector values increased, others decreased, and the mean amount of noise was zero. After the application of noise the vector output values were calculated, where the index of the maximum value of the vector equaled the output. For example, prior to noise the maximum value for the vector representing ‘5′ is A = 1 at index 5. After the application of noise this value may have been reduced to some value, 0.79 while the value at index 6 was increased to 0.81. The vector has now, due to noise, overestimated the value 5 as 6 for its output. The use of noise in neural models is well established (Schutte et al., 2003) and is a more accurate representation of neural coding than static coding. The entire process of the application of random noise to the set of tuning functions was repeated 200 times for 200 simulated ‘subjects’ per coding condition.

As previously noted, prior work has shown that when behavioral errors occur the neural activity for the preferred quantity was significantly reduced compared to correct trials (Nieder et al., 2002, 2006; Nieder & Miller, 2004; Nieder & Merten, 2007). The hypothesis here is that the pattern of errors in the neural tuning functions influence the pattern of errors in behavioral output. Thus for these simulations an incorrect index of the maximum activation value is interpreted as an incorrect number estimate.

Results and discussion

For each instantiation the simulation produced estimations were plotted against the target numbers and best fit lines were calculated. Variances of 0.5, 1, 2, and 3 were examined for both symmetric and positive skew. R2 values were calculated for both linear and logarithmic best fit lines, which will be referred to as linear R2 and log R2 values. For positive skew coding linear R2 values decreased as variance increased (0.99, 0.95, 0.79, 0.70), while log R2 values increased (0.81, 0.89, 0.97, 0.94) (see Figure 3). For symmetric coding, linear R2 values were similar as variance increased (0.99, 0.99, 0.99, 0.99), as were log R2 values (0.80, 0.80, 0.83, 0.81). Thus, symmetric coding was overall quite accurate in estimation and did not resemble the log function curve shown by young children. For positive skew coding small variance values, which have narrow tuning functions, produce higher linear R2 values than log R2, similar to older children; larger variance values, which have broader tuning functions, produce higher log R2 values than linear R2 values, similar to younger children. Thus with the positive skew coding there is a shift from more logarithmic estimates to linear estimates as the tuning function narrows.

Figure 3.

Simulation estimates for selected variance parameters. Variances of 2 and 3 produced estimates best described by a Log function (left panel). Estimation data for model simulation data with positive skew and broad tuning function (s = 2), compared to behavioral data with kindergarten-aged children (Booth & Siegler, 2006) and the target values (right panel).

Further comparisons between behavioral data and simulations were completed. A direct comparison was done between prior behavioral data with the current model results. Behavioral data taken from Booth and Siegler (2006, Figure 1), included 37 data points which were matched to corresponding simulation data points. Of the current simulations positive skew with a broad tuning function (S = 2) fits this the closest (see Figure 3). Simulation data points were highly correlated with the behavioral data points, R = 0.94.

Only with both an overly broad tuning function and positive linear skew does the model produce estimations similar to that of very young children. A narrowing of the tuning curve produces data similar to developmentally advanced children and adults. Younger children tend to be overall less accurate in their estimates and tend to overestimate smaller numbers in the number line task. The simulation matches this pattern due to several factors. As the width of the tuning function increases, the potential for large misestimating increases, thus wide neural tuning functions are less precise than broad tuning functions. In addition, the positive linear skew of the tuning function causes any misestimating likely to be overestimations. The more the skew the more likely an error to be an overestimation, as opposed to an underestimation. As the magnitude of the estimated value increases, positive skew decreases and misestimating tends to average towards zero, over- and underestimations are nearly equally likely. Together these factors contribute to the simulation’s production of a logarithmic estimation pattern, closely mirroring behavioral data.

The neural coding of number is unlikely to be the only influence on children’s performance on estimation tasks. There certainly must be a ‘read-out’ process to go from a neural coding to behavioral output. This process could add noise to the outcome or include influence from top-down control. Children have been shown to change their estimation performance based on structured feedback (Siegler & Booth, 2004; Opfer & Siegler, 2007) and this may reflect top-down influences on estimations. In the two studies children who had previously shown logarithmic estimation patterns were given an additional specific landmark on the number line. After this additional feedback children adjusted their estimation to a more linear pattern. While the neural coding may provide a starting point and present limitations in accuracy, this may be mitigated by explicit feedback, particularly with older children.

In both the behavioral task and the current simulations, output estimations are limited to a particular range. Neither child participants nor model simulations can provide an estimation more than the top value of 100. This does have some consequences in both cases; by limiting estimations neither can overestimate values as greater than 100. In the simulations, removal of this barrier does slightly reduce the fit of the log function. It is unclear how child participants would perform in such a situation. The current model predicts constant proportional variance from the target number.

Prior work has also reported correlations between number line estimation and other number tasks (Booth & Siegler, 2006). Children’s score on a standardized math achievement test was significantly positively correlated with the linear R2 value of their given estimates. It was, however, not significantly correlated with mean absolute error of estimates. This suggests that producing linear estimation functions is correlated with superior performance in related math tasks. This is unsurprising given the current account. Participants who produce logarithmic estimations due to broad neural tuning functions will also show errors in simple computation, while participants who produce linear estimations due to narrow tuning functions may show fewer errors in computation.

Operational momentum

Another relevant aspect of number cognition is the development of knowledge of arithmetic operations. Research on simple arithmetic includes participants from 5-month-olds (Wynn, 1992), to older children (e.g. Barth, Beckmann & Spelke, 2008; Prather & Alibali, 2011) to adults (e.g. Barth, Mont, Lipton, Dehaene, Kanwisher & Spelke, 2006; Robinson & Ninowski, 2003). In one such avenue of research several studies have described a phenomenon termed operational momentum (Knops, Viarougue & Dehaene, 2009; Lindemann & Tira, 2011; McCrink, Dehaene & Dehaene-Lambertz, 2007; McCrink & Wynn, 2009). In short, for addition (A + B = C) participants tended to overestimate the value of C, while for subtraction (A – B = C) participants tended to underestimate. The basic phenomenon has been shown with participants ranging from 9 months to adults. A high-level representational account of the phenomenon was envisaged: humans are able to cognitively represent numbers spatially and thus addition and subtraction involve moving along the mental number line. For both addition and subtraction the participants overshoot the value of C, leading to overestimation in addition and underestimation in subtraction. Arithmetic errors are a result of movement along the mental number line where the correct answer is overshot; perhaps a similar mechanism to representational momentum (Hubbard, 2005). The original work describing operational momentum (McCrink et al., 2007) also suggested that the effect may reflect properties of the neural coding of number and does so in terms of arithmetic operations as movement along a mental number line.

Given the prior work on the use of mental number lines (e.g. Dehaene, Bossini & Giraux, 1993), this appears to be a plausible behavioral description of the phenomenon. The current simulation examined how and if the neural coding of number may contribute to this behavioral phenomenon. The current simulations illustrate that for the operational momentum effect, the mental number line explanation is unnecessary once the neural coding of number is taken into account. Again, the simulations examined how two key tuning function characteristics, positive skew on a linear scale, and proportional scaling, contribute to the patterns of performance reported in the operational momentum literature – a tendency to overestimate addition and underestimate subtraction.

Model specifications

Model specifications were identical to the prior experiment with the exceptions of the range of number values, the length of vectors and tuning function widths considered. In the following simulations values 1 to 30 were used in a variety of arithmetic equations. Each value was represented by a vector contained of 50 items. Variance parameters 1, 1.5, and 2 were evaluated.


All simulations were evaluated using MATLAB (Mathworks) software. Two separate simulations were carried out; symmetric Gaussian coding and positively skewed Gaussian coding (on a linear scale). In each case coding vectors were calculated for target numbers 1 through 30. Random noise was then added to each vector value, whereas the activation level was altered by a percent calculated from a random distribution. The amount of noise applied was random and independent for each vector value.

After the application of noise the vector output values were calculated, where the index of the maximum value of the vector equaled the output. For example, prior to noise the maximum value for the vector representing ‘5′ is A = 1 at index 5. After the application of noise this value may have been reduced to some value, 0.79, while the value at index 6 was increased to 0.81. The vector has now, due to noise, overestimated the value 5 as 6 for its output. The output values for all vectors were then used to calculate the simulated results of the full set of addition and subtraction equations. For example, for the equation 7 + 3, the vectors representing 7 and 3 are applied some random noise, and then some resulting outputs, e.g. 7, 4 are combined together to determine the model estimate of the addition equation, in this case 7 + 3 = 11. Again, this paradigm is based on prior work reporting correlations between neural coding errors and behavioral errors (Nieder et al., 2002, 2006; Nieder & Miller, 2004; Nieder & Merten, 2007). The entire process for the set of equations was repeated 200 times for 200 simulated ‘subjects’ per coding condition.

Results and discussion

Simulation results were analyzed separately by coding style and equation operation. For addition and subtraction there were 435 equations evaluated each (all combinations of 1–30). The percent deviation between the target result and the simulated result was calculated for each equation. For positive skew coding, tuning function widths (S = 2, 1.5, 1) tended to produce average overestimate deviations for addition and underestimate deviations for subtraction, 72% and -39%, 56% and -19%, 37% and 2%, respectively. For symmetric coding, all tuning function widths (S = 2, 1.5, 1) produced small average deviations for addition and subtraction, 0.28% -0.32%, 0.21% 0.14%, 0.01% 0.03%, respectively. Thus the operational momentum is more severe for relatively broad tuning functions.

Performance curves for addition and subtraction were calculated, similarly to that reported in prior behavioral work regarding operational momentum (McCrink et al., 2007). For each equation the difference between the simulated result and the target result was calculated as a percentage difference (see Figure 4). The performance curve conveys the frequency of over- and underestimation errors for both addition and subtraction. The behavioral data show that overestimates are more frequent for addition while underestimates are more frequent for subtraction. The current simulation results show that for the positive skew broad tuning function condition addition equation results are more frequently overestimated than subtraction. Symmetric coding shows equal frequency of over- and underestimation for both addition and subtraction. Thus, the simulated data with positive skew and broad tuning function show the same cross-over between addition and subtraction as the behavioral work, while symmetric coding does not.

Figure 4.

Performance curves showing the relative deviation from the target value for both addition and subtraction equations. Behavioral data from McCrink et al. (2007) are also shown.

The data reported here suggest that a positive linear skewed neural coding of number (Nieder & Merten, 2007; Neider & Miller, 2002; Verguts & Fias, 2004) results in arithmetic errors that are consistent with the reported behavioral phenomenon termed operational momentum. That is, addition operations tend to be overestimated, while subtraction is underestimated. This occurs because both the chance of a misestimate and the type of misestimate vary by magnitude. While smaller numbers with sharper tuning functions tend to have both less frequent and smaller errors, the errors that do occur are much more often overestimations than underestimations. That relatively small values are typically overestimated is consistent with prior work on the development of numerical estimation in children (Booth & Siegler, 2006; Huntley-Fenner, 2001; Opfer & Siegler, 2007) and numerical estimations in non-human animals (e.g. Brannon & Roitman, 2003; Platt & Johnson, 1971). Given the relative magnitude of numbers in addition and subtraction equations, this particular tendency of misestimation accounts for both overestimation of addition and underestimation of subtraction.

Cognitive accounts of operational momentum (McCrink et al., 2007; Knops et al., 2009) such as spatial associations with number (Dehaene et al., 1993; Knops et al., 2009; Santens & Gevers, 2008) are not necessarily inconsistent with the current account. A variety of cognitive representations could exacerbate the behavioral pattern including number-spatial associations. However, the current account requires a priori only the experimentally established neural coding of number. Prior research has illustrated how number selective neurons can come about through unsupervised learning (Verguts & Fias, 2004), neural data illustrate the positive skew and relative width of the neural tuning functions used in the current simulations (e.g. Nieder & Miller, 2003). The effect can be described as a ‘natural result’ of the neural coding.

There were several differences of note between the current model and typical behavioral methodology. The behavioral methodology (McCrink et al., 2007; Knops et al., 2009) has typically included a verification task in which participants evaluated presented arithmetic results, whereas the current simulations produced the results of arithmetic equations. In addition, the behavioral methodology has typically used a limited set of arithmetic equations, due to experimental constraints, whereas the current simulations evaluated all relevant arithmetic equations, resulting in a more comprehensive data set.

General discussion

The current work

The simulation results presented here illustrate how the neural coding of number magnitude directly contributes to several known behavioral phenomena in numerical cognition. While there has certainly been discussion regarding connections between neural coding and numerical development, the current account presents an unprecedented level of detail regarding the influences of neural coding patterns on specific behavioral patterns: the developmental change in number line estimation and the operational momentum effect. Research in both these areas includes a variety of explanations such as log to linear representational shifts (Siegler & Opfer, 2003) and spatial representations of number (McCrink et al., 2007). However, the degree to which the neural coding of number and its change during development can account for these behavioral phenomena should mediate the need for additional cognitive-level explanations.

In brief, the present work contributes to current understanding of developmental changes in number cognition by offering a framework for understanding both the age-invariant aspects of number reasoning and developmental change. The simulations show how the neural coding of number may influence several behavioral phenomena in the number cognition literature. The current stimulation combines known characteristics of the neural coding of number with other neurocognitive principles, such as activation noise and response function sharpening. The sharpening of the simulated neural tuning functions lead to changes in the modeled behavior that closely mirrored several developmental phenomena. More critically, the current simulations suggest how both quantitative and qualitative changes in number judgments with age and experience may be understood in terms of the fundamental properties of how number magnitude is represented and in changes in the tuning functions of those properties. Although the present study does not make a direct link between experience and changes in these tuning functions, a large literature on perceptual learning both at the behavioral and neural levels is consistent with the idea of a narrowing in tuning functions with increasing experience (e.g. Goldstone, 1998; Luce, Green & Weber, 1976; Recanzone, Schreiner & Merzenich, 1993; Saarinen & Levi, 1995; Simmering et al., 2008; Schutte et al., 2003). Moreover, several studies of the development of number concepts and mathematical reasoning have pointed to precision of encoding as contributing to the better performance of older children; precision, in turn, may be related to experience-dependent aspects of these tuning functions especially the breadth of the tuning function. Developmental research in other domains (visual perception) has pointed towards similar ideas; developmental effects may be caused by increased representational acuity of the underlying neural mechanisms. The Representational Acuity Hypothesis (Westermann & Mareschal, 2004) posits that infants’ visual development is driven in part by the narrowing of receptive fields for visual cortex neurons. Thus, it may be the case that the same general mechanism of the sharpening of neural tuning functions can account for developmental phenomena in a variety of domains.

The simulations were specifically based on multiple non-human primate neural studies and computational accounts that have reported both proportional scaling and positive linear skew in the neural coding of number magnitude. Given these characteristics of the neural coding, a simple model of neural activity shows that patterns of errors mirroring the behavioral data emerge as a result of the neural coding. This is not to say that neural coding of number is the only influence on behavior in these or any other tasks, nor that these are the only properties of tuning functions that might be relevant. Indeed, dynamic aspects of this coding – rise times, fall times, and potentially forms of inhibition of return (e.g. Spencer, Thomas & McClelland, 2009) – may also be relevant as more sluggish representations rather than more temporally defined ones may lead to difficulties in serial behaviors such as counting and perhaps some forms of calculation. The properties of the coding and representation of number are foundational to number cognition and the present simulations are a first step to understanding their potential relevance to the development of number cognition.

The current account is a parsimonious account of several phenomena in the number cognition literature. For both number line estimation and operational momentum prior accounts posit cognitive representations such as mapping number to space and arithmetic as movement along a line. For estimation the current simulations show how changes in neural coding account for known developmental patterns. For operational momentum the data predict a possible developmental trajectory. Though prior work (Dehaene, 2007) has discussed the possible influences of neural coding on cognitive development, no data have been reported regarding either estimation development or operational momentum. The simulations illustrate how the observed behavioral data could emerge as a direct result of neural coding of number. While the current account does not necessarily contradict cognitive-level explanations, the inclusion of the influence of neural coding is a significant enhancement and provides a framework for further exploration of limitations and influences on early number cognitive development.

Remaining theoretical issues

In the current simulations developmental changes are approximated through a narrowing of the neural tuning functions. Though this developmental change has some support in the literature (e.g. Simmering et al., 2008; Schutte et al., 2003), it is unclear whether other factors may influence neural tuning functions of number magnitude. Prior computational work (Verguts & Fias, 2004) has suggested that in number cognition the use of specific symbols to refer to magnitudes leads to narrower tuning functions for the said magnitudes. If this is the case, children’s experience with the symbolic number system may be a factor in making more linear estimations (Dehaene, 2007). If symbolic representations lead to narrower tuning functions, then one would expect a close relation within individual children between their number knowledge, operational momentum, and ability to map numbers to a number line, as well as the Weber fraction for discrimination. Moreover, one might expect more linear mappings of numbers to a number line given tasks that encourage symbolic representations versus those that do not. On the other hand, symbols per se may not be the critical experience in changing these tuning functions; rather, discrimination of discrete magnitudes (with or without symbols) may be, in domains outside of number, perceptual tuning functions have been shown to sharpen with experience in making finer discriminations (Yang & Maunsell, 2004).

The current simulations used neural coding of number placed on a linear scale. Much has been written regarding the best description of neural and cognitive representations of number being either linearly or logarithmically scaled. Nieder and Miller (2003) put forth the most comprehensive augment regarding linear versus non-linear coding and concluded that non-linear coding best described both neural and behavioral data. Given that number is coded in a non-linear fashion, number representation is essentially proportional, consistent with findings regarding perceptual magnitude representation (e.g. Billock & Tsou, 2011; Stevens, 1957; Stevens & Marks, 1980). However, behavior regarding number and symbolic number systems is frequently performed on a linear scale. In the broadly adopted Hindu-Arabic base-10 number system, number increases linearly. Thus, examining the neural coding with respect to a linear scale is relevant to mathematical reasoning and to number concepts. Of course, linear and logarithmic representations are transformations of each other, and so the present approach might be viewed simply as taking a transformation of the neural coding system that makes the relevance of that system to common number tasks more clear.

Further directions

Recent research suggests a relationship between the ‘primitive sense of number’ and math ability (Libertus, Feigenson & Halberda, 2011). Children’s acuity with non-symbolic number magnitudes (dot patterns) is associated with later performance in symbolic mathematics, while controlling for other factors. The current results are consistent with the idea that numerical acuity can have a direct influence on math performance. It may be the case that numerical acuity influences math performance, and that acuity is in turn based on the neural tuning functions. This relationship raises the possibility that sharpening tuning functions to improve numerical acuity may also be a way to improve symbolic math performance. Future work will address what influences individual differences in number acuity and what experiences may lead to the sharpening of neural tuning functions. The answer to these questions could lead to design of interventions for children’s math performance.

There are a wide array of behavioral phenomena in the number and mathematical literature, including early non-symbolic arithmetic (Barth et al., 2006; Wynn, 1992), symbolic system acquisition (McNeil & Alibali, 2004; Uttal, Scudder & Deloache, 1997), multimodal presentations of number, such as auditory or tactile (Jordan & Brannon, 2006) and relations to other forms of magnitude (Cohen Kadosh & Henik, 2006; Lourenco & Longo, 2010; Walsh, 2003). Though there is evidence regarding neural coding of ‘pure number’, we need more neural and behavioral data regarding number in multiple modalities, representations and in comparison to other perceptual magnitudes, which may share some similarities with discrete number and thus may be relevant to some aspects of early number judgments (Clearfield & Mix, 1999). In addition, next steps require linking hypothesized changes in these properties of neural codes to number judgments in individual children across a variety of tasks that should be dependent on the properties of this coding, as well as examining how – and what kinds of – experiences may play a role in these tuning functions. Adding this perspective to the developmental study of number cognition offers a unifying framework for the rapidly advancing knowledge about early number concepts, about the influence of learning symbolic representations of number on number system, and about the patterns of errors (and difficulties) that characterize young school age children’s mathematical learning.


I gratefully acknowledge all who have commented on earlier versions of this manuscript. This research was supported by the National Institutes of Health (T32HD007475).