Formalizing theories of child development: Introduction to the special section

Here we introduce a Special Section of Child Development entitled “Formalizing Theories of Child Development.” This Special Section features five papers that use mathematical models to advance our understanding of central questions in the study of child development. This landmark collection is timely: it signifies growing awareness that rigorous empirical bricks are not enough; we need solid theory to build the house. By stating theory in mathematical terms, formal models make concepts, assumptions, and reasoning more explicit than verbal theory does. This increases falsifiability, promotes cumulative science, and enables integration with mathematical theory in allied disciplines. The Special Section contributions cover a range of topics: the developmental origins of counting, interactions between mathematics and language development, visual exploration and word learning in infancy, referent identification by toddlers, and the emergence of typical and atypical development. All are written in an accessible manner and for a broad audience.

them or unable to increase clarity.Other ambiguities are deliberate, and designed to offer wiggle room (Eisenberg, 1984).For instance, authors might 'stretch' theories to encompass data outside of their original scope (Rvachew, 2021) or 'shrink' theories to more precisely align with observed data.Both theory stretching and post-hoc precision make the empirical evidence for a theoretical claim appear stronger than it actually is (Frankenhuis et al., 2023).
Unless transparently noted, theoretical ambiguity hurts scientific progress (Frankenhuis et al., 2023;Rohrer, 2021): verbal theories include vague concepts, implicit assumptions, and predictions based on intuitive reasoning rather than logical deduction.Such ambiguities leave ample room for disagreement about which predictions a theory makes, which data are wellsuited to testing a theory, and which patterns of data provide support for and against a theory (Fried, 2020).For these and other reasons, ambiguity hinders the accumulation of knowledge.It also impedes falsifiability by enabling post hoc explanations to rescue a theory or prediction from inconvenient data.Moreover, ambiguity creates obstacles to building bridges with mathematical theory in allied fields, such as biology, economics, and anthropology, in which theory is often mathematically specified (Frankenhuis & Walasek, 2020).

T H E G OA L OF T H I S SECT ION
This Special Section showcases what mathematical and simulation-based modeling can accomplish for central questions in the study of child development.By mathematical modeling, we mean that a theory is expressed in terms of equations that characterize systems or processes.In some cases, such equations can be analytically solved (i.e., it is possible to calculate an exact solution); in others not.By simulation-based modeling, we mean the algorithmic generation of data based on the assumed underlying structure of systems or processes.These two approaches are not mutually exclusive and often used in tandem, but they feature distinct approaches to theoretical work: where analytic approaches are more useful in providing principled solutions to well-defined problems, typically involving a small number of variables, simulation-based approaches are better tailored to modeling larger systems that involve feedback among many different variables (so that analytic solutions are impossible to construct or simply cumbersome).
Although mathematical and simulation-based models can feature probabilistic processes and be statistical in that sense, they are not models designed to statistically analyze empirical data.Theoretical models describe coherent sets of statements about the world designed to explain robust features want to understand (phenomena).Statistical models describe associations between variables in an empirically derived dataset (data).Mathematical and simulation-based models might be informed by empirical datasets (e.g., models of intelligence development; van der Maas et al., 2006), but need not be.Statistical models might be informed by theory (e.g., factor analyses or network analyses of empirical data; Haslbeck et al., 2022), but need not be.All papers in this section include one or more mathematical or simulationbased models.Some papers additionally include statistical models to empirically test predictions derived from formal models or to calibrate the parameters of formal models to the data.However, this section does not include papers discussing only statistical models, without also presenting a mathematical and simulation-based model.
By formalizing theories, mathematical and simulation-based models make all concepts, assumptions, and reasoning explicit.Because a formal theory is stated in mathematical terms, it is decoupled from any particular person's mind; it is a public resource for anyone to evaluate.Clarifying and sharing theories in this way fits with SRCD's core values of integrity, transparency, and openness, and with the broader effort to support Open Science (Guest & Martin, 2021).However, valuing formal theory does not imply devaluing nonformal theory (Nettle, 2021;Scheel et al., 2021).Theory construction is a process of gradual maturation (Borsboom et al., 2021;Smaldino, 2020).Early developmental stages in the life of a theory may involve approximate or preliminary sketches, which are well handled in verbal terms; through the process of maturation, concepts, assumptions, relations, and predictions are refined and made precise (Loehle, 1987).However, once hypotheses are claimed to 'follow from' or be 'derived from' the theory, the logical chain from assumptions to predictions should be fully identified (Harris, 1976).This allows others to verify and reproduce the theoretical analysis.

T H E OR IGI NS OF T H E SECT ION
We invited authors to submit Letters of Intent for papers that use mathematical models to advance understanding of a central question in the study of child development by February 1, 2022.The Call stated that the papers proposed for this Special Section via Letter of Intent must do the modeling work, rather than being idea pieces that stop at thoughtful reflection.We also stated that, ideally, proposed submissions also would compare insights gained from the modeling studies with empirical data.Further, we noted that proposed submissions may test quantitative (parametric) or qualitative (directional) predictions in a dataset, or through systematic metaanalysis or review of the existing empirical record.Though not required, we encouraged authors to preregister their predictions, as part of our broader vision: transparency from theory to test.The Call encouraged submissions from all fields and sub-fields focused on research on child development.We envisioned the section to consist of 4-5 papers.
Our main criterion for evaluation was whether the proposed modeling advances a central question in the study of child development.As such, the Letters of Intent needed to motivate a research question about development, explain the chosen modeling approach, and justify why this approach is well-suited to studying the question.Our evaluation did not depend on the extent to which the empirical data were consistent with the predictions of the model(s).In this sense, testing mathematical models is subject to the same ethical guidelines as testing predictions not derived from formal models: we should not suppress null or mixed results.Some models might teach us that a theory is not able to explain the data that it purports to explain (van Rooij & Baggio, 2021).Other models might make new predictions about a wellestablished phenomenon, which are later tested in an empirical dataset.We requested that all theoretical and empirical analyses be computationally reproducible or explain why this is not possible: "Formal models must be reproducible, with well-documented code and equations, specification of software (including links to the sources where it can be obtained), and where appropriate, data files of simulation results reported in the paper.The paper must include a link to a permanent repository (or repositories) where all this information is stored."Thus, we asked authors to ensure that the code is findable, accessible, and intelligible.
The proposals we received were strong.We could encourage only six papers (<30%) for submission to Child Development as a full manuscript for consideration as part of the Special Section.One of these papers was later withdrawn, so we continued with five papers.Our encouragement at this stage was no guarantee of an eventual acceptance of the work described in the Letters of Intent.The action editor assigned to a given submission may choose to desk reject it or send it for full review.Manuscripts sent for full review could also be rejected.
In our invitation to submit a full manuscript, we asked that papers: (1) be written in a manner accessible to the readership of Child Development, some of whom have limited or no experience with mathematical modeling; (2) explain the modeling approach, and justify why this approach is well-suited to studying the question (possibly in a subsection that provides a tutorial on the modeling approach); and (3) formalize developmental change (even if the paper includes empirical data on only one age group).After all, the goal of this section is for the readership of Child Development to understand how mathematical modeling is used and adds value.
We encouraged authors to be transparent about the limitations of their approach.As with empirical papers, modelers may be incentivized to hide or downplay limitations.Some of these limitations pertain to the model itself.For instance, a model might include an assumption that does not align very well with the phenomenon under study.Other limitations pertain to model analysis.For instance, the authors may explore a narrower range of parameter values than arguably would be ideal.Such choices need to be explicit, well justified, and their consequences for conclusions stated (e.g., 'Our model shows that processes A and B are in principle capable of producing phenomenon C; however, as we did not explore the full range of possible parameter values, we draw no conclusions about the empirical plausibility of our model as an explanation of the phenomenon').Finally, the empirical data may not align with model predictions.As long as the model is clearly described, makes reasonable assumptions, and is rigorous in analysis, its conclusions are informative; this is true just as much of models as of empirical studies.It is fine if modelers adjust the features of their model to better fit the data if done transparently.Ideally, they would then test the adjusted version of their model on a new empirical dataset.Models are tools to promote understanding and prediction.As Iris van Rooij (2022) puts it: "Just as microscopes and telescopes help scientists to see better, models help scientists to think better" (p.127).
We now turn to the five papers in this Special Section.First, we briefly summarize each paper, describing its approach, insights, and synergies with data.Then we discuss a few common themes across the papers.Finally, we explain why formal theory, despite its advantages, is no panacea-thus providing tools to help readers be critical consumers of modeling papers.

T H E PA PER S I N T H E SECT ION
In the first paper, de Ron et al. ( 2023) address a paradox in patterns in child development.On the one hand, different cognitive abilities are positively correlated in the population at large, both across individuals and across time (the positive manifold).On the other hand, atypical development is often characterized by deficits in some abilities coupled with normal or even superior performance in others (uneven cognitive profiles).Their modeling approach assumes, as previous models have, that different abilities interact in a network in which they can enhance each other's growth.On the other hand, those abilities are also in competition for the overall supply of available developmental resources.de Ron et al. (2023) set out the formal foundations of their approach using differential equations derived from models of species interactions from ecology.They then simulate several scenarios using these equations.The simulations recover a number of well-known developmental phenomena: the positive manifold; uneven cognitive development in developmental disorders; developmental delay; and more.The paper shows how a single, fairly simple set of theorized relationships can give rise to a number of different patterns depending on choices of parameter values, obviating the need, for example, for separate accounts of typical and atypical development.
In the second paper, Driver and Tomasik (2023) tackle the difficult question of how to bridge the gap between informally specified theoretical models and statistical models that can be fitted to data.Using the interaction between development of language and mathematics skills as a case in point, Driver and Tomasik (2023) implement different theoretical views on how language and mathematics skills are related: (a) the thinking function hypothesis, which views both language and mathematics development as being influenced by a common factor, analogous to the g-factor of intelligence, (b) the medium function hypothesis, which suggests that improvements in language skills facilitate the development of mathematics skills, and (c) the specialization hypothesis, which specifies a trade-off whereby greater investments into one skill negatively affect the other (similar to the resource competition model of de Ron et al., 2023).Applying the framework of continuous time structural equation modeling to a large developmental dataset, Driver and Tomasik (2023) show how the different hypotheses may be represented in a common modeling framework.The paper is a valuable contribution, as it is both a substantial contribution to the literature, as well as a tutorial on how to translate theories into statistical models that can be used in the analysis of empirical data.
In the third paper, Piantadosi (2023) models how children learn number words.When learning numbers, children transition from understanding only a few number words to possessing a rich system of numerical concepts (Piantadosi et al., 2012).This transition involves taking finite data and going far beyond what children directly observe.There is a longstanding debate about the capacities that children use to solve this so-called 'inductive' challenge, how these capacities develop, and the role of experience.Indeed, these questions have been central to nativist and empiricist positions since the early days of psychology.After setting the stage, Piantadosi (2023) provides a tutorial on how we can build a computational model of the task of learning number words.This tutorial explains Bayesian updating, which involves combining "prior" estimates with observed data to arrive at "posterior" estimates.The paper then outlines a formal model that acquires a counting procedure using observations of sets and words.This model is an updated version of an earlier model by Piantadosi et al. (2012).That model has been criticized.The updated model includes new features to accommodate the criticisms.Next, the paper presents results from the updated model.These results show similar patterns to the original model.Therefore, Piantadosi (2023) argues the model's findings are robust.The paper concludes with seven predictions for future empirical research and limitations of the chosen modeling approach.
In the fourth paper, Bhat et al. ( 2023) examine the joint development of visual and auditory processing in young children.They develop a computational modeling approach based on dynamical field theory (a model of information processing based on the behavior of populations of neurons in the brain) to show how the interaction between auditory and visual stimulus processing may develop over time.In doing so, they provide a mechanism through which a number of effects observed in prior research (e.g., changes in the degree to which an auditory stimulus hampers visual processing) could be accounted for.This process modeling approach involves a highly detailed implementation of experimental manipulations in the theoretical model; in essence, Bhat et al. ( 2023) create a simulacrum of an infant and subject it to precisely the same experimental manipulations that were used in the research that produced the phenomena of interest.
In the fifth paper, Mihaela and Plunkett ( 2023) address an issue in cognitive development: how the child develops the ability to associate a particular image (of, say, a dog) with its corresponding phonological pattern, and its corresponding semantics (being an animal, barking, have a wet nose and so on).The modeling approach-a neural network model-is philosophically different from some of the other models in the section.Some modeling approaches are based on causal and epistemic transparency without process realism; for example, one might write down explicit equations supposed to represent the key hypothesized quantities and their functional relationships.This would give insight into the predicted consequences of those kinds of functional relationships, but does not say anything about how those quantities and relationships are instantiated in the brain or body.By contrast, neural network models are based on a kind of rough process realism: they are inspired by how networks of neurons actually work.A neural network model produces patterns of behavior on a particular task or stimulus set that may or may not correspond to the performance patterns of real children (and, as Mihaela and Plunkett do here, the performance of the model real children on exactly the same task can be directly compared).It does not, however, give researchers an explicit symbolic representation of how the system achieves this performance.It shows that the performance will reliably emerge from a brain-like model system given certain assumptions and inputs.Thus, in effect, a neural network model can provide an answer to the question: what is the minimal cognitive architecture required to robustly generate something like what children actually do in a cognitive task?Mihaela and Plunkett's paper is a useful primer on how to compare model-generated and empirically-generated data in a detailed way.
From a helicopter perspective, all five papers comment on the current status of theory in psychology.One theme is the need for a more clear and precise theory.Piantadosi (2023, p. xxx) writes: "Often developmental theories are informal, meaning that they are stated in language rather than mathematics or with computational implementations.These informal theories are absolutely critical to computational modeling because they tell us what to implement computationally.However, the informality of most theories often leaves ambiguity about how mechanically the pieces are put together.This is one of the most important reasons to implement theories: doing so functions as a kind of consistency check, showing where you make unstated assumptions or where the informal theory is imprecise, or sometimes even inconsistent." Another theme is that more advanced statistics do not solve this problem.Even with the heavy armature of advanced statistical modeling and big data, it is still difficult to get clarity on what is actually going on in the world.The lack of explicitness of available theories contributes to this.As Driver and Tomasik (2023, p. xxx) note: "If we had a complete, formalized theory of competence development, a reasonable starting point for modeling would simply be to instantiate that theory and work from there.Instead, we are faced with what may be the more typical scenario, wherein we have some vaguely specified theories, and want to see to what extent each may have something to offer."Indeed, that seems an adequate description of the conundrum that modelers in psychology often face: there is no shortage of theories, but these theories are insufficiently articulated to clearly direct the modeling work, and as such leave important questions open.Formal models are uniquely able to expose such gaps in the theory.This also means that they offer directions for future research that can address these open issues.As Bhat et al. (2023, p. xxx) put it, this should ideally lead to "a future where models and data can have more of a dialog back and forth; where models can be readily applied to multiple data sets and the model results are used to inform the next experiments."Indeed, we hope that the present Special Issue brings this future a bit closer.
Having summarized the papers in this section, we now turn to reasons why formal theory, despite its advantages, is no panacea.Our aim in the next section is to provide tools that can support readers in being critical consumers of the papers.We will discuss model assumptions, interpretation, and scope, respectively.

MODE L I NG I S NO PA NAC E A
Like verbal theory, formal models might make questionable assumptions.The benefit of formalization is that assumptions are explicit and therefore easier to evaluate.However, clarity alone is not enough: the assumptions need to be appropriate to the phenomena of interest.Clarity is compelling and seductive, but it should not be the end point of our inquiries (Nguyen, 2021).We should evaluate all assumptions before computing their consequences.If assumptions do not match phenomena, we achieve a mere illusion of understanding.On the positive side, reasonable assumptions can serve as springs in a trampoline we can use to jump higher and broaden our horizons-launching our minds into new and exciting directions.Thus, while reading this section, it would be good to reflect on the extent to which the papers provide justification for model assumptions.
Like empirical researchers, modelers are prone to confirmation bias, that is, the tendency to search for, favor, interpret, and remember information in a way that supports one's prior beliefs (Nickerson, 1998).First, authors may tune model parameters to produce behavior consistent with their hypothesis, rather than also explore a broader range of parameter values-including ones that do not produce behavior consistent with their hypothesis.Second, even if authors do explore a broad range of parameter values, they might focus their discussion of model results too narrowly on those parameter values that support their hypothesis.Finally, when testing the fit between model predictions and empirical data, authors might be too generous in interpreting the extent to which observations align with model predictions-just like empirical researchers do.Thus, although modelers develop theory more transparently, they are not immune to bias.
Finally, modelers might exaggerate the scope of their work, as empirical researchers sometimes do.Being humans, modelers are also motivated to have an impact and thereby help themselves succeed (Frankenhuis et al., 2023).Intentionally or not, modelers might make claims that exceed the scope of their work to increase impact and thus advance their personal goals.For instance, if a model explores the development of an ability in a physical environment, it may be wise to refrain from making claims about the development of this ability in a social environment, because the nature and dynamics of inputs will be different.Thus, readers may be attentive to the extent to which the papers in this section clearly distinguish between conclusions within versus beyond the scope of their model.Of course, there is nothing inherently wrong with speculation, when delineated as such.

CONC LUSION
You might think: "I am no modeler, and never will be."Nevertheless, you may help to improve the state of psychological theory in the area of child development.As an author, you can strive for clarity (Frankenhuis et al., 2023).Concretely, you might do this by interrogating your own ideas and inviting skeptics or modelers to poke holes in them.As a reviewer, you may encourage theoretical transparency.The worst we can do is penalize authors for being explicit about ambiguities in theories.Doing so incentivizes authors to create a Potemkin village-a façade of clarity that collapses upon closer inspection (Nguyen, 2021).It is a form of intellectual humility to be transparent about ambiguity (Hoekstra & Vazire, 2021).As an editor, you may encourage authors to formalize theory and to be transparent about ambiguities (Jamieson & Pexman, 2020;van Rooij, 2022).In all three of these roles-as author, reviewer, and editor-you may benefit from using Table 1.This table lists six questions that serve to help evaluate theoretical transparency.
Our larger message is that developmental scientists should strive for rigorous theory.By promoting clarity and precision, formal modeling provides one avenue to improving the state of psychological theory.However, in some cases, a well-articulated verbal theory may be sufficient.Moreover, a clear theory will not settle every dispute.For instance, there might remain debate over which empirical unit (estimated from observed data), and which measurement instruments, best capture a given theoretical unit (Lundberg et al., 2021;Rohrer, 2021).Nevertheless, such debates will be more fruitful when scientists operate within a shared framework of transparent ideas and logic, rather than a Wild West world of ambiguous language (Frankenhuis et al., 2023).We hope this Special Section will inspire more developmental scientists to contribute to this prospect.