Diversity and representation in infant research: Barriers and bridges toward a globalized science of infant development

Psychological researchers have become increasingly concerned with generalized accounts of human behavior based on narrow participant representation. This concern is particularly germane to infant research as findings from infant studies are often invoked to theorize broadly about the origins of human behavior. In this article, we examined participant diversity and representation in research published on infant development in four journals over the past decade. Sociodemographic data were coded for all articles reporting infant data published in Child Development, Developmental Science, Developmental Psychology , and Infancy between 2011 and 2022. Analyses of 1682 empirical articles, sampling approximately 1 million participants, revealed consistent under-reporting of sociodemographic information. For studies that reported sociodemographic characteristics, there was an unwavering skew toward White infants from North America/Western Europe. To address a lack of diversity in infant studies and its scientific impact, a set of principles and practices are proposed to advance toward a more globally representative science.


| INTRODUCTION
In 2010, Henrich et al. (2010) published a highly influential paper on the scientific costs of sampling practices in psychological research.This paper famously described how our understanding of psychological processes has been disproportionately informed by particular demographic groups.More specifically, individuals from what Henrich termed "Western, Educated, Industrialized, Rich, and Democratic" (WEIRD) societies were shown to be over-represented in individual samples.At the same time, the work of Henrich et al. (2010) suggests that individuals from these over-represented groups are often outliers in their psychological responses (see also Arnett, 2008;Graham, 1992;Rozin, 2006).This article ignited intense discussion about the limitations on generalizability of studies of human behavior given narrow sampling practices that have prevailed in the field.Developmental psychology is no exception to this practice: Research on child development remains firmly grounded in the Western, industrialized world but is often broadly extrapolated beyond these settings (Legare, 2017;Moriguchi, 2021;Nielsen et al., 2017).
Although participant diversity is a central concern in many areas of psychological research, these issues are particularly relevant to the study of infancy.On theoretical grounds, infant research is often viewed as a lens through which the origins of human behavior are revealed.Findings from infant research have served as a foundation for theory-building about innateness, human nature, universal principles of development, and the initial state of psychological organization (Simion & Butterworth, 1999).We do not oppose these scientific goals.However, we raise them against the backdrop of sampling practices that do not represent the full diversity of environments in which infants are raised.
Even within widely-represented settings, infant laboratories often do not draw randomly from the population, but tend to over-select majority-culture families from high socioeconomic strata (Fernald, 2010).In addition to skewed sample composition, infant research typically involves small sample sizes (Oakes, 2017).The combination of these normalized research practices in infant studies (small sample sizes and limited sociodemographic representation) would ordinarily point to the presumption of low generalizability.In spite of these factors, evidence from infant studies is often marshalled to theorize at the highest level of generalizability, driving overarching theories of human nature (Berent, 2021).

| METHOD AND RESULTS
In this paper, we discuss three core issues relating to diversity and representation in infant research.First, we review sampling practices in infant research over the last 12 years with a focus on reported race, ethnicity, and geographic origin of participants.The goal of this analysis is to examine (i) the distribution of participants across author-reported racial and ethnic categories, (ii) the distribution of research studies across geographical regions of the world, and (iii) changing trends in sociodemographic representation of participants over time.Next, we propose specific principles to address sampling bias in infant research.Third, in light of the fact that sampling practices have not changed in spite of significant consciousness-raising efforts, we suggest strategies for actionable behavioral change within the research community toward a more globally representative science of infant development.

| The state of the field: Participant diversity in infant research
First, we examined author-reported racial, ethnic, and national origins of participants and authors of published papers reporting psychological research on infants between 2011 and 2022 (inclusive) in the wake of Henrich et al. (2010).We relied on author-reported categorization of participants.At the outset, we acknowledge the significant variation that exists within the racial/ethnic categories and geographical regions reported by authors.We do not presume that individuals within these groups are homogenous, nor do we assign similar behavioral traits to these large and diverse groupings.Rather, we make use of these categories to chart time-trends in author-reported measures of participant diversity.

| Journal selection and data analysis
We selected four journals for this analysis: Infancy, Child Development, Developmental Science, Developmental Psychology.We selected these journals because they represent generalist outlets with stated broad foci.All data are aggregated across the four journals.We included articles that met three criteria.First, we included articles with participants that ranged from birth to 30 months.For articles with multiple studies, some of which tested participants below 30 months and others of which tested participants above 30 months, we included samples below 30 months.If samples were indivisible (e.g., "Participants ranged in age from 2 to 5 years"), the article was excluded.Second, we focused on articles investigating typically developing infants.We therefore excluded articles where children were reported to have known developmental delays or disabilities.Third, we only included empirical research articles.We excluded commentaries, theoretical position papers, review articles, and meta-analyses.Based on these criteria, we analyzed sociodemographic representation in 1682 articles.Figure 1 displays the process of manuscript selection in a PRISMA flowchart.

| Representation of participants' race/ethnicity
We examined the full text of each article for demographic information to chart race and ethnicity and geographical representation of participants.We relied on the author's provision of this information for all information.Here after, "sample" refers to unique groups of infants being tested within a study."Study" refers to the entire data set within an article, which may contain more than one type of sample."Article" refers to the published manuscript.Coding methods for all demographic variables are detailed in Supporting Information S1.All data are available at https://osf.io/e6z27/?view_only=ee1ed98829274619ad4725263606a1b9.
Within each study, we classified each sample by racial and ethnic categories as reported by authors.The reported racial categories of individual samples within studies were coded and the proportion of studies that corresponded to each racial category was computed by year of publication (see Figure 2).This figure reflects data aggregated across 987,486 participants distributed over 1682 studies.As can be seen in Figure 2, there was consistent under-reporting of race and/or ethnicity, a pattern that remained stable over the 12-year period with no observable attenuation.Of the subset of studies that reported the race of samples (47% of total studies), 82% reported data from White or predominantly White samples, a pattern that also remained stable over the 12-year period.Representation of Black, Latinx, Asian, and Indigenous samples combined amounted to less than 4% of all studies that reported participant race.
For a more granular view of participant representation, we manually calculated the proportion of participants belonging to each racial group across the period sampled.This analysis collapsed across samples within studies.Based on these calculations, the proportion of participants reported to belong to each racial/ethnic category was derived and aggregated across the 12-year time period (see Figure 3).The total number of participants included in this analysis was 987,687 distributed over 1682 studies.
In combination, Figures 2 and 3 bring into a sharp relief a consistent pattern of under-reporting of race/ethnicity.However, of equal significance is that of studies that reported race/ethnicity, there remains a clear predominance of White samples with little fluctuation over the time period sampled.What has also remained decidedly stable is the low representation of Asian, Latinx, Black, Indigenous, and multiracial samples.Figure 3 exemplifies an even more pronounced pattern at the participant level, revealing a very strong skew toward non-reporting.Similar to study-level analyses, when race/ethnicity information was reported, there was disproportionately high representation of White participants.It should be noted that at the participant level, there were several studies (5.1%) that drew from large-scale birth cohorts.Sample sizes in these cohorts were very high (the mean sample size per article for these studies was 8372 participants) compared with typical laboratory studies.In general, these large-scale studies reported race/ethnicity in greater measure.As such, the pattern evident in

| Representation of participants' places of origin
Next, we examined the geographical distribution of regions from which samples were drawn.Studies varied in how authors reported location of data.Studies were categorized into those that explicitly mentioned location of testing or those that made no mention of location or that indirectly stated location (see Supporting Information S1 for the detailed coding protocol).Over the time period sampled, nearly half the studies (43%) had no location information stated.As can be seen in Figure 4a, there has been a gradual increase over time in whether location of data collection was stated and a corresponding decrease in studies for which no location information was provided.
Of studies where location was explicitly stated, each study was classified by the world region where the study took place (Africa, Asia, Australia and New Zealand, Central Asia, Eastern Europe, North America, South America, Western Europe, Oceania) or as being multi-region (data was collected in more than one of these regions).This analysis assessed the site of data collection for 889,290 participants, distributed over 968 studies where location was directly reported (see Figure 4b).As displayed in Figure 4b, over the 12-year period sampled, there has been an enduring reliance on studies conducted in North America and Western Europe.As a result, 84% of studies relied exclusively on data from geographical regions inhabited by less than 7% of the world population.There is no clear evidence of geographical diversification in sample representation outside of North America and Western Europe over time.
Figure 5 reflects the overall distribution of samples across different world regions aggregated across 2011-2022.As can be seen, representation of infant samples across world regions remains uneven, with high concentrations of samples in North America and regions of Western Europe.There is very little representation in the Global South, notably Africa and Latin America.In addition, representation from many Asian countries is absent or minimal.To chart change over time, Figure 6 reflects change in sample representation between two successive time periods (2011-2016 and 2017-2022).This figure reflects relative increases or decreases in sample representation between these two time windows proportionate to the number of papers published during each time window.As can be seen, there is an overall decrease in some widely represented regions (e.g., North America) and correspondingly, an increase in representation in some regions of the Global South (e.g., India, Argentina).However, overall, many regions of the Global South remain vastly under-represented relative to the Global North (see Supporting Information S1 for full data).

| Summary of data analysis
We discuss changes first in demographic reporting and second in demographic representation.First, in terms of reporting, much demographic data remain largely absent from the research record.This is particularly pronounced in reporting of race/ethnicity, but it is also apparent in reporting of location.Reporting of race/ethnicity is complex in some regions, such areas of Western Europe, which may account for low reporting from these regions.Among studies conducted in Western Europe, 72% did not report race/ethnicity data.In contrast, 36% of studies conducted in North America did not Reporting patterns for location of data collection across time for studies aggregated across journals by whether location was stated or not stated.(b) Reported location of data collection aggregated across journals for studies where location was directly reported.Percentages in legend refer to average representation across the time period.report race/ethnicity data.The incidence of unreported race and/or ethnicity in all other regions are as follows: 58% in Asia, 52% in Australia, 92% in Latin America, 80% in Eastern Europe, and 100% in Africa.This variation speaks in part to the viability of collecting race/ethnicity in some world regions, but it may also speak to the suitability of race/ethnicity, often categorized using US classifications, as a proxy for characterizing participants' community of descent.We return to this point in our recommended practices.Second, with respect to demographic representation, amongst studies where race/ ethnicity was reported, infancy research remains dominated by White infants in North America and Western Europe.
In interpreting our findings, we acknowledge that our analyses are constrained by the four journals selected and recognize that outlets with different foci, originating from other world regions, and/or written in different languages may reveal a different profile of demographic reporting and different patterns of sample representation.However, these journals were selected because they represent broad areas of research in developmental science, are widely read, and have had a strong impact on the study of infant development.

| DISCUSSION
Next, we outline a set of principles to provide a framework for addressing fundamental issues of demographic reporting and representation in our field.We then suggest practices, aligned with these principles, to redress the current imbalance in participant representation.Several of our suggested practices make contact with psychological research more generally, which as a field is in need of greater participant diversity, while others are specific to infant research.

| Principle 1. Expand to explain: Diversifying data to formulate generalizable theories
As documented in the previous section, many densely populated areas of the world, most notably regions of the Global South, are largely absent from the journals sampled.In addition, data from non-White infants remain vastly under-represented.We acknowledge that these factors may be related and that the geographical imbalance observed is at least partially due to the imbalance in research capacity across different world regions.However, in combination with characteristically small and socio-economically narrow samples within individual populations (Fernald, 2010;Oakes, 2017), sampling practices so clearly evident in infant research are better aligned with the presumption of low generalizability than high generalizability.
We believe that one contributing factor to the presumption of high generalizability from infant data is the longstanding normalization (and arguably, incentivization) of such practices in the publication process.In high impact outlets, infant researchers have often argued for cross-cultural invariance in various developmental processes based on small samples of infants from limited data collection sites.For example, researchers have indicated universality in infants' physical reasoning (Baillargeon, 2008), in infants' social evaluation (Hamlin et al., 2007), in children's use of geometry (Shusterman et al., 2008), in infants' learning of cultural knowledge (Csibra & Gergely, 2009), and in infants' early sensitivity to sounds (Werker & Tees, 1984), based on small samples from Western societies.This manner of extrapolation extends from empirical work to theoretical contributions: Several influential theories of infant cognition have presupposed universality in infant development.We discuss two seminal theories of infant development below in relation to the diversity of the evidence basis.

Two universalist theories of infant development: Core knowledge and perceptual narrowing
Here, we discuss core knowledge and perceptual narrowing as two examples of seminal theories of infant development with a universalist orientation.Regarding core knowledge, it has been proposed that specific domains of information are available to all infants (i.e., objects, actions, number, space), while in other domains, core knowledge is presumed to be uniformly absent across populations (i.e., food or non-object artifacts; Shutts et al., 2009).For example, evidence in support of core knowledge has been characterized as follows, "These findings provide evidence that a single system, with signature limits, underlies infants' reasoning about the inanimate world" (Spelke & Kinzler, 2007, p. 90).Similarly, studies on infants' core knowledge of actions are summarized in generalizable terms, "Together, these findings provide evidence for a core system of agent representation that is evolutionarily ancient and that persists over human development" (Spelke & Kinzler, 2007, p. 90).There is certainly acknowledgment of cultural variation in the expression of core knowledge as well as cross-cultural confirmatory evidence to provide some support for these statements.However, if universality is construed as applying to all learners, the representational skew in empirical evidence must be recognized and acknowledged in these theoretical positions.In particular, sameness across a limited set of different cultural contexts does not equate to universality (see Kline et al., 2018).
Perceptual narrowing provides a similar example of a universalist theory grounded in comparatively narrow evidence.Perceptual narrowing theory posits that infants begin their lives as universal listeners and gradually align sensitivities to their native language(s).Empirical studies of perceptual narrowing have been drawn overwhelmingly from monolingual infants from North America (Singh, Rajendra, & Mazuka, 2022).Linguistic representation has been similarly narrow: The specific sounds used in studies on perceptual narrowing of speech are not the most common sounds across languages of the world (Everett, 2018;Singh, Rajendra, & Mazuka, 2022).Yet, perceptual narrowing is conveyed as a broad theory, applying across locations, learners, and languages.
Recent efforts to diversity samples have challenged some of the basic tenets of perceptual narrowing (i.e., that young infants are universal listeners and that there is age-related decline in non-native sensitivity to sound).For example, research with Japanese learning infants demonstrates that not all phonetic distinctions are available to infants early in development, even those that are clearly distinguished in the input (e.g., Bion et al., 2013;Sato et al., 2010Sato et al., , 2012)).Research with Dutch learning infants argues against an age-related decline in non-native speech discrimination when more naturalistic stimuli are used (e.g., de Klerk et al., 2019), challenging another central tenet of the theory.Moreover, diverse socio-economic sampling has suggested that infants from lower-Socio-economic Status (SES) families do not demonstrate typical patterns of perceptual narrowing for native sounds, demonstrated in Singaporean infants (see Singh, Cheng, & Yeung, 2022) or for non-native sounds, demonstrated in French infants (Gonzalez-Gomez et al., 2020).In both studies, only high-SES samples exhibited the typical textbook pattern of perceptual narrowing that forms the backbone of the theory.Importantly, low-SES infants do not exist at the margins of society: In the United States alone, an  (Hair et al., 2015).Globally, approximately for half of the world's population lives in poverty (Watkins & Quattri, 2016).As a result, research findings evidenced in high-SES participants may not represent the underlying population.
To be clear, perceptual narrowing and core knowledge theories have been (and will continue to be) immensely useful for understanding development in several domains and we do not discount their value.Rather, we posit that as theories evolve, the scale of inference of theories should be brought into alignment with cumulative evidence.We further suggest that claims of universality (if operationalized as applying to all learners) are presented with reference to the representation of learners and cultural settings that make up the underlying evidence basis.

Early responsiveness to sociocultural experience
Broad generalization from infant studies may be based on common intuitions about infants.Infants are often perceived to be inert, passive, and inexperienced compared with older populations (Berent, 2021).However, many studies have demonstrated specific instances in which very young infants are responsive to environmental experience.For example, shortly after birth, in the auditory domain, a sample of French and German newborn infants (n = 60) tested in France and Germany demonstrated variation in their crying behavior that aligned with the intonational properties of French and German respectively (Mampe et al., 2009).Similarly, newborn infants tested in Canada whose mothers spoke English and Tagalog during pregnancy (n = 15) could distinguish both languages yet expressed an equal preference for English and Tagalog (Byers-Heinlein et al., 2010).In the olfactory domain, newborn infants tested by French investigators (location of testing not identified; n = 24) preferred the odor of their mother over others (Schaal et al., 2000).In the visual domain, newborn infants tested by U.S. researchers (location of testing and other demographic details not provided; n = 48) preferred their mother's face over other faces (Field et al., 1984).In each of these instances, variation in early behaviors has been attributed to experiences either acquired directly or indirectly (e.g., maternal face preferences may be due to associated vocal/olfactory signals), reflecting early responsiveness to the environment.
The examples above focus on how experience can shape early development at the level of the individual.However, experience can also shape genetic expression at an individual and/or population level (Kline et al., 2018;Richerson & Boyd, 2005).As a result, phenotypic expression can vary across contexts via epigenetic processes (Meaney, 2010).For example, maternal nutrition during the perinatal period (gestation and lactation) predicts infant cognitive outcomes via gene-environment interactions (Morales et al., 2011).Similarly, influences of socio-economic status on infants' cognitive abilities are modulated by gene-environment interactions, conferring greater environmental influence on socio-economically disadvantaged infants (Tucker-Drob et al., 2011).This makes it challenging to precisely identify and differentiate the biological and environmental origins of behavior.
Thus far, we have focused on theories around universality and innateness because these types of theories are often invoked in infant studies.However, we acknowledge that there are other influential theories, such as dynamical systems theory (Smith & Gasser, 2005) and developmental cascades (Oakes & Rakison, 2019) that foreground learning and adaptation, thereby emphasizing flexibility of the human psychological repertoire.While our field therefore does have theoretical diversity beyond nativist accounts, a lack of sampling diversity in our field constrains any theory of infant development, new or traditional.

| Principle 2. Integrate to interpret: Weaving demographic data into analysis and reporting
Human behavior inevitably reflects sociocultural experience.Integrating demographic information about participants and context is therefore critical to an informed interpretation of data (Rogoff et al., 2018).The absence of interpretation of experimental data relative to context and/or demographic characteristics can overstate the uniformity of behavior across contexts and lead to a presumption of normativity.However, of concern is that presumptions of normativity are often not uniformly applied to psychological research: They are often selectively over-applied to well-studied populations.One way in which this has been described is in terms of a "Western centrality assumption" (Kline et al., 2018).A salient expression of this assumption lies in how research findings are reported in publications.For example, articles from the Global North are much less likely to include geographical references in their titles and much more likely to characterize studies in generic terms (e.g., 24-month-olds outperform 18-month-olds) in titles, abstracts, and highlights relative to studies from the Global South (Castro Torres & Alburez-Gutierrez, 2022).This has downstream consequences for how research is interpreted by readers as the use of generic language fuels perceptions of normativity and high generalizability by readers (DeJesus et al., 2019).

Consistent integration of demographic factors across cultural settings
Contextual and demographic details provide valuable information for all research studies but are conspicuously absent or ambiguous in research articles from widely studied samples (Draper et al., 2022).Draper et al. provide examples of how research from North America is often contextualized in terms that can be opaque to readers uninitiated to North American contexts (e.g., "The study took place in a large East Coast town").What this exactly means about the participants sampled is unclear unless a reader is aware of sociocultural associations with East Coast towns in the U.S. In general, the absence and/or ambiguity of demographic data and cultural context from over-represented participants can feed attributions of typicality or normality to these groups.As further noted by Draper et al. (2022), contextual details are often over-solicited from authors working with under-represented populations (see Causadias et al., 2018;Roberts & Mortenson, 2022, for analogous findings with racially-majoritized vs. -minoritized groups).
The construction of a generalized narrative for over-represented groups impacts how evidence accrues within an area of research (Stanley, 2007).Studies that confirm this narrative-even though they are based on an over-representation of small samples-may be well received as they align with a narrative deemed both representative and truthful.At the same time, studies are more likely to affirm an existing narrative if they draw from the same cultural context as prior studies, conferring potential advantage upon those that conduct research in widely represented settings.Data from under-represented settings may be more likely to deviate from findings derived from widely studied populations due to underlying population-level variance.This research can be more challenging to publish because it disaffirms an established narrative.Indeed, developmental scholars from under-represented regions report difficulty publishing their work if it does not align with research from widely represented communities (e.g., Draper et al., 2022;Moriguchi, 2021).As infant researchers each working with under-represented populations, we have faced the same barriers to publication.

Describing behavior in context: A need for greater precision
In order to integrate to interpret, demographic characteristics should be woven into reports of behavior with sufficient precision.Precision is sometimes thwarted by the use of labels to capture large swaths of the world's population, such as the WEIRD label.The use of contrastive labels (e.g., WEIRD/non-WEIRD) can result in research studies conducted across socio-cultural extremes in order to determine cultural invariance and/or universality versus specificity.For example, researchers have sometimes compared small samples from a large-scale industrialized society (e.g., a university town in North America) with samples from a small-scale, agrarian society to conclude universality of behavior.Differences in responses across culturally distal settings are often interpreted as evidence of cultural dependence (e.g., Gordon, 2004;McClay et al., 2022;Pica et al., 2004).Sameness in responses is construed as evidence for universality and/or innateness (e.g., Aknin et al., 2015;Barrett et al., 2019;Callaghan et al., 2011;Dehaene et al., 2006;Hernik & Broesch, 2019).As noted by Kline et al. (2018) sameness does not necessitate universality just as difference does not necessitate specificity (for a positive example of context-grounded interpretations of cross-cultural comparisons between a small-scale rural society and a US city, see Elmlinger et al., 2022; also see Harwood et al., 2007 for a positive example of characterizing within-culture variation).
An additional positive example of the value of nuanced cultural description within a population comes from the infant attachment literature.Heavily influenced by the foundational work of Bowlby and Ainsworth et al. (Bowlby, 1979; see also Main & Solomon, 1986), attachment theory has portrayed particular types of care and parental responsiveness as normative.These normative interactions are strongly aligned with caregiver practices in middle-class, North American homes.The Ainsworth model, in particular, remains a staple of attachment theory, often with little qualification as to its cultural specificity (see van Ijzendoorn & Sagi-Schwartz, 2008).
Within the context of this model, Mesman et al. (2018) documented patterns of care in three indigenous communities.Careful description of dyads in the indigenous samples reveals important differences in the ways in which infants are soothed and calmed in these communities versus in more widely documented North American samples.For example, in the indigenous communities, there was a greater reliance on "quiet soothing" (e.g., breastfeeding or tactile soothing) in anticipation of distress.This stands in contrast to distraction via object-mediated activities (e.g., calling the child's attention to an object to divert attention away from their state of distress) that is generally more characteristic of North American dyads.
Mesman's interpretation of caregiver behavior, in the context of the physical arrangement of child and caregiver, renders each form of soothing both responsive and adaptive to context.For example, in societies where infants remain in close physical contact with caregivers, via the use of slings, physical responses may be more easily sensed and therefore, may provide better social signaling than verbal responses.This leads to quieter, anticipatory forms of soothing.When infants are physically distant from their caregivers, verbal communication may be a more salient signal (see also Chapin, 2013;Gaskins et al., 2017) and prompt more reactive soothing.In this way, the physical arrangements between child and caregiver can regulate the structure of infant-caregiver interactions in each context.Without a careful and rich description of context, the behaviors exhibited by Mesman's sample would appear concerning based on the Ainsworth model (physical soothing is less visible, quieter, and easily mistaken for a lack of responsiveness or sensitivity on the part of caregivers).A decontextualized account of mother-infant behaviors can therefore misrepresent infants' experiences in relation to attainment of developmental goals.Furthermore, this approach can introduce and perpetuate cultural deficit models of behavior (Kline et al., 2018).In this way, demographic variables and cultural context should be woven into research reports using necessary detail and through a culturally-attuned lens.

| Principle 3. Innovate to include: Driving methodological innovation to broaden participation
Diversification of our discipline requires us to examine our methodological toolkit and its appropriateness to diverse populations.Psychological research has traditionally been a methodologically conservative discipline, originating from its early reliance on principles of the natural sciences and logical positivism (Bridgman, 1927).Although methods and approaches have evolved and dramatically changed since the inception of experimental psychology more than a century ago, experimental psychology has maintained a longstanding commitment to quantitative data obtained through laboratory-based experimental methods.These methods were developed by and for laboratory researchers within Western industrialized societies.Today, there remains the dominant perception that with the careful operationalization of abstract constructs into numeric data, precise experimental manipulation, and rigorously controlled testing conditions lie objectivity, generalizability, and neutrality (Breen & Darlaston-Jones, 2010).Infant research has followed in the wake of traditional experimental psychology: Laboratory-based looking time experiments remain the primary means by which infant psychological processes are studied (Csibra et al., 2016).

Adapting methods to context
Prevailing experimental approaches in infant research likely befit the contexts for which they were developed, but may not be well suited to other contexts in the same form in which they are instantiated in Western settings.Rather than adopting methods and practices from well-represented settings, it is preferable to develop methods and practices with culturally grounded expertise.We provide a positive example of how infant research has benefited from this approach from the domain of motor development.Characterized by physical changes, many basic aspects of motor development have long been assumed to be impervious to cultural variation.This is exemplified by a set of universal guidelines and percentiles for motor development published by the World Health Organization.However, too often, these compilations were conducted using Western samples, with children reared in Western traditions where childrearing practices are child-centered, the environment is object-abundant for play, and infants are free to move and encouraged to explore.This approach neglects cultural variation in infants' everyday experiences created by childrearing practices, such as how caregivers handle, position, and dress their infants which may offer unique opportunities for movement and exploration (Adolph et al., 2010;Adolph & Hoch, 2019;Adolph & Robinson, 2015;Karasik et al., 2022Karasik et al., , 2023)).
Exemplifying this principle, researchers examined infant motor development in the context of childrearing practices in Tajikistan, where caregivers use a gahvora cradle that restrains infants' movements (Karasik et al., 2019).To investigate the effects of restriction on motor development, researchers borrowed methods from laboratory and field studies, incorporating qualitative and quantitative accounts of behavior.Local informants helped with developing tools to capture practices and interpret findings.Researchers observed lags in onset ages of motor skills relative to Western norms, but seemingly without longer-term consequences.This is likely due to the restricted experience in infancy being countered with greater opportunities for exploration later in childhood.By 2-3 years of age, children travel freely from inside and outside contexts of their homes, have access to a variety of surfaces and elevations, and travel around their village with minimal adult supervision.Tajik 3-year-olds roam, climb high ladders, use heavy farm tools, and wield sharp objects.These highly specialized skills do not appear on any standardized test.If they did, it is likely that Tajik children would be advanced compared to Western norms.In this way, naturalistic observations revealed culturally relevant experiences that nurture and support infant development in context.Approaches such as these do not involve departing from the empirical rigor that defines our field, but instead, they involve refining empirical approaches to accommodate the full range of ways in which developmental progress can be empirically measured in diverse settings (see Weber et al., 2021).This example illustrates how broadening our methodological repertoire can enrich our understanding of infant development.

| Practices for effecting true change in the field of infant research
The principles described above provide a framework for shifting toward a globalized science of infant development.To accompany the three principles articulated in the previous section, we suggest specific practices aimed at effecting change in our field.We first suggest changes in researcher practices and then in evaluator practices.

Interpreting infant behaviors in the context of demographic characteristics
A first step toward a diversified science of infancy is so clearly evident from our empirical analysis, which is the consistent collection and provision of sociodemographic data as legally and ethically permissible (Oakes, 2021).As stated in the first principle ("Expand to Explain"), there is high theoretical significance in grounding infant behaviors in the context within which they are recorded.Several journals have recently begun to mandate that all articles must address sociodemographic origins of participants and the geographical context for the study to the extent possible, such as Child Development, Developmental Science, Infant and Child Development, and Developmental Psychology.This is therefore becoming a necessity in our field.
As noted in the second principle ("Integrate to Interpret"), we suggest authors' efforts to provide cultural context should not be limited to simply reporting sociodemographic data, but that demographic information should be interwoven into the research narrative (Rogoff et al., 2018).Furthermore, to engage in faithful extrapolation of research findings relative to sample composition, research reports should also be accompanied by claims of generalizability that derive from sample characteristics (Hoekstra & Vazire, 2021;Roisman, 2021;Simons et al., 2017).In particular, when samples are small and narrow, starting with the presumption of low generalizability and seeking positive evidence for expanding the scope of generalization based on cumulative evidence across laboratories, populations, and settings would be more parsimonious than starting with the de facto presumption of high generalizability and contracting the scope of generalization in the face of negative evidence (which may be less likely to surface in published literature).

Reporting demographic factors in describing past research
In addition to incorporating demographic factors in their own research, authors can also intentionally commit to a diversified research narrative in how they situate their work relative to that of others.This relates to the first ("Expand to Explain") and second ("Integrate to Interpret") principles.For example, citation practices tend to grant selective recognition to studies from widely-studied groups (Kwon, 2022).As a result, unexamined citation practices can reinforce the notion that knowledge acquired from widely studied (e.g., Western) populations is foundational and that research from non-Western populations provides cultural commentary on this foundational knowledge, leading to visibility gaps in research based on the origin of participants (Dutra, 2021).In particular, addressing intersectional invisibility (e.g., the interaction of various factors that marginalize particular groups of authors and/or participants) is an important step to counteracting inequities in visibility of researchers and participants (Syed et al., 2018).Via an intentional commitment to inclusive citation practices, researchers can integrate findings under-represented populations into the mainstream research record.However, it is critical to not imply "otherness" of such populations when citing it.That is, citing research from over-represented (mostly White, Western samples) samples without reference to demographics as a default, while citing research from under-represented samples as a point of contrast to this default (e.g., in a "but see…" addendum) undermines the effort and can reinforce notions of normativity associated with over-represented samples.

Developing culturally inclusive research
As discussed in the preceding section, psychological research is largely driven by methods developed by and for Western populations.This is integral to the third principle, "Innovate to Include."The process of transferring these methods to under-studied populations is inherently complex.In doing so, researchers often confront two incompatible goals: (1) to make use of methods with known validity and reliability estimates often from Western societies and (2) to maximize ecological validity where methods are adapted to the local context (Fernald et al., 2017).
Reconciling this conflict is not easy; there are costs and benefits to each approach.The straightforward adoption of presumptively valid tools from Western contexts to non-Western contexts can lead to the perception of cultural deficits and/or to cultural misattribution.An example of this comes from the direct adoption of infant attachment measures widely used in Western societies, with accompanying assumptions, to non-Western settings, which has mischaracterized attachment behaviors in the latter (Keller, 2012;Keller et al., 2018).
The other extreme-assembly-entails devising methods on the ground that are aligned with local norms and practices.However, the resulting methods often necessarily contain a high proportion of culture-specific items limiting direct comparison with studies in the wider literature.For example, an accurate assessment of Tajik motor development requires evaluating behaviors that would rarely be observed in a North American sample (e.g., ladder climbing by 3-year-old children; a new walker navigating rugged terrain).Such approaches require researchers to commit to best practices in incorporating local expertise in developing new methods.This also requires reviewers of such work to recognize that established Western methods may be a poor fit for under-represented settings and to appreciate the value of new, culturally appropriate methods.
In some cases, a middle ground may be possible.For example, researchers could use the same task, but simply adapt stimuli to the local context.For example, in their study of delay of gratification in Japanese and US preschoolers, Yanaoka et al. ( 2022) used both the traditional marshmallow task in conjunction with the same task with wrapped gifts as prizes instead of the marshmallow.Although Japanese children waited to eat the marshmallow, US children more easily waited to open a wrapped gift.Yanaoka et al. argued that this difference reflected the fact that waiting to eat is commonplace in Japanese culture, whereas children in the US more commonly wait for wrapped gifts.This provides a positive example of how the structure of a widely-used task can be altered to adapt a task to context.
No matter the approach taken, local informants and collaborators are critical to research in under-represented settings (see Morelli et al., 2018).In addition to informing methodological choices, grounded expertise can be instrumental in facilitating cross-cultural collaborations between researchers from different backgrounds.In the next section, we discuss some considerations for cross-cultural collaborations in infant research and their promise in addressing demographic gaps in infant development.

Large-scale, diverse collaborations in infant research: The role of big team science
In infant research, there have been several initiatives to collect "big data" as a complement to the more traditional model of solo science.Examples of these models in infant research include the Many-Babies consortium (ManyBabies Consortium, 2020), an effort to bring together large numbers of laboratories to collaborate on a single project; data repositories from diverse settings, such as CHIL-DES (MacWhinney, 2014), a repository for language input and output samples; WordBank (Frank et al., 2017), a repository for a specific language inventory; and Databrary (Gilmore et al., 2016), a repository of video data.However, we note across all of these sources, the provision of sociodemographic data about participants is inconsistent.
On its own, big data does not always provide insight into sociocultural context (see Forscher et al., 2023).With increased scale, there can be decreased insight into the particulars of any given culture within a study (Apicella et al., 2020).Although big data certainly affords the opportunity for investigating variability in behavior based on sociodemographic factors, the true promise of big data in diversifying infant research depends crucially on collecting and examining demographic variables and harnessing the diversity of large cross-national samples in interpreting research findings.As these valuable resources grow and expand, collecting and incorporating participant demographic information will be a key priority.

Equity and inclusion in large-scale international collaborations
Efforts to collect big data often requires international collaboration across diverse groups of researchers.The structure of these collaborations, notably the power dynamics, merit careful consideration to avoid the perpetuation of Western centrality and dominance.For example, it is not uncommon for researchers from well-represented countries to serve as the Principal Investigator (PI) and thought leader, obtaining grant funding and inviting others to collect data in exchange for subsidiary funding (e.g., sub-awards).This structure can create dependency at the outset.In large part, these dynamics can mimic the post-colonial power structures, whereby individuals from under-represented communities are dependent on and subordinate to those from well-represented communities (see Bhatia, 2020).With this type of vertical structure, individuals from well-represented countries may assume a disproportionately large role in decision-making and leadership.As a consequence, local voices can be sidelined or even absent from the research process, presenting a significant scientific cost to knowledge production (see Singh, 2022, for a discussion of these issues).
Inequities in international collaborations may be more persistent in studies involving researchers from low/middle-income countries (LMICs) and high-income countries (HICs).In particular, researchers from under-represented (local) contexts can be relegated to the role of field workers following the directions of HIC researchers (Parker & Kingori, 2016;Sinha, 1990).To avoid these inequities from taking hold, we suggest that international research groups continuously examine the power structures within their teams.Elevating both well-represented and under-represented communities to leadership roles (e.g., equal co-PIs; diverse governing boards) is one way to equalize perspectives and experiences (see Aravena-Bravo et al., 2023, for a positive example of intentional practices to promote diversity and equity in international collaborations).

| Change in evaluative practices
Increasing participant diversity and representation involves the commitment of gatekeepers within our field-our scientific journals-which are in a very strong position to (i) incentivize diversity and (ii) de-incentivize a lack of diversity (Neblett, 2019).This is relevant to each of our three preceding principles.We first focus on considerations around journal requirements for demographic reporting and then turn to the issue of reviewer bias.

Are journal reporting requirements associated with an increase in sociodemographic description?
Several developmental journals have recently begun to mandate that all articles must provide sociodemographic data, such as Child Development, Developmental Science, Infant and Child Development, and Developmental Psychology.Current reporting requirements for the journals vary across journals.Of the journals sampled, Child Development requires inclusion of "the theoretically relevant characteristics of the particular sample studied, for example, but not limited to: race/ethnicity, socioeconomic status, language, sexual orientation, gender identity (inclusive of non-binary options), religion, generation, family characteristics; and (3) the place(s) from which that sample was drawn, including country, region, city, neighborhood, school, etc. and all other context variables that are relevant to the focus of the publication."Developmental Psychology states, "Major demographic characteristics should be reported, such as sex, age, socioeconomic status, race/ethnicity, and, when possible and appropriate, disability status and sexual orientation."Developmental Science requires the provision of ethnicity/race and sex/gender.
We compared three journals that instituted requirements to provide demographic data on different dates: Developmental Psychology, Developmental Science, and Child Development.Developmental Psychology and Developmental Science both instituted these requirements for papers submitted mid-year in 2022 (Developmental Science) or at the end of 2022 (Developmental Psychology).As confirmed with the Editors-in-Chief of these journals, the results of reporting requirements are unlikely to be apparent as of this writing (P.Quinn [6/8/2022] and K. Pérez-Edgar [11/7/2022], personal communication).However, Child Development instituted a requirement for all papers submitted as of January 2021.The results of this requirement are likely to first become evident in papers published in early 2022 (G.Roisman, personal communication, 11/11/2022).Comparing provision of demographic data across these three outlets in 2022 therefore provides an indicator as to how impactful journal policies can be in increasing sociodemographic reporting.As can be seen below (see Figure 7a,b), the incidence of reporting race/ethnicity and site of data collection has seen an uptick in Child Development relative to the other journals for race/ethnicity and country of data collection.
For both reporting of race/ethnicity and of location, we computed whether the reporting rates for each journal in 2022 deviated from the preceding years.To statistically test whether provision of data for each journal in 2022 exceeded random fluctuations observed between 2011 and 2021, we computed for each data set (race/ethnicity and location of testing) the mean and standard deviation of the reporting rate across 2011-2021.This allowed us to generate a 95% confidence interval for the distribution of these data and to convert the 2022 rate into a Z-score.Based on these calculations, for author-reported race/ethnicity, the values for Child Development were greater than 2 SD from the mean (and outside the 95% confidence intervals) from previous years.This suggests that reporting for 2022 represents a statistical outlier and is therefore an increase that is incommensurate with the random fluctuation seen in prior years.For Developmental Science and Developmental Psychology, the values for 2022 author reports of race/ethnicity remained within 2 SD and within the 95% confidence intervals for the preceding period and were therefore aligned with the overall trajectory.
The same calculations were performed for location of testing, revealing that 2022 values fell within expectations, based on data from 2011 to 2021, for all three journals.In sum, these data suggest that reporting requirements may be effective for demographic factors, such as race/ethnicity, for which reporting rates are particularly low.

Measurement of impact of reporting requirements
Although the introduction of reporting requirements are undoubtedly welcome developments for our field, we add the caveat that well-intentioned policies to increase demographic reporting, and eventually visibility of research from under-represented groups, can have negative effects, further marginalizing under-represented groups (Cheon et al., 2020).For example, in a recent study, readers of scientific articles are more likely to view an article as relevant if the country of testing was the United States than if it was not (Kahalon et al., 2021).As further evidence of this, studies that included country information in the title had fewer citations than those that did not include this information, but only if the study was conducted outside of the U.S. (Kahalon et al., 2021).Systematic measurement of impact (positive and negative) of reporting requirements is therefore critical to ensuring that these requirements serve their intended purpose and do not have the unintended negative consequence of further sidelining research from under-represented areas/populations.demographic details.However, this presumes that there is a well-established theoretical base linking demographic factors to human behavior in the area of study.Because demographic factors are often absent from reporting and analysis of data, the opportunity to identify theoretically relevant factors is limited.Demographic variables can have indirect and non-obvious links to infant behavior, that may inform future theory, but that cannot be identified a priori as theoretically relevant.We illustrate this point with recent research from infant language development.Infant language exposure has been shown to influence basic psychological processes in non-linguistic tasks.In particular, infants who hear two languages versus one differ in various visual processes, such as visual categorization (Brito & Barr, 2013), visual habituation (Singh et al., 2015), and visual attention in the face of competing visual stimuli (D 'Souza et al., 2020).These visual behaviors are not obviously linked to language experience.However, they are impactful as measurement of visual processing (e.g., habituation, attention, categorization) is standardly used to operationalize latent perceptual and cognitive processes in infancy (Csibra et al., 2016).In this way, infant language experience, although rarely reported (Kidd & Garcia, 2022), can have far-reaching effects on basic behaviors that form the methodological backbone of the field (see Byers-Heinlein et al., 2019 for a discussion of this issue).While the relationship between language exposure and visual processing may inform developmental theory (Singh, 2021), the initial discovery often results from collecting and analyzing demographic data in relation to behavior.
Instituting reporting requirements that are equally suited to diverse settings can also be complex.One example comes from reporting race and ethnicity.On one hand, children's racial and ethnic origins are highly relevant to their development as race and ethnicity intersect with many aspects of lived experiences (e.g., caregiving practices, healthcare practices, family structure, SES, and biological development) (Garcia-Coll, 1990).These factors also reveal information about participants' community of descent (Hollinger, 1998) and influence participants' interactions with others (Feliciano, 2016;Markus, 2008).On the other hand, collecting this information in some contexts is ill-advised and can even be illegal (Léonard, 2014).In other world regions, even when these data can be collected, they tend to be collected using US-based racial classifications, which at best map imprecisely, and at worst underspecify underlying racial and ethnic variation within a population (e.g., the use of "Asian" to classify individuals originating from a diverse and heterogeneous continent).Finally, in some contexts, race and ethnicity may not be the best proxies of community of descent and/ or cultural identity (e.g., in many countries, Tribal Affiliation or membership in an ethnolinguistic group, but not race, is an important proxy for community of descent).We recommend that journals require instead the provision of relevant demographic information that best define participants' identities and lived experience within the community being sampled rather than recommending categorization systems that were developed by and for widely sampled communities.
Broadening the range of acceptable measures that can be used to capture demographic variation may generate a more complete demographic picture, particularly in regions where race/ethnicity is not feasible to collect or well-suited to context.Examples of such measures include adherence to cultural or religious routines and practices, language use, migration history, and other behavioral measures of enculturation (Juang & Syed, 2010;Kim & Abreu, 2001;Parameshwaran & Engzell, 2015).This suggestion is not intended to circumvent or undermine the critical importance of race and ethnicity, but rather to provide enriched descriptions of participant identity that are adapted to context.

Standardizing demographic reporting in infant research
There are efforts underway to establish norms for demographic reporting for infant research that tackle the question of what to report.In particular, ManyBabies Demographics aims to provide a generalizable and adaptable framework for demographic reporting in infant research (Singh, Barakova et al., 2022).This effort aims to provide the "lowest common denominator" of demographic information that merits inclusion when collecting infant data.The project is a collaboration between 24 researchers from diverse cultural settings and countries and aims to provide a culturally-adapted framework for capturing demographic variables.Such tools may facilitate the standardized collection of demographic data in a way that facilitates both data aggregation and post-hoc harmonization of demographic data.

Reviewer bias in evaluation of research from demographically diverse populations
In addition to reporting demographic characteristics, the interpretation and use of this information in the review process merits careful monitoring.In particular, Western centrality biases often permeate the peer review process in a way that may limit representation.We provide three common examples here of the ways in which reviewer biases surface in our field.First, when evaluating research from an under-represented (and unfamiliar) society, reviewers of research articles may instinctively interpret findings through the lens of what they know and accept about Western participants.For instance, reviewers sometimes ask for companion data from a Western "control group," as though such data provide the necessary comparator to interpret data from a non-Western sample.In addition to potentially delegitimizing data from the non-Western sample, this expectation elevates evidentiary standards for researchers from under-represented contexts relative to those from over-represented contexts (who are not generally required to ground their arguments in comparison data from under-represented regions).
Second, at a more basic level, reviewers may query the very relevance of research drawn from an under-represented region, even if that region is populated by a large proportion of the world's population.This conveys that research from under-represented countries is of narrower interest.For example, Draper et al. ( 2022) described the response received from an Editor for a study conducted in South Africa, "While your research is important and the South African context interesting, I do not see [journal name] as an ideal fit for this paper.I would suggest looking to a journal focused exclusively on international research."Editorial decisions that dismiss the relevance of empirical work based on its place of origin problematize the publication process at the outset for those working in under-represented settings.Moreover, the frequent use of "international" to refer to "non-US" disregards the reality that this label encompasses 95% of the world's population.Alternative terms, such as the "Majority World" reflect population-level statistics and can serve as a reminder of the vast number of people that inhabit under-studied areas.
Third, authors reporting data from under-represented communities are often advised to explicitly discuss representativeness as a limitation; such statements are rare in papers sampling over-represented populations where generalizability is frequently presumed (and unstated).Similarly, authors from under-represented settings are sometimes told that a particular study within their population is not of added value given that other prior studies have examined the same phenomenon in the same or a geographically/culturally proximate population, a criticism rarely leveled at widely-studied samples.Described as reviewer "microaggressions" (National Academies of Sciences, Engineering, and Medicine, 2022), these types of comments can marginalize authors from under-represented communities.Journal editors are in a strong position to be vigilant about microaggressions and other sources of bias in the review process and to mitigate their effects.

Editorial contributions to a global science
Diversifying editorial boards is a critical step towards diversifying representation.For example, in scholarship around race, diversifying editorial boards can lead to diversification of submitting authors and of the types of participants represented in submitted research (Auelua-Toomey & Roberts, 2022).In addition to editorial diversity, Editors can actively prioritize submissions from under-represented settings.For example, in a recent initiative, the American Psychological Society disseminated a plan to modify the review criteria for its journals to prioritize populations that are currently under-represented in the submission process and to ensure that authors are globally representative.These efforts reflect not just a rhetorical commitment to diversification, but tangible editorial practices that are aligned with those commitments.
At the time that all articles were coded, the demographics of the Editors and Associate Editors across all four journals reflected uneven demographic representation.Across all journals, 78% of Editors and Associate Editors had institutional affiliations in North America, 19% in Western Europe, and 3% in Asia.Variation across journals existed largely in terms of relative representation of Editors/Associate Editors from North America versus Western Europe.In no journal was there more than 6% representation from Asia.And in no journal was there any representation from a Latin American or African country.While this is in part a reflection of the representation of infant researchers in each of these countries, it does constrain the global diversity of perspectives in research evaluation (Moriguchi, 2021).

Inclusion of evaluators from LMIC
The majority of infants around the world are raised in an low-to-middle income countries (LMIC), making the inclusion of such children crucial to a globally representative narrative of infant development As can be seen from our data, however, LMIC representation in infant research is very limited.In addition, research from LMICs in our dataset was often conducted with the goal of confirming theories originating from the U.S. or Western Europe.However, the specific topic areas, research methods, and scientific foci relevant to LMICs differ markedly from research from HICs (Bornstein et al., 2012).Areas of research such as the impact of nutrition, chronic stress, poverty, and/or political instability may be more critical determinants of infant thriving in these settings and as such, the basic questions may differ from HICs (see Tomlinson & Morgan, 2015).As such, broadening the lens through which infant development is interrogated via increased LMIC participation would contribute to a more global science.Engagement of LMIC scholars in the editorial process is a key priority in broadening representation in this way.
Diversification of editorial boards to include greater LMIC representation can be complex.In many instances, scholars in LMICs face increased time pressure and may have higher commitments to teaching or service delivery.Moreover, the incentive structures of academic appointments in LMICs may not reward editorial participation in the way that U.S./E.U.institutions do.Finally, financial incentives to editorial participation may be more heavily weighted in LMIC than HIC settings.Identifying and addressing these barriers to participation is critical to broader inclusion.One approach to greater LMIC participation is for Editors to support LMIC authors in the publication process.For example, in an upcoming Special Issue on infant research in Latin America at Infant Behavior and Development, the Guest Co-Editors hosted a webinar in Spanish and Portuguese to provide information for Latin American authors on manuscript writing, review criteria, and offered strategies for submitting authors.Another approach may be to support capacity-building efforts, such as developing mentorship schemes to facilitate a transition to editorial roles in under-represented regions (Pike et al., 2017).Overall, a comprehensive examination of barriers to participation that solicits the direct perspective of LMIC members (rather than U.S. spokespeople for LMIC scholars) is critical to broadening participation.

The role of U.S. researchers in global diversification efforts
Even with clear support for a global science of infant development, there are resource limitations to scaling up individual research programs to survey infants across international settings.However, researchers from well-represented environments are in a strong position to engage in diversification initiatives without assuming costs that are not feasible for them.These researchers may be ideal change agents for shifting the visibility of under-represented research into the mainstream given their relatively high visibility in mainstream publications, prominence in editorial positions, and expertise in navigating U.S. based publications.As a start, researchers from well-represented environments can help to expand the scientific narrative by integrating findings from diverse populations into the dominant narrative on infant development present in textbooks, summaries, reviews, and synopses.Weaving data from diverse populations into mainstream accounts of development is preferable to relegating these data to separate commentaries on the role of culture on infant development (i.e., the "Culture as Chapter 13" approach; Syed & Kathawalla, 2022).
In addition, infant researchers within well-represented settings can explore effects of demographic variation within their local settings to address core issues of generalizability in widely-sampled contexts.Several infant researchers have engaged in these initiatives, generating findings that have modified the prevailing narrative in impactful ways.We provide some positive examples here.For example, Gaither et al. (2012) demonstrated that widely documented exploration and preference for own-race faces differs for US infants born into mixed-race homes, compared with those born into monoracial homes.Lopera-Perez et al. (2022) found that patterns of neural responding to infant-directed speech do not generalize to lower-SES infants in the US.Clearfield and Jedd (2012) reported that widely documented effects of stimulus complexity on visual attention are not observed in lower-SES infants.These provide a few of many examples of how increasing sample diversity within widely represented settings have substantially revised the empirical record and modified established theories of infant development.

| CONCLUSIONS
In this article, we sought to examine the state of the field with respect to participant diversity in infant research.Since 2011, a watershed year in psychological science, where concerns around sampling bias were re-centered in academic discourse, participant diversity in infant development shows little sign of diversification.Data from approximately 1 million participants reveal a significant skew toward White participants from North America and Western Europe, as well as a strong practice of not reporting demographic details.Given the attention brought to this issue in recent years, there is clearly no information deficit about the lack of diversity and its impact on development research, yet we have made very limited progress in diversifying our science.
In this article, we have identified deeply-rooted, structural factors that have limited progress in diversifying our field along with principles and practices to dismantle barriers in an effort to narrow this gap.In addition to individual practices, we suggest that a lack of diversity within the research community and reduced participation and visibility of under-represented researchers are organizing factors in our field.Genuine efforts to diversify research output takes purposeful, intentional, and decisive action on the part of relevant stakeholders, careful self-examination of inequities in our field, some of which are subtle and represent hidden barriers to diversification, and the establishment of precise and measurable targets.Changes at each tier of the research process (knowledge construction; research evaluation; knowledge dissemination) are critical to advancing toward robust and generalizable theories of infant development.

F
I G U R E 1 A PRISMA flowchart of article selection.15327078, 2023, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/infa.12545 by Ecole Normale Supérieure de Paris, Wiley Online Library on [19/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License

Figure 3
Figure 3 actually over-represents participants from a minority of studies (cohort studies) and may thus over-estimate overall reporting of participant race/ethnicity.

F
Percentage of studies by race aggregated across journals.The three largest constituencies are labeled in the graph.Percentages in legend refer to average representation across the time period.

F
I G U R E 5 A geographical distribution of samples for studies published between 2011 and 2022 aggregated across journals.The legend refers to the range of samples within each color band.15327078, 2023, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/infa.12545 by Ecole Normale Supérieure de Paris, Wiley Online Library on [19/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License

F
I G U R E 6 A geographical depiction of change in samples of publications in 2011-2016 compared to 2017-2022 aggregated across journals.Dark blue shading indicates an increase in sample representation over the two time periods proportionate to the number of papers published in each time window.Light blue shading indicates a decrease in sample representation over the two time periods proportionate to the number of papers published in each time window.
15327078, 2023, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/infa.12545 by Ecole Normale Supérieure de Paris, Wiley Online Library on [19/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License estimated 51% of children come from low-income environments 15327078, 2023, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/infa.12545 by Ecole Normale Supérieure de Paris, Wiley Online Library on [19/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 15327078, 2023, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/infa.12545 by Ecole Normale Supérieure de Paris, Wiley Online Library on [19/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 15327078, 2023, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/infa.12545 by Ecole Normale Supérieure de Paris, Wiley Online Library on [19/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 15327078, 2023, 4, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/infa.12545 by Ecole Normale Supérieure de Paris, Wiley Online Library on [19/12/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Challenges in reporting demographic factors Researchers often struggle with which demographic details to report.Some societies and journals, such as Society for Research in Child Development, urge authors to provide "theoretically relevant" F I G U R E 7 (a) Proportion of studies reporting of race/ethnicity in Child Development, Developmental Psychology, and Developmental Science.Red box indicates window of analysis for significance deviation from 2011 to 2021.(b) Proportion of studies explicitly reporting site of data collection in Child Development, Developmental Psychology, and Developmental Science.Red box indicates window of analysis for significance deviation from 2011 to 2021.