Infant-directed communication: Examining the many dimensions of everyday caregiver-infant interactions

Everyday caregiver-infant interactions are dynamic and multidimensional. However, existing research underestimates the dimensionality of infants’ experiences, often focusing on one or two communicative signals (e.g., speech alone, or speech and gesture together). Here, we introduce “infant-directed communication” (IDC): the suite of communicative signals from caregivers to infants including speech, action, gesture, emotion, and touch. We recorded 10 min of at-home play between 44 caregivers and their 18-to 24-month-old infants from predominantly white, middle-class, English-speaking families in the United States. Interactions were coded for five dimensions of IDC as well as infants’ gestures and vocalizations. Most caregivers used all five dimensions of IDC throughout the interaction, and these dimensions frequently overlapped. For example, over 60% of the speech that infants heard was accompanied by one or more non-verbal communicative cues. However, we saw marked variation across caregivers in their use of IDC, likely reflecting tailored communication to the behaviors and abilities of their infant. Moreover, caregivers systematically increased the dimensionality of IDC, using more overlapping cues in response to infant gestures and vocalizations, and more IDC with infants who had smaller vocabularies. Understanding how and when caregivers use all five signals—together and separately—in interactions with infants has the potential to redefine how developmental scientists conceive of infants’ communicative environments, and enhance our understanding of the relations between caregiver input and early learning.

• Over 60% of the speech that infants encounter during at-home, free play interactions overlap with one or more of a variety of non-speech communicative cues.
• The multidimensionality of caregivers' communicative cues increases in response to infants' gestures and vocalizations, providing new information about how infants' own behaviors shape their input.
• These findings emphasize the importance of understanding how caregivers use a diverse set of communicative behaviors-both separately and together-during everyday interactions with infants.

INTRODUCTION
Everyday caregiver-infant interactions are dynamic and multimodal (e.g., Frank et al., 2013;Holler & Levinson, 2019;Perniss, 2018;Schatz et al., 2022;Suarez-Rivera et al., 2022;Yu & Smith, 2012).A variety of studies have demonstrated that caregivers use and modify a diverse set of communicative behaviors-including speech, action, gesture, emotion, and touch-while interacting with infants (e.g., Brand et al., 2002;Chong et al., 2003;Fernald et al., 1989;Iverson et al., 1999;Stack & Muir, 1990), and these infant-directed modifications are linked to infants' attention and learning (e.g., Abu-Zhaya et al., 2017;Dimitrova & Moro, 2013;Gogate et al., 2000;Koterba & Iverson, 2009;Lew-Williams et al., 2019;Moses et al., 2001;O'Neill et al., 2005;Rowe, 2000;Seidl et al., 2015;Stack & Muir, 1992;Tamis-LeMonda et al., 2008;Tincoff et al., 2019;Williamson & Brand, 2014;Wu et al., 2021).However, these behaviors are frequently treated as separable components of communication, and most research focuses on one or two dimensions at a time (e.g., speech alone, or speech and gesture together), which results in an oversimplification of how communication unfolds during natural caregiver-infant interactions.Caregiver-infant communication is highly multidimensional-it is easy to imagine a caregiver saying "Wow, look!" while pointing at an object and showing surprise on their face.In this case, speech, gesture, and emotion would all be occurring simultaneously and working together to create meaning.A richer characterization of this natural multidimensionality is necessary for the field of developmental science to construct a comprehensive framework for understanding how communicative cues are used and integrated during everyday caregiver-infant interactions.
The goal of the current study is to address this limitation by investigating "infant-directed communication" (IDC): the suite of communicative signals from caregivers to infants including speech, action, gesture, emotion, and touch.We first review existing literature on how each of these dimensions, independently, are used and modified in caregiver-infant interactions.Next, we examine 10-min, at-home free play sessions to characterize and quantify caregivers' use of this suite of multidimensional cues.With this comprehensive characterization of caregiver-infant communication, we take a new approach to studying turn-taking to better understand not only whether caregivers respond to infant vocalizations and gestures, but also the extent to which they increase the dimensionality of their communication (i.e., providing combinations of cues across multiple modalities) at key moments in dynamic interactions.Finally, we explore how caregivers' use of multidimensional communication is modulated with infants' growing language skills.Understanding how and when caregivers use infant-directed communication provides a more complete picture of infants' real-world learning environment, deepens our understanding of caregiver input and early learning, and enables the creation of ecologically-valid theories of infants' learning from natural, everyday interactions.

Infant-directedness across modalities
Infant-directed speech.It has been well-established that many caregivers (and older children; e.g., Shatz & Gelman, 1973) modify their speech when interacting with infants (e.g., Ferguson, 1964;Fernald et al., 1989;Hilton et al., 2022;Kuhl et al., 1997;Piazza et al., 2017;Snow & Ferguson, 1977).Infant-directed speech (IDS) features higher and more variable pitch, shorter utterances, increased repetition, and simplified vocabulary, among other characteristics that make it distinct from the type of speech that is typically directed to adults (Fernald et al., 1989).These adaptations to IDS have a number of benefits for young learners.For example, there is robust evidence that infants prefer to listen to infant-over adult-directed speech (ManyBabies Consortium, 2020; Cooper & Aslin, 1990;Fernald, 1985), which is likely driven by optimized neural entrainment to its moment-to-moment dynamics (Nencheva & Lew-Williams, 2022).Additionally, the characteristics of IDS appear to promote early word learning by, for example, helping infants discriminate speech sounds and segment words out of continuous speech (Golinkoff et al., 2015;Graf Estes & Hurley, 2013;Ma et al., 2011;Soderstrom, 2007;Thiessen et al., 2005;Trainor & Desjardins, 2002).
IDS is clearly a key component of infant-directed communication in isolation, there is independent evidence that caregivers modify their behaviors in each of these dimensions when interacting with infants and that these modifications promote successful caregiverinfant communication.In many cases, these signals may be overlapping, simultaneous, or even synchronous or redundant with one another.
Below, we describe each of these signals and their benefits for early development.
Infant-directed action.When interacting with infants, caregivers act on objects as they talk about them (e.g., Karmazyn-Raz & Smith, 2023;Meyer et al., 2011;Schatz et al., 2022;Suanda et al., 2016;Suarez-Rivera et al., 2022), and there is evidence that caregivers modify these object-directed actions in ways that are analogous to the modifications observed in IDS.Action demonstrations directed to infants (vs. adults) are more enthusiastic, repetitive, simplified, use a larger range of motion, and are performed closer to the infants' space (Brand et al., 2002).Infants prefer to look at infant-directed over adult-directed action, and they are more likely to imitate and explore objects that have been demonstrated using these infant-directed features (Brand & Shallcross, 2008;Koterba & Iverson, 2009;Meyer et al., 2022;Williamson & Brand, 2014).Furthermore, caregivers frequently manipulate objects as they simultaneously make a verbal reference to that same object (Gogate et al., 2000;Messer, 1978), suggesting that caregivers may use action to reinforce their speech.Additionally, just as IDS seems to help infants segment words from speech, recent research suggests that infant-directed action may help infants segment individual action units from unfolding activity streams (Kosie & Baldwin, 2023).Action, too, thus appears to be an important feature of everyday caregiver-infant communication.
Infant-directed gesture.Gesture is another common and widely investigated feature of caregiver-infant interactions (e.g., Goldin-Meadow, 2005;Morganstern, 2024;Rowe et al., 2008;Schmidt, 1996;Vigliocco et al., 2019).Like speech and action, there is evidence that caregivers modify their gestures when interacting with infants and that caregivers' use of gestures changes with infant development.For example, gestures to infants are initially simple and concrete (e.g., frequent pointing and reference to the immediate environment), but both the amount and complexity of gestures increase with infants' age and abilities (Dimitrova & Moro, 2013;Gogate et al., 2000;Iverson et al., 1999;O'Neill et al., 2005;Shatz, 1982).As with infant-directed speech and action, caregivers' gesture use impacts infants learning and behavior.
For example, it has been demonstrated that caregiver gesture directs infants' attention to relevant features of the environment and has been linked to more accurate word learning (Booth et al., 2008;Vogt & Kauschke, 2017;Zukow, 1991).Additionally, caregivers' gesture use is positively predictive of infants' gesture use, which in turn predicts their later vocabulary size (Rowe, 2000;Rowe & Goldin-Meadow, 2009).
Given this evidence, it is clear that any comprehensive conceptualization of communication should necessarily include gesture.
Infant-directed emotion.Emotion, too, is a primary source of information that infants use in everyday communication with caregivers (for a review, see Wu et al., 2021).While emotion can be conveyed in multiple modalities including facial expression, prosody, and movement dynamics, we chose to operationalize "emotion" as facial expressions that caregivers often use and modify (e.g., through exaggeration) to express emotional content.For example, Chong et al. (2003) described the form and meaning of three facial expressions that are frequently used and exaggerated during caregiver-infant interactions to express love/concern, joy, and surprise.Caregivers use facial expressions to communicate emotional content, and infants use them to gather information that guides their exploration and learning.When conditions are uncertain, such as activities involving interaction with novel toys or traversing seemingly hazardous paths (e.g., a "visual cliff"), infants use their caregivers' facial expressions as one source of information to guide their own actions.Specifically, infants are more likely to avoid novel toys or situations when their caregiver shows negative emotions and approach the toys/situations when the caregiver shows neutral or positive emotions (Moses et al., 2001;Tamis-LeMonda et al., 2008).In addition to speech, action, and gesture, facial expressions conveying emotion are yet another way that caregivers provide communicative signals to infants.
Infant-directed touch.Caregivers' use of touch plays an important role in everyday interactions with infants as well (Anisfeld et al., 1990;Feldman et al., 2010;Ferber et al., 2008;Franco et al., 1996;Hertenstein, 2002;Jean et al., 2009;Stack & Arnold, 1998;Stack & Muir, 1990).Existing research on the role of touch in early language and auditory learning suggests that caregivers' touch can both enhance infants' ability to find words in speech and to learn new words (Abu-Zhaya et al., 2017;Lew-Williams et al., 2019;Seidl et al., 2015;Seidl et al., 2024;Stack & Muir, 1990;Tincoff et al., 2019).For example, Lew-Williams et al. (2019) found that combining touch and auditory information promoted infants' learning of auditory patterns, while infants failed to learn these same patterns in the absence of redundant touch cues.Touch, too, is a common source of information that promotes successful communication between caregivers and their infants.
While the research summarized above provides important information about caregivers' use of multiple communicative cues during interactions with infants, these studies often focus on a single dimension at a time (or sometimes a single non-speech dimension alongside speech).To our knowledge, prior work has not aimed to provide a comprehensive view of communication with infants, yet an integration across domains is important for understanding the real multidimensionality of everyday interactions.As a result, the field is currently working from an impoverished picture of how communication naturally unfolds between caregivers and infants.By examining multiple communicative behaviors in parallel, we aim to characterize the full richness of infants' everyday social environments, and in doing so, improve frameworks for understanding the varied contexts of early learning.intersensory redundancy) serves to direct infants' attention to features of a situation or stimulus that have been redundantly specified and attenuate infants' attention to features that are not redundant (Bahrick et al., 2004;Bahrick & Lickliter, 2000).More generally, the use of multimodal cues has been shown to aid learning in a variety of structured in-lab tasks across dimensions that include perception and discrimination of rhythm, tempo, and affect, learning of abstract rules, and mapping words or sounds to objects (Bahrick et al., 2002;Booth et al., 2008;Flom & Bahrick, 2007;Frank et al., 2009;Gogate et al., 2000;Gogate & Bahrick, 1998).

Multimodal signals support learning
Multimodality promotes attention and learning in more natural interactions as well.For example, mothers' multimodal engagement with objects (e.g., simultaneously touching and talking about an object) during play extends both the duration of play bouts and infants' attention to relevant objects (in contrast to instances that do not involve multimodality; Schatz et al., 2022;Suarez-Rivera et al., 2019;Suarez-Rivera et al., 2022).Caregivers also use multimodal cues to initiate bouts of joint attention and may tailor their use of these cues to the needs of their child (Depowski et al., 2015;Gabouer et al., 2018Gabouer et al., , 2020)).
For example, while there is substantial variation in how caregivers use multimodal cues with both hearing and deaf children, caregivers of deaf children may be more likely to accommodate their child's needs by incorporating the use of more tactile cues overall (Abu-Zhaya et al., 2019;Gabouer et al., 2020).However, less is known about how caregivers use and adapt a broad suite of multimodal cues in response to infants as dynamic interactions unfold across time.

Infants' role in shaping early input
As interactions unfold, caregivers and infants engage in reciprocal turn-taking in which one individual initiates a behavior and the other responds contingently (Bornstein et al., 1992(Bornstein et al., , 2015;;Gratier et al., 2015;Holler et al., 2015;Levinson, 2006;Sacks et al., 1978;Stern, 1985).Infants use a variety of behaviors to elicit responses from caregivers (e.g., Bates et al., 1975;Begus & Southgate, 2012;Gros-Louis et al., 2006;Kovács et al., 2014;Lucca & Wilbourn, 2019).For example, vocalizations directed at objects or speech-like babbling by infants are behaviors likely to receive a caregiver response (Albert et al., 2018;Elmlinger et al., 2019;Goldstein et al., 2010) and infants use gestures, like pointing, to request information from others (Kovács et al., 2014;Lucca & Wilbourn, 2019).Caregivers' contingent responses to infants have been linked to the development of attachment, cognitive skills, vocal behavior (i.e., the maturity of infant vocalizations), and language across infancy and early childhood (Baumwell et al., 1997;Goldstein et al., 2003;Goldstein & Schwade, 2008;Jaffe et al., 2001;Nicely et al., 1999;Rollins, 2003;Tamis-LeMonda et al., 2001).Contingent caregiver responses include speech and vocalizations as well as non-verbal behaviors such as facial expressions, gestures, actions on objects, and touch (e.g., Deak, 2018;Gros-Louis et al., 2006, 2014;Nicely et al., 1999).While we might expect caregivers to also employ more multimodal cues in response to infants' vocalizations and gestures, the coding of caregivers' responses is frequently binary (e.g., simply whether they responded or not) and focused on one single communicative dimension at a time.It is currently unknown whether caregivers increase the dimensionality of their communicative acts in response to their infant, which may be ideal for infant engagement and learning.

Characterizing infant-directed communication (IDC)
Existing evidence suggests that: (1) caregivers use a variety of multimodal cues in communicative interactions with infants, (2) multimodal cues support early learning, and (3) multimodal cues may be tailored to infants' behavior.However, the majority of research on the dynamics of caregiver-infant interaction has investigated each multimodal dimension independently, and much of our understanding of how multimodal cues support learning comes from structured, in-lab tasks.We know of no work to date that investigates how caregivers dynamically use and modulate all five multimodal cues-both independently and as they overlap with one another-during dynamic interactions with their infants.Thus, the aim of the current study is to expand beyond a dominant focus on IDS to more fully capture the range of communicative behaviors in natural caregiver-infant interactions.
To better understand how IDC occurs in everyday caregiver-infant interactions, we video-recorded caregivers and their 18-to 24-monthold infants during free play at home (via Zoom).Interactions were coded continuously for caregivers' use of each of the five dimensions of IDC-speech, actions on objects, gesture, emotion (operationalized as facial expressions), and touch on the infants' body-as well as infants' vocalizations and gestures.Because of the degree of complexity introduced by our highly detailed coding scheme, we were limited in the amount of data that was feasible to code and opted for brief, 10-min recordings.Though these recordings represent only a tiny sample of infants' lives, there is existing evidence that brief, at-home recordings like these do capture moments within infants' everyday experiences that closely match periods of high interaction with caregivers (Bergelson et al., 2019;Manning et al., 2020;Tamis-LeMonda et al., 2017).These data were analyzed with the following objectives: 1. First, we sought to characterize the extent to which caregivers used IDC during 10-min interactions with their infants.We examined how each dimension of IDC was used independently as well as how frequently dimensions overlapped.We predicted that caregivers would use dimensions of IDC throughout the entire interaction, but that their use of IDC would vary, both across dimensions and across caregivers.We also expected that speech would rarely occur in isolation and that-even when speech was not occurring-caregivers would still communicate via other non-speech dimensions.
2. We then asked how caregivers modulate their use of IDC in response to infant behaviors, namely, vocalizations and gestures.Specifically, we tested whether the dimensionality of communication (operationalized as the number of cues used simultaneously by caregivers) increased in the period surrounding an infant If our goal as developmental scientists is to understand how infants learn in the natural, everyday context in which this learning occurs, we need to better understand and characterize how multidimensional, dynamic, communicative caregiver-infant interactions unfold in infants' everyday, real-world experience.

Participants
A total of 44 caregivers (41 mothers, 3 fathers) and their 18-to 24-month-old infants (M age = 20.48,SD age = 1.87,21 female) were recruited from Central New Jersey and the surrounding area.This sample size was determined based on the number of families we were able to recruit during one summer and is similar to other recent studies examining measures of caregiver-infant communication (which typically include approximately 40 caregiver-infant dyads; e.g., Schatz et al., 2022;Suarez-Rivera et al., 2022).All infants were monolingual (i.e., exposed to at least 80% English in their daily lives), full-term (born after 37 weeks gestation), and had no reported developmental delays.Infants were predominantly white (n = 34 White; n = 2 Asian; and n = 8 more than one race) and came from higher-SES households.All mothers had graduated high school, and most had completed college (n = 14; 32%) or graduate school (n = 27; 61%).
Given these participant characteristics, we ask readers to keep in mind two points.First, as with many studies of early development, we are using a restricted convenience sample (see Kidd & Garcia, 2022;Singh et al., 2023).All of our participants come from relatively higher-SES, predominantly white, English-speaking families in the Northeast of the United States.Thus, while some of our findings may reflect characteristics of caregiver-infant interactions that apply across communities and cultures, we do not intend to make broad generalizations beyond this particular group.Second, we strive to consider and appreciate the substantial variation that exists even across families in this restricted sample.Our goal in describing variation is not to place value on families who fall on either end of the spectrum.Instead, we Example image depicting the Zoom recording setup.
encourage appreciation of the natural variation that exists even in this relatively homogenous sample.

Procedure
Caregivers and infants were recorded during 10 min of free-play over Zoom.Before joining the session, caregivers were asked to identify a place in their home where they could play with their infant alone, gather some toys to play with, and turn off TV, music, or other distractions.Before joining the Zoom chat, caregivers were asked to adjust their camera based on a sample image depicting the recording setup (i.e., both caregiver and infant in view, caregiver facing the camera, and infant seated nearby).Caregivers also completed the MCDI vocabulary questionnaire via WebCDI (deMayo et al., 2021) prior to their appointment.At the start of the session, the experimenter joined the Zoom chat, greeted the caregiver, and explained the procedure (instructing the caregiver to play with their infant just as they normally would).
She also helped the caregiver adjust their camera so that both the caregiver and the infant were fully in view and that the camera setup was as similar as possible across dyads (see Figure 1).The experimenter informed the caregiver that she would turn off her camera and audio to minimize distractions and would not be watching or listening as the caregiver and infant played together.After answering any questions from the caregiver, the experimenter turned off her video and audio and minimized and muted the Zoom chat.Approximately 10 min later, the experimenter returned to the Zoom chat to let the caregiver know that the play session was finished and thanked the family for their time.After participating, families were emailed a $10 Amazon gift card.

Coding infant-directed communication
Trained researchers coded all videos for caregivers' use of each of the dimensions of IDC (speech, action, gesture, emotion, and touch) as well as infants' vocalizations and gestures.Videos were coded continuously using ELAN, an open-source software designed for annotating video and audio files (Wittenburg et al., 2006).Researchers coded one dimension at a time, watching the video in its entirety and identifying the onset and offset of each occurrence of speech, action, gesture, emotion, or touch as well as infants' vocalizations and gestures.Our comprehensive coding manual and all related materials are available on the OSF and select session videos are available (for the 21 dyads from which we received caregivers' permission to share) to authorized Databrary users (https://nyu.databrary.org/volume/1589).We defined five dimensions of IDC as follows: 1. Speech: speech directed from the caregiver to the infant including communicative utterances such as "hmmm" or animal noises.
The beginning and ending of each utterance were identified following existing practices for the annotation and transcription of child-directed speech (i.e., using cues such as prosodic contour, turn-taking, and syntactic structure; MacWhinney, 2014;Soderstrom et al., 2021) and only periods between the onset and offset of utterances counted as caregiver speech, ignoring brief pauses that occurred between utterances.
2. Action: interacting with an object to change its state, location, or spatial orientation.While previous studies have coded objectdirected actions whenever the caregiver's hand was touching an object (e.g., Karmazyn-Raz & Smith, 2023;Schatz et al., 2022;Suanda et al., 2016;Suarez-Rivera et al., 2022), our definition of action additionally required the presence of movement.Simply holding or touching an object was not considered action, but activities like stacking blocks, rolling a truck across the floor, or pretending to feed a doll with a toy spoon would all involve movement and thus were considered actions.
3. Gesture: a movement, usually of the body or limbs, that has a communicative intent.Our gesture coding was adapted from Iverson et al. (1999) and included deictic (e.g., pointing), conventional (e.g., shaking the head "no"), representational (e.g., extending and retracting the index finger for "worm"), and emphatic gestures (e.g., bobbing side to side while singing).To differentiate gestures and action, gestures did not involve interactions with objects.
4. Emotion: caregiver production of an exaggerated facial expression depicting emotion.These were most frequently the emotions defined by Chong et al. (2003)-joy, surprise, and care/concern-but other exaggerated communicative facial expressions (e.g., furrowing the brow in an exaggerated way to indicate confusion) were included as well.
5. Touch: non-accidental touches that the caregiver provides on any part of the infants' body (following Abu-Zhaya et al., 2017).We included the use of hands or objects to directly touch the child (e.g., using a stuffed animal to pretend to "kiss" the child on their cheek).
Because each dimension was coded independently, researchers did not have to make decisions about which one of the five dimensions each event depicted; or whether a single event should be classified as one, two, or more dimensions of IDC.That is, a researcher might identify an event as an action when doing the action coding, and the same event as an example of touch when doing the touch coding.For example, a caregiver "driving" a car up and down an infants' leg would be coded as action (moving the car toy to change its state or location) and touch (touching the infants' leg with an object).Additionally, just as we separated utterances in our coding of speech, each individual occurrence of a non-speech behavior (e.g., action, gesture, emotion, and touch) was segmented separately.For example, if a caregiver repeated a pointing gesture in rapid succession, these would be coded as two different gesture events.Similarly, if the caregiver's gesture turned into action (e.g., pointing at a toy car and then picking it up with the same hand), the gesture event would end the moment the caregiver began moving their arm to pick up the car, and an action event would begin (though both gesture and action could be occurring at the same time if the caregiver pointed with one hand while simultaneously rolling the car with the other hand).Finally, because only a single caregiver and infant were present during the recording, any speech, action, gesture, emotion, or touch by the caregiver was considered infant-directed (e.g., we did not focus solely on moments that were highly exaggerated and/or repetitive).
In addition to coding the five dimensions of IDC, researchers also coded the occurrence of infant vocalizations and gestures.This was again conducted in separate passes through the video.Infants' vocalizations occurred any time the infant made a communicative vocalization (including babbles, laughs, or squeals, but not vegetative sounds like coughing or burping).As with caregiver gestures, infant gestures were defined as any movement of the infants' body or limbs that had a communicative intent (e.g., pointing or nodding).
Recording via Zoom enabled us to use only one camera angle, which is a limitation of this method of remote data collection, and it was sometimes the case that caregivers or infants were not visible.They occasionally went off-screen, turned away from the camera, or were otherwise blocked from the camera's view.Thus, for each dimension, researchers coded "NC" (not codable) for any segment of time in which it was not possible to see whether the caregiver (for non-speech dimensions of IDC) or the infant (for infant gesture) was or was not exhibiting each behavior.To avoid over-estimating the amount of each behavior present in the interactions, a behavior was deemed "not codable" if there was any question at all about whether the behavior could be properly coded.On average, 9% of seconds in each recording met this criterion.However, this varied across the five dimensions, with more not-codable seconds occurring in the Emotion dimension (M = 21%, SD = 18.36%)-as caregivers simply had to turn their head for the emotion dimension to be considered "not codable"-and the least occurring Gesture: r = 0.12, p = 0.433).

Assessing reliability
Approximately 20% of the total number of videos (n = 9) were fully coded by two independent researchers to assess inter-observer reliability (i.e., whether two independent coders observed the same behavior at the same point during the interaction).We first assessed reliability for our coding of caregivers' use of IDC (speech, action, gesture, emotion, and touch) using both Cohen's kappa (which corrects for agreements expected by random chance; Cohen, 1960) and percent agreement (i.e., the percent of total seconds for which our two coders agreed that an event had or had not occurred).These reliability metrics were computed overall as well as separately for each dimension and each video.Across all five dimensions, we found a Cohen's kappa of 0.88 (% agreement = 95.21%).When examined separately, Cohen's kappa was high, at or above 0.77 (e.g., Bakeman & Quera, 2011;Fleiss, 1981), and percent agreement was at or above 91.49%for all dimensions (Speech: κ = 0.89, % agreement = 94.39%; Action: κ = 0.84, % agreement = 91.49%;Gesture: κ = 0.79, % agreement = 95.41%;Emotion: κ = 0.81, % agreement = 96.87%;Touch: κ = 0.77, % agreement = 98.19%) and all subjects included in the reliability analysis (min: κ = 0.85, % agreement = 93.57%;max: κ = 0.92, % agreement = 96.59%).We also calculated an intraclass correlation (ICC) to assess the extent to which the two coders agreed on the proportion of the interaction during which each dimension occurred and found that this reliability metric was high as well (ICC = 0.99, p < 0.001).
This process was repeated for our coding of infant behaviors (i.e., vocalizations and gestures).The overall Cohen's kappa was 0.88 (% agreement = 96.32%).When examined separately, Cohen's kappa was high, at or above 0.70, and percent agreement was at or above 94.19% for both infant vocalizations (κ = 0.86, % agreement = 94.19%) and infant gestures (κ = 0.70, % agreement = 98.71%).Again, we found strong agreement in our coding of the proportion of the interaction during which each dimension occurred (ICC = 0.99, p < 0.001).
Mixed-effects analyses were performed using the lme4 package (Bates et al., 2015), and p-values were calculated using the lmerTest package (Kuznetsova et al., 2017).Though it is not straightforward to estimate  (Lenth, 2018), which uses a Tukey adjustment for multiple comparisons (Tukey, 1949).

Characterizing caregivers' use of IDC
Variation in caregivers' use of IDC.The proportion of the interaction during which speech, action, gesture, emotion, and touch occurred varied across caregivers and across dimensions.For example, as indicated by the individual data points in Figure 2, the caregiver who spoke the most did so for 84.80% of the interaction (or 8.48 min) while the caregiver who spoke least did so for 20.50% of the interaction (or 2.05 min).The caregiver who gestured the most did so for 36.83% of the interaction (or 3.68 min) while the caregiver who gestured least did so for 0.83% of the interaction (or 0.08 min).Additionally, a linear mixed-effects model (predicting the proportion of the interaction during which caregivers used each dimension from a fixed effect of dimension and a random intercept for subjects) confirmed that caregivers' use of IDC varied across dimensions, F (4, 172) = 223.12,p < 0.001, η 2 p = 0.84, 95% CI: [0.81,1.00](see Figure 2).While the proportion of the interaction during which caregivers were using emotion and gesture did not differ from one another (p = 0.429), differences between all other pairs were significant (ps < = 0.007).
Caregivers who used one dimension frequently did not necessarily use all other dimensions frequently, but we did find correlations between the use of some dimensions of IDC.In particular, as caregivers' use of speech increased, so did their use of emotion (r = 0.32, p = 0.035) and gesture (r = 0.54, p < 0.001).However, caregivers tended to gesture less if they used more action (r = −0.38,p = 0.010), Caregivers use multiple, overlapping dimensions of IDC.While the previous analyses focused on how much caregivers used each dimension of IDC, this approach does not capture an important feature of infants' everyday experience: that caregiver-infant interactions are dynamic and unfold across time (see Figure 3).For example, as interactions unfold, caregivers often use multiple multimodal cues simultaneously.In our next analysis, we explored how frequently this occurred.
How often does caregivers' use of IDC involve the use of multiple dimensions at the same time?
Underscoring the importance of examining IDC, we found that caregivers' speech overlapped with other cues (e.g., action, gesture, emotion, and/or touch) 65% of the time on average (Figure 4a).However, even when caregivers were not speaking, non-verbal communicative cues were used 49% of the time (Figure 4b).To examine whether non-verbal cues occurred more often with speech or in silence, we ran a linear mixed-effects model predicting the frequency of multimodal cues from whether or not speech was also occurring.This model included a random intercept for subjects.Multimodal cues occurred frequently both with speech and also in periods of silence, although multimodal cues accompanied speech (M = 64.90%,SD = 11.10%)significantly more often than they occurred in its absence (M = 48.90%,SD = 15.10%),β = 0.08 (SE = 0.01), p < 0.001, η 2 p = 0.69, 95% CI: [0.56-1.00](Figure 4c).While prior research has focused predominantly on speech as the driving force of communication (see Perniss (2018) for a review), the present results reveal that rich, multidimensional communication occurs about half the time even in the absence of speech.This means that-even when some might not consider "communication" to be occurring (i.e., in the absence of speech)it is.Multimodal, multidimensional cues are key features of infants' everyday communicative interactions.

Caregivers' use of multimodal cues is dispersed throughout the interaction.
We next asked whether caregivers' use of multimodal communicative cues was clustered in time versus spread throughout the interaction.The example time series plots in Figure 3 suggest that multimodal cues occur throughout the recording for all dyads.However, to quantify these visually observed temporal patterns, we used Devia-tion of Proportion (DP), which is a measure of the dispersion of an event (e.g., Gries, 2008).
To generate DP values, the recording session is first broken into a pre-specified number of parts (in our case, each part was 5 s of the recording).Then, the "observed frequency" is calculated (i.e., how often an event actually occurs in each part of the recording).Next, an "expected frequency" is calculated.This value indicates the expected number of occurrences in each part of the recording if the event was distributed equally throughout.These values are then normalized based on the total number of times the event occurs in the recording and, for each recording part, the absolute value of the difference between the expected and observed frequency is calculated.
The resulting DP statistic is generated by summing these difference values and dividing by two.DP values range from 0 to 1, with lower values (closer to DP = 0) indicating that the event's occurrence is dispersed relatively equally across the session (e.g., the "expected" and "observed" frequencies do not differ much), and higher values (closer to DP = 1) indicating that the event occurs more sporadically (e.g., the "expected" and "observed" frequencies differ substantially).
To calculate the DP value for each dyad, we split the 10-min recordings into 120 5-s parts 1 .Within each 5-s part, we summed the number of seconds during which two or more communicative behaviors occurred simultaneously (resulting in a value of 1 through 5) and followed the steps above to calculate the dispersion of multimodal cues this value is closer to 0 than to 1, it suggests that caregivers' use of multimodal cues was more dispersed throughout the 10-min interaction than clustered in time.For complementary analyses on burstiness (e.g., Abney et al., 2018;Slone et al., 2023), see Section S3.

Caregivers' use of IDC increases in response to infant gestures and vocalizations
We have so far focused on caregivers' behavior during play.However, the caregiver is not acting alone, and their infant may play an important role in shaping caregivers' use of IDC (e.g., Albert et al., 2018;Elmlinger et al., 2019;Goldstein et al., 2010;Kovács et al., 2014;Lucca & Wilbourn, 2019).In our next set of analyses, we asked whether caregivers increased their use of IDC-specifically, the number of overlapping communicative cues-when infants gestured or vocalized.
We first examined caregivers' use of IDC surrounding infant gestures.On average, infants gestured 7.64 times during the 10-min interaction (range = 0 to 28).We quantified how many overlapping multimodal cues caregivers used across three spans of time: (1) the three-second period before the onset of an infant gesture (pre-gesture region), ( 2) the three-second period after the onset of an infant gesture (post-gesture region), and (3) outside of these three-second regions (outside gesture region).When these regions overlapped, the segment was counted as post-gesture (though results were similar when the segment was removed entirely or counted as pre-gesture, see Section S4).We ran a linear mixed-effects model predicting the number of overlapping cues used by caregivers from the gesture region, as defined above.This model also included random intercepts and slopes for subjects.As can be seen in Figure 5a, caregivers used multimodal cues differently across the three regions, F( 2 Next, we focused on caregivers' responses to infant vocalizations. Infant vocalizations were more frequent than gestures; on average, infants vocalized 62.86 times (range = 16 to 105).As with infant gestures, we asked how many overlapping multimodal cues caregivers were using across three spans of time: (1) the three-second period before the onset of an infant vocalization (pre-vocalization region), (2) the three-second period after the onset of an infant vocalization (postvocalization region), and (3) outside of these three-second regions (outside vocalization region).As with infant gestures, when these regions overlapped, the segment was counted as post-vocalization (but see Section S4 for alternative classification methods).A linear mixed effects model predicting number of overlapping multimodal cues used by the caregiver from vocalization region (and including random intercepts and slopes for subjects) found a significant effect of vocalization region, F (2, 43.43) = 5.39, p = 0.008, η 2 p = 0.20, 95% CI: [0.04  5b).Thus, like infant gestures, caregivers increased the dimensionality of their communication in response to infant vocalizations.Note that, in contrast to gestures, caregivers seem to use fewer dimensions of IDC at the moment that an infant vocalizes.This may reflect turn-taking in that they are pausing in order to encourage and/or listen to an infant vocalization.

Relations between caregivers' use of IDC and infants' language skill
We performed two final analyses to investigate how caregivers' use of IDC relates to infants' language, specifically their scores on the MacArthur-Bates Communicative Development Inventory (MCDI;Fenson et al., 2007).The MCDI is a parent-report measure of young children's language development.
In the first analysis, we compared the frequency with which caregivers used each dimension of IDC to their infants' scores on the MCDI at two-time points: The time at which the play session took place, and 10 months later.For each time point, we compared two linear regressions (using the anova() function in R): (1) a "speech-only" model predicting infants' vocabulary size from the percentage of the 10-min interaction during which caregivers were speaking, and (2) a "full-idc" model predicting infants' vocabulary size from the percentage of the 10-min interaction during which caregivers were using speech, action, gesture, emotion, and touch (each entered separately).In all models, infant vocabulary size was standardized (z-scored) within each time point, with 0 indicating the average vocabulary size across participants.This model comparison approach allowed us to ask whether a full model that included all IDC dimensions (i.e., speech, action, gesture, emotion, and touch) improved model fit over a model predicting vocabulary size from caregivers' use of speech alone while, at the same time, enabling us to look at the influence of each dimension separately.These models were constructed as follows: speech − onlymodel : vocab_size ∼ speech full − idcmodel : vocab_size ∼ speech + action + gesture + emotion + touch Speech alone was not a significant predictor of infants' vocabulary size, neither when vocabulary was assessed at the time of the play session, β = − 0.85 (SE = 1.06), p = 0.427, η 2 p = 0.02, 95% CI: [0.00−1.00],nor 10 months later, β = − 1.48 (SE = 1.11), p = 0.188, η 2 p = 0.05, 95% CI: [0.00−1.00].However, the full-idc model (including all five dimensions) significantly improved model fit over the speech-TA B L E 1 Coefficient estimates from the "speech-only" model predicting infants' vocabulary size at the time of the 10-min play session from speech only and the "full-idc" model predicting infants' vocabulary size at the time of the 10-min play session from speech, action, gesture, emotion, and touch.  1 and 2).This effect was primarily driven by caregivers' use of emotion during the play session, both when vocabulary was measured concurrently (β = − 7.10 (SE = 1.74), p < 0.001, η 2 p = 0.30, 95% CI: [0.12−1.00])as well as 10 months later (β = −6.38 (SE = 1.84), p = 0.001, η 2 p = 0.27, 95% CI: [0.08−1.00].This effect was also driven by caregivers' use of touch, which was related to vocabulary at the time of the play session, (β = − 6.04 (SE = 2.80), p = 0.037, η 2 p = 0.11, 95% CI: [0.00−1.00],but not 10 months later (β = − 0.53 (SE = 3.19), p = 0.868, η 2 p = 0.001, 95% CI: [0.00−1.00]).That is, caregivers generally expressed more emotion and used more touch with infants who had lower vocabulary scores (see Tables 1 and 2 and   Figure 6), perhaps in a way that scaffolds their attention and learning (Nencheva & Lew-Williams, 2022).Finally, we asked if the temporal dynamics of caregivers' use of IDC (specifically, the extent to which overlapping communicative cues were clustered vs. spread out over time) was predictive of infants' vocabulary at either time point.We ran a linear regression predicting infants' vocabulary size from the DP value characterizing the dispersion of overlapping cues during the interaction with their caregiver.We found that dispersion was not predictive of vocabulary size assessed at the time of the play session, β = 0.014 (SE = 0.016), p = 0.383, η 2 p = 0.02, 95% CI: [0.00−1.00].However, infants who experienced more clustered input (i.e., had higher DP values) had larger vocabulary sizes 10 months after the play session, β = 0.039 (SE = 0.017), p = 0.022, η 2 p = 0.13, 95% CI: [0.01−1.00](see Figure 7).

DISCUSSION
By examining the dynamics of caregiver-infant interactions, we highlight the importance of characterizing the natural complexity of infants' everyday learning environments, involving the coordination of communicative cues across numerous modalities (e.g., Holler & Levinson, 2019;Perniss, 2018).That is, communication with infants is multidimensional, and focusing on only one or two dimensions of IDC undercuts the full richness of infants' real-world learning input.Our findings, scrutinizing caregiver-infant play, highlight that the majority of speech that infants hear is accompanied by one or more multimodal cues and, even in the absence of speech, caregivers frequently communicate in parallel using non-verbal means.Furthermore, we found that these cues are used systematically in response to infants' own behaviors to their own level of language proficiency, suggesting that the structure and timing of infant-directed communication may support early learning.These findings call attention to the stark differences between infants' real, everyday learning environments and the ways that learning is typically tested in controlled, often lab-based, studies.
While controlled studies provide insights that are crucial for elucidating potential mechanisms of early learning, research characterizing infants' real-world, everyday input is necessary for understanding how these mechanisms are actually instantiated in the environments in which infants are actively engaged.
We also found that IDC varied in multiple ways: across dimensions, between caregiver-infant dyads, and from moment to moment.Across dimensions, we found that caregivers use speech and action during the majority of the interaction, gesture and emotion throughout approximately one-quarter of the interaction, and touch more rarely.However, within individual dimensions, there was remarkable variability across caregivers and infants.For example, while on average caregivers gestured throughout 15% of the interaction, the frequency of gestures across individual dyads ranged from less than 1% to nearly 37%.This degree of variation is especially notable given that our sample consisted primarily of white, English-speaking, higher-income families in the United States.Thus, rather than focusing on invariance across individuals, we highlight the importance of centering variation (as opposed to averages) in the interpretation of our results and in considerations about how to extend the current research.
One prominent source of variation in IDC likely comes from caregivers' tailoring of communicative cues to their infant.Across a variety of domains, it has been demonstrated that caregivers vary their input in response to infants' behaviors and abilities (e.g., Elmlinger et al., 2019;Fukuyama et al., 2015;Gogate et al., 2000;Roy et al., 2009Roy et al., , 2015;;Smith & Trainor, 2008).We also observed links between caregivers' use of IDC and infants' vocabulary (as measured by the MCDI).
Specifically, caregivers used more facial expressions conveying emotion with infants who had lower vocabulary sizes, and the infants of caregivers whose use of multidimensional cues was more clustered in time (perhaps suggesting a more systematic or tailored use of multimodal behaviors) had higher vocabulary sizes 10 months later.These analyses were limited, though, in that they focused only on the total amount of IDC that parents used during the interaction.More sensitive analyses would consider links between caregivers' use of multiple dimensions of IDC from moment-to-moment as infants encounter familiar and novel objects in their world and as infants' language skills develop across the first years of their lives.For example, when infants are on the cusp of learning a new word, caregivers tend to reduce the length of utterances containing that word (Roy et al., 2009(Roy et al., , 2015) ) and caregivers increase the exaggeration of action sequences that their infants have the fine-motor skills to perform but are not yet performing independently (Fukuyama et al., 2015).Just as we found that caregivers increase the number of communicative cues they are using in response to infants' gestures and vocalizations, we might additionally predict that caregivers will use IDC systematically during periods in interaction in which infants are interacting with novel objects or encounter a new word.This tailoring of IDC may help create optimal moments for learning.Research is currently underway to better understand how caregivers' use of IDC is fine-tuned to their infants' unique needs and abilities.
However, there are many more reasons why we might expect infants' everyday input to vary in ways that are beyond the scope of the current study.First, this interaction represented only 10 minutes of each infant's life during one specific context (i.e., play with objects).
Because context influences infants' input at a variety of levels-from micro contexts like the identity of interactions partners or the immediate situational context (e.g., daily routines) to macro contexts including political and economic systems in the area in which the child resides (e.g., Rowe & Weisleder, 2020)-it will be important to extend beyond this short snapshot of communicative experience.Additionally, much of the investigation of variation in infants' everyday input across families and communities has focused on speech (e.g., Casillas & Cristia, 2019;Cristia, 2023;Tamis-LeMonda et al., 2019), and minimal work with diverse populations has examined non-speech dimensions of variation in infants' everyday communicative input (but see Abu-Zhaya et al., 2019;Depowski et al., 2015;Gabouer et al., 2020;Tamis-LeMonda et al., 2012).Examining the sources and correlates of variation in infants' everyday experience-across communities and cultures, in a variety of everyday activity contexts, and from moment to momentcan provide important insight into how infants learn in the natural, everyday environments in which they are actually doing this learning (Adolph, 2020;Casillas, 2023;de Barbaro, 2019;de Barbaro & Fausey, 2022;Sullivan et al., 2021).
When making inferences about infants' natural experience, it is also important to consider the strengths and limitations that arise from recording brief 10-min interactions over Zoom.While there is Another limitation of remote data collection is that we were only able to record from one, immovable camera angle, resulting in moments of the video that were not codable and restricting the interaction features that we were able to reliably detect (e.g., because infants moved around much more than caregivers, their facial expressions would have been extremely challenging to code from these videos).
Furthermore, while we have previously highlighted the homogenous demographics of the participants included here, the requirements for participation in remote studies (i.e., reliable internet connection) likely narrowed the characteristics of our sample even more.On the other hand, our use of at-home Zoom recordings enabled us to see families interacting in their everyday environment and with objects that are familiar to them, which is a major strength of this method.Additionally, Zoom-based methods enable asynchronous recording of caregiverinfant interactions (which we have employed in ongoing research) and thus may both reduce the effects of having an experimenter present during the recording session while also enabling families to participate at times during which they would naturally engage in behaviors of interest to researchers (e.g., playing, feeding, or reading).In sum, any method of data collection has both strengths and challenges, and it is important to consider how to best measure behaviors of interest while acknowledging methodological limitations.
Finally, to enable further study of variation in caregivers' use of IDC, it is likely that our coding scheme will need to be continuously examined and revised.For example, as mentioned previously, one decision we made was to consider all behaviors "infant-directed," given that they were performed in a one-on-one interaction between a caregiver and their infant.An alternative approach may be to only consider behaviors that are "stereotypically" infant-directed (e.g., exaggerated, repetitive, etc.), as these may be the cues that are most likely to engage infants' attention and promote learning.However, caregivers vary in how they use infant-directed behaviors and what meets the threshold for an exaggerated action or gesture.Additionally, infants appear to adapt to their caregivers' use of infant-directedness (at least in the domain of speech; Outters et al., 2020), and this may be yet another way that caregivers and infants tune in to each other and ad their own behaviors over time.A second decision in the current coding scheme was to code speech, action, gesture, emotion, and touch as separate categories.However, it is unlikely that these dimensions are truly separable; a caregiver may say "Wow!" while making an exaggerated expression of surprise and pointing at an object.While our coding scheme would consider this caregiver to be using three overlapping communicative cues, it is more likely that these cues work together to produce meaning in context.Furthermore, it may be that the absence of a communicative cue is meaningful as well.For example, the meaning of "Wow" can vary depending on whether it is said with an exaggerated smile or with a neutral facial expression.Thus, future work should attempt to reconcile these issues to better represent the cross-caregiver, cross-community dynamics of infants' communicative experience.
Our characterization of infants' natural communicative environ-

CONCLUSION
Infants' everyday communicative environments are rich and multidimensional.Our investigation of IDC during natural, at-home interactions revealed that caregivers frequently use multiple communicative cues at the same time and that they increase the dimensionality of IDC in response to infants' gestures and vocalizations.This increase in dimensionality may be one way that caregivers implicitly adapt their behaviors to reinforce infants' engagement with their environment and generate optimal moments for learning.Attempts to isolate learning mechanisms by removing the complexity of the environment may be counter-productive to understanding how infants learn in the everyday interactions in which the majority of this learning occurs.Harnessing variability in multidimensional caregiver input provides important new insights into the true richness of infants' everyday social interactions and allows for the development of comprehensive, ecologically valid theories of early learning.
14677687, 2024, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/desc.13515 by Princeton University, Wiley Online Library on [19/08/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License vocalization or gesture, relative to regions not surrounding an infant vocalization or gesture.If infants respond to higher dimensionality in caregivers' communication by vocalizing or gesturing, then we might expect the number of overlapping cues used by caregivers to be higher in the seconds before an infant behavior.In contrast, if caregivers respond to infants by increasing the dimensionality of their communication, then the number of overlapping dimensions would be higher in the seconds after an infant vocalization or gesture.3. Finally, we engaged in exploratory analyses to investigate relations between caregivers' use of IDC and infants' language skill, as measured by the MacArthur-Bates Communicative Development Inventory (MCDI; Fenson et al., 2007).

F I G U R E 2
Proportion of the 10-min interactions during which caregivers used each of the five dimensions of IDC.Averages are represented by larger, solid circles, and individual data points are represented by smaller, open circles.The solid lines span the range of scores for each dimension.standardized effect sizes for mixed-effects models (e.g., Brysbaert & Stevens, 2018), we used the effectsize package (Ben-Shachar et al., 2020) to approximate partial eta-squared and associated confidence intervals for each overall model.Pairwise comparisons and their associated effect sizes (approximated Cohen's d) were calculated using the emmeans package 14677687, 2024, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/desc.13515 by Princeton University, Wiley Online Library on [19/08/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 3 Time-series plots depicting three example caregivers' use of IDC across the 10-min interaction.The x-axis represents seconds as the interaction unfolds and each dimension of IDC is plotted in a different color.possibly because both gestures and actions frequently (though not always) involve movement of the hands.No other correlations were significant (rs〈 = 0.19, ps〉 = 0.26).See Section S1 for additional details about each pairwise correlation.

F
Plots A and B show the proportion of speech (a) and silence (b) that overlaps with non-speech dimensions of IDC for each individual caregiver-infant dyad.Plot C combines these two plots to illustrate the average amount of non-speech cues that occur in silence (light gray bar) and with speech (dark purple bar); open circles indicate individual data points.

2
14677687, 2024, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/desc.13515 by Princeton University, Wiley Online Library on [19/08/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 6 Relation between the proportion of the interaction during which each dimension occurs and infants' vocabulary size (as measured by the MCDI) at the time of the 10-min interaction (top row, solid lines, and circles) and 10 months later (bottom row, dashed lines, and open circles).
14677687, 2024, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/desc.13515 by Princeton University, Wiley Online Library on [19/08/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License F I G U R E 7 Relation between dispersion and infants' vocabulary size at the time of the play session (left panel) and 10 months later (right panel).
14677687, 2024, 5, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/desc.13515 by Princeton University, Wiley Online Library on [19/08/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons Licenseexisting evidence that short play sessions like these mirror peak interactions in longer, more natural contexts (e.g.,Tamis-LeMonda et al., 2017) and that there are similarities between in-lab and at-home samples of behavior (e.g.,Manning et al., 2020), it is unlikely that these brief interactions are fully representative of infants' natural experience.For example, our analysis of brief periods during which caregivers are aware of being recorded likely overestimates the amount of IDC that occurs as caregivers and infants go about their day.Though coding these behaviors requires substantial time and resources even in 10min recordings, more densely sampled, longer recordings are needed to assess the true nature of IDC in infants' everyday experience and how it might vary over time.
ment as highly dynamic and multidimensional calls into question whether the learning mechanisms that we see evidence for in tightly controlled experiments truly extend to the everyday interactions in which infants are actually learning and how these learning mechanisms may vary across contexts.Tightly controlled experiments intentionally remove the richness of true caregiver-infant interactions.For example, multimodality is often treated as "noise" and caregivers are usually removed from the equation to the extent possible (e.g., by wearing headphones and/or being prevented from seeing visual stimuli).This is problematic when the results of controlled experiments are taken as evidence for mechanisms of early learning that generalize beyond a single restricted task.Controlled experiments provide important insight into how infants can learn, but until we have improved characterizations of "learning in vivo"(Casillas, 2023) and better understand how learning mechanisms are instantiated in the real world, it may be problematic to generalize the results of controlled experiments to how infants actually do learn in their real, everyday learning environments.
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/desc.13515 by Princeton University, Wiley Online Library on [19/08/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Coefficient estimates from the "speech-only" model predicting infants' vocabulary size 10 months after the play session from speech only and the "full-idc" model predicting infants' vocabulary size 10 months after the play session from speech, action, gesture, emotion, and touch.