Seven‐year‐olds' references to internal states when playing with toy figures and a video game

Abstract References to internal states (e.g., thoughts, feelings, and desires) indicate children's appreciation of people's inner worlds. Many children spend time playing video games; however, the nature of children's speech when doing so has received little attention. We investigated the use of internal state language (ISL) as 251 seven‐year‐olds played with toy figures and a video game designed for the study. Although children used ISL more when playing with toy figures, children used ISL in both contexts, highlighting video game play as a context where children demonstrate their appreciation of inner worlds. Children's speech in the two contexts differed in how ISL was used: references to children's own internal states were more common when playing the video game, and the characters' internal states more common when playing with the toy figures. These findings are discussed with reference to the format of the play activities affording different opportunities to discuss internal states. Highlights In traditional play children refer to internal states, however, it is unclear whether this occurs when they play video games. Children referred to internal states when playing with toy figures and a video game, but did so more with the toys. Children's video game play can be used as a new context for the study of children's social understanding.

children used ISL more when playing with toy figures, children used ISL in both contexts, highlighting video game play as a context where children demonstrate their appreciation of inner worlds. Children's speech in the two contexts differed in how ISL was used: references to children's own internal states were more common when playing the video game, and the characters' internal states more common when playing with the toy figures. These findings are discussed with reference to the format of the play activities affording different opportunities to discuss internal states. types of games played and characteristics of the children themselves (see Halbrook, O'Donnell, & Msetfi, 2019). Evidence is mixed regarding the influence of violent video game content on behaviour (Anderson & Bushman, 2018), although a recent meta-analysis did not demonstrate links between violent video games and long-term youth aggression (Drummond, Sauer, & Ferguson, 2020). Other evidence suggests children's aggressive choices in video game play can be predicted by well-known risk factors for aggressive conduct problems and angry aggressiveness in infancy (Hay et al., 2017), and engaging with video games prosocially is positively associated with empathy and prosocial behaviour (Harrington & O'Connell, 2016). However, there remain few studies that focus on the positive features children's engagement with video game play (Bormann & Greitemeyer, 2015;Singer & Singer, 2013).
Children talk to themselves, that is, use private speech, while playing video games, as they do during free play (Søndergaard, 2013), and it is likely they would refer to internal states of characters in their speech while playing video games. Children themselves regard playing video games as an imaginative activity, similar to playing with toys (Downey, Hayes, & O'Neill, 2007). In common with traditional play, video game play takes place within fictional worlds (Lillard, 2013), and can evoke emotional reactions (Cairns, Cox, & Nordin, 2014), which reflect engagement, absorption, or immersion within the game's fictional world (Brown & Cairns, 2004;Harris, 2000;Liao & Gendler, 2011). Qualitative research highlights children's imagination as critical for becoming immersed in a virtual world (Søndergaard, 2013). Indeed, 6-to 8-year-olds make references to the cognitive states of computers (e.g., what computers "think" and "know"; Turkle, 1997). Children also refer to electronic toys, such as Tamagotchi or Furby, as possessing internal states (e.g., "he's sad" and "he wants it"; Francis & Mishra, 2009) and treat such toys as if they are living entities, more so than traditional toys (Plowman & Luckin, 2004;Subrahmanyam, Kraut, Greenfield, & Gross, 2001).
Although children may produce ISL in both contexts of play, video games and play with toys have different inherent properties and demands that may affect the nature of children's ISL. Differences in play materials and task structure are known to affect children's play (Trawick-Smith, Russell, & Swaminathan, 2010), including elements of their speech (Paine et al., 2019b;Krafft & Berk, 1998). For example, props such as toys and dress-up clothes prompt social interaction and imagination more so than more structured toys and activities, such as maths games and puzzles (Trawick-Smith et al., 2010).
We hypothesized that these inherent differences in the two types of play may influence the amount of ISL, the category of internal state that is referred to (e.g., cognitions, emotions, desires, etc.), and/or the referent (e.g., the child, the character, etc.) of the internal state being mentioned (Longobardi et al., 2014). Free play with toys is "open-ended," with children unconstrained in what they do. In contrast, video games are often "close-ended," analogous to games-with-rules, where players are often given a series of challenges to complete the game. Furthermore, the two types of play offer the child opportunities to take on different roles. In video games, children often experience the game from one perspective, that of a "virtual self" or avatar (Klimmt, Hefner, & Vorderer, 2009). In contrast, in traditional play with toy figures, children can engage in play as an actor, narrator, or manager of the play (Scarlett & Wolf, 1979). When acting, children are in the role of a character, whereas when narrating or managing (i.e., setting up the toys), children adopt a perspective that is "out" of the play (Giffin, 1984;Scarlett & Wolf, 1979). These different ways of playing are associated with pretend play and different uses of ISL.
Playing with toys in expected ways is positively associated with referring to cognitions and beliefs, and playing with toys in creative ways is associated with using these terms as well as pretence in play (Howe et al., 2014;Howe & Bruno, 2010), whereas simply setting up objects and toys is associated with less pretence in play and fewer references to preferences and beliefs (Howe et al., 2014;Howe & Bruno, 2010).
Despite expecting differences in children's speech across the two types of play, we also hypothesized there would be some consistency in the use of ISL when playing with toy figures and when playing video games. Children's use of ISL has been shown to be consistent over time (Carr, Slade, Yuill, Sullivan, & Ruffman, 2018) and across different contexts (Longobardi et al., 2014). However, such consistency could be explained by other child-related factors associated with children's ability to understand the minds of others. For example, it is well established that children's age and verbal ability are related to markers of children's social understanding, including their ISL in conversation (de Rosnay & Hughes, 2006;Slade & Ruffman, 2005). Children's higher-order processes that control thought and behaviour (i.e., executive functioning, such as working memory) share a similar developmental timetable to children's developing social understanding, and are related to social understanding skills in early to middle childhood (Paine et al., 2019a;Bock, Gallaway, & Hund, 2015;Carlson & Moses, 2001;. The family environment is also important. Children in a family at a socioeconomic disadvantage may have less opportunity for the types of discourse (e.g., reflective discourse) that promote the development of social understanding (Cole & Mitchell, 2000). In addition, children's early conversational environment, in terms of their caregivers' references to mental states, is associated with children's social understanding (Ensor, Devine, Marks, & Hughes, 2014;Meins et al., 2002). It is also possible that children's verbal engagement with each context of play, as well as their existing preferences and experience with different types of toys and technology, may influence their propensity to reflect on inner states. Therefore, we tested whether these characteristics of the child and family were associated with the use of ISL in virtual environments and traditional toy play, and asked whether these other factors explained any individual differences in the use of ISL across the two contexts.

| AIMS OF THE STUDY
In middle childhood, children engage with and enjoy play with toy figures and video games (Case-Smith & Kuhaneck, 2008;Ofcom, 2019). Although it is well-established that free play is a rich context for ISL in middle childhood (Paine et al., 2019a), the nature of children's speech as they engage with video games has received little attention. As such, in the context of a prospective longitudinal study, we investigated seven-year-olds' propensity to refer to internal states during solitary play with toy figures and as they played a video game. We aimed to: (1) establish whether children used ISL in both types of play; (2) examine the total use of ISL in the two contexts; (3) explore the different categories and referents of ISL in children's speech in the two types of play for descriptive purposes; and (4) investigate whether children's overall propensity to use ISL was consistent across both types of play while controlling for factors known to be associated with social understanding, including working memory (Bock et al., 2015;Carlson & Moses, 2001;, verbal ability (de Rosnay & Hughes, 2006;Slade & Ruffman, 2005), risk for socioeconomic disadvantage (Cole & Mitchell, 2000), and their mothers' propensity to refer to internal states (Ensor et al., 2014). Based on the literature reviewed, we hypothesized that there would be some consistency in the use of ISL between the two contexts (Carr et al., 2018;Longobardi et al., 2014), but we also expected subtle differences to be present in the use of ISL due to the different properties of the two contexts (Howe et al., 2014;Howe & Bruno, 2010;Trawick-Smith et al., 2010). the Gwent Healthcare Trust, United Kingdom. Midwifery teams also granted the recruitment team access to specialist prenatal clinics for medical problems and outreach services for vulnerably housed pregnant women, which enhanced the representativeness of the sample. No exclusion criteria were set, either for the recruitment during the pregnancy or after the baby was born, except in the case of miscarriage, the infant's death, or the infant's experience of health problems that were so severe, it would not be possible to participate in the study. Translators were employed for families whose native language was not English or Welsh, or for participants who had impaired hearing.
Data collection took place in pregnancy (Wave 1) and at a mean of 6, 12, 21, and 33 months postpartum (Waves 2-5, respectively). The final assessment (Wave 6) is the focus of the present paper, and took place when the children were between 6.5 and 7.5 years of age (mean 83 months). The sociodemographic characteristics of the sample recruited into the CCDS are displayed in Table 1. The CCDS was found to be nationally representative when compared to a subsample of firstborn children in the Millennium Cohort Study, the most recent national birth cohort study in the United Kingdom (Kiernan, personal communication, 2009).

| Participants
The middle childhood assessment (Wave 6) took place at a target age of 7 years (M = 6.96, SD = 0.38). Of the original 332 families recruited, 22 families had withdrawn from the study and one family had never been traced, leaving 309 (93% of those recruited) remaining in the sample. Of those, 287 (93%) provided data at the middle childhood assessment, 272 being seen in the home and 15 only completing questionnaires. The present analyses focus on 251 children (92% of those seen in the home; M = 6.95 years, SD = 0.38) who had completed a free play activity with Playmobil figures and played a novel video game designed for the study; 111 girls (44%) and 140 boys (56%) did so (this gender composition reflects the fact that more boys [57%] than girls [43%] are in the full [N = 332] CCDS sample). Six children refused to complete at least one of the activities: one did not play the game due to time restrictions; one did not complete any child testing; and one could not be assessed on these tasks due to a severe developmental delay. One family withdrew their data after the assessment and one session took place in a language other than English or Welsh and no translation was available. In 10 cases, technical problems resulted in data being unavailable for at least one of the tasks. The 251 children in the present sample were not significantly different from the original N = 332 with respect to sociodemographic adversity scores, (p > .05).

| Procedure
At the middle childhood assessment, families were visited at home for two 2-hr assessment sessions. The primary caregiver (97% mothers) completed interviews with one researcher while the firstborn child completed a battery of cognitive, social, and emotional tasks with a second researcher. These tasks included a battery of age-appropriate social understanding tasks that used Playmobil figures: second-order false belief, social information processing, Machiavellian intelligence, and simple deception tasks (Paine et al., 2018;Christie & Geis, 1970;Leekam & Prior, 1994;Quiggle, Garber, Panak, & Dodge, 1992). These tasks were followed by a free play activity with Playmobil figures and then the video game, which were the focus of the present study. If necessary, a third researcher kept any younger siblings occupied to prevent disruption of the interviews or assessments. At the end of the final session, the focal child, caregiver, and any siblings present then took part in a series of interaction tasks. At the end of the session, the child was given a £10 book voucher, and the caregiver was given a £20 gift voucher.

| Materials
Free play with Playmobil figures. In the first home visit, after completing the social understanding tasks, the children were given an opportunity to play with Playmobil figures in any way they would like on their own for 3 minutes (see a first-person perspective game that was modified from the commercially available game The Elder Scrolls V: Skyrim (Bethesda, 2011) using freely available modification tools. The commercial game was modified to create a narrative that consisted of 11 "scenes" portraying the child on a school trip to a castle with their teacher and classmates, identified by the red school sweatshirts they wore, and encountering children wearing blue sweatshirts from a rival school (see Figure 1). This modification allowed us to present children with the same series of emotional challenges that might provoke prosocial behaviour, fear-related behaviours, or aggressive responses using a mallet that their character had been given at the start of the game (for further details of these specific challenges, see supplementary materials for a detailed narrative of the game. A video demonstration of the CAMGame can be found at https:// youtu.be/SpixvsHypg8).
Children also completed the CAMGame during the first middle childhood home visit. Children were told that in the game they would take part in a school trip to the castle; they had to stop and listen to the characters in the game to find out where to go and what to do. The children's speech and faces were video-recorded using the webcam on an Alienware™ laptop using an Xbox™ controller with the right trigger coloured in purple and the left analogue stick coloured white to correspond with instructions given by the game. The researcher explained how to use the controller before the child began playing the game, with reminders given as a part of the narrative. Children varied considerably in how long they took to complete the game, taking on average 19 min and 5 s (SD = 5 min and 45 s, range = 8 min and 30 s-41 min and 45 s).

| Measures
Children's talkativeness. Video recordings of children's speech while playing with the Playmobil figures and the CAM-Game were transcribed into 5-sec segments. Trained translators transcribed tasks that took place in Welsh. Time segments in which children were not playing the game due to a technical error, were repeating a part of the game that they had already completed, or were not engaged in either of the tasks, were excluded from all analyses, including calculations of talkativeness scores and task length. A proportional measure of the child's talkativeness was generated by dividing the number of 5-sec segments in which the child spoke by the total number of 5-sec segments of the length of the task, yielding a score between 0 and 1. Any instances of non-word vocalizations that were not sound effects were excluded from these calculations. This time sampling measure of talkativeness has been validated as a measure of how much someone talks using Audacity voice analysis software on a subsample of video records, which yielded a measure of the precise duration of speech (Roberts et al., 2013).
Children's references to internal states. The transcripts of children's speech as they played with the Playmobil and played the CAMGame were coded for the use of ISL using a coding scheme previously designed for the study (Paine et al., 2019a). This coding scheme was derived from Roberts and colleagues' (2013) scheme, which was based on Bartsch and Wellman's (1995) belief-desire categorization of the theory of mind. Each 5-sec segment was considered in terms of whether it contained an utterance that included the use of ISL. The coding scheme categorizes speech that includes references to the following internal state categories: perception, physiology, preference, intention, desire, emotion, and cognition. The referent of each use of ISL was coded as referring to an inner state of the self, the characters (game characters or Playmobil figures), or others (e.g., the experimenter) (See Table 2  Perception Comments made about the perception of an object using one of five senses, such as "see," "hear," "feel," "taste," "smell." "That one looks like…"; "I didn't see it" "She heard it stop again"; "And the teacher sees" "Can you see it in the camera?" "I'm looking at the floor" "What are you looking at me for?" "Did you see that guy?" Physiology Comments made about physical states and sensations, including "sleepy," "pain," "hot/cold (as in temperature)," "sick," "comfy." "It really hurts" "They're feeling tired" No instances of physiology to other occurred "Ooh I've got pins and needles" "Am I actually hurting my own people?" "Did you die in this one" Preference Comments made about positive or negative judgements of an object, action or experience.
Coding preference includes terms include "like," "hate," "love," "favourite," "enjoy," "interest." "My favourite colour is pink and blue"; "I like this" In play voice: "Kate's the best" "Do you like that thing that's there?" "I don't like this Desire Comments made about longing for an object, action or experience. Desire terms include "want," "wish," "hope," "fancy," "rather," "need (as in want)." "I wish you can buy everything for free"; "I want that one" In play voice: "Actually I don't want to…"; "And he wants to sit there" Emotion Comments made about feeling states, including basic emotions "happy," "sad," "surprised," "disgusted" and variations like "fed up," "bored," "glad," "excited." "That was disgusting" "Mum and teacher are happy"; "They are kind of sad" No instances of emotion to other occurred "I'm a bit scared" "He's scared, he's a scaredy cat" "Would you be scared if you was only a school person going here?" Cognition Comments made about beliefs and knowledge.
Also include general terms indicating other cognitive activity, such as "remember," "imagine," "pretend," "understand." "I'm gonna pretend these chairs are here"; "I think that goes here" "Kate does not know where it is"; "Then they thought" "Amd imagine she buyed another" "I probably know All the items contributed to a single component which explained approximately 77% of the shared variance with positive scores indicating a higher than average level of adversity at the time of the child's birth.
Children's early conversational environment. At the early infancy home assessment (Wave 2), mothers and the focal child (M age = 6.64 months, SD = 0.88) had been given a topic sharing task using an activity board: a plastic toy with folding flaps and images of cartoon animals. Mothers were asked to show the toy to the infant for 2 min, and this interaction was video recorded, transcribed, and coded for mothers' references to ISL using the same coding scheme described above (see Table 2; Paine et al., 2019a). Inter-rater reliability was established between two independent coders on 65 (31.4%) of mothers' references to ISL in early infancy (median ICC = .99). Data were available for 222/251 (88%) families in the present sample. Fourteen families did not take part in this assessment, and data were not available for 15 of the families for this task.
Caregivers' reports of children's play with toy figures and video games. As part of a general interview about the child during the middle childhood assessment, caregivers were asked to identify "what kind of things does [child's name] like to do?" from a list of 18 activities. Two variables were derived from the caregivers' reports: (1) a composite variable measuring liking to play with toy figures (including Playmobil figures, action figures, toy vehicles, and dolls) and (2) liking to play video/computer games. Both variables were dichotomous, where 0 indicated not liking to do the activity and 1 liking to do the activity. Data on these measures were available for all children.
Verbal ability. In session 1 of the middle childhood assessment, children's receptive vocabulary was assessed using the British Picture Vocabulary Scale (BPVS; Dunn & Dunn, 2009). BPVS data were available for 247 (98%) of the subsample for this paper: three children did not complete the task and one child was not testable.
Working memory. Also in session 1 of the middle childhood assessment, children completed the Visuo-Spatial Sequencing (VSS) task from the Amsterdam Neuropsychological Tasks (ANT; de Sonneville, 1999). Children were presented with nine circles on the computer screen presented in a square matrix. Following a beep, an animated hand on the screen pointed to a sequence of the circles that gradually increased in the number of targets and the complexity of the sequence. The child was asked to replicate this sequence of circles; the total number of correctly identified targets in the correct order measured their working memory (Paine et al., 2019a). Data on this task were available for 237 children: two children refused to participate in the task and two were not testable; seven did not complete the task due to time constraints or incomplete visits; and, for three, data were unavailable due to technical errors.

| Data analysis
We first describe features of children's speech in the two contexts of play in terms of the overall use of ISL, and the categories and referent of internal states. We have used data that reflect the proportion of the use of ISL based on the task length, not absolute frequencies, to account for differences in how long children played with the toys and the video game.
However, descriptive data regarding absolute frequencies of the use of ISL are presented in Table 3. Because the data were not normally distributed, nonparametric analyses (Wilcoxon signed-rank tests, Spearman's Rho) were used for analyses involving the ISL variables and child characteristics. Significant associations (p < .05) were carried forward and used as control variables for subsequent regression analyses. The proportion scores for children's ISL in the CAMGame were transformed for use as the outcome variable in the regression using the logit transformation function in R (R Core Team, 2018), as this is the preferred method of improving the normality of proportional data (Warton & Hui, 2011).  T A B L E 4 Means, standard deviations, and intercorrelations of overall internal state language variables and relevant child characteristics  Note: * p < .05; ** p < .01. Means and standard deviations for ISL variables are based on proportions of task length. All correlations are Spearman's rho.
6.2 | Children's use of ISL in play with toys and the video game ISL categories. Figure 2 displays the relative proportions of ISL categories in the two play contexts; descriptive statistics for the use of ISL according to the category are presented in Table 3. When playing with Playmobil and when playing the CAMGame, children referred to cognitions more than any other internal state category (M = .03, SD = 0.04 for the Playmobil free play activity; and M = .03, SD = 0.02 for the CAMGame). The descriptive statistics presented in Figure 2 and Table 3 indicate subtle differences in the number of references to ISL categories between the two contexts; however, due to the low proportion scores in certain ISL categories, further analyses were not conducted.
The referent of children's ISL. Table 3 Table 4 presents the means, standard deviations, and intercorrelations between the total use of ISL (as a proportion of the length of the activity) in each context and child and family characteristics. Two hundred and six (82.1%) of the children were reported as liking to play with toy figures, and 220 (87.6%) were reported as liking to play with video/ computer games, confirming that the two types of play studied were relatively popular in this sample. As these variables were at ceiling, they were not included in the present analyses. Children's use of ISL was not significantly associated with family adversity or the early conversational environment. Nor was ISL associated with the child's age, gender, or verbal ability. Children's working memory scores were associated with their references to internal states during the CAMGame, but working memory was not correlated with ISL during play with the toy figures. F I G U R E 2 Relative proportion of the use of internal state language categories in the two contexts of play

| Consistency in the use of ISL in play with toys and the video game
Children's total references to internal states when playing with toys and while playing the CAMGame were positively correlated. Children's overall talkativeness was consistent across contexts, and was positively associated with ISL in both contexts (all ps < .05, see Table 4). Therefore, a mean talkativeness score was computed to indicate children's propensity to talk which was used in subsequent analyses.
To investigate whether children's overall propensity to use ISL was consistent across both types of play, we entered children's ISL during free play as a predictor in a regression analysis, with children's ISL in the CAMGame as the outcome variable; covariates of children's ISL (mean talkativeness and working memory) were controlled. When these variables were entered into the analysis, only children's talkativeness (β = .72, p < .001, 95% CI = [1.99, 2.70]) significantly predicted children's ISL in the CAMGame in the final model (see Table 5). Children's use of ISL while playing with Playmobil figures did not account for significantly more variance than the child factors alone, as it did not represent a significant step in the model, F(1, 229) = .07, p = .067, ΔR 2 = .01.
With respect to the referent of the internal state attribution, children's references to their own, r s (251) = .26, p < .001, the characters', r s (251) = .18, p = .005, and other people's, r s (251) = .14, p = .033, internal states were positively associated across the two contexts. Because the association between children's total use of ISL in the two contexts was explained by children's talkativeness scores, these univariate analyses were not explored further. context for ISL (Leach et al., 2017) and extend this to include play in virtual environments. Our findings corroborate earlier qualitative research demonstrating that children narrate their video game play (Søndergaard, 2013), and show they refer to the inner worlds of purely virtual characters when doing so. Our findings support earlier claims that video games can be considered a form of play that children engage with in a similar way to more traditional play with toys (Lillard, 2014;Singer & Singer, 2005).

| Children's use of ISL when playing with toy figures and video games
Children referred to more internal states when playing with Playmobil figures than the video game, and there were also differences in how ISL was used in each context. In line with research on mothers' use of ISL in different contexts (Carr et al., 2018;Howe, Rinaldi, & Recchia, 2010), and studies of other elements of children's speech (Paine et al., 2019b;Krafft & Berk, 1998), our findings indicate that differences in the structure and presentation of play activities afford different opportunities to discuss internal states. Children referred to characters' internal states more often when playing with toy figures, and referred to their own internal states more often during the video game. The open-ended format of the play with toys afforded the children the opportunity to engage in different types of play, including telling stories that mentioned characters' internal states (e.g., "They want to visit the animals"; "Alex wants the ball"). They might also enact character roles using the figures (e.g., "[In play voice] I hate bedtime").
Our findings indicate that there also may be differences in the categories of ISL used in the different contexts of play. The unscripted free play session may provide children more opportunities to discuss their own desires, particularly in terms of the way in which they played and preferred to set up the toys (e.g., "I wanna put these on the washing line"). In contrast, when playing the video game, children were restricted in their possible actions by the constraints and rules of the game (Valkenburg, 2001), and so may have been less likely to express desires that could not be met. However, children may have referred to their emotions and perceptions more often when playing the video game than when using the Playmobil figures for several reasons: because the presentation of the game was perceptually engaging (e.g., "I hear someone from the blue school"); because some of the challenges built into the game involved searching for items (e.g., "I see the magic statue!"); or because the graphics and narrative of the game elicited emotional responses (e.g., "I'm a bit scared what's come, cause I've got loads of monsters in here 1 ").
The pre-scripted nature of the virtual characters in the video game may have limited opportunities for children to speak about characters' internal states; however, the first-person perspective and challenging nature of CAM-Game may have promoted private speech (Fernyhough & Fradley, 2005), particularly in the children's commentary on their ongoing activity (e.g., "I think I see something"; "I think I'm just gonna go"). Given the story-based nature of the game, much of this speech may also reflect children's interactions with the characters (e.g., "Blue boys they're stupid, stupid, I'm going to get ready to hit you"; "Where do you think you're going? I'm scared."). Evidence from studies exploring associations between stories and social cognition (both from storybooks and other narrative media; Mar, Tackett, & Moore, 2010;Mar, 2018) suggests that fictional narratives afford opportunities to "abstract and simulate the social world" (Mar & Oatley, 2008, p. 185). Indeed, evidence suggests that in-game storytelling is positively associated with social understanding; this association, it is suggested may occur via players' opportunities to engage in virtual social interaction (Bormann & Greitemeyer, 2015;Mar, 2018). These processes and potential links with social understanding, however, require further investigation.
It is notable that, when coding children's speech while playing the video game, we used a conservative method to distinguish between the attributions of internal states to the self versus the avatar controlled by the child. We could only reliably code references to one internal state for the avatar, the avatar's physiology (e.g., "That might hurt me"), as this was the only category of ISL in which there was no possibility that the child meant to attribute the internal state to the self. Such speech could be interpreted as children integrating themselves with the avatar, resulting in the referent being difficult to separate in their use of language (Klimmt et al., 2009). Therefore, it is possible that some of the references made to children's own internal states were in fact attributions to the avatar.

| Limitations of the study
Our study has limitations. The experimenter's presence during both forms of play could have affected children's use of ISL. However, the evidence is mixed regarding whether the presence of adults leads to differences in the frequency and quality of children's speech during play, depending on their level of involvement (Howe & Bruno, 2010;Krafft & Berk, 1998). In our study, experimenters were advised to engage with the play only at the child's request; although this potentially may have differed across experimenters, our analyses indicated no differences in different experimenters' speech or use of ISL. However, future research could make use of unobtrusive data recording methods to record children's speech during their play in the absence of an experimenter or observer.
Furthermore, we only investigated children's use of ISL during solitary play. ISL is most often studied in the context of social interaction. Solitary imaginative play activities have been argued to be social in that they are performances to real, or imagined, others (Piaget, 1962). In the present study, the experimenter's presence may have provided an audience for the child's performance. Additionally, our findings corroborate previous work on a solitary play that found similar qualities in children's private speech when playing alone (Davis et al., 2013;Krafft & Berk, 1998). Our work extends these findings to include both playing with toys and with video games. Future research could investigate children's use of ISL when engaging in and negotiating to play video games with a peer, as opposed to more commonly studied social play with toys (Howe et al., 2014;Leach et al., 2015).
Finally, order effects may have contributed to the differences found in relation to the children's use of ISL in the Playmobil free play activity and when playing the video game. The social understanding tasks, the free play activity, and the CAMGame were presented in a fixed order within the task battery, and it is possible that the stories told in the social understanding tasks may have primed patterns of ISL in the free play activity, which further prompted patterns of ISL present in the CAMGame. Although counter-balancing the order of the presentations of these tasks would have been ideal, these activities formed part of a larger battery of tasks, including the covariates included in the present analyses, for the time-intensive home visit. The order of tasks was devised to ensure data were collected efficiently, and children's interest was sustained throughout the duration of the visit. However, the possibility that order effects might have had an effect calls for caution to be taken in the interpretation of the present findings and warrants future research that can address this issue.

| CONCLUSIONS
In summary, we have demonstrated that children's video game play can be used as a new context for the study of children's references to internal states. Children used ISL in both virtual and traditional play contexts, but their speech about internal states in the two contexts differed in terms of the nature and referent of the inner states. Our results have implications for parents, teachers, and researchers seeking to foster children's developing social understanding by encouraging conversations about mental states (Bianco, Lecce, & Banerjee, 2016). We have found that children do indeed refer to internal states when playing video games, and therefore this popular activity could be targeted to support children's social understanding by promoting their use of ISL. At a time when concerns are being expressed by parents and policymakers about children's screen-based activities, our findings demonstrate that children demonstrate their social understanding and imaginative skills when playing video games, just as they do when they are engaged in more traditional forms of play with toys.