Longer looking to agent with false belief at 7 but not 6 months of age

Abstract Theory of mind refers to the ability to reason about others' beliefs and to understand others' behaviour in terms of those beliefs. A large body of previous research has examined theory of mind reasoning in young children using false belief tasks, but tasks to examine this capacity during infancy have only been developed more recently. This research used stimuli developed by Kovács, et al., Science, 2010, 330, 1830–1834, to measure looking time to an agent with a false belief in infants aged 6 and 7 months. Using an eye‐tracking procedure, we found looking behaviour consistent with 7‐month‐olds distinguishing an agent who has a false belief from one who has a true belief, consistent with the results reported by Kovács et al. We did not find evidence of this looking preference among 6‐month‐olds.


| INTRODUCTION
The ability to reason about others' beliefs and mental states is termed theory of mind (Premack & Woodruff, 1978).
An understanding of others' mental states, including beliefs, desires, and goals, is believed to allow one to understand and to predict others' behaviour (Baron-Cohen, Leslie, & Frith, 1985).
In the last 35 years, the development of theory of mind was measured via explicit responses to false belief (FB) tasks (Baron-Cohen et al., 1985;Premack & Woodruff, 1978;Wimmer & Perner, 1983). FB tasks typically require children to predict the behaviour of a character who has a FB ( Baron-Cohen et al., 1985;Wimmer & Perner, 1983) or to predict the belief of a character who has only partial knowledge of a situation (Perner, Frith, Leslie, & Leekam, 1989). Given such tasks that require an explicit response to a verbal query, children typically show evidence of theory of mind development by 4 years of age (see Wellman, Cross, & Watson, 2001 for a meta-analysis).
More recently, implicit FB tasks that rely on looking time as a dependent variable have shown evidence of theory of mind reasoning in younger children and infants. Onishi and Baillargeon (2005) measured 15-month-old infants' looking time using a violation-of-expectation (VOE) task. In VOE tasks, there are typically two phases: first, the infant is familiarized with the scene, and second, the infant sees one of two outcomes, either surprising or unsurprising given the infants' expectations. Infants' looking behaviour is used as the outcome measure in these tasks. In Onishi & Baillargeon's, 2005 study, using three familiarization trials, an experimenter familiarized an infant with the scenario by reaching for a toy in a number of locations. Then, during the test trial, the infant saw either a true belief (TB) scenario or a FB scenario: in TB test trials, the experimenter knew the location of the toy; in FB test trials, the toy had been moved while the experimenter was absent. The experimenter reached either into the box containing the toy or into the empty box. Looking time data differed significantly between TB and FB test trials: infants looked longer when (a) the experimenter reached to the incorrect location (the empty box) in the TB trials and (b) when the experimenter reached to the correct location (the box containing the toy) in the FB trials. These results suggested that infants tracked the experimenter's belief and expected her to act accordingly (Onishi and Baillargeon (2005).
Over the next decade, tests measuring looking direction or other nonverbal responses continued to produce the evidence of early theory of mind development. One study revealed that 25-month-olds show anticipatory looking to locations in which another person falsely believes that there are toys (Southgate, Senju, & Csibra, 2007). Using a violation-ofexpectation paradigm, Surian, Caldi, and Sperber (2007) found evidence that 13-month-olds could use the height of an occluder to determine whether an agent had the visual access necessary to see a preferred food item, and thus form an expectation about which location the agent would approach. Seventeen-month-olds were found to use an experimenter's (true or false) belief to track which of two hidden objects an experimenter asked for (Southgate, Chevallier, & Csibra, 2010). Finally, eighteen-month-olds were shown to help an adult experimenter achieve her preferred goal when the infant understood that the experimenter had a FB about an object's identity (Buttelmann, Suhrke, & Buttelmann, 2015).
To our knowledge, only one behavioural study has found evidence of FB reasoning with infants as young as 7 months of age (but see Hyde, Simon, Ting, &Nikolaeva, 2018 andVernetti, 2014 for relevant neuroimaging studies). Kovács, Téglás, and Endress (2010) showed 7-month-old infants 18.4-second animated displays in which a blue Smurf-like character watched a ball roll across a table and behind an occluder. In two familiarization trials, infants viewed the character watching the ball roll behind the occluder and then saw the occluder drop to reveal the ball behind it. Infants then viewed both a TB and FB test trial. In each test trial, the agent walked off-screen and returned. In TB test trials, the agent watched the ball roll outside the scene to an unseen location. In FB test trials, the ball rolled behind the occluder in the agent's presence and then to an unseen location during the agent's absence. In all test trials, when the occluder was lowered, infants saw the final outcome of no ball being visible. Infants looked significantly longer to FB trials than to TB trials, suggesting that they tracked the agent's belief regarding the presence or absence of the ball behind the occluder.
One limitation of Kovács et al.'s study is the possibility of retroactive interference. Heyes (2014) argues that infants may have looked longer to the FB trials than to the TB trials not because of belief tracking, but because the return of the agent in FB trials after the ball moved off-screen interfered with infants' formation of a memory of the ball moving off-screen. Because this memory of the ball's movement was impaired, FB trials were more similar than TB trials to the familiarization trials, since during familiarization trials, the final outcome revealed the ball when the occluder dropped. From this perspective, the ultimate absence of the ball was surprising, but only in FB trials.
One approach to addressing Heyes's concern is to use eye tracking to measure whether the infant's gaze is directed at the agent or at the area that is revealed once the occluder has dropped at the end of a FB trial. If attention is drawn to the ball (or its absence), then the infant would look longer at the expected location of the ball, the area that had been occluded. Similarly, if the infant is surprised on behalf of the agent, gaze would be not to the agent but to the newly revealed evidence: the ball or its absence. In contrast, only if the surprise that results from the FB of the agent is of interest, would the infant's gaze be directed to the agent. Evidence from adult participants suggests that a model's gaze captures attention more if that model has a surprised facial expression (Bayless, Glover, Taylor, & Itier, 2011;Neath, Nilsen, Gittsovich, & Itier, 2013). We reasoned that if an infant watches as an agent is confronted with evidence challenging a FB, the agent's resulting surprise could attract the attention of the infant.
Kovács and colleagues reported that in studies with 7-month-olds, the presence of a character with an apparent FB can impact infants' looking behaviour. Still, it is an open question whether there is any FB reasoning at 6 months of age. There is evidence that 6-month-olds view reaching and grasping as goal-directed (Woodward, 1998). There is even some evidence of social referencing as early as 6 months of age (Mireault et al., 2014). Neuroimaging studies have reported neural responses consistent with an appreciation for FB at 6 months (Hyde et al., 2018;Southgate & Vernetti, 2014). Thus, some components of theory of mind are already developing at this age. However, we know of no behavioural evidence that 6-month-olds understand FB displays. This study measures 6-and 7-month-old infants' looking behaviour when a character has a FB versus a TB.
1.1 | The current study Kovács et al. (2010) reported greater looking time to displays in which an agent had a FB compared with trials in which an agent had a TB. The purpose of this study was to build on previous findings regarding FB reasoning in 7-month-olds by testing whether the infants' looking behaviour was directed toward the agent or to expected location of the ball. In other words, do they attend to different aspects of the scene? We also extended previous findings by testing whether 6-month-olds also respond differently to FB and TB trials.
To this end, we used the stimuli provided by Kovács et al. (2010) and made the following changes to their design: (a) in answer to criticism by Heyes (2014) that FB trials were more similar to the familiarization trials than TB were, we did not present infants with familiarization trials. Rather, all infants viewed only eight test trials (4 TB, 4 FB), (b); each infant viewed all TB and FB trials. The final outcomes (ball or no ball) were always congruent with infants' beliefs, but in FB trials, outcomes were incongruent with the agent's beliefs, and (c) trial length was kept constant (rather than infant-controlled) so that all infants viewed the final outcome for 5 s after the occluder dropped.
The use of eye-tracking technology allowed us to track gaze direction during the final 5 s of trials, and to measure whether infants spend more time looking at the agent who might be surprised or at the surprising evidence: the expected location of the ball. We reasoned that if infants were tracking the character's beliefs, they would look more to the character (expecting the character's surprise) in FB trials than in TB trials. We tested both 6-and 7-month-old infants to see whether these two age groups would visually explore these scenarios in a similar way.

| Participants
Thirty-four infants were included in analyses (Caucasian = 30, Other/mixed race = 4). This included two age groups with seventeen 6-month-old infants (mean age = 6 months, 8 days; age range = 179-195 days; 8 males and 9 females) and seventeen 7-month-old infants (mean age = 7 months, 6 days; age range = 208 days-230 days; 8 males and 9 females). We planned to recruit approximately 20 participants per age group, because we wanted to have a similar sample size to the original study (Kovács et al., 2010; n = 14 per age group), while allowing for some loss for fussiness or equipment failure. In order to find an effect of the size that Kovács et al. reported (d' = 1.44), we would need to compare two groups with a minimum sample size of 7 per age group (or 14 total). In fact, we were successful in recruiting 43 participants to the laboratory. Nine infants were tested and excluded for fussiness (n = 5), technological malfunction (n = 2), or autism spectrum disorder (ASD) diagnosis of a close relative (n = 2). The latter were excluded since theory of mind development is known to be atypical in children with ASD ( Baron-Cohen et al., 1985). No analyses were started before data collection was complete.
Infants and parents were recruited through hospital visits to local maternity wards, where providing contact information for the purpose of research participation is completely voluntary. Participants were paid $15 for their participation. Informed consent was obtained, and an explanation of experimental procedure was provided prior to testing.

| Apparatus and data recording
A remote eye tracker (Tobii T60XL) with infrared corneal reflection technology embedded into the monitor was used to measure eye movements during stimuli presentation. The display monitor was a 24-in. flat screen set to the resolution of 1,024 Â 768 pixels. The Tobii T60XL recorded data at 60 Hz and 0.5 visual angle accuracy. A web camera was placed atop the monitor to record infants' faces in a full-frontal view. Stimuli were presented with Tobii Studio software using a Dell desktop computer with a Windows 7 operating system.

| Stimuli
Four trial types were used. All trials were 24 s. All trials depicted an animated blue character (agent) watching a ball rolling across a table and behind an occluder, and included the agent leaving and returning to the scene. In FB trials, when the agent walked off-screen, the ball either rolled to a different unseen location after the agent had viewed the ball rolling behind an occluder (no ball trials), or the ball rolled back behind the occluder after the agent had viewed the ball rolling off-screen (ball trials) in the agent's absence. In TB trials, when the agent walked off-screen, the ball remained in the same location during the agent's absence. Either the agent viewed the ball rolling to a different unseen location, and the ball remained off-screen in the agent's absence (no ball trials), or the agent viewed the ball rolling behind the occluder and the ball remained there in the agent's absence (ball trials). In all trials, the agent returned to the scene and the occluder dropped down to reveal the presence or absence of the ball. The trial ended 5 s later. Note that in the original experiment (Kovács et al., 2010), each trial ended with no visible ball, but in this study, half of the trial end with a visible ball, and half the trials end with no visible ball, so that we can test whether looking behaviour is explained by the presence of a ball.

| Design
In the current design, age was a between-subject factor, with 7-month-old and 6-month-old groups. The Smurf-like agent's true versus FB was a within-subject factor since every infant saw four trials in which the agent was present when the ball rolled out of the scene and four trials in which the agent was absent when the ball rolled out of the scene. The infant's TB was constant across these eight trails, as the presence or absence of the ball behind the occluder could always be anticipated by the infant. The final outcome, whether the ball was behind the occluder or not, was also a within-subject factor. All infants viewed a total of eight 24-second stimuli trials with agent's belief (4 FB, 4 TB) crossed with the final outcome (4 Ball, 4 No Ball). Trial order was randomized for each participant during the session: at each trail, a trial type was selected randomly without replacement by the eye tracker and displayed as the next trial. Table 1 illustrates this design.

| Procedure
Infants were seated in a car seat (n = 28) or parent's lap (n = 6) with eyes 65 cm away from the eye-tracking monitor.
The testing room was dimly lit to reduce distractions. Parents sitting with their infants were asked to close their eyes to avoid looking at the monitor. The Tobii Studio 5-point infant calibration with animated stimuli was used to obtain reliable eye movement data.

| Data coding
Total looking time and total number of fixations to two areas of interest (AOIs) in the final 5 s (occluder down) of each trial were coded. These AOIs isolate the agent or the area that is revealed when the occluder turns down. In order to avoid false positives, the Agent AOI was drawn around the outline of the agent, rather than by creating an oval-shaped AOI (see Figure 1 and Appendix A). Fixation data were defined using the Tobii fixation filter with a velocity threshold of 35 pixels and a distance threshold of 35 pixels.

| Preference for looking to false belief over true belief as a proportion of total looking
To examine looking behaviour of infants between age groups, the proportion of the total looking time that occurred during FB trials was calculated. This index was calculated for each participant as follows: proportion-to-FB index ¼ Sum of Looking to FB trials Sum of Looking to FB trials þ Sum of Looking to TB trials This index was calculated for (a) total looking time summed across trials by trial type and for (b) total number of fixations summed across trials by trial type with respect to each of the two AOIs: (a) the agent AOI and (b) the occluder area AOI. This looking index allowed us to account for individual variance in baseline looking behaviour F I G U R E 1 Stimuli used in this study. Final frame of TB and FB trials (5 s each) shown with red areas of interest (AOIs) depicted. Infants viewed four trials total: each trial type (TB, FB) had one trial of each outcome (ball present, ball absent) between infants (Csibra, Hernik, Mascaro, Tatone, & Lengyel, 2016), and similar indices are used widely across infant literature (Hirshkowitz & Wilcox, 2013;Rutherford, Walsh, & Lee, 2015;Young, Merin, Rogers, & Ozonoff, 2009).
An Omnibus repeated-measures ANOVA was conducted for the proportion-to-FB index using total looking time, and another ANOVA was conducted for the proportion-to-FB index using number of fixations. First, using proportion-to-FB index of number of fixations as a dependent variable, a 2 (AOI: Agent vs. Expected Ball Location) by 2 (Age: 6 vs. 7 months) repeated-measures ANOVA was conducted. The interaction between the AOI and age was not significant (F[1,32] = 1.805, p = .19), nor were the main effects of the AOI (F[1,32] = 1.516, p = .23). The main effect of age was significant for the proportion-to-FB index of the number of fixations (F [1,32] = 4.087, p = .05). Next, using the proportion-to-FB index of total looking time as a dependent variable, a 2 (AOI: Agent vs. Expected Ball Location) by 2 (Age: 6 vs. 7 months) repeated-measures ANOVA was conducted. The interaction between AOI and age was not significant (F[1,32] = 1.169, p = .29), nor were the main effects of AOI (F[1,32] = 1.61, p = .21) or age (F[1,32] = 2.68, p = .11). These relationships are illustrated in

| Do infants show a preference for false or true belief agents?
In addition to testing whether there was development between our 6-month-old and 7-month-old samples, we could ask whether each group showed a preference for looking to TB or FB displays. We tested whether preference for looking toward FB displays was above chance for each age group. Using the proportion-to-FB index for the number F I G U R E 2 Preference for looking toward false belief trials over true belief trials of fixations on the agent AOI, 6-month-olds were significantly less likely than chance to prefer the FB display (t

| DISCUSSION
The current study explored infants' looking behaviour across trials that did and did not include an agent with a FB by using an eye-gaze-based implicit measure and stimuli adapted from Kovács et al. (2010). Similar to the Kovács et al.
study, we found evidence that infants as young as 7 months of age show a different behavioural response to a FB display compared to a TB display, indicating that these infants respond to the agent's true or FBs. While the 7-month-olds in the present study looked more to the agent in the final 5 s of FB trials than TB trials, the 6-montholds showed no such preference. Similar results were found using the total looking duration and the total number of fixations. Furthermore, the evidence for elevated interest in FB trials shown in Figure 2 relies on looking toward the agent, regardless of looking toward the more informative location of the ball, suggesting that the infant has linked the FB to the agent who holds the FB. Six-month-old participants did not show this tell-tale pattern of looking times.
Notice that in contrast to other false-belief tasks, this task does not target the content of the belief, but rather the belief-holder. We used eye-tracking technology to measure looking to specific AOSs, the agent or the occluder, rather than looking to the entire display as was measured by Kovács et al. (2010), because we predicted that the agent with a FB would attract the attention of the infant. Seven-month-olds fixated the agent AOI in FB trials more than they did in TB trials. This is evidence that 7-month-olds will attend to the agent if that agent holds a FB more than if the agent holds a TB. To date, we know of no other evidence that infants bind beliefs or other mental states to the corresponding agent. Heyes (2014) argued that in Kovács et al.'s original study, results could be explained by the return of the agent interfering with the infant's memory for the movement of the ball to an unseen location. This explanation fails to account for two of the novel findings in the current study. First, it does not explain why looking time specifically to the agent, rather than to the display as a whole was greater for FB trials. Second, it does not offer a straightforward explanation for why the pattern of looking time across FB and TB trials changed between 6 and 7 months of age.
The current paradigm reveals a measurable response to an agent with a FB as early as 7 months. Prior to Kovács et al. (2010), research examining young infants' understanding of FBs had used experimental tasks embedded with preference information (He, Bolz & Baillargeon, 2012;Luo, 2011). For example, Luo (2011) found that 10-month-old infants looked longer when agents grasped a new object over a previously preferred object only when the agent appeared to have knowledge that there were two objects present. In contrast, when the agent knew there was only one object present (TB) or falsely believed there was only one object present (FB), the infants did not expect the agent to have a preference. Similarly, He, Bolz and Baillargeon (2012) found that 10-11-month-old infants looked longer when an agent grasped a short container with a preferred toy collapsed and moved in the agent's absence (FB) than when an agent grasped a tall container without the preferred, long toy. In both cases, the results of these studies suggested infants' sensitivity to FBs involving agent preferences. The stimuli in the current study differ from those used in preference paradigms. In the current study, the agent does not necessarily have a clear goal in mind; rather, he just observes, and one cannot ascribe a clear preference to the agent. The only clues infants might have about the agent's disposition involve his gaze.
In the current study, looking patterns differ between 6-and 7-month-olds. The 7-month-olds looked to the character in the FB trials more than in the TB trials, but the 6-month-olds did not. One possible explanation for this difference is that the ability to understand cues to FBs develops between the ages of 6 and 7 months. However, there are other explanations that could account for these findings. Technically, this is not a FB task as the infant is not required to predict the actor's behaviour, so not looking at the agent who has a FB does not necessarily mean that the infant does not know that the agent has a FB. It is possible that 6-month-olds do not focus on agents preferentially to the rest of the scene despite understanding that the agent has a FB. The gaze pattern of these 6-month-olds does not necessarily mean that they have not detected the difference between TB and FB, but could reflect the fact that they do not consider the agent as interesting. Our dependent variable relies on the agent as the region of interest, and a difference in interest in agents generally between 6 and 7 months of age could account for this difference.
It is also possible that 6-month-olds understood that the character had a FB but did not have a coherent plan for looking for consequences of the FBs. These results are still evidence of socio-cognitive development, but the current paradigm cannot distinguish between an appreciation of beliefs versus other psychological processes that might be part of such development. Previous research indicates that infants as young as 3-5 months are surprised by objects disappearing (Luo, Kaufman, & Baillargeon, 2009;Wang, Baillargeon, & Paterson, 2005), suggesting the age differences found in this study are not based upon task difficulty related to infants' ability to track the events from their own perspective.
It would be imprudent to conclude that the pattern of results seen in the 6-month-old group was clear evidence of immaturity with respect to false-belief reasoning. Extant research with 6-month-olds suggests that they may be sensitive to FBs. Hyde et al. (2018) used near-infrared spectroscopy (NIRS) to examine infants' functional brain responses in temporal, parietal, and frontal regions, including the temporal-parietal junction (TPJ), an area well known for theory of mind reasoning (Aichhorn et al., 2009;Braukmann et al., 2018;Lloyd-Fox et al., 2018;Saxe & Kanwisher, 2003;Saxe & Wexler, 2005;Young, Dodell-Feder, & Saxe, 2010). They found that 6-month-olds and 7month-olds show evidence of FB reasoning using a FB task. Researchers found that neural activity in 7-month-old infants differentiated only in the TPJ to TB and FB scenarios; the same pattern of results was found in the adult literature. Southgate and Vernetti (2014) used EEG to measure 6-month-olds' motor cortex activation by alpha suppression. They found that infants' motor cortex showed activation only when an agent had a FB about a ball being in a box (expecting the agent to reach for the ball), but not when an agent had a FB about the ball not present in the box.
This study involves not only tracking belief but predicting behaviour. Neither imaging study reports corroborating eye-tracking or behavioural data, so it is possible that young infants' processing of FBs is only measurable with neural measurements. Future research could examine both neural and eye-tracking data together to see whether this is the case.
It is now clear that early estimates of the age of theory of mind development based on the child's explicit report were too conservative. Children do show evidence of theory of mind reasoning much earlier than the often cited 4 years of age, possibly as early as the first year of life. The development of methods that are sensitive enough to measure earlier theory of mind reasoning has some theoretical and potentially some clinical implications. The earliest age of theory of mind reasoning is relevant to any theory that posits such reasoning as a precursor of some other behaviour. For example, Leslie posits that pretend play is evidence of theory of mind development (Leslie, 1992), but this suggestion is troubled by persistent claims that theory of mind abilities develop only at 4 years of age since pretend play is evident as early as 18 months of age. If pretend play depends on theory of mind development and is evident at 18 months of age, then theory of mind cannot develop as late as 4 years of age.
In summary, this study shows evidence of theory of mind reasoning at 7 months of age, and a lack of such evidence at 6 months of age. The fact that 7-month-olds looked longer at the observing character who holds a FB when this character had not conveyed any goal or preference suggests that infants has made a link between the FB and the character who holds that FB. Together, these results provide evidence of an appreciation of FB at 7 months and suggest some development between 6 and 7 months.