Face engagement during infancy predicts later face recognition ability in younger siblings of children with autism

Face recognition difficulties are frequently documented in children with autism spectrum disorders (ASD). It has been hypothesized that these difficulties result from a reduced interest in faces early in life, leading to decreased cortical specialization and atypical development of the neural circuitry for face processing. However, a recent study by our lab demonstrated that infants at increased familial risk for ASD, irrespective of their diagnostic status at 3 years, exhibit a clear orienting response to faces. The present study was conducted as a follow-up on the same cohort to investigate how measures of early engagement with faces relate to face-processing abilities later in life. We also investigated whether face recognition difficulties are specifically related to an ASD diagnosis, or whether they are present at a higher rate in all those at familial risk. At 3 years we found a reduced ability to recognize unfamiliar faces in the high-risk group that was not specific to those children who received an ASD diagnosis, consistent with face recognition difficulties being an endophenotype of the disorder. Furthermore, we found that longer looking at faces at 7 months was associated with poorer performance on the face recognition task at 3 years in the highrisk group. These findings suggest that longer looking at faces in infants at risk for ASD might reflect early face-processing difficulties and predicts difficulties with recognizing faces later in life.


Introduction
The ability to recognize and process information from the faces of the people around us is crucial for functioning in our highly social world. There is a large body of research showing that individuals with autism spectrum disorder (ASD), which is characterized by impairments in social interaction and communication, have difficulties with processing faces (for a review see Sasson, 2006). Given the importance of faces in conveying social and emotional information, some have proposed that these face-processing difficulties are central to the disorder (Dawson, Carver, Meltzoff, Panagiotides, McPartland & Webb, 2002;Grelotti, Gauthier & Schultz, 2002;Schultz, 2005). The difficulties individuals with ASD experience with the recognition of unfamiliar faces have been taken as support for this idea, but the evidence for face recognition difficulties in ASD is mixed. Several studies investigating face recognition in children with ASD aged between 2 and 16 years have demonstrated that they have difficulties with the delayed recognition of recently viewed faces (Boucher & Lewis, 1992;Chawarska & Shic, 2009;de Gelder, Vroomen & van der Heide, 1991;Hauck, Fein, Maltby, Waterhouse & Feinstein, 1998;Klin, Sparrow, de Bildt, Cicchetti, Cohen & Volkmar, 1999). For example, Klin et al. (1999) found that 7-year-old children with ASD had pronounced face recognition difficulties compared to children with pervasive developmental disorder not otherwise specified and typically developing control groups, even when the groups were matched on nonverbal and verbal mental age. This study also demonstrated that the recognition of unfamiliar faces was more vulnerable to changes in facial expression in children with ASD (Klin et al., 1999). However, other studies with toddlers (Chawarska & Volkmar, 2007), and adults (Barton, Cherkasova, Hefter, Cox, O'Connor & Manoach, 2004) failed to find consistent face recognition problems in individuals with ASD, or suggest that face recognition difficulties are the result of general perceptual atypicalities that are not specific to faces (Davies, Bishop, Manstead & Tantam, 1994). These conflicting findings with regard to performance on face recognition tasks are likely to result from differences in experimental tasks, participant ages, and control group criteria (use of chronological age, mental age, verbal or non-verbal IQ) especially as difficulties appear to be more evident in younger children (Sasson, 2006). A recent review by Weigelt, Koldewyn and Kanwisher (2012) suggests that individuals with ASD mainly experience difficulties with face recognition tasks that have a memory demand. Even a very minimal increase in memory demand, for example by presenting stimuli sequentially rather than simultaneously, seems to result in problems with face discrimination.
In recent years it has been shown that not only individuals with ASD, but also their first-degree relatives, demonstrate face-processing difficulties albeit to a lesser extent (Adolphs, Spezio, Parlier & Piven, 2008;Dalton, Nacewicz, Alexander & Davidson, 2007;Dawson, Webb, Wijsman, Schellenberg, Estes, Munson & Faja, 2005;Merin, Young, Ozonoff & Rogers, 2006;Wallace, Sebastian, Pellicano, Parr & Bailey, 2010). For example, parents of individuals with ASD demonstrate difficulties with recognizing faces relative to their verbal and visual spatial abilities (Dawson et al., 2005) and electrophysiological and fMRI studies with parents and infant siblings of children with ASD have shown that genetic relatives demonstrate atypical cortical responses to faces (Dalton et al., 2007;Dawson et al., 2005;Key & Stone, 2012;McCleery, Akshoomoff, Dobkins & Carver, 2009) which mirror the responses observed in individuals with ASD. This phenomenon, where the genetic relatives of individuals with ASD who do not have a diagnosis themselves possess certain behavioural and neural characteristics associated with the disorder, has been described as part of the broader autism phenotype (BAP) (Pickles, Starr, Kazak, Bolton, Papanikolaou, Bailey, Goodman & Rutter, 2000). The presence of face recognition problems in relatives of individuals with ASD raises the question whether these difficulties are specifically related to (sub-clinical) characteristics of the disorder or whether they are present in all those at familial risk, something not many studies of family members have tested.
Another unanswered question concerns the causal factors underlying the development of face-processing problems in ASD. In typically developing infants, the development of specialized face-processing mechanisms is thought to be mediated by exposure to, and experience with faces (Nelson, 2001;Morton & Johnson, 1991). This has led researchers to suggest that in infants with ASD, differences in the formation or processing of the core face network (e.g. fusiform gyrus; Sasson, 2006) or the pulvinar (Johnson, 2005) result in a failure to preferentially orient to this kind of stimulus. Others have suggested that a more general reduced level of social motivation is the primary factor causing face-processing difficulties in children with ASD (Dawson et al., 2002;Dawson et al., 2005;Grelotti et al., 2002) possibly as a result of amygdala abnormalities (Grelotti et al., 2002;Kleinhans, Richards, Sterling, Stegbauer, Mahurin, Johnson, Greenson, Dawson & Aylward, 2008;Schultz, 2005). Regardless of the precise mechanisms, all these accounts predict that infants who go on to develop ASD spend less time looking at faces and that this lack of experience with faces has detrimental effects on their face-processing abilities later in life.
The studies investigating this hypothesis have been mainly retrospective, using home video analyses of unstructured settings (e.g. birthday parties) or parent reports. The results of these studies demonstrate that children later diagnosed with ASD orient to, and look at, social stimuli less than their typically developing peers in the first 2 years of life (for a review see Saint-Georges, Cassel, Cohen, Chetouani, Laznik, Maestro & Muratori, 2010). However, parent reports are likely to be influenced by recollection bias and home videos vary greatly in context and lack experimental control. Experimentally controlled prospective studies have shown very few differences between infants at familial risk for ASD (due to having an older sibling with a diagnosis of ASD) and low-risk controls in the orienting towards, and scanning of, faces when they interact with their caregiver (Young, Merin, Rogers & Ozonoff, 2009) or an experimenter (Bryson, Zwaigenbaum, Brian, Roberts, Szatmari, Rombough & McDermott, 2007;Ozonoff, Iosif, Baguio, Cook, Hill, Hutman, Rogers, Rozga, Sangha, Sigman, Steinfeld & Young, 2010) during the first 6 months of life. However, a recent eye-tracking study by Chawarska, Macari and Shic (2013) demonstrated that 6-month-old infants who later received an ASD diagnosis attended less to social scenes and spent less time scanning the face of the person in the scene. In contrast, results from two other studies suggest that children with, or at risk for, ASD instead looked longer at faces. Webb, Jones, Merkle, Namkung, Toth, Greenson, Murias and Dawson (2010) demonstrated that toddlers with ASD and their unaffected siblings took significantly longer to habituate to faces than children in the control group, and that this slower face learning was correlated with poorer socialcommunicative skills. Elsabbagh, Gliga, Hudry, Charman, Johnson and the BASIS team (2013a) demonstrated that 7-month-old high-risk infants spent proportionally more time looking at faces relative to other objects than low-risk controls. Although these findings appear to contradict the hypothesis that face-processing difficulties in ASD result from a lack of engagement with faces in infancy, prospective studies, such as the present one that was conducted as a follow-up to the Elsabbagh et al. (2013a) study, are needed to validate in what way abnormalities in early engagement with faces are related to face-processing abilities later in life.
The younger siblings of children diagnosed with ASD have an increased risk of developing ASD themselves; combined data over several studies indicate that their risk for ASD is increased to 18.7% (Ozonoff, Young, Carter, Messinger, Yirmiya, Zwaigenbaum, Bryson, Carver, Constantino, Dobkins, Hutman, Iverson, Landa, Rogers, Sigman & Stone, 2011) compared to around 1% in the general population (Baird, Simonoff, Pickles, Chandler, Loucas, Meldrum & Charman, 2006). Therefore, research with high-risk siblings is a promising new approach to identify the processes through which symptoms emerge over time, and to investigate how differences in early development influence the resulting phenotype (Elsabbagh & Johnson, 2010). In the current study we implemented this approach to investigate the face recognition abilities of 3-year-old children at increased risk for ASD, due to having an older sibling with ASD, compared to a low-risk control group. In addition, we investigated how measures of engagement with faces in infancy related to their face recognition abilities in toddlerhood.
To test the children's face recognition abilities, a touchscreen task was administered during which the children had to recognize newly viewed faces after a short delay. Based on previous studies that demonstrated that memory load (Weigelt et al., 2012) and changes in facial expressions (Klin et al., 1999) affect face recognition abilities in individuals with ASD, we used a delay to increase the memory demands and included items with a local feature change between the familiarization and recognition phase to maximize the chance that we would be able to differentiate between the groups. We hypothesized that if face recognition difficulties are correlated with the presence of ASD characteristics, only the children who received an ASD diagnosis and possibly those that manifest sub-clinical atypicalities, but not typically developing high-risk siblings, would have difficulties with the task. Alternatively, the high-risk group as a whole may show face recognition problems, suggesting that face-processing difficulties represent an endophenotypea genetically mediated risk factorof the disorder that is present in relatives of individuals with ASD at a higher rate than in the general population (Gottesman & Gould, 2003).
The second aim of the study was to investigate how face recognition abilities in toddlerhood related to experimental measures of engagement with faces in infancy by correlating performance on the face recognition task with looking behaviour during the 'pop-out' task that had been administered when the participants in the present study were around 7 months old (Elsabbagh et al., 2013a). In this eye-tracker task infants were presented with visual arrays containing faces amongst multiple distracters. Just like typically developing 6-month-old infants (Gliga, Elsabbagh, Andravizou & Johnson, 2009), infants at risk for ASD, irrespective of their diagnostic status at 3 years, direct their first saccade toward faces more frequently than expected by chance, despite the presence of competing objects (Elsabbagh et al., 2013a). Interestingly, over the course of the experiment the high-risk infants spent proportionally more time looking at faces relative to other objects than the low-risk controls. We used this measure of the proportion of time infants spent looking at the faces during the pop-out task at 7 months as a measure of early engagement with faces. Based on models of the development of specialized face-processing mechanisms in typically developing infants (Nelson, 2001;Morton & Johnson, 1991), we initially expected higher face engagement values at 7 months to be associated with better performance on the face recognition task at 3 years in the low-risk controls and high-risk siblings. However, based on our earlier findings from this same cohort (Elsabbagh et al., 2013a), an alternative hypothesis, that in the high-risk group higher face engagement at 7 months instead reflects face-processing difficulties and would therefore be associated with poorer performance on the face recognition task at 3 years, was favoured.
This study allowed us to investigate whether face recognition difficulties are related to ASD diagnosis or whether they represent an endophenotype of the disorder that is present at a higher rate in those at increased familial risk. In addition, we investigated for the first time how early differences in engagement with faces in infancy are related to face-processing abilities in toddlerhood. Supplementary analyses were performed to investigate whether group differences might be explained by face-scanning patterns.

Participant characteristics
The participants in this study were part of a larger group of children (N = 104) recruited for an ongoing longitudinal project facilitating research with siblings of children with ASD (the British Autism Study of Infant Siblings (BASIS)). Ethical approval for the current study was made available through BASIS (NHS NRES London REC 08/H0718/76). Parents gave informed consent. Of the 104 participants (39 males and 65 females), 54 had an older sibling with ASD (high-risk siblings) and 50 had a typically developing older sibling (low-risk controls). Participants in either group were excluded from participating in the study if they were born preterm, had low birth weight, medical or neurological conditions, or sensory or motor problems. The pop-out task was administered when the infants came for their first lab visit around 7 months of age (mean = 238.3 days, SD = 37.2) (see Elssabbagh et al., 2013a, for more information about the task and participant characteristics). Subsequently, 53 (out of 54) of those at high risk for ASD and 48 (out of 50) low-risk controls were seen for an assessment around their third birthday (mean = 37.7 months, SD = 3.0 days). During this visit, a battery of clinical research measures was administered including the Autism Diagnostic Interview-Revised (ADI-R; Lord, Rutter & Couteur, 1994), the Autism Diagnostic Observation Schedule-Generic (ADOS-G; Lord, Risi, Lambrecht, Cook, Leventhal, DiLavore, Pickles & Rutter, 2000), and the Mullen Scales of Early Learning (MSEL; Mullen, 1995). Consensus ICD-10 criteria were used to ascertain diagnosis in the high-risk siblings using all available information from all visits by experienced researchers (TC, KH, SC, GP). The supplementary materials present detailed participant characteristics, including confirmation of risk status, background measures, and outcome classification. The children in the high-risk group were classified as having ASD (sib-ASD), other developmental concerns (sib-Other), or to be typically developing (sib-TD).
Forty-four high-risk siblings and 40 low-risk controls provided valid data for the face recognition task at 3 years. Participants' characteristics (age, gender, IQ) are presented in Table 1. There were missing data for seven high-risk and six low-risk participants because: participant did not take part in the 3-year visit (1 high-risk, 2 low-risk), assessment took place during a home visit (2 high-risk), participant was more than 1 year older than the group average (1 high-risk, 2 low-risk), participant did not do the face recognition task (3 high-risk, 2 lowrisk). Additional participants were excluded from analyses because of: no response in more than three trials in one of the difficulty levels (1 high-risk, 1 low-risk), task compliance or parental interference (2 high-risk, 3 low-risk).

Design and materials
Face recognition task at 3 years E-prime software was used to present the children with 12 face recognition items on a 21-inch touchscreen. The stimuli consisted of still images of human faces selected from the MacBrain Face Stimulus Set (see Supplementary Materials for more information). In half of the items, male faces were used as targets and distracters. The images were equated in colour and luminosity. When viewed from a distance of 60 cm the images covered an Note: Superscripts indicate differences between low and high risk (* p < .05; ** p < .01) and between the Sib-ASD or Sib-Other groups (which did not differ from each other on any measures) and Sib-TD (+ p < .05; ++ p < .01; Bonferroni correction). a Group mean, standard deviation, and range of MSEL ELC scores, Mean = 100, SD = 15. b Group mean, standard deviation, and range of MSEL subscales. NB: No data were collected for the Gross motor subscale. c Group mean, standard deviation, and range for the Social and Communication algorithm of the ADOS 'one missing.
approximate area of 8.9°9 9.4°. During the touchscreen task the children sat on their caregiver's lap or on a chair on their own at approximately 50-60 cm from the touchscreen. The face recognition procedure consisted of a familiarization and a recognition phase. A central target face with a neutral facial expression or a closed mouth smile replaced a small fixation stimulus when the child attended the screen. The experimenter encouraged the child to pay attention to the face. After 5000 ms the target face was replaced with an image of a house and the experimenter asked: 'Where did he/she go? He/she went into the house!' After a 3000 ms delay the sound of a doorbell was played and the house was shown with the familiar and a novel face displayed on the location of the windows. The experimenter then prompted the child to touch the correct face by saying: 'Where did he/she go? Can you find him/her?' (see Figure 1). When the child answered correctly a smiley appeared on the screen; when the child gave the wrong answer or did not respond no feedback was provided. When no response was given the images timed out after 7000 ms. The task started with two easy example trials in which the novel faces were of the opposite sex. The 12 test items were of two difficulty levels. In the six easy items the target faces presented during the familiarization and recognition phase were identical whereas in the six difficult items the facial expression changed (either from a closed-mouth smile, during familiarization, to a open-mouth smile during recognition or from neutral to a closed mouth smile). In the recognition phase, the novel face always had the same facial expression as the target face. The order of the items was randomized for each participant and the side of the correct face was counterbalanced between trials. If the child got distracted the experimenter redirected the child's attention to the task.
Measures. We calculated the percentage of trials in which the child gave the correct answer and the mean reaction times (RTs) for the correct responses. When calculating the percentage correct, no-response trials were counted as incorrect. To remove bias from impulsive responses a total of six trials with RTs lower than two standard deviations below the mean were excluded (Mean RT = 3254.52, SD = 1329.99, RTs < 594.54 ms excluded). The mean number of valid trials per condition and group is presented in Table 2. Participants with more than three no-response trials per difficulty level were excluded. There were no group differences in the number of valid trials, the number of no-response trials, or impulsive responses, all ps > .095.

Face pop-out task at 7 months
The infants' looking behaviour was recorded at 50 Hz using a Tobii eye tracker. The infants were seated on their caregiver's lap, at 50-55 cm from the Tobii screen. First a 5-point calibration sequence was run and only when at least four points were marked as being properly calibrated for each eye was recording started. During the task 14 different slides with five images, one face and four distracters were presented (see Figure 2). Colour images of seven male and seven female faces with direct gaze were used as the targets. Different exemplars of mobile phones, birds, cars, and face visual noise images (Halit, Csibra, Volein & Johnson, 2004) were used as distracters. For more information about the stimuli see Elsabbagh et al. (2013a). Stimuli were presented using Tobii Studio software and each slide presentation lasted 15 seconds. Before each slide a small animation was presented in the centre of the screen to ensure that the infant's gaze was directed to the centre. To maintain the infant's attention, the visual presentation was accompanied by unrelated music. If the infant stopped looking at the slide one of the experimenters prompted the infant to look at the screen again. If the infant looked away for more than 5 seconds the slide presentation was terminated.
Measures. Rectangular areas of interest (AOIs) were defined around each object image and the centre of the screen using Tobii Studio software. In the current study we used the proportion of time the infants spent looking Figure 1 An example of the face recognition stimulus display for a difficult trial. Children were presented with a fixation stimulus, followed by the familiarization phase. After 5000 ms a picture of a house replaced the target face. After a delay of 3000 ms the target face was presented together with the novel face and children were instructed to touch the familiar face. In the easy items the target face was identical at familiarization and recognition, whereas in the difficult items the facial expression changed (either from a closed-mouth smile to an open-mouth smile or from neutral to closed mouth smile).
at the face AOI relative to all target AOIs in the array, which we will refer to as 'face engagement'. 1 Trials were considered valid if the infant was looking at the slide for at least 3 seconds. Infants with less than three valid trials were excluded from the analyses.

Statistical analyses
We investigated our main hypotheses using a repeatedmeasures analysis of variance with trial type (easy, difficult) as within-subjects factor and outcome group (low-risk controls, sib-TD, sib-Other, sib-ASD) as between-subjects factor. The MSEL early learning composite (ELC) standard score at 3 years was entered as a covariate to account for any group differences in general intelligence. We found no RT differences between any of the groups, therefore only accuracy analyses are reported in the results section. We investigated the presence of group differences further with planned comparisons. We first compared the low-risk controls and high-risk siblings to examine overall differences based on risk status. Hereafter the effect of clinical outcome was examined. The relationship between face engagement at 7 months and performance on the face recognition task at 3 years was investigated with a repeated-measures ANOVA with percentage correct for the two trial types (easy, difficult) as within-subjects variable, group (highrisk siblings, low-risk controls) as between-subjects factor, and MSEL ELC and face engagement as covariates. Levene's tests demonstrated that the error variances of the dependent variables (percentage correct on easy and difficult items) were equal across the groups in all the analyses, all p values > .235.

Face recognition
The repeated-measures ANOVA with percentage correct for the two trial types as within-subjects variable, outcome group (low-risk controls, sib-TD, sib-Other, sib-ASD) as between-subjects factor, and MSEL ELC as  a covariate demonstrated that there was a marginally significant effect of MSEL ELC, F(1, 78) = 3.88, p = .053, but no effect of trial type, F(1, 78) = .05, p = .819, and no group by trial type interaction, F(3, 78) = .71, p = .550. There were no significant differences between the four outcome groups, F(3, 78) = 1.48, p = .228 (see Figure 3a). However, planned comparisons demonstrated that there was a significant difference between the high-risk siblings and low-risk controls, F(1, 78) = 4.40, p = .039, Cohen's d = .453 (see Figure 3b). One-sample t-tests on the means corrected for MSEL ELC score demonstrated that the low-risk controls performed above chance level for both difficulty levels, p < .001, while the high-risk siblings performed above chance on the easy items, p < .001, but at chance level on the difficult items, p = .665 (see Figure 3a). Additional planned comparisons were performed to ascertain that the risk group difference was not driven by one of the outcome groups. We found that none of the three high-risk outcome groups (sib-TD, sib-Other and sib-ASD) was significantly different from the low-risk control group, and that the outcome groups did not differ from each other, all uncorrected p values > .105. When we omitted the sib-ASD group and compared the low-risk control group to the two other high-risk outcome groups taken together (the sib-TD and sib-Other groups combined), the risk group effect was still marginally significant, F(1, 78) = 3.44, p = .067, Cohen's d = 0.426. Finally, a Pearson's correlation demonstrated that there was no significant relationship between ASDlike characteristics (as measured by the ADOS-G Social and Communication algorithm) and performance on the face recognition task in the high-risk group, r = À.229, p = .134 (partial correlation controlling for MSEL ELC, r = À.121, p = .444).
Summarizing, these results demonstrate that it was familial risk for ASD and not an ASD diagnostic outcome (sib-ASD) or the presence of sub-clinical ASD-like social and communication characteristics that was driving the high-risk toddlers' face recognition difficulties. Although general intelligence seems to have influenced performance on the task, group differences in MSEL ELC did not account for the difference in face recognition performance between high-risk siblings and low-risk controls.
We also tested whether face recognition difficulties in the high-risk siblings might be the result of atypical facescanning patterns. As there were no eye tracking data recorded during the face recognition task we used gaze data recorded when the children observed the still image of a face during a different task that was administered during the same testing session (see Supplementary  Materials). Analyses of face scanning during this task demonstrated that although there was a group difference (low-risk vs. high-risk) in the proportion of switches between facial features, this measure was not related to performance on the face recognition task.

Relationship between face pop-out and face recognition task
To test the hypothesis that engagement with faces in infancy is associated with face-processing abilities in toddlerhood, we investigated how the children's looking behaviour during the face pop-out task at 7 months was related to their performance on the face recognition task. A repeated-measures ANOVA with percentage correct for the two trial types (easy, difficult) as within-subjects variable, group (high-risk siblings, low-risk controls) as between-subjects factor, and MSEL ELC at 3 years and face engagement as covariates demonstrated that there was no main effect of face engagement, F(1, 72) = .001, p = .969, and no interaction between face engagement and trial type, F(1, 72) = 1.07, p = .305. However, we did find a significant interaction between group, face engagement and trial type, F(1, 72) = 4.33, p = .041. To follow up on this interaction we investigated the relationship between face engagement and performance on the face recognition task (while controlling for MSEL ELC) in the high-risk siblings and low-risk controls separately (see Table 3). There was no relationship between face engagement and performance on the face recognition task in the low-risk controls (see Figure 4a). In the high-risk sibling group, however, there was a significant negative relationship between face engagement at 7 months and performance on the difficult items of the face recognition task at 3 years, r = À.316, p = .050 (see Figure 4b). In this group, longer looking at faces in infancy was associated with poorer face recognition abilities in toddlerhood. Because these findings are compatible with face engagement in the high-risk infants reflecting face-processing difficulties, we investigated whether this measure was related to any other measures of atypical face processing during infancy (e.g. atypical face scanning). As the images used in the popout task were too small to define separate AOIs around the facial features we investigated the face-scanning patterns of the same group of infants when they observed the still image of a face during a different task that was administered during the same testing session (see Supplementary Materials). However, we found no significant relationship between face-scanning patterns during this task and the amount of face engagement during the pop-out task or performance on the difficult items of the face recognition task. We did find a trend towards a relationship between the proportion of time spent looking at the eyes during the face-scanning task and face engagement in the pop-out task in the high-risk siblings, r = .303, p = .082. There also was a trend towards a negative relationship between this measure and performance on the difficult items of the face recognition task in this group, r = À.260, p = .131.

Discussion
In the current study we investigated the face recognition abilities of toddlers at increased familial risk for ASD and a low-risk control group. By studying children at risk we were able to relate face recognition performance to a broad spectrum of phenotypic outcomes. In addition, the longitudinal design of our study allowed us to investigate for the first time how these face recognition abilities in toddlerhood relate to measures of visual engagement with faces during infancy. We found a reduced ability to recognize unfamiliar faces at 3 years in the high-risk group that was not driven only by those children who received an ASD diagnosis nor by those manifesting subclinical ASD-like social and communicative characteristics. This adds to the previous studies that found that first-degree relatives of individuals with ASD can exhibit face-processing difficulties (Adolphs et al., 2008;Dalton et al., 2007;Dawson et al., 2005;Merin et al., 2006) but that did not test the relationship with the ASD phenotype. We found that general intelligence was a marginally Table 3 Partial correlations for the relationship between face orienting at 7 months and performance on the face recognition task at 3 years controlling for MSEL ELC. significant predictor of performance on the face recognition task, suggesting that the children's overall cognitive functioning and their ability to understand and follow instructions influenced their task performance. However, the risk group effect was still significant when controlling for general intelligence, demonstrating that group differences in cognitive functioning did not account for the difference in face recognition performance between high-risk siblings and low-risk controls. Taken together, our findings suggest that face-processing difficulties are an endophenotype of ASD that is present in those at familial risk. Even though the risk group differences in face recognition abilities are subtle, they are likely to be amplified in the real world by suboptimal viewing conditions such as limited exposure time, inconsistent light levels, or dynamic changes in facial expressions. It should be noted that performance in the low-risk control group was not as high as might have been expected (MSEL-corrected average 66% correct). As the task was part of a long battery of experimental assessments it is possible that the performance of the low-risk control group was suboptimal because of fatigue. Alternatively, as the face recognition task was not a standardized assessment and we adjusted the difficulty level to avoid ceiling effects, task demands might have been high even for some of the low-risk controls. Finally, as there were only six trials per difficulty level, a relatively small number of incorrect responsese.g. 2 out of 6already resulted in a rather low accuracy score of 67%. We are currently in the process of testing a larger cohort of children at risk on the same task (with extra trials added) in an attempt to replicate our findings and validate our face recognition measure.
What might underlie the face recognition difficulties of the high-risk siblings? Although there was no significant interaction between group and trial type, the high-risk siblings only performed at chance level on the difficult items in which the facial expression changed between the familiarization and recognition phase. This finding is consistent with a study by Klin et al. (1999) that demonstrated that the recognition of unfamiliar faces was more vulnerable to changes in expression in children with ASD. Because the change in expression was mainly happening at the level of the mouth (changing from neutral to smiling, for example) participants had to rely on other invariant features, such as the eyes, for recognition. Researchers have proposed that atypical face-processing strategies such as decreased looking at the eyes (Dalton et al., 2005;Klin, Jones, Schultz, Volkmar & Cohen, 2002), a focus on the mouth (Joseph & Tanaka, 2003;Langdell, 1978;Neumann, Spezio, Piven & Adolphs, 2006), and an unusual reliance on featural relative to configural face information (Davies et al., 1994;Falck-Ytter, 2008;Hobson, Ouston & Lee, 1988) are among the possible mechanisms underlying the face recognition difficulties in individuals with ASD. However, analyses of the face-scanning patterns of the same high-risk siblings and low-risk controls during a different eye-tracking task at the same age did not show any relationship with performance on the face recognition task in the present study (see Supplementary  Materials). Although the absence of a relationship between face scanning and face recognition performance is puzzling, other studies have reported similar findings. For example, in a study on emotion recognition in Huntington's disease, van Asselen, J ulio, Janu ario, Bobrowicz Campos, Almeida, Cavaco and Castelo-Branco, (2012) found that the difficulties in this patient group could not be explained by atypical face-scanning patterns. Together these studies suggest that the absence of atypicalities in face-scanning behaviour does not preclude processing difficulties.
The second goal of the present study was to investigate whether engagement with faces in infancy is related to face processing later in life. In the low-risk group, engagement with faces at 7 months was not related to performance on the face recognition task. This suggests that early experience with faces might not be as strongly related to later face-processing abilities as previously hypothesized (Nelson, 2001;Morton & Johnson, 1991). Although some level of interest in faces early in life is likely to be necessary to develop competency in processing faces, the relationship between these two variables need not be a linear one. Instead it is probable that the amount of time spent looking at faces early in life is merely one of many factors influencing later face recognition performance. For example, a recent twin study found an important contribution of heritable factors to face-processing abilities (Zhu, Song, Hu, Li, Tian, Zhen, Dong, Kanwisher & Liu, 2010).
In the high-risk group, engagement with faces at 7 months was negatively correlated with performance on the difficult items of the face recognition task at 3 years. Because the face engagement measure was relative to other objects (Elsabbagh et al., 2013a), the increased engagement with the face cannot simply be explained by differences in domain-general perceptual processing. As previous studies have associated longer looking time with processing difficulties (e.g. Colombo, Mitchell, Coldren & Freeseman, 1991), longer proportional looking at the face in the high-risk group may reflect early face-processing difficulties. Although it is unclear at this point what aspect of face processing may be affected, one possibility is that our results were driven by atypicalities in the way the high-risk infants scanned the faces. Typically developing infants are sensitive to configural information in faces from at least 4 months of age (Quinn & Tanaka, 2009;Schwarzer, Zauner & Jovanovic, 2007). Possibly, the high-risk infants who demonstrate face recognition difficulties later in life may have had an atypical bias to attend to individual features of the face at 7 months, increasing the time needed to process the whole face. Alternatively, a focus on irrelevant features such as the hairline and ears might reduce visual exposure to internal features of the face and contribute to long-term face recognition difficulties (Golarai, Grill-Spector & Reiss, 2006). It may therefore be that it is not simply early visual experience with faces, but rather a particular kind of visual experience with facesi.e. exposure to the most informative internal features of the face and their configurationthat is needed to develop long-term face recognition competency. However, when we obtained face-scanning measures from a different eye-tracking task that was also administered at 7 months, we found no significant relationship between face scanning and the amount of time the high-risk infants spent engaging with the face during the pop-out task or with their performance on the face recognition task (see Supplementary Materials). We did find a trend towards a relationship between the proportion of time spent looking at the eyes during the face-scanning task and both face engagement and performance on the difficult items of the face recognition task in the highrisk siblings. Possibly those high-risk infants who look proportionately longer at the face during the pop-out task at 7 months and who perform poorly on the difficult items of the face recognition task at 3 years have a problem with processing information from the eyes (which are the main invariant features of the face in the difficult items). However, considering that this study was not originally designed to investigate the relationship between early face-scanning strategies and face recognition performance in toddlerhood this finding needs to be replicated in a separate study using a bigger sample.
It has been suggested that the development of faceprocessing abilities is supported by an experience-expectant process, whereby exposure to faces during a sensitive period of development leads to perceptual and cortical specialization (Nelson, 2001). However, it is currently unknown how long this sensitive period lasts. It is therefore possible that a stronger relationship between face engagement and face recognition performance might have emerged if we had measured infants' face engagement at an earlier age. Future studies should investigate this possibility by obtaining measures of highrisk infants' visual engagement with faces before 7 months and relating these to their face-processing abilities later in life.

Conclusions
Together with a previous study on the same cohort (Elsabbagh et al., 2013a), our findings contradict the prevailing idea that face-processing difficulties in ASD result from the absence of a bias to look at faces. We conclude that infants at risk for ASD do not lack an attraction to or actively avoid faces, but rather seem to experience difficulties with processing faces (and possibly specifically the eyes) from early in life resulting in problems in face recognition memory that are evident in toddlerhood. management of toddlers with suspected autism spectrum disorder: insights from studies of high-risk infants.

Risk status, behavioural assessment, and outcome groups
High-risk siblings were recruited through BASIS and had an older brother or sister (proband) with a clinical diagnosis of an ASD from a UK clinician based on ICD-10 criteria (World Health Organization, 1992). Low-risk controls were recruited from a volunteer database at the Birkbeck Centre for Brain and Cognitive Development and had at least one typically developing older brother or sister. Infants were only included in the control group if they did not have a first-or second-degree relative with ASD as confirmed by a parent interview about the family medical history. Two expert clinicians confirmed the proband diagnosis using the Development and Wellbeing Assessment (DAWBA) and the parent-report Social Communication Questionnaire (SCQ). Most probands of the 44 high-risk participants that were included in the face recognition task met criteria for ASD on both the DAWBA and SCQ (n = 40). Even though a small number of probands scored below threshold on the SCQ (n = 4) no exclusions were made as these probands scored above threshold on the DAWBA and their diagnosis was confirmed by expert opinion. Parent-reported family medical histories were examined for significant medical conditions in the proband or extended families members, but no exclusions were made on this basis.
An independent team at the Centre for Research in Autism and Education, Institute of Education performed the behavioural assessments during the 3-year visit. Children were assessed on the Mullen Scales of Early Learning (MSEL; Mullen, 1995) to obtain a measure of general intelligence. The Autism Diagnostic Observation Schedule Generic (ADOS-G; Lord et al., 2000) a semistructured, standardized observation assessment was administered in both groups to obtain information about the children's social behaviour, use of vocalizations/ speech and gesture in social situations, and play and interests. The ADOS-G sessions were double coded and the experimenters agreed consensus codes. Parents of the high-risk siblings also completed the Autism Diagnostic Interview-Revised (ADI-R; Lord et al., 1994). Children were included in the Sib-ASD group if they met ICD-10 (World Health Organization, 1992) criteria for ASD. Given the young age of the children, and in line with the proposed changes to DSM-V, no attempt was made to assign specific sub-categories of PDD/ASD diagnosis. 1 Children from the high-risk group were considered typically developing if they (i) did not meet ICD-10 criteria for an ASD; (ii) did not score above the ASD cut-off on the ADOS or ADI; (iii) scored within 1.5 SD of the population mean on the MSEL ELC standard score (>77.5) and RL and EL subscale T scores (>35). High-risk siblings were considered to have other developmental concerns if they did not fall into either of the above groups. That is, they either scored above the ADOS or ADI cut-off for ASD or scored <1.5 SD on the Mullen ELC or RL and EL but did not meet ICD-10 criteria for an ASD. Of the 44 high-risk participants that were included in the face recognition task, 14 were classified as sib-ASD, 19 were sib-TD and 11 were in the sib-Other concerns group (9 scoring above ADOS ASD cut-off, 1 scoring above ADOS ASD cut-off and <1.5 SD Mullen ELC cut-off, and 1 scoring <1.5 SD Mullen ELC cut-off). It should be noted that the recurrence rate in the current study (31.8%) is higher than the 18.7% reported in a large consortium paper recently published by Ozonoff et al. (2011). This is likely to be the result of the modest size of the high-risk sample in the current study (N = 44). Whilst recurrence rates approaching 30% have been found in other moderate size samples (e.g., Landa, Holman & Garrett-Mayer, 2007;Paul, Fuerst, Ramsay, Chawarska & Klin, 2011) these rates are sample specific and will likely not be generalizable as autism recurrence rates from larger samples converge between 10% and 20% (Constantino et al., 2010;Ozonoff et al., 1 Clinical judgement is considered more accurate than instrument thresholds (even on so-called 'gold standard' measures), in particular for young children (Charman & Baird, 2002). Our approach to diagnosis at 3 years is consistent with other published studies on highrisk siblings (Ozonoff et al., 2011;Zwaigenbaum et al., 2009) and is in line with the recommendations of the Baby Sibs Research Consortium of which we are members. 2011). However, the present study used similar procedures to other familial risk studies by combining all information from standard diagnostic measures and clinical observation and arriving at a 'clinical best estimate' ICD-10 diagnosis.

Face scanning at 3 years
The differences in face recognition performance between the high-risk siblings and low-risk controls seem to be mainly driven by differences in performance on the difficult items of the face recognition task in which the mouth changed between the familiarization and recognition phase (from a closed-mouth smile to a openmouth smile or from neutral to closed mouth smile). Therefore if the high-risk siblings focused more on the mouth area or failed to look at invariant features such as the eyes during the familiarization phase this might have impaired their ability to recognize the face during the recognition phase. There is research that suggests that individuals with ASD indeed look less at the eyes (Dalton et al., 2005;Klin et al., 2002) and have an atypical focus on lower parts of the face (i.e. the mouth) when identifying faces (Joseph & Tanaka, 2003;Langdell, 1978). Additionally, individuals with ASD have been shown to fixate irrelevant features of the face such as the hairline and ears more, and core facial features such as the eyes and mouth less than typically developing controls (Pelphrey et al., 2002). We therefore investigated whether the high-risk siblings differed from the low-risk controls with respect to the scanning of internal features of the face. Other studies have shown that individuals with ASD rely more on featural relative to configural face information compared to typically developing controls (Davies et al., 1994;Falck-Ytter, 2008;Hobson et al., 1988; but also see Weigelt et al., 2012). It has been suggested that because faces are generally quite similar, feature-based processing is not sufficient for the recognition of unfamiliar faces (Behrmann, Thomas & Humphreys, 2006). If the high-risk siblings indeed used a more featural instead of a configural processing style this could explain their face recognition deficits in the current task. Therefore we also investigated whether there were any differences between high-risk siblings and low-risk controls in the amount of switches between facial features. We consider this measure a prerequisite to configural processing as it has been suggested that eye movements between facial features are functional in obtaining information about the configuration of these features (Henderson, Williams & Falk, 2005). As there was no eye-tracking data recorded during the face recognition task we used the gaze data recorded when the children observed the still image of a face during a different task that was administered during the same testing session to investigate differences in face-scanning patterns between the high-risk siblings and low-risk controls and its relationship with face recognition performance (for more information about the eyetracking task see Elsabbagh et al., 2013b).

AOIs and Measures
Children were presented with four still images of two different female faces. The images were presented for 5 seconds during which the children freely scanned the faces. Areas of Interest (AOIs) were defined around the eyes, the nose, and the mouth, and around the whole face (see Figure A1). Children were only included in the analyses if they had at least 800 ms of total accumulated gaze data during the presentation of the four still faces, 36 siblings and 30 controls were included in the analyses. Several measures were calculated to investigate whether differences in face scanning underlie the high-risk siblings' face recognition difficulties. Only fixations with a minimum duration of 80 ms were included. To look at differences in scanning of the mouth and eye area we calculated an eye index: total looking time to the eyes / total looking time to the face, and a mouth index: total looking time to the mouth / total looking time to the face. To look at differences in the scanning of internal features we calculated a feature index: total looking time to the facial features (eyes, nose, mouth) / total looking time to the face. To investigate differences in featural vs. configural face-scanning strategies we calculated a feature switch index: number of immediate switches between facial features (eyes, nose, mouth) / total looking time to the features. We only used proportional measures to ensure that potential differences in data quality did not influence our findings.

Results
ANOVAs with the face-scanning measures as dependent variables, group (high-risk siblings and low-risk controls) as factor, and Mullen ELC as covariate demonstrated that the feature switch measure was the only variable with a significant group difference (see Table A1). The low-risk controls had a significantly higher proportional rate of switching between the facial features than the high-risk siblings. However, partial correlations between the feature switch index and performance on the face recognition task (controlling for MSEL ELC) demonstrated that this measure was not significantly related to the ability to recognize unfamiliar faces (see Table A1). None of the other face-scanning measures was significantly related to performance on the face recognition task either (see Table A1).

Face scanning at 7 months
We found a negative correlation between proportion of time spent looking at the face during the pop-out task at 7 months and performance on difficult items of the face recognition task at 3 years in the high-risk siblings. We hypothesize that this increased engagement with the face might be the result of atypical face-processing strategies, which may be reflected in scanning differences. Possibly, the high-risk infants who engage with the face for longer have an atypical bias to attend to individual features of the face, which increases the processing time needed. Alternatively, early abnormalities in gaze behaviour of the high-risk children (e.g. a focus on irrelevant features of the face such as the hairline and ears) might reduce exposure to internal face features (Golarai et al., 2006) while also increasing the total amount of looking. As the images used in the pop-out task were too small to define separate AOIs around the facial features we investigated the face-scanning patterns of the same group of infants using gaze data that was recorded when they observed the still image of a face during a different task administered during the same testing session (for more information about the task see Elsabbagh et al., 2013b).

AOIs and Measures
Face-scanning measures were derived from gaze data recorded during the same task as was described above (see Figure A1 for the stimuli and AOIs). Infants were only included in the analyses if they had at least 800 ms of total accumulated gaze data during the presentation of the four still faces, 43 siblings and 49 controls were included in the analyses. Several measures were calculated to investigate whether there were any group differences in the face-scanning patterns and whether face scanning was related to the amount of face engagement during the pop-out task and performance on the difficult items of the face recognition task in the high-risk group. An eye index, mouth index, switch index and feature index were calculated as described above (see face scanning at 3 years). Again only fixations with a minimum duration of 80 ms were included.

Results
ANOVAs with face-scanning measures as dependent variables and group (high-risk siblings and low-risk controls) as a factor demonstrated that there were no Table A1 Analyses of group differences in face scanning measures at 3 years and its relationship with total percentage correct on the face recognition task (controlling for Mullen ELC).

Measure
Group differences

Correlation with Face Recognition task
High-risk Low-risk significant group differences in the way the children scanned the faces. Additionally, correlation analyses demonstrated that none of the face-scanning measures was significantly related to the amount of time spent orienting towards the face during the pop-out task or to performance on the difficult items of the face recognition task in the high-risk siblings (see Table A2). We found a trend towards a relationship between the proportion of time spent looking at the eyes during the face-scanning task and face engagement in the pop-out task in the high-risk siblings, r = .303, p = .082. We also found a trend towards a negative relationship between this measure and performance on the difficult items of the face recognition task in this group, r = À.260, p = .131.

Table A2
Analyses of group differences in face scanning measures at 7 months and its relationship with face engagement and performance on the difficult items of the face recognition task in the high-risk group.

Measure Group differences
Correlations high-risk group