Does Gaze Direction Modulate Facial Expression Processing in Children With Autism Spectrum Disorder?

Authors


  • We would like to acknowledge all the children, their parents, and the teachers of Musashino Higashi Gakuen. We thank Rie Fukumoto and the other staff members for their assistance in data collection and thank Koki Ikeda, Kikue Sakaguchi, and all other members of Hasegawa Lab for their support and helpful discussions. This study was supported by the Japan Society for the Promotion of Science (JSPS), 21st Century COE Program J05, “Center for Evolutionary Cognitive Sciences at the University of Tokyo,” and JSPS KAKENHI (19330210). A.S. was also supported by the ESRC Research Fellowship (RES-063-27-0207).

concerning this article should be addressed to Atsushi Senju, School of Psychology, Centre for Brain and Cognitive Development, Birkbeck, University of London, Malet Street, London WC1E 7HX, United Kingdom. Electronic mail may be sent to a.senju@bbk.ac.uk.

Abstract

Two experiments investigated whether children with autism spectrum disorder (ASD) integrate relevant communicative signals, such as gaze direction, when decoding a facial expression. In Experiment 1, typically developing children (9–14 years old; n = 14) were faster at detecting a facial expression accompanying a gaze direction with a congruent motivational tendency (i.e., an avoidant facial expression with averted eye gaze) than those with an incongruent motivational tendency. Children with ASD (9–14 years old; n = 14) were not affected by the gaze direction of facial stimuli. This finding was replicated in Experiment 2, which presented only the eye region of the face to typically developing children (n = 10) and children with ASD (n = 10). These results demonstrated that children with ASD do not encode and/or integrate multiple communicative signals based on their affective or motivational tendency.

Facial expression conveys information about another’s current emotional state, which is among the essential signals for decoding others’ mental states. Humans are sensitive to others’ facial expressions very early in their ontogeny. Even newborns imitate adults’ facial expressions (Field, Woodson, Greenberg, & Cohen, 1982; Field et al., 1983) and infants aged 4–7 months can discriminate a number of facial expressions (Caron, Caron, & MacLean, 1988; Nelson, Morse, & Leavitt, 1979; Serrano, Iglesias, & Loeches, 1992). The performance of facial expression recognition and labeling increases with age through the preschool and grade school years (Camras & Allison, 1985; Harrigan, 1984; Odom & Lemond, 1972), and this increase continues in adolescence (see Herba & Phillips, 2004, for a review).

In addition to the investigation of typical development, scientists have been exploring the atypical development of facial expression recognition in various developmental disorders in order to understand the nature of the difficulties in social interaction and communication that are associated with these disorders. In particular, studies have focused on whether individuals with autism spectrum disorder (ASD) who suffer from severe challenges in social interaction and communication are impaired at processing facial expressions. Although some studies have demonstrated that individuals with ASD have difficulty labeling others’ facial expressions of emotion and with matching facial expressions with vocal expression (Hobson, 1986a, 1986b; Tantam, Monaghan, Nicholson, & Stirling, 1989), other studies have found no such differences when these groups were matched with typically developing (TD) individuals by verbal IQ (Braverman, Fein, Lucci, & Waterhouse, 1989; Ozonoff, Pennington, & Rogers, 1990; Prior, Dahlstrom, & Squires, 1990) or mental age (Blair, 2003). Thus, it is still unclear whether children with ASD have a deficit in facial expression recognition.

Another line of research has revealed that there may be qualitative differences between individuals with ASD and TD individuals (matched by verbal IQ) in processing facial expressions, even when they can perform simple labeling or matching tasks. For example, Weeks and Hobson (1987) asked children with and without ASD to freely sort pictures of human faces, and they found that TD children spontaneously sorted faces according to their facial expressions. Children with ASD, on the other hand, sorted faces with accessories such as hats. Grossman, Klin, Carter, and Volkmar (2000) reported that even though children with ASD did not show impairments on a task of simple facial expression recognition compared with TD children, the presence of emotionally incongruent affective words interferes with the labeling of facial expressions in children with ASD but not in TD children. This result suggests that children with ASD may use a verbally mediated strategy to recognize facial expressions. In addition, McIntosh, Reichmann-Decker, Winkielman, and Wilbarger (2006) demonstrated that children with ASD do not spontaneously mimic others’ facial expressions, even though they can imitate them when instructed. The results may be based on the lack of spontaneous processing of facial expressions because children with ASD show spontaneous mimicry of facial expressions when they are instructed to process the sex of the face (Magnée, de Gelder, van Engeland, & Kemner, 2007). These studies imply that the way individuals with ASD process facial expressions may differ from that of TD individuals.

Recently, Adams and Kleck (2003) reported that in TD adults, facial expression is processed in conjunction with eye gaze direction. Their study found that facially communicated approach-oriented expressions (e.g., anger and joy) are more quickly recognizable when they are coupled with a direct gaze. In contrast, facially communicated avoidance-oriented expressions (e.g., fear and sadness) are decoded faster when they are coupled with an averted eye gaze. This study suggests that facial expression is processed based on motivational tendency in conjunction with other communicative signals such as eye gaze. Similarly, Ganel, Goshen-Gottstein, and Goodale (2005) reported that the latency to discriminate a facial expression was affected by task-irrelevant changes in eye gaze direction in typically developed adults. Brain imaging studies with adults also examined the neural substrates for the combined processing of facial expression and eye gaze, and revealed that the activation of the amygdala is correlated to the combined affective or motivational information of facial expression and eye gaze (Adams, Gordon, Baird, Ambady, & Kleck, 2003; Sato, Yoshikawa, Kochiyama, & Matsumura, 2004).

To date, little is known about the developmental trajectory of such integration of gaze direction and facial expression. It may occur very early in ontogeny because even in 4-month-old infants, gaze direction influenced event-related potentials in response to an angry facial expression (Striano, Kopp, Grossmann, & Reid, 2006). Barnes, Kaplan, and Vaidya (2007) also reported that facial expression affected gaze processing in 10- to 13-year-old children. However, it is not clear whether such integration found in children (Barnes et al., 2007) and infants (Striano et al., 2006) is based on the motivational tendency seen in adults. It is also possible that infants and young children have not yet developed specialized social brain networks that process different aspects of facial information in separate cortical regions (e.g., Farroni & Senju, in press; Grossmann & Johnson, 2007), and thus both gaze direction and facial expression activate widespread cortical regions that have not yet specialized for specific aspects of face information. Thus, it is critical to examine the mechanism underlying the integration of different aspects of facial information in children in order to understand the functional development of the social brain network (Brothers, 1990; Grossmann & Johnson, 2007) that efficiently integrates different aspects of face perception processed in the different cortical and subcortical structures in adults (Haxby, Hoffman, & Gobbini, 2002; Johnson, 2005).

Impairment in the use of eye contact for nonverbal communication is among the clinical symptoms of autism (American Psychiatric Association, 1994); however, it may not originate from the basic impairment in encoding gaze direction. Recent studies have demonstrated that individuals with ASD and TD individuals are equally adept at discriminating eye gaze direction (Baron-Cohen, Campbell, Karmiloff-Smith, Grant, & Walker, 1995; Leekam, Baron-Cohen, Perrett, Milders, & Brown, 1997). Individuals with ASD even shift their attention reflexively to the direction of the eye gaze of others (Kemner, Schuller, & van Engeland, 2006; Kylliäinen & Hietanen, 2004; Okada, Sato, Murai, Kubota, & Toichi, 2003; Senju, Tojo, Dairoku, & Hasegawa, 2004; Swettenham, Condie, Campbell, Milne, & Coleman, 2003; Vlamings, Stauder, van Son, & Mottron, 2005; see also Johnson et al., 2005; Ristic et al., 2005). Moreover, Kylliäinen and Hietanen (2006) demonstrated that children with ASD showed higher autonomic arousal in response to looming faces with direct gaze than those with averted gaze. This study suggests that individuals with ASD respond effectively to others’ gaze direction in some contexts, although it is possible that such sensitivity to direct gaze may be based on atypical cognitive mechanisms (Senju, Kikuchi, Hasegawa, Tojo, & Osanai, 2008). However, it is not clear whether individuals with ASD spontaneously integrate gaze direction with social and communicative contexts. For example, Pelphrey, Morris, and McCarthy (2005) revealed that individuals with ASD do not encode the congruence between the direction of others’ eye gaze and the location of the object being looked at. Baron-Cohen, Baldwin, and Crowson (1997) also demonstrated that children with ASD did not refer to others’ gaze direction to map a name to a new object. These studies suggest that individuals with ASD do not encode the referential property of others’ eye gaze.

In this study, two experiments were conducted to investigate whether children with ASD and TD children integrate facial expressions and eye gaze direction based on an affective or motivational tendency. Following Adams and Kleck (2003), we used facial stimuli with an approach-oriented expression (anger) and an avoidance-oriented expression (fear), both with either an averted or direct gaze. Experiment 1 used the whole face as stimuli, while Experiment 2 used only the eye region. Participants were then asked to discriminate facial expressions and press the corresponding key as soon as possible. We did not predict an overall group difference in the accuracy or reaction time (RT) between groups because previous studies found no group differences when they were properly matched (Braverman et al., 1989; Ozonoff et al., 1990; Prior et al., 1990). However, we predicted that gaze direction should not affect the performance of facial expression discrimination in children with ASD for the following reasons: First, as reviewed earlier, they may not encode facial expressions or gaze direction in terms of an affective or motivational tendency. Second, the amygdala, the region reportedly responsible for the integration of facial expression and gaze direction (Adams et al., 2003; Sato et al., 2004), is atypical in individuals with ASD, both in structure (Aylward et al., 1999; Howard et al., 2000; Schumann et al., 2004; Sparks et al., 2002) and function (Baron-Cohen et al., 1999; Critchley et al., 2000; Pierce, Müller, Ambrose, Allen, & Courchesne, 2001). On the other hand, it was predicted that TD children would display the expected interaction between facial expression and gaze direction that was reported in TD adults (Adams & Kleck, 2003) because previous studies have suggested that children (Barnes et al., 2007) and even infants (Striano et al., 2006) can integrate facial expression and gaze direction in other tasks.

Experiment 1

In Experiment 1, children with and without ASD were asked to discriminate the facial expression (anger or fear) of the facial stimuli presented on the computer screen and press the corresponding key as soon as possible. To examine the effect of congruency between gaze direction and facial expression, we used face stimuli with a direct or averted gaze.

Method

Participants.  Fourteen children with ASD (4 females) and 14 TD children (4 females) participated in Experiment 1. Children with ASD included in the final analyses had been diagnosed with Autistic Disorder (10) or Pervasive Developmental Disorder–Not Otherwise Specified (PDD–NOS; 4) by at least one child psychiatrist, clinical psychologist, or pediatrician. To confirm their clinical manifestation, the participants’ parents all completed the Japanese version of the Autism Screening Questionnaire (ASQ–J; Berument, Rutter, Lord, Pickles, & Bailey, 1999; Dairoku, Senju, Hayashi, Tojo, & Ichikawa, 2004). An abbreviated version of the Japanese Wechsler Intelligence Scale for Children–Third Edition (WISC–III; Wechsler, 1992; Japanese WISC–III Publication Committee, 1998) was also administered to measure IQ. The demographic background of the participants is presented in Table 1. All the children had normal or corrected-to-normal visual acuity. Written informed consent was obtained from all the children and their parents, and this study was approved by the Research Ethics Committee of the University of Tokyo.

Table 1. 
Means, Standard Deviation, and Range of Chronological Age, Full Intelligence Quotient (FIQ), Verbal Scaled Scores (VSS), Performance Scaled Scores (PSS), and Scores on the Japanese Version of the Autism Screening Questionnaire (ASQ–J) of Participants in Experiments 1 and 2
 Age (years)FIQVSSPSSASQ–J
  1. Note. ASD = autism spectrum disorder; TD = typically developing.

Experiment 1
 ASD (n = 14)
  M (SD)12.1 (2.0) 98.9 (16.1)10.5 (3.1) 9.1 (3.0)18.4 (5.8)
  Range9.2–14.870–1275–155–1410–27
 TD (n = 14)
  M (SD)11.9 (1.9)101.3 (12.8)11.4 (2.4) 9.1 (2.9) 3.4 (3.8)
  Range9.0–14.985–1248–165–140–12
Experiment 2
 ASD (n = 10)
  M (SD)12.4 (2.2) 98.8 (20.6)10.2 (4.3) 9.4 (3.1)20.5 (5.6)
  Range9.7–15.761–1273–164–1410–28
 TD (n = 10)
  M (SD)11.3 (1.7)106.9 (15.5)11.1 (3.2)11.2 (3.1) 1.7 (2.0)
  Range9.8–14.191–1428–198–170–5

Apparatus and stimuli.  Stimulus presentation and data collection were controlled on a PC with a 17-in. color monitor using E-Prime and the PST Serial Response Box (Psychology Software Tools, Pittsburgh, PA). The participants were seated approximately 60 cm from the monitor. The fixation point consisting of a central cross that subtended 1.5° appeared on the screen and the children were instructed to fixate on it before the experiment began. Facial photographs of six Caucasian models (3 males and 3 females from the Pictures of Facial Affect; Ekman & Friesen, 1976) and six Japanese models (3 males and 3 females from the Facial Information Norm Database [FIND]; Watanabe et al., 2007) were used to create the stimuli as follows: First, two photographs of each model, one with an angry and the other with a fearful expression, were selected from the database. Note that all the photographs were with direct gaze. Then, the direction of eye gaze was manipulated with Adobe Photoshop to create stimuli with leftward and rightward gaze directions for the same images. All the photographs were in grayscale and were cut into ovals (Figure 1a; 8.8° high and 6.6° wide; eyes 0.6° high and 1.3° wide; pupils 0.6°).

Figure 1.

 Example of stimuli used in Experiments 1 and 2. Top left: angry expression with direct gaze, Top right: angry expression with averted gaze, Bottom left: fearful expression with direct gaze, Bottom right: fearful expression with averted gaze.

In order to match the number of trials for each gaze condition (direct or averted), the stimuli with direct gaze were presented twice. The stimuli with averted gaze were presented once for each, which yielded two trials for the averted gaze condition (one with leftward gaze and one with rightward gaze). In total, the experiment consisted of 96 trials.

Procedure.  Each trial commenced with the central presentation of the fixation point. After 750 ms, the fixation point was replaced with a facial stimulus, which was presented on the center of the monitor with the middle of the nose presented at the same location as the fixation point. The participants were instructed to discriminate whether the facial expression was one of anger or fear and to press the corresponding key (left or right) as soon as possible. The correspondence between the particular facial expression and key was counterbalanced between participants. The facial stimulus was presented until the initiation of the participant’s response or for 5 s. Then, a blank screen was presented for 1 s before the beginning of the next trial (Figure 2). The testing consisted of one practice block and four test blocks. The practice block consisted of 8 trials, and each test block consisted of 24 trials. Eight facial stimuli used in the practice block were presented randomly from 96 facial stimuli. The presentation order of each trial was randomized across blocks and participants.

Figure 2.

 An example of the stimulus sequence in Experiment 1. This figure depicts the “fear” and “averted” condition.
Note. ITI = intertrial interval.

Design.  The experiment consisted of three factors: group (ASD or TD), emotion (anger or fear), and gaze direction (direct or averted). Group was a between subject factor and emotion and gaze direction were within-subject factors.

Results

There was no significant group difference in full intelligence quotient (FIQ; t = 0.43, p = .671), verbal scaled scores (VSS; t = 0.82, p = .420), performance scaled scores (PSS; t = 0.06, p = .950), chronological age (t = 0.29, p = .777), or sex ratio (χ2 = 0, p = 1). There was a significant group difference in the ASQ–J score (t = 8.11, p < .001).

The mean RTs and error rates were compared by a three-way mixed analysis of variance (ANOVA) with group (ASD or TD) as the between-subject factor and emotion (anger or fear) and gaze direction (direct or averted) as within-subject factors. The trials with RTs of more than 2 SD above or below the mean of each individual (4.17%) and trials resulting in incorrect responses (18.82%) were excluded from the RT analysis.

The mean RTs are presented in Figure 3a. The main effect of emotion was significant, F(1, 26) = 5.51, p = .027, ηp2 = .17, because fear was recognized slower than anger. A three-way interaction between group, emotion, and gaze direction was significant, F(1, 26) = 5.60, p = .026, ηp2 = .18. As predicted, post hoc simple effect analysis revealed that there was a significant interaction between emotion and gaze direction in the TD group, F(1, 26) = 4.89, p = .036, ηp2 = .16, but not in the ASD group, F(1, 26) = 1.29, p = .267, ηp2 = .05. Although the interaction between emotion and gaze direction in the TD group was significant, the simple effect of gaze was not significant for either anger, F(1, 52) = 2.29, p = .137, ηp2 = .04 or fear, F(1, 52) = 1.67, p = .202, ηp2 = .03. For error rates, no significant main effects or interactions reached significance (all Fs < 4.20, all ps > .05).

Figure 3.

 Mean reaction times for correct responses (a) and congruency effect (b) in Experiment 1. Error bar: standard errors.
Note. ASD = autism spectrum disorder; TD = typically developing.
*p < .05.

To further examine the individual differences in the integration of facial expression and gaze direction, the “congruency effect” was calculated for each participant by subtracting RTs for stimuli with congruent motivational tendency (angry expression with direct gaze and fearful expression with averted gaze) from RTs for stimuli with incongruent motivational tendency (angry expression with averted gaze and fearful expression with direct gaze). The congruency effect was significantly larger in the TD group than in the ASD group (Figure 3b, t = 2.37, p = .026, d = 0.89). The congruency effect in the TD group was significantly higher than zero (t = 2.78; p = .016, d = 1.54) whereas in the ASD group the congruency effect did not differ from zero (t = 0.97; p = .449, d = 0.54). The congruency effect was not significantly correlated with the ASQ–J scores in the ASD group (r = −.06, p = .844) or in the TD group (r = −.36, p = .204). In addition, the correlation between the congruency effect and verbal skills did not reach significance in the ASD group (r = −.18, p = .531) or in the TD group (r = .14, p = .634).

Discussion

In the TD group, gaze direction modulated the speed of recognition of facial expressions, which replicated the results by Adams and Kleck (2003). On the other hand, in the ASD group, gaze direction did not affect the recognition latency of a facial expression. This result cannot be attributed to impairment in recognizing facial expressions per se because the overall accuracy and RTs did not differ between groups. It is also very unlikely that the children with ASD who participated in the current study cannot encode eye gaze direction per se because previous studies have demonstrated that children with ASD have no impairments in discriminating gaze direction (Baron-Cohen et al., 1995; Leekam et al., 1997). Thus, the current results appear to suggest that children with ASD do not integrate affective or motivational tendency communicated by facial expression and gaze direction in the same way that TD children do.

However, there is still another possible explanation for why gaze direction did not affect facial expression recognition in children with ASD. Recent eye-tracking studies report that individuals with ASD spend less time fixating on the eye region than TD individuals during the observation of a face (Dalton et al., 2005; Klin, Jones, Schultz, Volkmar, & Cohen, 2002; Neumann, Spezio, Piven, & Adolphs, 2006; Pelphrey et al., 2002; Spezio, Adolphs, Hurley, & Piven, 2007; see also van der Geest, Kemner, Verbaten, & van Engeland, 2002). In addition, Dalton et al. (2005) reported that in individuals with ASD, individual differences in the amygdala activation in response to a face were positively correlated with the duration of fixation on the eyes. These results suggest that the outcome in the current experiment may simply reflect fewer fixations on the eye region but not the capacity to integrate facial expression and gaze direction. Thus, in Experiment 2, we presented the eye regions of the face stimuli rather than the whole face, to help children turn their attention to the eye region.

Experiment 2

This study presented the eye regions of the face stimuli rather than the whole face used in Experiment 1 in order to control the fixation of each participant. Although it has been suggested in some previous studies that individuals with ASD have difficulty encoding others’ emotional and mental states from the expression of the eye region alone (Baron-Cohen, Baldwin, & Crowson, 1997; Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, 2001; Senju, Tojo, Konno, Dairoku, & Hasegawa, 2002), other studies suggest that individuals with ASD use information from the eye region like TD individuals to process facial expressions (e.g., Back, Ropar, & Mitchell, 2007). Thus, we did not make a specific prediction about any overall group differences in the recognition of expression from the eyes.

If the results of Experiment 1 can be fully explained by a fewer number of fixations on the eye region in children with ASD, their performance in facial expression recognition should be affected by the gaze direction of the stimuli in Experiment 2. On the other hand, if children with ASD are impaired at integrating the affective or motivational tendency from a facial expression and gaze direction, their performance should not evince an interaction between facial expression and gaze direction as in Experiment 1.

Method

Participants.  Ten children with ASD (3 females) and 10 TD children (3 females) participated in Experiment 2. Children with ASD included in the final analyses had been diagnosed with Autistic Disorder (7) or PDD–NOS (3) by at least one child psychiatrist, clinical psychologist, or pediatrician. Similar to Experiment 1, the parents of all the participants completed the ASQ–J, and an abbreviated version of the Japanese WISC–III was administered to measure IQ. The demographic background of the participants is presented in Table 2. All of the children had normal or corrected-to-normal visual acuity. As in Experiment 1, written informed consent was obtained from all the children and their parents.

Table 2. 
Means and Standard Deviations of Reaction Times (ms) and Error Rates (%E) for Experiments 1 and 2
 AngerFear
DirectAvertedDirectAverted
  1. Note. ASD = autism spectrum disorder; TD = typically developing.

Experiment 1
 ASD (n = 14)
  M (SD)956 (272)940 (270)1,005 (265)1,052 (334)
  %E18212421
 TD (n = 14)
  M (SD)1,005 (229)1,071 (256)1,161 (385)1,105 (369)
  %E15211516
Experiment 2
 ASD (n = 10)
  M (SD)942 (238)943 (200)972 (199)1,010 (217)
  %E892924
 TD (n = 10)
  M (SD)858 (165)899 (288)1,020 (242)904 (147)
  %E17241718

Apparatus and stimuli.  The apparatus and stimuli were similar to Experiment 1 except that the eye regions of the faces, instead of whole faces, were presented as stimuli. The eye regions were cut into a rectangle (5.0° × 10.0°; eyes 0.8° high and 1.7° wide; pupils 0.8°) from the grayscale facial photographs of 10 Caucasian models (5 males and 5 females from the Pictures of Facial Affect; Ekman & Friesen, 1976) to create the stimuli. As in Experiment 1, two photographs for each model, one with an angry expression and the other with a fearful expression, were selected from the database. These images were then edited with Adobe Photoshop to create different gaze directions and presented four times in the averted gaze direction (twice with leftward gaze and twice with rightward gaze) and four times in the direct gaze direction. Experiment 2 consisted of 80 trials.

Procedure and design.  The procedure and design were similar to those in Experiment 1 except that the test block consisted of two blocks, and each block was composed of 40 trials.

Results

There was no significant group difference in FIQ score (t = 1.00, p = .334), VSS (t = 0.53, p = .600), PSS (t = 1.30, p = .209), chronological age (t = 1.26, p = .224), and sex ratio (χ2 = 1.81, p = .178). There was a significant group difference in the ASQ–J score (t = 9.99, p < .001).

As in Experiment 1, the mean RTs and error rates were compared by three-way mixed ANOVA with the group (ASD or TD) as a between subject factor and emotion (anger or fear) and gaze direction (direct or averted) as within subject factors. Trials with RTs of more than 2 SD above or below the mean of each individual (4.19%) and trials resulting in incorrect responses (17.38%) were excluded from the analysis.

The mean RTs are presented in Figure 4a. No main effects, including group differences, reached significance (all Fs < 3.14, all ps > .05). As was predicted, a significant three-way interaction between group, emotion, and gaze direction was significant, F(1, 18) = 5.18, p = .035, ηp2 = .22. As in Experiment 1, post hoc simple effect analyses revealed that there was a significant interaction between emotion and gaze direction in the TD group, F(1, 18) = 6.82, p = .018, ηp2 = .27, but not in the ASD group, F(1, 18) = 0.37, p = .551, ηp2 = .02. In the TD group, the simple main effect of eye gaze in fear was also significant, F(1, 36) = 10.62, p = .002, ηp2 = .23, but not in anger, F(1, 36) = 1.36, p = .252, ηp2 = .04.

Figure 4.

 Mean reaction times for correct responses (a) and congruency effect (b) in Experiment 2. Error bar: standard errors.
Note. ASD = autism spectrum disorder; TD = typically developing.
*p < .05.

For error rates, the main effect of emotion was significant, F(1, 18) = 6.77, p = .018, ηp2 = .27. The interaction between group and emotion was also significant, F(1, 18) = 14.21, p = .00014, ηp2 = .44. Post hoc simple effect analysis revealed that children with ASD made more errors in recognizing fear (M = 0.263, SD = 0.144) than anger, (M = 0.085, SD = 0.075), F(1, 18) = 20.30, p = .0003, ηp2 = .53, and TD children were equally accurate in recognizing both angry and fearful facial expressions, F(1, 18) = 0.68, p = .420, ηp2 = .04.

The “congruency effect” was calculated for each participant as in Experiment 1 by subtracting RTs for stimuli with congruent motivational tendency (angry expression with direct gaze and fearful expression with averted gaze) from RTs for stimuli with incongruent motivational tendency (angry expression with averted gaze and fearful expression with direct gaze). The congruency effect was significantly larger in the TD group than in the ASD group (Figure 4b, t = 2.28, p = .035, d = 1.02). The congruency effect in the TD group was marginally higher than zero (t = 2.20; p = .055, d = 1.47), but that in the ASD group was not significantly different from zero (t = 0.79; p = .449, d = 0.53). In addition, the congruency effect in the ASD group was negatively correlated with the ASQ–J scores (r = −.68, p = .032) but not in the TD group (r = −.38, p = .278). The congruency effect was not significantly correlated with VSS in the ASD group (r = .12, p = .748) or in the TD group (r = −.11, p = .766).

Discussion

In Experiment 2, the children judged the expressions from the eye region of the facial stimuli. The results replicated Experiment 1 in that the recognition latency of an emotional expression was modulated by gaze direction in TD children. In contrast, children with ASD were not affected by the affective or motivational congruency between gaze direction and emotional expression. As a result, the TD group had a higher congruency effect than children with ASD.

As in Experiment 1, because we did not find any overall group differences in the accuracy or RTs of emotional expression recognition, it is highly unlikely that these effects are merely derived from difficulty in decoding emotional expressions from the eye region in individuals with ASD. In addition, since only the eye region was presented, it was impossible for the participants to rely on facial parts other than the eye region to decode emotional expression. Thus, the current results are inconsistent with the theory of fewer fixations to the eyes, which fully explained the results in Experiment 1. Together with the results in Experiment 1, the current results suggest that children with ASD do not integrate the affective or motivational tendency of facial expressions and gaze direction to process the emotion of a facial expression.

Interestingly, the congruency effect or effect of affective or motivational congruency between facial expression and gaze direction had a negative correlation with the ASQ-J scores in the ASD group. This result suggests that the children’s capacity to integrate facial expression and gaze direction based on their affective or motivational congruency may be related to a manifestation of autistic symptoms. However, we need to be cautious about interpreting this correlation because of the small sample size (n = 10) and the lack of a negative correlation in Experiment 1. Further studies will be required to examine the relation between autistic symptom and congruency effect in individuals with ASD.

Note that participants in Experiment 2 were only partially overlapping with those who participated in Experiment 1 (5 of 10 children with ASD and 2 of 10 TD children participated in both experiments), and Experiment 2 was conducted 1 year after Experiment 1. Thus it is still possible that individual differences within children with ASD as well as TD children might have affected the discrepancies between the two experiments. Further study, ideally adopting a complete within-subject design to test the relation between the performance in facial expression detection and eye contact detection, will be beneficial.

General Discussion

The current study is the first to demonstrate that during the process of recognizing the facial expression that corresponds to an emotion, TD children integrate gaze direction and facial expression based on motivational tendency, but children with ASD do not. In Experiment 1, TD children exhibited the interaction between emotional expression and gaze direction in RTs that replicated the results of an adult study by Adams and Kleck (2003). In contrast, children with ASD were not affected by the gaze direction of the stimuli during the recognition of a facial expression. In Experiment 2, only the eye region of the face stimuli was presented, and TD children showed the predicted interaction but children with ASD did not, which again replicated the findings from Experiment 1. The results suggest that the absence of the interaction between emotional expression and gaze direction in children with ASD cannot be explained by the fewer number of fixations on the eyes. Note that in both experiments, we did not find any group differences in overall accuracy or RTs, which suggests that the current finding cannot be attributed to a general difficulty in decoding facial expression in children with ASD. At the same time, it is also highly unlikely that children with ASD did not encode the gaze direction per se because previous studies suggest that they have an intact ability to discriminate different eye gaze directions (Baron-Cohen et al., 1995; Leekam et al., 1997). Moreover, the individual differences in the congruency effect in the ASD group resulted in a significant negative correlation with ASQ–J scores, a retrospective measurement of autistic symptoms around 4–5 years of age: The higher ASQ–J score individuals with ASD had, the smaller congruency effect they showed. This result also suggests that the lack of or at least weaker integration between facial expression and gaze direction is strongly related to the manifestation of autistic symptoms such as difficulties in social interaction and communication.

The current results suggest that 9- to 14-year-old TD children, like adults (Adams & Kleck, 2003), integrate gaze direction and facial expression based on an approach-oriented or avoidance-oriented motivational tendency. This conclusion is congruent with previous studies that demonstrated that children around the same age range integrate face identity and gaze direction in the same manner as adults (Hood, Macrae, Cole-Davies, & Dias, 2003; Smith, Hood, & Hector, 2006). Thus, it is possible that such adult-like integration of different aspects of facial information develops by the age range tested in the current study. Further studies will be required to include younger children and infants to investigate the development of the mechanism underlying such integration. It will be also necessary to adopt neuroimaging techniques in order to examine the neural basis of such integration of different aspects of facial information and its developmental trajectory.

There are at least two possible explanations for why children with ASD do not show the interaction between facial expression and gaze direction. The first possibility is that they do not encode affective or motivational tendency per se from communicative signals such as facial expression or eye gaze. This is consistent with the previous studies that suggested an atypical processing of facial expression in individuals with ASD. For example, Kamio, Wolf, and Fein (2006) demonstrated that the affective valence of facial expression did not prime the affective evaluation of the following stimuli in individuals with ASD. In addition, Grossman et al. (2000) demonstrated that the simultaneous presentation of mismatching emotional words interfered with facial expression recognition in individuals with ASD, which appears to suggest that they are using a semantic or verbally mediated strategy to encode facial expressions rather than encoding them based on their affective tendency. It is known that the amygdala is involved in the evaluation of both physical and social environments (Rolls, 1999), and weaker activation of the amygdala was reported in individuals with ASD (Baron-Cohen et al., 1999; Critchley et al., 2000; Pierce et al., 2001). However, since other studies reported apparently intact capacity for encoding affective or motivational tendency of gaze (Kylliäinen & Hietanen, 2006) and facial expression (Ashwin, Wheelwright, & Baron-Cohen, 2006), as well as apparently typical activation (Pierce et al., 2001; Piggot et al., 2004) or hyperactivation (Dalton et al., 2005) of the amygdala in individuals with ASD, it is also possible that individuals with ASD lack the spontaneous encoding of affective or motivational tendency rather than the capacity to encode this information.

The other possibility is that individuals with ASD do not spontaneously integrate information from facial expression and gaze direction. In addition to the amygdala, superior temporal sulcus, fusiform face area (FFA), and medial prefrontal region are also responsible for the social processing, and these regions form the “social brain” (Adolphs, 2003; Brothers, 1990; Johnson, 2005; Johnson et al., 2005). Thus, it is also possible that the reduced connectivities between the amygdala and these social brain components rather than the impairment in the amygdala affect the integration of different social signals in individuals with ASD (e.g., Frith, 2001). For example, Kleinhans et al. (2008) investigated functional connectivity between cortical and subcortical structures while participants were performing a facial identification task, and they found increased functional connectivity between the FFA and the amygdala in the TD group compared with the ASD group. Interestingly, they also found that greater social impairment in the ASD group was associated with reduced connectivity between FFA and the amygdala and increased connectivity between the FFA and inferior frontal gyrus. This study suggests that the amygdala does not modulate the activity of social brain network in individuals with ASD to the same extent as in TD individuals.

Alternatively, the lack of spontaneous integration can be based on perceptual or cognitive style such as weak central coherence (e.g., Happé, 1999), which is not specific to social domain. For example, Jemel, Mottron, and Dawson (2006) claimed that perceptual characteristics can lead to difficulty in face processing tasks. However, given the specific involvement of the amygdala in the integration of facial expression and gaze direction in TD individuals (Adams et al., 2003; Sato et al., 2004), it is more likely that the difficulty in integration is based on the atypical development of the social brain network.

Note that these two possibilities are not mutually exclusive. In typical development, Adams and Kleck (2003) documented that gaze direction and facial expression “interact meaningfully in the perceptual processing of emotionally relevant facial information” (p. 646). To achieve this function, the brain must (a) detect the emotional relevance of each social cue and (b) perceptually integrate these social cues according to their emotional relevance. Thus, impairment in either of these two processes can lead to impairment in integrating multiple social signals based on their affective or motivational tendencies. Moreover, impairment in either of these two mechanisms (encoding or integration) could hamper the development of the other function in the course of development. For example, interactive specialization theory (e.g., Johnson, 2005) argues that infants are born with an innate bias, which is subserved by subcortical regions including the amygdala and relatively unspecified cortical structures. In the course of development, the input from the social environment, which is filtered by the innate bias of the former system, interacts with the architectural bias of cortical structures to form a typical social brain network specialized for social processing. According to this interactive specialization theory, any impairment in the amygdala, atypical architecture of cortical structure, or reduced connectivity between these subcortical and cortical structures can cause atypical specialization of the social brain network, which leads to impairment in the effective integration of the multiple social signals. Schultz (2005) also hypothesizes that the amygdala plays a key role in the development of brain regions associated with social deficits in ASD.

The current study has demonstrated that the discrepancies between TD children and children with ASD in the integration of facial expression and gaze direction can be tracked down to childhood. Further studies are needed to investigate the earlier development of an ability to encode communicative signals and integrate them in children with ASD, which could reveal the developmental origin of the difficulty in social interaction and communication in ASD.

To summarize, the current results corroborate previous neuroimaging studies (Adams et al., 2003; Sato et al., 2004; Wicker, Perrett, Baron-Cohen, & Decety, 2003) and suggest that the social brain network plays a critical role in the integration of multiple communicative signals based on their affective or communicative relevance. The impairment in the encoding or integration of multiple communicative signals in ASD, probably due to the atypical function and development of the social brain network, could hamper the development and functioning of efficient communication in daily social interactions. Further studies are needed to explore the neural and developmental bases of impaired encoding and/or integration of communicative signals, as well as their relations with difficulties in social interaction and communication in ASD.

Ancillary