This work was supported by the Sofja Kovalevskaja Award granted by the Alexander von Humboldt Foundation to T. Striano. We would like to thank Birgit Elsner, Thomas Gunter, and Herbert Roeyers for helpful comments on earlier versions of this manuscript.
The importance of eye gaze as a means of communication is indisputable. However, there is debate about whether there is a dedicated neural module, which functions as an eye gaze detector and when infants are able to use eye gaze cues in a referential way. The application of neuroscience methodologies to developmental psychology has provided new insights into early social cognitive development. This review integrates findings on the development of eye gaze processing with research on the neural mechanisms underlying infant and adult social cognition. This research shows how a cognitive neuroscience approach can improve our understanding of social development and autism spectrum disorder.
Humans communicate with each other using a variety of means such as emotional expressions in face and voice, pointing gestures, body postures, and vocal language. Compared with these complex forms of expression, eye gaze may appear rather rudimentary as it involves the movement of only one body part and results in simple and easy to capture perceptual information (the current review will not cover the communication of complex emotions such as shame through the eyes; see Baron-Cohen, Wheelwright, Hill, Raste, & Plumb, 2001). The human eye constitutes a special visual stimulus because of its morphology, including a strong contrast between sclera and iris, which is unique in humans (Kobayashi & Koshima, 1997). This permits the computation of gaze direction based on relatively simple processing mechanisms in the human brain (Anstis, Mayhew, & Morley, 1969). But how much information can be transported by either direct or averted gaze?
A closer look at human communication reveals that eye gaze plays an essential role especially early in ontogeny. In dyadic face-to-face interactions, mutual gaze establishes first social interactions and may be the driving force behind infants’ interest in human faces (Gliga & Csibra, 2007). In triadic person–object–person interactions, eye gaze indicates another person’s focus of attention. As such, it guides learning even in the first months after birth (see Striano & Reid, 2006, for a review on dyadic and triadic interactions in the 1st year). Further, in combination with emotional expressions, eye gaze constitutes an effective means to communicate and detect threat in the environment (Adams & Kleck, 2003).
Information of this rich and pivotal kind may be derived from eye gaze long before the acquirement of vocal language. Moreover, eye gaze may be detected and used appropriately without conscious cognitive efforts as it results in automatic shifts of attention (see Langton, Watt, & Bruce, 2000, for a review), which has been observed in infants by 3 months of age (e.g., Hood, Willen, & Driver, 1998) and even in newborns (Farroni, Massaccesi, Pividori, & Johnson, 2004). There is evidence that information from the eyes cannot be processed as readily in children with autism spectrum disorder (ASD; e.g., Johnson et al., 2005). We will discuss important factors that affect ASD children’s performance in tasks involving gaze perception and implications for social development in autism.
The aim of this review article is to integrate findings on the development of eye gaze processing in infants with research on the neural correlates of social cognition in adults on the basis of the directed attention model (DAM) of infant social cognition. A more detailed review on the attention cueing effects of eye gaze shifts can be found in a recent review by Frischen, Bayliss, and Tipper (2007). See Striano and Reid (2006) and Grossmann and Johnson (2007) for broader reviews on social cognitive functions in infancy and Grossmann and Farroni (2009) for an empirical review on the neural networks underlying gaze perception. The phylogeny of eye gaze processing is reviewed by Emery (2000).
The following aspects of gaze processing will be discussed in terms of their underlying neural correlates and social cognitive functions in this review article:
1The detection of the presence of eyes with direct gaze, which allows for the establishment of mutual gaze with another person in a dyadic face-to-face interaction.
2Gaze detection, which will be referred to as determining the direction of another person’s eye gaze.
3Gaze following, that is, following another person’s gaze direction to an external target in a triadic person–object–person interaction.
4Processing eye gaze in conjunction with emotional facial expressions to detect the referent of the emotional expression.
It is important to note that head and body posture are processed in conjunction with gaze direction when determining another person’s focus of attention (Langton et al., 2000). As this review’s focus is on processing information from the eyes it will focus on studies using a front view of the face with diverging eye gaze directions only. Whenever additional information from head motion was given, for instance, in live interaction studies with infants, this will be specified accordingly.
Developmental research has been stimulated by the application of methods originating from neuroscience, for example, the measurement of event-related potentials (ERPs). ERPs reflect the summated responses of synchronously firing neurons in the human cortex that are time-locked to a certain event. Different components of the ERP can be associated with perceptual and/or cognitive processes. Neuroscience methodology thus allows for the investigation of cognitive processes with an excellent temporal resolution and provides new insights into infant cognitive development.
In the following section, the DAM of infant social cognition will be introduced with special attention to its implications in eye gaze processing. On the basis of this information-processing framework, we will review empirical findings on early gaze processing in dyadic and triadic interactions, in combination with emotional expressions and in autism. Finally, the neural system involved in gaze processing will be recapped and integrated with the conception of the DAM. We will conclude with an outlook on open questions and potentially exciting directions for future research.
The DAM of Infant Social Cognition
One issue that is difficult to resolve is the developing infant’s remarkable capacity to process and interpret social-cognitive information, even within the first few postnatal months. These capacities are particularly surprising given the multimodal, dynamic, and continuous nature of human behavior. It has even been suggested that it is the overall complexity of social interaction that causes aspects of social cognition to be uniquely human (Saxe, 2006). In the 1st year, the human infant displays remarkable astuteness in social situations, from determining the referent of eye gaze (Woodward, 2003) through to predicting the goal of actions (Gergely, Nádasdy, Csibra, & Bíró, 1995; Reid, Csibra, Belsky, & Johnson, 2007; Reid et al., 2009). Infants are capable of these skills despite their very limited working memory capacities in the 1st year. For example, Ross-Sheehy, Oakes, and Luck (2003) found that 10- to 13-month-olds detect changes within streams of objects when presented with two to three items; however, 4- and 6-month-olds detect differences involving only one stream of information. Similarly, Káldy and Leslie (2005) found that 6.5-month-old infants could recall the location of no more than one object within a spatial area. Performance at this level was only possible without delay or distraction. When these cognitive capacities are compared with the complexity of the social world, there is clearly some form of accelerated performance in tasks involving cognitive processes indexed in social situations. For instance, 3-month-old infants process relations between themselves, an object and another person (Striano & Stahl, 2005).
To account for these marked differences between social and cognitive processing, Reid and Striano (2007, 2008) developed the DAM of social-cognitive performance, inspired by neurocognitive models of social cognition (e.g., Satpute & Lieberman, 2006). This information processing model describes the perceptual stages of processing social information that are required to respond appropriately to a social partner. According to the DAM, the infant uses each successive aspect of the social world to filter the overall amount of available information such that limited cognitive capacities are capable of producing socially competent responses. Thereby, the perceptual input is parsed into manageable components and social information is highlighted. The DAM suggests that key groupings of cognitive tasks must occur in a set sequence in order for the infant to successfully react to the social situation. The five stages represent an information processing sequence and are not assumed to develop subsequently during ontogeny. The following processing stages need to be accomplished at any age to enable appropriate social responses:
1Detecting a person.
2Identifying the individual.
3Assessing the individual’s attention toward the self.
4Locating the individual’s focus of attention toward external objects/events.
5Inferring an observed goal/responding appropriately.
The first stage involves the detection of humans. At a purely mechanistic level, to produce social responses, the infant must detect those components of the environment that are relevant for social interactions. Nelson (2008) suggests that for the DAM to operate efficiently, early stages of the model must incorporate perceptual processing biases such as Johnson’s conspec model of face processing (Johnson, 2005; Johnson & Morton, 1991). However, the first stage of the DAM is broader, insofar as it also incorporates the detection of biological motion. This allows the infant to fix attention on the social components of the environment, removing the need to process extraneous information that is not relevant to the task of social communication. Newborn infants’ disposition to orient toward biological motion has been demonstrated recently (Simion, Regolin, & Bulf, 2008). These initial biases (i.e., attention to faces and biological motion) lead to accelerated experience with social compared with nonsocial information and to enhanced processing in brain areas involved in social information processing (see also Johnson, 2005; Johnson et al., 2005).
Following the detection of a conspecific, the second stage of the DAM suggests that the infant individuates the social partner. The beginning of this skill is intact at a very early age. For example, Bushnell, Sai, and Mullin (1989) found that even a few hours after birth, newborn infants discriminate the maternal face from those of other adults. Once these two steps have been taken, the infant can assess the individual’s locus of attention in relation to the self (third stage) and in relation to outside objects and other persons (fourth stage). Provided that these four components of the social situation are successfully appraised, at the fifth stage the infant can infer the observed goal, and/or, prepare an appropriate response (e.g., establish contact and offer response).
Research investigating effects of eye gaze cueing and gaze following provides compelling support for the DAM (Nelson, 2008). For example, newborn infants preferentially attend to faces with direct eye contact (Farroni, Csibra, Simion, & Johnson, 2002). This attention bias leads to accelerated processing of social relative to nonsocial information and to advantages in the neural processing of social information when compared with nonsocial stimuli. When combined, eye gaze research within dyadic face-to-face interactions and triadic person–object–person situations demonstrates that infants use others as tools with which to reduce the amount of information available for them to process in the surrounding world. This fits well with the notion that infants direct their limited attentional resources and memory capacity to components of the world that are important for social relations. Our approach is in line with previous accounts of an “interactive specialization” of the social brain network (Johnson et al., 2005) and experience-expectant development of face processing (Nelson, 2001) in that it assumes functional development of involved brain areas to be shaped by relevant social inputs. However, the DAM extends these approaches by suggesting a more specific function to these mechanisms (i.e., reduction of cognitive load when processing social relative to nonsocial input) and by specifying an information-processing sequence of socially relevant input. The direction of attention toward social information may be seen as one aspect that allows human infants not only to engage in episodes of “pedagogy,” that is, learning from social partners through teaching (Csibra & Gergely, 2006), but also to establish affective relationships. In the following sections we review research on eye gaze processing in dyadic and triadic interactions that provides support for this notion.
Eye Contact—Entering the World of Social Interaction
Mutual gaze is an essential component of dyadic face-to-face interactions and is often used to establish social interactions (Rutter, 1984). Neonates prefer a photograph depicting a woman with open eyes relative to closed eyes (Batki, Baron-Cohen, Wheelwright, Connellan, & Ahluwalia, 2000). Moreover, newborn infants prefer to look at a face with eye gaze directed at them compared to eye gaze averted to the side (Farroni et al., 2002). This effect is not found in the context of an inverted face, suggesting that it reflects a special aspect of face processing rather than an effect of low-level perceptual differences between stimuli (Farroni, Menon, & Johnson, 2006), though it might be accomplished by subcortical structures (Johnson, 2005).
In the DAM, the detection of a human face with direct eye gaze corresponds to the first stage an infant needs to accomplish to successfully interact with another person. Importantly, mere detection of eyes with direct gaze as observed in newborns (Farroni et al., 2002) may be achieved by a subcortical face processing route involving the amygdala (Johnson, 2005) and may not necessarily imply the understanding of mutual gaze as a communicative signal directed at the self (Stage 3 of the DAM).
The concept of an innate bias toward eyes with direct gaze concords with the notion of an innate eye direction detector (Baron-Cohen, 1994). It is an unresolved issue that neural structures underlie newborns’ attentional biases (see Johnson, 2005). However, as gaze detection is not independent from facial configuration even in newborns (Farroni et al., 2006), this mechanism is probably not accomplished by an isolated structure but may rather be embedded in functions of a general face processing system. In the following section, we would therefore like to focus on the development of brain structures involved in the processing of direct gaze in older infants, children and adults and refrain from the discussion of possibly innately dedicated neural structures.
Interestingly, amplitude of the N170 to eyes alone is equal or even larger than to intact faces, which has lead to the suggestion that the N170 might reflect the operation of an eye processor (Bentin et al., 1996). However, amplitude of the N170 does not differ in response to intact faces and faces with the eye region removed (Eimer, 1998). Thus, the N170 does not merely reflect activity of neurons sensitive to the presence of eyes but rather may be induced by different populations of neurons that are either specifically sensitive to faces or specifically sensitive to eyes (Itier et al., 2006).
Is the N170 sensitive to eye gaze direction? Using animated stimuli of facial movements, Puce, Smith, and Allison (2000) found enhanced N170 responses for eyes that averted gaze away from the perceiver. This effect, however, was not restricted to eye movements but was found also for movement of the mouth. In another study using dynamic stimuli, in contrast, the N170 was larger for direct relative to averted gaze (Conty, N’Diyae, Tijus, & George, 2007). Using static stimuli, Taylor, Itier, Allison, and Edmonds (2001) found no effect of gaze direction on amplitude of the N170. Thus, there is only limited evidence for the N170 as an eye (direction) detector. On the basis of current empirical data, the N170 response may rather reflect structural encoding of faces (Eimer, 1998) and is probably only in part driven by specific neurons sensitive to the presence of eyes (Itier et al., 2006).
However, there is evidence that the role of the N170 in gaze processing may be different at earlier stages of development. Taylor, Edmonds, McCarthy, and Allison (2001) found that the N170 in children from 4 to 15 years is much larger in amplitude and faster in latency in response to eyes alone compared with intact upright faces. Further, while the N170 to faces develops until adulthood, the N170 response to isolated eyes has a mature morphology already by 11 years of age.
Recently, a likely developmental precursor to the N170 component has been investigated in infancy (see de Haan, Johnson, & Halit, 2003, for a review). The N290 has been observed in response to static faces in infants from 3 months of age and is comparable to the adult N170 with respect to topography and polarity, though it is a little delayed, more medially distributed and attenuated in amplitude (see de Haan et al., 2003). Cortical sources of the N290 are lateral occipital cortex, fusiform gyrus and STS (Johnson et al., 2005). Moreover, the N290 shares some functional similarities with the N170 in that it shows sensitivity to human faces (Halit, de Haan, & Johnson, 2003).
Importantly, the infant N290 is enhanced by direct gaze in static faces compared with averted gaze in 4-month-old infants (Farroni et al., 2002), which is not the case for the N170 component in adults (Taylor, Itier, et al., 2001). It is possible that this effect reflects enhanced face processing in the presence of direct gaze in infants (see also Farroni, Massaccesi, Menon, & Johnson, 2007). However, a similar effect on N170 amplitude would then be expected in adults (but see Conty et al., 2007, for an alternative view on discrepant findings regarding the N170 and gaze detection). The infant component may reflect activation of a neural system involved in the detection of eyes with direct gaze which is functional by at least 4 months of age. It is currently not known whether the same system is involved in gaze detection from birth. It is possible that detection of eyes with direct gaze is initially accomplished by subcortical structures that then increasingly interact with cortical structures, such as the STS, during development (Johnson, 2005).
Detection of eyes with direct gaze is part of the first stage of the DAM: the detection of social agents. Special attention to eyes with direct gaze may initially drive infants’ interest in faces and help establish the first social interactions of a newborn as suggested by Gliga and Csibra (2007). With increasing age and face expertise (see Gauthier & Nelson, 2001), this function may become less important during the 1st years of life. During childhood, eye-sensitive neurons may still be the dominant source of the N170 (Taylor, Edmonds, et al., 2001), but in adults these neurons may be suppressed by neurons that are processing facial configurations during processing of canonical but not inverted faces (Itier et al., 2006).
To conclude, there is evidence that the N290 component in infants represents activity of eye-sensitive neural structures. However, these structures may become a less dominant source of the N170 component in children and adults. Further research is required to examine whether the infant N290 is more sensitive to the presence or absence of eyes in a face than it is the case for the adult component (Eimer, 1998) and if the N290 in response to eyes alone is enhanced when compared with intact faces. Furthermore, it would be useful to investigate the assumed transition from the infant N290 to the adult N170 ERP component in longitudinal studies and the interaction between cortical and subcortical structures during detection of eyes with direct gaze in early infancy.
What Are You Looking at? Direction of Attention Through Gaze Cues in Triadic Interactions
By 3 months, infants can follow the direction of an adult’s eye gaze and head motion (D’Entremont, Hains, & Muir, 1997) and are faster to orient to targets cued by eye gaze shifts (Hood et al., 1998), an effect that has even been observed in newborns (Farroni et al., 2004). Further studies on eye gaze cueing and object processing indicate that there is an early capacity for discerning a relation between a person and an object. In one study (Reid & Striano, 2005), 4-month-old infants watched a video presentation of an adult gazing toward one of two objects. When presented with the same objects a second time, infants gazed toward the uncued object significantly more—suggesting that they found it more novel. In support of these findings, uncued objects elicited enhanced neural processing compared to cued objects in a related ERP study with 4-month-old infants (Reid, Striano, Kaufman, & Johnson, 2004; see also Theuring, Gredebäck, & Hauf, 2007). These findings suggest that 4-month-old infants acquired information about the object that was the focus of the adult’s attention, suggesting that information regarding the relation between an adult and an object is rapidly determined, even in early infancy.
Recently, Triesch, Teuscher, Deak, and Carlson (2006; see also Nagai, Hosoda, Morita, & Asada, 2003, for the implementation of a similar model in a robot) proposed a computational model to explain the emergence of gaze following based on learning mechanisms and the presence of a “basic set” of prerequisites. The authors argue that infants learn to follow their caregiver’s gaze, because they discover that it helps them to predict the occurrence of interesting visual events, which act as a reinforcement to gaze following behavior. Corkum and Moore (1998) investigated whether learning mechanisms and reinforcement can sufficiently explain gaze following in infancy. They demonstrated that infants at the age of 8 months can be trained to follow their caregiver’s gaze if interesting objects appear in the direction of the adults head turn. In a second experiment, however, they found it much more difficult to condition infants to look to the opposite direction of an adult’s head turn. Learning mechanisms may thus be relevant but are not sufficient to explain infants’ development of gaze following. In particular, the attention directing effects of gaze shifts implicated in Stage 4 of the DAM need to be considered.
How do infants process the relation between a looker and the target of his or her gaze? Hoehl, Reid, Mooney, and Striano (2008) presented 4-month-old infants with photographs depicting an adult looking either toward or away from an object. The resulting ERPs indicated that the infants were faster to process object directed gaze than nonobject directed gaze, as measured by the latency to the peak of the negative component (Nc). Further, amplitude of the Nc was larger when eye gaze was averted from the object. The Nc is a negative deflection typically occurring between 400 and 600 ms after stimulus onset on frontal and central channels and has consistently been related to attentional orienting toward salient stimuli (Courchesne, Ganz, & Norcia, 1981; Nelson, 1994). Its amplitude is closely associated with attention as measured by heart rate deceleration (Richards, 2003). This suggests that additional processing demand is required when eye gaze is averted from a target object, as evidenced by an enhanced and delayed Nc response.
Recently, we tested a sample of 2- to 3-month-old infants and a sample of 5- to 6-month-old infants using the same stimuli and procedures as Hoehl, Reid, et al. (2008). Whereas in 5- to 6-month-old infants (n = 17, 10 males; average age = 5 months 18 days) the effect of an enhanced Nc amplitude (on Fz and Cz channels) in the averted gaze condition could be replicated, t(16) = 2.2, p = .046, no such effect was observed in 2- to 3-month olds (n = 17, 8 males; average age = 2 months 23 days), t(16) = −0.327, p = .714. These results suggest that processing the relation between another person’s gaze and an outside object is not mature from birth but rather develops between 3 and 4 months of age. This is supported by behavioral studies showing development of sensitivity to triadic interactions between 6 weeks and 3 months of age (Striano & Stahl, 2005; Striano, Stahl, Cleveland, & Hoehl, 2007).
Evidence is accumulating that by 4 months infants use gaze to learn. A series of behavioral studies investigated this phenomenon in ecologically valid live-interaction settings (Cleveland, Schug, & Striano, 2007; Cleveland & Striano, 2007; Striano, Chen, Cleveland, & Bradshaw, 2006). Infants showed a stronger novelty preference for objects they had been familiarized with in a triadic interaction that is when the adult alternated gaze and head direction between infant and object including phases of mutual gaze. In a control condition in which the adult did not engage in eye contact with the infant during the familiarization phase, the novelty preference was significantly decreased, indicating more thorough object encoding in the joint attention context. This pattern was consistently found in infants at 7 and 9 months of age but not in a sample of 12-month-old infants using the same paradigm and stimuli. This can likely be accounted for by a ceiling effect of the novelty preference in the oldest age group. We expect that infants at 12 months of age will also benefit from social cues in more complex learning situations.
Another study investigated infants’ sensitivity to aspects of triadic interactions such as eye contact, movement cues, and vocalizations, at 3, 6, and 9 months of age (Striano & Stahl, 2005). Infants of all tested age groups decreased smiling and looking at the experimenter in conditions in which the adult disrupted the normal triadic interaction. Even when the adult broke eye contact only briefly before turning to an object with her gaze and head direction (to impair the referential meaning of the gaze shift), infants’ smiling and looking decreased significantly (see also Farroni, Mansfield, Lai, & Johnson, 2003, for an earlier demonstration of the relevance of mutual gaze in a triadic interaction). Six-week-old infants, however, do not discriminate between triadic interactions with or without breaks in eye contact (Striano et al., 2007).
In an ERP study using a live-interaction paradigm, infants at 9 months of age allocated significantly more attentional resources, as indicated by an increased Nc component, to objects in a joint attention context, involving eye contact and head and eye gaze turns toward objects on a computer screen, relative to a nonjoint attention situation without mutual gaze and movement cues (Striano, Reid, & Hoehl, 2006).
These studies investigating the early development of eye gaze and object processing suggest that complex information about the social world is processed and interpreted by young infants. In accordance with the DAM, infants use eye gaze to guide their limited attentional resources and facilitate learning in triadic interactions. However, it is conceivable that these skills rest on rather automatic mechanisms and do not necessarily involve awareness of a joint focus of attention with another person. Indeed, empirical evidence has put into question whether infants younger than 12 months of age effectively encode the relationship between looker and object, even though they perform overt gaze following (Woodward, 2003). However, in a more recent study using an experimental design similar to the one used by Woodward (2003), 9-month-olds encoded the relation between a person and the target of her gaze and head turn if they were given additional information regarding the person’s focus of attention and goal directedness (Johnson, Ok, & Luo, 2007). Csibra and Volein (2008) found similar results in a behavioral paradigm where the object that was gazed toward (including head turn) was occluded. When the occlusion was removed, infants at 8 and 12 months looked longer at an empty location that had previously been the target of an adult’s gaze, suggesting that infants expected to find the target of the adult’s gaze shifts at this location.
Together, the empirical evidence reviewed in this section strongly supports the notion that infants utilize eye gaze cues to parse important information from their perceptual input. Thus, eye gaze plays an important role for the establishment of triadic interactions. The next question that will be addressed is how emotional expressions are processed in conjunction with eye gaze direction.
Eye Gaze and Emotional Expressions
Stages 3 and 4 of the DAM involve the detection of another person’s focus of attention with respect to the self and external entities. The previous section has demonstrated that infants rely on eye gaze cues to discern others’ focus of attention. However, eye gaze cues are often processed together with emotional facial expressions and function as important indicators of threat. We propose that even at early stages of social information processing (Stages 3 and 4), emotional facial expressions are processed in conjunction with eye gaze. In a series of experiments, Adams and Kleck (2003) demonstrated that direct gaze speeds up recognition of faces expressing approach related emotions (anger and happiness) whereas averted gaze speeds up recognition of avoidance related emotional expressions (fear and sadness).
Little is known about how the processing of emotional expressions is affected by eye gaze during early development. Developmental research in this context has mostly focused on social referencing behavior in infants from 10 months of age onward. Social referencing is the ability to detect emotional signals from others and to adjust behavior accordingly (Feinman, Roberts, Hsieh, Sawyer, & Swanson, 1992). This ability was thought to arise only by the end of the 1st postnatal year (e.g., Mumme & Fernald, 2003). In most studies on social referencing, infants’ behavior in ambiguous situations has been measured as a function of an adult’s emotional signals. However, given that by 7 months infants discriminate and categorize most basic emotional expressions (see Leppänen & Nelson, 2006, for a review), it is reasonable that the infant’s attention system may be affected by emotional threat signals even before they are able to respond accordingly on a behavioral level. The measurement of electrophysiological brain responses is an adequate means to test for this possibility.
In a recent ERP study, we investigated the neural processing of threat-related emotional expressions in combination with eye gaze in a sample of 7-month-old infants (Hoehl & Striano, 2008). Infants saw static faces expressing fear or anger and either directing eye gaze toward the infant or to the side. Amplitude of the resulting Nc component showed an interaction between emotion and gaze direction. Infants responded with a substantially enhanced Nc component when presented with angry faces with direct compared with averted eye gaze. No such effect was found for fearful faces with direct versus averted gaze. This finding indicates that infants allocated increased attentional resources toward the most threatening stimuli: angry faces with direct gaze. Only when combined with direct eye gaze, angry expressions may have been perceived as indicators of threat. Concordant evidence for enhanced neural processing of angry faces with direct relative to averted eye gaze comes from a study with 4-month-old infants (Striano, Kopp, Grossmann, & Reid, 2006). Infants showed an enhanced positive slow wave response for direct relative to averted eye gaze in the context of an angry face but not when the adult face displayed a neutral or happy expression.
The studies by Hoehl and Striano (2008) and Striano, Kopp, et al. (2006) demonstrate young infants’ sensitivity to referential social cues directed to the self when processing emotional expressions (Stage 3 of the DAM). New evidence from another ERP study shows that a fearful expression, together with eye gaze direction, enhances attention toward unfamiliar objects even by 3 months of age (Hoehl, Wiese, & Striano, 2008). This suggests that an emotional expression can be linked to an external referent on a neural level even before overt social referencing behavior can be observed. Importantly, infants were sensitive to the adult’s eye gaze direction and did not associate the object with the emotional expression if eye gaze was averted away from the object, suggesting Stage 4 mechanisms operating at this early age. It remains to be examined whether processing of emotional expressions that are not related to the detection of threat (e.g., happiness and sadness) is also affected by eye gaze cues.
Eye Gaze Processing in ASD
The criteria for ASD are defined by three major symptoms: qualitative impairment in social interaction, impaired communication, and the occurrence of restricted and repetitive behavior (American Psychiatric Association, 2000). More specifically, one of the deficits in the social domain is the inability to share enjoyment, interests, or achievements with other people through nonverbal behaviors. This deficit has been described by looking at aspects of social behavior as, for example, joint attention skills and—at a more basic level—orienting to and following the eye gaze of another person. As this is an extensive topic, this review can only provide a rather limited insight on the complex matter.
Earlier studies reported that children with ASD engage less in mutual gaze than controls (e.g., Volkmar & Mayes, 1990) and are relatively insensitive to a speaker’s gaze direction and unable to use referential gaze to infer the referent of a novel word (Baron-Cohen, Baldwin, & Crowson, 1997). However, although young children with ASD may have difficulties with spontaneously following others’ gaze direction in a natural and complex context, at least some children with ASD can follow eye gaze in a structured experimental situation (Leekam, Hunnisett, & Moore, 1998), discern where people are looking (Leekam, Baron-Cohen, Perrett, Milders, & Brown, 1997), or learn to follow eye gaze (Leekam, Lopez, & Moore, 2000). Leekam et al. (2000) concluded that the observed difficulties in spontaneous gaze following in young children with ASD might be related to a failure in understanding eye gaze as a communicative cue that is linked to another person’s intentions.
Home video observations of small children later diagnosed with ASD, demonstrate reduced interest in faces during the 1st year of life (Osterling & Dawson, 1994) and an increasing amount of deviant behavior during the 2nd year of life, such as eye gaze avoidance and absence of emotional expressions (Adrien et al., 1993). Although home video observations have methodological constraints, they demonstrate that early signs of differential social behavior are already present during the 1st and 2nd years of life. Interestingly, although both the capability of responding to and initiating joint attention are impaired early in life (e.g., Osterling, Dawson, & Munson, 2002), older children with ASD often learn to respond to joint attention, whereas the deficit in initiating joint attention remains (Mundy, Sigman, & Kasari, 1994). Between the 2nd and 4th years of life, children with ASD are often able to compensate some of their deficits when overall joint attention behavior is measured (Naber et al., 2007).
In general, it is important to note the high degree of variability in social impairment between children with ASD that is not only determined by chronological but also by mental age as both are associated with gaze following and joint attention abilities (Leekam et al., 2000). While some literature reports that ASD children are able to engage in eye contact as much as children with mental retardation (Sigman, Mundy, Sherman, & Ungerer, 1986), Leekam et al. (2000) found that ASD children’s responses to attention bids were below the level of control children with mental retardation. Mundy et al. (1994) observed that ASD children with mental ages between 8 and 22 months displayed syndrome specific deficits on both initiating joint attention and a measure involving gaze following. However, ASD children with mental ages estimated between 23 and 39 months only displayed impairments on initiating joint attention compared with typical and mentally retarded controls. Other studies suggest that individual differences in gaze following or responding to joint attention are related to language development in autism (Carpenter, Pennington, & Rogers, 2002) and even improvements in language after intervention (Bono, Daley, & Sigman, 2004).
Another aspect related to the perception of social cues is the reflexive orienting of attention in the direction of gaze cues. The majority of high-functioning children with ASD show normal abilities in reflexive orienting of attention (e.g., Kylliainen & Hietanen, 2004) and even 2-year-old ASD children seem to be able to use eye gaze as a directional cue for a target (Chawarska, Klin, & Volkmar, 2003; but see Johnson et al., 2005, for a different interpretation of these data). However, not all children with ASD show a preferential effect for gaze cues relative to nonsocial stimuli such as arrows as are found in typically developing controls (Senju, Tojo, Dairoku, & Hasegawa, 2004). It has been suggested that children with ASD may learn to use gaze cues based on mechanisms that are also applied when processing nonsocial cues (see Nation & Penny, 2008, for a review on the apparently contradicting findings on gaze detection and gaze following in autism).
Chronological age, mental abilities, and situational context (complex natural environment vs. well-structured experimental setup) play an important role in gaze following and more general in joint attention skills. Not all children at all stages fail to detect eye gaze or lack effects of eye gaze cueing. Future research should focus on the individual differences in gaze detection, gaze following, and, in particular, initiating joint attention. The identification of subgroups with specific patterns of deficits within the spectrum may provide the opportunity to offer individually adjusted trainings and intervention.
Neuroscience methods have been increasingly utilized to enhance our understanding of the neural correlates underlying the deficits in overt behavior in ASD. Two recent studies have investigated ERP responses to direct and averted gaze in ASD children. Grice et al. (2005) presented faces with direct or averted eye gaze to 3- to 7-year-old typically developing controls and an ASD group. In the ASD group, but not in the control group, a “mid-line N170” component was increased for faces with direct relative to averted gaze. Grice et al. concluded that children with ASD may be delayed in the processing of eye gaze direction, as a similar effect in response to direct gaze has been observed in infants (Farroni et al., 2002), but not in adults (Taylor, Itier, et al., 2001). However, using a task that required active detection of specific gaze directions, Senju, Tojo, Yaguchi, and Hasegawa (2005) found an increased posterior negativity during detection of direct gaze in typically developing children but found no difference between direct and averted gaze for the ASD group. Both results speak to differential neural processing of gaze direction in ASD. However, the divergence of the direction of this difference can likely be accounted for by methodological differences, namely, an active oddball paradigm versus a passive viewing task, different recording systems, orientation of the facial stimuli, and chronological age of the subjects.
Various approaches have been made to find possible explanations for neural dysfunction or differential brain development. It has been proposed that the initiation of joint attention relies on a dorsal “social brain” system, including the dorsal medial prefrontal cortex (mPFC) and anterior cingulate, while responding to joint attention is accomplished by a posterior attention system associated with gaze following and the flexible shifting of attention (Mundy, Card, & Fox, 2000; see Mundy, 2003; Vaughan Van Hecke & Mundy, 2007, for extensive reviews). Mundy (2003) suggested that deficits in joint attention skills, especially the initiation of joint attention interactions, are associated with a disturbed function of the dorsal medial-frontal cortex and anterior cingulate. A delay in the development and specialization of social cognitive processing circuits is another not mutually exclusive possibility to this assumption (Grice et al., 2005) and may relate to the repeatedly proposed lack of motivation to orient toward normally salient social stimuli in autism (e.g., Dawson, Meltzoff, Osterling, Rinaldi, & Brown, 1998). Thus, it seems that the direction of attention toward social stimuli as proposed in the DAM does not operate in the same way in children with ASD as in typically developing children. Even the first stages of the DAM may be impaired in autism, leading to a failure to respond to others’ joint attention bids and later in development to the employment of alternative cognitive strategies to track others’ focus of attention. Yet, individual differences, developmental changes, and the need for more empirical evidences do not encourage drawing early conclusions on the matter. The DAM aims to inspire future research on processing of social information in ASD.
Both behavioral and neurophysiological methods can provide significant insights on ASD describing variability between individuals in terms of eye gaze detection, gaze following, and joint attention skills. Although young children with ASD often do not spontaneously use eye gaze as a social cue in communication, they are able to learn how to use gaze cues to guide attention, possibly in the same manner as nonsocial cues such as arrows. The development of general cognitive abilities might support this learning process and contribute to individual differences. Given the reported difficulties in initiating joint attention in children with ASD, more studies investigating top-down processes instead of perceptual bottom-up processes are necessary. The complexity of the topic calls for more studies comparing different stages of development and children with different mental abilities, using both behavioral and neuroscience methods.
Neural Underpinnings of Gaze Processing in Light of the DAM
This sections addresses the brain structures involved in eye gaze processing (see Figure 1) and abnormalities that can be found in these structures in autism (see Mundy, 2003; Pelphrey, Adolphs, & Morris, 2004, for reviews on neural correlates of social cognitive impairments in autism). The structures will be discussed according to the subsequent perceptual stages of the DAM (Stages 1–4).
The first stage of the DAM requires detection of other people to engage in social interactions. It has been suggested that a subcortical pathway involving the amygdala may underlie newborns’ preference for faces and especially faces with direct eye gaze (Johnson, 2005). The amygdala is also sensitive to gaze direction in adults (Kawashima et al., 1999) and has been proposed as one of the key structures to account for social cognitive abnormalities in autism, for example, impairments in making mental inferences from the eyes (see Baron-Cohen et al., 2000, for a review).
The STS is important to detect social agents on a basic level as it processes biological motion (Puce & Perrett, 2003; Puce et al., 1998) and is one of the sources of the face-sensitive adult N170/infant N290 ERP component (Itier & Taylor, 2004; Johnson et al., 2005). This structure may contain neurons that are specifically sensitive to the presence of eyes with direct gaze, and its activity is enhanced by direct eye contact in a social communicative context (Pelphrey, Adolphs, et al., 2004). Four-month-olds’ enhanced neural processing of faces with direct eye gaze suggests that these neurons are active already in early infancy (Farroni et al., 2002). During ontogeny, humans increasingly process faces in a configural manner (Cashon & Cohen, 2003). In adults, eye-sensitive neurons may then be suppressed by neurons specialized in processing facial configuration (Itier et al., 2006). While eye (gaze) sensitive neurons may be the more powerful source of the infant N290 compared with neurons engaged in face processing, increasing experience with faces may lead to the specialization of the N170 in configural face processing and suppression of eye sensitive neurons.
The second stage of the DAM involves identification of social partners. Even at this stage, eye gaze plays an important role for infants, children, and adults. Hood, Macrae, Cole-Davies, and Dias (2003) showed that children’s and adults’ recognition of faces is modulated by eye gaze direction during both encoding and retrieval stages of a face recognition task. Eye contact facilitates gender judgment and incidental recognition memory of faces (Vuilleumier, Gorge, Lister, Armoni, & Driver, 2005) and leads to increased activation of cortical areas involved in face processing (fusiform gyrus; George et al., 2001). Farroni et al. (2007) tested 4-month-olds’ recognition of faces depending on eye gaze direction during a habituation phase and a preferential looking test phase. Infants showed a novelty preference for unfamiliar faces only in the direct gaze condition. It can be concluded that direct gaze enhances the processing and encoding of faces already in early infancy. The neural correlates of this phenomenon remain to be examined, but the fusiform gyrus, an important structure for encoding the identity of a face, is likely involved (George et al., 2001). Several studies have found reduced involvement of the fusiform gyrus in face processing in autism (e.g., Schultz et al., 2000). However, if fixation on the eye region was ensured by superimposing a fixation cross on the face, persons with ASD show normal activation patterns of the fusiform gyrus during face processing (Hadjikhani et al., 2004).
The detection of another person’s focus of attention in relation to the self is the third stage of the DAM. The detection of a face with direct eye gaze as observed in newborns may function without this step when based on automatic processes. Neuroimaging studies with adults show that processing the self-relevance of social cues and the detection of others’ intention to communicate, for instance, engaging in mutual gaze or hearing someone call one’s name, activates the dorsal mPFC (Kampe, Frith, & Frith, 2003; Schilbach et al., 2006). This structure is implicated in reasoning about others’ mental states (mentalizing) and joint attention (see Saxe, 2006, for a review). Persons with autism show decreased activation in this area during mentalizing tasks (Castelli, Frith, Happe, & Frith, 2002). Mundy (2003) has suggested that the dorsal mPFC may be implicated in the basic disturbance in social orienting in autism, including the ability to relate oneself to others. New findings investigating gamma oscillatory brain responses to direct and averted eye gaze suggest that the same structure may be implicated in typically developing 4-month-olds’ neural processing of direct eye gaze (Grossmann, Johnson, Farroni, & Csibra, 2007). Although further research is required, these results suggest that an important region of the social brain network may function already in early infancy. However, an functional MRI study with adult participants found increased activity in the mPFC for averted relative to direct eye gaze (Calder et al., 2002). The authors interpret this finding as evidence that “mentalizing” processes are automatically involved in eye gaze detection and the computation of another person’s goal may require more resources in the case of averted gaze. Thus, the exact function of the mPFC in processing information from the eyes is not established unequivocally and may also extend to functions of the fourth stage of the DAM.
The fourth stage of the DAM involves the detection of another person’s attention in relation to external objects or other people. Calder et al. (2007) found that left and right gaze directions are coded by distinct neuronal populations in the right anterior STS. Even though detecting a similar dissociation in the right inferior parietal lobule, they attributed this dissociation to that region’s role in attentional orienting rather than to coding the direction of gaze per se. In previous studies, the perception of averted eye gaze also activated regions of the parietal cortex (intraparietal sulcus [IPS]), which are implicated in spatial attention (Hoffman & Haxby, 2000). The IPS is consistently activated in visual orienting tasks and belongs to a general frontoparietal spatial attention network (Corbetta, 1998). It has been suggested that activation of this region during gaze detection reflects covert shifts of attention that are elicited by averted gaze (Haxby et al., 2000). Oscillatory brain activity in a similar region was found in response to averted gaze in 4-month-old infants (Grossmann et al., 2007). However, neural responses in this study were distributed over a broader area, which may indicate a less fine-tuned neural system in infancy than in adulthood.
The STS plays an important role not only for the detection of biological motion but also for the processing of intentions that underlie gaze shifts (Pelphrey et al., 2003). When eye gaze is unexpectedly averted from a visual cue, this area shows increased activation in typically developing children and adults but not in persons with autism (Mosconi, Mack, McCarthy, & Pelphrey, 2005; Pelphrey, Morris, & McCarthy, 2005; Pelphrey et al., 2003). Results from an ERP study with infants indicate that 4-month-olds also expect eye gaze to be directed toward a target (Hoehl, Reid, et al., 2008). However, this effect was found on fronto-central channels and is presumably related to attentional mechanisms. In a similar study with 9-month-old infants and adults, object-incongruent gaze enhanced amplitude over the occipito-temporal area in both age groups (in infants the N290 component; Senju, Johnson, & Csibra, 2006). Presumably, more frontal regions are involved in the allocation of attention process referential eye gaze in early ontogeny. Later in development, posterior regions that initially process eye gaze in dyadic situations may then be recruited for the more complex task of processing gaze in relation to objects and the encoding of intentions. The fourth stage of the DAM may thus recruit similar neural structures as in adults already by 9 months.
To summarize, the initial stage of the DAM, namely, the detection of humans, requires activation of a subcortical face processing pathway including the amygdala and, by at least 4 months of age, the STS region, which encodes biological motion. Then information is fed into structures that process the identity of a face (fusiform gyrus; Stage 2) and the intention to communicate (mPFC; Stage 3). As suggested by Pelphrey, Viola, et al. (2004), feedback mechanisms may then (re)engage the STS region if further processing of the social context or intention of the observed biological motion is required (Stage 4).
Conclusion and Outlook
The reviewed empirical evidence highlights the relevance of eye gaze in early social cognitive development. From early infancy, eye gaze is used to shift attention, facilitate learning, and may be involved in early threat detection mechanisms. The eyes may thus be a central social cue that enables infants to filter social information from their perceptual input. The observation that eyes are a critical stimuli from early infancy on has raised the question whether a specific gaze processing module may exist in the human brain (e.g., Baron-Cohen, 1994). In fact, some evidence supports the notion that the neural systems involved in gaze processing meet some of the criteria of a module as defined by Fodor (1983). For instance, eye gaze is processed quickly and automatically and a characteristic neural architecture seems to underlie this mechanism. However, there are also some caveats to this view. The lack of neuroscientific longitudinal studies of gaze processing leaves open the question of whether the development of the involved structures exhibits a characteristic pace and sequencing. The encapsulated nature of information processing with regard to eye gaze is questioned by findings regarding the interaction between gaze and emotional expression processing even in young infants (Hoehl & Striano, 2008; Hoehl, Wiese, et al., 2008). Finally, the structures implicated in gaze processing as reviewed in this article are—on the level accessible with current methodology—not confined to the processing of eye gaze. For instance, the posterior STS subserves a variety of functions in adults and undergoes considerable development, which is not confined to the fine-tuning of gaze-specific functions. We therefore put forward a model in which different stages of processing may be accomplished by different neural structures at different ages, and one structure (e.g., the STS) may be involved in several stages of the DAM.
In a dyadic face-to-face context, direct eye gaze is detected already by newborn infants (Farroni et al., 2002). We propose that this mechanism is different from gaze detection in 4-month-old infants as it may be resolved by a subcortical pathway as suggested by Johnson (2005) and corresponds to Stage 1 of the DAM, while in 4-month-old infants the STS region and the mPFC are activated by direct gaze, suggesting Stage 3 mechanisms at this age (Farroni et al., 2002; Grossmann et al., 2007). This indicates that the social relevance and self-directedness of mutual gaze is encoded by similar neural systems as in adults by at least 4 months of age. However, we reviewed evidence that the neural system underlying face and eye gaze processing undergoes important developmental changes until early adulthood (Taylor, Edmonds, et al., 2001). While eye gaze is the social cue that engages the cortical sources of the N290/N170 the most during infancy and childhood, in adults, the N170 reflects mainly structural encoding of facial configurations.
When examining gaze detection and gaze following in ASD, important distinctions must be made between attention cueing effects (often unimpaired in ASD; Chawarska et al., 2003), gaze following in natural interactions (often impaired in young children with ASD, but trainable; Leekam et al., 2000), and the initiation of joint attention experiences (relatively consistently impaired in ASD throughout development; Mundy, 2003; Mundy et al., 1994). However, age, mental age, and experimental settings are only some of the relevant factors to consider when studying gaze detection and joint attention in autism. It will be the challenge of future research to examine the exact nature of the repeatedly observed individual differences in gaze following in ASD. We expect that this will help to identify subgroups with particular interventional needs within the spectrum and that this will enhance the success of individually adapted trainings. One important area of research concerns the development of diagnostic tools that may enable earlier detection of relevant symptoms. Although the diagnostic sensitivity and specificity of neural abnormalities in autism remain to be explored, research as reviewed here is promising and will provide new insights in both typical and atypical development.
We propose that the DAM provides a useful framework for the study of gaze processing in typical and atypical development because it assembles relevant aspects of gaze processing (e.g., detection of eyes with direct gaze, detection of self-relevance of gaze, and detection of external referents of gaze) in an information-processing sequence that can be associated with relevant underlying neural structures. Future studies should determine whether specific predictions based on the DAM prove valid. For instance, under instances of high cognitive load mere detection of the presence of a person should be most accurate and rapid, followed by structural face processing, the detection of attention in relation to the self, and then the detection of attention in relation to outside entities. A further prediction is that a familiar person will facilitate rapid identification (Stage 2), which in turn will enhance the speed of processing of object-directed gaze (Stage 4) when compared to a stranger. Consequently, learning about objects will be more effective with a familiar relative to an unfamiliar social partner. Furthermore, the addition of biological motion (i.e., a slight head turn) to eye gaze direction will facilitate processing of the gazed location, even when psychophysical characteristics of the turn are factored out by shifting the head such that the eyes are constantly in the same observed location and in the presence of background noise. This is because the processing of first-stage information (i.e., biological motion) will reduce the salience of other components of the complex visual scene and act as an information filter.
The reviewed findings emphasize the fact that social skills and underlying neural structures undergo dynamic developmental trajectories. Longitudinal studies involving physiological and behavioral measures of social cognitive skills stand to reveal much in terms of relation between different aspects of social cognition and between overt behavior and underlying neural mechanisms.
Research should also focus more on investigating eye gaze in the context of other social cues. Real social interactions are characterized by dynamic flows of complex information from several changing sources. Infants are remarkably capable at handling this complexity from an early age. For instance, the reviewed data show that infants process emotional expressions depending on gaze cues by 3 months of age (Hoehl & Striano, 2008; Hoehl, Wiese, et al., 2008; Striano, Kopp, et al., 2006). It will be the challenge of future research to investigate how the infant human brain simultaneously processes different and dynamic aspects of social interactions, for instance, information from eye gaze, facial expression, and voice, when both congruent and incongruent (see Grossmann, Striano, & Friederici, 2006).
To conclude, research on neural systems involved in face and eye gaze processing benefits from developmental research and vice versa. On the one hand, a developmental perspective sheds light on the puzzle whether the N170 component reflects activation of an eye detector by revealing the differential roles this component plays at different steps of ontogeny. On the other hand, the measurement of neural indicators of attention can reveal infants’ remarkable social skills in triadic interactions that have been underestimated in studies that examined overt behavioral responses such as gaze following. The application and improvement of neuroscience methods will deepen our understanding of eye gaze processing and social cognition in typical and atypical development.