Four themes from 20 years of research on infant perception and cognition



This paper reviews progress over the past 20 years in four areas of research on infant perception and cognition. Work on perception of dynamic events has identified perceptual constraints on perception of object unity and object trajectory continuity that have led to a perceptual account of early development that supplements Nativist accounts. Work on face processing has charted developmental changes that clarify the way innate systems are modified by experience. Research on perception of goal-directed action and animacy has made significant progress in uncovering the roots of social cognition from 6 months onwards. New methods such as eye tracking and measures of brain activity have done much to confirm and clarify conclusions arising from more conventional looking preference methods. It is likely that future progress in theory and understanding will be made increasingly as a result of triangulation between data arising from conventional and newer methods. Copyright © 2011 John Wiley & Sons, Ltd.

Infant and Child Development began its life as Early Development & Parenting, a title that arose from separate proposals for new journals, one I prepared on early development and one from Brian Hopkins and Willem Koops on parenting. At the time, my reason for proposing a journal on early development arose from the feeling that we needed more journal space to publish the large and ever growing volume of work on infant development. In the 20 years since, this work has continued, and it would be impossible to do justice to what has happened in the last 20 years in a short paper without being very selective. Part of this selectivity was predetermined: the editors had asked me to write about progress in my own area. But that still left a large body of work. For some time I puzzled over a way to home in further on a manageable literature, and finally arrived at a solution that suited me very well if not my audience: I would discuss the work that had particularly interested me and stimulated my thinking. My chosen topics are perception of dynamic events, perception of people, and knowledge of mind. I will also include a section that cuts across these topics, concerning development of infant neuroscience and developing methodologies.

Perception of Dynamic Events

When the journal was formed, work supporting a Nativist account of infant knowledge was thriving (for instance, see Baillargeon et al., 1992, in volume 1 of the journal) and indeed has continued to thrive since. However, over the past 20 years and more there has been a growing prevalence of accounts that make fewer claims and assumptions regarding infant knowledge and that rely instead on explaining early development in terms of the infant's growing ability to process complex events. This work, which has been one of my own major research foci, has its empirical roots in a key paper by Kellman and Spelke (1983), whereas the theoretical basis is largely rooted in information processing.

The focus of Kellman and Spelke's work concerned the infant's ability to ‘fill in’ gaps in perception. Specifically, looking at the top part of Figure 1, were infants able to perceive the rod behind the box as complete despite the fact that its centre was occluded? They tackled this question though an ingenious adaptation of the habituation-recovery technique: infants were habituated to the rod and box display, with the rod moving back and forth behind the box, and were then tested successively on two test displays with the box removed and consisting either of a complete rod or just the parts they had seen during habituation. The rationale was that if they perceptually completed the rod during habituation, the whole rod test would be familiar and the parts novel, whereas if they did not, the rod parts would be familiar and the whole rod novel. Four-month-olds treated the test display consisting of the rod parts as novel, and it was concluded that they perceived object unity.

Figure 1.

Habituation and test displays used by Kellman and Spelke (1983) to investigate infants' perception of object unity.

Progress in the 90s was on two fronts. First, unlike many lower level aspects of visual perception shown to be present at birth, it was demonstrated that object unity developed over the first 4 months, being absent at birth (Slater, Johnson, Brown, & Badenoch, 1996) and present in fragile form at 2 months (Johnson & Aslin, 1995). Second, systematic investigation revealed a good deal more about the nature of object unity, including the perceptual constraints that determined its presence or absence (Johnson and Aslin, 1995; Johnson & Náñez, 1995). For instance, the need for rod motion is apparently to ensure progressive deletion and accretion of background texture, because the phenomenon is lost if there is no background texture. Additionally, although alignment of rod parts is important in the simple rod and box display (Johnson & Aslin, 1996), 4-month-olds are capable of filling in invisible parts of more complex figures such as crosses and circles (Johnson, Bremner, Slater, & Mason, 2000).

The same experimental logic has been used to investigate infants' perception of trajectory continuity (Figure 2). In this case, infants are habituated to an event in which a ball moves back and forth, disappearing temporarily behind a central occluder. The question here is whether they fill in the invisible parts of the object's trajectory. Again, the test for this involves presenting infants with two posthabituation displays, both with the occluder absent, one showing the object travelling continuously and the other showing it travelling discontinuously. If infants processed a continuous trajectory during habituation, they should treat the discontinuous display as novel, whereas if they processed a discontinuous trajectory, the opposite preference should be obtained.

Figure 2.

Displays used by Johnson, Bremner et al. (2003) to investigate infants' perception of trajectory continuity. A. Habituation display, B. Discontinuous test display, C. Continuous test display.

In this case, a similar developmental story emerges, though lagging that for object unity by about 2 months. Six-month-olds display robust perception of trajectory continuity (Johnson, Bremner et al., 2003), whereas 4-month-olds only do so when the time or distance out of sight is short (Bremner et al., 2005), and 2-month-olds show no evidence of this ability (Johnson, Bremner et al., 2003). Additionally, the ability of 4-month-olds is limited to linear trajectories and they have particular problems processing oblique linear trajectories (Bremner et al., 2007).

This work has potential for integration with theory emerging from the literature on intersensory perception. At an empirical level, it emerges that from at least 2 months infants are sensitive to the dynamic co-location of a moving visual object and an associated sound (Bremner, et al., 2011) and the addition of dynamic auditory information for object movement supports perception of trajectory continuity (Bremner, Slater, Johnson, Mason, & Spring, submitted). Theoretically, both bodies of literature are underpinned by the claim that perceptual learning is the key process in early development. An exciting principle arising from work on intersensory perception is that the presence of intersensory redundancy (presentation of the same information across two or more senses) recruits infant attention and facilitates learning (Bahrick, Flom, & Lickliter, 2002; Bahrick & Lickliter, 2000; Bahrick, Lickliter, & Flom, 2004). This principle is proving valuable in explaining the development of selective attention and learning about the structure of the animate and inanimate world.

Some would view perceptual learning approaches of this sort as alternatives to nativism. My own view is that their primary strength is in providing approaches complementary to Nativist findings, and my hope is that an important area of progress in the future will be the integration of what might appear to be opposed theoretical views, leading to a developmental account in which perceptual learning processes lead to the formation of the perceptual structures that set constraints on the later structure of infants' knowledge. In the process, however, there may need to be some dilution of the stronger Nativist principles regarding the origins of knowledge.

Perception of People: Face Perception

In the early 90s a good deal was known about infant face perception, and there was strong evidence for at least a crude system for recognizing facial configurations at birth (Goren, Sarty, & Wu, 1975; Johnson, Dziurawiec, Ellis, & Morton, 1991). What was controversial then, and remains so now, is the form of the early face recognition system. Johnson et al. (1991) proposed a model in which an early crude recognition system was replaced after a few months by a more refined system capable of forming discriminations between people. Although this account was well based in developmental neuroscience, there was evidence from a range of studies (for instance, Walter, Bower, & Bower, 1992) that newborns rapidly learned to discriminate between mother and female stranger, and disagreement continues regarding the form of the early face recognition system.

In the early 90s we also knew that infants preferred faces that adults rated as more attractive (Langlois et al. 1987; Langlois & Roggman, 1990). This rather peculiar finding took on greater theoretical significance once it was demonstrated that if faces were averaged to produce a prototype the outcome was judged more attractive than any of the individual faces. It seemed likely that infants preferred attractive faces because they approximated more closely to a prototype. More recently, it has been demonstrated that attractiveness preferences exist at birth (Slater et al., 1998) and are based on the internal configuration rather than external features (Slater et al., 2000).

To my mind, the important progress of the past 20 years has been to chart out developmental change in face recognition and hence to clarify the way innate systems are modified through experience. One of the main principles to emerge from recent research is that face perception is initially very general but progressively becomes more specific to the faces that the infant is exposed to, a phenomenon often known as perceptual narrowing. The generality of the system is demonstrated by the fact that 6-month-olds are able to discriminate Barbary Macaque monkey faces, whereas 9-month-olds and adults have lost this ability (Pascalis, de Haan, & Nelson, 2002). And in the human face domain, 3-month-olds can discriminate faces of other races, but by 9 months they have lost this ability, only discriminating faces within the race they are exposed to (Kelly et al., 2009). Another finding that probably provides further evidence for effects of experience is that a general preference for female over male faces emerges within the early months (Quinn, Yahr, Kuhn, Slater, & Pascalis, 2002), and is limited to same race faces (Quinn et al. 2008). The favoured interpretation is that this preference emerges as a result of accumulated experience in which infants' caregivers are predominantly female, and support for this interpretation is provided by the finding that infants whose primary caregiver is male show a preference for male faces (Quinn et al., 2002).

The picture that emerges is a face recognition system that is initially very general to the extent of serving to identify and discriminate between faces of other species as well as other races. However, specific experience results in perceptual narrowing such that discriminative abilities and preferences become progressively more specific to the race the infant experiences. One of the exciting things about a model of this sort is that it raises a range of quite specific questions for future research. For instance, are there advantages to perceptual narrowing in terms of greater specialization resulting in greater accuracy of discrimination? Also, what happens to the narrowing process, specialization, and accuracy in the case of infants brought up by two parents of different race? Finally, are very early abilities explained by an innate system that detects a very general innate prototype template, or are very young infants' preferences directed to the average of the faces they have experienced? In terms of testable possibilities and alternatives, the future of infant face perception research certainly looks bright.

It is worth noting that in establishing perceptual narrowing effects, face perception research is relatively late on the scene: such effects were established long ago in the case of speech perception (Werker & Tees, 1984) and evidence of this sort continues to accumulate (Kuhl, Williams, Lacerda, Stevens, & Lindblom 1992; Mattock & Burnham, 2006). But it could be that some of the specific questions arising from face perception research could also be applied productively in the domain of speech.

Perception of Goal-directed Action and Animacy

The third substantive area I have selected also concerns person perception but at the higher level of interpreting human actions. In this respect, the work complements research with older participants on theory of mind and can be seen as establishing the early roots of an ability that has been the focus of considerable research effort for well over 20 years now.

One of the earlier studies on this was reported by Meltzoff (1995), who used an imitation task to investigate infants' understanding of goal-directed action. Eighteen-month-olds were exposed to adult acts that failed to achieve their goal. Subsequently, they imitated the intended act rather than the failed one that they had actually witnessed. This seems strong evidence that they understood the goal of the act. Interestingly, no such effect occurred when a mechanical device modelled the failed act, and the conclusion was that only animate acts are interpreted in terms of goals and intentions. Although Huang, Heyes, and Charman (2002) have suggested that infants were responding at a lower level to factors such as the causal structure or affordances of the objects involved, this does not appear to explain the lack of an effect in the case of a mechanical device.

Itakura et al. (2008) made ingenious use of robotics to identify more closely what conditions were necessary for an action to be perceived as social. They demonstrated that 24- to 26-month-olds imitate the ‘intended’ act when faced with a failed act executed by a robot, but only when the robot engaged in eye-contact before and after the action. They thus conclude that it is human-like behaviour rather than human-like morphology that is key to infants' ability to read animate intention. There is a suggestive link between this finding and a recent attempt to explain the A not B search error. Topál, Gergely, Miklósi, Erdõhegzi, and Csibra (2008) demonstrated that infants were more likely to make the error if the hider engaged them in eye contact and interaction. Their conclusion is that the error results from a misinterpretation of the adult's social cues, a misinterpretation in the sense that they take socially marked hiding at A to indicate that this is the place to search for objects. It is likely that a focus of future research will be the whole issue of how infants of different ages interpret social cues, both inside and outside the experimental setting.

Woodward (1998, 1999) adopted a rather different approach to the same issue. In her 1998 paper she habituated 6- to 9-month-old infants to an event in which the investigator reached and grasped an object. Following this, they were tested on events in which either the same act was directed to a different object or the same grasp was executed on the original object, following a different path of movement of the reach. She found that there was more recovery of looking to the former than to the latter. Interestingly, and in parallel with Meltzoff's finding, she obtained no such effect when a robot arm executed the acts. Her conclusion is that in the case of human actions infants focus on the relationship between the action and its goal, that is, the specific object acted on. The action used to achieve the goal does not matter, such that they do not show recovery of looking to a change in the trajectory of the action. But they do show recovery of looking when the goal of the act changes.

Woodward (1999) habituated 5- to 10-month-olds to touching or grasping actions. In the touch case, the hand appeared to fall on the object and thus did not have the appearance of being goal directed. Again, test trials involved either the same action applied to a different object or the same action on the same object but through a different path of motion. In the case of the grasping action, 9-month-olds showed more recovery of looking to the same action on a new object than to different path of motion to the same object, and there was a marginally significant trend in the same direction with 5-month-olds. In the case of the touching action, however, there was no such effect at any age tested. Thus, it appears that infants concentrated on the goal only in the case of intentional acts; when the act seemed unintentional a change in the goal produced no more recovery of looking than a change in the path of the action. Interestingly, using the same methodology as her grasping/touching studies, Woodward (2003) subsequently demonstrated that the same phenomenon emerged at a somewhat later age for the case of looking rather than grasping; although 7- and 9-month-olds did not respond to a change in the relation between looker and object, 12-month-olds did.

Although the evidence discussed so far suggests that only action with clearly animate characteristics is perceived as intentional and goal directed, there is now evidence that infants will apply a similar interpretation to movements of inanimate objects even when their movement is not explicitly biomechanical (that is, not structured according to biological motion of inter-related body parts). Csibra, Biro, Koos, and Gergely (2003) habituated 9- and 12-month-olds to a dynamic display in which a large ball pursued a small ball. Part of the way along its trajectory, the small ball passed through a narrow gap in a barrier—a gap too narrow for the large ball to pass through. The large ball thus made a detour round the barrier before returning to its pursuit path. On test trials, the gap in the barrier was made wide enough for the large ball to pass through. On one test trial, the large ball followed the small ball through the gap without detour, and in the other test trial it made the same detour as before. The rationale was that the direct pursuit test trial corresponded to the large ball being an agent carrying out rational action to obtain a goal (capture of the small ball), whereas the detour test trial was not rational since it contained an unnecessary detour. Twelve-month-olds looked longer at the detour test trial, as if they had detected that the detour was not a rational act. In contrast, 9-month-olds looked equally at both test trials.

Csibra et al. (2003) also habituated infants to an object moving from left to right along a surface, jumping at a point in the middle of its path, before reaching a goal object at the other side of the stage. A screen hid the area above the surface at the jump point but did not hide the jump. After habituation, the screen was removed to reveal either an empty space or a cube that made the jump necessary. Again, 12-month-olds, but not 9-month-olds, looked longer at the no obstacle jump. The assumption is that older infants perceive the jump as a rational act when there is an obstruction on the way to the object's goal, but not when there is no obstruction. Csibra et al. (2003) thus argue that 12-month-olds interpret the movements of inanimate objects as if they were rational actions aimed at reaching a goal. This is not to say that they are mis-attributing rationality to inanimate objects, but that they detect the structure of rational action in quite abstract movement sequences.

Maybe there is no contradiction between the findings of Csibra et al. and those suggesting that biomechanical and social action are key to perception of animate action. There are different levels at which animacy could be perceived, a fine detailed level at which inter- and intra-limb components of action are analysed and a less fine grained level in which the relationship between two gross movements are analysed. Thought of that way, simple pursuit may be perceived as a social act just as a reach and grasp is.

In summary, there is a rich literature emerging in this area, which is beginning to build quite a complex picture of the factors that lead infants, from 6 months onwards, to perceive animate, goal-directed action. The initial appearance that the act had to be executed by a human being has been modified to suggest that it is the structure of the act that matters rather than the appearance of the actor. What is more, the structure that is sufficient for such perception can be fine grained, in the detail of a reaching movement or the creation of eye contact, or it can exist at a gross level in terms of the way two bodies move relative to one another.

Methodological Advances

Although much of the research on infant perception and cognition continues to use visual attention in the form of overall looking time to the stimulus as the response measure, there is growing concern that over-reliance on such a crude measure is unsafe. Although infants' tendency to look longer at novel stimuli is well established, there are circumstances, particularly when habituation has not been achieved, under which they will look longer at familiar stimuli (Hunter & Ames, 1988). This concern arises more markedly in violation of expectancy work in which habituation is often not carried out to a criterion, and it is assumed that longer looking indicates recognition that an event violates the infant's expectations about how the world works. In much of this body of work, controls to rule out lower level perceptual explanations are based on the assumption that the infant will look more at perceptually novel events. Thus, experiments are often designed so that the event that violates a physical rule is perceptually more familiar than the lawful event. A good example is Wynn's (1992) work on infants' knowledge of addition and subtraction. In the subtraction case, the infant initially sees two objects, which are then screened, whereupon a hand removes one. The screen then comes down to reveal either one (the correct numerical outcome but perceptually novel) or two (the incorrect outcome but perceptually familiar). Five-month-old infants look longer at the incorrect outcome and from this and a corresponding result in the case of addition, Wynn concludes that young infants have a basic knowledge of subtraction and addition. However, as Cohen and Marks (2002) point out, the findings may be based on a familiarity preference following incomplete habituation and thus may provide no information about the infant's numerical understanding.

This example points up the need for other measures to supplement basic measures of looking duration. And, increasingly, researchers are using other measures, either to supplement or to replace basic looking time. In particular, the development of sophisticated eye-tracking systems that do not involve any head mounted equipment has led to an increase in the number of studies that measure not just how long infants look at an event but precisely where they look.

Eye tracking has been used to test different interpretations of Wynn's results. Slater, Bremner, Johnson, and Hayes (2010) measured where infants looked in addition and subtraction events. In the subtraction event, two objects were initially present, and once screened a hand removed the left-hand object. In the incorrect outcome of two objects, infants looked significantly longer at the left-hand object that should no longer be there (The fact that no such effect appeared in the case of addition is to be expected given that it is unusual for young infants to look at empty spaces.). This result appears to rule out the perceptual familiarity interpretation of Wynn's results, since this does not predict any difference in looking at the two objects.

A good example also exists in the case of trajectory perception, covered in the first section of this paper. Johnson, Amso, and Slemmer (2003) presented the habituation display in Figure 2 to 4- and 6-month-olds, but instead of measuring looking time to the whole event, measured the number of anticipative saccades to the point of re-emergence, once the object had gone out of sight. They found that 4-month-olds made few anticipations, whereas 6-month-olds made more. This parallels the habituation-novelty finding for this occluder width, in which 4-month-olds perceived the trajectory as discontinuous whereas 6-month-olds perceived it as continuous, and agrees with other work suggesting a gradual improvement in the ability to keep track of the hidden components of object movement (Gredebäck, & von Hofsten, 2004; Rosander & von Hofsten, 2004). Additionally, Bertenthal, Longo, and Kenny (2007) found that predictive tracking was more frequent when the object disappeared by progressive deletion at the occluder boundary than when it disappeared instantaneously or by implosion. This suggests that progressive deletion is an important cue to perception or representation of object continuity.

Developmental neuroscience is another growth area in infancy research. In addition to providing an alternative theoretical framework in which to consider behavioural findings in infancy, this approach brings with it alternative measures that have potential to clarify the processes underlying behaviour. The most common technique is to measure brain activity using EEG. Although this work is technically challenging, there are now many examples in the literature. I shall provide just one that bears on the trajectory perception literature and the whole issue of object permanence. In a series of experiments, Kaufman, Csibra, and Johnson (2003) presented 6-month-old infants with an event in which a moving object temporarily disappeared behind an occluder, while using EEG to measure gamma-band activity, a form of activity that occurs in adults when they hold an object in mind (Tallon-Baudry, Bertrand, Perronnet, & Pernier, 1998). They detected enhanced gamma-band activity in the right temporal lobe while the object was out of sight. What is more, gamma-band activity was further enhanced if, while the object was out of sight, the occluder was lifted to reveal no object. They conclude that gamma-band activity is the neural basis for representation of occluded objects, and that the further increase in activity when the occluder rises to reveal no object reflects the effort to maintain a representation of an object that should be present, in the face of its visible absence. In a follow-up study, Kaufman, Csibra and Johnson (2005) identified one of the perceptual factors that appears to trigger the absent object representation process. Increased gamma activity was obtained when a stationary object disappeared in the normal way behind a moving occluder, by progressive deletion at the occluder boundary. But no such increase in activity was obtained if the object disintegrated through random deletion of pixels. This fits well with behavioural data, suggesting that progressive deletion is a key factor supporting object representation (Bertenthal et al., 2007).

There are reasons to believe that these relatively new methods will supplement rather than replace more conventional measures of gross looking time. Although precise eye tracking delivers fine-grained information regarding fixation patterns, this greater detail does not on its own necessarily lead to answers to the psychological questions that the work aims to address. For instance, although it is tempting to assume that anticipatory saccades towards the point of re-emergence of an object from behind an occluder are evidence for representation of the object's invisible path behind the occluder, it was pointed out many years ago that such a fixation pattern could be the product of learning a contingency in which two separate events occurred in a predictable sequence to left and right of an occluder (Goldberg, 1976). In contrast, the habituation-novelty work is structured logically to make explicit predictions based on whether infants perceive or represent the object's invisible trajectory. Thus, I would argue that eye-tracking evidence is valuable here in providing an alternative measure to confirm and refine conclusions from habituation-novelty work.

In a rather different way, brain activity measures are insufficient on their own. Although they are direct measures of brain function, at times quite big interpretative leaps have to be made regarding their meaning, particularly when, as is typical, extrapolation takes place between known adult brain function and the function of equivalent areas of the infant brain. Additionally, if we agree that psychology is the science of behaviour, they are indirect measures of behaviour. Thus, again it may be argued that these measures should supplement rather than replace behavioural measures, and the best developmental neuropsychology should continue to measure both brain activity and behaviour. Considering the nature of evidence in this way, in the future we may expect to see significant theoretical advances as a result of data triangulation in multi-measure studies, in which looking preference work will maintain an important role.