Effect of racial bias on composite construction

Summary We investigated how prior bias about a face's racial characteristics can affect its encoding and resultant facial composite construction. In total, 61 participants (24 Europeans, 18 Indians living in India and 19 Indians living in Europe) saw a racially ambiguous unfamiliar face and were led to believe it was either European or Indian. They created a composite of this face, using EFIT6. Two groups of independent raters (one Indian, the other European) then assessed the apparent race of each composite. A different two groups (one Indian, one European) assessed each composite's degree of resemblance to the target face, to determine whether this was influenced by the constructors' initial categorisation of the target face as “ own-race ” or “ other-race. ” Composites appeared significantly more “ Asian ” or “ European ” according to the bias induced in their creators, but there was no evidence of any own-race bias in the resemblance ratings for the composites.


| INTRODUCTION
A facial composite is a representation of the face of an unknown person, usually a crime suspect. It is created by an eye witness for whom the face is unfamiliar, and publicised by the police in the hope that it will be recognised by someone who knows the individual concerned.
"Pure" research on face processing has tended to focus on the nature of the information that is extracted from a face in order to achieve recognition: there are now hundreds of research articles on the nature of "configural" processing, for example (entering the keywords "configural processing faces" into PsycINFO yielded 581 results at the time of writing; see reviews of the concept in Piepers &Robbins, 2012 andBurton, Schweinberger, Jenkins, &Kaufmann, 2015). However, people are not merely passive information-processing systems: how they process a face is affected by their previous experience with faces, and more generally, by their socio-cultural experiences with the people to whom the faces belong.
Except in the case of studies of own-group biases such as the "other race effect" 1 (see below), most research on unfamiliar face recognition has deliberately presented participants with facial images that have little significance for them. Studies of this kind have produced a large amount of insight into how we recognise faces but fail to do justice to the fact that, outside of the psychology laboratory, face encoding may be heavily influenced by an individual's experiences, expectations, prejudices and schemas.
Relatively few studies have investigated how a viewer's attributions about a face might influence how they encode and remember it. One of the earliest was by Shepherd, Ellis, McMurran and Davies (1978). Female participants saw an adult male face and constructed a Photofit of it from memory immediately afterwards. Half were told the man was a lifeboat captain, and half that he was a murderer. A group of independent judges then rated the Photofits on nine attributes, such as attractiveness, intelligence, kindness and sociability. This study is sometimes misreported as showing that the composites' appearance was affected by whether the constructors thought the photograph showed a lifeboat captain or a murderer. In fact, while the ratings tended to be more positive for the "lifeboat captain" than for the "murderer," this manipulation significantly affected only two of the constructors' ratings of the face (for intelligence and attractiveness), and none of the judges' ratings; and if a correction for the use of multiple tests is applied, even these two differences are non-significant. However, this study is important in drawing attention to the possibility that witnesses' stereotypes might systematically influence how they perceive a face, which in turn might affect any composite they constructed subsequently. Similar conclusions can be drawn from a study by Davies and Oldman (1999), which found some indications that the appearance of E-Fit composites of familiar faces (celebrities) was influenced by whether the constructors liked or disliked the celebrities concerned.
A few researchers have investigated how racial categorisation of a face can affect its appearance, though not in the context of composite construction. Malpass (2001, 2003) demonstrated that the appearance of a racially ambiguous face could be markedly biased by presenting it with a racially stereotypical hairstyle (either Hispanic or African-American). An ambiguous-race face that had a stereotypically African-American hairstyle was seen as darker in complexion and was considered to possess more African-American features than the same face with a stereotypically Hispanic hairstyle. Hairstyle also affected memory for the faces: Hispanic participants were less likely to recognise faces with an African-American hairstyle than a "Hispanic" one. Levin and Banaji (2006) found that White faces were judged to be lighter than Black faces, even when they were matched for reflectance. They also found that a racially ambiguous face (produced by morphing an African-American face with a European-American face) was perceived to be lighter or darker, depending on the racial label ("White" or "Black") with which it was shown. Hilliar and Kemp (2008) presented European-Australian and East Asian-Australian participants with a series of racially ambiguous male faces (produced by morphing together an East Asian face and a European face). These morphs were rated as looking significantly more "European" if they were paired with typically European names than if they were presented with typically Asian names. Both races of participant showed similar effects. MacLin and Malpass (2003) suggested that "categorization of race plays a substantial role in the perception and representation of faces.
When a key feature acting as a racial marker is present, it causes a face to be categorized as one race or another, thereby altering perception of the face as consistent with other exemplars from the categorized race. Furthermore, categorization appears to alter the storage and representation of individual characteristics that enable the face to be subsequently recognized" (p. 252).
The present study tests this hypothesis. We used a similar method to Shepherd, Ellis, McMurran and Davies (1978), but with two important modifications. Firstly, we used a more sophisticated composite system. EFIT6 (Visionmetric Ltd.) that allows for virtually limitless modifications of the composite during its construction. Perhaps one of the problems with Shepherd et al's study was that even if the composite constructors in the "lifeboat captain" and "murderer" conditions did perceive the test face differently, Photofit's technological limitations may have prevented any subtle differences from being represented in the final composites. This would explain why the composite constructors showed a trend towards being influenced by the experimenters" manipulations (as evidenced by their ratings of the target photograph), but not the composite raters.
The second modification was in the nature of the biasing information that was used. We tried to influence participants' perception of the race of the target face. In Shepherd et al.'s study, participants are unlikely to have possessed well-developed schemas for "lifeboatman" and "murderer." In contrast, participants should have much betterdeveloped schemas of what a typical "European" or "Indian" face looks like. In addition, as one of our reviewers pointed out, it is wellestablished that race has pronounced effects on face perception: we are better at recognising faces belonging to our own racial group than those of another race, the so-called "other race effect" or "own-race bias" (review in Meissner & Brigham, 2001).
We therefore used a variant of Hilliar and Kemp's (2008) technique. We employed three different groups of composite constructors: Caucasians living in Europe (henceforth referred to as "European Caucasians"); Indians living in India (henceforth referred to as "Native Indians"); and Indians living in Europe (henceforth referred to as "Overseas Indians"). Thus, we had groups of Indian participants with relatively limited or extensive contact with Caucasians, respectively.
Half of the participants in each group were led to believe that the target face was "Asian" and half were led to believe it was "European." Participants saw a racially ambiguous face (a morph between an Asian face and a European face) after being presented with racially stereotypical names to encourage them to assume this face was either Asian or European. Each participant constructed a composite from memory, using EFIT6. These composites were then independently rated for their racial appearance and, separately, for their degree of resemblance to the original face.
On the basis of previous research, we predicted that the racial appearance of the composites would be significantly influenced by whether the constructors thought the face was "Asian" or "European." As well as investigating whether participants' racial categorisation of the target face affected the racial appearance of the composites they produced of it, we were also interested in whether it would affect the degree of resemblance between the composites and the target face.
Theories about the cause of the other-race effect fall into two broad categories, according to whether they emphasise the role of perceptual or social psychological factors. "Perceptual expertise" explanations suggest that the other-race effect occurs as the result of differential experience with own-and other-race faces. Prolonged experience with our own race leads to face-processing systems becoming better tuned for the types of faces with which we have greater experience. There are various suggestions about precisely what this "expertise" might involve. According to Valentine's (1991) Multidimensional Face Space model, expertise lies in a better representation of the facial dimensions necessary for individuating own-race faces, compared to those needed to distinguish between out-group faces. Others have proposed that expertise consists of an increased ability to extract the configural information from faces which supposedly underlies efficient face recognition (e.g., Rhodes, Tan, Brake, & Taylor, 1989;Sangrigoli, Pallier, Argenti, Ventureyra, & de Schonen, 2005). Hills and Pake (2013) found that people scan own-race faces in a characteristic way. This strategy is optimised for extracting information that is particularly useful for distinguishing between ownrace exemplars. Hills and Pake argue that part of the reason for the other-race effect is that these scanning strategies do not work well with other-race faces, but people persist in using them nevertheless.
Perceptual expertise explanations suggest that the other-race effect occurs because the same type of processing is used for all faces (regardless of whether or not it is optimised for that particular class of face). In contrast, "social cognitive" models of face processing, such as Sporer's (2001) "Ingroup-Outgroup Model (IOM)" or the "Categorisation-Individuation Model (CIM)" (Bernstein, Young, & Hugenberg, 2007;Hugenberg, Wilson, See, & Young, 2013;Hugenberg, Young, Bernstein, & Sacco, 2010) suggest that differences in the effectiveness of recognising own-and other-race faces arise because the two types of face are processed in different ways, following their categorisation as "in-group" or "out-group" members. The other-race effect is conceived of as being a special instance of differences between in-group and out-group facial processing. In-group faces are processed in a more individuated way, with a focus on facial characteristics that would serve to identify a particular face. Out-group faces are subject to "cognitive disregard" (Rodin, 1987) and are merely categorised as "out-group" in an undifferentiated manner.
There is some evidence that merely categorising faces as "in-group" or "out-group" is sufficient to produce differences in their memorability, even when all of the faces are drawn from a homogenous set (e.g., Bernstein et al., 2007).
Perceptual expertise and social cognitive explanations of the other-race effect make different predictions about how well own-race and other-race composites are likely to resemble the target face in the present study. Because perceptual expertise explanations suggest that the same kind of processing is used for all faces, they predict that composite resemblance ratings should be unaffected by whether the constructor believes the target face is European or Asian. The quality of the composites should be determined solely by the perceptual information that is extracted from the target face by whatever processes are habitually used with own-race faces.
In contrast, social cognitive explanations predict that participants who categorise the face as being the same race as themselves should produce more individuated (and hence more recognisable) composites than participants who categorise the face as "otherrace." Hence European participants who believe the target face is European should produce more recognisable composites than European participants who think it is Indian. A complementary owngroup advantage should apply in the case of the two groups of Indian participants.

| Design
There were two independent variables. The first was whether facial composite constructors were biased towards thinking a racially ambiguous face of a "criminal" was European or Indian. This was achieved by showing the face accompanied by the names of "accomplices" with either typically Indian or European names. The second IV was the race of the composite constructors themselves. Constructors were either native Indians (residing in India), overseas Indians (residing in Europe) or European Caucasians (Caucasians living in Europe).
The combinations of these two IVs thus gave rise to six groups of participants: 1 Native Indians who constructed a composite of an "Indian" face; 2 Native Indians who constructed a composite of a "European" face; 3 Overseas Indians who constructed a composite of an "Indian" face; 4 Overseas Indians who constructed a composite of a "European" face; 5 European Caucasians who constructed a composite of an "Indian" face; 6 European Caucasians who constructed a composite of a "European" face.
The primary dependent variable was the apparent race of each composite, as measured by two separate groups of independent raters (one Indian and one European) using a seven point scale ranging from "1" (wholly Asian in appearance) to "7" (wholly European).
A secondary dependent variable was the rated similarity of each composite to the original target face, as assessed by another two separate groups of independent raters (one Indian, the other European), using an 11-point scale ranging from "0" (no resemblance) to "10" (perfect resemblance). This research was approved by the University of Sussex Cross-Schools Research Ethics Committee.

| Participants
There were four separate sets of participants. 2 1 Eighty-four participants (53 Indian [28 male], mean age 25 years; 32 European [17 male], mean age 23 years) were involved in the preliminary process of rating the apparent ethnicity of face images, in order to obtain a racially ambiguous target face for use in the composite construction phase.
2 Sixty-one participants (mean age 24 years, 35 males in total) acted as composite constructors. Eighteen were Indians living in India ("Native Indians"), 19 were Indians living in Europe ("Overseas Indians") and 24 were European Caucasians.
3 One hundred twenty participants (75 Indians, mean age 26 years; 45 Europeans, mean age 25 years; 58 males in total) rated the racial appearance of the composites.
4 One hundred ninety-five participants (65 Indians, mean age 29 years, 20 male; 130 Europeans, mean age 32 years, 54 male) rated the resemblance of the composites to the target face.

| Apparatus
A racially ambiguous face was used as the stimulus for the main study. This was produced as follows. Firstly, three pairs of faces were obtained. These were sourced from the internet and from the Stirling Face Database. Each pair contained one "typically Asian" man and one "typically European" man, as judged by the experimenters, one of whom is Asian and the other European.
The individual depicted in each picture was a young male with a neutral facial expression, and no beard, piercings, distinguishing marks or jewellery. The images were colour, full-face images, measuring 413 × 531 pixels. A set of 25 morphs were made between each pair of faces, using "Smartmorph" (Meesoft Ltd.). The morphs varied, in approximately 4% increments, from a blend of 100% Asian/0% European (i.e., wholly Asian in appearance) to 0% Asian/100% European (i.e., wholly European in appearance).
The middle nine images (ranging from approximately 66% Asian/33% European, to 33% Asian/66% European) were then selected from each of these sets, to produce three sets of faces which varied in apparent ethnicity.
Each of 84 participants was presented with all three sets of faces. For each set, they were asked to rate each face on a scale ranging from "1 (very European looking)" to "7 (very Asian looking)." These ratings were averaged across participants, and the morph level that was closest to the scale's midpoint in each case was identified. We then randomly chose one of these three midpoint morphs to use as our target face 3 (see Figure 1).
In the main study, the facial composite construction software EFIT6, version 1.10.10, (Visionmetric Ltd.) was used to create facial composites of the target face on an Apple MacBook laptop computer (screen size 13.3 in., 1,440 × 900 pixels).

| Composite construction phase
Each constructor saw the same racially ambiguous face and then produced a composite of it, using EFIT6. Half of the participants in this phase were led to believe that the face they viewed was Indian, and half that it was European. Each participant produced their composite individually, interacting with the experimenter over the internet via Skype. To control for individual variation in expertise with the EFIT6 software, one experimenter was involved in the creation of all the facial composites.
After establishing a Skype call, using the audio-only function, the participant was sent a temporary internet link. Clicking on this randomly assigned them to one of two conditions. Either they were led to believe the face they would later see was Indian (by virtue of being exposed to stereotypically Indian names) or that it was European (due to being exposed to stereotypically British names). The experimenter did not have access to the participant's screen, and was thus blind to which condition the participant had been allocated to.
The participant began by reading a short text that asked them to imagine they had witnessed a bank robbery. The text said that all of the robbers had been caught except for the getaway driver: The racially ambiguous target face that was viewed by composite constructors [Colour figure can be viewed at wileyonlinelibrary.com] "Imagine that you have witnessed the following crime. A bank robbery has been committed by a small gang of thieves. Three of them went into the bank and stole £100,000 in cash. They were caught by the police before they could escape with the money. However the driver of the getaway car has escaped. The three criminals who were caught, Rajesh, Mohan and Rahul (or Roger, Michael and Richard), will not reveal the identity of the getaway driver. However, you saw his face clearly before he drove away. You will be shown the image shortly. Please help us produce a composite of this face, which can be released to the public in the hope that someone will identify him and report him to the police." All participants thus received the same instructions, except for being biased to believe that the "robbers" were either Indian or European, by the use of stereotypical Indian or British names. After reading the passage, participants saw the racially ambiguous target face on their computer screen for 30 s and were asked to memorise it.
Once the link expired on the participant's screen, they were instructed to alert the experimenter. The experimenter then used Skype's "share screen" facility, to enable the participant to see the Six steps led to the creation of an initial facial composite. In the first five steps, the participant chose the face shape, nose, lips, eyes, and eyebrows. Nine alternatives were provided for each of these. In the sixth step, E-FIT6 generated a face based on these choices. This face could then be modified further, as desired by the participant. For example, they could alter (or modify) the ethnicity, skin color, hair, eyes, age, expression, feature positioning, and so forth. The experimenter merely followed the participant's instructions, and provided no extra information or feedback that might influence the composite construction process.
Participants took as long as they wanted to produce a composite that they expressed satisfaction with (i.e., that they considered to be a good likeness of the target face). Each session typically lasted between 30 and 50 min.
After completing their composite, participants were asked to provide information about their age, gender and self-identified race or ethnicity. As a measure of inter-racial contact, they also completed an abbreviated version of the Social Experiences Questionnaire (SEQ; Slone, Brigham, & Meissner, 2000).

| Ratings of racial appearance
120 different participants rated the apparent race of each of the 61 facial composites that had been produced, using a seven-point Likert scale that ranged from −3 ("Very European looking," through 0 ("Wholly racially ambiguous") to +3 ("Very Asian looking"). The scale was recoded from 1 to 7 afterwards for statistical analysis. Each rater saw the composites in one of nine different random orders.

| Ratings of resemblance
Another group of 195 participants rated the likeness of each of the 61 composites to the original target face, using a slider scale to adjust the amount of resemblance from 0 (no resemblance) to 10 (perfect resemblance). On each trial, one of the composites was presented to the right of the target face, with the adjustment scale directly beneath the two faces. The trials were presented in a different random order for each participant. Each pair of faces remained on the screen until the participant made their decision.

| Contact
Questions on the SEQ seemed to conflate active contact (personal interaction) and passive contact, so these were looked at separately.
The question "how many Indians/Europeans do you know on a first name basis?" was considered as the most reliable measure of active contact. The question asking how many Indians/Europeans were encountered in stores was taken as a measure of passive contact.  Table 1).

| Racial appearance of composites
The raters' assessments of the race of the composites were analysed with a three-way mixed ANOVA.  Indian names than if they had been given European names (see Figure 3 e.g., composites). This was true for all three racial groups.
To pinpoint the source of the significant interaction between constructor race and constructor belief, we first calculated difference scores (collapsing across composite rater's race, as this was not significant Thus the naming manipulation produced a slightly stronger effect on ratings of composites produced by the Overseas Indian constructors than it did on the composites produced by the other two groups of constructors. Notwithstanding its statistical significance, the effect sizes in this analysis are small (as with the interaction itself), and the interaction does not detract from the principal conclusion, that all groups of composite constructor were significantly influenced by the names that were associated with the target face.

| Degree of resemblance between composites and the test face
As with the racial appearance data, the resemblance ratings were analysed with a three-way mixed ANOVA. The three IVs were: (a) constructor's race (with three levels: European Caucasian, F I G U R E 2 Effects of racially stereotypical names on mean ratings of the apparent race of the facial composites. 1 = "highly European in appearance," 7 = "highly Asian in appearance." p < .0001, η p 2 = 0.14). Finally, the three-way interaction between constructors' race, constructors' belief about the ambiguous face, and the raters' race was significant, F(1.91, 368.33) = 6.62, p = .001, η p 2 = 0.03.  Given that the resemblance scale ranged from 0 to 10, and the highest mean rating was only 3.43 (for Overseas Indian constructors given Indian names) the main conclusion to be drawn from these data is that, in absolute terms, the composites were generally considered to be rather poor likenesses of the target face. However, if one considers the 60 composites on an individual basis, they did vary substantially in quality (see Figure 5)

| DISCUSSION
Biasing participants about the race of an unfamiliar face before they viewed it significantly affected their perception of that face. Participants who were led to believe the face was Indian produced facial composites that were subsequently rated as significantly more "Asian" than did participants who were led to believe that the same face was European. On the assumption that a facial composite is an approximation of a witness' memory of a face, then the subtle verbal cues provided before the racially ambiguous face was seen led to quite dramatic shifts in how it was perceived and represented. This finding is consistent with previous studies showing that racial categorisation can substantially affect the perception and representation of faces (e.g., Hilliar & Kemp, 2008;MacLin & Malpass, 2003).
Our results extend these in several ways. MacLin and Malpass (2003) manipulated the categorisation of racially ambiguous faces by changing a single racially stereotypical facial feature (hairstyle) and then asking participants to make categorical judgements about the face's race (i.e., deciding whether it was "Black" or "Hispanic"). They claimed that the featural change acted in a "conceptual" way, affecting the classification of the face's race, which then affected perception of the face itself in order to bring it into line with its racial categorisation.
As MacLin and Malpass noted themselves, an alternative, more mundane explanation is that, due to the configural or "holistic" processing that routinely occurs with faces, the featural change affected the rest of the face, making it look (perceptually, not conceptually) more like one race or the other.
MacLin and Malpass argued that the latter interpretation was unlikely, but could not rule it out entirely. However, as in Hilliar and Kemp's (2008) study, we used exactly the same target face for all groups of participants. Therefore, any changes in the appearance of the target face (as inferred from the changes in the composites that were constructed) must have arisen from the participants themselvesthat is, from how they perceived the face, as a result of expecting it to be Asian or European. Like Hilliar and Kemp's work, our study also built on MacLin and Malpass' research by using a more subtle, sensitive measure (degree of racial appearance) rather than forcing participants to make binary, categorical judgements (e.g., this face is Black or White, Black or Hispanic).
Our results clearly show that a face's appearance is influenced by the viewer's beliefs about its race. We also hoped that the data on the resemblance between the composites and the target face would be useful in differentiating between perceptual and social cognitive explanations for the other-race effect. Although the target face was physically identical for all composite constructors, the ratings of the composites' racial appearance suggest that the face was either encoded and/or remembered quite differently by constructors depending on whether they were led to believe the face was Because the target face is structurally intermediate between an Indian and a British face, its similarities and dissimilarities to each composite constructor's own race might possibly have influenced how it was encoded; but perceptual processing theories would suggest that a constructor's attributions about the face's race should have no influence on how the face is processed, because each member of a given race processes all faces in the same way regardless of race.
The appearance ratings data thus seem to be troublesome for perceptual processing explanation. What about the resemblance rating data? Social categorisation hypotheses hold that there is a strong link between facial appearance and processing: in effect, faces only get optimally processed after being categorised as own-group (in this case, own-race). Faces labelled as "own-race" should be processed in a more differentiated way, and hence should give rise to more recognisable composites than faces labelled as "other-race." Perceptual processing hypotheses would predict that initial racial categorisation of a face should have no direct effect on encoding and retrieval. This type of theory suggests that the other-race effect occurs because other-race faces are processed as if they were ownrace faces but this type of processing is somehow inappropriate for dealing with other-race faces. So social categorisation hypotheses predict that labelling will affect resemblance ratings, while perceptual processing hypotheses predict that labelling will have no effect on resemblance ratings.
In this study, we tested Indian and European composite constructors, race-raters and resemblance-raters. The Indian participants were either overseas Indians who reported having extensive contact with Europeans or native Indians who reported having significantly less contact with Europeans. 4 There was little evidence of any crosscultural differences in any of the effects obtained. In terms of the racial appearance of the composites, Indian and European-Caucasian constructors seem to have been affected similarly by the biasing name information, and Indian and European raters appear to have judged the resulting composites similarly.
Indian and European raters did differ significantly in their ratings of the degree of resemblance between the composites and the target face, but not in ways that would be predicted by social categorisation hypotheses These would predict that composite construction would be affected by an own-race bias, so that better likenesses might be produced if the composite constructor believed the ambiguous target face was a member of their own race rather than the other race.
Believing that they were viewing an own-race face should have led constructors to process the target face in a more "individuating" way rather than merely categorising it as a generic "other-race" face (e.g., Sporer, 2001;Hugenberg et al, 2007). An own-race bias in composite construction would have been shown by an interaction between race of constructor and race of target face: European Caucasian constructors who thought the target face was "European" should have produced better composites of the target face than European Caucasian constructors who believed the face was "Indian," and vice versa for the Indian constructors.
In practice, we found no evidence of any interaction between the constructors' race and their beliefs about the race of the target face.
As far as the European raters were concerned, all three groups of constructors produced better composites after seeing Indian names than European names. For Indian raters of the same set of composites, there were no effects of names on rated resemblance at all: all composites were rated as similarly poor likenesses of the target face, regardless of the constructor's race or their belief in the race of the face. Therefore we found no evidence of any own-race bias in either the constructors or the raters.
Thus, overall, the resemblance data in the present study provide no support for social categorisation accounts of the other-race effect.
However this is a topic worth further investigation for various reasons. Firstly, it might be worth revisiting the issue using an ambiguous target between races that differ more in physiognomy than European and Indians do. Secondly, although our European and Indian groups had relatively less contact with each other than with their own races, in absolute terms all of the participants had some degree of interracial contact. All three groups consisted of computer-literate individuals with access to the internet and extensive exposure to Western media. Consequently even our native Indian participants would be quite familiar with Western faces.
Finally, because our composites were generally rather poor likenesses of the target face, we need to consider whether the resemblance data simply lack the sensitivity to show any difference between "ownrace" and "other-race" composites in terms of similarity to the target face. It is true that in absolute terms the ratings of resemblance between the target and the composites were generally rather poor. However, an alternative interpretation follows from a follow-up analysis that we performed. We directly compared the two European groups of composite constructors, as previous research suggests these would be the most likely to demonstrate an own-group bias (see Meissner & Brigham, 2001). We compared the resemblance ratings given to composites by European constructors who thought the target face was British, to the ratings given by European constructors who thought the target face was Indian. A repeated-measures t-test revealed a highly significant difference between the two sets of ratings, but in the "wrong" direction: our group of (British) resemblance raters thought that the composites constructed after seeing Indian names were significantly better likenesses of the target face than were composites constructed after seeing British names ("Indian names" M = 3.41, SD = 1.57, "British names" M = 2.91, SD = 1.43, t(129) = 7.35, p < .0001).
This is a puzzling result until one considers that the task of the resemblance raters was to compare each composite to the target face.
The latter was displayed on every trial. We do not know how each raters' own personal assessment of the target face's race affected their judgements about the resemblance between the target and each composite. Suppose for example, that one rater judged the racially ambiguous target face to be "Indian" while another judged it to be "British." The first rater is likely to judge composites as better likenesses if they look Indian than if they look British, because this would align with their own interpretation of the target face's race. The second rater is likely to do the opposite, and rate the faces as better likenesses if they look British. These contrasting effects are likely to be muddied by variations between raters not just in the direction of their own bias but also its extent, making them very difficult to allow for.
For these reasons, the design of our study does not allow the resemblance data to provide a strong test of the competing explanations for the other-race effect.
Turning to more "applied" matters, our results have obvious implications for police procedures: they suggest that officers should be aware that the process of composite construction might be systematically biased by extraneous information (such as casual comments about the suspect's race). Although the procedure for using  explicitly solicits information about the face's apparent ethnicity, care should be taken to ensure that this is left for the witness to decide for themselves.
However, before policy recommendations are made, further research is warranted to investigate whether these effects occur in contexts that are more relevant to criminal investigations. For example, the current study employed only one operator, in order to control for the operator's influence on the quality of the composite construction (Christie, Davies, Shepherd, & Ellis, 1981;Davies, Milne, & Shepherd, 1983;Ellis et al., 1978aEllis et al., , 1978b. This operator practiced extensively with the EFIT-6 software until they were competent at producing composites, but they probably lacked the skill of a fully-trained police operative. Although the study was conducted "blind," the operator was of course aware of the experimental hypotheses. It would be wise to replicate the study using police personnel who were totally unaware of the topic under investigation. Another issue is that E-FIT6 has two different modes of composite construction: the "traditional" feature-based method (in which a witness selects and modifies features such as the eyes, nose and mouth) and a "holistic" method (similar to that used by Evo-FIT, with good results in terms of producing identifiable composites: see Frowd et al., 2013). We used the feature-based method, but with hindsight it may have been preferable to use the "holistic" mode, as this is more in keeping with how humans encode and remember face (i.e., in terms of their configural properties rather than as a collection of isolated features).
In the "holistic" procedure, the software produces an array of composites based on the witness' initial description of the target face.
The witness selects the composite that they think most closely resembles the target face, and then the computer generates a new set of variants based on this selection. This procedure is repeated over a number of trials, until the software arrives at an image that the witness judges to be a reasonable likeness of the target face. It would be interesting to repeat our study using this alternative method of composite construction (either with EFIT-6 or Evo-FIT) given that this is likely to become the police's preferred method of composite construction in future.
Our study revealed that labelling a face immediately before the encoding stage can significantly affect the encoding and recall of that face. We found no evidence of any effects of categorisation on the recognisabilty of the composites that were constructed, but it would be interesting to revisit this question, using the more ecologically valid procedure devised by  for evaluating composites (see also Fodarella, Kuivaniemi-Smith, Gawrylowicz, & Frowd, 2015). This involves using two groups of participants. One group see a face which is unfamiliar to them, and construct composites of it after a forensically relevant delay (e.g., 24 hr).
The second group, who are highly familiar with the target face, are presented with the composites and asked to identify them (e.g., by name). Spontaneous naming is probably a more sensitive measure of resemblance between composites and target face than the method that we used, that is, obtaining resemblance ratings from individuals who were (by necessity, given it was an artificially created faced) unfamiliar with the target face.
As with many studies in this field, our experiment interposed only a short delay between participants viewing the target face and constructing their composite. In real life settings, this delay is typically much longer (Frowd, McQuiston-Surrett, Kirkland, & Hancock, 2005). Do similar effects to those we have demonstrated occur if the biasing information is provided some time after the target face is seen, and if so, do these effects increase or decrease with the passage of time? In U.K. police procedures, witnesses are given a Cognitive Interview (CI) before they construct a composite.
The CI procedure includes the witness making an attempt at free recall of information about the suspect's appearance. It would be interesting to see if a CI (or the improved "holistic" CI used before composite construction by Frowd, Bruce, Smith, & Hancock, 2008) protected witnesses against the biasing effects that we have demonstrated. Related to this, as one of our reviewers pointed out, it would be interesting to see if racially biased composite construction had any contaminating effect on a witness' subsequent lineup performance.
The participants in the present study viewed the target face under optimum conditions: good lighting, without any distractors present while they viewed the image, and under conditions of no stress (Hancock Burke & Frowd, 2011). Such conditions are of course unlikely to be present during the encoding of criminals' faces in real life, and it would be interesting to see how these factors interacted with the categorisation effects on composites that we have demonstrated. Davies and Oldman (1999) speculated that any gaps in a witness' memory for a face "will be filled in by the perceiver, relying on attributions and stereotyping" (p. 129). If so, one might expect the effects that we have demonstrated to be even more pronounced under sub-optimal witnessing conditions.
Finally, we provided information to systematically bias participants' encoding of the target face; would similar effects occur if the participant spontaneously categorised the face, without any prompting from the experimenters?
These questions remain to be explored, but the present study provides further evidence that a participant's initial categorisation of a face can profoundly affect their subsequent representation of it, consistent with Social Cognitive models of face processing.