Super-recognisers show an advantage for other race face identification

The accurate identification of an unfamiliar individual from a face photo is a critical factor in several applied situations (e.g. border control). Despite this, matching faces to photographic ID is highly prone to error. In lieu of effective training measures which could reduce face matching errors, the selection of ‘super-recognisers’ (SRs) provides the most promising route to combat misidentification or fraud. However, to date, super-recognition has been defined and tested using almost exclusively ‘own-race’ face memory and matching tests. Here, across three studies we test Caucasian participants on tests of own-race (GFMT, MFMT, CFMT) and other-race (EFMT, CFMT-C) face identification. Our findings show that compared to controls, high performing typical recognisers (Studies 1 & 2) and super-recognisers (Study 3) show superior performance on both the own-and other-race tests. These findings suggest that recruiting SRs in ethnically diverse applied settings could be advantageous.


Introduction
The use of face photos for accurate identity verification is critical in maintaining border security and ensuring that correct convictions occur within the criminal justice system.At border control, passport officers are required to decide whether the face of a traveller matches their passport photo, while police officers are routinely required to match the face of a suspect to poor-quality CCTV stills.In each of these cases, the target individuals are likely to be unfamiliar to the police officer or border control official.Despite this, it is now well established that matching pairs of unfamiliar faces is highly prone to error (Burton, 2009;Burton & Jenkins, 2011;Davis & Valentine, 2009;Hancock, Bruce, & Burton, 2000;Jenkins & Burton, 2011;Johnston & Edmonds, 2009;Robertson, 2018;Robertson & Burton, 2016).
Notably, errors within this context may lead to travellers with fraudulent passports entering the country illegally, or innocent suspects being convicted of a crime.
In addition, a number of recent experiments have found it difficult to train people to be better at facial identification, with individual differences in performance often outweighing the magnitude of improvement so that any positive effects demonstrated tend to be largest for, or restricted to, typically poor recognisers (e.g., see Robertson, Mungall, Watson, Wade, Nightengale, & Butler, 2018;and White, Kemp, Jenkins, & Burton, 2014 for work on feedback training).This difficulty in trying to improve an individual's facial recognition ability was further supported by a recent paper by Towler et al. (2019) which showed that professional facial identification training courses, which are used by agencies across the world, appear to have little or no impact on an individual's person identification performance.
Therefore, focus has now somewhat shifted from improving the performance of typical recognisers to the selection of individuals (see Baldson, Summersby, Kemp, & White, 2018), known as super-recognisers (SRs), who naturally excel at face identification tasks as a result of a likely inherited (Wilmer, Germine, Chabris, Chatterjee, Williams, Loken, et al., 2010), face-specific (McCaffery, Robertson, Young, & Burton, 2018;Wilhelm, Herzmann, Kunina, Danthiir, Schacht, & Sommer, 2010;Yovel, Wilmer, & Duchaine, 2014)  present in around 2% of the general population.Recent work has started to assess the processes which may underpin super-recognition, and the findings suggest that SRs may focus more on the inner features of unfamiliar faces (particularly the nose region; Bobak, Parris, Gregory, Bennetts & Bate, 2017), as well as enhanced early stage encoding of incoming facial information (Belanova, Davis, & Thompson, 2018), compared to typical recogniser controls.Despite these advances in the assessment of the neurocognitive markers of superrecognition, the CFMT+ remains the gold standard test for SR categorisation.The CFMT+ is a Caucasian learned face memory test.Participants are asked to memorise the faces of six people, followed by a memory test (3AFC) which includes novel instances of the learned identities.However, as noted above, the critical task at border control and in criminal identification is unfamiliar face matching, which does not place any demands on memory, and indeed in the early phase of super-recognition research it was not clear whether CFMT+ SRs would also excel in matching tasks.
However, recent findings have shown that the superior face memory ability found in CFMT+ SRs does generalise to the unfamiliar face matching domain.A series of recent studies have shown significantly greater accuracy rates for CFMT+ SRs on the GFMT and the more challenging Models Face Matching Test (MFMT) compared to typical recognisers (Bobak, Dowsett, & Bate, 2016;Davis, Lander, Evans, & Jansari, 2016;Robertson, Noyes, Dowsett, Jenkins, & Burton, 2016; see also Bobak, Hancock, & Bate, 2016;Davis, Treml, Forrest, & Jansari, 2018;Noyes, Hill & O'Toole, 2018;Phillips et al., 2018 for similar findings with newly developed matching tests).In addition, recent individual difference studies have reported positive correlations of moderate strength, between scores on the CFMT+ and the GFMT (e.g.McCaffery, Robertson, Young, & Burton, 2018;Verhallen et al., 2017;see Fysh, 2018;Fysh & Bindemann, 2018 for equivalent findings with the CFMT/Kent Face Matching Test).Such correlations across face matching and face memory tasks support the idea of Verhallen's f (Verhallen et al., 2017), as a common underlying mechanism for face processing akin to Spearman's g (1927) (for intelligence).In the applied context, these findings confirm that CFMT+ SRs can also excel on matching tasks and could therefore be deployed as passport checkers at border control or as officers in criminal identification units in policing.
The finding that CFMT+ SRs also excel at matching pairs of faces is important in terms of the general utility of SRs across different occupations.However, it must still be viewed with caution because the face tasks employed in these studies (CFMT+, GFMT, MFMT) used only Caucasian faces (see Noyes & O'Toole, 2017), when in the real-world, passport checkers and police officers regularly encounter faces from a wide range of ethnic groups.
Data from the 2011 UK Census (ONS, 2011) showed that six distinct ethnic groups are represented by more than one million UK citizens (i.e.White British, All Other White, Mixed, Asian, Black and with 'Other' category representing many additional ethnic groups) and an official may encounter many other non-UK ethnicities at an airport.Verifying an individual's identity from a face photo is challenging enough when the viewer and the target are from within the same ethnic group, however, due to a well-established psychological phenomenon known as the other-race effect (ORE), accurately identifying a person from a different ethnic group results in even poorer performance (see Meissner & Brigham, 2001 for a review).
The ORE emerges early in development, with infants as young as nine months of age showing preferential recognition for own-race faces, while initial exposure to predominantly own-race faces, shapes adult perception and performance (Kelly et al., 2007;Meissner & Brigham, 2001;O'Toole, Deffenbacher, Valentin, & Abdi, 1994;Walker & Tanaka, 2003).
A study by Meissner, Susa, and Ross (2013) demonstrated the ORE using a matching task which mirrored the passport control context, with the image pairs showing a high-quality face photo of the 'traveller' and a scanned photo-ID page from a passport.They reported the typical 20% error rate in the own-race condition (Mexican American observers/faces), which rose to 30% in the other-race condition (Mexican American observers/African American faces).In addition, findings from Megreya, White, and Burton (2011) displayed the ORE in a 1-10 matching task (UK/Egyptian Faces/Observers).Intriguingly, this study also reported moderate-to-strong correlations between accuracy rates on the own-and other-race tests for both groups (r = .60UK Observers, r = .78Egyptian Observers), although the sample size here was small (N = 26 for both groups).This suggests that participants who excelled on the own-race task were also likely to excel on the other-race task (relative to a lower mean score).Recent work by Kokje, Bindemann, and Megreya (2018) replicated both the ORE effect and the own-/other-race accuracy correlation with a larger sample (N = 74) using 1-1 matching tasks.However, they did not use the CFMT+, or the GFMT, when assessing individual differences in performance, limiting the generalisability of their findings to typical recognisers.
To date, only one paper by Bate et al. (2018) has attempted to directly assess the performance of SRs on other-race face identification tests.Using a sample of 8 Caucasian SRs, Bate et al. (2018) presented participants with own and other-race face memory tests SUPER RECOGNISERS AND OTHER-RACE FACES 9 (Experiment 1), and own and other-race face matching tests (Experiment's 2 and 3).They reported that their sample of SRs did not show a performance advantage over native typical recognisers (i.e.Asian observers/Asian face tests).However, the SRs did show an advantage over the Caucasian controls on the other-race face tests, although the accuracy cost for otherrace faces remained, with no difference in magnitude compared with the control group.That is, the ORE was present in SRs, albeit from a higher baseline level of performance than controls.These are intriguing findings suggesting that SRs may be performing at the top end of a face recognition continuum rather than displaying qualitatively different cognitive processes.However, the findings from Bate et al. (2018)  Finally, in Study 3, in order to directly assess SRs' performances on own-and other-race face tasks, we test a large sample of Caucasian super-recognisers (SRs) (N = 35) using the CFMT+, Adult Face Recognition Test (AFRT), MFMT, and the other-race EFMT-short, relative to Caucasian typical recogniser controls (N = 420).Following the process reported by Bate et al. (2018), we seek to assess whether Caucasian SRs outperform Caucasian controls on an other-race unfamiliar face matching test, and whether or not the accuracy cost associated with the identification of other-race faces is present in the SR group, and if so, to what extent.

Study 1
In Study 1, we use four established tests of face identification (CFMT-short, CFMT-Chinese, GFMT-short, MFMT-short) and a 200 item Egyptian Face Matching Test (EFMTlong; 100 match/100 mismatch trials).Here we seek to replicate previous work, outlined in the introduction, which has shown a robust correlation between the CFMT (learned face memory) and the GFMT (face matching).We also include the more challenging MFMT (face matching; highly variable male model images) as a direct correlation between this task and the CFMT and the GFMT has not been previously reported.Importantly, we also include an other-race face matching test (EFMT-Long; Egyptian Faces), and we assess whether this task produces an other-race accuracy cost, and whether accuracy on the own-race GFMT generalises to the other-race EFMT-long.Although the focus of this paper is on other-race face matching, we also include the CFMT-Chinese version (McKone et al., 2012) to assess cross-domain (i.e.matching/memory) and cross-race correlations (Caucasian, Egyptian, Chinese).The short version of the CFMT is used in this study, rather than the CFMT+, and therefore we cannot determine if there are any SRs in the sample.Therefore, in Study 1 we test typical recognisers (undergraduate students) only.

Method Ethical Approval
Each study reported in this paper received ethical approval from the Ethics Committee

Glasgow Face Matching Test (GFMT)
The GFMT (short version) consists of 40 pairs of unfamiliar Caucasian faces.The test contains an equal number of trials in which the face pairs show the same person (match condition) or two different, but similar looking, people (mismatch condition).See Figure 1 for an example image pair and Burton et al. (2010) for further details.

Models Face Matching Test (MFMT)
The Models Face Matching Test (short version) consists of 30 pairs of unconstrained, highly variable, face photos of male models (15 match/15 mismatch).The MFMT is designed to be more difficult than the GFMT and, in line with the CFMT/CFMT+ distinction, is more likely to detect high performing face matchers.See Figure 1 for an example image pair and Dowsett and Burton (2015) for further details.

Item Egyptian Face Matching Test Long Version (EFMT-long)
The Egyptian Face Matching Test (EFMT-long) that we use here consists of 200 pairs of unfamiliar male Egyptian faces (100 match/100 mismatch), as seen in Figure 1 (see Megreya, White, & Burton, 2011 for further details).

Cambridge Face Memory Test (CFMT)
The Cambridge Face Memory Test (short version) is a well-established 72-item learned face recognition-memory task which increases in difficulty with the addition of within-person variability and visual noise to the image set. Figure 1 shows an example of the stimuli used in the CFMT, see Duchaine & Nakayama (2006) for further details.

CFMT-Chinese Version (CFMT-C)
The Cambridge Face Memory Test -Chinese Version, follows an identical format to that described above for the CFMT with the exception that Chinese faces replace the Caucasian faces used in the original test.See McKone et al. (2012) for further details.

Procedure
The order of presentation of the tasks was randomised by block (unfamiliar face matching tests, face memory tests) and then by test (GFMT/MFMT/EFMT-long, CFMT/CFMT-C).On each trial on each of the face matching tests, participants were required to decide whether the face pair showed the same person or two different people.For the matching tests, each trial remained on screen until a participant made a response.For the face memory tests, participants were required to learn six target identities by viewing photos of them in three different orientations (left, forward facing, right), and to then detect photos of these identities in the presence of two foils in 3-AFC recognition trials.Recognition trials remained onscreen until participants made their response.All responses were made via keyboard key with testing sessions lasting approximately 1 hour.
While research shows that accuracy on other-race tasks is poorer than own-race face tasks, here we find that EFMT-long accuracy was significantly higher than both the GFMT and MFMT performances (M = 85%, SD = 8%, Range = 60%-98%; F(1,110) = 17.29, p < .001,ηp 2 = .14for the GFMT, F(1,110) = 122.26,p < .001,ηp 2 = .53for the MFMT).This pattern is likely to be because, as mentioned above, the GFMT and MFMT consist of the most difficult items from longer test sets.This is not the case for the EFMT-long, in which the full 200 trial test was used, and so accuracy is likely to be inflated by the inclusion of a greater proportion of easy trials.A shortened version of this test, using the most difficult items from the current dataset, is therefore tested in Study 2.

Individual Differences
As our principal aim was to explore potential correlations between different measures, we were more concerned with avoiding Type 2 than Type 1 errors, and therefore report uncorrected statistics.As a check on the reliability of these, however, we also used the Benjamini-Hochberg procedure with a false discovery rate of 0.2 to correct for multiple comparisons, and we also report confidence intervals (see McCaffery, Robertson, Young, & Burton, 2018).

Unfamiliar Face Matching (GFMT, MFMT, EFMT-long)
As seen in Figure 1, there was a significant positive correlation between the GFMT and the MFMT (r(111) = .541,uncorrected p < .001,95% CI [.39, .66])with individuals who perform highly on the GFMT also performing highly on the MFMT.This correlation replicates the effect reported by Bobak, Dowsett, and Bate (2016), and shows a level of stability in matching aptitude across the GFMT and the MFMT.It further supports the use of the MFMT as a more sensitive measure of face matching ability among high performers.
Importantly, participants' scores on the own-race GFMT and MFMT both correlated with the other-race EFMT-long (r(111) =.580, uncorrected p < .001,95% CI [.44, .69]for the GFMT; r(111) =.535, uncorrected p < .001,95% CI [.39, .65]for the MFMT).This finding extends previous research by Megreya, White, and Burton (2011) who reported a similar relationship using 1-10 face matching arrays, using the well-established GFMT.These findings suggest that individuals who perform highly in matching pairs of unfamiliar faces from their own-race, are also likely to perform highly when exposed to other-race faces.

Face Recognition Memory (CFMT, CFMT-C)
Here we replicate the strong positive correlation reported by McKone et al. (2012) between performances on the own-race Caucasian CFMT and the other-race CFMT-C, r(111) SUPER RECOGNISERS AND OTHER-RACE FACES 17 =.653,uncorrected p < .001,95% CI [.53, .75].This finding shows that individuals with a high aptitude for the recognition of new instances of a recently learned own-race face, are also like to perform well when the target identity is from a different ethnic group.

Cross-Domain and Cross-Race Correlations
As shown in Figure 1, all of the cross-domain (matching, memory) tests correlated with each other, suggesting shared underlying mechanisms for identity verification in both matching and memory contexts.While it has previously been established that scores on the CFMT and the GFMT correlate (McCaffery, Robertson, Young, & Burton, 2018;Verhallen et al., 2017), this is the first study to show such relationships between these tests and the other-race face tasks included in the battery.Importantly, we show a significant positive correlation between the CFMT and both the own-race GFMT (r(111) = .433,uncorrected p < .001,95% CI [.27, .57])and the other-race EFMT-long (r(111) = .449,uncorrected p < .001,95% CI [.29, .58]for CFMT vs. EFMT-long).That is, aptitude on a face memory test generalises to both own-and other-race unfamiliar face matching accuracy.Taken together, the findings from Study 1 do provide support for the view that a general face processing factor f (Verhallen et al., 2017) exists and it supports face processing across matching and memory domains for both own-and other-race faces.

Study 2
As reported in Study 1, mean accuracy on the 200-item EFMT-long were higher than the 40-item GFMT-short (which consists of the 40 most challenging items from the GFMTlong).Here, in Study 2, we follow the same procedure as Burton, White, and McNeil (2010) by selecting the 40 most difficult items (i.e.least accurate responses) from the EFMT set used in Study 1, to create a shorter version of the task.

Stimuli, Apparatus and Procedure
In this study, only the GFMT and our shortened version of the EFMT were used.In line with the GFMT, the EFMT-short used in the present study consisted of 40 trials (20 match / 20 mismatch).These 40 EFMT pairs represented the 40 most difficult pairs as measured by EFMT item error rates in Study 1, from a sample of fifty-two participants.The tasks were presented on a Dell PC, task order was counterbalanced, and trial order was randomised across participants.

Task Accuracy
Mean accuracy on the shortened version of the other-race EFMT was 74%, significantly lower than the own-race GFMT (81%), F(1, 42) = 19.63,p < .001,ηp 2 = .32.As seen in Figure 2, accuracy on the EFMT-short was lower in both the Match and Mismatch conditions (F(1, 42) = 5.02, p = .03,ηp 2 = .11for Match; F(1, 42) = 8.83, p = .005,ηp 2 = .17for Mismatch), and in line with the GFMT, accuracy rates did not differ between EFMT-short Match and Mismatch conditions, F(1, 42) = 2.15, p = .15,ηp 2 = .05.We note here that although the EFMT-short produced lower accuracy rates than the GFMT, without the inclusion of an Egyptian sample of participants we cannot say conclusively that our EFMTshort produces an other-race effect on accuracy.It could be the case that the EFMT-short items are simply more difficult than the GFMT items, we thank Reviewer 3 for bringing this to our attention.However, 75% of the items used in our EFMT-short were also included in a longer test by Kokje, Bindemann, and Megreya (2018), and analysis of the dataset which isolated our EFMT-short items revealed that mean accuracy rates for the Egyptian observers was 78%, that is 4% more accurate than our Caucasian observers.This suggests that should Study 2 be replicated within the inclusion of an Egyptian sample, that it would be likely that the EFMT-short would generate an other-race effect on accuracy rates.Even were it to be the case that this data was not available from Kokje, Bindemann, and Megreya (2018), the EFMT-short would still provide a valid measure with which so assess between group differences on identification accuracy using own-and other-race faces, as we do so in Study

Individual Differences
Here we replicate the findings from Study 1 with a significant positive correlation between overall scores on the GFMT and the EFMT-short (r(43) = .454,uncorrected p = .002,95% CI [.18, .66]),again showing consistency in performance across own-race and other-race unfamiliar face matching tests.In addition, significant correlations were found across the tests when the match and mismatch trials were analysed separately (r(43) = .532,uncorrected p < .001,95% CI [.27, .71]for Match trials; r(43) = .390,uncorrected p = .010,95% CI [.10, .61]for Mismatch trials).These correlations remained significant after applying both the Bonferroni and Benjamini-Hochberg corrections.self-paced, order and trial order were randomised across participants, and feedback scores were provided at the end of the study.

Group Comparisons
For the typical recogniser control group, mean accuracy rates on the tasks were: 80% for the CFMT+ (SD = 11%, Range = 46%-92%), 75% for the AFRT (SD = 9%, Range = 40%-95%), 91% for the GFMT (SD = 7%, Range = 58%-100%), 83% for the MFMT (SD = 9%, Range = 53%-100%), and 86% for the EFMT-short, the other-race face matching task (SD = 8%, Range = 55%-100%).Mean performance on each of these tests is around 8%-10% higher than previously published norms, which is likely to be due to a recruitment bias in which those likely to take part in this study have an interest in superior face recognition ability.Importantly, these results replicate our findings from Study 2, with poorer performance on our newly established short version of the other-race EFMT in comparison to the own-race GFMT, F(1, 359) = 118.07,p < .001,ηp 2 = .25.This confirms that this short version of the EFMT-short is challenging enough to provide an unfamiliar face matching other-race effect.
It is important to note that while SRs display enhanced accuracy on the EFMT-short in comparison to controls, however, in line with the controls the SRs still performed less accurately on the other-race EFMT-short (94%) compared to the own race GFMT (97%; t(34) = 2.67, p = .012for the difference).For the SRs, the mean difference in accuracy between the EFMT-short and GFMT was 3%, which was not significantly smaller than the 5% effect reported between the tests for the typical recogniser controls, t(393) = -1.48,p = .141.However, again, this could be due to the recruitment bias in the control group outlined above, and when the size of the SR difference in accuracy between the own-and other-race tests (3%) was compared to the typical recognisers recruited for Study 2 (7%; students), the magnitude of the SR cost was found to be significantly smaller, t(76) = -2.33,p = .022.We note again, that our claim that the EFMT-short produces an other-race task cost should be replicated in a fully crossed design which includes native Egyptian observers.

Super-Recogniser Group
In contrast to the typical recogniser group, and as expected, there were no correlations between the CFMT+ and any of the other tests (all p's > .076), a consequence of selecting SRs on the basis of the CFMT+ scores, thus removing most of the variance from that set which would allow for an individual differences analysis.

Superior Performance Across All Tests
Although the majority of SRs did produce scores above mean control performance across tasks, it is important to note that 3 SRs scored below the control mean on the GFMT, 4 SRs scored below the control mean on the MFMT, and 2 SRs scored below the control mean on the EFMT-short.That is, it is not the case that all SRs, as categorised by the CFMT+ and the AFRT, will always show superior performance on other facial identification tasks.
Moreover, if we apply the conservative CFMT+ criteria for super-recognition (i.e.≥ 2 SDs above the control mean) to the other tests, then, as seen in Figure 3, 16/35 SRs achieved this for the GFMT, 3/35 for the MFMT, and 9/35 for the EFMT-short, as seen in Figure 3. Out of the sample of 35 SRs, only 1 participant achieved scores of 100% across each of the three face matching tests.This has implications in terms of the types of tests that should be used to categorise SRs for specific occupations, as outlined in the general discussion.

General Discussion
Across three studies we demonstrate a consistent performance cost for other-race face identification, both the context of recognition memory (Study 1; CFMT/CFMT-C) and importantly in unfamiliar face matching (Studies 1-3; GFMT, MFMT, EFMT), we show that Caucasian SRs do outperform Caucasian controls on an other-race face matching test, but that an other-race accuracy cost remains in that group.
Study 1 is, to our knowledge, the first to assess cross domain matching/memory performance in own/other-race tasks using this battery of well-established (CFMT, CFMT-Chinese, GFMT, MFMT) and novel (EFMT-long) tests, in a single well powered sample.The findings from Study 1 replicate previous work showing consistency in performance on the CFMT and GFMT (McCaffery, Robertson, Young, & Burton, 2018;Verhallen et al., 2017) and we extend this to the more challenging MFMT.The latter effect supports the idea that individuals who excel on the CFMT and GFMT are also likely to fare well in more ecologically valid tasks which contain highly variable face photos (i.e. the MFMT).Most importantly, we show that performance on the CFMT and GFMT correlate with scores on the EFMT-long.This suggests that performing well on own-race face memory/matching tasks is likely to result in superior performance when an individual encounters faces from out with their own ethnic group (Kokje, Bindemann, & Megreya, 2018;McKone et al., 2012;Megreya, White, & Burton, 2011;Meissner, Susa, & Ross, 2013).This finding along with the other cross-domain correlations (e.g.CFMT-C vs. GFMT) add further support to the idea that both face matching/memory and own/other-race face processing may tap the same underlying cognitive and perceptual processes, which Verhallen et al. (2017) has termed f, a general face perception factor (analogous to Spearman's g in the study of intelligence; Spearman, 1927), which may distinct from non-face cognitive abilities (McCaffery, Robertson, Young, & Burton, 2018;Wilhelm et al., 2010).However, while Verhallen et al. (2017) used a variety of face tests to assess the potential for a general face f, further work, including a variety of object based and other non-face tasks is required to assess whether this factor is indeed specifically indicative of individual differences in face processing.
Having assessed cross-domain performance in typical recognisers in Study 1 and verified our 40-item EMFT-short in Study 2, in Study 3 we used a battery of tests to assess own-and other-race face identification in a set of Caucasian SRs in comparison to Caucasian controls.The findings showed that while there was a SR advantage for accurately matching pairs of other-race faces, with an 8% increase in mean performance over controls, SR accuracy on the other-race EFMT-short was still lower than scores on the own-race GFMT.
These findings support the recent work by Bate et al. (2018) which also showed that SRs outperformed typical recognisers on other-race face tests, but that an accuracy cost or ORE remained evident in the SR group.Both the study by Bate et al. (2018) and the present findings provide support for the view that the SRs are displaying performance at the top end of a face recognition continuum, rather than engaging qualitatively different cognitive and perceptual processes.One limitation of the present study was that it did not include native Chinese (Study 1) or Egyptian (Studies 1 & 3) control groups, therefore we were not able to test whether SRs would outperform native observers.However, Bate et al. (2018) did include native control groups and they found that while Caucasian SRs outperformed Caucasian controls on other-race tests, the native observers (e.g.Asian observers/Asian face test) outperformed both of these groups.This suggests that while employing a Caucasian SR at border control may lead to greater detection of fraud attacks by other-race travellers, a native observer who shares the fraudsters ethnic group would outperform that SR.Again, the sample size used in the study by Bate et al. (2018) was small, so further work should seek to test this native observer vs. SR advantage.
The persistence of an ORE in SRs in the study by Bate et al. (2018) the other-race accuracy cost reported in this paper, is consistent with the idea that SRs represent the top end of a face recognition continuum, rather than a qualitatively distinct ability.Bobak, Parris, Gregory, Bennetts, and Bate (2016) used eye-tracking to assess face processing in SRs, typical recognisers and individuals with congenital prosopagnosia, and found that SRs spent a greater proportion of their time on the inner features of a face, particularly the nose region, when viewing social scenes.It could be this change in the time spent on the internal features of a face that is driving the SR advantage for other-race face identification.A series of studies has shown that the ORE may result from failing to direct attention to those features, such as the nose region, of an other-race face that are likely to provide the most diagnostic information for accurate identity perception (Hills, Cooper, & Pake, 2013;Hills & Lewis, 2006;Hills & Pake, 2013).Therefore, it could be the case that SRs are naturally attuned to deploy their attention more efficiently, and for longer, to central regions of the face which leads to greater identification accuracy for both own-and other-race faces.This could explain the greater accuracy on the EFMT-short in the SR group relative to controls; and the smaller magnitude of the SR EFMT-short cost (3%) compared to typical recognisers (7% in Study 2; but n.s. 5% in Study 3).
An important consideration in terms of the applied potential of our findings relates to the fact that within SR research, group-level analyses (i.e.SRs vs. typical recognisers) can mask the fact that not all SRs, as categorised by scores on the CFMT+, always outperform typical individuals on other tests of face processing (Davis, Lander, Evans, & Jansari, 2016;Noyes, Hill, & O'Toole, 2018).In both Study 1 and Study 2, we replicate the correlation between the CFMT+ and the GFMT reported by previous studies.This correlation suggests that CFMT+ SRs are also likely to perform above average in occupations where unfamiliar face matching is the critical task (i.e.passport control officer).However, these correlations are in the moderate range, and it is therefore the case not all CFMT+ SRs are likely to be 'super-face-matchers' and therefore tests would need to be performed in conjunction with the CFMT+ before an individual could be considered as a suitable SR candidate for roles in which face matching is the critical task.Similarly, as outlined in Study 3, and as seen in Figure 3, not all CFMT+ SRs, or indeed higher performers on the GFMT, showed outstanding performance on the EFMT.Therefore, in the applied context, professions which are seeking to recruit SRs should employ a battery of tests to assess their suitability for the specific role (see Bate et al., 2018;Ramon, Bobak, & White, 2019).It is not the case that selecting SRs on the basis of CFMT+ scores will ensure that each of these individuals will excel at unfamiliar face matching or indeed other-race unfamiliar face matching.
In conclusion, our findings of consistent associations in accuracy across face processing domains (matching/memory) and race adds weight to the notion that these processes may be served by the same underlying mechanism, or f a general face perception factor.SRs as a group, and to a large extent at the individual level, outperform typical recognisers from the SR's own race on a test of other-race face matching, with an other-race accuracy cost remaining evident in this group.This SR advantage for other-race faces may be driven by more efficient attentional allocation to central regions of the face, particularly the nose, which are likely to provide greater diagnostic information for identity perception (Bobak, Parris, Gregory, Bennetts, & Bate, 2016;Hills, Cooper, & Pake, 2013;Hills & Lewis, 2006;Hills & Pake, 2013).Finally, police forces, border control agencies and private organisations who seek to select and employ SRs must include other-race face tasks in their assessment battery to ensure, at the individual level, that the people they select also excel in verifying the identities of individuals from outside their own ethnic group.In doing so, this would provide an effective addition to counter-measures which are designed to reduce fraud attacks at passport control, and wrongful criminal convictions.
should be treated with caution, as both the size of the SR sample (N = 8) and its heterogeneity precluded statistical comparisons at the group level.Further work, with a much larger SR sample is required to test the robustness of their findings.Therefore, the present study sought to investigate individual differences in performance across a range of own-and other-race faces tests in typical recognisers (Study 1 & 2) and a large sample of Caucasian SRs (Study 3).In Study 1, we test a large sample of typical recognisers (Caucasian undergraduate students; N = 111) using a battery of facial identification tests which tap into own-race face memory (CFMT+), own-race face matching (GFMT-short, MFMT-short), other-race face memory (CFMT-Chinese), and other-race face matching processes (Egyptian Face Matching Test; EFMT-long), to assess whether typical recognisers who perform accurately on own-race tests also show similar levels of performance on the other-race tests.If that is the case, it would provide support for a common mechanism which underlies both own-and other-race face identification.In Study 2, using a sample of typical recognisers (Caucasian undergraduate students; N = 43), we verify a shortened 40 item version of the other-race face matching test used in Study 1 (EFMT-short).
of the University of Strathclyde School of Psychology Sciences and Health.Study 3 received concurrent approval from the University of Greenwich Research Ethics Committee.Participants One hundred and eleven Caucasian participants with a mean age of 22 years (SD = 5, Range = 18-53, 18 Male) were recruited from the University of Strathclyde School of Psychological Sciences and Health.All participants had normal or corrected-to-normal vision, each provided written informed consent, and upon completion of the study each received a course credit, or an optional piece of confectionary.

Figure 1
Figure 1 Correlation matrix for the five face identification tests used in Study 1; Glasgow

Participants
Forty-three Caucasian participants recruited from the University of Strathclyde School of Psychological Sciences and Health, with a mean age of 23 years (SD = 5, Range = 18-44, 11 Male) took part in this study.All participants had normal or corrected-to-normal vision, each provided written informed consent, and upon completion of the study they received a course credit to reimburse them for their time.

2
Mean accuracy for the GFMT and the shortened version of the EFMT (40 Trials), and separately, their match and mismatch conditions, * p < .05,** p ≤ .005.