Genome‐wide association study of facial emotion recognition in children and association with polygenic risk for mental health disorders

Emotion recognition is disrupted in many mental health disorders, which may reflect shared genetic aetiology between this trait and these disorders. We explored genetic influences on emotion recognition and the relationship between these influences and mental health phenotypes. Eight‐year‐old participants (n = 4,097) from the Avon Longitudinal Study of Parents and Children (ALSPAC) completed the Diagnostic Analysis of Non‐Verbal Accuracy (DANVA) faces test. Genome‐wide genotype data was available from the Illumina HumanHap550 Quad microarray. Genome‐wide association studies were performed to assess associations with recognition of individual emotions and emotion in general. Exploratory polygenic risk scoring was performed using published genomic data for schizophrenia, bipolar disorder, depression, autism spectrum disorder, anorexia, and anxiety disorders. No individual genetic variants were identified at conventional levels of significance in any analysis although several loci were associated at a level suggestive of significance. SNP‐chip heritability analyses did not identify a heritable component of variance for any phenotype. Polygenic scores were not associated with any phenotype. The effect sizes of variants influencing emotion recognition are likely to be small. Previous studies of emotion identification have yielded non‐zero estimates of SNP‐heritability. This discrepancy is likely due to differences in the measurement and analysis of the phenotype.

bipolar disorder (Kohler et al., 2011), and anxiety (Demenescu, Kortekaas, den Boer, & Aleman, 2010). The ability to infer emotions displayed by others could represent an important influence on the individual that shapes their behavior within a society, as well as their mental health (Lopes, Salovey, Côté, Beers, & Petty, 2005).
To date, a few studies have examined the role of genes in facial emotion recognition, implicating variants in the oxytocin receptor gene OXTR with the identification of emotions , and the catechol-O-methyl transferase gene COMT with response time to emotional faces (Weiss et al., 2007). Investigations of the effect of variation in the promoter region of the serotonin transported gene (5HTTLPR) found no differences between emotions in terms of identification but found some evidence of an association with the intensity of emotion at which recognition occurred (Antypa, Cerit, Kruijt, Verhoeven, & Van der Does, 2011). There has also been a considerable literature linking 5HTTLPR to amygdala activation, including in response to emotional faces (Canli & Lesch, 2007). However, these studies use a candidate gene approach, which is limited by focusing on a few regions of assumed relevance, and usually relies on small sample sizes that are underpowered to detect likely effect sizes (Dick et al., 2015;Ioannidis, 2003). Previous work in the Philadelphia Neurodevelopmental Cohort has investigated emotion identification (among other phenotypes) genome-wide, focusing on estimation of heritability (identifying a common-variant heritability of 36%) and polygenic risk relationships with schizophrenia (Germine et al., 2016;Robinson et al., 2015).
Alternative approaches have also provided insights into the genetics of emotion recognition. Epidemiological observation of emotion recognition deficits in X-linked disorders including Turner's syndrome and fragile X disorder argues for a role of variants on the X chromosome (Bouras, Turk, & Cornish, 1998;Lawrence, Kuntsi, Coleman, Campbell, & Skuse, 2003;Skuse, 2006). A family-based quantitative genetic study of individuals with schizophrenia has estimated the heritability (additive genetic component of variance) of emotion recognition in faces at approximately 35% (Greenwood et al., 2007). In contrast, research on typically developing twins in childhood identified a large heritable component of general emotion recognition in faces that was shared across different emotions (75%), although no emotion-specific components were identified (Lau et al., 2009).
Facial emotion recognition deficits have been reported in individuals suffering from schizophrenia, bipolar disorder, depression, autism spectrum disorder, and mixed evidence exists for similar deficits in anorexia and anxiety disorders (Bourke et al., 2010;Collin, Bindra, Raju, Gillberg, & Minnis, 2013;Demenescu et al., 2010;Harms et al., 2010;Kohler et al., 2010;Kohler et al., 2011). Large GWAS of these disorders exist, and may predict variance in emotion recognition in the present cohort (Otowa et al., 2016;Ripke et al., 2013;Schizophrenia Working Group of the Psychiatric Genomics C, 2014; Sklar et al., 2011). Increased understanding of intact emotion recognition may aid in understanding the nature and importance of emotion recognition deficits in these disorders. Accordingly, we investigated the association between polygenic risk scores derived from GWAS of these disorders and facial emotion recognition phenotypes to assess whether genetic correlations mirror reported comorbidities.
In this study, we performed GWAS of non-verbal emotion recognition in children from the Avon Longitudinal Study of Parents and Children (ALSPAC). We then used polygenic risk score analysis to predict individual differences in emotion recognition within this cohort, using polygenic risk scores from studies of psychiatric disorders in which emotion recognition is impaired.
Participants were drawn from ALSPAC, which has been described in detail elsewhere (Boyd et al., 2012). In brief, approximately 15,000 pregnant women resident in Avon, UK with expected dates of delivery between April 1st, 1991 and December 31st, 1992 were recruited into a prospective birth cohort to study the effects of environmental and genetic influences on health and development. Additional information on the ALSPAC cohort is available on the study website, through a fully searchable data dictionary (http://www.bris.ac.uk/alspac/ researchers/data-access/data-dictionary/).

| Measurement of facial emotion recognition
A total of 7,297 of the child participants in ALSPAC underwent the Diagnostic Analysis of Non-Verbal Accuracy test (DANVA) as part of the "Focus at 8" assessment, performed when the participants were approximately 8 years old (Nowicki & Carton, 1993). The "Focus at 8" assessments comprised four sessions investigating psychometric and psychological characteristics, taking place across half a day. The DANVA was performed as part of the Activities session. Two tests from the DANVA were used, measuring the ability of participants to extract emotional information from the vocal tone (paralanguage) and face of actors. However, only data from the faces task was available to be analyzed in this study. During the task, the participant was shown 24 images of children displaying one of four emotions: happiness, sadness, anger, or fear. The image was displayed for 2 s, after which the participant was asked to identify the emotion verbally, and their response was recorded.

| Genotyping and assessment of population stratification
The generation and quality control of genome-wide genotype data are described on the ALSPAC website (http://www.bristol.ac.uk/medialibrary/sites/alspac/migrated/documents/gwas-data-generation.pdf).
In brief, 9,912 of the child participants were genotyped on the Illumina HumanHap550 Quad microarray, and the resulting genotypes imputed to the HapMap2 release 22 for autosomes, and HapMap 3 release 2 for the X chromosome using MACH and Minimac, respectively (Howie, Fuchsberger, Stephens, Marchini, & Abecasis, 2012;Li, Willer, Ding, Scheet, & Abecasis, 2010). Following quality control to remove poorlyperforming variants and samples, 8,365 samples and 2,487,351 variants were available (500,527 genotyped). Additional filtering to that described in the ALSPAC documentation was applied to the imputed dataset to remove rare variants (minor allele frequency <0.01) and variants that had been poorly imputed (MACH Rsq < 0.3) before analysis.
Participants self-reported white Western European ancestry.
Principal components analysis of the genotyped data using EIGEN-SOFT yielded no principal components associated with the DANVA phenotypes at a level greater than chance. Given that the cohort comprised individuals of white Western European ancestry from a single region, no further correction for population stratification was made.

| Analysis
Results from the DANVA were used to measure the participant's general ability at emotion recognition by calculating a proportion index (Rosenthal & Rubin, 1989). This measures a participant's performance across all 24 trials of the DANVA, scaled such that a score of 0. In order to assess emotion-specific genetic influences, unbiased hit rates were created for each emotion (Wagner, 1993 Unbiased hit rates for each emotion were arcsine transformed and used as phenotypes in genome-wide association studies (GWAS) performed in ProbAbel (http://www.genabel.org/), using MACH-imputed dosage data (Aulchenko, Struchalin, & van Duijn, 2010;Li et al., 2010).
Following each individual GWAS, variants were clumped in PLINK1.9 to identify linkage-independent loci (Chang et al., 2015). Specifically, all variants were assigned to a locus if they were in linkage disequilibrium (r 2 > 0.25) with a nearby (<250 kb) variant with a lower p-value.
All GWAS analyses controlled for fixed effects of gender, age at assessment (in weeks), IQ at assessment, and whether the activities session was the first, second, third, or fourth performed (with first used as the reference condition; Supplementary Table S2). Further covariates were considered for inclusion, including summary results from each section of the Development and Wellbeing Assessment (DAWBA; (Goodman, Ford, Richards, Gatward, & Meltzer, 2000)), and components of the Family Adversity Index (Bowen, Heron, Waylen, & Wolke, 2005). However, these additional covariates were found to be uncorrelated with the DANVA phenotypes, and so were not included.

Performance on the Social and Communication Disorders
Checklist (SCDC) was correlated with the phenotypic outcome. This questionnaire is a measure of flexibility and responsiveness to social interactions, and as such may involve the same cognitive processes as the DANVA (Skuse, James, Bishop, & Coppin, 1997). Analyses were run both with and without this covariate; these results were very similar, and so only analyses not including the questionnaire are using summary statistics and genotype data, respectively. A previous study of emotion recognition in children examined a score equivalent to the summed correct responses score used prior to conversion to the proportion index, yielding a heritability estimate of 36% (Robinson et al., 2015). In order to compare results directly between this study and that of Robinson et al. (2015), sensitivity analysis was run in GCTA using the summed correct responses score.
Associations between external traits and specific and general emotion recognition were assessed using the default high-resolution polygenic risk scoring option implemented in PRSice (Euesden, Lewis, & O'Reilly, 2015;Purcell et al., 2009). Specifically, 10,000 risk scores were calculated from each of the external GWAS using an increasing threshold for the inclusion of single nucleotide polymorphisms (SNPS).
Variants were included if their associated p-value from the external GWAS fell beneath this threshold (p = 0.00005 to p = 0.5 in steps of 0.00005), and were weighted by their effect size in the external GWAS.
Within each of polygenic risk score analysis, an adjusted alpha threshold of p = 0.001 was used to correct for the assessment of multiple correlated risk scores (Euesden et al., 2015). Multiple analyses were performed across five DANVA phenotypes (recognition of happy, sad, angry and fearful faces, and overall recognition), using COLEMAN ET AL. | 703 results from seven external GWAS studies (schizophrenia, bipolar disorder, major depressive disorder, autism, anorexia nervosa, and anxiety assessed as a case-control and as a continuous phenotype).
The number of effective tests resulting from these multiple analyses was determined using the Nyholt-Šidák method (Nyholt, 2004).
Specifically, the correlation matrix of the 35 optimal polygenic risk scores (Table 4) was calculated and spectral decomposition was used to determine the number of effective tests.

| Ethics
Ethics approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees.
ALSPAC operates in accordance with the principles laid out in the Declaration of Helsinki (Mumford, 1999).

| Data available for analysis and demographics
Of the 7,297 participants who completed the DANVA, 483 were excluded from the analysis because they provided responses for fewer than 23 of the 24 faces, 50 because their parent reported a diagnosis of autism spectrum disorder, and 118 because their IQ was less than 70.
This resulted in 6,646 participants, of whom 4,097 also had genomewide genotyping data (2,487,351 variants available following imputation) and made up the analyzed cohort.
Demographic data for the cohort is displayed in Table 1. The cohort contained slightly more females (50.4%) and ranged from 7 to 10 years old (389-543 weeks, mean = 450 weeks, SD = 12 weeks). IQ ranged from 70 (lower IQs were removed) to 145 (mean = 106, SD = 15.7).  Table S1). Measurements of skewness and kurtosis suggest that the arcsine transformation of the proportion index was necessary to improve the normality of this measure (Supplementary Table S1). Unbiased hit rates were acceptably normal before transformation, and their normality was largely unaffected by arcsine transformations (Supplementary   Table S1). Transformed phenotypes were used to ensure consistency of treatment of all proportional phenotypes. Sensitivity analyses were performed on untransformed hit rates to assess the effect of this transformation.

| Performance of the DANVA faces task
Correct identification of faces differed by emotion. Participants were better at detecting happy faces compared to all other emotions, better at detecting sad faces than fearful or angry faces, and better at detecting fearful faces compared to angry faces (Table 2).

| GWAS results
No variants were identified at conventional levels of genome-wide significance (p = 5 × 10 −8 ) in any GWAS, but 10 loci reached suggestive levels of significance across the GWAS of individual emotions (Supplementary Figures S1-S4), with 5 of these loci attaining suggestive significance for general emotion recognition, along with an additional two variants (p < 5 × 10 −6 , Table 3, Figure 1).
Sensitivity analyses of the specific emotion GWAS using untransformed hit rates produced results that did not qualitatively differ from those using the transformed phenotypes (Supplementary Table S4). All variants with p < 5 × 10 −6 in a specific analysis in the main GWAS had p < 5 × 10 -5 in the relevant sensitivity analysis (Supplementary Table S5).
Post-hoc power analyses were conducted using Genetic Power Calculator (Purcell, Cherny, & Sham, 2003). The cohort of 4,097 participants is adequate to detect a variant capturing 0.97% of variance at 80% power. For comparison, the most variance captured by any of the top SNPs in the analysis of individual emotions was 0.047% (rs3770081, sad faces, Table 3); a total of 84,320 participants would be required to capture this level of variance at 80% power.  (Table 4). Three PRS passed correction for the 10,000 nonindependent tests involved in a single PRSice analysis (p < 0.001):

| Polygenic risk scoring
autism predicting fear recognition (p = 7.32 × 10 −4 ), anxiety (as a casecontrol phenotype) predicting recognition of happy faces (p = 6.72 × 10 −4 ) and anxiety (as a factor score) predicting angry faces (p = 6.62 × 10 −4 ; (Euesden et al., 2015)). However, none was significant when taking into account the testing of multiple phenotypes (all p > 3.01 × 10 −5 ; (Nyholt, 2004)). Plots of PRS associations across common thresholds are provided for each analysis in the Supplementary Material (Supplementary Figures S5-S11).  WIPF3. Of these genes, chimerin 2 (CHN2) is the most interesting candidate. It encodes beta-chimerin, a rho-GTPase activating protein involved in the phospholipase C cell signaling pathway, proposed to have a regulatory function in the central nervous system (Yang & Kazanietz, 2007). CHN2 is highly expressed in the brain and has previously been implicated in schizophrenia, although neither this gene nor the locus of interest was present in the largest GWAS of schizophrenia to date (Hashimoto et al., 2005;Schizophrenia Working Group of the Psychiatric Genomics C, 2014). However, it should be noted that biological interest has proved an unreliable indicator of true association in GWAS to date (Collins & Sullivan, 2013). Furthermore, although the region discussed passes the threshold for suggestive significance, it is not genome-wide significant, and as such could be accounted for by random chance alone.

| Secondary analyses
The sample size studied is relatively large for a psychological study; however, it is modest for a GWAS. As such, analyses only had statistical power to detect moderate effect sizes. Studies of psychological and behavioral traits to date suggest emotion recognition is likely to be highly polygenic, with multiple variants each contributing only a small effect (Munafo & Flint, 2014). Our results are consistent with such a model, and place an upper bound on the effect sizes to be expected from any larger study or meta-analysis. However, these results are also consistent with the null hypothesis of no genetic effects. The weight of evidence from the literature supports the hypothesized polygenicity of emotion recognition (Germine et al., 2016;Robinson et al., 2015). The results presented herein do not provide additional support, yet polygenicity remains more likely than the absence of a common, additive genetic component to emotion recognition.
Estimation of heritability was performed using common SNP data, which captures only a proportion of total heritability . No estimate of heritability could be obtained from the analyses presented. Previous attempts to use this method for behavioral phenotypes have reported similarly non-significant or low estimates of heritability, which may result from differences in analytical approach and sample characteristics (Pappa et al., 2015;St Pourcain et al., 2015;Trzaskowski, Dale, & Plomin, 2013). The null estimate of heritability does not appear to be due to sample size, as power calculations suggest the cohort was powered to detect the 36% SNP heritability previously reported (Robinson et al., 2015). Although this study and that of Robinson et al. (2015) assessed similarly sized cohorts of juvenile participants of European ancestry (N = 4,097 and N = 3,661, respectively), there are a number of methodological differences that may underlie the differing results. First, there are some demographic differences- Robinson et al. (2015) studied an American cohort with ages ranging 8-21, whereas the ALSPAC cohort is British and younger (ages ranged 7-10). The approach to measuring emotion recognition also differed. Robinson et al. (2015) used the Penn Computerized Neurocognitive Battery (CNB) Emotion Identification test (Gur et al., 2012). This measure assesses the same four emotions as the DANVA (but also includes a neutral face condition) and its output is the sum of all correct responses. As such, it is equivalent to the summed correct Each locus is represented by a sentinel SNP, that with the lowest p-value in the locus. One locus on chromosome 7 showed different sentinel SNPs across different analyses, so is represented by three SNPs. Positive direction of effect means better recognition of emotion with each effect allele (A1). Locus information is provided in Supplementary Table S3. Three associations passes the recommended p = 0.001 for a single analysis, but not the adjusted threshold (p = 3.01 × 10 −5 ) for the 33.22 effective tests performed (Euesden et al., 2015;Nyholt, 2004). Previous studies of emotion recognition by Greenwood et al. (2007) and Lau et al. (2009) differed considerably from the current study in their sample composition and analytical approach. The estimate of heritability from Greenwood et al. (2007) is derived from the Penn CNB Emotion Identification test described above (Kohler et al., 2003). In addition, the participants differ considerably- Greenwood et al. (2007) studied families of adults with schizophrenia, whereas the data analyzed herein were drawn from a population cohort of children prior to puberty (after which there is evidence for an The cohort studied by Lau et al. (2009) was more similar to that investigated in this study, being comprised of 10 year old twins, albeit selected for high levels of parent-reported anxiety (Lau et al., 2009).
However, although accuracy of emotion recognition was measured, the experiment used a face morphing from a neutral condition to an emotional condition, rather than static images. The analysis of heritability also differs. The reported figure of 75% is derived from a latent factor analysis model in which a single genetic factor influences emotion recognition in all faces. Estimates of heritability from individual emotions (both from univariate analyses and modeled as emotion-specific effects in the latent class analysis) were not significantly different from zero (Lau et al., 2009 years old) has previously been estimated at 74-78%, using twin-based methods (Scourfield, Martin, Lewis, & McGuffin, 1999;Skuse, Mandy, & Scourfield, 2005). The estimate of heritability from common variants is only a third of that estimated from twin methods, further demonstrating low heritability estimates from common variants in behavioral phenotypes.
Polygenic risk scoring was unable to identify significant predictors. Although power estimation is possible in polygenic risk scoring, the number of variables involved makes accurate estimation difficult without prior knowledge of the relationship between the phenotypes under study (Dudbridge, 2013;Palla & Dudbridge, 2015).
Emotion recognition is a complex phenotype requiring attention to cues in multiple areas of the face, which change subtly in real-time (Bassili, 1979). It is likely to involve an intricate network of neural interactions (Vuilleumier & Pourtois, 2007). The faces component of the DANVA (as used in the ALSPAC study) is a comparatively simple forced-choice test between static pictures of the four emotions studied. As such, the DANVA can only provide a limited measure of facial emotion recognition. Furthermore, because the DANVA does not include a neutral face condition, we were unable to control for general face recognition ability in this analysis. As such, we cannot separate associations between genetic variants and face recognition from those with emotion recognition. Future studies could achieve this separation by meta-analyzing GWAS of emotion recognition in faces and in voices. At least a proportion of the variants associated with emotion recognition in faces would be expected to be associated with recognition of emotion in verbal tone (such as in the paralanguage component of the DANVA, which was not available during this study).
We performed GWAS of non-verbal emotion recognition in a population cohort of children. Although no variants were identified at genome-wide significance, the modest power of the sample suggests an upper threshold on the expected effect sizes of individual variants on this phenotype. Similarly, we were unable to obtain an estimate of heritability for any emotion recognition phenotype, despite power to detect true SNP heritabilities of 22%, lower than the reported SNP heritability of 36%. Emotion recognition is a complex phenotype, and its measurement is a simplification by necessity. Insights into the genetics of emotion recognition could inform our understanding of psychiatric disorders and of the basis by which individuals interact with their environment. Accordingly, a challenge for future research will be to combine sensitive measures of emotion recognition with the sample sizes required to capture the small effect sizes of variants suggested by the behavioral genetic literature.