ATP2C2 and DYX1C1 are putative modulators of dyslexia‐related MMR

Abstract Background Dyslexia is a specific learning disorder affecting reading and spelling abilities. Its prevalence is ~5% in German‐speaking individuals. Although the etiology of dyslexia largely remains to be determined, comprehensive evidence supports deficient phonological processing as a major contributing factor. An important prerequisite for phonological processing is auditory discrimination and, thus, essential for acquiring reading and spelling skills. The event‐related potential Mismatch Response (MMR) is an indicator for auditory discrimination capabilities with dyslexics showing an altered late component of MMR in response to auditory input. Methods In this study, we comprehensively analyzed associations of dyslexia‐specific late MMRs with genetic variants previously reported to be associated with dyslexia‐related phenotypes in multiple studies comprising 25 independent single‐nucleotide polymorphisms (SNPs) within 10 genes. Results First, we demonstrated validity of these SNPs for dyslexia in our sample by showing that additional inclusion of a polygenic risk score improved prediction of impaired writing compared with a model that used MMR alone. Secondly, a multifactorial regression analysis was conducted to uncover the subset of the 25 SNPs that is associated with the dyslexia‐specific late component of MMR. In total, four independent SNPs within DYX1C1 and ATP2C2 were found to be associated with MMR stronger than expected from multiple testing. To explore potential pathomechanisms, we annotated these variants with functional data including tissue‐specific expression analysis and eQTLs. Conclusion Our findings corroborate the late component of MMR as a potential endophenotype for dyslexia and support tripartite relationships between dyslexia‐related SNPs, the late component of MMR and dyslexia.


| INTRODUCTION
Dyslexia is a learning disorder affecting the acquisition of reading and spelling skills. According to the Diagnostic and Statistical Manual of Mental Disorders: DSM-V (American Psychiatric Association, 2013), reading as well as spelling impairments belong to the category of specific learning disorders. They can occur independently or in combination.
However, knowledge regarding specific pathomechanisms translating genetic risk variants into a dyslexic phenotype is still very limited. Endophenotypes are a common concept for describing pathomechanistical processes. Endophenotypes are defined as measurable, phenotypic components contributing to disease-phenotype and found along the path from genes to the disease-phenotype (Gottesman & Gould, 2003). Certain dyslexia-related potential endophenotypes affected by genetic risk variants have been reported in neuroimaging studies analyzing specific hemodynamic brain activation patterns.
Exemplarily, SNPs in FOXP2 were associated with fMRI activation in the left inferior frontal and precentral gyri, whereas SNP rs17243157 in THEM2 was associated with asymmetry in the functional activation of the superior temporal sulcus (Pinel et al., 2012). Furthermore, Darki, Peyrard-Janvid, Matsson, Kere, and Klingberg (2012) reported gray and white matter variation to be linked with variants within DYX1C1, DCDC2, and KIAA0319 dyslexia candidate genes, while Scerri et al. (2012) showed white matter variation to be linked with variants within MRPL19/C2ORF3. These associations between dyslexia candidates and brain structure are in line with findings from MRI studies, where structural gray and white matter alterations were associated with dyslexia-relevant traits (Klingberg et al., 2000;Kraft et al., 2015).
Another potential neurophysiological endophenotype for dyslexia refers to automatic responses being observable in a specific component of the auditory event-related potential (ERP). This is called mismatch negativity or, more generally spoken, mismatch response (MMR) (Näätänen, Gaillard, & Mäntysalo, 1978). Altered MMRs are reported for individuals with dyslexia and SLI, and are assumed to be associated with deficient phonological processing (Lovio, Näätänen, & Kujala, 2010;Schulte-Körne, Deimel, Bartling, & Remschmidt, 1998. Indeed, first evidence for genetic variants affecting the late component of MMR was reported in a previous genome-wide association study (GWAS) for common variants (Roeske et al., 2011) and, subsequently, for certain rare variants (Czamara et al., 2011).
In this study we investigated the neuro-functional implications of dyslexia candidate genes. Specifically, we wanted to uncover the relationship between dyslexia-related phenotypes and the late component of MMR on the genetic level. To this end, we identified 25 independent SNPs from 10 genes previously reported to be associated with dyslexia or dyslexia-related phenotypes in at least two studies.
Details on the selected SNPs can be found in Materials and Methods.
Additionally, we investigated two common SNPs previously reported to be associated with the late component of MMR. These SNPs were tested regarding a possible association with the late component of the MMR in a sample of 67 children.

| Participants
Sixty-seven children (37% male subjects), mean age 9.63 years (SD = 0.53), participated in this study. All children were monolingual German, attending primary school (grade 3 and 4) without any history of hearing impairment or neurological disorders. Twelve children (10 male subjects) were diagnosed with attention deficit disorder (ADD).
German individuals with dyslexia are more likely to show spelling difficulties than reading difficulties (Landerl, Wimmer, & Frith, 1997;Wimmer, 1996). Therefore, the DERET (Deutscher Rechtschreibtest; German Spelling Test) (Stock & Schneider, 2008) was used to specify participants' spelling abilities. The DERET qualitatively and quantitatively assesses the orthographic abilities of primary students in accordance with German curricula. Dictations mirror children's ability to use German phoneme-grapheme-correspondence as well as orthographic rules (see Supporting Information 1 for details).
In addition, we assessed nonverbal intelligence using the German version of the Kaufmann-Assessment Battery for Children (K-ABC) (Kaufman, Kaufman, Melchers, & Preuß, 2009). Importantly, no child had an IQ (i.e., nonverbal) below the critical threshold of 85. Descriptive statistics for demographic and psychometric (DERET, K-ABC) variables are presented in Table 1 and further details can be found in Table S1.
Parents of participating children were reimbursed (€7.00 per hour).
The study followed American Psychological Association (APA) standards in accordance with the declaration of Helsinki from 1964 (World Medical Organization, 1996) and was approved by the medical faculty of the University Leipzig. Children and their parents were informed both orally and in writing about the procedures and parents had to provide written consent for their children's enrolment.

| Stimulus material
In order to analyze auditory speech discrimination capabilities by means of MMR, we conducted a passive oddball paradigm, where participants were presented with a frequently occurring standard syllable, occasionally replaced by a deviant syllable. We used the syllables /pa/ (266 ms in length) and /ga/ (409 ms in length), which were recorded by a native German speaker. The stimuli were recorded with a 16-bit sampling rate and digitized at 44.1 kHz.

| EEG testing
The children were seated in a comfortable chair in an electrically and acoustically shielded electroencephalography (EEG) cabin. Auditory stimuli were presented binaurally via the tannoy with an intensity of 64 dB sound pressure level (SPL). During the presentation, the children watched a silent video of "the little mole", a popular children's cartoon (http://www. imdb.com/title/tt0841927/), on a small video screen in front of them. This was done to prevent extreme eye movements and boredom. We used a two-block design because of the different duration characteristics of the two syllables. The syllable/ga/was used as the standard and the syllable/ pa/as the deviant in one block; and vice versa in the other block. The order of the two blocks was counterbalanced across the children. Within one block, 600 stimuli were presented with 510 standard (85%) and 90 deviant stimuli (15%). We pseudorandomized the presentations of the deviant stimuli so that at least two standard stimuli were presented in between the deviant stimuli. The inter-stimulus-interval (ISI) between two stimuli (offset to onset) varied between 1,450 and 1,750 ms related to the different duration characteristics of the syllables. This is a time range during which the MMR is still elicited (Sams, Hari, Rif, & Knuutila, 1993). The experiment lasted for 45 min and the total procedure for about 90 min.

| EEG data processing and analysis
Recordings were algebraically re-referenced to the average of both mastoids (A1, A2). To remove muscle artifacts from the EEG signal, a digital low-pass filter of 30 Hz was applied to each single subject dataset (−3 dB cutoff frequency of 26.27 Hz). The sampling rate was then reduced to 250 Hz and a high-pass filter of 0.5 Hz was applied to remove very slow drifts (−3 dB cutoff frequency of 0.501 Hz). Single EEG epochs or trials with a signal above ±80 μV within a sliding window of 200 ms were considered invalid (e.g., containing artifacts) and excluded.
The EEG data were averaged per participant and per condition (i.e., standards and deviants) between −200 and 1,250 ms relative to the onset of the stimuli. The response to the standard stimulus, which was presented directly after a deviant stimulus, was excluded from further analyses. A period of −200 to 0 ms relative to the stimulus onset was chosen for baseline correction. In a second step, grand averages were computed for each condition across subjects. All EEG processing was carried out with the EEP 3.2.1 software package (Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany).
Individual MMR was quantified as the mean signal within 300-600 ms after stimulus onset. This represents the time window for

| DNA extraction and genotyping
Saliva samples were used for genotyping and DNA extraction. DNA was extracted using standard procedures as described by Quinque, Kittler, Kayser, Stoneking, and Nasidze (2006)

| Statistical analyses
Differences in the averaged MMR signals among poor and good spelling probands were tested using a sliding window t-test. In total, 375 time windows of 3.2 ms length were tested.
We used the literature driven 300-600 ms time window for association analysis in order to provide replicability among the different studies. Association analyses among genotyped SNPs and late MMR were conducted using a multifactorial linear regression model adjusted for poor spelling (categorized by the lower 10% percentile of the DERET outcome). Thereby, we used an additive genetic model.
We analyzed the effect of ADD on the SNP-MMR associations by comparing the effect sizes with and without adjusting for ADD. We verified that significant findings were not caused by influential outliers by performing regression applying Cook's distance. To analyze the distribution of the observed p-values and to detect deviations from the expected p-value distribution due to the multiple testing issue, a QQ-plot was generated. The 95% confidence envelope is based on the order statistic of expected distribution. Thereby, we avoided bias due to linkage disequilibrium (LD) using only independent SNPs identified by clumping the SNP set by applying LD-based clumping implemented in PLINK (Purcell et al., 2007) using standard settings. This clumping procedure identifies the subset of SNPs not strongly correlated, that is, being independent from each other thereby keeping stronger associated SNPs. The p-values were controlled for multiple testing using the false discovery rate (FDR) (Benjamini & Hochberg, 1995).
An unweighted polygenic risk score (PRS) to estimate the joint discrimination capability was defined by summing up all risk alleles of the clumped, independent SNPs within each individual. This approach requires the information whether a certain allele can be considered a risk allele or not. We obtained this information from independent studies from the literature (Table S2)  information (e.g., genetics) to the model (Müller et al., 2016).
All statistical analyses were performed using the R statistical software, Version 3.0.2 (R Core Team, 2013).

| Power analyses
Our study had 80% power to detect an association of a SNP with the MMR at the level of nominal significance (p-value = .05) for effect sizes of at least 13.1% explained variance in a linear regression model when accounting for spelling performance. At a level of p-value .01 and .001 the power was 80% for effect sizes of 18.2% and 24.7% of explained variance, respectively. For calculating power, we used the framework of the general linear model as implemented in the R-package pwr 1.1-3 (Champely, 2015). The identified effect size corresponds to a medium to large effect size according to Cohen's classification (6%-16%, respectively). This is in accordance with previously reported effect sizes for the MMR phenotype (Roeske et al., 2011), which is until now-to the best of our knowledge-the only reported association of common genetic variants with dyslexia-related MMR.

| Functional in silico analyses
We characterized nominally associated SNPs, proxies of these SNPs (R 2 ≥ 0.3 and Lewontin's D′ ≥ 0.8), and respective genes for in silico evidence for functional effects. These investigations included eQTL analyses, annotations for local regulatory elements, and the investigation of the spatial distribution of genes and their expression products.
Tissue specificity of expressed proteins and RNA of respective genes was characterized using data from "The Human Protein Atlas" (Uhlen et al., 2015). Protein expression data from "The Human Protein Atlas" is derived from annotations of immune-histochemical staining of various cell types across different tissues. RNA levels were determined by RNAseq experiments.

| SNP selection, genotyping quality, and discrimination improvement
In total, 30 SNPs in 14 genes associated with dyslexia-related phenotypes and replicated in at least one study were identified and investigated (   Effect sizes of all nominally associated SNPs with the late component of the MMR are provided in Figure 3.

| Association of reported candidate SNPs with the late component of MMR
In order to investigate the relevance of ADD for our findings, we conducted an additional analysis where we adjusted for ADD status.
Corroboratingly, we found only little change in p-values and effect sizes when additionally adjusting our analysis on ADD status ( Figure   S1).

| Functional characterization of identified candidate SNPs and corresponding genes
The screen for eQTL effects revealed direct cis-effects for rs11100040, rs17819126, and rs3743204 in blood-derived monocytes (Table S3), that is, for these SNPs, carriage of risk alleles is correlated with the expression level of certain genes. These genes were PTPRU (rs11100040), PIGB (rs17819126), and DYX1C1 (rs3743204). All genes were expressed in neuronal tissues and all were abundant as RNA in the cerebral cortex (Table S5).

| DISCUSSION
Several studies show that an altered late component of the MMR, known to be connected to auditory discrimination, is associated with dyslexia (Hommet et al., 2009;Neuhoff et al., 2012;Schulte-Körne et al., 1998) showing stronger association than expected due to multiple testing and two SNPs with an association at an FDR of 5%. Furthermore, one SNP (rs11100040) previously associated with the late component of MMR was also nominally replicated in our study. Thereby, we controlled for environmental influences on the MMR by analyzing the spellingindependent MMR resulting from our adjustment strategy. Using functional data, we characterized these SNPs and corresponding genes.

| Discrimination between good and poor spelling by MMR and genetics
In accordance with the literature (Lovio et al., 2010;Roeske et al., 2011;Schulte-Körne et al., 1998, the late MMR significantly discriminated between people with good and poor spelling further corroborating its value as potential endophenotype for dyslexia. Also in accordance with previous reports, discrimination was found to be strong in a time window near 400 ms (Figure 1, Alonso-Búa et al., 2006;Cheour et al., 2001). The validity of selected SNPs for dyslexia in our cohort was strengthened as we found significant improvement in reclassification good and poor spelling probands when using of a score created from these SNPs in addition to the late component of the MMR. To the best of our knowledge, this is the first report of a genetic risk score for dyslexia improving prediction.

| Identified genetic modifiers of MMR
Next, we analyzed which subset of the selected SNPs might be associated with the late component of the MMR. We observed stronger associations than expected due to multiple testing with the late component of MMR (Figure 2), which supports a potential relationship were able to show that rs11100040 also affects the functional connectivity of the fronto-temporal processing hubs in German native speakers . The reduced functional connectivity between frontal and temporal brain areas might provide the basis for the dyslexia-related modulation of the late component of the MMR (f) rs11100040 because at least two regions are involved in generating the MMR: a frontal source is located in the inferior frontal gyrus and a temporal source located in the superior temporal gyrus (Doeller et al., 2003;Marco-Pallarés, Grau, & Ruffini, 2005). Thus, the affected functional connectivity of the fronto-temporal processing hubs, might lead to the observed functional alteration in the late component of the MMR for different numbers of risk alleles of rs11100040.
Importantly, the study by Skeide et al. (2015) and the present analysis were conducted in overlapping cohorts, which further strengthen the proposed connection between the functional connectivity of the fronto-temporal processing hubs, altered late component of the MMR and rs11100040.
In addition to Roeske et al. (2011), a second study investigated the association of genetic variants with the MMR (Czamara et al., 2011

Our strongest identified associations include effects of SNPs in
genes DYX1C1 and ATP2C2 on late component of the MMR (Table 2).
The two strongest SNPs revealed associations with an FDR of 5%, which means that for both SNPs the probability of being a false discovery due to multiple testing is only 5%. Reflecting the candidate approach, all these genes are of high neurobiological relevance: DYX1C1 (dyslexia susceptibility 1 candidate 1) encodes a protein expressed in cortical neurons and white matter glial cells (Taipale et al., 2003) and DYX1C1 contributes to neuronal migration in the developing rodent brain (Wang et al., 2006). ATP2C2 is an ATPase which transports Mg 2+ and Ca 2+ ions into the Golgi lumen for protein modification and is also involved in Ca 2+ signaling (Feng et al., 2010). Interestingly, it is known that an imbalanced ion transmembrane gradient may impact neurological functions and supports the nexus between neurological impairment and risk for dyslexia.
It should be noted that our study was adequately powered to detect nominal associations for effect sizes similar to those described in Roeske et al. (2011)

| Characterization of the Effect Directions of Associated SNPs on the Late Component of the MMR
We observed a significant, positive MMR for children with poor spelling skills (Figure 1). This is in line with the reported stronger shift of the late component of the MMR toward positive values compared with children not being at risk for dyslexia by Maurer, Bucher, Brem, and Brandeis (2003) in kindergarteners at risk for dyslexia. Consequently, a stronger positivity of the late component of the MMR in relation to the number of risk alleles would be a straight-forward assumption ( Figure S2). However, we only observed the expected effect direction for the strongest associated SNP (  2007). In fact, these "flip-flop" associations are relatively common in dyslexia studies. For example, Taipale et al. (2003) identified an association of two SNPs (rs3743205-DYX1C1 and rs57809907-DYX1C1) with dyslexia, thereby reporting −3A and T as risk alleles. Two subsequent studies replicated these findings for rs57809907, albeit with the opposite effect direction Wigg et al., 2004).
Similarly, the identified effect sizes of SNP rs2143340-KIAA0319 were reported with opposing risk alleles Luciano et al., 2007;Newbury et al., 2011). This might also contribute to the differences in the effect direction we observed: for dyslexia, different cognitive subtypes are described (Heim et al., 2008;

| Functional properties of associated SNPs
Most of the SNPs detected in association studies are intronic and do not change the protein structure. However, several studies have shown an effect of SNPs on gene expression levels. To follow this, we screened published eQTL databases for the significant SNPs or the best proxy of these SNPs. We regard functional evidence as an additional indicator for a true function of the respective SNP in dyslexia-related phenotypes. Indeed, four associated SNPs could be linked to altered expression of nearby genes (cis-eQTL, see Table   S3). Note that with the exception of DYX1C1 all these SNPs affect expression of other nearby genes to which they were not originally assigned to. Therefore, future research should consider these nearby genes as candidates when further investigating molecular pathomechanisms. We only considered eQTLs identified in brain tissue and blood-derived cells because we expect that eQTLs involved in the development of dyslexia are likely present in brain tissue. However, we also included eQTL studies done in blood-based tissue as these studies are traditionally very well powered. Furthermore, it is known that a large proportion of eQTLs are not tissue-specific, especially cisacting eQTLs (Petretto et al., 2006;Van Nas et al., 2010). Note that all genes affected by the associated eQTL were expressed in brain tissue (Uhlen et al., 2015) which strengthens the proposed functional effect of these genes on dyslexia. As the number of eQTL studies increases, we expect even more findings in the future including relevant eQTLs in cerebral tissue.
We further examined the analyzed SNPs for functional in silico evidence to obtain possible molecular mechanisms for functional effects.
For most of the analyzed SNPs, in silico functional data on the genetic level was found (Boyle et al., 2012), for example, footprinting and position weight matrix assays (PWM) provided evidence for rs16973771-ATP2C2 to affect the binding of Nuclear factor 1 (NF-1) transcription factor family members (Matys et al., 2006;Pique-Regi et al., 2011). A complete list is shown in Table S4.

| Limitations
We would like to address some limitations regarding this study. Due to our moderate sample size our results should be followed up in independent replication studies and when interpreting effect sizes the well-known winner's curse (Lohmueller, Pearce, Pike, Lander, & Hirschhorn, 2003) should be taken into consideration. Based on the findings of this study, we cannot make definite conclusions about the role of the analyzed genetic variants. Nevertheless, our results are supported by in silico functional data and a previous genetic MMR study (Roeske et al. (2011) and are available for meta-analysis in future studies.
As another potential limitation, our study sample was a homogenous Caucasian group, therefore, our results might differ in other ethnicities.

CONFLICT OF INTEREST
All contributing authors are members of the Legascreen consortium.