Shared genes between Alzheimer’s disease and ischemic stroke

Summary Aims Although converging evidence from experimental and epidemiological studies indicates Alzheimer's disease (AD) and ischemic stroke (IS) are related, the genetic basis underlying their links is less well characterized. Traditional SNP‐based genome‐wide association studies (GWAS) have failed to uncover shared susceptibility variants of AD and IS. Therefore, this study was designed to investigate whether pleiotropic genes existed between AD and IS to account for their phenotypic association, although this was not reported in previous studies. Methods Taking advantage of large‐scale GWAS summary statistics of AD (17,008 AD cases and 37,154 controls) and IS (10,307 IS cases and 19,326 controls), we performed gene‐based analysis implemented in VEGAS2 and Fisher's meta‐analysis of the set of overlapped genes of nominal significance in both diseases. Subsequently, gene expression analysis in AD‐ or IS‐associated expression datasets was conducted to explore the transcriptional alterations of pleiotropic genes identified. Results 16 AD‐IS pleiotropic genes surpassed the cutoff for Bonferroni‐corrected significance. Notably, MS4A4A and TREM2, two established AD‐susceptibility genes showed remarkable alterations in the spleens and brains afflicted by IS, respectively. Among the prioritized genes identified by virtue of literature‐based knowledge, most are immune‐relevant genes (EPHA1, MS4A4A, UBE2L3 and TREM2), implicating crucial roles of the immune system in the pathogenesis of AD and IS. Conclusions The observation that AD and IS had shared disease‐associated genes offered mechanistic insights into their common pathogenesis, predominantly involving the immune system. More importantly, our findings have important implications for future research directions, which are encouraged to verify the involvement of these candidates in AD and IS and interpret the exact molecular mechanisms of action.


| INTRODUC TI ON
Alzheimer's disease (AD) is the world's leading cause of dementia. The hallmarks of AD are extracellular amyloid-β peptide (Aβ) accumulation and intracellular neurofibrillary tangles (NFTs), the latter of which is formed by hyperphosphorylated tau protein. Along with the notorious reputation as the second leading cause of mortality and disability worldwide, 1,2 stroke is another major cause of age-related cognitive decline and dementia. 3 As the most prevalent form, ischemic stroke (IS) accounts for ~85% of stroke incidents. 3 Collectively, AD and IS both exert a large burden on global public healthcare and clinical practice.
Growing evidence indicates that there are links between AD and IS. Firstly, emerging epidemiologic research shows that AD is associated with considerable increased risk of IS, 4,5 and vice versa. 6 Secondly, neuropathological studies show that cerebrovascular lesions frequently coexist with AD pathology. 7 The two mixed pathologies act synergistically in increasing the odds of clinical dementia. 8 Indeed, a handful of studies have reported that brain ischemia is a non-neglectable factor driving the development of AD through dysregulated expression of AD-associated genes, such as Aβ precursor processing genes and tau protein gene. 9,10 Lastly, tau protein, a core hallmark of AD, can exacerbate brain injury in experimental IS through tau-mediated iron export and excitotoxicity. 11,12 Taken together, we hypothesized that there might be a shared genetic basis underlying these connections between AD and IS.
Genome-wide association studies (GWAS) have yielded new insights into the genetics of AD 13,14 and IS. 15,16 Shared genetic variants between AD and IS or its subtypes, have been first determined by Traylor et al. 17 They tested whether established genome-wide single nucleotide polymorphisms (SNPs) for AD or IS were significantly associated with the other disease. Yet no such variants have been found.
Conventional GWAS methods just focus on significant SNPs judging by overly stringent criterion (P < 5.00E−08) when exploring the genome.
There is an emerging consensus that, however, complex diseases are mostly driven by the joint action of a large proportion of SNPs having modest effects well below genome-wide significance. 18 Alternatively, gene-based analysis can obtain more validated associations by combining the effects of all SNPs within corresponding genes, thus expand knowledge about genetic architectures of complex diseases.
Hence, in our present study, we performed gene-based association tests to identify potential candidate genes shared between AD and IS. Next, gene expression analyses were conducted to evaluate shared genes' expression alterations in AD and IS brains or peripheral blood versus matched controls. Furthermore, to further interpret the molecular mechanisms that underpin AD and IS, systematic dissection of individual genes' functionalities was conducted.

| Samples
The GWAS statistics data for AD and IS analyses were from the International Genomics of Alzheimer's Project (IGAP) 13

| VEGAS2 method
Using the GWAS summary data for AD and IS, we performed a gene-based association test implemented in an updated version of Versatile Gene-based Association Study-2 version 2 (VEGAS2). 19 Among various methods of gene-based analysis, VEGAS2 is particularly feasible for analyzing GWAS summary statistics where individual-level genotypic and phenotypic data are unavailable. By uploading the individual SNPs' IDs and their association P-values, VEGAS2 sums the effects of all the SNPs within a gene and corrects for linkage disequilibrium (LD) referring to 1000G reference set and thus, generates a gene-based test statistic by doing simulations from the multivariate normal distribution. The simulation approach is computationally more efficient than other methods that rely on permutations, such as PLINK, minSNP. 20 The default "symmetric boundaries ±0 kb outside gene and SNPs in LD above r 2 = 0.8" was chosen to define gene boundaries, which meant that the effects of SNPs within a gene, also outside of the gene with r 2 > 0.8 with the ones in the gene, were considered to calculate gene-based P-values. This option both took account into the effects of nearby regulatory SNPs and reduced the non-specificity caused by large boundaries like ±50 kb.
To reduce the possibility of a single disease driving the cross-disease associations and uncover truly pleiotropic genes shared by AD and IS, we focused on shared genes that were nominally significant in each disease (P AD < 0.05 and P IS < 0.05).               Top-SNP, most significant SNP from each gene. Significantly associated genes whose P-values in both phenotypes were below 0.001 were shown in bold.

| Meta-analysis using Fisher's method
We used Fisher's method to combine the P-values calculated by the VEGAS2 for each gene shared by AD and IS. For a given gene, the Fisher formula for meta-analysis is: where P i are the P-values of the genes in the i th study and k is the total number of studies. x 2 follows a chi-square distribution with 2k degrees of freedom. 21 The gene-based meta-analysis was carried out using the R software package.
To avoid false positive signals, we applied the stringent Bonferroni correction accounting for the number of genes and phenotypes tested, that is, the significance threshold was set at 0.05/2N, where N represented the number of shared genes with nominal significance in both AD and IS.

| Gene expression analysis method
To explore the expression alterations of shared genes in each disease, we surveyed the expression datasets of AD and IS from the Gene Expression Omnibus (GEO) repository. Gene expression analysis was mainly restricted to brains and peripheral blood, as they are the most affected by AD-or IS-associated pathology.
Because AD-associated neuropathology shows regional specificity, expression profiles from discrete brain regions are more informative for discerning AD molecular signatures than analyses based on whole-brain expression data. Thus, we examined expression data from separate regions from postmortem human AD and control brains, including the dorsolateral prefrontal cortex (PFC) and hippocampus regions. The former (GSE44770) sampled 549 brains of 376 late-onset AD patients and 173 controls. 22 The latter (GSE48350) comprised 19 AD cases versus 43 matched controls. 23 For IS, samples of brains and other tissues (eg spleen) from patients were generally not available. Considering that the core features of IS pathophysiology in rodents and humans are analogous, we included the expression data from peri-infarct brain areas of rats (GSE55260). 24 We also analyzed transcriptional data from peripheral blood collected within 24 hours of stroke onset in 39 IS patients and 24 controls (GSE16561). 25 Additionally, we explored transcriptional profiles from mouse spleens (GSE70841) as spleen is the major lymphoid organ involved in the inflammatory milieu secondary to brain ischemia. All gene aliases in rat or mouse were transformed to the official symbols corresponding to human genes.
If there were multiple transcripts within the same gene, the one with the smallest P-value was selected. Due to between-study heterogeneity, not all transcripts of AD-IS pleiotropic genes appeared in each dataset. The differential expression was determined by Bonferroni correction accounting for the number of shared genes present in each dataset (P = 0.05/n, n ≤ 16).

| Gene-based testing for risk genes of AD and IS
Firstly, the VEGAS2 method was applied to individual GWAS summary data from AD and IS. For AD, 34 genes exceeded gene-wide significance (P < 2.35E−6 for 21,244 gene tests). Apart from 20 genes at 19q13.31-q13.32 harboring the well-known APOE locus,

| Gene expression analyses of shared genes
To validate the relevance of the 16 AD-IS genes, we evaluated their expressions in brains or peripheral blood of AD or IS cases versus controls. Importantly, MS4A4A and TREM2, two established AD-susceptibility genes, showed remarkable alterations in the spleens and brains afflicted by IS, respectively. A comparison of differentially expressed genes in each dataset was presented in Table 2.
AD-associated expression profiles from the PFC region (Supporting Information Table S3A) revealed that 8 genes were differentially expressed, with MS4A4A, UBE2L3, TREM2, and HECTD4 being the most significant, while just 3 genes' expression ln P i levels (HECTD4, YDJC, and PABPC1) were altered in the hippocampal regions compared with control brains (Supporting Information   Table S3B). In peri-infarcted rat brains (Supporting Information

| D ISCUSS I ON
Different from conventional SNP-based GWAS studies, our research used VEGAS2 gene-based association test to detect pleiotropic genes jointly associated with AD and IS. To avoid the problem of joint effects arising from a dominant association with one single disease, we focused on shared genes with nominal significance in both AD and IS (P < 0.05). By this criterion, 16 genes survived the stringent Bonferroni correction. Next, they were screened for differential expression in AD and IS cases versus controls. To search for supportive evidence for their relevance to AD and IS, we paid extra attention to the molecular functions of these genes.

| ZYX and EPHA1 gene at 7q34-7q35
Zyxin (ZYX) encodes a zinc-binding adaptor protein that translocates from focal adhesions to the nucleus to conduct signal transduction and modulate gene expression. Recently, zyxin has been identified as a novel target of Aβ metabolism in AD. 26 Besides, Zyxin is a novel interacting partner of SIRT1 27 that is protective against aging-associated pathologies like AD 28 and IS. 29 Here, ZYX was jointly associated with AD and IS (P combined = 2.63E−07) with P < 0.001 in each disease. Of note was that ZYX showed the most significant association (P = 2.70E−05) in the gene-based analysis of IS.
EPHA1 is an established risk locus of AD, and our genebased analysis confirmed its gene-wide association with AD cell-to-cell communication in the central nervous system (CNS). 30,31 Moreover, EPHA1 can modulate leukocyte extravasation, chemotaxis, and inflammatory cell migration. [32][33][34] Indeed, ample evidence has implicated the involvement of EPHA1, as well as MS4A4A and TREM2 listed below, in the immune module of AD. 22,35,36 As previously described, 37 we observed no aberrant EPHA1 expression in the PFC region of AD patients, neither in the hippocampus.
Following IS, increased incidence of infections occurs, mainly in the form of pneumonia and urinary tract infections. 43,44 The underlying mechanism is insufficient antigen-presentation of monocytes/macrophages and DCs in peripheral immune organs, resulting from downregulation of MHC class II and co-stimulatory molecules and remarkable reduction of proinflammatory cytokines. 45 Herein, we speculated that the pronounced upregulation of MS4A4A in the spleen following IS might reflect a phenotypic switch of monocytes from the proinflammatory M1 phenotype to the anti-inflammatory M2 phenotype.

| UBE2L3, YDJC and SLC2A11 genes at 22q11
Ubiquitin-conjugating enzyme E2 L3 (UBE2L3) and YDJC are located at 22q11.21. The genetic relationship between the 22q11.21 region and multiple autoimmune diseases has been extensively elucidated. [46][47][48] Besides, SNPs near YDJC are suggested to be a pleiotropic locus between AD and Crohn disease. 49 Lately, UBE2L3 has been identified as a hub gene in the gene regulatory networks of AD. 50 UBE2L3 encodes an E2 ubiquitin-conjugating enzyme. Through its action on ubiquitination in NF-κB signaling, UBE2L3 promotes NF-κB activation, thus mediates its link with numerous autoimmune diseases. [51][52][53] Moreover, UBE2L3 modulates pro-IL-1β processing and mature IL-1β secretion, 54 the deregulation of which pronouncedly intensifies neuronal damage in both AD and IS. 55,56 In addition, UBE2L3 directly interacts with the parkin protein, a ubiquitin-protein ligase that is protective against not only neurodegenerative diseases, 57-59 but also cerebral ischemia-reperfusion injury. 60 Nonetheless, there is no conclusive evidence to date demonstrating a causative link between UBE2L3 and AD or IS.

| HECTD4 and OAS2 gene at 12q24
Mounting GWAS studies have demonstrated the pleiotropic effects of 12q24 locus on type 1 diabetes, 68 83 and a microtubule-binding protein essential for chromosome segregation in mitosis. 84 In addition to its biological significance in various cancers, 85 PINX1 gene is associated with subclinical cardiovascular events like carotid intima media thickness, 86 blood lipids, 87,88 and involved in AD as a potential interactor of Aβ. 89

| TREM2 gene at 6p21.1
Triggering receptor expressed on myeloid cells-2 (TREM2) is highly expressed on microglia as an innate immune receptor involved in phagocytosis, clearance of damaged neurons, and inhibition of the microglial proinflammatory response. 90 Mutation of rare variants in TREM2 confers a substantial increase in AD risk, 91-93 which has been experimentally proved. 94,95 TREM2 has been indicated to be upregulated and participate in ischemic brain damage by modulating microglial phenotypes despite conflicting findings. [96][97][98][99][100] Here, TREM2 was shared by AD and IS with individual P = 0.0015 and P = 0.0095, respectively. Since rare variants (MAF < 0.01) were excluded from the GWAS panels of both AD and IS, we assumed that the true joint association signal of TREM2 with AD and IS might be stronger than we observed.

| Others
Though we found that the following genes (ie EFTUD1/FAM154B at 15q25.2, RRN3P1, SLC16A5, and ANKHD1-EIF4EBP3) were common association signals for AD and IS in terms of bioinformatics, less is known about their biological roles due to lack of overwhelming evidence related to AD or IS.
Based on the concise discussion about their biological significance, partial pleotropic genes underlying AD and IS were priori- Although gene-based tests increase the power to detect disease-associated genes harboring multiple associated variants, they do have limitations. Firstly, the VEGAS test is prone to underestimating effects of low-frequency SNPs correlated with few SNPs in LD blocks, 102 but may unable to distinguish the truly casual genes from several adjacent ones colocalizing in one significant locus.
Secondly, genes revealed by positional proximity to significant variants are not necessarily the casual ones for disease pathogenesis. In complex diseases, significant variants are mostly located in intronic/intergenic areas, presumably regulating gene expression, including acting on distant genes. Next, we leveraged the GEO dataset to estimate the expression alteration of candidate genes in disease-related tissues, the reliability of which largely relied on the raw data, for instance, the size of tissue samples in the original studies. Further powerful approaches and sophisticated functional interpretation analyses are warranted to prioritize causal genes.
Moreover, being pathologically and genetically heterogeneous, 103 IS has different etiological subtypes (ie large vessel disease, cardioembolic stroke, and small vessel disease, undetermined and other).
Here, we just surveyed the genetic link between AD and overall IS.

| CON CLUS IONS
We presented a gene-based strategy that corroborates shared candidate genes between AD and IS, with gene expression analysis ensued, which provided a typical example of how genetic studies could add to biological understanding of cross-trait etiology. Literature mining supported the potential association of partial novel candidate genes with both AD and IS. Our findings yielded mechanistic insights into the common pathogenesis underlying AD and IS, predominantly involving the immune system, and might suggest common intervention targets. More importantly, our findings should encourage more studies to verify the involvement of these candidates in AD and IS and interpret the exact molecular mechanisms of action.

ACK N OWLED G M ENTS
This work was supported by the National Natural Science Foundation of China (81300945 to G-Y.L.; and 81401361 and 81870954 to X-F. M.). We're greatly appreciative of the kind offer of GWAS summary statistics from the IGAP and METASTROKE collaboration.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.