Edinburgh Research Explorer Genome-Wide Meta-analysis identifies three novel loci associated with stroke

UK Biobank; AS, Any Stroke; AIS, Any Ischemic Stroke; GWAS, Genome-wide association study; MR, Mendelian Randomization; eQTL, expression quantitative trait locus ABSTRACT We conducted a European-only and trans-ancestral genome ‐ wide association meta-analysis in 72,147 stroke patients and 823,869 controls using data from UK Biobank (UKB) and the MEGASTROKE consortium. We identified an exonic polymorphism in NOS3 (rs1799983, p.Glu298Asp; p=2.2E-8; OR=1.05[1.04-1.07]) and variants in an intron of COL4A1 (rs9521634, p-value=3.8E-8, OR=1.04[1.03-1.06]) and near DYRK1A (rs720470, p=6.1E-9; OR=1.05[1.03-1.07]) at genome-wide significance for stroke. Effect sizes of known stroke loci were highly correlated between UKB and MEGASTROKE. Using Mendelian Randomization we further show that genetic variation in the nitric oxide synthase (NOS) – nitric oxide (NO) pathway in part affects stroke risk via variation in blood pressure. responses in vascular endothelial cells. 19 Dyrk1a heterozygous mice exhibit defects in retinal vascularization and DYRK1A was found to positively regulate VEGF-dependent transcriptional responses in endothelial cells. 19 We found no association signal with specific ischemic stroke subtypes possibly related to limited power. DYRK1A maps to the Down syndrome (DS) critical genetic region and is thought to contribute to the manifestations of DS. Recent work has drawn attention to an increased risk of stroke in DS. 20 While this might relate to other factors our findings in conjunction with the above experimental data suggest a link between DYRK1 and stroke.


INTRODUCTION
Stroke is the leading cause of disability and the second most common cause of death worldwide. The identification of common genetic variants for vascular conditions has provided mechanistic insights, improved options for risk prediction, and facilitated the development of novel therapeutics. 1 The MEGASTROKE consortium recently reported on the largest genome-wide association study (GWAS) to date in >520,000 subjects from multiple ethnicities. 2 Aside from finding novel risk loci for any stroke (AS), any ischemic stroke (AIS) and ischemic stroke subtypes , this study demonstrated marked genetic overlap with related vascular traits. Still, much of the heritability of stroke remains unexplained and the biological mechanisms and pathways underlying shared genetic influences with related traits are largely elusive.
The UK Biobank (UKB) was established to improve understanding of common diseases including stroke. Participants were recruited from the general adult population and, in addition to having provided self-reported medical history at recruitment, are followed prospectively, chiefly through linkage to their National Health Service records. 3 (http://www.ukbiobank.ac.uk) UKB recently released genotypes on over 500,000 participants thus adding to available GWAS data.
The current study aimed to identify additional susceptibility loci for stroke and obtain further insights into relevant pathways by meta -analyzing GWAS summary statistics from UKB with data from the MEGASTROKE European-only stratum followed by a transancestral analysis.

Accepted Article
This article is protected by copyright. All rights reserved.

SUBJECTS AND METHODS
The protocol for this study received prior approval by all IRBs, and informed consent was obtained from each subject.

UK Biobank
For definition of stroke cases we used UKB fields 42007 and 42009, the algorithmically defined stroke outcome, including both prevalent strokes (stroke prior to recruitment) and incident strokes (first stroke diagnosed during follow-up) (http://biobank.ctsu.ox.ac.uk/crystal/docs/alg_outcome_stroke.pdf ). Stroke events that were self-reported only without corroborating evidence from medical records were excluded due to substantial uncertainty about the accuracy of stroke self-report. 4 Coded hospital admissions and death record data (ICD 9 and 10 coding systems) were included based on previous work showing good accuracy of these data sources for identifying stroke cases. 5 Participants without a stroke diagnosis were included as controls. Related participants and those of non-White-British descent were excluded, as were SNPs with MAF <0.01. The final UKB dataset consisted of 4,985 AS cases, 3,628 AIS cases and 369,419 controls. The genetics dataset of UKB is described at https://www.biorxiv.org/content/early/2017/07/20/166298 . We fitted a logistic regression model with stroke as the outcome and each SNP as a dependent variable including age, sex, principal components 1-10, and genotyping chip as covariates.

MEGASTROKE
We used the full summary statistics from MEGASTROKE after filtering as recently described. Analyses included 67,162 AS cases, 60,341 AIS cases, 6,688 large artery stroke (LAS) cases, 9,006 cardioembolic stroke (CES) cases and 11,710 small vessel stroke (SVS) cases. 2

Genome-wide association meta-analyses
We first performed a fixed-effects meta-analysis for AS and AIS using summary statistics from UKB and the European stratum from MEGASTROKE. The newly formed European stratum was then meta-analyzed with the East Asian, South Asian, African American, other Asian, and Latin American strata from MEGASTROKE using a fixed-effects model. applied for all analyses. We report on results from both the new European-only metaanalysis and the trans-ancestral meta-analysis. Genome-wide significance was set at p<5E-

Accepted Article
This article is protected by copyright. All rights reserved. 4

Mendelian Randomization analysis
To evaluate the causal association of recently published variants in the NOS3 pathway 6 and stroke risk, we performed a two-sample Mendelian Randomisation (MR) analysis using blood pressure (BP) data from UKB as an exposure variable (systolic BP [SBP] and diastolic BP [DBP]: 318,417 subjects analyzed using a BOLT LMM model) and stroke data from the combined European-only analysis as outcome. We used the R package "mendelianRandomization" and report on results obtained from the weighted median, inverse variance weighted, MR-EGGER, and MBE (modal) method.

Expression quantitative trait loci (eQTLs)
For lead variants at the novel loci and SNPs in LD (r 2 >0.8) we queried publicly available eQTL databases GTEx7 7 and GRASP2. 8

RESULTS
We first compared effect sizes and directions of previously published lead SNPs from MEGASTROKE between UKB and the European MEGASTROKE stratum. Two lowfrequency variants at RGS7 and TMF4SF4 were not available in UKB leaving 30 loci for analysis. We observed significant positive correlations in the effect sizes for both AS for association with AS in the European-only or the full trans-ancestral analysis ( Table 1 and Figure 1A). Genomic inflation was estimated to be 1.05 both for AS and AIS (Europeans-only analysis). For the transethnic analyses, we found lambda values of 1.04 and 1.07, respectively. The explained phenotypic variance of the three novel loci is estimated to be 0.06% for AS in Europeans assuming a disease prevalence of 0.055.
The first finding is an exonic SNP in the Nitric Oxide Synthase 3 (NOS3) gene (rs1799983, p.Glu298Asp) that was associated with AS in the European-only analysis (p=2.2E-8; OR=1.05 [1.04-1.07]; Table 1 and Figure 1B). Functional variants in NOS3 and other genes of the NOS-NO pathway have been associated with hypertension. 6 Hence, we explored the possibility that the effects of genetically determined dysregulation of the NOS-NO pathway on stroke risk are mediated via BP using a two-sample MR analysis. We

Accepted Article
This article is protected by copyright. All rights reserved. 5 selected instruments in the genes encoding proteins in the NOS -NO pathway that exert known functional effects on NO signaling, as summarized in a recent review 6 . The genetic instruments included variants in NOS3 (rs1799983, rs2070744, rs3918226), Guanylate Cyclase 1, Soluble Alpha 3 GUCY1A3 (rs7692387), and the L-arginine transporter gene SLC7A1 (rs41318021) 6 ; rs7539120 in the Nitric oxide synthase 1 adaptor protein gene NOS1AP was not available for analysis in UKB. Considering BP in UKB as the exposure and AS as outcome we found significant associations in the weighted median approach (OR=1.11, p<0.0001 for SBP; OR=1.14, p<0.0001 for DBP) and IVW (OR=1.15, p=0.005 for SBP; OR=1.19, p=0.002 for DBP), suggesting a causal association of the instruments with stroke via BP (Figure 2) with an estimated contribution 9 of 4%. We found no evidence of heterogeneity in the IVW analyses and the intercept in the MR -Egger analyses were not significant (all p>0.10) suggesting absence of significant pleiotropy. After excluding two potentially pleiotropic SNPs identified through PhenoScanner as being associated with CAD (rs3918226, rs7692387), the results remained significant.
We further found variants in COL4A1 to be associated with AS (rs9521634, transancestral analysis: p=3.8E-8; OR=1.04 [1.03-1.06], Table 1 and Figure 1C). rs9521634 is situated in intron 28 but does not act as a known eQTL for any gene in available tissues as is also true for SNPs in linkage disequilibrium (LD, r 2 >0.8) with rs9521634.

DISCUSSION
We found rs1799983 encoding p.Glu298Asp in endothelial NOS to reach genome-wide significance for association with AS. Nitric oxide signaling is a key regulator of vascular tone, BP, and platelet aggregation. p.Glu298Asp lowers eNOS activity by disruption of eNOS caveolar localization 11 . Another variant, rs3918226, that is situated in the NOS3

Accepted Article
This article is protected by copyright. All rights reserved. 6 promoter and was found to lower promoter activity 12 , has been shown to associate with both hypertension 12 and coronary artery disease. 13 rs3918226 is in low LD (r 2 =0.17) with our lead SNP and did not reach genome-wide significance for association with AS or AIS.
Still, our MR analysis suggests that the aggregate effects of common variants in the NOS-NO pathway on stroke risk are in part mediated through blood pressure. Similar results have recently been shown for coronary artery di sease using a genetic risk score comprised of two common variants in NOS3 and GUCY1A3. 14 Somewhat surprisingly, we found the strongest association with CES. While this might relate to limited power in stroke subtypes, potential mechanisms underlying this association might include prothrombotic effects as well as mechanism that are yet unknown.
Our findings further highlight a role of COL4A1 in stroke. Collagen type IV α1 is a major constituent of the vascular basement membrane and forms heterotrimers with collagen IV α2. Rare variants in COL4A1 cause monogenic small vessel disease with hemorrhagic and ischemic stroke being part of the spectrum. 15,16 . These mutations are associated with structural protein changes or altered expression levels of COL4A1, which interfere with the assembly, secretion, or biological function of COL4A1. While we found no eQTLs for variants in LD with our lead SNPs in GTEX7 and GRASP2 there might be eQTLs in relevant tissues or cell types not captured by these sources. In keeping with the role of COL4A1 in small vessel disease we found rs9521634 to show the strongest association signal in SVS. Interestingly, common variants in the adjacent COL4A2 gene associate with SVS 2, 17 , intracerebral hemorrhage 17 , and white matter hyperintensities. 18 Collectively, these findings define COL4A1 and COL4A2 as key molecules in the biology of stroke and small vessel disease.
Our association results in combination with the eQTL data further point to a potential role of DYRK1A in stroke. DYRK1A encodes a dual-specificity tyrosine-phosphorylationregulated kinase 1A that has recently been shown to regulate angiogenic responses in vascular endothelial cells. 19 Dyrk1a heterozygous mice exhibit defects in retinal vascularization and DYRK1A was found to positively regulate VEGF-dependent transcriptional responses in endothelial cells . 19 We found no association signal with specific ischemic stroke subtypes possibly related to limited power. DYRK1A maps to the Down syndrome (DS) critical genetic region and is thought to contribute to the manifestations of DS. Recent work has drawn attention to an increased risk of stroke in DS. 20 While this might relate to other factors our findings in conjunction with the above experimental data suggest a link between DYRK1 and stroke.

Accepted Article
This article is protected by copyright. All rights reserved. Table 1 Results from the fixed effects (trans-ancestral and European-only) GWAS meta-analyses. For each locus the variant the lowest p -value in the fixed effects trans-ancestral or European-only meta-analysis, respectively, is shown. Chr, chromosome; TRANS, trans -ancestral fixed-effects meta-analysis; EUR, Europeanonly fixed-effects meta-analysis; OR, odds ratio; CI, confidence interval;  x-axis and the odds ratio (OR) for any stroke from the European-only analysis is displayed on the yaxis together with their respective standard errors. The solid and dashed lines display the inverse variance weighted (IVW) effect estimate and 95% confidence interval (CI), respectively. Estimates for systolic and diastolic blood pressure were derived from UKB. Estimates for stroke were derived from inverse variance fixed effects meta-analyses of MEGASTROKE and UKB.

Accepted Article
This article is protected by copyright. All rights reserved.