Genetic comorbidity between major depression and cardiometabolic traits, stratified by age at onset of major depression

It is imperative to understand the specific and shared etiologies of major depression and cardio-metabolic disease, as both traits are frequently comorbid and each represents a major burden to society. This study examined whether there is a genetic association between major depression and cardio-metabolic traits and if this association is stratified by age at onset for major depression. Polygenic risk scores analysis and linkage disequilibrium score regression was performed to examine whether differences in shared genetic etiology exist between depression case control status ( N cases = 40,940, N controls = 67,532), earlier ( N = 15,844), and later onset depression ( N = 15,800) with body mass index, coronary artery disease, stroke, and type 2 diabetes in 11 data sets from the Psychiatric Genomics Consortium, Generation Scotland, and UK Biobank. All cardio-metabolic polygenic risk scores were associated with depression status. Significant genetic correlations were found between depression and body mass index, coronary artery disease, and type 2 diabetes. Higher polygenic risk for body mass index, coronary artery disease, and type 2 diabetes was associated with both early and later onset depression, while higher polygenic risk for stroke was associated with later onset depression only. Significant genetic correlations were found between body mass index and later onset depression, and between coronary artery disease and both early and late onset depression. The phenotypic associations between major depression and cardio-metabolic traits may partly reflect their overlapping genetic etiology irrespective of the age depression first presents.


| INTRODUCTION
Major depressive disorder (MDD) and cardio-metabolic traits are both major causes of morbidity and mortality in high-income countries. Epidemiological studies have shown a well-established association between them (B. W. J. H. Penninx, 2017): MDD increases the risk of cardio-metabolic disease onset and mortality, but cardio-metabolic disease itself can also increase risk of developing MDD. Specifically, a meta-analysis of 124,509 individuals across 21 studies showed that depression is associated with an 80% increased risk for developing coronary artery disease (Nicholson, Kuper, & Hemingway, 2006). Mezuk et al (Mezuk, Eaton, Albrecht, & Golden, 2008) showed that MDD predicted a 60% increased risk of type 2 diabetes, while type 2 diabetes predicted a 15% increased risk in MDD. MDD is associated with an increased risk of developing stroke (HR 1.45) (Pan, Sun, Okereke, Rexrode, & Hu, 2011), but meta-analyses have also shown that 30% of stroke survivors suffered from MDD (Ayerbe, Ayis, Wolfe, & Rudd, 2013;Hackett & Pickles, 2014). Risk factors for cardio-metabolic disease, such as obesity, are also linked to depression. Milaneschi et al. (Milaneschi, Simmons, van Rossum, & Penninx, 2018) showed a bidirectional association between depression and obesity, where obesity increases risk for depression and depression increases risk for subsequent obesity. A more detailed review of the comorbidity between depression and cardio-metabolic traits is given in B. W. Penninx, Milaneschi, Lamers, and Vogelzangs (2013).
Multiple mechanisms have been proposed to explain the association between MDD and cardio-metabolic diseases, for example biological dysregulation, or an unhealthy lifestyle (B. W. J. H. Penninx, 2017).
However, it remains unclear to what extend these mechanisms influence the association, as most studies examining the mechanisms were based on epidemiological observational study designs. The association between MDD and cardio-metabolic disease could also be due to shared genetic factors. Twin studies have shown that genetic factors contribute 40% of the variation in liability to both MDD (Sullivan, Neale, & Kendler, 2000) and coronary artery disease (Polderman et al., 2015;Schunkert et al., 2011), 72% to type 2 diabetes (Willemsen et al., 2015), and 40-70% to BMI (Locke et al., 2015). No twin heritability estimates are available for stroke, as such studies have very limited numbers of twins. The most recent published GWAS for major depression, including 246,363 cases and 561,190 controls, identified 102 loci that were significantly associated with major depression, highlighting the highly polygenic nature of major depression (Howard et al., 2019).
The heritability for major depression based on single nucleotide polymorphisms (h 2 SNP ) was estimated to be 8.9% on the liability scale, based on a lifetime risk of 0.15. Similarly, recent efforts to identify common variants associated with cardio-metabolic disease have shown these traits to be highly polygenic (Malik et al., 2018;Nikpay et al., 2015;Scott et al., 2017;Yengo et al., 2018).
Several studies have shown genetic overlap between MDD and cardio-metabolic traits, in particular with coronary artery disease (Wray et al., 2018). Findings for other cardio-metabolic traits have been inconsistent. Previous studies have identified genetic overlap between MDD and BMI using polygenic risk scores (PRS) (Milaneschi et al., 2017), but not based on genetic correlations (Wong et al., 2018). However, more recent GWAS studies now have the power to detect a genetic correlation between MDD and BMI (Wray et al., 2018). Studies using twin data also showed that the phenotypic association between MDD and type 2 diabetes was partly due to genetic effects (Kan et al., 2016). This finding has been replicated using a polygenic risk score approach (Wong et al., 2019), but not using genetic correlations (Clarke et al., 2016;Wong et al., 2019). The Brainstorm Consortium did not find a significant genetic correlation between MDD and stroke (Anttila et al., 2018), while Wassentheil-Smoller et al (Wassertheil-Smoller et al., 2018) showed that higher polygenic risk for MDD was associated with increased risk for stroke, in particular small vessel disease.
The inconsistency in results is likely due to differences in methodological approaches or in summary statistics, but could also be due to the heterogeneity of MDD. MDD onset can occur at any stage of life, but the factors associated with MDD are often age specific or age restricted (Power et al., 2017). Increased genetic risk for major depression is associated with earlier age at onset (AAO) compared with later AAO (Wray et al., 2018). Earlier AAO MDD has a higher heritability and is associated with increased risk for MDD in relatives. On the other hand, vascular disease and its risk factors are linked to a later AAO for MDD (G. S. Alexopoulos et al., 1997;Naismith, Norrie, Mowszowski, & Hickie, 2012;Taylor, Aizenstein, & Alexopoulos, 2013). A large study of Swedish twins showed that a later AAO for MDD in one twin was associated with a higher risk for vascular disease in the other twin (Kendler, Fiske, Gardner, & Gatz, 2009). To date, no molecular genetic studies have examined the association between late onset MDD and cardiometabolic traits and in the current study we have more power to replicate and further investigate this association leveraging both summary statistics and common genetic variant information.
The main aim of the present study is to examine the genetic association between MDD and cardio-metabolic traits using PRS and genetic correlations. Secondly, we will examine the association stratified by AAO for MDD to test whether a higher genetic predisposition for cardio-metabolic traits is associated with a later AAO for MDD.

| Samples
This study was performed using data from the Psychiatric Genomics Consortium (PGC) MDD working group (PGC-MDD), Generation Scotland: The Scottish Family Health Study (GS:SFHS), and UK Biobank.

| PGC
Full details of the studies that form the PGC-MDD have previously been published (Wray et al., 2018). In summary, a subset of 11 studies from the full PGC-MDD analysis were included in the current study, based on the availability of AAO for MDD. All cases were required to have a lifetime diagnosis of MDD based on international consensus criteria (DSM-IV, ICD-9, or ICD-10) (American Psychiatric Association, 1994;World Health Organization, 1978, 1992. This was ascertained using structured diagnostic instruments from direct interview by trained interviewers or clinician administered checklists. In most studies (10/11), controls were randomly selected from the general population and were screened for absence of lifetime MDD. This led to a total of 9,518 cases and 11,557 controls with genotype data and AAO information.

| GS:SFHS
GS:SFHS is a family-based study consisting of 23,690 participants recruited from the population via general medical practices across Scotland. Sample characteristics and recruitment protocols have been described elsewhere (B. H. Smith et al., 2006). In summary, MDD diagnosis was based on the structured clinical interview for DSM-IV disorders (SCID) (First, Spitzer, Gibbon, & Williams, 1997). Participants who answered positively to two mental health screening questions were invited to complete the full SCID to ascertain MDD diagnosis. Cases were further refined through NHS linkage. Controls were defined as participants who answered negatively to the two screening questions or participants who did complete the SCID but did not meet criteria for MDD. This resulted in 1947 cases and 4,858 controls, based on unrelated individuals.

| UK Biobank
UK Biobank is a large resource for identifying determinants of diseases in middle aged and older healthy individuals (www.ukbiobank. ac.uk) (Sudlow et al., 2015). A total 502,655 community-dwelling participants aged between 37 and 73 years were recruited between 2006 and 2010 in the United Kingdom, and underwent extensive testing including mental health assessments. MDD status in UK Biobank was derived from the online mental health questionnaire as previously described (Coleman et al., 2020;Davis et al., 2018). Briefly, MDD cases were defined as individuals meeting lifetime criteria for MDD based on questions from the Composite International Diagnostic Interview. Individuals reporting previous self-reported diagnosis of schizophrenia (or other psychosis) or bipolar disorder were excluded as MDD cases. Controls were defined as individuals who did not have any self-reported diagnosis of mental illness, did not take any anti-depressant medications, had not previously been hospitalized with a mood disorder, and did not meet previously defined criteria for a mood disorder (D. J. . For the current study, this led to 29,475 cases and 51,243 controls.

| AAO
AAO was defined as follows, based on previous work by Power et al (Power et al., 2017) developed to account for the substantial by-study heterogeneity in the measure. Heterogeneity in AAO within the PGC MDD cohorts has been extensively investigated. Responses depend on the specific setting in which AAO is asked, and may reflect age at first symptoms, first visit to general practitioner or first diagnosis) (Power et al., 2017). Using cut offs for AAO (e.g., onset under 30) does not capture the variance in this measure, and we therefore followed Power et al. (Power et al., 2017) to use the within study distribution to define early and later onset depression. This approach assumes that all cases were recruited from the same age at onset distribution with differences due to study-specific parameters. Cases reporting AAO older than the recorded age at interview were excluded from each study. Within each study, cases were ordered by AAO and divided into equal octiles (O1-O8). The first three octiles (O1-O3) were combined into the early AAO group, the last three octiles (O6-O8) into the late AAO group. Splitting into equal octiles can result in individuals with the same AAO being arbitrarily placed in different octiles. To address this, cases in O4 with the same AAO as the maximum AAO in O3 were assigned to the early AAO group. Similarly, cases in O5 with the same AAO as the minimum AAO in O6 were assigned to the late AAO group. This led to a total of 15,844 MDD cases with early AAO and 15,800 MDD cases with late AAO, and 67,532 controls.

| Genotyping and quality control
Genotyping procedures have been described in the original analysis for each study (Bycroft et al., 2018;Nagy et al., 2017;Wray et al., 2018).
All analysis were based on individuals from European ancestry only.

| Statistical analysis 2.4.1 | PRS
Six PRS were calculated based on GWAS summary statistics for body mass index (BMI), coronary artery disease (CAD) (Nikpay et al., 2015), stroke and two stroke subtypes (Malik et al., 2018), and type 2 diabetes (T2D) (Scott et al., 2017) based on external GWAS p-value <0.5. In order to lower the multiple testing burden a p-value threshold of 0.5 was chosen, as often representative of the predictive ability, since R 2 values often plateau across the range of p-values = .05-1. Table 1 provides further detail on the GWAS summary statistics.
PRS were calculated in all genotyped participants in each study using PRSice v2 (https://github.com/choishingwan/PRSice) (Euesden, Lewis, & O'Reilly, 2015). Prior to creating the scores, clumping was Logistic regression was used to test the associations between the six PRS and four different MDD case-control sets (all MDD cases vs. control subjects, early AAO cases vs. control subjects, late AAO cases vs. control subjects, and late AAO cases vs. early AAO cases).
These analyses test our two hypotheses; (a) testing the association of genetic risk for cardiometabolic traits with MDD, and whether this association differs by AAO by comparing controls and MDD cases (all MDD cases vs. control subjects, early AAO cases vs. control subjects, late AAO cases vs. control subjects), and (b) testing the association of genetic risk for cardiometabolic traits with MDD AAO itself, by comparing late AAO cases vs. early AAO cases only. Analyses were performed separately for each study, adjusting for relevant covariates in each study (PGC & GS:SFHS: 5 genetic principal components for population stratification; UK Biobank: 6 genetic principal components for population stratification, assessment center, and genotyping batch).
All PRS were standardized with each study across samples. Meta-analyses of the results across studies for each PRS-MDD subtype combination were then conducted to synthesize the findings for maximum statistical power and to check for heterogeneity. Fixed-effects models were used in which the standardized regression coefficients were weighted by the inverse of their squared SE. We tested for the presence of between-study heterogeneity using Cochran's Q. We corrected for multiple testing across all 24 meta-analysis models (4 phenotypes × 6 PRS) using the Benjamini Hochberg false discovery rate method (Benjamini & Hochberg, 1995), using a critical p-value of .0026, which means that all p-values equal to or below the critical p-value are considered significant.

| GWAS meta-analysis and genetic correlations
Genome wide association analyses for three AAO-stratified MDD case-control subsets were performed within each study, with adjustments for population stratification. Study-specific covariates-for example, site or familial relationships-were also fitted as required (see Supplementary Material). The GWAS for GS:SFHS included all individuals (compared with unrelated individuals in the PRS analysis) to maximize power. Quality control of the study-level summary statistics was performed using the EasyQC software (Winkler et al., 2014), which implemented the exclusion of SNPs with imputation quality <0.8 and minor allele count <25. p-value based meta-analyses, with genomic control, were then performed for each of the three outcome measures, using the METAL package (Willer, Li, & Abecasis, 2010).
SNPs with a combined sample size of less than 1,000 participants were excluded.
Genetic correlations between the cardio-metabolic traits and the four MDD summary statistics were calculated using Linkage Disequilibrium score regression (LDSC) (B. K. Bulik-  using the default HapMap LD reference. To maximize power (and given genetic correlation analyses in LDSC are robust to sample overlap), the largest available summary statistics were used. These included the previously described summary statistics for coronary artery disease (Nikpay et al., 2015), stroke (Malik et al., 2018), and type 2 diabetes (Scott et al., 2017), as well as summary statistics for major depression from the  Quenouille, 1956;Tukey, 1958 (Table S1). Figure 1 shows the distribution of AAO and the stratification in the early and late AAO groups in each of the studies. Figure S1 shows the cumulative distribution of AAO in each study.

| Polygenic risk analysis of cardio-metabolic disease
The meta-analyses across studies showed that PRS for BMI, coronary artery disease, all stroke, ischaemic stroke, small vessel disease, and type 2 diabetes were significantly associated with MDD case control  Table S2). Study specific results can be found in Table S3.
In order to calculate genetic correlations between the AAO stratified MDD traits and cardio-metabolic traits, we ran GWAS on the AAO stratified traits. Figure S3-S5 shows the Manhattan and QQ plots for these traits. Table S4 provides further detail on the heritability estimates of the stratified MDD traits. One genome-wide significant variant was identified in the GWAS for early AAO versus controls (rs2789313 on chromosome 10, Z-score = 5.56, p = 2.74 × 10 −8 ). This variant is located in the MALRD1 gene, which is involved in hepatic bile acid metabolism and lipid homeostasis (Vergnes, Lee, Chin, Auwerx, & Reue, 2013). This gene has not previously been associated with depression. Figure S3 shows a locus zoom plot of the region on chromosome 10 including rs2789313.
The results for the stratified AAO traits showed a significant genetic correlation between BMI and late AAO versus controls  Table S4).

| DISCUSSION
This study explored whether the association between MDD and cardio-metabolic traits is partly due to genetic factors. Using data from publicly available GWAS and from the PGC, UK Biobank, and Generation Scotland, we showed significant genetic overlap between MDD and BMI, coronary artery disease, and type 2 diabetes from PRS and genetic correlations. study, but the depression phenotype was based on the definition as described by D. J. Smith, Nicholl, et al. (2013), and has lower heritability than the Mental Health Questionnaire phenotype used here (Cai et al., 2020;D. J. Smith, Nicholl, et al., 2013). This difference in results is likely due to power: based on the heritability, sample size, and disease prevalence, our study has 60% power to detect an association between depression and coronary artery disease, compared with 18% in Khandaker et al. (2019) (Dudbridge, 2013).
This is the first study to show genetic overlap between MDD and type 2 diabetes using both PRS and genetic correlations. Twin studies have previously shown genetic overlap between MDD and type 2 diabetes, but evidence from genetic studies has been weaker. The current study, however, does show evidence for a shared genetic etiology between MDD and type 2 diabetes based on both PRS and genetic correlations (Clarke et al., 2016;Kan et al., 2016;Wong et al., 2019).
The polygenic risk score analysis showed that individuals with a stronger genetic liability for stroke (and its subtypes) were more likely with a later AAO for MDD. We did not find a significant association between genetic risk for stroke and AAO. This could be due to a lack of power, as the stroke outcomes had low heritability estimates (observed h 2 1%, Table S4B) and the sample size of the stratified MDD outcomes is smaller than the overall MDD case-control comparison.
When stratifying by AAO, higher polygenic risk for BMI was more strongly associated with late AAO than early AAO compared with controls, while a significant genetic correlation was only found between late AAO versus control and BMI. A causal association using Mendelian randomization has previously been identified between BMI and depression, showing a 1.12 fold increase in depression for each SD increase in BMI (Wray et al., 2018). Similarly, Vogelzangs et al (Vogelzangs et al., 2010) have shown that over a 5 year period both overall and abdominal obesity were associated with an increased risk for MDD onset in men. Although the differences observed in this study are not significant following multiple testing correction, in the context of previous findings, they tentatively suggest that the etiological processes underlying later onset MDD are linked to vascular pathology and its risk factors such as obesity.
LD score regression produced fewer significant associations than the polygenic risk score analysis. Genetic correlations using LD score regression are based on summary statistics only and could therefore be less powerful than a polygenic risk score approach based on raw genotype data. However, the direction of effect was the same across both analysis approaches and significant genetic correlations were corroborated by significant polygenic risk score associations.
The current study has a number of limitations. The measures for AAO for MDD rely on self-report and are assessed differently across cohorts. We addressed this by stratifying AAO into octiles relative to the mean in each study, therefore assuming that each study recruited AAO from the same distribution with differences between studies due to differences in ascertainment measure. It should be noted that the mean AAO for late onset cases was 47.85 years, which is below what is generally considered to be late onset or geriatric depression (onset >60 years). More pronounced overlap might exist between late onset MDD and cardio-metabolic traits when focusing on cases with a later AAO, however this study only had a small number of these included. However, this study does show that the effect of genetic risk for somatic conditions on depression is not limited to late onset or geriatric depression. By standardizing the PRS in all models we have possibly introduced bias into our results. When the sample prevalence of a trait is not equal to the population prevalence, the mean of the PRS will not represent the mean of the PRS in the population, and thus introducing bias which could lead to inflated effect estimates.
Future studies could focus on the effects of medication or disease status for the cardio-metabolic traits, as the current study was unable to adjust for this due to lack of information on these variables. Pathway analysis might provide further insight into the biology underlying the association between cardio-metabolic traits and MDD. In order to further dissect the comorbidity between MDD and cardio-metabolic disease, it is imperative to have cleaner phenotypes, in particular with regard to AAO. Biobanks with electronic health records might aid in providing improved phenotyping, but this still does not provide the same information as one gets from clinical research studies.
In summary, this study showed genetic overlap between MDD and cardio-metabolic traits based on PRS and genetic correlations.
These associations were largely irrespective of AAO for MDD, in particular for coronary artery disease and type 2 diabetes. The association with BMI showed some evidence for a stronger link with later onset MDD, however this finding needs to be replicated.