A genome‐wide association study of the frailty index highlights brain pathways in ageing

Abstract Frailty is a common geriatric syndrome and strongly associated with disability, mortality and hospitalization. Frailty is commonly measured using the frailty index (FI), based on the accumulation of a number of health deficits during the life course. The mechanisms underlying FI are multifactorial and not well understood, but a genetic basis has been suggested with heritability estimates between 30 and 45%. Understanding the genetic determinants and biological mechanisms underpinning FI may help to delay or even prevent frailty. We performed a genome‐wide association study (GWAS) meta‐analysis of a frailty index in European descent UK Biobank participants (n = 164,610, 60–70 years) and Swedish TwinGene participants (n = 10,616, 41–87 years). FI calculation was based on 49 or 44 self‐reported items on symptoms, disabilities and diagnosed diseases for UK Biobank and TwinGene, respectively. 14 loci were associated with the FI (p < 5*10−8). Many FI‐associated loci have established associations with traits such as body mass index, cardiovascular disease, smoking, HLA proteins, depression and neuroticism; however, one appears to be novel. The estimated single nucleotide polymorphism (SNP) heritability of the FI was 11% (0.11, SE 0.005). In enrichment analysis, genes expressed in the frontal cortex and hippocampus were significantly downregulated (adjusted p < 0.05). We also used Mendelian randomization to identify modifiable traits and exposures that may affect frailty risk, with a higher educational attainment genetic risk score being associated with a lower degree of frailty. Risk of frailty is influenced by many genetic factors, including well‐known disease risk factors and mental health, with particular emphasis on pathways in the brain.


| INTRODUC TI ON
Frailty is a common geriatric syndrome involving multi-system impairment, which is associated with increased vulnerability to stressors (Dodds & Sayer, 2016). Frailty is a major public health issue in geriatric populations and is becoming increasingly common with ageing demography (Morley, 2016). Many frailty definitions exist but two of the most commonly used are the frailty phenotype (FP) (Fried et al., 2001) and the frailty index (FI) (Mitnitski et al., 2001).
The FP is a clinical syndrome based on the presence of three of five physical components (exhaustion, weakness, slow walking speed, unintentional weight loss and low physical activity), whereas the FI is based on the accumulation of a number of health deficits during the life course. While both measures are related to adverse ageing outcomes, the two instruments are differently orientated and the underlying physiological and biological differences between the FP and FI are important (Cesari et al., 2014) (Cesari et al., 2017). The FI may be better at discriminating at the lower to middle end of the frailty continuum, making it better suited for measuring frailty at younger age (Blodgett et al., 2015).
Higher FI values are associated with many negative health outcomes including disability, mobility limitations, a variety of chronic diseases and hospitalization, and mortality (Williams et al., 2019) ( Kojima et al., 2018) (Vermeiren et al., 2016). The prevalence of the FI is estimated as 18% worldwide in community-dwelling older adults aged 60 and above (Siriwardhana et al., 2018). The mechanisms underlying the FI are likely multifactorial but not well understood.
Studies have suggested a genetic basis, with heritability estimates between 30 and 45%   (Young et al., 2016) (Kim et al., 2013). Candidate gene association studies for the FI have suggested the involvement of genes in inflammatory pathways, including interleukin-18 (Mekli et al., 2015). A first GWAS of the FI included two representative samples from the United States and U.K.
(the discovery sample: the Health and Retirement Study, n = 8532; and the replication sample: the English Longitudinal Study of Ageing, n = 5248), and identified two SNPs associated with FI variation (Mekli et al., 2018). rs6765037 (located near the gene KBTBD12) was associated in the discovery sample only, and rs7134291 (located in GRIN2B)-which plays a role in brain development, synaptic plasticity and cognition-showed a nominal replication. Further large-scale GWAS are needed to corroborate these findings and to more thoroughly understand the genetic determinants of frailty and associated biological pathways. These might provide insights for interventions to prevent and delay frailty and hence promote healthier and more independent ageing.
We undertook a GWAS meta-analysis of frailty, measured using the FI, in 60-to 70-year-old participants of European descent from the UK Biobank cohort and Swedish TwinGene participants aged 41 to 87 years. To gain insights into the functional relevance of emergent genetic associations, we examined the relationship of the frailty risk loci with circulating proteins and tissue-specific gene expression and epigenetic profiles. We also leveraged the GWAS findings in Mendelian randomization analyses to identify modifiable physiological, lifestyle or environmental traits that could be targeted by clinical and/or public health interventions to mitigate frailty risk.

| Study characteristics
In this GWAS meta-analysis of UK Biobank and TwinGene, we included 164,610 UK Biobank participants of European descent aged 60 to 70 years at baseline (mean 64.1, SD 2.8), which included 84,819 females (51.3%). The FI score ranged from zero to 27, from 49 deficits in total, and the mean proportion of deficits was 0.129 (SD 0.075), with a slightly higher mean score in females (0.132, SD 0.076) compared to males (0.125, SD 0.074) ( Table 1). The TwinGene participants (n = 10,616) were Swedish nationals aged 41 to 87 years (mean 58.3, SD 7.9) with 5,577 females (52.5%), and all of European descent. The TwinGene FI score ranged from zero to 26.25 deficits (44 considered in total), and the mean proportion of deficits was 0.121 (SD 0.080).

TA B L E 1 Baseline characteristics of study populations
The Swedish Adoption/Twin Study of Aging (SATSA) participants (n=368)-whose data were used in DNA methylation-related follow-up analyses-were all of European descent, aged 48 to 93 years (mean 68.6, SD 9.6), with 223 (60.6%) females. The SATSA FI score ranged from zero to 19 deficits (42 considered in total), and the mean proportion of deficits was 0.100 (SD 0.087).

| GWAS meta-analysis of the FI
We identified 14 loci associated (p < 5*10 −8 ) with the FI (n = 2,007associated variants in total) in the meta-analysis of UK Biobank participants and TwinGene ( Figure 1, Table 2 and Table S1  Of the 14 signals identified above the threshold for genomewide significance, two were significant below the more robust pvalue threshold of 1*10 −9 : rs9275160 (nearest genes HLA-DQB1 and HLA-DQA2) and rs82334 (nearest gene HTT). Two of the 14 loci (rs583514 in NLGN1 and rs2396766 in FOXP2) had significant heterogeneity between the studies (effect allele associated with FI in opposite directions), so should be interpreted with caution ( Table 2).
Many of the loci (13 of the 14) have previously been associated with traits and diseases in the GWAS Catalog (Buniello et al., 2019) (as of 12 April 2021) such as body mass index (BMI), cardiovascular disease, smoking initiation, HLA proteins, depression and neuroticism ( Table 2, Table S2). For some, it is the lead SNP itself identified in another GWAS, whereas for others it is a variant in partial LD that has appeared previously (e.g. rs4952693 in gene LRPPRC is associated with FI in our analysis, with a correlated variant rs10186876 R 2 = 0.61 known to be associated with handgrip strength (Willems et al., 2017); see Table 2 and Table S2 for details). One of the associations appears to be novel: rs56299474 (p = 4*10 −8 ) is located between genes HR and REEP4 and has not previously appeared in the GWAS catalog. However, it is a single associated variant with no other variants at the locus showing appreciable evidence of association, so further evidence is required for a robust conclusion.
In separate analysis of variants available only in UK Biobank (not imputed in TwinGene), we identified six further loci that require additional validation ( Table 2). Five of the six have previously appeared in the GWAS catalog, with rs796921150 in gene CSMD3 appearing in GWAS for the first time.
F I G U R E 1 Manhattan plot for genome-wide association study of Frailty Index. Meta-analysis GWAS of Frailty Index (normalized) in 164,610 UK Biobank participants aged 60-70 of European descent and 10,616 TwinGene participants aged 41-87 years. Primary analysis included 7,666,890 autosomal variants with minor allele frequency (MAF) >0.1%. Hardy-Weinberg p-value >1x10 −9 and imputation quality >0.3 in both cohorts. Linear mixed-effects regression models (BOLT-LMM software , which accounts for relatedness and population structure), were adjusted for age, sex, assessment centre (22 categories) and genotyping array (2 categories: Axiom or BiLEVE). There are 14 loci associated with p<5*10 −8 (red line) in the meta-analysis, highlighted in blue. In secondary analysis of 8,828,853 variants only available in UK Biobank, 6 additional loci were associated p<5*10 −8 (plotted but not highlighted). Genes are those nearest to the lead variants. See Table 2 for primary meta-analysis results. See Tables S1 and S2 for full details TA B L E 2 GWAS meta-analysis associations with Frailty Index in UK Biobank and TwinGene

| Candidate gene association analysis
Based on prior evidence we also specifically looked up variants for which we hypothesized an association with the FI a priori. First, the previous GWAS of FI (Mekli et al., 2018) identified 2 variants; however, neither were associated with FI: rs6765037 (p = 0.410), although rs7134291 was nominally significant (p = 0.018). We also investi- as CDKN2B-AS1-showed evidence of association, although did not reach genome-wide significance (p = 2*10 −4 ). rs61348208 is associated with parents' lifespan (RHJ Timmers et al., 2019) and is close to significance in this analysis (RBM6 locus, p = 3*10 −7 ) though still not reaching the genome-wide cut-off (5*10 −8 ). See Table S2 for full results list.

| Tissue enrichment analysis implicates brain pathways
FUMA (Functional mapping and annotation of genetic associations (Watanabe et al., 2017)) analysis of differentially expressed genes using tissue-specific gene expression data (GTEx v8) identified four tissues with significant (multiple testing adjusted p < 0.05) downregulation of genes associated with FI: Brain (Frontal Cortex BA9), Brain (Cerebellar Hemisphere), Brain (Spinal cord cervical c-1) and Brain (Hippocampus) (See Table S3). With gene-set enrichment analysis using MAGMA software (in FUMA), we found nine pathways significantly enriched in the GWAS results for FI after adjustment for multiple testing (Bonferroni), including: MHC Class II complex; Translocation of Zap70 to immunological synapse, and asthma (p < 3*10 −6 ; see Table S4 for details). In stratified linkage disequilibrium score regression (LDSR) analysis (Finucane et al., 2018), which identifies whether GWAS statistics are associated with specific gene-tissue pairings in expression or chromatin modification, we found enrichment for the genetic determinants of frailty in several brain regions, including the hippocampus and frontal cortex (see Tables S5 and S6 for details).

| GWAS hits as molecular quantitative trait loci (QTLs)
We examined whether there are established associations of the top 14 SNPs with concentrations of circulating proteins, metabolites, gene expression and/or epigenetic markers, as reported by previous QTL GWAS with genome-wide significance (p < 5*10-8). Overall, 13 of the SNPs were associated with at least one molecular marker (

| Methylation-FI associations
To identify whether risk variants could be related to frailty via DNA methylation differences, we assessed whether the FI-associated genetic variants harboured methylation quantitative trait loci (mQTL) and whether methylation levels in such loci demonstrated further associations with the FI in SATSA. Of the total 2,007 variants associated with the FI, 1,573 were cis-methylation quantitative trait loci (cis-mQTL) identified in SATSA. These 1,573 variants had 12,081 demonstrable associations with CpG sites in total; single variants were commonly associated with multiple CpG sites, and there were 103 unique CpGs among all associations. Associated variants for the 103 CpGs are shown in Table S8. Methylation level in one (cg20614157) of the 103 CpG sites was significantly associated with the SATSA FI after Bonferroni correction (see Table 3; full results for all 103 sites shown in Table S9), and 6 CpG sites were associated with a nominal p < 0.05.

| Mendelian Randomization (MR)
To identify potentially modifiable phenotypic traits that may contribute to the development of frailty over the life course, we conducted a series of MR analyses. First, we calculated genetic risk scores (GRS) in UK Biobank for 35 traits for which we could retrieve one or more SNPs robustly associated with the traits from pre-existing GWAS results ( Figure 2 and Model types also produced inconclusive or more divergent findings for grip strength, waist-to-hip ratio, inflammatory bowel disease, parental attained age and age at first sexual intercourse, at least partly reflecting the use of less informative sets of genetic instruments for analyses of several of these traits.

| DISCUSS ION
In this GWAS, we identified 14 genetic loci associated with frailty (as measured by the FI) in 164,610 UK community-based individuals aged 60 to 70 years of European ancestry, with meta-analysis of data from similarly aged Swedish individuals in TwinGene. We found suggestive evidence for others but more evidence is required to confirm these. Many of these loci have previously been associated with traits such as body mass index, cardiovascular disease, smoking initiation, HLA proteins, depression and neuroticism, suggesting roles for known disease risk factors and mental health in the development of frailty, with particular emphasis on genes expressed in the frontal cortex and hippocampus in the tissue expression analysis.
Our SNP-based heritability estimate of 11% for the FI was slightly lower than previous family-or twin-based estimates be- The majority of FI-associated loci have previously been identified for traits that have widespread health consequences, such as BMI, smoking initiation, depression, or neuroticism. The results include variants in the major histocompatibility complex region of chromosome six, containing many HLA (Human Leukocyte Antigen) genes.
HLAs are cell-surface proteins crucial for regulating immune function, which is known to decline with age (Simon et al., 2015). Genetic variants affecting the efficiency of immune function at older ages could have profound consequences for morbidity: variants in this region are robustly implicated in many chronic diseases and traits *If >20 associated variants, the first 20 with lowest p-values are included (see Table S9 for full results). (Buniello et al., 2019), and also weakness in older people (sarcopenia) (Jones et al., 2019).

TA B L E 3 Association between methylation levels in mQTL-associated CpGs and Frailty Index in SATSA
In our analysis of blood-based DNA methylation associated with the FI, the only confirmed mQTL CpG (cg20614157) association was in a promotor for the TNXB/TNXA/STK19 [Tenascin XB/putative Tenascin XA/Serine/threonine-protein kinase 19] gene cluster, located in the HLA region. The association of a marker of potential functional relevance with the FI increases the likelihood that one or more of the genes in this cluster are of causal relevance for frailty risk, rather than others in the wider region. We found that high methylation level in cg20614157 was associated with higher frailty; a previous study has found that high methylation of the CpG site is In tissue enrichment analysis, genes expressed in several brain regions are significantly downregulated by genetic variants associated with FI. Regions include the frontal cortex and hippocampus, the latter especially has been linked to dementia (Moodley & Chan, 2014). The main genes associated with several of the lead SNPs are linked directly to neuronal function, including ANK3 (Ankyrin F I G U R E 2 Genetic risk score associations with the frailty index in UK Biobank. Thirty-five exposures, including lifestyle factors, clinical measures, circulating biomarkers and diseases, were assessed for their association with the Frailty Index by genetic risk score analysis in UK Biobank participants of European descent aged 60-70 years. Linear regression models included age, sex, assessment centre (22 categories), genotyping array (2 categories: Axiom or BiLEVE) and principal components of ancestry 1-10 as covariates. The betas represent the SD change in FI per SD increase in genetic predisposition to the exposure. Positive betas suggest increased frailty in individuals with greater genetic predisposition to the exposure, whereas negative betas represent a protective effect with increasing genetic predisposition. See Table S10 for details. * = significant p<0.0014 after Bonferroni correction for 35 tests. Abbreviations: BMI = body mass index; adjBMI = adjusted for BMI; IGFBP-3 = insulin-like growth factor-binding protein 3; SHBG = sex hormone binding globulin; IGF-1 = insulinlike growth factor 1; DHEAS = Dehydroepiandrosterone sulphate; eGFR = estimated glomerular filtration rate; CIs = 95% confidence intervals

HTT (Huntingtin), linked to Huntington's disease and may have a role
in vesicle formation in autophagy, and SYT14 (Synaptotagmin 14), which mediates membrane trafficking in synaptic transmission, and mutations in this gene cause spinocerebellar ataxia (Doi et al., 2011).
These results are consistent with FI-defined frailty having a neurological basis, as has been suggested by a previous study showing that frailty and chronic widespread pain have shared neurological pathways . Pain questionnaire responses constituted a notable proportion of the FI items in UK Biobank (9 of 49; 18.4%), which will have influenced these findings. However, the relationship of these genetic risk loci with frailty was also present in the TwinGene meta-analysis (where equivalent data were available), which included fewer items related to pain perception (2 of 44; 4.5%).
A previous FI GWAS (Mekli et al., 2018) identified two variants (rs6765037 and rs7134291), but neither were associated with FI in UK Biobank, which was more than ten times larger than this previous study. Surprisingly, loci implicated in parental lifespan (Wright et al., 2019) (RHJ Timmers et al., 2019) or key ageing hallmarks (Pilling et al., 2016) were not associated with the FI at genome-wide significance. This included genetic variants in APOE, 9p21.3 (CDKN2A/B), TERT and FOXO3A. The FI is strongly associated with morbidity and mortality epidemiologically, but it may be limited as a tool for identifying specific ageing pathways, given the broad definition that includes many diverse diseases and characteristics.
In Mendelian randomization analyses, we tested whether longterm effects of specific physiological, behavioural or lifestyle factors may affect FI scores. The most prominent findings indicated that predispositions to higher educational attainment and lower BMI were related to decreased frailty. Higher BMI directly affects cardiovascular health and subsequent disease risk, in addition to possible indirect effects by reducing socioeconomic status due to stigma (especially in women) , with subsequent effects on frailty. This also supports a recent study which provided evidence against the notion of an obesity paradox in older people (Bowman et al., 2017).
To our knowledge, this is the largest genetic study of the frailty index to date. However, this study is limited to participants of European ancestry so results may not be generalizable to other populations. UK Biobank volunteers tend to be healthier and less socioeconomically deprived at baseline than the general UK population (Fry et al., 2017), perhaps reducing power to detect variants associated with the FI. The FI was also based on self-reported baseline data so is open to the possibility of misclassification bias, but the FI has previously been validated in UK Biobank and shown to be strongly predictive of all-cause mortality (Williams et al., 2019). Sample sizes in TwinGene and SATSA are also relatively small so power for the replication and CpG-FI analyses may have been limited. Although we show that the FI definitions are comparable between UK Biobank and TwinGene, differences in genetic background could also have contributed to the limited overlap in results.
In conclusion, frailty is influenced by a large number of genetic determinants linked to well-known disease risk factors such as BMI, cardiovascular disease, smoking and HLA proteins. Frailty is also influenced by genetic determinants for depression and neuroticism, suggesting a role for mental health, which may be underpinned by pathways linked to brain function. Future research is required to replicate our results in other cohorts, testing for consistency of F I G U R E 3 Mendelian randomization estimates for the effect of educational attainment on the frailty index in UK Biobank Points and error bars represent beta estimates and 95% confidence intervals for each SNP-education / SNP-FI association. The trend lines represent different methods for summarizing the estimates from individual SNPS-inverse variance weighting (IVW), weighted median and MR-Egger. The weighted median and MR-Egger estimates are less prone to bias from pleiotropy among the set of variants than IVW, given alternative assumptions hold. The MR-Egger method includes a test of whether the trend's intercept differs from zero, which indicates whether there is an overall imbalance (directional) of pleiotropic effects: such bias was not identified in this education-FI model genetic determinants when frailty has been measured differently.
In particular, a comparable GWAS of the frailty phenotype, based on physical components of health (e.g. exhaustion, weakness) rather than comorbidities, would be likely to yield further risk loci specifically for these components of frailty.

| TwinGene
TwinGene data collection took place between 2004 and 2008, when the older participants of the Screening Across the Lifespan Twin (SALT) study were invited to donate blood for molecular and genetic analyses. Both same-and opposite-sex twins were included.
This study was based on the sample that had both genetic and FI data available (n = 10,616). All participants have given their informed consent. The TwinGene study was approved by the Regional Ethics Review Board, Stockholm.

| Swedish Adoption/Twin Study of Aging (SATSA)
SATSA is a longitudinal study in gerontological genetics, with sampling of participants drawn from the Swedish Twin Registry. SATSA was initiated in 1984 and ended in 2014, and it comprises of nine questionnaire and ten in-person testing (IPT) waves. The participants are same-sex twins, of which some twin pairs have been reared together and some separated before age 11 and reared apart.
Ascertainment procedures for SATSA have been described previously (Finkel & Pedersen, 2004). This study made use of IPT data, and participants' first available measurement on FI and DNA methylation were included (n = 368). All participants have provided informed consent. The SATSA study was approved by the Regional Ethics Review Board in Stockholm.

| Frailty Index (FI)
We used an FI based on the accumulation of deficits model (Searle et al., 2008), as validated in UK Biobank previously (Williams et al., 2019). The FI was derived using 49 self-reported baseline data variables in UK Biobank. Variables were based on a variety of physiological and mental health domains, and included symptoms, disabilities and diagnosed diseases, which were self-reported by participants at baseline (see Table S11 for details of the FI components included and the proportion of individuals scoring one for each component). The FI was generated using a complete-case sample with information on all 49 individual components (n = 164,610) and presented as a proportion of the sum of all deficits. The FI was quantile normalized (i.e. transformed into a normal distribution) prior to the genome-wide association study (due to the skew of the untransformed trait) and sensitivity analyses using the non-transformed trait were performed.
Construction of the FIs for TwinGene and SATSA have been previously described and validated for their ability to predict mortality (Li et al., 2019) (Raymond et al., 2020). Briefly, both the TwinGene FI and SATSA FI were constructed using self-reported questionnaire data (for TwinGene, the data collected in SALT were used), and they cover a variety of different health domains. The TwinGene/SALT FI consists of 44 deficits (Table S12), and the SATSA FI consists of 42 deficits. Prior to the analyses, the FIs were quantile normalized.
We compared the FIs used in UK Biobank and TwinGene. Of the 49 items used in UK Biobank, 29 of these have approximate items in TwinGene (see Table S13 for details of overlap). The subset of 29 items in UK Biobank were well correlated with the full 49 items (r 2 = 0.85, p < 0.0001).

| GWAS Meta-analysis of the Frailty Index
We used data from UK Biobank v3 genotyping release, described in detail previously (Bycroft et al., 2018). In brief, 488,377 participants were successfully genotyped using custom Affymetrix microarrays for ~820,000 variants. Imputation was performed using 1000 Genomes Genotyping of the TwinGene was performed using the Illumina OmniExpress platform and has been previously described (Magnusson et al., 2013). The present study made use of data im-

| Gene Ontology Pathways & QTL analyses
See Supplementary Methods for details.

| Mendelian Randomization (MR)
MR is the application of genetic variation to infer whether phenotypic traits or exposures affect diseases or health-related outcomes (Lawlor et al., 2008). We investigated a range of exposures for which genetic determinants (typically SNPs) have been identified in previous GWAS. In total, 35 exposure-frailty associations were modelled in UK Biobank. Exposures included lifestyle factors, clinical measures, circulating biomarkers and diseases. All traits are listed in Table S10, with references for the GWAS from which sets of instrumenting variants were sourced. We used genetic risk scores (GRS) for each trait in initial MR models, and followed up several of the lead findings with sensitivity analyses to detect and account for bias by pleiotropy among genetic instruments (described in the Supplementary Methods).

| Methylation-FI association analysis
To identify whether risk variants could be related to frailty via DNA methylation differences, we assessed whether the FI-associated genetic variants harboured methylation quantitative trait loci (mQTL) and whether methylation levels in such loci demonstrated further associations with the FI in SATSA (see Supplementary Methods for further details).

ACK N OWLED G EM ENTS
Access to UK Biobank Resource was granted under Application Number 14631. We would like to thank UK Biobank participants and coordinators for this dataset. We also acknowledge the Swedish Twin Registry for access to data. Genotyping for the Swedish Twin Registry cohorts used in this study was performed by the SNP&SEQ Technology Platform in Uppsala Sweden. The authors would like to acknowledge the use of the University of Exeter High-Performance Computing (HPC) facility in carrying out this work.

CO N FLI C T S O F I NTE R E S T
No conflicts of interest to declare.

AUTH O R CO NTR I B UTI O N S
JLA and LCP drafted the manuscript. JLA derived the FI in UK Biobank and performed descriptive statistics on the data. LCP performed GWAS and pathways analysis in UK Biobank. DMW performed QTL look-ups and Mendelian randomization analysis in UK Biobank. YW performed genetic analysis in Twin Gene and JJ performed mQTL analyses. PKM and NLP were responsible of acquisition of the TwinGene genetic data. NLP coordinates the SATSA study, and together with SH, was responsible of acquisition of the SATSA methylation data. YL was responsible for imputing and pre-processing the TwinGene genotype data. YW was responsible of pre-processing the SATSA methylation data. All authors contributed to the design of study, data interpretation and revision of the manuscript.