A trans‐omic Mendelian randomization study of parental lifespan uncovers novel aging biology and therapeutic candidates for chronic diseases

Abstract The study of parental lifespan has emerged as an innovative tool to advance aging biology and our understanding of the genetic architecture of human longevity and aging‐associated diseases. Here, we leveraged summary statistics of a genome‐wide association study including over one million parental lifespans to identify genetically regulated genes from the Genotype‐Tissue Expression project. Through a combination of multi‐tissue transcriptome‐wide association analyses and genetic colocalization, we identified novel genes that may be associated with parental lifespan. Mendelian randomization (MR) analyses also identified circulating proteins and metabolites causally associated with parental lifespan and chronic diseases offering new drug repositioning opportunities such as those targeting apolipoprotein‐B‐containing lipoproteins. Liver expression of HP, the gene encoding haptoglobin, and plasma haptoglobin levels were causally linked with parental lifespan. Phenome‐wide MR analyses were used to map genetically regulated genes, proteins and metabolites with other human traits as well as the disease‐related phenome in the FinnGen cohorts (n = 135,638). Altogether, this study identified new candidate genes, circulating proteins and metabolites that may influence human aging as well as potential therapeutic targets for chronic diseases that warrant further investigation.


| INTRODUC TI ON
Increasing lifespan and promoting healthy living into old age are among top priorities of health care systems around the world. Over the past decades, several policies have been set in place to improve sanitation, access to care and control risk factors for premature mortality. In Western societies, however, although lifespan is steadily improving, an increasing proportion of the population is affected by chronic diseases and multimorbidity, which may be attributable to a significant extent to poor lifestyles as well as environmental and socio-economic factors. Unprecedented efforts will be needed to target these risk factors and decrease age-associated disease burden and improve the quality of life of aging populations. Drugs aimed at improving lifespan may represent additional options that could be used to prevent aging-associated diseases. Increasing evidence suggests that drug targets with genetic support are more likely to be efficient and to be approved by regulatory authorities (Nelson et al., 2015). A better characterization of longevity genomics could therefore contribute to the identification of drug development strategies to delay the onset of aging-associated diseases.
The genetics revolution offers new opportunities to foster our understanding of complex traits such as human longevity (Cotto et al., 2018;Melzer et al., 2020). The definition of longevity in human genetic studies is highly debated and the lack of a universally recognized definition increases the possibility of biases and limits external validation. Studying the genetic architecture of long-lived individuals has shed some light on the loci at which genetic variation may influence human longevity. However, the most comprehensive meta-analysis of genome-wide association studies (GWAS) of survival percentiles only reported one locus (APOE) to be robustly associated with human longevity . On the other hand, other longevity traits such as parental lifespan (the lifespan of deceased or long-lived parents) have also been used to uncover longevity genes. In the most comprehensive GWAS of parental lifespan to date, Timmers et al. (2019) reported several new loci associated with parental lifespan, including APOE. Although it has been suggested that human longevity is not simply limited to the absence of life-threatening diseases, many of the reported genetic variants associated with lifespan are also associated with chronic diseases.

Mendelian randomization (MR) is a burgeoning field of research
that draws on the use of genetic variants as instruments to assess potentially causal relationships between a wide variety of exposures such as risk factors or drug targets and outcomes such as age-related disease or human longevity (Hemani et al., 2018). By taking advantage of the naturally randomized allocation of genetic variation, MR is not subject to many of the biases of observational studies such as reverse causality or random measurement error and less susceptible to confounding and reverse causality. The recent public release of a wide variety of deeply phenotyped gene-trait association datasets has enabled the use of multi-omic datasets as exposures in MR studies.
However, few studies have investigated the long-term consequences of lifelong exposure to over-or under-expressed genes (eGenes), plasma proteins, lipoproteins or metabolites on human lifespan and associated age-related chronic disease. In this study, we aimed at discovering and harnessing novel biological determinants of human longevity by identifying eGenes through a transcriptome-wide association study (TWAS) as well as genetically regulated circulating proteins (eProteins) and metabolites (eMetabolites) using a highly translational MR framework. We performed a TWAS for parental lifespan leveraging expression Quantitative Trait Loci (eQTL) data from 43 non-sexspecific tissues to identify eGenes associated with parental lifespans.
In light of recent investigations reporting that the majority of eQTLs do not necessarily influence protein levels (He et al., 2020;Yang et al., 2020) we used MR to identify key eProteins, and eMetabolites, that may influence human lifespan. In order to discover the role of these genetically regulated traits in human homeostasis while at the same time determining whether they could be effectively and safely targeted to promote healthy aging and human longevity, we report MR findings across the human disease-related phenome.

| Identification of eGenes associated with parental lifespan
We performed a TWAS using the MetaXcan framework (Barbeira et al., 2018) to identify potentially causal effects of eGenes on parental lifespan using GWAS summary-level statistics of the UK Biobank and LifeGen consortium meta-analysis (methods) (Timmers et al., 2019). TWAS is a type of 2-sample MR study that takes advantage of all cis-acting single-nucleotide polymorphisms (SNPs). These SNPs are often in linkage disequilibrium (LD) with each other and using all SNPs for TWAS can ensure optimal statistical power, in contrast to classical MR studies that use one or multiple independent SNPs. This analytical approach uses eQTL mapping from the Genotype-Tissue Expression (GTEx) project across 43 non-sex-specific tissues to estimate gene-level association with summary-level GWAS results.
This strategy led to the identification of new potential eGenes for parental lifespan after correction for multiple testing (Figure 1a). A detailed description of the association of eGenes-parental lifespan associations across all tissues is presented in Table S1. Because TWAS prioritizes multiple genes, some eGenes could be non-causal, owing to sharing of eQTLs or co-regulated genes. TWAS may thus be prone to false positives and spurious gene prioritization, especially in the absence of genetic colocalization, that is, when the lead genetic variant driving gene expression is not among the lead GWAS variants. We therefore took additional steps using genetic colocalization to control for spurious prioritization. After excluding genes with a posterior probability of statistical colocalization PPH4 < 0.75 and after excluding genes found in pleiotropic regions such as HLA, ABO and APOE, the number of parental lifespan eGenes was reduced to 30, spanning 17 loci (Figure 1b). Details on strength of the association, colocalization of these eGenes with parental lifespan as well as the lead tissue (i.e., tissue providing the strongest eGenes-parental lifespan estimate from MetaXcan) are provided in Table 1. Posterior probabilities of genetic colocalization (PPH4) using a wider range of priors are presented in Table S2. We identified tissue-specific expression regulation of several known parental lifespan genes and revealed potentially new parental lifespan genetic signals at the LRP8 (ApoER2), NEK10, CCDC71L, NRG1 and RAD52 loci. Although the use of statistical colocalization helped prioritize several genes, the causal gene potentially linked with parental lifespan could not be identified in all genetic regions. This is the case for the CELSR-PSRC1-SORT1, FURIN-FES, TXNL4B-HP-HPR and LAMA5-AL121832.2-CABLES2-CHRNA4 regions, within which the lead parental lifespan variant was linked with the expression of two or more genes. Multivariable MR, a MR technique used to identify the causal exposure accounting for potential confounders, did not identify the causal eGene from these loci (data not shown). We also observed that among colocalized eGenes for parental lifespan FURIN and HP, gene expression in one tissue was positively linked with parental lifespan while negatively linked with parental lifespan in another tissue. A Sankey diagram presents tissues underlying the genetic signals of colocalized eGenes ( Figure 1c). Table   S3 presents the results of a classical MR approach that reports associations between the level of expression of each eGene presented in Table 1 (in the tissue identified in Table 1) and parental lifespan using inverse variance weighted (IVW)-MR and other outlier robust MR methods. This analysis revealed that most of the colocalized eGenes were nominally associated with parental lifespan with the exception of LRP8 and BECN1. We also performed similar analyses using data on whole blood gene expression levels from 31,684 individuals of the eQTLGen consortium as exposure (Võsa et al., 2018). Although only 17 of the 28 independent and colocalized eGenes could be evaluated using MR, we found that most eGenes showed nominal associations with parental lifespan with the exception of LRP8, POM121C, FURIN, TXNL4B and LAMA5 (Table S4). These analyses revealed the important of considering tissue specificity and TWAS methods over classical MR for the identification of novel eGene-trait associations. In order to gain insight into potential tissue specificity of the parental lifespan associated eGenes, we obtained the tissue-specific gene expression metric (τ) as described by Kryuchkova-Mostacci and Robinson-Rechavi (2017). This analysis revealed that several of the eGenes had tissue-specific expression (τ ≥0.80), including the HP gene, which appears to be liver-specific, in accordance to our initial TWAS finding (Table S5). Other tissue-specific eGenes include KCNK3, NRG1, NEK10, CHRNA3/5 and CHRNA4. We used LocusCompare  to depict colocalization events within our framework and present as an example the colocalization of the top SNP associated with liver HP expression and parental lifespan (Figure 1d).

| Identification of eProteins associated with parental lifespan
Proteins are the target of most medicines. In order to identify circulating factors causally linked with parental lifespan, simultaneously representing therapeutic targets for aging-associated diseases, we performed a systematic screen of the human plasma proteome of the INTERVAL cohort (Sun et al., 2018) using MR. We first obtained robust instruments for protein levels by identifying proteins with ≥4 cis-acting independent variants (r 2 < 0.1) strongly associated with protein levels (p-value <5e −8 ). A total of 279 circulating proteins were available for MR analyses using these criteria. We then used IVW-MR to determine the association between these circulating proteins and parental lifespan. Nine proteins emerged as causal mediators of parental lifespan in the proteome-wide MR analysis, including haptoglobin (HP gene on chromosome 16q22), which has also been identified by GWAS and TWAS of liver-parental lifespan associations ( Figure 2a). Other proteins identified as causal mediators of parental lifespan include asporin (ASPN gene on chromosome 9q22), agouti-signaling protein (ASIP gene on chromosome 20q11), the soluble receptor of insulin-like growth factor 2 (IGF2R gene on chromosome 6q25), plexin-B2 (PLXNB2 gene on chromosome 22q13), interleukin-6 receptor subunit alpha (IL6R gene on chromosome 1q21), soluble intercellular adhesion molecule-5 (ICAM5 gene on chromosome 19p13), ectonucleotidase phosphodiesterase 7 (ENPP7 gene on chromosome 7q25) and Sushi, von Willebrand factor type A, EGF and pentraxin domain-containing protein 1 (SVEP1 gene on chromosome 9q31). In order to determine whether these associations were robust to pleiotropy and outliers, we used additional outlier robust (MR-PRESSO) and modeling MR methods (MR-Egger and contamination mixture). The association of these proteins with parental lifespan using these methods are presented in Table S6. Results of this analysis suggest that the causal association of these proteins with parental lifespan is robust to outliers. The MR method however detected evidence of horizontal pleiotropy for one protein, IL6-sRA. In order to determine if there was general agreement between eProteins and eGenes, we investigated whether the top pQTLs for eProteins also influenced gene expression levels in the GTEx tissues. Results presented in Table S7 suggest that most pQTLs were indeed eQTLs, but in a tissue dependent manner. Altogether, this analysis identified five circulating "protective" proteins that may be positively associated with parental lifespan and four circulating proteins that may be negatively associated with lifespan.

| Identification of eMetabolites associated with parental lifespan
We applied a similar MR framework as above to identify eMetabolites including lipoprotein metabolomic parameters that could be causally associated with parental lifespan using a GWAS-metabolomics dataset, previously described by Kettunen et al. (2016). The association between 123 metabolites from this datasets and parental lifes-  F I G U R E 1 A multi-tissue transcriptome-wide association study of parental lifespan. (a) Miami plot depicting the results of transcriptomewide association studies of parental lifespan in multiple tissues before filtering out eGenes without evidence of genetic colocalization. Each dot represents the effect of an eGene on parental lifespan and the top tissue underlying the signal is shown. eGenes negatively associated with parental lifespan are above the baseline and eGenes positively associated with parental lifespan are below the baseline. Some tissues (for instance those in the brain) were pooled in the legend to facilitate visualization of the tissues responsible for the eGene-parental lifespan associations. (b) Miami plot depicting the results of transcriptome-wide association studies of parental lifespan in multiple tissues after filtering out eGenes without evidence of genetic colocalization (posterior probability of statistical colocalization <0.75) and after excluding genes found in pleiotropic regions such as HLA, ABO and APOE. (c) Sankey diagram depicting tissues that are responsible for the eGene-trait associations. This analysis is based on 43 non-sex-specific tissues from GTEx. (d) LocusCompare plot depicting colocalization of the top SNP associated with liver HP expression and parental lifespan. Each dot represents a single-nucleotide polymorphism (SNP) at the HP locus. In the left panel, these SNPs are plotted to represent their effect on HP expression (top right) against their effect on parental lifespan (bottom right) methods that enabled us to identify causal proteins associated with parental lifespan were used and no evidence of horizontal pleiotropy or distortion of the causal estimates was detected using these methods (Table S8). Overall, results of this analysis suggest that elevated apolipoprotein-B-containing lipoprotein levels are strongly associated with shorter parental lifespan.

| Investigation of genetic colocalization across multiple traits at the HP locus
Given that the only parental lifespan signal supported by GWAS, TWAS and proteome-wide MR was at HP, the gene encoding haptoglobin (Hp), and that variation at the HP locus was linked with F I G U R E 2 Proteome-and metabolome-wide Mendelian randomization analysis of parental lifespan. Volcano plots representing the association between plasma eProteins from the study of Sun et al. (a) and eMetabolites from Kettunen et al. (b) and parental lifespan using inverse variance weighted Mendelian randomization. eProteins and eMetabolites in blue represent those positively associated with parental lifespan and those in red are negatively associated with parental lifespan F I G U R E 3 Statistical colocalization at the HP locus. (a) Regional association plot highlighting the lead SNP associated with liver HP expression (rs34042070) and circulating haptoglobin levels, low-density lipoprotein cholesterol levels and parental lifespan with color key indicating r 2 with the lead SNP. (b) Heatmap depicting the posterior probabilities of colocalization between each trait pair hyperlipidemia, we investigated genetic colocalization across multiple traits at the HP locus. For this purpose, we used a Bayesian algorithm called Hypothesis Prioritization in multi-trait Colocalization (HyPrCOLOC) (Foley et al., , 2021). In order to reduce the risk associated with spurious association from overlapping dataset, we used GWAS summary statistics from the Global Lipids Genetics Consortium for LDL-C. Results presented in Figure 3a suggest strong evidence of genetic colocalization between liver HP expression, circulating LDL levels, and parental lifespan. Upon further investigation, we found that liver HP expression, circulating LDL levels, parental lifespan but not circulating Hp levels showed evidence of genetic colocalization with a posterior probability of colocalization of 0.96 ( Figure 3b).

| Identification of drug repositioning opportunities for human lifespan and chronic diseases
As many of the eGenes, eProteins and eMetabolites were associated with age-related chronic disease, we aimed at identifying in a more comprehensive manner which disease traits were influenced by these exposures. We performed a systematic analysis of potential diseases associated with eGenes and genes encoding eProteins in the Open was also used to intersect eGenes and eProteins with the DrugBank and the Drug Gene Interaction database (Wishart et al., 2018). This analysis identified new targets potentially interacting with parental lifespan eGenes and eProteins (Tables S11 and S12). Finally, we used the network processing data tool GeneMANIA to identify new genes that may show co-expression or physical interaction with our targets (Franz et al., 2018). The inner circle of Figure 4a includes the eGenes was identified in the biological pathway analysis as interacting with several of the parental lifespan associated eGenes such as the LDLR.
Through this strategy, we also identified PCSK9 inhibitors as potential lifespan extending drug candidates. independent SNPs within the PCSK9 region strongly and independently associated with LDL cholesterol in the Global Lipids Genetic Consortium (r 2 < 0.1 and p-value < 5e −8 ). We evaluated the impact of carrier status of PCSK9 SNPs linked with high LDL and the impact of a weighted genetic score (GS) scaled to model a 1 mmol/L increase in LDL-C levels and found a significant association with having lower F I G U R E 4 Investigation of proprotein convertase subtilisin/kexin type 9 (PCSK9) inhibition as a potential therapy for chronic diseases. (a) Interaction network of parental lifespan eGenes and eProteins with other genes and top pathways of parental lifespan eGenes and eProteins (false discovery rate <5e −5 ) from the GeneMANIA prediction server. (b) Odds ratio (OR) and 95% confidence interval (CI) for high parental lifespan in participants of the UK Biobank separated into quartiles of the PCSK9 genetic score (GS) of 11 independent variants associated with low-density lipoprotein cholesterol levels. Analyses were adjusted for age, sex and 10 first ancestry-based principal components. The adjusted odds ratio and 95% CI for high parental lifespan, associated with a 1 mmol/L increase in low-density lipoprotein cholesterol associated with these variants is also shown. (c) Phenome-wide inverse variance weighted Mendelian randomization study depicting the association between variants at the PCSK9 locus weighted for their impact on low-density lipoprotein cholesterol and 1169 binary diseaserelated traits in FinnGen. Associations are reported after correction for multiple testing, which accounted for phenotypic correlation between traits. Arrows pointing up represent higher disease presence and arrows pointing down represent lower disease presence. The dotted line represents the nominal p-value of 0.05 and the green line represents the p-value after correction for multiple testing

| Genetic investigation into PCSK9 as a therapeutic target for chronic diseases
odds of high parental lifespan (Figure 4b). We next performed a phenome-wide MR analysis across the disease-related phenome to gain more insight on the potential benefits/risk ratio of PCSK9 inhibitors. In this phenome-wide MR, the association between 11 SNPs at the PCSK9 locus (weighted for their impact on plasma LDL cholesterol levels) and 1169 binary disease-related traits in 135,638 participants of the FinnGen cohorts was investigated using IVW-MR.
As expected, PCSK9 "genetic inhibition" may target aging-associated diseases through its association with lifelong exposure to low LDL cholesterol levels and associated reduction in the risk of cardiovascular diseases (Figure 4c and Table S13).

| Mendelian randomization studies across the human phenome
We used PhenomeXcan (Pividori et al., 2020), a resource enabling the investigation of eGenes across the human phenome, to identify human traits and diseases than may be causally influenced by parental lifespan eGenes. This analysis revealed 669 eGene-trait associations of interest after correction for multiple testing (  (Collins et al., 2016;Pedersen, 2016).
In support of the causal association of apolipoprotein-Bcontaining lipoprotein and parental lifespan, we found through a gene interaction network that the four genes responsible for the overwhelming majority of the cases for the most prevalent human genetic disease (Defesche et al., 2017), familial hypercholesterolemia, could be linked with genes regulating lifespan (LDLR, PCSK9, APOB and LDLRAP1). These results support the notion that lifelong exposure to low concentration of all apolipoprotein-B-containing lipoprotein particles might be an important determinant of atherosclerotic coronary artery diseases and human longevity. In this work, we also provide evidence that genetic variation in PCSK9 mimicking the effect of PCSK9 inhibitors is linked to lifelong exposure to low LDL cholesterol levels and higher parental lifespan. These results suggest that PCSK9 inhibitors, by reducing LDL cholesterol, could be key to decreasing agingassociated diseases and extending lifespan but need to be tested in adequately powered long-term randomized clinical trials.
A non-colocalized signal was also reported for a liver expressed Our findings add novel information on the regulation of genes that may be linked with parental lifespan. Although Timmers et al. provided evidence that some cis-acting variants associated with parental lifespan did influence gene expression, a TWAS followed by genetic colocalization investigating potentially new eGenes associated with parental lifespan was not performed as we report here. We also linked the genetically regulated genetic expression of these genes with several cardiometabolic traits as well as human diseases. Altogether, these results support the usefulness of combining various genetic investigation techniques relying on mortality risk factors and MR to identify new genetic and biological determinants of complex traits such as human longevity.
Results of this study further confirm the role of smoking behaviors as a strong determinant of lifespan and chronic diseases.
We found that many smoking-associated loci such as cholinergic receptor nicotinic 2, 3/5 and 4 subunits may actually be brain eGenes.

Timmers et al. (2019) did identify variation at CHRNA 3/5 on chromo-
some 15 to be associated with parental lifespan at the genome-wide significance level. Here we report that CHRNA4 on chromosome 20 and CHRNA2 on chromosome 8 as new potential parental lifespan loci (although the latter did not show evidence of genetic colocalization). Unsurprisingly, our search for potential drug targets for agingassociated diseases identified many smoking cessation therapies and the results of our phenome-wide MR studies confirmed the association of these eGenes with atherosclerotic cardiovascular diseases and neoplasms. These results confirm the importance of urgently adopting strict tobacco control policies throughout the world to improve global health, decrease chronic disease burden and human longevity.
In this analysis, the only protein that showed significant association in genome-wide, transcriptome-wide and proteome-wide MR studies was haptoglobin (Hp). We show here that HP is a liver expressed gene and that liver expression of HP, as well as plasma Hp is linked with parental lifespan. A study showed that Hp levels are higher in the plasma of Japanese semisuper centenarians (Miura et al., 2011). Hp is an acute-phase protein and is one of the most abundant proteins in human plasma. It binds free hemoglobin and facilitates its removal from the bloodstream. Our analysis also identified an association between genetically regulated HP genetic expression and hemoglobin levels. Hp also decreases oxidation of apolipoprotein E (encoded by the longevity gene APOE). A previous study showed that exon deletion in HP was associated with lower LDL cholesterol levels (Boettger et al., 2016). The authors proposed a mechanism whereby Hp reduced the oxidation of apolipoprotein E and facilitated its removal from the blood stream, thereby reducing LDL cholesterol levels. Upon further investigation of the HP locus on chromosome 15, we found strong evidence of genetic colocalization between liver HP expression, LDL-C levels and parental lifespan, but not plasma Hp levels. We believe that this finding might be due to the fact that the assessment of plasma Hp levels may not capture the full extent of the various circulating Hp isoforms (1-1, 1-2 and 2-2). Interestingly, our gene network analysis revealed a potential interaction of HP and APOE. Interaction of Hp, APOE and beta-amyloid was also reported in human brain tissues, where this complex might influence central cholesterol metabolism (Spagnuolo et al., 2014). This finding is also supported by our phenome-wide MR study in FinnGen reporting a positive association between liver expression of HP (but not blood levels of Hp) and Alzheimer's Disease. Functional genomics and experimental investigations will be needed to shed light on the mechanisms linking HP and longevity and to determine if HP might represent a potential therapeutic target to improve lifespan.
The majority of previous investigations into the biological determinants of aging have contributed to the identification of several biological mechanisms regulating longevity in model organisms. These mechanisms include, among others, the modulation of the insulin-like signalling pathway, target of rapamycin, sirtuins and NAD+, circadian clocks, mitochondria and oxidative stress, senescence, chronic inflammation, autophagy, proteostasis as well as genomic instability and telomere attrition (Campisi et al., 2019). Although the biological determinants of aging in humans are still incompletely understood, aging mechanisms identified in model organisms appear to contrast with the biological determinants of aging in humans. One exception appears to be chronic inflammation, which might influence longevity in both model organisms and humans. Both our recent report of genetic variation in the interleukin (IL)-6 signalling pathway being associated with parental lifespan (Rosa et al., 2019) and the results of the CANTOS trial, which reported important mortality benefits in patients with elevated high-sensitivity C-reactive protein levels following administration of the IL-1β monoclonal antibody canakinumab (Ridker et al., 2018) support this hypothesis. Our study builds on this previous work showing that genes involved in both innate and immune response may be associated with longevity (Chignon et al., 2020). Here, our proteome-wide MR study identified variation at the IL6R and ICAM5 loci to be asso- Interestingly, a previous analysis also reported a potential association in this gene with longevity in the Framingham Heart Study (Yashin et al., 2010). Finally, the association of CCDC71L expression in arteries with shorter parental lifespan and higher risk of coronary artery disease in FinnGen also warrants further exploration.
The results of this study need to be interpreted in the context of certain limitations of our approach, beginning with the definition of the study outcome. By definition, parents have to reach their reproductive age to be included (i.e., have children that were enrolled in the study cohorts), which may introduce selection bias. Although efforts were made to exclude relatives as well as parents that died before the age of 40, the cause of parental death was not available.
Parents were also exposed to a different environment than their children (exposure to world wars, famine, higher rate of smoking, heterogeneous living conditions, sanitation and access to medical care, etc.). This observation, combined with the fact that parental lifespan is a self-reported trait, may have introduced misclassification. The effect sizes of the variants that were used to assess our study exposures (eGenes, eProteins and eMetabolites) were obtained in modern cohorts from individuals not necessarily exposed prediction models were performed using elastic net, a regularized regression method, as implemented in the PredictDB pipeline (Barbeira et al., 2018;Gamazon et al., 2015). We used SNPs with a minor allele frequency greater than 1% from European ancestry participants.
Expression of protein coding, antisense, long intergenic non-coding and micro RNA was considered. The first three ancestry-based principal components, a set of covariates identified using the probabilistic estimation of expression residuals method (Stegle et al., 2012) with genotyping platform and sex were used as covariates.

| Assessment of genetic colocalization
Genetic colocalization was used to filter our LD-contaminated associations. A stringent Bayesian model is used to estimate the posterior probability of each eGene containing a single eQTL affecting both the eGene and the outcome(s) of interest. We used COLOC R package to estimate the probability of five hypotheses: H0 corresponds to no eQTL and no GWAS association, H1 and H2 correspond to eQTL association but no GWAS association or vice-versa, H3 corresponds to eQTL and GWAS associations but independent signals (weaker eQTL signal or GWAS hit) and H4 corresponds to shared eQTL and GWAS signal (the lead eQTL is also among the top GWAS hits) (Giambartolomei et al., 2014). We kept eGenes with evidence of genetic colocalization (PP.H4 > 0.75). To assess the role of the prior, we varied the critical parameter p12, which codes for the prior probability that a variant is associated with the exposure and the outcome (1e−4, 1e−5 and 1e−6). For cis-eQTL analyses, a p 12 = 1e−4 threshold is usually suggested. Table 1 presents the PPH4 results with the prior set at the more conservative threshold of p 12 = 1e−5. Locuscompare function from the LocuscompareR R package  was used to depict the colocalization events. This software enables visualization of the strengths of eQTLs and outcomes associations by plotting pvalues for each within a given genomic location, thereby contributing to distinguish candidates from false-positive genes. To account for multiple testing, a Bonferroni correction for all gene-tissue pairs: <0.05/number of gene-tissue pairs tested was applied.

| Associations of eGenes with parental lifespan using Mendelian randomization
The association of eGenes linked with parental lifespan in TWAS analysis and parental lifespan was investigated using GTEx and eQTL-Gen. We used IVW-MR with the mr function from TwoSampleMR package. The IVW-MR is comparable to performing a meta-analysis of each Wald ratio. Additional MR analyses were performed to evaluate heterogeneity (intercept p-value, from MR Egger (Bowden et al., 2017)) and the presence of outliers. We used MR-PRESSO (Verbanck et al., 2018), an outlier robust method, to detect the presence of outliers (variants potentially causing pleiotropy and influencing causal estimates) and causal estimates were obtained before and after excluding outliers. We also used the contamination mixture method as this method was recently identified as a robust modelling method to identify causal relationships in the presence of potentially invalid genetic instruments (Burgess et al., 2020).

| Tissue specificity of gene expression
The tissue-specific gene expression metric (τ) was obtained from all genes identified by TWAS. We used the formula from Yanai et al. (2005) to compare the level of gene expression across selected tissues based on RNA sequencing data from European ancestry donors from GTEx. All the genes with expression <1 RPKM were set as not expressed. The RNA-seq data were first log-transformed. After the normalization, a mean value from all replicates for each tissue separately was calculated. A τ value closer to 1 indicates tissue specificity while a τ value closer to 0 indicates ubiquitous gene expression. We considered that eGenes had tissue-specific expression when their τ statistic was ≥0.80.

| Associations of eProteins and eMetabolites with parental lifespan
We used GWAS summary statistics from the INTERVAL cohort (Sun et al., 2018) to identify eProteins that could potentially be causally associated with parental lifespan. Relative concentrations of 3,622 plasma proteins or protein complexes were assayed using an aptamer-based multiplex protein assay (SOMAscan) in 3,301 participants from the INTERVAL study. A minimum of 4 instrumental variables (IVs) were constructed using cis-pQTLs in a 1Mb window, clumped using plink 1.9 with a r 2 < 0.1 (from 1000 genomes phase physical distance threshold of 250 Kb. The MHC (6:28,477,448,354) and APOE regions were removed as well as ABO gene and immunoglobulins. Association of 279 circulating eProteins and parental lifespan was assessed with IVW-MR using mr function from TwoSampleMR package and a Bonferroni correction was applied (pvalue = 1.79e −4 (0.05/279 eProteins). Additional MR analyses were performed to evaluate heterogeneity (MR Egger (Bowden et al., 2017)) and the presence of outliers (MR-PRESSO (Verbanck et al., 2018)). We also used the contamination mixture method (Burgess et al., 2020). The same MR framework was used to identify eMetabolites potentially associated with parental lifespan. We used GWAS summary on 123 metabolites from the study of Kettunen et al. (2016) In this study, 123 blood lipids and metabolites were measured in 24,925 individuals from 10 European cohorts using highthroughput nuclear magnetic resonance spectroscopy. Metabolites contamination mixture methods were also used as described above.

| Investigation of genetic colocalization across multiple traits at the HP locus
We investigated genetic colocalization across multiple traits at chromosome 15 within 1 Mb of the HP gene between liver expression of the HP gene (from GTEx (Lonsdale et al., 2013)), plasma Hp levels (from INTERVAL (Sun et al., 2018)), plasma LDL-C levels (from GLGC (Willer et al., 2013)) and parental lifespan (from UK Biobank and LifeGen (Timmers et al., 2019)) using the hyprcoloc R package. HyPrColoc is a deterministic Bayesian algorithm using GWAS summary statistics that can detect colocalization across vast numbers of traits simultaneously (Foley et al., , 2021). Posterior probabilities of colocalization heatmap was performed with the sensitivity.plot function from the hyprcoloc R package with default settings (regional and alignment thresholds: 0.6, 0.7, 0.8 and 0.9, prior probabilities of colocalization: 0.98, 0.99 and 0.995). Regional association plot was obtained from the stack_assoc_plot function from the gassocplot R package and the 1000G phase 3 LD reference panel (European ancestry).

| Resources used to identify drug candidates for aging-associated diseases
We used the Open Targets Platform to determine the potential ther-

| Variation in PCSK9 and parental lifespan using individual participant data in the UK Biobank
We selected independent SNPs (r 2 < 0.10) at the PCSK9 locus SNPs independently associated with LDL-C levels. We constructed a weighted genetic score (GS) of SNPs at the PCSK9 locus scaled to a 1 mmol/L reduction in LDL-C levels. The associations between the GS and parental lifespan in the UK Biobank was tested using logistic regression adjusted for age, sex and the first 10 ancestrybased principal components using R (version 3.5.1). We used the definition of Pilling et al. (2017) to assess high parental lifespan in participants of the UK Biobank. Only participants between 55 and 69 years were included. Participants who were adopted, had missing information on age of parents' death or who had parents who died at a young age (<46 for the father and <57 for the mother) were excluded from these analyses. Parents were separated into three categories: long-lived (father still alive and older than 90 or father's age of death ≥90 and mother still alive and older than 93 or mother's age of death ≥93), medium-lived (age of death ≥66 and <89 for the father and ≥73 and <92 for the mother) and shortlived (age of death ≥46 and <65 for the father and ≥57 and <72 for the mother). We defined high parental lifespan as at least one long-lived parent (i.e., long/long or long/medium). The control group included participants with parents considered as short-or medium-lived (i.e., short/short, short/medium or medium/medium).
Participants discordant for mothers' and fathers' age of death (one long-lived parent and one short-lived parent) were also excluded from the present analyses. These analyses were conducted under UK Biobank data application number 25205.

| Effect of parental lifespan eGenes on other traits and human disease
The association between parental lifespan eGenes (presented in Table 1) and human traits and disease was investigated using PhenomeXcan (Pividori et al., 2020 The fastENLOC-assessed regional colocalization probability (RCP) is also presented.

| Association of eGenes, eProteins, eMetabolites and PCSK9 with disease traits in FinnGen
For MR analysis, cis-eQTL were identified using QTLtools (Delaneau et al., 2017) with the first three principal components, a set of covariates identified using the probabilistic estimation of expression residuals method (Stegle et al., 2012) with genotyping platform and sex as covariates. Variants were included if they had a minor allele frequency ≥1%. Missing genotypes were imputed as the mean of the other participants' genotypes. The cis-window size was set to 1Mb. We used the get_r_from_pn and get_r_from_lor functions from the TwoSampleMR package to calculate variance of each eQTL for quantitative and binary data respectively. SNPs that explain more of the variance in the outcome compared to the exposure was not included as instrumental variables (IV). Removing eQTLs that show evidence of reverse causality is of particular importance in the setting of human longevity studies as aging might influence genes in a tissue-specific manner (Yang et al., 2015). IVs were clumped using plink 1.9 with a r 2 < 0.1, a p-value <0.01 and a physical distance threshold of 250 Kb. If an eGene was significant in multiple tissues, the tissue that provided the strongest eGenes-parental lifespan estimate from S-PrediXcan was prioritized. Instruments for eProteins and eMetabolites were the same as those used to determine their association with parental lifespan. In FinnGen, a method called SAIGE (Scalable and Accurate Implementation of Generalized Mixed Models), which is based on generalized mixed models was developed to control for case-control imbalance, sample relatedness and population structure. GWAS was performed using over 16 million genetic markers genotyped with the Illumina or Affymetrix arrays or imputed using the population specific SISu v3 reference panel. Variables included in the models were sex, age, the 10-main ancestry-based principal components and genotyping batch. This analysis was performed using publicly available GWAS summary statistics from the FinnGen data freeze 3 (June 16, 2020).
Outcomes with <400 cases were excluded leaving 1169 traits for PheWAS. Since several of the phenotypes that were investigated were genetically correlated, accounting for all phenotypes as they were independent may be too conservative. We therefore used the PhenoSpD tool , to estimate the number of independent tests that are performed. PhenoSpD applies GWAS summary statistics to LD score regression to estimate the phenotypic correlation matrix of the traits and estimates the number of independent variables among the traits. We considered associations that had a p-value <6.5e −5 (0.05/773 traits) (instead of 1169) to be statistically significant. We used IVW-MR to determine the association between genetic instruments for eGenes, eProteins, eMetabolites, single SNPs without TWAS (APOE, LPA, LDLR, MAGI3 and 9p21 region) or LDL-C reductions associated with variants at the PCSK9 locus with these disease-specific binary traits.
We used genetic instruments mentioned above while keeping the SNPs in common between the compared datasets.

ACK N OWLED G EM ENTS
We would like to thank all study participants as well as all investigators of the studies that were used throughout the course of this investigation.

CO N FLI C T O F I NTE R E S T
B.J.A. is a consultant for Silence Therapeutics and Novartis and has received research funding from Ionis Pharmaceuticals, Pfizer and Silence Therapeutics.

AUTH O R CO NTR I B UTI O N S
N.P., W.P., J.B., C.C. and Z.L. analyzed the data, N.P., C.C. and B.J.A. made the figures, N.P. and B.J.A. drafted the manuscript. All the others interpreted the data and critically reviewed the manuscript prior to submission.

PATI ENT CO N S ENT S TATEM ENT
All contributing datasets obtained patient consent.

DATA AVA I L A B I L I T Y S TAT E M E N T
The summary statistics of the GWAS meta-analysis of parental lifes-