Proteomic assessment of serum biomarkers of longevity in older men

Abstract The biological bases of longevity are not well understood, and there are limited biomarkers for the prediction of long life. We used a high‐throughput, discovery‐based proteomics approach to identify serum peptides and proteins that were associated with the attainment of longevity in a longitudinal study of community‐dwelling men age ≥65 years. Baseline serum in 1196 men were analyzed using liquid chromatography–ion mobility–mass spectrometry, and lifespan was determined during ~12 years of follow‐up. Men who achieved longevity (≥90% expected survival) were compared to those who died earlier. Rigorous statistical methods that controlled for false positivity were utilized to identify 25 proteins that were associated with longevity. All these proteins were in lower abundance in long‐lived men and included a variety involved in inflammation or complement activation. Lower levels of longevity‐associated proteins were also associated with better health status, but as time to death shortened, levels of these proteins increased. Pathway analyses implicated a number of compounds as important upstream regulators of the proteins and implicated shared networks that underlie the observed associations with longevity. Overall, these results suggest that complex pathways, prominently including inflammation, are linked to the likelihood of attaining longevity. This work may serve to identify novel biomarkers for longevity and to understand the biology underlying lifespan.

of specific candidate biomarkers that are hypothesized to reflect relevant outcomes (Sanchis-Gomar et al., 2015), but some have also used more broad ranging analytical approaches aimed at identifying biomarker signatures, for instance using metabolomics (Cheng et al., 2015).
Mass spectrometry (MS)-based proteomic methods have been successfully adopted for biomarker discovery (Huang et al., 2017), but such proteomic approaches have been limited by technically demanding and time-consuming methods, and have had inherently low throughput. Previous studies were frequently restricted to relatively small sample sizes that are inadequate to assess associations on a population scale. Newer approaches, such as aptamer-based or antibody-based affinity proteomics, allow multiplexing and larger sample sizes but are constrained to the evaluation of candidate proteins (Benson et al., 2019;Gold et al., 2010).
We developed high-throughput and sensitive MS-based methods that allow a broad, discovery-based assessment of the serum proteome (Baker et al., 2010(Baker et al., , 2014 and have used those methods to interrogate samples from a large longitudinal cohort of older men to identify proteins associated with bone loss and mortality (E. S. Nielson et al., 2017;Orwoll et al., 2018). Similar pipelines for large scale discovery proteomics have been employed in several other pioneering studies (Geyer et al., 2016;Price et al., 2017;Surinova et al., 2015).
We have used discovery proteomics in a 12-year longitudinal study of older men to identify serum proteins that are associated with longevity and have explored the biological pathways that may be involved in their regulation. Some of these proteins are well documented to be associated with longevity, while others have not been previously reported. These results illustrate the utility of this approach for biomarker discovery, provide candidate protein biomarkers potentially useful to identify individuals who may be long-lived, and offer insight into the biological basis of longevity.

| Study participants
We utilized serum samples and phenotypic data from men ≥65 years enrolled in a large, prospective, longitudinal study (MrOS)(http:// mrosd ata.sfcc-cpmc.net). Of the entire MrOS cohort (N = 5994), a randomly selected subcohort (N = 2473) had serum proteomic assessments of baseline serum samples and were followed prospectively for 11.9 ± 4.6 years. In these analyses (the analytic cohort), we compared those with proteomic measures who achieved longevity, defined as reaching or exceeding the 90th percentile of expected age for their birth cohort (N = 554), to those who died before achieving longevity (N = 642) (Figure 1). Less than 1% of long-lived men died within 5 years of baseline, and all men in this group lived at least 3.7 years (the 10th percentile of follow-up time was 9.1 years). Just 13 (2%) of the shorterlived men died within 1 year of baseline, and an additional 30 (5%) in this group died with between 1 year and 2 years of follow-up. Thus, our study design explicitly mitigated the risk of inadvertently detecting proteins associated with life-threatening acute illness effects. Potential confounding by age was minimized by requiring complete overlap in baseline age distributions between the two groups (see 4.2 Analytic sample). The characteristics of the overall MrOS cohort, the randomly selected subcohort with proteomic measurements, and the analytic cohort are shown in Table 1. The randomly selected subcohort with proteomic measures was similar to the entire MrOS cohort. In the analytic cohort, the mean age at baseline was 77.4 ± 3.2 years (range 73-84).
Generally, these men were similar to the overall MrOS cohort, apart from being slightly older on average due to the age selection criteria.
Compared to the shorter-lived men, the men who achieved longevity were slightly older, had minimally lower BMI, and had slightly better levels of self-reported health, scores in the physical component of the SF-12 and scores on the Healthy Aging Index (lower scores are better).

| Proteins associated with longevity
We analyzed 3831 serum peptides mapping to 224 proteins. The raw data are available as a MassIVE dataset (accession MSV000085611).
Protein identifiers used in the MassIVE files are provided (in the "Symbol" column) in Table S1. The effect sizes of the associations of peptides with longevity are shown in Figure 2a. Protein-level metaanalysis of the peptide associations revealed 25 proteins associated with longevity (Table 2), defined as having a meta-analyzed fold change of at least 1.1 in magnitude and posterior probability of less than 0.1 that the effect is opposite of the estimated direction. An additional 34 proteins (second tier) had significant associations with longevity (Table   S2), but with slightly smaller fold changes and slightly higher posterior probabilities of incorrect sign (see 4.4 Statistical analyses). The effect sizes of the protein-level associations are shown in Figure 2b. All 25 strongly associated proteins (and all but 3 of the 34 second-tier proteins) were of lower abundance in those men who achieved longevity (fold changes −1.10 to −1.22) than in shorter-lived men. Key quantitative results from the mass spectrometric data analysis are available in Table S3, including the protein identifiers, number of peptides quantitated, mean relative abundance levels for long-lived men and controls, fold changes, Bayesian posterior probabilities, and technical coefficients of variation (CVs).
The relative abundance levels of the 25 longevity-associated proteins in the members of the analytic sample are shown in the heatmaps in Figure 2c. Among the men who achieved longevity, there was a large fraction with a pattern of consistently lower abundance levels. That pattern was present in a considerably smaller segment of the men who did not reach longevity, and in the latter group, a larger fraction had a pattern of consistently higher abundance of the longevity proteins.
In a clustering analysis of the complete set of identified serum proteins in the proteomics cohort, there was evidence of 12 clusters of intercorrelated proteins, and those clusters were similar when the clustering was performed separately in both long-lived men and controls. The 25 longevity-associated proteins grouped into 5 clusters ( Figure S1) showing modest to high levels of pairwise correlation (r = 0.33-0.89) among proteins in each cluster, suggesting that proteins within a cluster may share some underlying regulation.
Although it is difficult to equate tissue levels to circulating protein levels, to explore the tissues that were likely to contribute to the serum proteins associated with longevity, we used stud- iv.org/conte nt/10.1101/797373v2). Figure S2 shows the tissues that were most frequently described as having high levels of protein expression of the longevity-associated proteins. Cardiovascular and neurological tissues were most prominent. The proteins within clusters (above) did not appear to originate more often from unique tissue sources.  Healthy Aging Index (0-10) c 3.0 ± 1.6 2.9 ± 1.6 3.1 ± 1.6 3.0 ± 1.5 3.3 ± 1.7 a To reach or exceed the 90th percentile of expected age for birth cohort. b Higher score is better. c Lower score is better.

| Prediction of longevity and mortality
To examine the hypothesis that baseline levels of proteins were predictive of subsequent longevity and mortality, we used several complementary analytical approaches. First, the ability of protein signatures to predict longevity was analyzed using receiver operating characteristics (ROC). All possible combinations of the 25 most robustly associated proteins were evaluated for ability to separate the long-lived and control groups, and the most informa- Second, when considering either the entire cohort with proteomic measures (N = 2473) or the analytic cohort, higher abundances of each one of these 25 proteins were individually predictive of earlier mortality. Table 3 shows the age-adjusted hazard ratios of mortality corresponding to standard-deviation changes in protein abundance for the full proteomics cohort (hazard ratios 1.03-1.32, p < 0.0001 for most proteins).
An example of the relationships between protein abundance and death rate in the entire proteomics cohort is shown in Figure 3b; men in the highest tertile of C7 levels had a higher risk of death, and those in the lowest tertile a lower risk, compared to those in the middle tertile. As age increased and time to death shortened (see below), the differences were reduced. These analyses yielded similar results in the analytic cohort, including both the long-lived men and those who died earlier.
Finally, while the average abundance of the 25 longevity-associated proteins was lower, at any age, in the long-lived men than in those who died earlier, in both groups the levels of these proteins tended to be higher in individuals whose time to death was shorter,  (Table 2); small black dots =tier 2 proteins (Supplemental Table 2); small gray dots =nonsignificant proteins. (c) Heatmaps showing the standardized relative abundance of the 25 tier 1 serum proteins associated with longevity in each of the 554 men who achieved longevity during observation (cases, top) and the 642 men who died before achieving longevity (controls, bottom). The z-scores for all protein associations were precalcuated using the full cohort, so the z-score values (represented as heatmap colors) are directly comparable between the two panels even after adjustment for baseline age (p = 0.026). In contrast, the average abundance of all the other measured proteins that were not associated with longevity was unrelated to the time to eventual death (p = 0.68) ( Figure 3c). Although protein abundance was assessed at only one time point, these results suggest that an increase in these longevity-associated proteins heralds impending death.

| Proteins associated with longevity are associated with better health status
Centenarians have been reported to have an unexpectedly low burden of adverse health conditions (Gellert et al., 2018). Similarly, in the analytic cohort, lower levels of an overall abundance score summarizing the 25 longevity-associated proteins (lower score indicates lower protein abundance overall; see Protein abundance summary score in 4.4 Statistical analyses) were significantly associated with a better self- were the same when the entire proteomics cohort was considered.
Moreover, with the exception of the proteins in Cluster 5 (CD5L and IGHM; see Figure S1), each of the protein clusters and all of the proteins in each cluster were individually correlated with these health indices in the same direction as the overall protein score, on average at similar magnitudes but varying (ranging from ~0.02 to ~0.20 in size) depending on the health index and protein. CD5L and IGHM were not correlated with any of the health indices.

| Relationship of proteins associated with longevity, mortality and bone loss
In analyses of the MrOS cohort, we previously reported proteins that are associated with early mortality (E. S. Orwoll et al., 2018) and with accelerated bone loss (Nielson et al., 2017), and we explored to what extent the proteins associated with longevity are also associated with these other two phenotypes. In the Venn diagram in Figure 4a, it is apparent that there is considerable overlap among the proteins associated with longevity, mortality, and accelerated bone loss. However, almost universally, the directions of the associations with longevity are in the opposite direction to those of mortality and bone loss. Figure 4b shows the magnitude and direction of fold changes for each phenotype with each of the proteins referenced in the Venn diagram, and with few exceptions, the protein signature for bone loss and mortality in the top bands of the heatmap is similar and diametrically opposed to that of longevity in the bottom band.
To show these relationships in a more quantitative way, we also plotted the values of longevity protein fold changes versus mortality and bone loss fold changes ( Figure S3).

| Pathway analyses: identification of upstream regulators of longevity-associated proteins
Ingenuity pathway analysis (IPA) was used to identify upstream regulators and pathways that could be responsible for the proteomic patterns associated with longevity. Upstream regulators are compounds whose biological actions can be directly linked to a protein of interest. The upstream regulators with activation scores |Z|>2 (either activated or inhibited) of the longevity-associated proteins are shown in Table 4, along with the associated target proteins in our dataset. Of note, accounting for the direction of association in each measured protein that is regulated by the upstream regulator, almost all the upstream regulator pathways highlighted are predicted to be inhibited in long-lived men. The pathways with high activation scores were all associated with multiple longevity-associated proteins, and some proteins were members of multiple (>3) upstream regulatory pathways, suggesting a potential convergence of multiple pathways resulting in an altered protein abundance observed in long-lived men in this study.
Several additional analyses supported the relevance of the IPA predictions of upstream regulators. First, serum concentrations of two upstream regulators with high activation scores in our IPA analyses were available from independent ELISA assays (Cauley et al., The relationship between C7 abundance at baseline age and death rate in the entire proteomic cohort (N = 2473). Shown are the hazard ratios (HR) of death (± 95% CI) for the highest and lowest tertiles of C7 abundance compared to the middle tertile across years of age at baseline. (c) A plot of an abundance index of the 25 tier 1 longevity-associated proteins (left) as a function of time to death, compared to an abundance index of all 165 measured proteins not associated with longevity (right) 2016): IL-6 and IL-10, with activation scores −2.834 and −2.166, respectively. As predicted by the IPA, long-lived men had IL-6 and IL-10 concentrations that were lower than other men (22% lower, p < 0.001, and 11% lower, p = 0.042, respectively). Similarly, serum levels were available for 3 other upstream regulators with less robust activation scores: TNF (activation score = −1.14), TNF receptor 1 (activation score = −1.41), and TNF receptor 2 (activation score = −0.85). Their levels were also, as predicted by IPA, lower in the longlived men: 12% lower for TNF (p = 0.18), 13% lower for TNF receptor 1 (p < 0.001), and 6% lower for TNF receptor 2 (p = 0.004). Second, the relative abundance levels of proteins assessed in the present MS-based analyses that were not associated with longevity, but that were linked by IPA to upstream regulators, were also generally in the directions predicted by IPA. For example, 73% (29 of 40) of the abundance levels of proteins linked to IL-6 regulation were in the direction predicted, and 83% (10 of 12) of the relationships were as predicted for proteins linked to IL-10 regulation.
Using IPA analyses, we also examined how the activation patterns of the upstream regulators of the proteins associated with longevity may differ from those of the mortality and accelerated bone loss phenotypes. In Figure 5a, we show the 35 regulatory pathways with strong activation scores for longevity (activation score |Z|≥2), along with a heatmap displaying the patterns of activation for the regulators across the 3 phenotypes. The directions of activation or inhibition in longevity are uniformly opposite those for bone loss and mortality. Finally, with the hypothesis that the 5 clusters of intercorrelated longevity-associated proteins described above may reflect shared biological foundations, we examined the upstream regulators that might be involved in regulating the proteins in each cluster. An IPA-derived network analysis of the largest cluster (including the proteins in the cluster and other biologically associated proteins) is shown in Figure 5b, illustrating the importance of IL6 (which participates in 31 of the 120 connections in the network) and alpha V integrin (IGTAV, participating in 10 connections, including one to IL6) in its regulation. By way of comparison, the average number of connections per gene (beta index) for the network is just 2.14, indicating that the typical amount of connectivity for genes in the network is considerably lower than that displayed by IGTAV and especially IL6. Excluding the connection between them, these two genes account for nearly 25% (39/120) of the total number of edges in the network. Similar network analyses of the other 4 highly intercorrelated clusters containing at least 1 of the 25 longevity proteins are shown in Figure S4. Each of these networks contains at least one hub gene with a node degree 3 to 5 times larger than the beta index for the network. A comprehensive mapping between the gene names used by IPA (and in this paper) and the corresponding UniProt names and identifiers for our 224 measured proteins is provided in Table S1.

| DISCUSS ION
High-throughput proteomic analysis of population-based study samples provides the opportunity to identify biomarkers for important health outcomes. Using those methods, we identified serum proteins that are associated with longevity in a longitudinal study of older, community-dwelling men with a long follow-up period. Further, we used those findings to explore biological pathways that might be involved. The majority of the proteins we identified have been associated with inflammation, although some have multifunctional biological roles and their associations with longevity may reflect other mechanisms. Pathway analyses suggested that several major upstream regulators may be causally responsible for the associations. The proteins and regulatory pathways that are associated with longevity are also associated, but in opposite directions, with the adverse health outcomes of bone loss and mortality. Moreover, latelife disability and morbidity are lower among people who experience extreme longevity (Hitt et al., 1999). In concert with those findings, the longevity-associated proteins in this study were associated with several indices of better health status. Finally, we observed a gradual increase in the abundance of longevity-associated proteins as time to death shortened in both long-lived and shorter-lived men. Although TA B L E 3 Hazard ratios of longevity-associated proteins with mortality. The hazard ratios were adjusted for baseline age of the participants  (Emilsson et al., 2018;Menni et al., 2015;Sun et al., 2018;Tanaka et al., 2018). Our approach offers an unbiased opportunity to identify serum peptides/proteins associated with long life. In fact, our unbiased approach yielded longevity-associated proteins that were also measured in a recent analysis using a very large aptamer-based array (Emilsson et al., 2018), but also identified a number (4 of 25; FCGR3A, HPR, FCGBP, MCAM) that were apparently not assessed by that aptamer approach, highlighting the benefit of discovery proteomics.
The fact that many proteins were associated with longevity is not only biologically interesting but also supports the usefulness of population proteomic approaches to identify peptides and proteins of potential usefulness as biomarkers. MS-based discovery proteomic methods are evolving quickly and more in-depth measurements F I G U R E 4 Comparison of protein associations for longevity, mortality, and bone loss. (a) Venn diagram of the overlap of proteins associated with longevity, mortality, and bone loss. The accompanying table lists the overlapping proteins, with protein overlap groups color-coded to match the regions of the Venn diagram. Shown in parentheses are the directions of protein associations for each phenotype in the order (left to right): longevity, bone loss, mortality. (b) A heatmap of the relative protein abundance of proteins associated with longevity, mortality, and/ or bone loss. Shown are the signed fold changes for all proteins that were significantly associated with at least one of the phenotypes using the same criteria for significance that is used in this study (meta-fold change at least 1.1 in magnitude and meta-p less than 0.1) TA B L E 4 Regulatory pathways for longevity-associated proteins. Tier 1 proteins associated with longevity appear in boldface, tier 2 proteins appear with neither boldface nor parentheses, and proteins that we did not find significant for longevity but that were linked to the upstream regulators in the IPA knowledge base appear in parentheses. UniProt names and identifiers corresponding to the gene names appearing in the table can be found in Supplemental Table S1 Upstream Regulator Activation z-score Target proteins measured in cohort  (Li et al., 2016). Finally, cystatin C levels were lower in men who achieved longevity. Cystatin is a small molecular weight protein and is typically used as a marker of renal function, but higher levels have also been linked to the development of late-onset Alzheimer's Disease (Chuo et al., 2007) and it has been implicated in diverse aspects of immunity/inflammation and apoptosis (Zi & Xu, 2018).
To provide biological insight into the upstream regulators that could be involved in the generation of the proteomic patterns observed in our data, we utilized IPA analysis that is based on cu-   Kirkland & Tchkonia, 2017). In fact, many of the longevity-associated proteins (11 of 25) are considered SASP, and four (galectin-3-binding protein, CD166 antigen, 72 kDa type IV collagenase, cystatin-C) are considered "core" SASP-proteins that are consistently stimulated by a variety of senescent stimuli (Basisty et al., 2020). Moreover, there is overlap between the longevity-associated proteins we have identified (e.g., S100A9, S100A8), or their upstream regulators (e.g., APOE, IL1β, IL6, IL17), and the genes that are differentially expressed in response to caloric restriction and are related to inflammation (Ma et al., 2020). Orange shades indicate IPA-predicted activation and blue shades indicate predicted inhibition of the regulator. (b) Network analysis (IPA) of the largest cluster (Cluster 1) of intercorrelated serum proteins associated with longevity. To derive these networks, we used IPA network-building tools in a systematic and algorithmic manner to connect the proteins appearing in the clusters to one another and to annotate their relationships to other closely connected proteins. Green symbols show measured proteins that were decreased in longlived men, red symbols measured proteins that were increased, and blue symbols unmeasured proteins or regulators that are predicted by IPA to be inhibited. Blue lines represent inhibitory relationships that were consistent with IPA predictions, orange lines activating relationships consistent with IPA prediction, yellow lines relationships inconsistent with IPA prediction, black lines relationships that exist in the IPA knowledge base but without a prediction, solid lines direct relationships and dashed lines indirect relationships. Arrows indicate directionality of activation, and flat ends show directionality of inhibition. Lines with neither arrows nor flat ends indicate only a general relationship or interaction of the molecules. The names appearing in the figure are IPA gene names, not UniProt identifiers; a mapping of gene names and current UniProt identifiers is in Supplemental Table S1 (Emilsson et al., 2018), and similarly, we found that the proteins associated with longevity in the current study also clustered. Of the 25 longevity-associated proteins we identified, 18 were also measured by Emilsson et al. (Emilsson et al., 2018) and 12 were found to be part of clusters in their cohort (AGES). Two of their clusters were enriched in proteins we also found to be associated with longevity. Four of our longevity-associated proteins (neuropilin, CD166, alpha-2 microglobulin and 72 kDa type IV collagenase; members of our related clusters 1 and 2, Figure S1) were part of their cluster PM27, a 378-protein module associated with prevalent heart failure and reduced survival. Three (beta-2-microglobulin, complement factor D and cystatin-C; members of our cluster 3, Figure S1) were part of their cluster PM26, a 390-protein module that was positively association with prevalent and incident coronary heart disease and heart failure as well as reduced overall survival probability. Clusters such as these may suggest shared biological underpinnings, and our integrative analyses using IPA yielded pathways that appeared to converge on nodes that tied together the longevity-associated proteins and related proteins and regulators. These analyses may yield targets for additional research evaluation aimed at uncovering causative events related to longevity.
Our analysis has important strengths. It takes advantage of a large, prospective observational study that includes excellent follow-up and ascertainment of longevity. We analyzed discovery proteomic measures in almost 1200 men, thus representing the largest such experiment available. We utilized very robust statistical methods to link peptides to proteins and to reduce the likelihood of type II error. Several limitations should also be mentioned.
The proteomic analysis we performed is limited in terms of sensitivity, but on the other hand, it is relatively comprehensive and we examined a very large number of participants. As our pathway and protein-protein interaction analyses demonstrate, many of the longevity-associated proteins we report are linked, and although we can implicate major pathways as being associated with longer life, it is more difficult to evaluate the relative importance of each peptide/protein. Since the numbers of minority participants were limited, we could not examine these associations in non-white men.
We did not include women. Observational studies such as ours are limited in their ability to disentangle correlative from the causal factors, and from these analyses, we cannot determine the time of life at which potentially advantageous pathways become associated with longevity. Moreover, we did not include a direct assessment of health, but ultimately it will be important to understand both longevity and disease-free longevity. Future experimental studies may help to elucidate the relationships among proteins and with outcomes relevant to human health.
In summary, we performed broad based serum proteomic analyses on a large number of older men and describe peptides and proteins that are associated with longevity. Many of the proteins we identified as being reduced in those who were long-lived are involved in inflammation, and a number were previously found to be linked to early mortality but in the opposite direction. Pathway analyses were highly enriched in regulators of inflammation and immunity, reinforcing the importance of inflammation in the determination of lifespan. These results provide the opportunity to further evaluate these peptides and proteins as biomarkers and highlight the potential importance of the biological pathways they implicate in the origins of long life.

| Study participans
The Study of Osteoporotic Fractures in Men (MrOs) is a prospective observational cohort study of men aged ≥65 years. The design and recruitment have been previously described E. Orwoll et al., 2005). Briefly, 5994 community-dwelling, ambula-

| Analytic sample
For the current analysis, 2473 non-Hispanic white participants were randomly selected from MrOS ( Figure 1). Too few non-white men were enrolled (~10%) to enable analyses of racial or ethnic differences. Within the subcohort of 2473, we selected men who had the potential to achieve longevity, specifically to reach or exceed the 90th percentile of expected age for their birth cohort. That expected age for each birth cohort was defined by an analysis of actuarial life tables from the United States Social Security Administration (see Supplemental Methods). Men who enrolled at ages less than 73 were excluded because they did not have sufficient time to reach the 90th percentile of age for their birth year cohorts. Men who enrolled at ages greater than 84 were excluded because they were already quite close to the 90th percentile of their birth year cohorts; almost none of them failed to reach the 90th percentile during follow-up, and there were few or no same-aged controls to compare them to.
In order to guard against leverage effects, we required overlap in the age distributions of the long-lived and not-long-lived groups such that each discrete stratum by year of age would contain at least 5 participants from each group. Using these criteria, 1196 men were eligible. Control participants were MrOS subjects with enrollment ages in the selected baseline age range [73][74][75][76][77][78][79][80][81][82][83][84] who died during observation (or were lost to follow-up; 8%) before they reached the 90th percentile of age for their birth year cohort.

| Serum proteomic analysis
The proteomics workflow is illustrated in Figure 1 and has been described in detail in (E. S. Orwoll et al., 2018). Briefly, 150 µL of serum from the baseline MrOS visit that had been stored at −80°C since collection was depleted of 14 high-abundance proteins using IgY14 immunoaffinity depletion columns (Sigma-Aldrich, St. Louis, MO, USA) and digested with trypsin. A pooled serum from 102 MrOS participants served as technical control of sample processing and analysis; they were embedded throughout the proteomic runs. The tryptic peptide samples were analyzed using a LC-IMS-MS platform (Baker et al., 2010(Baker et al., , 2014. Specifically, the analytical platform utilized in this work coupled a 1-m ion mobility drift tube and an Agilent  (Livesay et al., 2008). to 3200 m/z. The details of the platform performance have been described elsewhere (Baker et al., 2010). Detection and quantification of LC-IMS-MS features with characteristic (mass, charge, LC elution time, IMS drift time and abundance) was performed using Decon2LS (Jaitly et al., 2009) and FeatureFinder (Crowell et al., 2013) software tools. The detected features were identified by mapping their mass, elution time, and drift time using VIPER software tools (Crowell et al., 2013;Zimmer et al., 2006) and linked to known proteins. Peptide abundances were log 10 transformed. We removed outliers flagged by a multivariate distance measure. Data normalization was based on estimates of technical variability that were computed from measured abundances of peptides that were detected in all 102 pooled control samples.

| Peptide and protein-level associations with long-lived status
We have published portions of our statistical analysis pipeline previously (Nielson et al., 2017;E. S. Orwoll et al., 2018). We used bias-corrected estimates from linear regression models to estimate associations between individual peptide abundances and long-lived status, followed by Bayesian meta-analyses that combined peptidelevel results to yield associations at the protein level. We then used a resampling procedure to ensure protein-level estimates were stable. The linear regression model used normalized log 10 peptide abundance levels as the dependent variable, and incorporated adjustments only for participant age and the population-based birth-cohort cumulative hazard at age to account for differential hazard at the same age in different subcohorts. Measures of body mass index were essentially identical in long-lived and control groups. The models included indicators for MrOS clinical site (to ensure no variability based on unappreciated differences in study conduct between sites) and an indicator for peptides whose abundance was partially imputed during mass-spectrometry analysis. The fold difference between peptide abundance in the longevity group and the short-lived group was estimated as the antilog of 1 from the following model: Our approach to account for bias in the peptide fold changes caused by missing peptide values was application of the Heckman selection model (Heckman, 1979), the details of which have been previously described (Nielson et al., 2017). To investigate stability of effect size estimates, we performed a delete-half jackknife resampling sensitivity analysis based on a bootstrap of 200 jackknife replicates sampled with replacement (Efron, 1994;Shao, 1989). For each replicate, we ran each protein through the entire estimation pipeline and compared the bootstrap distribution of meta-effect estimates to the full-cohort estimate. We found that the replicate distribution of effects for proteins represented by at least 2 peptides reliably reproduced the credibility bounds obtained from the full-cohort meta-analysis. Singleton peptides were often less stable.
Therefore, protein-level meta-effects were reported only for the 224 proteins represented by at least 2 measured peptides.
log 10 peptide abundance ∼ + 1 alive 90th+ Proteins with differential abundance between the longevity and control groups were selected based on their meta-effect size and "meta-p" value (posterior probability that the sign of the meta-effect is incorrectly estimated). Proteins were prioritized if their bootstrap meta-p was less than 0.1 and the absolute value of their bootstrap log 10 meta-effect was at least 0.041 (corresponding to a fold change of about 1.1 or 0.9). This rule led to the selection of 25 proteins (referred to as "tier 1"; see Table 2). A second tier of 34 proteins with absolute meta-fold change >1.05 and meta-p < 0.2 that also were in the top third of ranks generated by an empirical Bayes ranking procedure (https://arxiv. org/abs/1312.5776) were tabulated as well (Table S2). It is important to note that these meta-p values are Bayesian posterior probabilities and do not carry the same interpretation as p-values. They have already been adjusted for our prior expectations (via the Bayesian prior distributions) and do not require any correction for multiple comparisons.
To facilitate comparisons of longevity proteomic associations with those of the mortality and bone loss phenotypes that we examined in previous papers (Nielson et al., 2017;E. S. Orwoll et al., 2018), we reanalyzed both phenotypes using methods identical to those employed for this study and selected robustly associated proteins using the same selection criteria (absolute meta-fold change >1.1 and meta-p < 0.1). The selected proteins for these phenotypes are presented in Figure 4a.

| Estimates of protein abundance
Protein clustering, receiver operating characteristic (ROC) curves, and correlations with health phenotypes were based on estimates of protein abundance, which were derived via a crossed-randomeffects model that included all peptides observed for each protein (Nielson et al., 2017). Protein levels for each participant were estimated using predicted values for the total effects (i.e., fixed plus random effects) minus the best linear unbiased prediction of the corresponding random effect for each peptide.

| Clustering
To identify clusters of proteins that might share biological regulation, protein abundance estimates were standardized by protein, and pairwise distances between all proteins were calculated using the Gower dissimilarity measure. These distances were clustered hierarchically using Ward's linkage, yielding 12 clusters by the Duda-Hart stopping rule, 5 of which contained longevity-associated proteins (from either tier 1 or tier 2).

| Protein abundance summary score
Briefly, we standardized the measured abundances within each protein, combined these standardized values across the 25 longevityassociated proteins (additionally noting the subtotals separately for each cluster), and then standardized the combined total; it is this final overall standardized value that we refer to as the "overall abundance score" above. Scores were additionally calculated for each of the 5 clusters as averages of the (standardized) abundance values of the proteins in each cluster. Hence, a lower score indicates generally lower protein levels. The total protein score was correlated (using Spearman's correlation) with self-reported health, the SF-12 physical component, the Healthy Aging Index (Sanders et al., 2014), and the Fried Frailty Index adapted for MrOS (Cawthon et al., 2007;Fried et al., 2001), all measured at the MrOS baseline visit.

| Mortality and death-proximity associations
We investigated associations between the 25 longevity-associated proteins and mortality, anticipating that each protein's mortality association would be approximately the inverse of its longevity association. For each protein, we fit a semiparametric time-to-event model (using cubic splines), adjusted for participant age, on data from the entire proteomics cohort (N = 2473) as well as the analytic cohort, to obtain a hazard ratio of mortality. Furthermore, to investigate whether proteins were correlated with proximity to death (to calculate the slopes in Figure 3c), we fit a structural equation model (SEM) for the top 25 longevity-associated proteins, and another for the set of 165 proteins that were not associated with longevity (i.e., including neither tier 1 nor tier 2 proteins; note that tier 2 proteins were not used in either model). The models included a measurement component summarizing the protein abundances into a single factor score. The structural portion of each model assumed the effect of age on protein abundance levels was at least partially mediated by proximity to death. To disentangle the effect of death proximity from the expected aging effect, we used as instrumental variables health status (as measured by the physical component of the SF-12) and cumulative hazard of death (from US population statistics).
Finally, we calculated protein abundance score predictions from the models and performed a kernel-weighted (Epanechnikov) local polynomial spline smoothing of those predictions along a time-to-death axis. The linear association between predicted protein abundance scores and time to death was estimated as the average time derivative of the spline in each instance.

| Pathway analyses
We used the Ingenuity Pathway Analysis software (IPA, Spring 2019 release, QIAGEN Inc.; https://digit alins ights.qiagen.com/ produ cts-overv iew/disco very-insig hts-portf olio/analy sis-andvisua lizat ion/qiagen-ipa/) to identify networks of interacting proteins associated with longevity, predicted upstream regulators of those associations, and causal networks potentially related to those effects (Kramer et al., 2014). The proteomic dataset was input into ingenuity pathway analysis (IPA) using the Core Analysis platform (Ingenuity Systems, Redwood City, CA) under default settings: Direct and indirect relationships between molecules supported by experimentally observed data were considered, de-novo networks did not exceed 35 molecules, and all sources of data from human, mouse, and rat studies in the Ingenuity Knowledge Base were considered. IPA provides an upstream regulator analysis to determine likely direct regulators of the proteins in our dataset, designating them as "activated" or "inhibited" based on a z-score calculated from the fold change directions and magnitudes among the proteins in our data that could be mapped to the regulator.
Regulator associations are quantitated by the activation state, including the predicted direction of the associations (activated or inhibited), and the salience of the activation of the putative regulator, as measured by the magnitude of the z-score. For each cluster (see Clustering), we used IPA network-building tools in a systematic and algorithmic manner to create networks of genes that according to the IPA knowledge base are closely connected to the proteins comprising each of the 5 protein clusters associated with longevity. Connectivity of genes within each network was assessed by the degree of the gene node (i.e., the number of other genes in the network with a connection to that node) and compared to the overall beta index for the network, which characterizes the average number of connections per node (counting each pair only once).

ACK N OWLED G M ENTS
The MrOS Study is supported by the following institutes under the Proteome data analysis was further supported by the preceding grant as well as NIH/NCATS grant UL1TR000128. CMN was supported by K01 AR062655.

CO N FLI C T O F I NTE R E S T S
All authors have nothing to declare.

AUTH O R CO NTR I B UTI O N S
ESO, JW, and JL had full access to data and take responsibility for the integrity of the data and the accuracy of the data analysis.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data concerning MrOS cohort, including cohort and phenotypic information, are publicly available at http://mrosd ata.sfcc-cpmc.net.
The raw proteomics data are available as a MassIVE dataset (accession MSV000085611). Protein identifiers used in the MassIVE files are provided (in the "Symbol" column) in Table S1.