A decade of epigenetic change in aging twins: Genetic and environmental contributions to longitudinal DNA methylation

Abstract Background Epigenetic changes may result from the interplay of environmental exposures and genetic influences and contribute to differences in age‐related disease, disability, and mortality risk. However, the etiologies contributing to stability and change in DNA methylation have rarely been examined longitudinally. Methods We considered DNA methylation in whole blood leukocyte DNA across a 10‐year span in two samples of same‐sex aging twins: (a) Swedish Adoption Twin Study of Aging (SATSA; N = 53 pairs, 53% female; 62.9 and 72.5 years, SD = 7.2 years); (b) Longitudinal Study of Aging Danish Twins (LSADT; N = 43 pairs, 72% female, 76.2 and 86.1 years, SD=1.8 years). Joint biometrical analyses were conducted on 358,836 methylation probes in common. Bivariate twin models were fitted, adjusting for age, sex, and country. Results Overall, results suggest genetic contributions to DNA methylation across 358,836 sites tended to be small and lessen across 10 years (broad heritability M = 23.8% and 18.0%) but contributed to stability across time while person‐specific factors explained emergent influences across the decade. Aging‐specific sites identified from prior EWAS and methylation age clocks were more heritable than background sites. The 5037 sites that showed the greatest heritable/familial–environmental influences (p < 1E−07) were enriched for immune and inflammation pathways while 2020 low stability sites showed enrichment in stress‐related pathways. Conclusions Across time, stability in methylation is primarily due to genetic contributions, while novel experiences and exposures contribute to methylation differences. Elevated genetic contributions at age‐related methylation sites suggest that adaptions to aging and senescence may be differentially impacted by genetic background.


| INTRODUC TI ON
The functional profiles of genes are not static and vary across time, and indeed across the lifespan, in part as a result of different environmental exposures and contexts (van Dongen et al., 2016;Jones, Goodman, & Kobor, 2015;Lappe & Landecker, 2015;McClearn, 2006). Measurable gene-environment dynamics for behavioral traits are possible due to advances in biotechniques for global epigenetic profiling at, for example, specific methylation sites in the human genome. Epigenetic changes may be critical to the development of complex diseases, accelerated aging, or steeper declines in cognitive and physical functioning with age (Lappe & Landecker, 2015). Understanding epigenetic changes over time in the elderly may identify pathways of decline or plasticity (e.g., maintenance or even boosts in functioning) during the aging process and help with elucidating the biology of aging and survival.
Epigenetic modifications resulting in altered gene expression may occur due to a number of processes, including direct methylation of DNA (Jones & Takai, 2001). DNA methylation results from intrinsic-programmed factors as well as non-genetic processes that may arise due to prenatal or early life exposures or at later points in development (Gottesman & Hanson, 2005;Kanherkar, Bhatia-Dey, & Csoka, 2014;Torano, Garcia, Fernandez-Morera, Nino-Garcia, & Fernandez, 2016). DNA methylation is characteristically produced by the addition of a methyl group to the DNA molecule cytosine within cytosine-guanine dinucleotides (CpGs), at an estimated 28 million sites across the human genome (Lovkvist, Dodd, Sneppen, & Haerter, 2016). Dense regions of CpGs referred to as "islands" represent about 5% of CpGs occurring in the genome (about 20,000 total) and often reside in promotor regions (Vinson & Chatterjee, 2012); in addition, surrounding "shores" and "shelves" to these islands are of interest and may be differentially methylated compared to islands (Jones et al., 2015). The addition of methylation tags to CpG sites is associated with altered gene expression, typically by interfering with or silencing gene transcription although upregulation of gene expression has been documented (Wang, Chen, Yang, Zhang, & Wong, 2019), and may differentially occur in cells across multiple tissue types including brain, muscle, and leukocytes (Fernandez et al., 2012). Methylation tags can be removed as a consequence of exposures as well, leading to dynamics in expression across time (Kanherkar et al., 2014).
Epigenetic changes may accelerate over time, whereby changes in gene expression due to exposures become more abundant and salient to phenotypic changes, hence potentiating the development of health and aging conditions earlier in life. Indeed, methylation is correlated with age (Ciccarone, Tagliatesta, Caiafa, & Zampieri, 2018;van Dongen et al., 2016), is used to define biological clocks that may more closely track biological aging (Field et al., 2018), and is associated with mortality (Zhang et al., 2017) and a number of physical and neuropsychiatric health traits (Kanherkar et al., 2014;Lappe & Landecker, 2015). Longitudinal studies of twins represent a valuable approach to evaluate genetic and environmental contributions to stability and change in methylation across the methylome (Tan, Christiansen, von Bornemann Hjelmborg, & Christensen, 2015).

Investigations of etiological contributions have relied primarily on
cross-sectional data (van Dongen et al., 2016;Hannon et al., 2018) and have addressed age-related differences (van Dongen et al., 2016) but not change. We evaluate individual differences in DNA methylation at individual CpG sites across the methylome across 10 years in two Scandinavian samples of same-sex aging twins, estimating the genetic and environmental contributions to stability as well as to novel influences that emerge. Moreover, we examine whether surrounding "shores" and "shelves" are differentially heritable compared to islands and whether sites identified as associated with rate of aging in epigenome-wide association study (EWAS)

| Materials
Methylation measurements from the Illumina HumanMethylation450 array (Illumina) were preprocessed and normalized with adjustments for cell counts and batch effects. Processing of the SATSA sample probes has been described previously (Jylhävä et al., 2019;Wang et al., 2018) and in brief included the following: (a) preprocessing with the R package RnBeads (Assenov et al., 2014) where filtering of samples and probes proceeded with a greedy-cut algorithm maximizing false-positive rate versus sensitivity at a detection pvalue of 0.05; (b) removal of sites that overlap with a known SNP site or reside on sex chromosomes; (c) normalization of data using dasen (Pidsley et al., 2013); (d) applying a Sammon mapping method (Sammon, 1969) to remove technical variance; (e) adjustment for cell counts (Jones, Islam, Edgar, & Kobor, 2017); (f) correction for batch effects using the ComBat approach in the sva package (Leek, Johnson, Parker, Jaffe, & Storey, 2012).
Processing of the LSADT data has been described previously (Svane et al., 2018)

| Filtering of sites post-analysis
We conducted additional filtering of probes where model-fitting results evidenced means or variances outside of expected values.
Specifically, we filtered based on the typical range of M-values (c.f., Du et al., 2010), with expected mean values falling outside the range −6.25 to 6.25 for 1812 sites under either ACE or ADE models at either timepoint. Likewise, we filtered based on expected standard deviations exceeding 1.5 under either ACE or ADE models (Du et al., 2010) resulting in 9554 sites out of range under either ACE or ADE models at either timepoint. The effective reduction in sites was from 368,391 to 358,836 after dropping 9555 unique sites from the analysis set.

| Analysis
Bivariate biometrical twin models of M-values were fitted to all available data across the pairs using full-information maximum likelihood (FIML), adjusting for centered age (centered at the average age across time = age -74 years), sex (0 = males, 1 = females), and country (0 = Sweden, 1 = Denmark). Bivariate ACE and ADE Cholesky models evaluated the degree to which additive genetic (A), dominance or non-additive genetic (D), common environmental (C), and non-shared factors (E), encompassing non-shared environmental influences, measurement error, and stochastic factors, contributed to variation and covariation in M-values within and across time (see Figure 1).The resolution of the genetic and F I G U R E 1 Bivariate Cholesky model. Note. ACE and ADE models were separately fitted to M-values at two waves 10 years apart environmental effects are done by comparing the relative similarity of monozygotic (MZ) twins who share 100% of their genes in common, including all additive effects and dominance deviations, versus dizygotic (DZ) twins who share on average 50% of segregating genes in common leading to expectations of 50% for additive effects and 25% for dominance deviations. Both twin types are presumed to have the same contribution of common environmental effects that contribute to similarity. We fitted ADE and ACE models as dominance (D), and common environment (C) could not be simultaneously estimated (see Figure 1).
Fit comparison between the ACE and ADE models was done via Akaike information criterion (AIC; Akaike, 1974). If the fit of the ADE model was as good or better than the ACE model, it was retained as "best" fitting, and otherwise, the ACE model was retained as best. We evaluated submodels including AE, CE, and E models. Differences in nested model deviance statistics [−2ln(L)] are distributed as chi-square (χ 2 ) with the difference in the number of parameters between the full and constrained models as the degrees-of-freedom (df). LSADT samples tended to show lower variability in methylation at any given probe compared to SATSA; hence, we allowed for scalar differences at each timepoint (k 1 , k 2 ) in standard deviations between the two samples (see Figure 1).
Thus, the relative contributions of A, C or D, and E were equated across LSADT and SATSA, but the scalar allowed for the variance components to differ by a constant at each assessment. Scalar Shelf', and a blank annotation field was treated as 'Open Seas'.
In comparing relative heritabilities across sites by location, as well as aging/clock CPGs sets to remaining CpGs, we fitted random effects regression models to ages 69 and 79 biometrical estimates using lme (version 1.1-21; Bates, Mächler, Bolker, & Walker, 2015).
We allowed for random effects between and within sites, reflecting consistency of effects by CpG sites across time and non-systematic variation within time.
To compare time1-time2 correlations from the biometrical estimates, we rescaled the a 12, d 12 or c 12 , and e 12 paths into correlations (r A , r D or r C , and r E ) and performed Fisher Z-transformations before submitting each to a skew-normal regression analysis using the sn package (Azzalini, 2020). Regression analyses compared low stability sites to remaining CpGs, after which regression weights were inverse-transformed into correlation units for interpretation.
Enrichment analyses were conducted using the GREAT 4.0.4 tool (McLean et al., 2010). Selected sites were mapped to the Human GRCh37 build and default settings were used for association rules (i.e., basal + extension: 5000 bp upstream, 1000 bp downstream, 1,000,000 bp max extension, curated regulatory domains included).
We present results of both biomial and hypergeometric tests where the false discovery rate (FDR) achieved p < .05 and where fold enrichment (FE) tests exceeded 2.0. We followed up the enrichment analyses using the mQTL Database (Gaunt et al., 2016) to annotate associations with methylation quantitative trait loci, noting the number of cis or trans variants.

| RE SULTS
We first evaluated the extent to which heritable and environmental  Note: A = additive genetic, D = non-additive genetic (dominance), C = common environment, E = non-shared factors.
across time is significant within site, M t2-t1 = −.058 (t = −232.0, df = 358,835, CI 95 = −0.058, −0.057). The decrease in heritability is due to an absolute increase in non-shared factors (E) compared to genetic influences (A, D) (see Table 1, Absolute Variances). Patterns of decline were observed for heritabilities (A) under the ACE model (0.150 and 0.109, respectively), and under best-fitting ADE or ACE models (see Table 1, Variance Components). Common environmental influences were generally stable in overall ACE results at over 5% (0.057, 0.054) and in best-fitting ACE results at 10% (0.106, 0.098) (see Table 1, Variance Components).
Across time, heritabilities showed divergence by location Table S1, Figure 2 is low for C even in large samples, as well as to distinguish D from A, we present full model estimates (Visscher, Gordon, & Neale, 2008).
In terms of contributions to stability and change in methylation due to genetic or environmental influences, across 358,836 sites, 58.5% showed cross-time associations at p < .05 (df = 3, where a 12 = [d 12 or c 12 ] = e 12 = 0) indicative of stability over time due to either genetic and/or environmental mechanisms. As shown in Figure 3, the cross-time stability was largely due to genetic effects in both the ADE best and ACE best models which were most often perfect in correlation.
As cross-sectional twin studies have reported that heritability may be higher for variable methylated sites (e.g., Hannon et al., 2018), we report the correlation between the estimated standard de-

| Age-related sites
We evaluated the best-fitting ADE and ACE results of two published CpG sets that were identified in EWAS as related to age that  Table S1). 0.06, ACE best; p = 3.87E−146) and 0.05 higher common environmentality (0.15 vs. 0.10; p = 1.88E−40) (see Table S2, Variance Components). Patterns in the absolute variances suggested the greater heritability was due primarily to lower non-shared factors (p ≤ 1.35E−07) and for ACE models coupled with higher additive genetic and common environmental influences (p ≤ 2.03E−04;

F I G U R E 2
see Table S2, Absolute Variances). Significantly higher heritabilities and common environmentality were also observed for Aging Set II (1934 sites) where the increased heritable and common environmental influences (all p ≤ 1.55E−31) were driven mainly by amplified genetic and common environmental influences (p ≤ 1.01E−11) and otherwise comparable non-shared factors between the Aging II set and remaining background CpGs (see Table   S2). Thus, the age-related sites showed a significantly higher proportion of variance attributed to heritable and shared environmental influences due to lower non-shared factors in Aging set I and due to higher genetic and common environmental influences in Aging set II.

| Methylation clock sites
Available CpG sites from four epigenetic clocks were evaluated in similar fashion using multilevel regression models fitted to ages were observed for the 1190 unique clock sites compared to all remaining CpGs (0.02-0.05 higher, p ≤ 8.68E−10, see Table S2, Variance Components). Comparisons of absolute variances suggested amplified genetic and common environmental influences (p ≤ 3.94E−05) as well as non-shared factors (p ≤ 3.90E−08) between the clock sites and remaining background CpGs (see Table   S2, Absolute Variances). Thus, the clock sites showed greater overall variability across sources of variance suggesting greater individual differences in these sites, with a significantly higher portion of variance attributed to heritable and shared environmental influences.
Among the 1190 unique CpG sites compared to one another, the  Table S3). The ratio of intercept variance to total variance (ρ) in heritability estimates was 0.559 for ADE best models and 0.680 for ACE best models suggesting 56% and 68% of the variation in heritability, respectively, was CpG site-specific across time and less than half of the variation was unique to CpG site and time, consistent with analyses by location (see Table   S1). Likewise, absolute variances showed strong between-site variations (66%-88%, see Table S3).

| Low stability sites
We identified 2020 CpGs with low stability but meaningful genetic or common environmental contributions at one or both timepoints, that is, p > 0.01 (df = 3, where a 12 = [d 12 or c 12 ] = e 12 = 0) and where e 1 or e 2 accounted for <50% of the total variation (1638 ADE best, 382 ACE best). Based on skew-normal analyses, low stability CpGs SD 1 of 1.08 to 1.09 (SD ratio = 0.13) for ACE and ADE best models, respectively. Moreover, heritabilities decreased across time while nonshared components tended to increase (see Figure S1). Compared to background CpGs, low stability CpGs tended to show higher A + D or A and C components (all p ≤ 2.03E−14) but generally lower overall absolute variances for A + D and E variances (p ≤ 15.51E−09) in ADE models (see Table S2). Higher absolute variance for A but lower variance for E was observed in ACE models (p ≤ 4.94E−03) (see Table   S2). Altogether, results suggest lower overall phenotypic variance in methylation among the low stability versus background CpGs across time (c.f., Table S2). However, within the set of lower stability CpGs, variance in methylation increased at time 2 mainly due to novel nonshared factors (c.f., Figure S1).

| Enrichment analysis: High heritability/ familiality
The set of 5037 CpGs achieving epigenome significance ( Table 2; for full ontology results, see Table S4). The sites that showed the greatest heritabilities showed enrichment in immune and inflammation pathways as well as CpG matches to 155,177 SNP variants from the Middle Age timepoint (see Table S6). Of the 1435 CpG matches, 1256 were associated with cis-mQTLs and 304 were associated with trans-mQTLs suggesting an abundance of associations with cis-mQTLs. The maximum number of mQTLs associated with any given CpG was for cg03202060 with 5230 cis-mQTLs variants plus 575 trans-mQTLs.
As polycomb repression may relate to age-related changes in DNA methylation, we filtered our set of 5037 CpGs to reflect genes annotated on the 27 k array and evaluated whether our set mapped to 1861 PolyComb Group Target genes (PCGTs) identified using the Illumina 27 k chip probes (Zhuang et al., 2012). We observed 493 CpGs within a set of 293 PCGTs overlapped, or a 15.7% overlap of PCGTs (see Table S6). A hypergeometric test of the 293 overlapping PCGTs was significant at p = 1.004E−11 suggesting overrepresentation, when considering the number of unique PCGTs in Zhuang et al. (2012), and the number of genes represented in the Illumina 27 k chip.

| Enrichment analysis: Low stability sites
The 2020 low stability CpGs were submitted to GREAT 4.0.4, showing enrichment for stress-related DNA and RNA transcription pathways (see Tables S7-S8). Hence, these sites may lie in genes/gene pathways that are sensitive to exogenous exposures to stress leading to increasing divergence in methylation profiles across time. The GO Biological RNA and DNA pathways noted relate to heat shock and response to hypoxia in a number of plant and animal species, including humans (c.f., annotations GO:0043620, GO:0061418; Table   S8).
The low stability CpGs were submitted in kind to the mQTL Database (Gaunt et al., 2016) producing 397 unique CpG matches to 7103 mQTLs at the Midlife timepoint. Of the 397 CpG matches, 58 annotations were to cis-mQTLs and 347 were to trans-mQTLs (see Table S9), suggesting an abundance of associations with trans-mQTLs. The maximum number of mQTLs linked with any given CpG was for cg07677296 matched with 576 cis-mQTLs. The cis-mQTLs variants associated with cg07677296 traverse FAHD1 and NUBP2 on chromosome 16 and have been implicated in aging pathways related to insulin-like growth factor (Teumer et al., 2016). A scatterplot of cg07677296 M-values of twin 1 by twin 2 across time shows comparable similarity for MZ and DZ pairs (see Figure S2c,d).

| D ISCUSS I ON
Overall, results suggest genetic contributions to DNA methylation tended to be small, vary by location, and decrease across a decade; however, genetic influence mainly contributed to the stability of methylation. Unique person-specific influences not shared by cotwins were emergent across 10 years suggesting that non-shared factors become more salient to DNA methylation in late life. The extent of variation in methylation at any given CpG site was positively correlated with observing stronger heritable effects. Moreover, 58% of sites showed stability across time due to strongly correlated genetic influences and modestly correlated non-shared factors, suggesting continuity of influences across 10 years for more than half the CpG sites. The sites that showed the greatest heritabilities showed enrichment in immune and inflammation pathways and neurotransmitter transporter activity pathways. Low stability sites meanwhile showed increased expression variability across time due to novel non-shared factors, with enrichment in stress-related pathways, suggesting that these sites are responsive to "new" environmental cues even in old age.
Prior studies report average heritabilities of 16.5%-19.0% across adulthood (17-79 years) (van Dongen et al., 2016;Hannon et al., 2018) and common environmental influences of 3.0%-12.6%, that are stronger in young adulthood (Hannon et al., 2018 CpG sites related to age show a greater impact of heritable influences consistent with genetic regulation of the rate of biological aging. Sites associated with age and longevity generally show higher heritabilities than the total background sites and varied in magnitude of heritabilities by location, where "islands," which often reside in promotor regions (Vinson & Chatterjee, 2012), typically showed lower heritability than those sites residing in surrounding "shores" and "shelves," which have been shown to be differentially methylated compared to islands (Jones et al., 2015).  (Tserel et al., 2015). Indeed, five CpGs in our set identified as associated with cis-mQTLs at midlife lie within the BCL11 gene (cg26396443) or RUNX3 gene (cg05162523, cg13566436, cg20674490, cg22509179) involved in T cell differentiation (Tserel et al., 2015). A related study of German and Danish individuals (including an overlapping sample of twins herein) evaluating RNAsequencing expression patterns and longevity identified expression patterns in biological processes contributing to immune system and response pathways (Häsler et al., 2017) and observed high heritabilities (30%-99%) among 20% of cis-eQTLS. Immunosenescence describes an age-associated decline in elderly individuals' immune functioning, such as mounting less effective responses to vaccines and lowered resistance to illnesses, with concomitant upregulation of pro-inflammatory cytokines, among several other cellular and physiological changes in the immune system (Accardi & Caruso, 2018). It has been proposed that heritable factors may be partly associated with differential immune responses (Derhovanessian et al., 2010;Poland, Ovsyannikova, Kennedy, Lambert, & Kirkland, 2014) and may predict influenza-related susceptibility and mortality (Poland et al., 2014), for example, and, broadly, successful aging and longevity (Derhovanessian et al., 2010). Hence, differential adaptions to aging processes including immunosenescence reflect gene-environment dynamics with some individuals showing better adaptions than others due to genetic influences.
High heritability CpGs were also enriched for PCGTs-a group of genes that are epigenetically regulated by polycomb-group proteins and involved in developmental processes and cell-fate decisions (Lanzuolo & Orlando, 2012). Enrichment of hypermethylated of PCGT has also been implicated in cancer and aging and show consistent patterns across different cell types (Teschendorff et al., 2010).
Our findings would thus support the role of heritable/familial-environmental factors in the epigenetic regulation of these fundamental cellular processes.
Enrichment analyses of low stability CpG sites suggest that stress-related DNA and RNA transcription pathways may be relevant for these environmentally responsive sites which showed increased novel environmental contributions to methylation. It is notable that unlike the high heritability set, the low stability set showed more associations with trans-mQTLs. That said, cg07677296 matched with 576 cis-mQTLs, with variants spanning FAHD1 and NUBP2, both implicated in metabolic and aging pathways related to insulin-like growth factor (IGF) (Teumer et al., 2016). Specifically, FAHD1 was identified as a cis-eQTL associated with a variant in NUBP2 (rs1065656) that may contribute to circulating IGF-I and IGFBP-3 concentrations (Teumer et al., 2016). Moreover, IGF-I is implicated in oxidative stress pathways (Gubbi, Quipildor, Barzilai, Huffman, & Milman, 2018).
The current study establishes the extent to which the genetic and environmental influences contribute to site-specific methylation across a 10-year span in a longitudinal sample of Swedish and Danish twins. While stability of methylation was largely due to genetic influences, person-specific environmental influences were emergent across time and explained change. By and large, the dynamics of methylation may be influenced by experiences and exposures, suggesting possible mediation of gene expression; however, the most heritable sites may participate in immune and inflammation pathways and neurotransmitter transporter activity pathways which suggest that adaptions to aging and senescence may be differentially impacted by genetic background.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
CAR drafted the manuscript. CAR and EM analyzed data. QT and JH contributed to scripting and QT, SH, and JJ advised on enrichment analyses. NLP and SH contributed to the coordination of the study and acquisition of the SATSA methylation data. JJ contributed to preparation of SATSA data and interpretation of results. LC, QT, and JH coordinated the LSADT data acquisition. All authors participated in interpretation of the data, have read and commented on the manuscript, and approved the final version.

DATA AVA I L A B I L I T Y S TAT E M E N T
Methylation data for SATSA are available at EMBL-EBI (www.ebi. ac.uk/array express) under accession number E-MTAB-7309 (see Wang et al., 2018). For the LSADT study, legal restrictions prevent the deposit of data into a public database and transfer and sharing of individual-level data requires prior approval from the Danish Data Protection Agency. However, inquiries regarding collaboration and individual requests for data sharing are welcome.