Distinct chromatin signatures of DNA hypomethylation in aging and cancer

Summary Cancer is an aging‐associated disease, but the underlying molecular links between these processes are still largely unknown. Gene promoters that become hypermethylated in aging and cancer share a common chromatin signature in ES cells. In addition, there is also global DNA hypomethylation in both processes. However, the similarity of the regions where this loss of DNA methylation occurs is currently not well characterized, and it is unknown if such regions also share a common chromatin signature in aging and cancer. To address this issue, we analyzed TCGA DNA methylation data from a total of 2,311 samples, including control and cancer cases from patients with breast, kidney, thyroid, skin, brain, and lung tumors and healthy blood, and integrated the results with histone, chromatin state, and transcription factor binding site data from the NIH Roadmap Epigenomics and ENCODE projects. We identified 98,857 CpG sites differentially methylated in aging and 286,746 in cancer. Hyper‐ and hypomethylated changes in both processes each had a similar genomic distribution across tissues and displayed tissue‐independent alterations. The identified hypermethylated regions in aging and cancer shared a similar bivalent chromatin signature. In contrast, hypomethylated DNA sequences occurred in very different chromatin contexts. DNA hypomethylated sequences were enriched at genomic regions marked with the activating histone posttranslational modification H3K4me1 in aging, while in cancer, loss of DNA methylation was primarily associated with the repressive H3K9me3 mark. Our results suggest that the role of DNA methylation as a molecular link between aging and cancer is more complex than previously thought.


| INTRODUCTION
Age is among the most important risk factors for cancer (de Magalhães, 2013;DePinho, 2000). However, the underlying molecular mechanisms governing this relationship are still poorly understood. Recent research has established polycomb-target gene promoter hypermethylation as a common epigenetic characteristic of cancer (Schlesinger et al., 2007;Widschwendter et al., 2007). In this scenario, prior to alteration these promoters display an embryonic stem cell "bivalent chromatin pattern" consisting of the repressive histone mark H3K27me3 and the active mark H3K4me3 (Ohm et al., 2007). Genes affected by this process are associated with developmental regulation (Easwaran et al., 2012), implying a possible stem cell origin of cancer whereby aberrant hypermethylation could promote a continuously self-renewing embryonic-like state in cancer cells (Teschendorff et al., 2010). Interestingly, promoter hypermethylation of polycomb-target genes was later described in aging blood (Rakyan et al., 2010;Teschendorff et al., 2010) and other tissue types such as mesenchymal stem cells (Fern andez et al., 2015), ovary (Teschendorff et al., 2010), brain, kidney, and skeletal muscle (Day et al., 2013), findings which were also confirmed using whole-genome bisulfite sequencing (Heyn et al., 2012).
In addition to aberrant locus-specific DNA hypermethylation, tumoral cells are also globally hypomethylated as compared to their healthy counterparts. While this molecular alteration preferentially occurs at gene bodies, intergenic DNA regions, and repeated DNA elements (Ehrlich, 2009) and is proposed to be associated with chromosomal instability, reactivation of transposable elements, and loss of genomic imprinting, its precise functional role in cancer development is still poorly understood (Rodr ıguez-Paredes & Esteller, 2011).
Intriguingly, global loss of genomic DNA methylation has also been reported during the aging and senescence process (Cruickshanks et al., 2013;. Whole-genome bisulfite sequencing and methylation arrays have confirmed the global loss of DNA methylation in different human tissues including blood (Heyn et al., 2012), mesenchymal stem cells, and brain (Fern andez et al., 2015). On the other hand, other important tissues such as skeletal muscle do not seem to become hypomethylated with aging (Zykovich et al., 2014).
Despite the interesting parallelism in aging and cancer recently reported with respect to hypermethylated DNA regions, the relationship between hypomethylated DNA sequences in these two processes has not been sufficiently studied. Moreover, recent analyses mainly performed in mouse tissue have failed to confirm global hypomethylation during the aging process (Cole et al., 2017;Hahn et al., 2017;Masser et al., 2017) and, to date, no study has provided a back-to-back and systematic comparison of the epigenetic changes that occur in aging and cancer. To address this issue, here we have analyzed DNA methylation changes and their associated chromatin patterns in a total of more than 2,300 healthy and tumoral samples obtained from differentially aged individuals, using HumanMethylation450 BeadChip data generated by The Cancer Genome Atlas (TCGA) consortium and other datasets (Bormann et al., 2016;Guintivano, Aryee & Kaminsky, 2013;Hannum et al., 2013). Our results confirmed the relationship between DNA hypermethylation in aging and cancer, but they also revealed important differences in DNA hypomethylation changes in the two processes that might be important to understand the possible role of DNA methylation as a molecular link between decline related to aging and tumor development.

| DNA methylation profiling in aging and cancer
To identify DNA methylation changes in aging and cancer, we collected DNA methylation data obtained with the HumanMethyla-tion450 BeadChip (Illumina) (see Section 4) and compared the DNA methylation status of a total of 361,698 CpG sites across 1,762 samples corresponding to healthy and tumoral tissues obtained from differentially aged patients with breast, kidney, thyroid, skin, and brain tumors (see Tables 1 and S1 for extended information). Using an empirical Bayes moderated t test (see Section 4), we identified a high number of autosomal differentially methylated CpGs (dmCpGs; FDR < 0.05) between normal and tumoral samples, while a lower and more variable number of aging-related dmCpGs between young and old samples was found (Table 1; Figure 1a and Table S2 for additional information). Hierarchical clustering of samples using the dmCpGs enabled us to distinguish between tumoral and control samples with more efficiency than young and old samples ( Figure 1b and Figure S1). On the whole, whereas cancer-related DNA methylation changes had no dominance of either hyper-or hypomethylation, aging-related changes tended toward DNA hypermethylation, and showed a much more variable and tissue type-dependent magnitude of change. Globally, methylation changes were found to be more pronounced in cancer than in aging ( Figure 1c and Table S3, Wilcoxon tests; all p < .05), while comparison of hyper-vs. hypomethylation changes was variable and disease and tissue type dependent.
Intriguingly, most tumors obtained from differentially aged patients did not show significant age-associated DNA methylation changes, with the exception of thyroid cancer (Table 1, Figure 1a, bottom   panel and Table S2). Furthermore, analyses employing Horvath's     Figure S3 and Table S5), with this effect being even more pronounced for the aging dmCpGs.

| Tissue type-independent DNA methylation changes in aging and cancer
To determine the effect of tissue type on the DNA methylation changes during aging and cancer, we compared the previously identified dmCpG sites for each of the tissues. In cancer,~50% of hyperand hypomethylated CpGs were common to at least two different tumor types (Figure 3a), with 1,962 (1.1%) hyper-and 2,708 (1.5%) hypomethylated CpG sites being common to all five tumor types analyzed ( Figure 3b and Table S6). In contrast, the overlap between dmCpGs in aging across different tissues was considerably reduced.
Indeed, only 18% of the hyper-and 8% of the hypomethylated CpG  hyper-and 1 (0.002%) hypomethylated CpG sites were common to all five tissue types analyzed (Figure 3a,b and Table S6). However, statistical analyses of the pairwise overlaps between the different sets of probes showed overall enrichment in every case, especially for aging ( Figure 3d, Fisher's tests; all p < .001, Table S7). This overenrichment was also revealed through a simulation of a random sampling of probes from the array ( Figure S4a). Taken together, these results suggest that both cancer and aging manifest tissueindependent changes in DNA methylation.
We also identified a subset of dmCpG sites in aging and cancer that could potentially be either hyper-or hypomethylated depending on the tissue type involved ( Figure 3c)

| Distinct chromatin signatures of DNA hypomethylation in aging and cancer
To determine whether the chromatin signatures of DNA hypomethylation were also similar in aging and cancer, we compared the hypomethylated CpG sites identified in our study with data from the same histone modifications as described in the earlier analyses. Interestingly, the results showed that hypomethylated CpG sites in cancer were enriched in the repressive H3K9me3 histone modification, while in aging, hypomethylated CpGs were more enriched in the activating histone mark H3K4me1 (Figure 4a, lower panel; Table S8).
There were though exceptions to this general trend: hypomethylated CpG sites in thyroid tumors were also enriched at H3K4me1, and hypomethylated DNA sequences in aged skin were mainly co-associated with H3K9me3-marked DNA regions. Nevertheless, the ratio H3K4me1/H3K9me3 was always higher in aging than in cancer (Figure 4b). Moreover, when analyzing the dmCpGs shared by all five tissues in cancer or at least three tissues in aging, these distinct chromatin signatures became much more evident. In sum, these results indicate that, in contrast to DNA hypermethylation, chromatin signatures of DNA hypomethylation in aging and cancer differ considerably.
After deriving the chromatin signatures, we performed validation analyses on two additional datasets: the first related to tissue from TCGA control and lung adenocarcinoma and the second to whole blood from the classical Hannum et al. (2013) dataset ( Figure S5; see Table S9 for additional information). Interestingly, we were unable to find aging-related methylation changes in normal lung tissue using our pipeline. The magnitude and distribution of the hyper-and hypomethylation changes in lung cancer and whole blood aging followed the same trend as observed for the other datasets ( Figure S5a, see Table S2 for a list of dmCpGs). The histone enrichment analyses revealed the same hypermethylation signature previously found for cancer and aging, and very clear and different hypomethylation signatures of H3K9me3 for lung cancer and H3K4me1/3 for whole blood aging ( Figure S5b and Table S8).
Finally, we compared the overlap between either hypermethylated or hypomethylated CpGs across tumors and their correspond-   Table S8 for 98 full cell and tissue types). Color code indicates the significant enrichment based on log2 odds ratio (OR). Common signatures are calculated from hyper-and hypomethylated dmCpGs shared between five tissues for cancer (1,962 and 2,708 probes, respectively) or three tissues for aging (904 and 106 probes, respectively) (see Table S6 for CpG lists). (b) Barplots indicating the ratio of OR for the H3K4me1/H3K9me3 marks associated with hypomethylated dmCpGs in aging and cancer. The ratio is calculated by taking the mean of OR of those tracks with significant over-enrichment for each histone mark and dividing the obtained numbers. (c) Venn diagrams showing the number and overlap of total nonredundant hyper-and hypomethylated dmCpGs detected in cancer and aging. dmCpGs that were only hypermethylated or only hypomethylated between all tissues were chosen for the comparison P EREZ ET AL.  (Figure 5b; see Figure S8 and  Figure S7 for tissue-specific signatures, and Table S10 for 98 full cell and tissue types). Color code indicates the significant enrichment based on log2 odds ratio (OR). Common signatures are calculated from hyper-and hypomethylated dmCpGs shared between five tissues for cancer (1,962 and 2,708 probes, respectively) or three tissues for aging (904 and 106 probes, respectively) (see Table S6 for CpG lists). (b) Panel indicating significant (p < .05) transcription factor enrichment at hyper-and hypomethylated dmCpG sites in aging and cancer that occurred in at least four or five tissues (see Figure S8 for tissue-specific results and Table S11 for full data). Only the most representative transcription factors (those that appeared as significantly over-enriched in at least three tracks or with an OR > 3 in any track) were selected for data representation GATA2/3. In this case, aging hypomethylated dmCpG sites tended to display a more marked enrichment of most of the cancer hypomethylation factors and, additionally, revealed the presence of other family-or function-related proteins, like FOSL1/2, MAFF, MAFK, and STAT3. When examining enrichment at common dmCpG sites shared by different tissues in cancer and aging, the initial observations were further confirmed ( Figure S9).
Gene ontology analyses (Figure 6a; Table S12; see Figure S10 for tissue-specific results) revealed that hypermethylated CpGs in both processes belonged to genes that were mainly related to developmental functions. While genes containing hypomethylated CpGs in cancer were associated with extracellular signaling, those for aging were, in general, much less enriched in any gene ontology. In the case of KEGG pathways (Figure 6a; Table S12 and Figure S10   Gene ratio a b c F I G U R E 6 Dissimilar functional context of differentially methylated CpGs in aging and cancer. (a) Panels indicating gene and KEGG pathway ontology enrichment for common hyper-and hypomethylated dmCpGs shared between five tissues for cancer (1,962 and 2,708 probes, respectively) or three tissues for aging (904 and 106 probes, respectively) (see Table S6 for CpG lists and Figure S9). Color code indicates the significance of the over-enrichment based on log10 p-value. Size of circles indicates gene ratio, calculated as the ratio of a number of identified hits with respect to the total number of components in a given ontology. this issue, we focused on kidney tissue (KIRC) as this TCGA dataset displayed a reasonable number of control and cancer patients with paired methylation and gene expression data. We initially performed differential gene expression analyses comparing young vs. old or normal vs. tumoral kidney samples (Figure 7a and Table S13). These results allowed us to identify a total of 13 and 20,678 differentially expressed genes (DEGs) in aging and cancer conditions, respectively.
The majority of the aging DEGs were also found in cancer, including, for example, the CKM gene, which contained a dmCpG in the proximity of its promoter (Figure 7b), was differentially expressed in both processes ( Figure 7c) and displayed a considerable negative correlation between DNA methylation and gene expression in normal kidney (Spearman r = À.37, Figure 7d). To further explore the potential relationships between CpG methylation and gene expression in these processes, and due to the reduced number of DEGs observed in the aging context, we decided to perform all potential pairwise correlations between DNA methylation and gene expression using canceror aging-related dmCpGs and genes expressed in a subset of normal kidney tissue samples (n = 18). This approach enabled us to quantify the extent to which CpGs whose methylation status changes in cancer and aging originally influence gene expression in normal tissue.
We computed a total of 2.58e 09 and 3.84e 08 correlations between cancer-and aging-related dmCpGs, respectively, and genes expressed in the normal KIRC dataset (Figure 7e). Despite the considerable difference in a number of dmCpGs between cancer and aging, when compared to the total possible number of correlations, we found similar percentages of strong correlations between DNA methylation and gene expression in both processes (Table S14).
Moreover, these proportions were also higher than those observed when sampling random probes from the array and computing their correlations (see Figure S11). These results indicate that both cancer-and aging-related dmCpGs are enriched in CpGs that can influence, to some extent, gene expression in kidney tissue.
A more detailed inspection of the strongest correlations (>= 0.9 or <= À0.9) identified in these datasets revealed that, in cancer, the  Table S13  dmCpG-gene pairs identified in the previous analysis which displayed correlation scores above 0.9 (pos) or below À0.9 (neg) in cancer (top) or in aging (bottom) conditions. (g) Stacked barplots indicating relative distribution of unique dmCpGs obtained from the previous correlations according to their CpG island status. (h) Stacked barplots indicating relative distribution of unique dmCpGs obtained from the previous correlations according to their gene location status. (i) Venn diagrams illustrating the overlap between dmCpGs identified in aging and in cancer which displayed strong positive or negative correlations (>0.9 or <À0.9) with genes expressed in the normal kidney dataset also present in the group of cancer-related dmCpGs (Figure 7i).
These results point toward similarities of cancer-and aging-related dmCpGs in the control of gene expression in normal tissue, despite the fact that the number of cancer-related dmCpGs is clearly larger than their aging counterparts.

| DISCUSSION
Although it is widely accepted that cancer is an age-dependent disease, the underlying molecular mechanisms are still poorly characterized. In this work, we have looked at the similarities and differences in epigenetic changes associated with cancer and aging.
In agreement with previously published data (Fern andez et al., 2015), we observed that the number of aging-and cancer-associated The changes in cell type composition that occur with age and cancer are also well-known confounding factors that could affect our datasets (Zheng et al., 2017). However, the application of the SVA method of correction (and Houseman correction for blood) and the use of a pure-cell dataset such as the glia dataset (Guintivano et al., 2013) allowed us to tackle this issue in two different ways.
Additionally, the use of the blood validation dataset (Hannum et al., 2013) allowed us to verify the reliability of our workflow, as in terms of whole blood dmCpGs we obtained 89% concordance with previous studies using the same data (Fern andez et al., 2015).
When analyzing the genomic distribution of dmCpGs and, in line with previously published reports (Cruickshanks et al., 2013;Kulis et al., 2012;Yuan et al., 2015), we found that hypomethylated CpGs were enriched at open sea DNA regions, principally intronic and intergenic, irrespective of the type of process. The distribution of hypermethylated CpGs was found to be similar to that of the array, which is to a certain extent to be expected because it was designed to interrogate a promoter-and CpG dense-biased portion of the genome. Nonetheless, hypermethylation changes always occurred in far more CpG-dense regions than hypomethylation changes (Day et al., 2013;Yuan et al., 2015), and this observed effect was especially noticeable for aging dmCpGs.
When studying the potential effect of tissue type on DNA methylation changes, we found, in agreement with recently published data (Chen, Breeze, Zhen, Beck & Teschendorff, 2016), that DNA methylation changes in different tumor types were surprisingly similar, regardless of the tendency of the alteration. This observation is conceptually relevant because it has classically been considered that different tumor types are characterized by specific DNA methylation signatures (Ehrlich & Jiang, 2005;Portela & Esteller, 2010). In this sense, our data confirm that, although different tumor types might display specific DNA methylation patterns, there is a significant common nexus between them. The analysis of the DNA methylation changes observed with respect to the aging process also revealed a significant overlap between tissue types, although it is possible that these results are affected by the variability in the sizes of the sets of probes detected in the aging analysis.
Our data revealed that dmCpGs shared by two or more tissues were much less likely to have different behaviors in other tissues, perhaps pointing toward nonstochastic and possibly functional roles for these CpGs.
The systematic DNA methylation analyses described in this study confirm that DNA hypermethylation in aging and cancer is associated with the same set of histone marks, including the repressive H3K27me3 and H3K9me3 marks, and the activating H3K4me1/3 posttranslational modifications. Chromatin state analysis revealed that the hypermethylation-associated H3K27me3 and H3K4me1/3 marks configured bivalent chromatin domains, as has been extensively described in embryonic stem cells (Fern andez et al., 2015;Ohm et al., 2007;Rakyan et al., 2010;Schlesinger et al., 2007;Teschendorff et al., 2010;Widschwendter et al., 2007). Moreover, our data reveal that this chromatin signature is not restricted to only embryonic stem cells, but rather this trend should be considered an extended, global tissue-independent chromatin signature of DNA hypermethylation in aging and cancer. Interestingly, Chen and colleagues (Chen et al., 2016) have recently demonstrated that normal tissue signatures are better predictors of DNA hypermethylation changes than ESC signatures. Furthermore, hypermethylation changes were also associated with the repressive histone mark H3K9me3 (Ohm et al., 2007), which was correlated to ZNF genes and DNA repeats in our chromatin state analyses, and which might have a potential relationship with the malignant transformation process (Severson, Tokar, Vrba, Waalkes & Futscher, 2013).
Regarding DNA hypomethylation, our results showed that agerelated DNA hypomethylation is associated with the activating histone posttranslational modification H3K4me1, which supports previously published data (Fern andez et al., 2015). A slight tendency for the enrichment of H3K27Ac, a histone mark characteristic of active enhancers (Creyghton et al., 2010), was also detected in our analyses. Intriguingly, the chromatin signature of DNA hypomethylation in cancer was substantially different, being primarily enriched in the posttranslational repressive histone modification H3K9me3, a relationship that has been investigated in colon and breast cancer (Berman et al., 2011;Hon et al., 2012). This observation might be conceptually relevant because DNA methylation has been proposed to be a molecular link between aging and cancer (Fraga, Agrelo & Esteller, 2007;Klutstein, Nejman, Greenfield & Cedar, 2016). However, our results suggest that the role of DNA methylation as a possible link between aging and cancer is more complex than previously proposed. Importantly, even though many of the observed DNA methylation changes in aging were not shared by tissues, we were able to describe a common chromatin signature characteristic of the aging process.
Regarding the analysis of the chromatin states, DNA hypomethylation in cancer was associated with heterochromatin DNA regions, which is in line with previous work (Berman et al., 2011;Kulis et al., 2012). In contrast, chromatin marks of DNA hypomethylation in aging were associated with enhancers, reinforcing previous observations performed with the Infinium HumanMethylation27K Beadchip platform (Day et al., 2013). As DNA methylation changes in enhancers have been shown to play an important role in gene regulation (Aran, Sabato & Hellman, 2013;Blattler et al., 2014;Heyn et al., 2016), our results suggest that DNA hypomethylation during aging might have a different functional role in gene regulation compared to DNA hypomethylation changes in cancer.
With regard to the potential effectors of the distinct chromatin signatures, enrichment analyses of transcription factors revealed the presence of EZH2 and SUZ12 polycomb components at DNA hypermethylated sites, both in cancer and aging. Specific aging hypermethylation-associated factors were also observed in our comparisons, such as REST, which has been reported in previously published data in blood (Yuan et al., 2015), and has also been correlated with longevity (Lu et al., 2014). Concerning DNA hypomethylation, transcription factors such as FOS, JUN, and JUND were detected at both cancer and aging hypomethylated CpG sites, but again aging displayed stronger and more varied enrichment, and included the presence FOSL1/2, other bZIP-domain factors like MAFF and MAFK, and STAT3, which has been associated with recruitment of the H3K4 methyltransferase SET9 at promoters (Yang et al., 2010). Altogether, these observations would imply that hypomethylation in aging displays a more marked functional context than that of cancer, exhibiting an increased enrichment of some factors also detected at cancer hypomethylated sites and other specific factors not found associated with tumoral changes.
Interestingly, our gene ontology analyses revealed similar gene functionalities affected by cancer and aging DNA hypermethylation, mainly related to developmental processes, which is in line with the methylation of bivalent chromatin promoters of developmental regulators in cancer and aging (Easwaran et al., 2012;Rakyan et al., 2010). On the other hand, DNA hypomethylation in cancer was mainly associated with functions identified with cellular signaling, and much lower enrichments in gene functions were found for hypomethylated CpGs in aging. A preponderance of nongenic enhancer hypomethylation in aging could potentially explain this absence of gene function association in our data.
To date, the potential relationships between DNA methylation and gene expression have only been systematically analyzed in a small subset of studies (Gevaert, Tibshirani & Plevritis, 2015;Gutierrez-Arcelus et al., 2013, and the potential effects of these relationships on aging and cancer are yet to be elucidated. To this end, we explored the establishment of potential correlations between these two processes using the TCGA-KIRC dataset. While most correlative studies focus on CpGs located at particular genomic regions, such as DNA promoters (Moarii, Boeva, Vert & Reyal, 2015) and Finally, we found that most of the tumor types analyzed in this study did not show age-associated DNA methylation changes, which is in agreement with the reprogramming of the epigenetic clock in cancer cells (Horvath, 2013). As an exception, we identified ageassociated dmCpGs in thyroid tumors. Uncommonly, thyroid cancer includes age as a prognostic indicator in most staging systems (Haymart, 2009), implying that these cancers do suffer age-related changes in their behavior. Intriguingly, this tissue displayed the lowest level of DNA methylation changes in cancer and one of the lowest in aging. Although the reasons for the different behavior of DNA methylation changes in thyroid are currently unknown, they could be related to the good prognosis that typically characterizes this type of tumor. In fact, Yang Z. and collaborator's "epiTOC" mitotic clock (Yang et al., 2016) shows thyroid cancer to have the least deviation from the behavior of its normal tissue. In this regard, future research should be conducted to address this issue.
In conclusion, our results indicate that hyper-and hypomethylated changes in aging and cancer each have similar genomic distributions and manifest tissue-independent trends in both processes.
We confirm that chromatin signatures of DNA hypermethylation in aging and cancer are similar but, strikingly, we demonstrate that they are different for DNA hypomethylation. Collectively, our data suggest that the possible role of DNA methylation as a molecular link between aging and cancer is more complex than previously thought.  (Bormann et al., 2016), and glia (Guintivano et al., 2013), respectively. Tissues were chosen based on disease prevalence, control data availability, and previous literature analyses to include both novel and pre-analyzed tissues. We also performed analyses on two supplementary datasets: lung adenocarcinoma (LUAD) and control TCGA data, and whole blood from a healthy cohort (Hannum et al., 2013). Extended information about the samples for each tissue type is shown in Tables 1, S1 and S9. Data were preprocessed as detailed in supporting information.

| Differential DNA methylation analyses
Differentially methylated probes (dmCpGs) in aging and cancer were calculated with the R package limma (version 3.32.2) (Ritchie et al., 2015). Briefly, a linear model between methylation levels as response variable, the variable of interest (either age group or sample_type), and surrogate variables (see Supplementary Methods) was fitted for each of the analyses, adjusting p-values to control for false discovery rate (FDR < 0.05). For the calculation of age-related dmCpGs, samples were divided into age quantiles in such a way as to obtain groups with sizes of n = 15-30, and comparisons were performed between the upper (OLD) and the lower (YOUNG) quantile. Cancer-related dmCpGs were calculated between normal tissue (Solid Tissue Normal) and tumor samples (Primary Tumor) as indicated in Tables 1, S1 and S2. Probes with M-value changes of <0.5 were not considered as dmCpGs, as has been suggested elsewhere (Du et al., 2010). Venn diagrams of relationships between dmCpGs were generated with the online resource provided by the UGent/VIB bioinformatics unit (http://bioinformatics.

| Region set enrichment analysis
Enrichment analyses were performed with the R package LOLA (version 1.4.0) (Sheffield & Bock, 2016), which looks for over-enrichment by conducting one-sided Fisher's tests (p < .05 significance threshold), by comparing overlap of probes (10 bp probe-centered windows) with the dataset of interest. Enrichment of histone marks was determined using histone ChIP-seq peak tracks (H3K4me1, H3K4me3, H3K27me3, H3K36me3, H3K9me3, and H3K27ac marks) from 98 epigenomes (primary tissues, cultures, and cell lines) obtained from the NIH Roadmap and ENCODE projects (Bernstein et al., 2010;Consortium 2012) (datasets obtained from http://datab io.org/regiondb) (see Table S8). The same method was employed for chromatin-segment analysis using NIH Roadmap's ChromHMM expanded 18-state model tracks for the same 98 epigenomes (see Figure S7 and Table S10, custom database generated with data obtained from http://egg2.wustl.edu/roadmap/). In a similar fashion, ChIP-seq peak tracks from ENCODE for transcription factor binding sites (TFBS) comprising 689 datasets corresponding to 188 TFs analyzed in 91 cell and tissue types were employed for TFBS enrichment analysis (http://databio.org/regiondb, see Table S11).

| Availability
All data generated during this study are included in this published article and its supplementary information files and are also available in the Zenodo public repository, https://doi.org/10.5281/zenodo. 1086491.