DNA methylation is a covalent modification in which the 5′ position of cytosine is methylated in a reaction catalyzed by DNA methyltransferases (DNMTs) with S-adenosyl-methionine as the methyl donor. In mammals, this modification occurs at CpG dinucleotides and can be catalyzed by three different enzymes, DNMT1, DMNT3a, and DNMT3b. DNA methylation plays a role in the long-term silencing of transcription and in heterochromatin formation. As an epigenetic modification, DNA methylation permits these silenced states to be inherited throughout cellular divisions.
In the context of DNA methylation, sequences within the genome can be classified into two different groups: CpG poor regions and CpG islands. CpG islands are defined as being longer than 500 bp and having a GC content greater than 55% and an observed CpG/expected CpG ratio of 0.65 (Takai and Jones, 2002). CpG islands are often but not always found in promoter regions and about 40% of genes contain CpG islands that are situated at the end of the 5′ region (promoter, untranslated region, and exon 1) (Jones and Baylin, 2002). The rest of the genome, such as the intergenic and the intronic regions, is considered to be CpG poor. In healthy cells, CpG poor regions are usually methylated whereas CpG islands are generally hypomethylated, with a few exceptions including the inactive X chromosome. During the development of cancer, many CpG islands undergo hypermethylation while the CpG poor regions become hypomethylated. This alteration in DNA methylation pattern leads to changes in chromatin structure causing the silencing of tumor suppressor genes and instability of the genome (Fig. 1) (Jones and Baylin, 2002). This change in methylation pattern during cancer is similar to the pattern observed on the inactive X chromosome. The inactive X chromosome is hypermethylated at CpG promoters, but contains less methylation than the active X chromosome at CpG sites located downstream of the promoter (Fig. 1) (Jones, 1999; Hellman and Chess, 2007).
Traditionally, DNA methylation has been divided into two types: de novo and maintenance methylation. De novo methylation is catalyzed by DNMT3a and DNMT3b and is important for the establishment of methylation patterns in early embryos, during development, and during carcinogenesis (Fig. 1) (Okano et al., 1999). In order to maintain these methylation patterns set by de novo methylation, DNMT1 is localized to the replication fork during cellular division and conducts maintenance methylation (Leonhardt et al., 1992; Liu et al., 1998). However, DNMT1 has been shown to be inefficient at maintaining the methylation of many CpG dense regions (Liang et al., 2002). Therefore, the de novo activities of DNMT3a and DNMT3b are also necessary in somatic cells in order to reestablish the methylation patterns so that they are not lost due to the inefficient activity of DNMT1 (Fig. 2).
Although much progress has been made in understanding the role DNA methylation plays in controlling cellular processes, there are still many details that are not fully understood. Recently the DNA methylation field has been most focused on trying to answer a few key questions: (1) What role does DNA methylation play in the silencing of genes? (2) How does DNA methylation silence genes? (3) How is de novo DNA methylation targeted? and (4) Why are CpG islands generally not methylated in normal cells and what are the possible mechanisms that could lead to methylation of CpG islands during the development of cancer cells? This review surveys the research that has been done in attempt to answer these questions.
The Role of DNA Methylation
Silencing of genetic elements can be successfully initiated and retained by histone modifications and chromatin structure. However, these modifications are easily reversible making them make poor gatekeepers for long-term silencing (Shi et al., 2004; Takenchi et al., 2006). Therefore, mammalian cells must possess an additional mechanism for prolong silencing of these sequences. An important component of this process is DNA methylation. DNA methylation is a stable modification that is inherited throughout cellular divisions. When found within promoters, DNA methylation prevents the reactivation of silent genes, even when the repressive histone marks are reversed (McGarvey et al., 2007). This allows the daughter cells to retain the same expression pattern as the precursor cells and is important for many cellular processes including the silencing of repetitive elements, X-inactivation, imprinting, and development.
The mammalian genome is complex consisting of not only coding material but also of transposons and other parasitic elements that have been acquired in the human genome over time. These repetitive sequences make up much of the intergenic and intronic regions of DNA. Many of these repetitive elements contain long terminal repeat promoters which permit the transcription of these sequences (Kochanek et al., 1995; Yoder et al., 1997). Since the expression of these sequences can allow for the movement of the parasitic elements within the genome, these elements must be persistently silenced by DNA methylation in order to preserve the integrity of the genome (Robertson and Wolffe, 2000). During tumorgenesis, however, these CpG poor regions become hypomethylated by an unknown mechanism. This hypomethylation leads to the expansion of the parasitic elements, and may play a role in carcinogenesis (Wilson et al., 2007).
In addition to silencing repetitive elements, CpG methylation in also an important constituent in X chromosome inactivation and in the establishment and maintenance of imprinted genes. Both X-inactivation and imprinting are forms of non-Medelian inheritance in which one allele becomes methylated leading to mono-allelic expression. During embryogenesis one of the X chromosomes is inactivated in order to preclude the expression of its genes. Methylation of CpG rich promoters found within the inactive X chromosome stabilizes the repressed state of the corresponding genes and allows these silent states to be inherited through every cell division (Chang et al., 2006). Imprinting is important for determining which parental allele will be expressed. Imprinted genes are marked in the gonads by DNA methylation of the imprinted control region (ICR) allowing for the daughter cells to retain the same mono-allelic expression as their parental origin (Jelinic and Shaw, 2007).
The role DNA methylation plays in development has been a matter of debate. Although some germ-line and tissue specific promoters have been shown to undergo DNA methylation during cellular differentiation, there are many developmental genes which are not silenced by DNA methylation (Bird, 2002; Strichman-Almashanu et al., 2002; Fazzari and Greally, 2004). This leads many to question if DNA methylation really plays a substantial role in the developmental process. A recent study, however, has provided new evidence implicating DNA methylation in the silencing of germline specific genes (Weber et al., 2007). By looking at global DNA methylation, this study found a subset of genes that were methylated in fibroblasts but not in sperm. A large number of these differentially methylated genes were found to be germline specific. These results imply that somatic DNA methylation does contribute to differentiation by repressing key genes in the germline and irreversibly forcing the cell on a path to differentiation (Weber et al., 2007; Ziblerman, 2007).
Most of the research on DNA methylation and promoters has been focused on CpG islands and not CpG poor promoters. About 60% of human genes contain non-CpG rich promoters and although these promoters are CpG poor they are not necessarily methylated in healthy cells (Takai and Jones, 2002). Methylation of CpG islands has been shown to have a direct effect on the silencing of genes; however, no strong correlation has been established yet for CpG poor promoters. Many CpG poor promoters have been shown to be methylated in cell lineages, suggesting that methylation of CpG poor promoters may play a role in development and differentiation (Futscher et al., 2002; Hattori et al., 2004; Blelloch et al., 2006). This link has yet to be clearly established and calls for further research in this area.
Regulation of Transcription by DNA Methylation
DNA methylation has been correlated with transcriptional silencing for over 20 years. However, it has not been until recently that the mechanisms by which DNA methylation inhibits transcription have begun to be uncovered. Numerous processes by which DNA methylation can influence transcription have been proposed. One model suggests that DNA methylation can directly impede the binding of transcriptional factors to their target sites, thus prohibiting transcription. Most of the other proposed mechanisms are based on the idea that methylation of CpG sequences can alter chromatin structure by effecting histone modifications and nucleosome occupancy within the promoter regions of genes.
Many transcription factors are targeted to CG-containing sequences and methylation of CpG sites within these sequences have been shown to prevent the binding of these proteins to these sites (Comb and Goodman, 1990; Prendergast et al., 1991). Most of the evidence for this mechanism comes from studies done on c-myc and CTCF. c-Myc is a transcription factor that is involved in the regulation of the cell growth and differentiation. Studies using gel-shift assays have shown that DNA methylation excludes c-Myc from binding to its consensus site (Prendergast et al., 1991). CTCF is most known for the role it plays in imprinting at the H19/If2 locus. CTCF acts to silence the maternal copy of the Igf2 gene by binding to a site between the enhancer and promoter. At the paternal locus, however, methylation of the CTCF binding site prevents CTCF binding and allows for activation of Igf2 (Bell and Felsenfeld, 2000). Although this is an important mechanism, relatively few factors whose binding is affected by methylation have been identified. Furthermore, methylated DNA has been shown to be transcribed in the absence of chromatin or methyl-binding proteins (MBPs, Kass et al., 1997). This suggests that there must be additional mechanisms involving chromatin structure by which DNA methylation can silence genes.
DNA methylation can prevent transcriptional activation by interfering with the propagation of active chromatin marks. Genes are poised for transcriptional activation by the methylation of H3K4 (histone H3 lysine 4) in a reaction that is catalyzed by a family of H3K4 methyltransferases (Ruthenburg et al., 2007). Some of these H3K4 methyltransferases have been shown to be targeted to sites which contain a local concentration of CpG dinucleotides (Voo et al., 2000; Birke et al., 2002; Ayton et al., 2004; Lee and Skalnik, 2005). Methylation of these sites prevents binding of the these methyltransferases thereby impeding these genes from being primed for activation (Voo et al., 2000; Birke et al., 2002; Ayton et al., 2004; Lee and Skalnik, 2005). In addition, recent evidence has shown DNA methylation prevents the incorporation of H3K4me2 and H3K4me3 (Okitsu and Hsieh, 2007) on episomes. This mechanism allows a way for DNA methylation to safeguard the silent state of genes.
In addition to directly inhibiting transcriptional factors from binding, DNA methylation also recruits MBPs that specifically bind to methylated CpGs. This family of proteins consists of five members all containing a homologous methyl-CpG-binding domain (Sansom et al., 2007). In addition, a non-homologous protein called Kaiso has also been shown to bind a methylated CGCG motif (Sansom et al., 2007). MBD1, MBD2, MBD3, MeCP2, and Kaiso have all been shown to play a role in methylation-dependent silencing of transcription. How these proteins repress transcription is not fully understood. However, it has been shown that MBPs can bind repressors and histone deacetylases which may lead to an inactive chromatin structure (Jones et al., 1998; Nan et al., 1998; Harikrishnan et al., 2005).
Methylation of CpG sites within promoters may also affect nucleosome occupancy at the transcriptional start sites of genes. This in turn could effect transcriptional activation of these genes. Nucleosome occupancy has been shown to decrease the binding of transcription factors and RNA polymerase II (Li et al., 2007). Experimental studies suggest that promoters contain nucleosome free regions at their transcriptional start sites which allow for the binding of transcriptional activators and RNA polymerase II to the DNA (Gal-Yam et al., 2006; Ozsalak et al., 2007; Lin et al., personal communication). Evidence suggests that DNA methylation can affect nucleosome occupancy. The first experimental data in support of this claim came from in vitro studies which showed that methylation of CpG sites could affect nucleosome positioning at particular sequences (Davey et al., 1997). Since then, studies of the MGMT and MLH1 promoters have shown that DNA methylation does affect nucleosome occupancy in these “nucleosome free” regions in vivo (Patel et al., 1997; Lin et al., personal communication). The mechanism by which DNA methylation dictates nucleosome occupancy is not fully understood. It has been shown that MeCP2 binds to Brahma, a catalytic subunit of the chromatin remodeling complex SWI/SNF (Harikrishnan et al., 2005). This interaction may allow for the targeting of the chromatin-remodeling complex to methylated CpG sites, resulting in changes of nucleosome occupancy. In addition, DNMT3a has been shown to interact with the chromatin remodeler hSNF2h, suggesting that chromatin remodelers may be directly targeted to these sites by DNA methyltransferases (Geiman et al., 2004).
Regulation of De Novo Methylation
In somatic cells between 70% and 90% of CpG dinucleotides found in the genome are methylated. In healthy cells most of this methylation occurs at CpG poor regions dispersed throughout the genome whereas most CpG islands are unmethylated (Jones and Baylin, 2002). Given the existence of both methylated and unmethylated CpG sites there must be mechanisms within the cell which control these patterns of methylation.
One mechanism by which DNA methyltransferases may be targeted to particular sites within the genome is by recognizing specific chromatin structures. Traditionally chromatin has been divided into two different states: an active euchromatic state and an inactive heterochromatic state. Euchromatin makes up a large portion of the genome and is usually defined by di- and trimethylation of lysine 4 on histone H3 and acetylation of the histones H3 and H4. Euchromatin represents a flexible state where genes can be kept on or turned off (Kouzarides, 2007). In contrast, heterochromatin is a compact structure involved in mitosis and in the protection of chromosome ends. In mammals, heterochromatin is marked by either trimethylation of lysine 27 on histone H3, trimethylation of lysine 9 on histone H3 or methylation of lysine 20 on histone H4. In mammals H3K9me3 is preferentially localized to pericentrometric heterochromatin whereas H3K27me3 is involved in homeotic gene silencing and marks the inactive X chromosome (Kouzarides, 2007). Since DNA methylation occurs at heterochromatic regions, these histone modifications, either individually or together, make probable targets for the DNA methyltransferases.
For many years it has been suspected that chromatin structure could recruit DNA methyltransferases to specific loci. DNA methyltransferases have been shown to interact with the histone methyltransferases SUV39, which is responsible for methylation of H3K9, and EZH2, which catalyzes H3K27 methylation (Fuks et al., 2003; Vire et al., 2006). These interactions have been proven to be important for DNA methylation of heterochromatin and EZH2 targeted genes. In both Neurospora and Arabidopsis, mutations in the histone H3K9 methyltransferase caused significant loss of genomic DNA methylation and in mammalian cells SUV39 has been shown to be required for Dnmt3b-dependent DNA methylation at pericentric repeats (Jackson et al., 2002; Lehnertz et al., 2003; Tamaru et al., 2003). Knockdown studies of the polycomb methyltransferase EZH2 show that EZH2 is necessary for the CpG methylation of EZH2 targeted genes (Vire et al., 2006). DNA methyltransferases have also been shown to interact with HP1, a protein that specifically binds to methylated lysine 9 on histone H3, and evidence supports a model in which HP1 is responsible for recruiting the DNA methyltransferases to loci marked by H3K9me3 (Smallwood et al., 2007). Further proof that methylation of H3K9 and H3K27 may target DNA methylation comes from genome wide studies showing that these histone modifications precede DNA methylation. These studies have shown that when DNA methylation is inhibited there is no effect on the methylation of H3K9 or H3K27 in repeat sequences or CpG islands (McGarvey et al., 2006). Taken together, these results suggest that trimethylation of H3K9 and H3K27 are important markers of DNA methylation.
Recently this has been proposed that the occurrence of H3K9me3/H3K9me2 and H3K27me3 within the same region of the genome serve as a signal for DNA methylation of CpG islands within promoters (Ohm and Baylin, 2007). The need of dual silencing marks to signal DNA methylation was first discovered in plants. Non-CpG methylation was shown to be targeted only to loci that contained both H3K27me3 and H3K9me3 (Lindroth et al., 2004). Since then it has been shown that H3K9me3/H3K9me2 and H3K27me3 are found in genes that are commonly hypermethylated in cancer, suggesting a similar mechanism may occur in mammals within CpG islands (Ohm and Baylin, 2007). Recent studies have suggested that adult stem cells, which contain the bivalent histone marks H3K4me2 and H3K27me3 at many CpG islands, can loose the H3K4me2 mark and gain the H3K9me3 and H3K9me2 marks. Once this occurs the dual marked CpG island is targeted for DNA methylation and the stem cell is on its way to becoming cancerous (Ohm and Baylin, 2007).
Although these studies show convincing evidence that certain modifications are needed for DNA methylation, these studies do not prove that these modifications are the targeting factors for DNA methylation. Data have been collected from only a few genes and genome wide studies will need to be done in order to determine if there is a global correlation between DNA methylation and these silencing histone modifications. In addition, there have been cases where H3K9me3 and H3K27me3 are present at loci where there is no DNA methylation (Lewis et al., 2004; Umlauf et al., 2004). Therefore, methylation of H3K9me3, H3K9me2, and H3K27me3 may predispose sequences for DNA methylation but there may be additional mechanisms needed to target the DNA methyltransferases to these sites.
Another mechanism by which DNA methyltransferases can be targeted to promoters is by repressors. Evidence for this comes from studies done on the PML-RAR fusion protein and on Myc. These studies showed that the oncogenic PML-RAR fusion protein can induce gene hypermethylation and silencing by recruiting DNA methyltransferases to target promoters (Di Croce et al., 2002). In addition, the Myc protein was shown to associate with DNMT3a and recruit DNMT3a to the p21cip1 promoter. This recruitment results in the silencing of the p21cip1 promoter (Brenner et al., 2005).
RNAi has also been shown to play a role in targeting DNA methylation. In plants RNAi-mediated silencing can result in de novo methylation of genes (Matzke and Birchler, 2005). Cell culture studies have shown that RNAi-mediated silencing can lead to de novo methylation (Morris et al., 2004). There are contradicting results, however, Murchison et al. (2005) and further research need to be done to show that this mechanism occurs in mammals.
Aberrant DNA Methylation and Cancer
Early indications that DNA methylation may be involved in cancer came with the discovery of global hypomethylation in tumors (Riggs and Jones, 1983). Since then it has been shown that although the genome as a whole is hypomethylated in cancer, many CpG islands are hypermethylated (Jones and Baylin, 2007). Hypermethylation of CpG islands leads to silencing of genes, including many tumor suppressors, thereby contributing to the process of tumorgenesis. One question that has perplexed the field for decades is how this change in distribution of methylation comes about? Unfortunately, without fully understanding the targeting mechanisms for DNA methylation, this question is difficult to answer.
Many hypotheses have been proposed to explain why certain genes are methylated in cancer. One Darwinian theory suggests that particular genes become methylated in tumors because inactivation of these genes provides a selective growth advantage for the cells (Esteller, 2005). Another hypothesis suggests that genes which are under the control of PcG are more vulnerable to DNA methylation. Recently it has been shown that PcG marked genes are 20 times more likely to be methylated in cancer. This idea has been reviewed in detail elsewhere (Ohm and Baylin, 2007). In addition, it has been suggested that the DNMTs may be aberrantly targeted to specific promoters by fusion proteins (Di Croce et al., 2002). Aberrantly targeting of DNMTs by fusion proteins has been found to occur in a limited number of cases and is therefore probably not a general phenomenon in cancer.
Much progress has been made in our understanding of the basic mechanisms of DNA methylation, yet there are still many questions left unanswered. Though DNA methylation has definitively been shown to play a role in X-inactivation, imprinting and silencing of repetitive sequences, the role of DNA methylation in development is still an area of debate. Exciting new evidence has aroused implication of DNA methylation in silencing of germline specific genes, however, more work needs to be done in this area to establish whether or not this methylation is a cause of differentiation and not a result of it.
In addition, there is still more left to learn about the mechanism by which DNA methylation silences genes. There is strong evidence supporting the notion that DNA methylation prevents binding of some transcription factors to their targeted sites, but this seems limited to a subset of factors and genes and is not itself a global mechanism. Recent evidence suggests a silencing mechanism by which DNA methylation leads to the recruitment of chromatin remodelers possibly through MBPs. These chromatin remodelers would then affect the nucleosome occupancy within promoters leading to chromatin compaction and long-term silencing of genes (Fig. 3D,E). Further studies will need to be done to show that there is a link between DNA methylation of promoters and nucleosome occupancy on a global scale.
The mechanism that targets DNA methylation to particular sites is still not understood. Many studies have linked various histones modifications with the DNA methylation status of genes. These results propose a model in which inactive marks on histones play a role in dictating which CpG sites will become methylated. When CpG islands contain active histone marks, they are not targets of DNA methylation, however, when these active marks are lost and the inactive H3K27me3 and H3K9me3 marks are acquired, these sites now become susceptible to DNA methylation (Fig. 3). Whether or not this is all that is needed for DNA methylation to occur or if there is another layer of defense is still a matter of debate.
Given that DNA methylation plays a prominent role in cancer development, it is important for us to fully understand the targeting mechanisms used by the cell to target the DNA methyltransferases to specific loci. It is imperative that we have a complete grasp on how DNA methylation leads to the silencing. Only then we will be able to develop better therapies for the prevention and treatment of cancer.
We thank Mark Miranda and Connie Cortez for proofreading the manuscript.