• Open Access

Mammalian epigenetic mechanisms

Authors


Abstract

The mammalian genome is packaged into chromatin that is further compacted into three-dimensional structures consisting of distinct functional domains. The higher order structure of chromatin is in part dictated by enzymatic DNA methylation and histone modifications to establish epigenetic layers controlling gene expression and cellular functions, without altering the underlying DNA sequences. Apart from DNA and histone modifications, non-coding RNAs can also regulate the dynamics of the mammalian gene expression and various physiological functions including cell division, differentiation, and apoptosis. Aberrant epigenetic signatures are associated with abnormal developmental processes and diseases such as cancer. In this review, we will discuss the different layers of epigenetic regulation, including writer enzymes for DNA methylation, histone modifications, non-coding RNA, and chromatin conformation. We will highlight the combinatorial role of these structural and chemical modifications along with their partners in various cellular processes in mammalian cells. We will also address the cis and trans interacting “reader” proteins that recognize these modifications and “eraser” enzymes that remove these marks. Furthermore, an attempt will be made to discuss the interplay between various epigenetic writers, readers, and erasures in the establishment of mammalian epigenetic mechanisms. © 2014 IUBMB Life, 66(4):240–256, 2014

Introduction

The term “Epigenetics” was first introduced by Conrad Waddington in 1942, as the study of the casual interaction between genes and their products controlling phenotypic changes which occur over the course of development. After decades of scientific discovery, epigenetics has evolved into a subject that is focused on the study of changes that are inherited during mitosis/meiosis, without altering the underlying DNA sequences. Mammalian gene expression is regulated at multiple epigenetic layers. The four major determinants are DNA methylation patterns, histone modification signatures, chromatin conformation characteristics, and non-coding RNAs (ncRNAs) (Fig. 1).

Figure 1.

Landscape of epigenetic layers. Epigenetic layers, chromatin physical contact in three-dimensional space (chromatin conformation) (upper panel in Fig. 1), covalent chemical modification of DNA and histone (middle panel in Fig. 1), and post-transcriptional gene expression (lower panel in Fig. 1) work in concert to dictate the mammalian physiological functions such as cell division, differentiation and apoptosis. Red dots in the upper panel of Fig. 1 represent physical contacts among chromatin segments and between chromosomes; lolly with red heads in the middle panel represent DNA methylation; green, blue, and brown dots in the middle panel represent covalent modifications on histone tails, such as phosphorylation, methylation, and acetylation; red curves in the lower panel represent mature miRNA or siRNA; orange curves in the lower panel represent mRNA.

DNA methylation is the most commonly studied epigenetic mark in the mammalian genome. It is primarily thought to suppress the binding of transcription factors to gene promoters, thus controlling gene expression. Recently, repressor proteins that can read and bind methylated DNA have been shown to be a major mechanism of transcriptional repression. This precise spatial and temporal gene transcription regulatory mechanism by appropriate gene methylation is of vital importance during mammalian development. DNA methylation also contributes to other developmental events like X-chromosome inactivation and genome stability [1]. DNA methylation patterns are faithfully inherited during both mitosis and meiosis. Recent studies have demonstrated that while some CpG regions are stably methylated, a small number of dynamic methylated regions could play a major role in controlling the transcription network of cells [2]. Failure to maintain correct methylation patterns leads to aberrant DNA methylation, often observed in human diseases including neurodevelopmental defects, neurodegenerative, neurological and autoimmune diseases, and cancers [3].

Nucleosomes are the basic unit of chromatin that constitute the bulk of compacted chromosomes. A mononucleosome consists of genomic DNA wrapped around histone octamer scaffolds. The protruded NH2-tails of histones in the nucleosomes can be modified in a number of ways, for example, lysine and arginine methylation, lysine acetylation, serine, threonine and tyrosine phosphorylation, and lysine ubiquitination and sumoylation [4]. These modifications can change the charge of chromatin leading to a more condensed or open state, which subsequently dictates the accessibility of regulatory proteins including transcriptional regulators. In transcriptionally non-permissive chromatin, regulatory repressor proteins recognize modified histones tails for recruitment, leading to chromatin condensation and obstruction of transcriptional activator binding. Mammalian genomes are compacted into highly condensed chromatin via histones and other scaffold proteins to maintain a compartmentalized three-dimensional conformation. This topological organization is thought to control gene expression directly. Recent advances in chromatin conformation capture technologies have revealed that chromatin adopts a non-random three-dimensional structure with genes organized into hubs of different transcription states [5].

Advances in high throughput sequencing of the transcriptome have led to the deciphering of so called “dark regions” of the genome, which were previously thought to be junk DNAs. These “dark regions” are now found to be transcriptionally active and the majority of these transcripts are non-coding RNAs, including microRNA (miRNA), small nucleolar RNA (snoRNA), and long non-coding (lncRNA). These RNAs play many important roles in transcriptional repression and activation, including heterochromatization [6], targeting of enzymatic modification of rRNAs [7], genome imprinting, X chromosome inactivation [8], and a variety of other biological processes. Extensive cross-talk between different epigenetic layers maintains and regulates complex transcriptional networks in cells. Indeed, the promoter of miRNAs is frequently the target of DNA hypermethylation and therefore silenced in various cancers [9, 10].

Here, we will discuss recent advances in epigenetic modifications and their relationship to transcriptional regulation. A major focus will be on the cross-talk among different epigenetic layers that leads to aberrant chromatin modification and dysregulation of gene expression in diseases including cancer.

Functional Consequences of DNA Methylation in Mammalian Cells

In the mammalian genome, methylation at the C-5 position of cytosine results in 5-methylcytosine (5mC). 5mC is the predominant DNA modification in mammals. However, in recent years, its oxidative derivatives, 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxylcytosine (5caC) have also been found in mammalian DNA. Those oxidative products are hypothesized to be intermediates in an active DNA demethylation pathway [11, 12]. DNA methylation generally occurs in CpG dinucleotides in mammalian cells, although a small percentage of methylation at CHG and CHH sequences is also observed in embryonic stem (ES) cells [13]. In the human genome, about 30% of the CpG islands, defined as regions longer than 200 bp with a higher than expected density of CpG sites, are located in the transcription start sites (TSS), leaving only about 32% in gene bodies [14]. Most of the CpG islands in promoter regions remain unmethylated in somatic cells to maintain transcription of the active genes. The CpG island-containing genes are known as housekeeping genes and are generally involved in cell cycle regulation and DNA repair.

DNA methylation could affect gene transcription in three ways: (a) by changing transcription factor binding affinity to a gene promoter; (b) affecting the binding of methylation-specific recognition factors to promoter or gene bodies; (c) altering the chromatin structure and spatial accessibility of transcription factors and/or other DNA binding proteins (Fig. 2).

Figure 2.

Model of how DNA methylation affects gene expression. Currently known models on DNA methylation affecting gene expression: A, transcription factors like YY1 are sensitive to methylated DNA and would not guide the initiation of transcription when template is methylated; B, MBD proteins can tether on methylated DNA and insulates transcription factor binding to DNA; C, methylated DNA can be made more compact by chromatin remodeling enzymes, and thereby transcription factors will not get access to heterochromatin. Lolly with red heads (in Figs. 1A, 1B) and red dots (in Fig. 2C) represent methylated CpG dinucleotides; orange curves represent RNA transcripts.

There are many examples of reduced binding preference of transcription factors to methylated target promoter sequences compared to unmethylated, thereby impairing transcription. Such transcription factors include, but are not limited to, the adenovirus major late transcription factor (MLTF) [15], the Gli-type transcription factor YY1 that controls the paternal expression of Peg3 and Usp29 [16], and the USF1/2 transcription factor that positively regulates the expression of UDP-glucuronosyltransferase 1A1 in colorectal cancer cells [17]. Impairment of transcription factor binding is usually determined by a single or a few pivotal methylated CpG sites; for example, the YY1-binding site in Peg3 contains one CpG site and methylation of this site is sufficient to abolish the binding of YY1 in vitro [16]. Nevertheless, there are other reports demonstrating that DNA methylation does not always suppress transcription. In a detailed transcription factor microarray binding analysis, a number of purified transcription factors displayed methylated CpG binding activities. Such examples are zf-C2H2, Homeobox, bHLH, Forkhead, bZIP, and HMG box subfamilies [18]. There are also transcription factors showing sequence specific binding patterns irrespective of the methylation status of the sequence as observed for Kruppel-like factor 4 (KLF4). These examples suggest that promoter DNA methylation inhibition of transcription can be dominant; in other cases, it may be transcription factor sequence specificity that dictates the transcription mechanism. Furthermore, recent findings of zinc finger family transcription factor Kaiso and ZFP57 binding on methylated DNA suggested the possibility that DNA methylation can promote the initiation of transcription [19]. These reports suggest that although DNA methylation can be repressive for transcription, in some cases it may work as a positive effector.

It is obvious that modification of DNA may repel binding of certain proteins while attracting others. One such family is MBD (methyl-CpG-binding domain) containing proteins, which includes MeCP2 [20, 21], MBD1 [22, 23], MBD2 [24], MBD3 [25], and MBD4 [26] that recognize and bind methylated CpG sequences and act as insulators for transcription factor binding [27]. The mutation or duplication of MeCP2 has been known to cause severe neurodevelopmental disorders and progressive neurological diseases like Rett syndrome [28, 29]. Evidence indicates that MeCP2 can tether to methylated DNA and repress transcription. Indeed, in Mecp2 null mice ∼85% of genes such as Sst, Oprk1, and Mef2c show transcriptional activation due to a loss of MeCP2 binding and the recruitment of the CREB1 transcriptional activator [30]. DNA methylation in gene bodies was thought to inhibit spurious intragenic transcription initiation and simultaneous reduction of transcriptional noise, which arises from unintentional transcription initiation within gene bodies. When microarray data were compared to base resolution DNA methylation data, it was discovered that gene body methylation inversely correlated to transcriptional noise [31]. Nevertheless, recent studies have shown that the relationship between transcriptional activity and gene body DNA methylation is non-monotonic, but bell-shaped [32]. The proposed model suggests that highly condensed DNA is a poor substrate for both DNA methyltransferases and RNA polymerases; the DNA of genes with moderate transcription levels has a more open conformation and is accessible to DNA methyltransferases; while actively transcribed genes are loaded with RNA polymerases and DNA methyltransferases and are thus excluded [32]. This model supports a consequential, but not causative role of gene body methylation in gene expression.

Methylation leads to compact the DNA, thereby making transcriptional machinery less accessible and thereby reducing transcriptional activity at the methylated gene. DNA methylation has been observed to compact chromatin fiber in the presence of histone H1 in vitro and methylated DNA is also more resistant to micrococcal nuclease digestion in NIH/3T3 cells [33]. However, studies on the chromatin structure using Dnmt knockout mouse ES cell line suggest that DNA methylation affects core histone modification and linker histone mobility, but not chromatin condensation in bulk chromatin and heterochromatin [34]. These studies indicate that DNA methylation may play an indirect role in modeling chromatin structure through core histone modification and linker histone occupancy. Indeed, combinatorial whole genome DNA methylation and chromatin conformation analysis have clearly elucidated the interdependency between chromatin compaction and DNA methylation [5, 35].

5hmC is the second DNA cytosine-5 modification in mammalian genomes. In addition to the intermediate role it plays in active DNA demethylation (discussed later), 5hmC also participates in gene expression regulation. Specifically, 5hmC is enriched in the intragenic regions of genes related to neurodegenerative disorders in mouse cerebellum cells in an age-dependent manner [36]. Additionally, enrichment of gene-body 5hmC facilitates transcription and helps maintain cellular identity in neuronal populations, especially in mature olfactory sensory neurons [37]. Furthermore, in ES cells, 5hmC is enriched in the body of actively transcribed genes and promoter regions of Polycomb-repressed developmental regulator genes [38]. Although the activator role of 5hmC was reported in neurological and ES cell types the detailed mechanism of transcriptional activation remains elusive. Recently, the methyl-CpG-binding domain containing protein MeCP2 has been found to bind 5hmC in actively transcribed gene bodies with similar affinity to its binding of 5mC in repressed gene promoters [39], which indicates that MeCP2 may play a role in active transcription of genes enriched in 5hmC in their coding regions.

Establishment and Maintenance of DNA Methylation Patterns in Mammals by DNMTS

Three DNA cytosine-5 methyltransferase family members DNMT1, DNMT3a, and DNMT3b and an accessory protein known as DNMT3L are responsible for the establishment and maintenance of DNA methylation patterns in mammals. Purified recombinant DNMT1 shows 7–21-fold higher preference for hemimethylated versus unmethylated DNA substrates [40, 41], whereas DNMT3a has threefold higher activity on unmethylated versus hemimethylated DNA substrates [42]. These enzymes work in concert during mammalian development. During early embryo development, specifically post fertilization, the embryo loses DNA methylation. It is believed that de novo methyltransferases DNMT3a and DNMT3b are the early DNA methylation writers. After the eight-cell stage of development DNMT1 participates in DNA methylation, thus ensuring the faithful maintenance of established DNA methylation patterns. The peptide sequence of DNMT1 has a replication foci targeting domain along with a proliferating cell nuclear antigen (PCNA) binding domain ensuring its localization at the replication site for maintenance methylation of the newly synthesized daughter strand [43, 44]. Another crucial DNA methylation accessory protein at the replication fork is the Ubiquitin-like containing PHD and RING finger domains 1 (UHRF1). Uhrf1 null mouse ES cells lose over 70% of genomic DNA methylation [45]. Therefore, although DNMT1 is thought to be the major contributor of maintenance DNA methylation accessory proteins such as PCNA and UHRF1 play a major role in maintaining DNA methylation. Recent crystallography studies have also revealed that DNMT1 can be auto-inhibited for de novo methylation of unmethylated CpGs by the replication foci targeting sequence (RFTS) domain. Indeed, the RFTS domain of DNMT1 binds in proximity to its own catalytic pocket and prevents substrate DNA binding [46, 47]. Also, the SRA domain of UHRF1 facilitates DNA acceptance through the catalytic center of DNMT1 via blocking the RFTS domain [48]. Mutations in the RFTS domain can cause serve neurological diseases such as hereditary sensory and autonomic neuropathy, autosomal dominant cerebellar ataxia, and autonomic system dysfunctions [49-51].

Adequate levels of DNMT1 expression are essential to the normal development of mammals. Dnmt1 knockout mouse ES cells display threefold lower 5mC genomic levels, yet remain viable and show no growth retardation or abnormal morphology with only enhanced microsatellite instability [52, 53]. Interestingly, mouse embryos lacking Dnmt1 are stunted at about embryonic day 9.5 and undergo death in utero [52]. In contrast, conditional Dnmt1 knockout mice (with hypomethylated brain cells) remain viable, but display defects in learning and memory at adulthood, suggesting that DNA methylation maintenance plays an important role in the neuronal maturation of the central nervous system [54]. This evidence strongly indicates that loss of DNMT1 is deleterious to mammalian development, but an overabundance of DNMT1 also has a negative impact during development. It has been observed that Dnmt1 overexpression causes genomic hypermethylation, loss of imprinting for specific genes like Igf2 and H19, and consequent embryo death [55]. Taken together, all these studies indicate that adequate dose and temporal and spatial expression of DNMT1 are important to the growth and development of mammals.

Aberrant DNA methylation, which includes both hypomethylation of repetitive DNA elements and hypermethylation of tumor suppressor genes, has been observed in various cancer cell types. Hypermethylation of a tumor suppressor gene promoter was observed in the retinoblastoma (RB) [56], L3MBTL1 [57], and RIZ1 genes in various carcinoma [58]. Indeed, DNMT1 is responsible for maintaining the repressive state of some tumor suppressor genes. DNMT1 hypomorphs do not result in an obvious loss of genomic DNA methylation or cell death [59, 60], but complete inactivation of DNMT1 caused a hypomethylation of the genome resulting in mitotic catastrophe and demethylation of tumor suppressor gene promoters [61-63]. It has also been reported that knockout of a single allele of DNMT1 is sufficient to activate bivalent chromatin domains and impair leukemia stem cell function [64]. These studies have revealed that DNMT1 is essential for the maintenance of DNA methylation of tumor suppressor genes and tumor cell survival.

DNMT3a and 3b play a major role in de novo methylation in mammalian cells as they display a preference for unmethylated DNA substrates in vitro, high expression in germ cells and embryos, and low expression in somatic cells. Indeed, de novo DNA methylation occurs mainly in the specification and maturation of germ cells, post-implantation of embryos, ES cells, and embryonal carcinoma cells and is largely absent in differentiated somatic cells [65-67]. Dnmt3a and Dnmt3b knockout ES cells also displayed impaired methylation of proviral DNA sequences suggesting de novo methyltransferases establish and maintain methylation of Moloney Murine Leukemia Virus sequences [68]. In a recent study, Dnmt3a and Dnmt3b deficient-induced pluripotent stem cells (iPSCs) underwent self-renewal, but showed limited developmental potential. Introduction of Dnmt3a and Dnmt3b into these iPSCs fully rescued the developmental potential suggesting a major role for de novo methyltransferases in pluripotency and development [69]. Moreover, the essential role of de novo methyltransferases for development is further supported by the observation that Dnmt3a knockout mice died four weeks after birth, and similarly Dnmt3b knockout mice embryos stopped developing at embryonic day 11.5 [68]. Indeed, conditional knock-out of Dnmt3a led to lack of DNA methylation at maternally imprinted loci resulting in utero embryonic death [70].

In recent years, there is increasing evidence that DNA methylation is not solely established by DNMT1 and DNMT3a/3b. Indeed, a host of other proteins are essential in DNA methylation process in mammalian cells [71]. Indeed, the chromatin remodeling family member LSH has been shown to regulate CpG methylation at repetitive sequences [72] and more recently has been shown to facilitate genome-wide cytosine methylation at non repetitive DNA elements [73]. Deletion of Lsh led to alterations in H4K4me3 modification and gene expression. Although Dnmt1 knockouts show a profound loss of DNA methylation, Dnmt3a and Dnmt3b double knockout mouse ES cells only showed demethylation in various repetitive elements and single copy genes [74]. DNMT1 is also indispensable for maintenance methylation and synaptic plasticity in adult forebrain neurons [75]. Additionally, DNMT1 can form complex with DNMT3a and 3b in vivo [76], suggesting DNMT3a and 3b also play a role in maintenance methylation. It is also worth noting that non-CpG DNA methylation, such as methylation in the CHH and CHG, is solely mediated by DNMT3a and DNMT3b in ES cells [65, 74]. Genome-wide DNA methylation mapping of 30 diverse human tissues and cell types concluded about a fifth of methylated CpGs are methylated dynamically [2]. However, the portion of dynamic contribution by all three methyltransferases for de novo and maintenance DNA methylation in embryo development or in carcinogenesis is poorly understood. As a result, mapping the contribution of these enzymes could be of importance to understand the developmental and disease-specific DNA methylation pattern.

Mechanism of DNA Demethylation

Passive dilution and active removal of 5mC are reported in mammalian cells. In the passive demethylation pathway, DNMTs are inhibited or targeted leading to a gradual loss of 5mC during successive cell divisions. Several mechanisms have been proposed for active demethylation, including base excision repair by Gadd45 and other companion proteins [77], removal of 5mC by thymine DNA glycosylase (TDG) [78], and 5mC deamination by AID (activation-induced deaminase)/APOBEC (apolipoprotein B mRNA-editing enzyme complex) followed by consequent mismatch repair [79]. The DNA methylation-demethylation cycle has gained support by the discovery of TET (ten eleven translocation) dioxygenases. More specifically, 5mC can be oxidized in a iterative manner to 5hmC leading to 5fC and 5caC by the TET enzyme family members TET1, TET2, and TET3 [12]. 5caC is ultimately removed by the TDG and the base excision repair (BER) pathway enzymes [11, 12, 80].

DNA demethylation occurs in several circumstances during mammalian development. Demethylation is observed in the paternal copy of the genome after the sperm penetrates the egg and before the fusion of the two nuclei. Active demethylation is proposed to be the mechanism in this process as the methylation patterns of the maternal genome are retained. However, the first oxidation product of 5mC, 5hmC, persists into the later embryo development stages, suggesting that a combination of active oxidation and passive dilution of 5hmC mechanism may be responsible for the demethylation of the paternal pronuclei [81]. The highly expressed TET family member, specifically TET3, is proposed to catalyze this demethylation step [81].

Similarly, the primordial germ cells (PGC), progenitors of mature germ cells derived from the proximal epiblast, do undergo genome wide demethylation to maintain totipotency [82]. The mechanism underlying this phenomenon is controversial as suggested. Both AID/APOBEC-mediated deamination and TET1/TET2-mediated oxidation of 5mC followed by DNA-replication-coupled dilution of 5hmC have been proposed [83, 84]. AID-deficient mouse are viable and have threefold more methylation throughout their PGC genome [83]. Tet1/Tet2 DKO mice showed higher midgestation abnormalities with perinatal lethality and viable female DKO mice have smaller ovaries and reduced fertility [85].

In addition to the DNA demethylation observed during embryo development and gametogenesis, dynamic demethylation has also been found in somatic cells. Promoter DNA demethylation and subsequent gene expression were observed in the dentate gyrus of the adult mouse brain, a process proposed to be facilitated by TET1-catalyzed hydroxylation of 5mC and AID/APOBEC1-mediated deamination of 5hmC [86]. There are also studies demonstrating that physical exercise can induce genome wide DNA methylation changes in human adipose tissue, which potentially influences adipocyte metabolism, although the precise mechanism is yet to be understood [87].

Histone Modification

Methylation and Demethylation of Histone

Histone methylation occurs on arginine, lysine and histidine amino acids residues. Mono-, di- or tri-methylation has been discovered on histone H2A, H3 and H4. SET-domain-containing enzymes, for example, MLL1, SET1, SET7/9 and G9a, methylate the lysine residue on histone tails, whereas DOT1 family enzymes, for example, DOT1L, are able to methylate the globular region of histones. Several protein arginine methyltransferases (PRMT) have been shown to methylate histone arginine residues. PRMT1, 4 and 6 catalyze mono- and asymmetric dimethylation, whereas PRMT5 catalyzes mono- and symmetric dimethylation on arginine residue (see Table 1 for details) [88].

Table 1. Histone modification and responsible enzymes
ModificationResponsible enzymesDe-modification enzymesRole
  1. PRMT, protein arginine methyltransferases; MLL, mixed-lineage leukemia protein; JMJD, Jumonji domain–containing protein; SET, Su(var)3–9, Enhancer-of-zeste, Trithorax domain protein; LSD, Lysine-specific demethylase; JHDM, JmjC domain-containing histone demethylase; KDM, lysine (K)-specific demethylase; SUV39H1, suppressor of variegation 3–9 homolog 1; EHMT1, euchromatic histone methyltransferases 1; EZH, enhancer of zeste; UTX, ubiquitously transcribed tetratricopeptide repeat, X chromosome; NSD, nuclear receptor-binding SET domain protein; SMYD, SET, and MYND domain containing protein; DOT1L, DOT1-like, histone H3 methyltransferase; PHF2, plant homeodomain finger protein 2; Tip60, the acetyltransferase 60 kDa trans-acting regulatory protein of HIV type 1-interacting protein; CDK2, cyclin-dependent kinase 2; PP2A1, protein phosphatase 2A1; MSK1, mitogen- and stress-activated protein kinase; ATM, ataxia telangiectasia mutated; ATR, ATM and Rad3-related; DNA-PK, DNA-dependent protein kinase; Mst1, mammalian STE20-like kinase 1; WSTF, Williams–Beuren syndrome transcription factor; RSK2, ribosomal S6 kinase 2; AMPK, AMP-activated protein kinase; Haspin/Gsg2, germ cell associated 2 (haspin); IKK-α, IκB kinase α; CK2, casein kinase II.

Methylation
H2AR11, H2AR29PRMT1, PRMT6 Transcriptional repression
H3R2PRMT6 Transcriptional repression
H3K4MLL1, MLL2, MLL3, MLL4, MLL5, SET1, SET7/9LSD1, Swm1, Su(var)3–9, JHDM1b, KDM1A, KDM1B, KDM1C, KDM1DTranscriptional activation or repression, mutually exclusive with H3R2 methylation
H3R8PRMT5 Negatively regulate tumor suppressor gene NM23
H3K9SUV39H1, SUV39H2, G9a, EHMT1LSD1, JHDM2a, JHDM2b, JMJD2A, JMJD2B, JMJD2C, JMJD2DBarrier to somatic cell reprogramming to iPSC, heterochromatin anchor to nuclear envelop
H3R17, H3R26PRMT4 Facilitating transcription by discharging corepressors from chromatin
H3K27EZH1, EZH2UTX, JMJD3Gene silencing
H3K36SET2, NSD1, NSD2, SMYD2JHDM1a, JHDM1b, JMJD2A, JMJD2B, JMJD2CTranscriptional activation
H3K56G9a Facilitate DNA replication by PCNA docking
H3K79DOT1L  
H4R3PRMT1, PRMT5 Gene silencing
H4K20PR-SET7, SUV420H1, SUV420H2, NSD1PHF2Transcriptional regulation, X chromosome inactivation, DNA damage response, mitotic condensation, and DNA replication
Acetylation
H2AK5Tip60HDAC1, HDAC2Transcriptional activation
H3K4, H3K9, H3K36KAT2A, KAT2BHDAC1, HDAC2Transcriptional activation
H3K14, H3K18,KAT2A, KAT2B, KAT3A, KAT3BHDAC1, HDAC2Transcriptional activation
H3K23KAT3A, Tip60HDAC1, HDAC2Transcriptional activation
H3K56KAT2AHDAC1, HDAC2Transcriptional activation
H4K5, H4K8, H4K12, H4K16Tip60HDAC1, HDAC2Transcriptional activation
Phosphorylation
H1.2S173, H1.2S172, H1.4S187CDK2PP2A1Mitosis transcription
H2AS1MSK1 Mitosis, chromatin assembly, and transcriptional repression
H2AS139ATR, ATM, DNA-PK DNA repair
H2AXY142Mst1 WSTFEYA1, EYA3Apoptosis and DNA repair
H2BS14Mst1 Apoptosis and meiosis
H2BS32RSK2 EGF signaling
H2BS36AMPK Transcription
H3T3Haspin/Gsg2 Mitosis
H3S10Aurora-B, MSK1, MSK2, IKK-αPP1Mitosis, meiosis, and transcriptional activation
H3T11Dlk/ZipPPγMitosis
H3S28Aurora-B, MSK1, MSK2PP1Mitosis and immediate-early activation
H4S1CK2 Mitosis, chromatin assembly, and DNA repair

Histone methylation has been associated with various cellular functions such as transcription, DNA replication, DNA damage response including repair, heterochromatin formation, and somatic cell reprogramming. Among these biological functions, transcriptional repression and activation are the most studied. H3K4 trimethylation has been associated with euchromatin and facilitates active transcription by recruiting the RNA polymerase II complex [89]. In contrast, H3K27 trimethylation is generally considered as a repression mark for transcription, and these two modifications are often mutually exclusive with each other perhaps due to lower affinity of H3K27-methylated histone H3 with the SET1-like H3K4 methyltransferase complexes [90]. However, in bivalent chromatin domains, large chromatin regions of H3K27 methylation usually harbor smaller sub-regions of H3K4 methylation, which suppress developmental genes while keeping them poised for activation in ES cells [91]. The polycomb repressive complex 2 (PRC2) can repress a large number of genes involved in somatic processes, which is largely attributed to the H3K27 methylation activity of the core EED-EZH2 subunit in the complex [91]. In addition to H3K4, H3K36 methylation can deter the spread of H3K27 methylation, therefore H3K36 methylation perhaps insulates activate marks for gene expression [92]. Unlike H3K4 methylation, H3K36 methylation is more confined to chromatin associated with gene body regions. Indeed, H3K36 methylation by SET2 decreases the acetylation of histones in the gene body regions and attenuates transcription from intragenic cryptic promoters, thus safeguarding authentic transcriptional elongation in cell [93].

Several different methyltransferases including SUV39H1, SUV39H2, G9a, and EHMT1 methylate H3K9. Although H3K9me1 is generally considered as a gene expression activation mark, H3K9me2 and H3K9me3 are largely repressive signatures [94]. PRE-SET7 catalyzes the mono methylation of H4K20 and SUV420H1 and SUV420H2 mediate further modifications, specifically di or trimethylation. H4K20me1 is associated with transcriptional activation, whereas H4K20me2 and H4K20me3 are correlated to other important physiological processes such as DNA damage response and maintenance of genome integrity.

In contrast to histone tail modification, the methylation on the globular region of histones is more relevant to DNA replication. H3K56 lies near the entry-exit sites of the nucleosomal DNA helix and monomethylated H3K56 is the essential chromatin-docking site for PCNA prior to DNA replication. Therefore, disruption of H3K56 methyltransferase G9a resulted in impaired DNA replication [95].

Two families of histone demethylase, the amine oxidase LSD family that are represented by LSD1 and the Jumonji-(JmjC)-domain-containing JMJC family enzymes that are represented by JHDM2a and JHDM2b, can demethylate histone lysine and maintain the dynamic equilibration of histone methylation [96]. Although JMJD6 has also been reported to possess histone arginine demethylation activity, other studies have questioned the validity of this claim [97]. Histone demethylases regulate gene expression by removing histones marks. Indeed, UTX and JMJD3 can antagonize PRC mediated repression by demethylating H3K27me2/3, which has been reported to regulate the expression of the Hox transcription factor [98].

Acetylation

Acetylation, which occurs on the epsilon-amino group of conserved lysine residues, is another important modification of histone. Acetylation was found on histone H2A, H3, and H4 (see Table 1 for details). Acetylation of histones is thought to relax condensed heterochromatin as the negative charge of acetyl groups can antagonize the DNA phosphate backbone charges, thus reducing the histone binding affinity for DNA. This hypothesis was validated by the discovery of the histone acetyltransferase (HAT) activity of several transcriptional activator complexes in mammalian cells, for example, the p300/CBP transcriptional co-activator proteins [99], homeodomain transcription factor Crx-HAT complex [100], and the von Hippel–Lindau Partner Jade-1-HAT complex [101]. In contrast with transcriptional activation, histone deacetylation activity was also found in transcriptional repressor complexes, for example, Sin3A4, which is recruited to target genes by DNA-bound repressors, confirming gene activation and repression modulation by alternate histone acetylation and deacetylation by histone deacetylases (HDACs) [102].

Beyond the association of histone acetyltransferase activity with transcription factors, histone acetylation also influences chromatin compaction and the rates of transcription through recruitment of other chromatin remodeling complexes. Indeed, bromodomain was found to recognize acetyllysine residues within histones. The acetyllysine can recruit and tether the SWI/SNF family of chromatin remodelers, which can consequently slide nucleosomes on DNA and make chromatin more compacted. The SWI/SNF family of complexes also facilitates the exchange of variant histones with normal histone monomers, which would evict some histones [103].

However, the transcriptional activation role of histone acetylation has been challenged by the predominant presence of Rpd3/HDAC1 at actively transcribed gene promoters [104] along with the co-localization of some HATs including p300, CBP, GCN5, P/CAF, and MOF with HDAC. A possible explanation for this co-localization phenomenon may be that transcription initiation requires the cyclical use of HATs and HDACs to facilitate histone acetylation for the recruitment of RNA polymerase during the on-set of transcription. Similarly, it was proposed that cessation of transcription by histone de-acetylation would be required before another round of transcription can initiate. This mechanism is observed for PS2 gene expression in Drosophila [105], but it is yet to be determined if this same phenomenon occurs in mammalian cells.

The distribution of histone acetylation in the promoter region of an active gene versus gene body is distinct and functionally relevant. In yeast, histone H4 associated with highly transcribed gene promoter regions is hyper-acetylated while those associated with the corresponding gene bodies are hypo-acetylated. The H3K36 methylation presented in the body of actively transcribed genes is a signal for histone H4 hypo-acetylation [106]. In mammalian ES cells, H4K16 acetylation is a mark of actively transcribed genes and enhancers. Indeed, the H4K16 acetylation mark is mainly located in the promoter regions of active genes and to a lesser extent spread into gene bodies. Furthermore, some active genes are also marked by H3K4me1 and KAT8 histone acetyltransferase but not by EP300 transcriptional coactivator [107].

Phosphorylation

Phosphorylation of serine, threonine, and tyrosine residues is another type of histone modification. These histone modifications participate in various cellular physiological processes like DNA damage repair, transcription regulation, and chromatin compaction. One of the best-studied processes related to histone phosphorylation is DNA damage repair. Indeed, DNA double strand break is a deleterious DNA lesion, which, if left unrepaired, will cause severe consequences like genomic instability, large fragment deletion, replication failure, and other detrimental aberrant activities including apoptosis. Once DNA double strand breaks are formed, ATM (ataxia telangiectasia mutated), ATR (ATM and Rad3-related) serine/threonine kinases, or DNA-dependent protein kinase (DNA-PK) will be recruited to the breakpoint and catalyze phosphorylation of serine 139 on H2A variant (H2AX). This H2AX phosphorylation results in a variant γH2AX, which is a recruiting signal for many different DNA damage response (DDR) proteins such as MRE11/NBS1/RAD50, MDC1, 53BP1, and BRCA1. DDRs will be recruited to the double strand break site to perform DNA repair [108]. Another crucial functional aspect of γH2AX is when ATM is attracted to DNA double strand breaks. Indeed, ATM-dependent cell cycle checkpoint will arrest cell division until damaged DNA is repaired [109]. Interestingly, H2AX needs several other prerequisite modifications and eraser of modifications before it undergoes phosphorylation at S139, for example, the dephosphorylation of H2AXY142ph by eyes absent (EYA) family of proteins [110] and monoubiquitination of H2AX by RING finger protein 2 (RNF2) [111]. Inactivation of EYA or RNF2 results in a DNA repair defect phenotype to ionizing radiation, which can be rescued by a co-mutated histone H2AX S139A with Y142A [112]. This evidence indicates that the DNA repair complex requires multiple interaction sites on histone to conduct successful assembly and repair.

A number of different histone phosphorylation sites have been associated with transcriptional regulation. Histone H1 plays an important role in chromatin condensation, genome stability, and also participates in transcriptional regulation. Phosphorylation of the histone H1.2 has been found at T31, T146, T154, and S173 sites, whereas phosphorylation for H1.4 has been identified at T18, S27, T146, T154, S172, and S187. Phosphorylation on S173 of H1.2 and S172/S187 on H1.4 is enriched at 45S pre-ribosomal RNA gene promoters, which facilitates transcription and has an unanticipated function in ribosome biogenesis [113]. The mechanism underlying this transcriptional activation is not completely understood, but one hypothesis would be that transcriptional activation is possible due to chromatin unpacking and transcriptional apparatus accessibility.

Phosphorylation-mediated transcriptional regulation on core histones has been extensively studied. Transcriptional activation by epidermal growth factor (EGF) is mediated via phosphorylation of H3S10, H3S28, and H2BS32 by Rsk-2 and PKM2. Upon EGF receptor activation, PKM2 directly binds to histone H3 and phosphorylates histone T11 residue. PKM2-dependent histone H3 modifications are essential in EGF-induced expression of cyclin D1 and c-Myc, which in turn determines tumor cell proliferation, cell-cycle progression, and brain tumorigenesis. Phosphorylation of H3S10, H2BS32, and H3S28 is also associated with transcriptional activation of oncogenes like c-fos, c-jun, and c-myc. Furthermore, simultaneous H3 phosphorylation and acetylation were also reported in mammalian cells upon EGF stimulation suggesting the importance of a coordinated modification of H3 [114]. Indeed, the prototypical histone acetyltransferase Gcn5 catalyzes the acetylation of S10-phosphorylated H3K14 with 10-fold more efficiency compared to the non-phosphorylated substrate and this K14 acetylation induces transcriptional activation [115]. In addition to its DNA damage response role, γH2AX has recently been reported to be associated with normal cell proliferation. RSK2 and DNA-PK are required for EGF-induced phosphorylation of H2AX S139, with no involvement from either ATM or ATR. The phosphorylation of histone H2AX by RSK2 enhances the stability of histone H2AX, which in turn prevents cell transformation to malignancy facilitated by EGF [116].

Histone phosphorylation is also associated with chromatin compaction that occurs during mitosis and meiosis. Phosphorylation is proposed to be an essential step in the compaction and condensation of higher-ordered chromosomes, which is important for subsequent chromosome congression and segregation during cell division. Phosphorylation of several sites such as H3S10, H3T3, and H3T11 occurs during mitosis or meiosis with H3S10 phosphorylation acting as an accepted marker of these cellular events. Indeed, highly condensed metaphase chromosomes are heavily phosphorylated at all these sites. Two serine/threonine kinase family members, Aurora-A and Aurora-B, are responsible for the phosphorylation of H3S10. The localization of these enzymes is mutually exclusive during cell cycles with Aurora-A being responsible for the phosphorylation of centrosome histones, whereas Aurora-B co-localizes with the phosphorylated Histone H3. Importantly, phosphorylation of H3S10 occurs at late G2 phase, is maximal at mitotic prophase, and decreases upon exit from mitosis. A strong relationship between H3S10 phosphorylation and chromosome condensation has been observed, both of which start from pericentromeric heterochromatin regions and then spread to the rest of euchromatin regions [117]. Given the negative charge property of phosphorylation and DNA backbone, histone phosphorylation is less likely to facilitate chromosome compaction through neutralizing the negative charge of DNA, but may function through recruiting chromosome condensation factors. H3S10 phosphorylation has been shown to promote the recruitment of pre-mRNA-splicing factor SRp20 and alternative-splicing factor (ASF)/pre-mRNA-splicing factor 2 (SF2) modular proteins to the chromosomes. Also, these proteins may function in the maintenance of genome stability and cell-cycle progression [118]. In mouse early embryos H3S10 phosphorylation is an indispensable recruitment signal for heterochromatin protein 1 (HP1), which is required for proper heterochromatin structure and function. Furthermore, HP1 expression appears synchronized with H3S10 phosphorylation in the late S phase of a two-cell stage, at the time of pericentric heterochromatin replication [119].

Other Modifications on Histone

Other post-translational modifications, including ADP-ribosylation, biotinylation, O-GlcNAcylation, propionylation, sumoylation, and ubiquitination, are also detected on histone tails. These modifications also contribute to the epigenetic regulation of gene expression. An experiment has shown that less than 1% of all mammalian histone proteins are recognized and ADP-ribosylated by ADP-ribosyltransferase diphtheria toxin-like 1 (ARTD1) [120]. Recent studies have shown that the proteins with macrodomains, a short 20-amino acid motif comprised of mainly basic and hydrophobic amino acids that are mostly present in DNA damage checkpoint proteins, bind poly-ADP-ribosylated histones and initiate a cellular DNA damage response [121]. In the central nervous system, poly-ADP-ribosylation of histone H1 along with the corresponding enzyme poly[ADP]-ribose polymerase 1 (PARP1) is essential for memory stabilization in mice. Specifically, poly-ADP-ribosylation was enriched at the promoter region of learning and memory related genes, such as cAMP response element-binding protein and nuclear factor-κB-dependent genes [122]. However, the role of poly-ADP-ribosylation in gene expression remains elusive.

The existence of naturally occurring histone biotinylation has been under debate for many years. Recent studies using anti-biotin, streptavidin, and target-specific antibodies have provided more concrete evidence that this modification exists in various human primary and transformed cell lines. In vitro experiments using recombinant holocarboxylase synthetase (HCS) revealed that this enzyme interacts directly with histone H3, resulting in biotinylation of K9 and K18 [123]. Moreover, patients with multiple carboxylase deficiency showed a dramatic reduction in histone biotinylation that may further support the occurrence of histone biotinylation [124]. The roles of histone biotinylation in gene expression regulation and other chromatin related cellular events are yet to be elucidated.

Histone ubiquitination has been mapped to the highly conserved lysine 119 residue of histone H2A and lysine 120 of H2B. Similar to other cellular ubiquitination process, addition of ubiquitin moiety to histone involves the sequential involvement of E1, E2, and E3 enzymes. Removing the ubiquitin moiety is catalyzed by isopeptidases. Histone ubiquitination has been proposed to affect gene expression through different mechanisms including recruitment of effector transcription factors and binding to other modified histones. Recent studies have revealed that H2AK119u1 and H3K27me3 are specifically enriched at polycomb targeting genes that are required to maintain ES cell identity [125]. Remarkably, a recent report has demonstrated that DNMT1 preferentially associates with ubiquitinated H3K23, which is catalyzed by UHRF1. This proposes another mechanism of maintenance DNA methylation mediated by UHRF1 and DNMT1 [126].

Cross-talk Between Histone Modifications

Different types of histone modifications function in a combinatorial manner to fine-tune nuclear events. Using multiple histone modifications, the cell can integrate different cellular signaling pathways at the chromatin level. The complex histone modification network has been summarized to functions in four steps: modifying triggering histone residues, recognition of trigger modification, modifying consequent histone residues, and recruitment of effector complexes [127].

The vast majority of histone modifying enzymes are present in large complexes harboring multiple functional subunits. These subunits endow the complex with multiple target modifying capacities. For example, the PHF (plant homeodomain finger) protein family members, that include PHF2, PHF8, and KDM7A, harbor both a PHD and a JmjC domain. The PHD domain has been reported to bind methyllysine on histone, such as H3K4me3, whereas the JmjC domain is responsible for demethylation. The JmjC domain of KDM7A is responsible for demethylation of two repressing marks H3K9me2 and H3K27me2 [128]. Furthermore, demethylation of H3K9me2 and H3K27me2 decreases when the KDM7A PHD domain is deleted. These observations revealed that the presence of H3K4 methylation might signal the demethylation of H3K9me2 and H3K27me2 by bifunctional reader-eraser domain containing proteins.

The intersection between ADP-ribosylation and methylation of histones is another example of the cross-talk between histone post-translational modifications. ADP-ribosylation of H3 and H1.4 by ARTD1 abolishes the subsequent methylation of H3 and H1.4 by SET7/9; however, methylation of H3 and H1.4 does not affect their ADP-ribosylation [129]. The detailed mechanism of this cross-talk is still not clear, but it is hypothesized that modification at one histone site can influence the modification of other sites and their subsequent function. For example, HP1 can recognize dimethylated K26 on human histone H1.4 and induce the formation and spreading of heterochromatins, but phosphorylation of S27 prevents the binding of HP1 to H1.4K26me2 [130]. Adjacent K26 dimethylation of H1.4 can regulate Aurora-B activity on S27, thus affecting the phosphorylation status of S27 and affecting the binding of HP1 [131]. Therefore, the cross-talk between two adjacent post-translational marks can affect HP1 reader molecule binding on H1.4. Research on histone modification cross-talk has been more focused on single or few sites due to a lack of reagents, such as antibodies that can recognize dual modification on histones. However, cross-talk between modifications is more likely to function in a complex network. Methods with improved accuracy, throughput, and coverage, like liquid chromatography-tandem mass spectrometry-multiple reaction monitoring method might be useful in the study of these complex modification networks [132].

Non-Histone Proteins and Amino Acids Modifications in Mammalian Epigenetic Mechanisms

Histones are not the sole substrate for many of these modifying enzymes. They often target other cellular proteins. The role of phosphorylation in activating or inactivating non-histone proteins is well studied. In recent years, lysine methylation has been found to be important in the regulation of various protein activities, metabolic pathway, and physiological processes. The histone H3K4 methyltransferase, SET7/9, also catalyzes the methylation of various non-histone proteins, including DNMT1, p53, Yap (Yes-associated protein), SUV39H1, ERα (nuclear hormone estrogen receptor alpha), and many other cellular proteins. Methylation of DNMT1 at the K142 residue promotes its proteasome-mediated degradation, whereas phosphorylation of S143 by the AKT1 kinase prevents K142 methylation, thereby stabilizing DNMT1 [133]. SET7/9 can methylate p53 at K372, which induces p53 transcriptional activation. It is also worth noting that the p53K372 methylation also inhibit the methylation of neighboring K370 by SMYD2 resulting in transcriptional repression function of p53 [134]. SET7/9-mediated methylation also plays a role in mammalian development by methylating the Yap protein in the Hippo pathway. Methylation of K494 is critical for the Yap cytoplasmic retention and SET7 KO mice showed higher Yap expression and abnormal progenitor compartmentalization in the intestine [135]. The histone methyltransferase SUV39H1 can also be methylated at K105 and K123 by SET7/9 and the modified SUV39H1 shows a drastic decrease in histone methyltransferase activity without altering the cell compartment localization and stability. Furthermore, the methylation-mediated decrease of SUV39H1 activity in turn relaxes heterochromatin and increases the expression of genes located in satellite regions [136]. The evidence from these studies indicates that non-histone protein methylation is a crucial post-translational mechanism for regulating enzymatic activity or protein stability; however, it is still not known whether specific demethylases can reverse this process.

Non-Coding RNA

Non-coding RNAs are transcribed, but not translated into proteins. Besides tRNA and rRNA, such RNAs exist in mammalian cells in the form of lncRNA, miRNA, small interfering RNA (siRNA), and Piwi-interacting RNA (piRNA).

Large numbers of lncRNA transcripts are transcribed from diverse regions of eukaryotic genomes. These lcnRNAs play important roles in cell differentiation, chromosome dosage compensation, organ development, and disease progression processes [137]. The transcription of lncRNAs has been analyzed in different cell types and tissues, revealing lncRNAs are more diversely expressed than protein coding genes. Moreover, lncRNAs are diversely expressed during different stages of differentiation, indicating its predominant role for gene regulation during cellular differentiation.

A large number of LncRNAs function by recruiting chromatin modifying enzymes that facilitate modification of chromatin and ultimately altering gene expression. It has been reported that lncRNA can form a triplex structure within the human rDNA promoter, which then facilitates the recruitment DNMT3b methylation complex. Subsequent methylation of the rDNA promoter sequence results in the silencing of targeted rRNA genes [138].

Another important role for lncRNA is in X chromosome inactivation. The female mammalian genome contains two copies of the X chromosome and one of them needs to be transcriptionally silent as early as embryogenesis to achieve proper levels of gene expression. This inactivation process is mediated by the X chromosome inactivation center, which consists of four ncRNAs genes, Xist, Tsix, Jpx, and Ftx. The bivalent protein YY1, which possesses the capacity for both DNA and RNA binding, docks Xist RNAs onto the X chromosome [8]. Xist docking then recruits the polycomb repressive complex (PRC) 1 and 2, which mediates DNA methylation, histone hypoacetylation, and MACROH2A deposition throughout the entire targeted X chromosome. It is the entirety of these modifications that bring about transcriptional inactivation of either the maternal or paternal copy of the X chromosome [139]. In terms of intra-chromosome conformation, the inactivated X chromosome is characterized by random organization, whereas the active X chromosome copy shows multiple long-range interactions spanning its chromatin [140]. Irregular X chromosome inactivation has been associated with female preponderant autoimmune diseases like rheumatoid arthritis and autoimmune thyroid diseases [141]. Nevertheless, the direct mechanism of X chromosome inactivation in disease and development is still not well understood.

miRNAs and siRNAs are both well-known for their post-transcriptional gene silencing through the RNA interference pathway. Using deep sequencing technology, more than 2,500 mature miRNAs have been identified in human cells and these miRNA are estimated to target 60% of human genes. miRNAs are generally produced from either the coding sequence of their corresponding genes or from the splicing product of their introns. Clusters of miRNA encoding genes are typically transcribed by RNA polymerase II and yield pri-miRNA, which may contain multiple tandem miRNAs. pri-miRNA are then processed by Drosha to yield pre-miRNA. The two-nucleotide overhang and hairpin structure of pre-miRNA enables it to be recognized and exported to cytoplasm by exportin-5. Mature miRNA is produced from pre-miRNA by Dicer and loaded into the RNA-induced silencing complex (RISC). The Argonaute (Ago) protein in the RISC complex will then cut mRNAs that are complementary to the loaded miRNA. This cutting of targeted mRNAs thereby enforces transcriptional gene silencing. Recently, several nuclear-cytoplasmic shuttling complexes, CRM1, Importin-8, and TNRC6A, were reported to bind the cytoplasmic Ago protein and import miRNA from the cytoplasm to nucleus [142-144]. This indicates that miRNA may function in the nucleus by a different mechanism other than the conventional RNA interference pathway. For example, the mouse miR-709 has been shown to bind with pri-miR-15a/16-1 transcript using the 19-nt complementary sequence, which prevents the downstream processing of pri-miR-15a/16-1 [145]. Indeed, a large number of miRNA in mammalian cells participate in both normal physiological processes and diseases, including cancer. This has led researchers to look into miRNA as a diagnostic or prognostic cancer marker. The identification of reduced let-7 miRNA expression in lung cancers is associated with shortened post-operative survival [146] and miR-21 overexpression in breast cancer is associated with advanced clinical stage, lymph node metastasis, and poor patient prognosis [147].

siRNAs are double-stranded small RNAs that can also suppress the post-transcriptional gene expression via the RNAi pathway. In plants, endogenous siRNAs are produced from single-stranded RNA by RNA-dependent RNA polymerase (RdRP). Even though RdRP is not identified yet in mammals, a small number of cell types have been reported containing RdRP-dependent siRNAs. These endogenous siRNAs have been found in murine oocytes, ES cells, and male germ cells. The production of these siRNAs is DICER-dependent but not Drosha-dependent [148].

piRNAs are a family of single-stranded RNAs associated with PIWI proteins in germ cells. The biogenesis and function of piRNA are distinct and less understood than either miRNA or siRNA. Precursory piRNAs are transcribed from the piRNA gene clusters [149]. The precursors are processed into mature piRNAs in the context of an electron-dense cytoplasmic material, but the precise generation process remains elusive. piRNA–PIWI complexes have been proposed to silence the transposable elements and protect the integrity of the genome by promoting transcriptional repression in the embryo or reinforcing silencing at the post-transcriptional level after birth [150].

Chromosome Conformation

Eukaryotic genomes are believed to be arranged in three-dimensional spatial structures. Within these spatial structures, the arrangement of the chromosomes heavily influences gene expression. This is well supported by the presence of long-range enhancer elements that may be located at considerable distances or even on different chromosomes. The interaction of distal enhancers and effector genes could be mediated by chromosome looping or conformational change. Recent advances in chromosome conformation capture technology (3C) and its derivatives have enabled scientists to visualize the dynamic organization of chromatin fibers in a three-dimensional conformation. As a key organizer of the genome, the CTCF transcription factor is found to mediate both cis and trans interactions between different long distance DNA interactions. CTCF-mediated chromosome conformation results in multiple distinct sections of chromosome, that is, transcriptional harbors, in mouse embryonic cells [5]. Transcriptional harbors were identified in CTCF-defined chromatin loops and these harbors are enriched in active histone methylation signatures on H3K4 and H3K36. Independent of CTCF, repressive transcriptional domains have also been identified in the chromatin and they are enriched with suppressive H3K9, K20, and K27 methylation marks [5]. The correlation between functional chromatin conformations and histone methylation marks indicates that chromatin conformation plays an important role in the regulation of gene expression. Single-cell chromatin conformation analysis and structural modeling of single-copy X chromosomes have revealed that chromosomes are dynamically organized at scales larger than a megabase. Nevertheless, the establishment of chromosomal boundaries is still preferred to insulate actively transcribed genes [151].

Cross-Talk of Different Epigenetic Layers

DNA methylation and histone modifications are highly intertwined and rely on each other for maintaining mammalian cell's epigenetic state. CpG methylation may serve as a signal for histone modifications. Indeed, methyl CpG binding proteins MeCP2 and MBD bind histone deacetylases and histone lysine methyltransferases [152]. MBD1 is associated with the SUV39H1-HP1 heterochromatic complex and SUV39H1 is responsible for the repressive H3K9 methylation marks [152]. Simultaneous binding of MBD1 to both methylated CpG and SUV39H1 allows this protein to act as a bridge between CpG methylation and histone H3K9 methylation. This dual binding allows MBD1 to control two different layers of epigenetic regulation, thus ensuring epigenetic silencing at the targeted genes.

Histone methylation can also recruit DNMTs. DNMT3a/3b are recruited to H3K9 methylated chromatin by their direct interaction with the heterochromatin protein HP1, which binds to methylated H3K9 via its chromodomain [153]. DNMT1 can also be recruited by methylated H3K9 through UHRF1, which ensures the faithful inheritance of DNA methylation during mitosis [154]. This indicates that in heterochromatin methylated H3K9 plays a role in regulating DNA methylation (Fig. 3).

Figure 3.

H3K9 methylation and DNA methylation coupling. MBD1 protein can recognize methylated DNA and recruit SUV39H1, which is the H3K9 methyltransferase; HP1 protein binds K9 methylated histone H3 and recruit DNMT3 to place de novo methylation on nascent DNA; UHRF1 can also recognize H3K9 methylation and guide DNMT1-mediated maintenance DNA methylation. Lolly with red heads represent methylated CpG dinucleotides; blue dots represent methylated H3K9; black dots represent S-adenosyl methionine substrate for DNMTs.

Several miRNAs have been reported to target DNA methyltransferases, or enzymes responsible for DNA demethylation, thereby affecting DNA methylation state [155, 156]. These miRNAs are frequently correlated to tumor suppressors or oncogenes. Indeed, ectopic expression of miR-143, frequently down regulated in breast carcinoma, will decrease the expression of DNMT3a and inhibit the proliferation of breast cancer cells [155]. Furthermore, miR-29 targets and regulates the expression levels of both Tet1–3 and TDG mRNA, and overexpression of miR-29 causes a global decrease in genomic 5hmC levels. The inverse correlation of miR-29 with Tet3 and TDG indicates that miR-29 may play a role in the active demethylation pathway [156].

Direct inhibition of DNMT1 by RNA was discovered several decades ago [157]. A recent work demonstrated that active transcription of extra coding RNA (ecRNA) inhibits locus-specific DNA methylation. Specifically, they identified an ecRNA transcribed from the CEBPα locus that would bind DNMT1 and inhibit its activity. Also, the inhibition of DNMT1 by CEBPα ecRNA only occurs in close proximity to the CEBPα locus, thus defining a novel mechanism for locus-specific inhibition of DNA methylation by inactivation of DNMT1 [158]. Nevertheless, conserved RNA binding modules were not reported on DNMT1 and mechanistically very little is known about how these RNA inhibit DNMT1.

Concluding Remarks

Recent advances in epigenetic research have endowed us with a better understanding of the dynamic of DNA methylation mediated gene expression and mammalian cell development. This was made possible by advances in high-throughput technologies such as DNA, RNA, and ChIP sequencing along with microscopy and chromosome conformation studies. Recently, a large effort to perform single-cell epigenome studies has gained momentum. However, fundamental questions on the mechanism of DNA methylation patterns establishment and inheritance of imprinted genes remain unanswered. Current research examining the cross-talk between DNA methylation and histone modifications has generally been limited to only a few histone modification marks. Therefore, the true extent of interactive network of histone modifications function in the regulation of DNA methylation is not entirely understood. In the non-coding RNA field, the occurrence of mature miRNA in the nucleus has raised questions about the additional unknown roles of miRNA. Similarly, chromosome conformation analyses have been severely limited with most studies, only examining a single point of the cell cycle. To address this, chromosomal structure analyses should take a dynamic view of conformational changes by studying across the entire cell cycle. These types of analyses may help us to understand how DNA methylation and histone modification signatures spatially organize chromatin interaction and orchestrate dynamic cellular events. Regardless of these limitations, our understanding of the different epigenetic layers and their participation in gene expression is rapidly expanding. Furthermore, we will continue to gain a better understanding of DNA methylation signatures and biomarkers, diseases arising from epigenetic dysregulation allowing for more efficient epigenetic drug development and therapeutics. Indeed, pharmaceutical companies are exploring an increasing number of epigenetic targets for drug development and future therapy.

Acknowledgements

The authors would like to thank Drs. Bill Jack, Pierre-Olivier Estève, and Jolyon Terragni for constructive comments and editing. The authors are grateful to Drs. Donald Comb and Rich Roberts at New England Biolabs, Inc. for their support and encouragement. We apologize for being unable to include all the contributions to mammalian epigenetic research due to the space constraints.

Ancillary