Presented here is an introduction to epigenetic modifications, followed by descriptions of epigenetically regulated developmental events. The primer concludes with an insightful discussion about current topics in the field with epigeneticists Giacomo Cavalli, Ph.D., and Noel Buckley, Ph.D.
DNA methylation is a covalent modification that triggers heritable gene silencing (Levenson and Sweatt,2005). DNA methyltransferases (Dnmts) catalyze the reaction by transferring a methyl group to the 5 position of cytosine on CpG dinucleotides. This triggers silencing in one of two ways. First, methylation can directly interfere with transcription factor binding to recognition sites on DNA. Second, methyl-CpG binding domain proteins (MBPs) can reinforce silencing by recruiting corepressor complexes that harbor histone deacetylases (HDACs) or histone methyltransferases (HMTs). MBP-associated HMTs fuel a positive feedback loop between DNA methylation and another epigenetic silencing mark, methylated lysine 9 on histone 3 (H3K9; Fujita et al.,2003; Fuks et al.,2003).
The nucleosome, consisting of 146 bp of DNA wrapped around an octamer of histones, H2A, H2B, H3, and H4, is the basic structural unit of chromatin (Margueron et al.,2005). Histones have a central globular domain and relatively unstructured N- and C- terminal tails. The N-terminal tail is primarily subjected to posttranslational modifications, such as acetylation, phosphorylation, ubiquitination, and methylation, which can result in profound changes in local chromatin structure.
The histone code hypothesis states that specific combinations of histone modifications form a language that specifies the structural state of chromatin (Strahl and Allis,2000). Effector proteins read and carry out the code's instructions to specify heterochromatin formation, DNA that is tightly packed with nucleosomes, or more loosely packed euchromatin. Compared with euchromatin, heterochromatic DNA is largely inaccessible to transcription factors and chromatin remodelers, making it relatively transcriptionally inert.
Three histone marks encode epigenetic processes that are described in the sections below. H3K9 and H3K27 methylation are marks of silent or repressed chromatin (Cheung and Lau,2005). Heterochromatin protein 1 (HP1) binds methylated H3K9 by means of its chromodomain and induces local chromatin condensation. The chromodomain protein Polycomb (PC) binds methylated H3K27 and induces silencing in ways that are not yet understood. Promoters and transcribed regions of many active genes are decorated with methylated H3K4, a modification recognized by two factors associated with transcriptionally active genes. CHD1 and BPTF, a subunit of the multiprotein complex NURF, both remodel nucleosomes in an ATP dependent manner (Sims and Reinberg,2006).
Compared with other histone posttranslational modifications, lysine methylation is a long-lasting mark (Margueron et al.,2005). It is thermodynamically very stable, persists through mitosis, and is found on chromatin regions that are silenced for the long term, such as pericentric chromatin (dimethylated H3K9, H3K9me2) and the inactive X chromosome (H3K27me3, H3K9me2). Only recently was it discovered that histone methylation can be catalytically reversed by histone demethylases (Shi et al.,2004; Cloos et al.,2006). Demethylases are probably under tight control, because continual histone methylation is a hallmark of heritable transcriptional cellular memory.
It should be noted that specific histone modifications are not always predictive of gene activity. In embryonic stem (ES) cells, lineage specific genes that are either repressed or transcribed at low levels are “bivalent,” bearing marks of active (H3K4me, H3K9ac) and repressive (H3K27me3) chromatin (Azuara et al.,2006; Bernstein et al.,2006). These observations have led to the hypothesis that dual markings enable developmental genes to be repressed in ES cells while poised for quick activation.
It is likely that chromatin remodelers and other histone modifications are additional sources of epigenetic changes. For example, histone deacetylase (HDAC) is a component of the polycomb group and REST/NRSF corepressor complexes, both epigenetic modifiers (see Polycomb and REST/NRSF sections). Relative histone acetylation state affects factor recruitment and/or structural changes in chromatin. The ATPase Brahma-related gene 1 (Brg1), a member of the epigenetic activators trithorax group, is also a component of the SWI/SNF nucleosome remodeling complex, and regulates neurogenesis, myogenesis, and left–right asymmetry (Seo et al.,2005; Ohkawa et al.,2006; Takeuchi et al.,2007). SWI/SNF causes nucleosomes to slide along DNA, either exposing or occluding regulatory elements that affect the transcriptional status of nearby genes. Are these epigenetic mechanisms? It is partly a question of semantics (see Interview). At present, modification by HDACs and nucleosome remodeling are considered dynamic, whereas histone methylation is a tried and true long-lasting epigenetic mechanism. Although it is clearly not so black-and-white, the discussion here is limited to established epigenetic mechanisms.
Chromatin proteins are dynamic, and a histone can be exchanged for a variant within its own class (Cheung and Lau,2005). Structural differences between variants impact overall nucleosome structure, altering accessibility of DNA to transcription factors and chromatin remodelers.
H2A has the largest number of identified variants, and because of its strategic position in the nucleosome, H2A histone variants are particularly suited to dramatically alter histone and DNA contacts. Two variants, macroH2A and H2A-Barr body-deficient (H2A-Bbd), are involved in X chromosome inactivation. macroH2A promotes gene silencing by preventing remodeling by the SWI/SNF complex, initiation of p300-dependent polymerase II transcription, and p300-dependent histone acetylation (Angelov et al.,2003; Doyen et al.,2006a). H2A-Bbd is associated with active chromatin. The structure of its central globular domain allows only 130 bp of DNA to be wrapped around the nucleosome, creating a more relaxed, accessible chromatin structure (Doyen et al.,2006b). H2A.Z is associated with both active and inactive chromatin (Cheung and Lau,2005). It is not understood how histone variants are targeted to specific chromosomal regions.
Long-Distance Chromosomal Interactions
Another epigenetic mechanism has been brought to the fore in recent years: alterations in gene activity induced by direct interactions between chromosomal regions that are positioned at long-distances from one another. Long-distance chromosomal interactions can mediate gene activation or repression (Grimaud et al.,2006; Lomvardas et al.,2006), and both intra- and interchromosomal associations have been observed. In 2002, two papers were the first to report intrachromosomal interactions at the endogenous β-globin locus, between the locus control region (LCR) and active β-globin genes located some 50 kb away. Subsequently, it was found that, in T-cells, the LCR for cytokines Il4, Il5, and Il13 is positioned in close proximity to each of their promoters, bridging 120 kb in total (Spilianakis and Flavell,2004). The interactions may facilitate communication between regulatory elements, influencing the transcriptional state of associated genes.
Within the past year, there has been a surge of papers documenting nuclear colocalization of domains on two different chromosomes (Spilianakis et al.,2005; Bacher et al.,2006; Grimaud et al.,2006; Lomvardas et al.,2006; Xu et al.,2006). In naive T-helper cells, there is an interchromosomal association between the cytokine LCR mentioned above and interferon-γ, a determinant of the TH1 cell fate. Upon differentiation, the interaction is lost, suggesting that it coordinates active gene expression (Spilianakis et al.,2005). Interactions between X-chromosomes, polycomb response elements (PREs), and olfactory receptor enhancer and promoters have also been observed, and are described in the X-chromosome inactivation, Polycomb and Olfactory receptor sections below. The discovery of interchromosomal interactions highlights a possible role for nuclear architecture in epigenetic regulation. Regulatory chromosomal domains may be sequestered to a subcompartment of the nucleus, which acts as a center for coordinating either gene silencing or activation (O'Brien et al.,2003).
Broad implications can be gleaned from the long-distance chromosomal associations that have been discovered, despite that there is only a handful of examples so far. Because intra- and interchromosomal association occurs in different types of genes and in varied cell types, the phenomenon is likely global. Moreover, all reported interactions are cell-type specific, and among interactions that are transient, the timing of chromosomal domain colocalization tracks with changes in cellular differentiation. Taken together, these findings suggest that long-distance chromosomal interactions are developmental regulators.
DEVELOPMENTAL GENE REGULATION
Mammalian females have two X chromosomes (XX), whereas males have one (XY), posing the biological problem of how to equalize the dosage of X-linked genes between the two sexes. The mammalian subclasses eutherians, marsupials, and monotremes, and other species, such as C. elegans and Drosophila, have each taken different approaches to solving this problem (Lucchesi et al.,2005). Presented here is the eutherian solution, mainly studied in mice, which uses epigenetic mechanisms to inactivate one of the X chromosomes in female cells (Heard,2005; Thorvaldsen et al.,2006).
In the preimplantation mouse embryo, all cells initially undergo imprinted inactivation of the paternal X chromosome (Xp). In the late blastocyst, inner cell mass cells reactivate Xp, while extraembryonic cells (trophoectoderm, primitive endoderm) maintain Xp inactivation. Subsequently, the epiblast, or future embryo, initiates random X inactivation (Allegrucci et al.,2005). The mechanisms of imprinted inactivation of Xp and random X inactivation are largely similar; differences are noted below.
Before random X inactivation, there are transient interchromosomal interactions between X inactivation center (Xic) loci on the X chromosomes (Heard,2005; Xu et al.,2006). This event may promote cross-talk between the two Xics, somehow enabling its critical functions of counting the number of X chromosomes in a cell, and determining which X chromosome will be inactivated in female cells.
The future inactivated X chromosome (Xi) is characterized by accumulation of the noncoding (nc) RNA, Xist. Within Xic are three genes encoding ncRNAs that cross-regulate one another: Xist, Tsix, and Xite. Xist is required to initiation inactivation of one of the X chromosomes, Tsix is antisense to Xist and invokes Xist silencing, and Xite positively regulates Tsix transcription. Following Xic pairing, Xi down-regulates Tsix transcription, while Xist is up-regulated. On the active X chromosome (Xa), Tsix represses Xist, but counterintuitively does not do so by destabilizing Xist RNA (Sado et al.,2005; Sun et al.,2006). Rather, Tsix induces H3K4me2, a mark of euchromatin, which somehow results in Xist repression. The precise manner by which this occurs is under debate (Navarro et al.,2006; Sun et al.,2006). There are no interactions between Tsix and Xist during imprinted X inactivation, because they are imprinted maternally and paternally (Xi), respectively. In both random and imprinted X inactivation, Xist transcripts accumulate on Xi, “coat” it, and may function as a scaffold to recruit silencing histone modifying enzymes.
Many epigenetic mechanisms promote heterochromatization of Xi. First, Xi is decorated with histone methylation marks characteristic of chromatin silencing. The Polycomb group (PcG) HMT, EZH2, associates with Xi and endows it with the silencing mark H3K27me3 (Plath et al.,2003). Xi also acquires the repressive mark H3K9me2, and the activation mark H3K4me2 is underrepresented (Valley et al.,2006). Second, the histone composition on Xi is altered compared with Xa and autosomes. Xi is relatively devoid of activating H2A-Bbd, H2A.Z, and a large proportion of H2A are replaced by repressive macroH2A histone variants (Chadwick and Willard,2001,2003). Third, Xi in embryonic cells is sprinkled with hypermethylated CpG islands (Kratzer et al.,1983). Layers upon layers of inactivation might guarantee Xi repression, which must be extraordinarily stable to remain inactive for the lifetime of the animal.
Imprinting is the process by which epigenetic marks permit monoallelic expression from a single parental allele. There are around one hundred of imprinted genes, many of which regulate placental and fetal growth. So far, research of different imprinted loci has shown that there is variation between the molecular processes that regulate imprinting. Nevertheless, there are parallels between the epigenetics of genomic imprinting and X chromosome inactivation (XCI; Reik and Lewis,2005).
Like XCI, silencing of imprinted genes can be regulated by ncRNAs. Global identification of imprinted genes has shown that 30% encode for ncRNAs, indicating that they may be commonly used for imprinting. Imprinted genes usually occur in clusters of 3–11 genes. Two noncoding RNAs that silence their respective ICRs are Air in the Igf2r cluster and Kcnq1ot1 in the Kcnq1 cluster (Delaval and Feil,2004). Genes in the Igf2r and Kcnq1 clusters are expressed on the maternal chromosome and encode a fetal growth regulator and cardiac potassium channel, respectively. On the maternal chromosome, DNA methylation prevents transcription of Air and Kcnq1ot1, allowing clusters of imprinted genes to be expressed (Stoger et al.,1993; Engemann et al.,2000). On the paternal chromosome, all three coding genes in the Igf2r cluster, Igf2r, slc22a2, and slc22a3, are silenced by Air, even though it is only antisense to Igf2r (Sleutels et al.,2002). How Air silences Igf2r cluster remains to be determined, but regulatory mechanisms behind the silencing of Kcnq1 may yield some clues. Paternal Kcnq1ot1 induces silencing by targeting PcG-mediated H3K27 and H3K9 methylation to the cluster (Umlauf et al.,2004). This finding suggests that ncRNA regulates genomic imprinting by altering chromatin states, like ncRNAs Tsix and Xist during XCI. Indeed, imprinting and XCI are believed to be evolutionarily linked (Reik and Lewis,2005).
In contrast to the Igf2r and Kcnq1 clusters, silencing of the clustered imprinted genes Igf2 and H19 appears to be mainly regulated by DNA methylation. Igf2 and H19 are in the same cluster, but expression of Igf2 is paternal and H19 is maternal. Their mutually exclusive expression is regulated by an imprint control region (ICR), a stretch of DNA that regulates expression of all genes in imprinted clusters (Delaval and Feil,2004). On the maternal allele, Igf2/H19 ICR is unmethylated. This permits an insulator, CCCTC binding factor (CTCF), to bind a boundary element within the ICR. CTCF inhibits Igf2 expression by preventing interactions between its promoter and enhancer (Hark et al.,2000). At the same time, lack of ICR methylation is permissive for maternal H19 expression. On the paternal allele, the ICR is methylated, directly preventing H19 expression and allowing for Igf2 expression by blocking CTCF binding (Engel et al.,2004). H19 encodes an ncRNA, but unlike Air and Kcnq1ot1, does not function as a silencer.
Polycomb Group Regulates Gene Silencing
Beyond XCI and genomic imprinting, one might think that epigenetics and embryonic development are incompatible. Epigenetics positively or negatively affect transcription over the long term, while development is a dynamic process dependent on rapid and frequent changes in gene expression. Work with PcG genes first demonstrated that at least some aspects of development are under control of epigenetic mechanisms. PcG maintains the spatial pattern of Hox genes, conserved gene groups that regulate anterior/posterior patterning. Through deposition of the silencing epigenetic marks, H3K27me3 and to a lesser extent H3K9me2, PcG represses Hox genes in regions where they are not normally expressed. Flies and mice mutant for PcG genes exhibit ectopic Hox expression, sometimes eliciting homeotic transformations (Jürgens,1985; van der Lugt et al.,1994). Mice mutant for the PcG gene Bmi1 also show defects in hematopoietic and neural stem cell self-renewal, and in proliferation of primary fibroblasts (Jacobs et al.,1999; Lessard and Sauvageau,2003; Molofsky et al.,2003; Park et al.,2003). As mentioned previously, trimethylation of H3K27 by PcG also promotes silencing on Xi and imprinted genes. These examples illustrate that PcG controls many different cellular processes.
PcG binds to PREs in regulatory regions of target genes. Thus far only defined in Drosophila, PREs are comprised of different combinations of specific binding sites. PcG targets include developmental genes such as various transcription factors (Fox, Sox, Pax), and signaling molecules (Wnts, Shh, BMPs; Bracken et al.,2006; Negre et al.,2006). Taken together with the finding that PcG target profiles differ in undifferentiated and differentiated cells, PcG may regulate many developmental processes.
PcG is composed of two complexes, Polycomb repressive complex (PRC) 1 and 2. Mammalian PRC2 contains four core proteins, EED, SUZ12, and RbAP48, and EZH2, and PRC1 is composed of more than 10 subunits including the chromodomain protein Polycomb (PC), the oncoprotein BMI1, and E-3 ubiquitin ligases RING1A and B (Lund and van Lohuizen,2004). PcG exerts its repressive effects in multiple ways. The catalytically active component of PRC2, EZH2, trimethylates H3K27, and recruits DNMTs to PcG target genes (Vire et al.,2006). PC binds H3K27me3 and brings with it the rest of the PRC1 complex (Min et al.,2003). PRC1 also imposes silencing, at least partially by inhibiting chromatin remodeling by SWI/SNF (Francis et al.,2001). Other properties of PRC1 may also contribute to its ability to repress transcription: PRC1 ubiquitination of H2A is critical for Hox silencing by an unknown mechanism (Cao et al.,2005), and at least one PRC1 variant contains HDAC1 (Huang and Chang,2004). PcG has also been reported to prevent transcription initiation (Dellino et al,.2004). Multiple epigenetic mechanisms may make repression by PcG failsafe.
There is mounting evidence that repression is at least partly mediated by PRE-dependent chromosomal interactions. PREs on the same chromosome or on different chromosomes can pair together, with the former resulting in a looping out of intervening sequence (Comet et al.,2006). These interactions induce paring-sensitive silencing: PRE pairs silence gene expression more efficiently than a single PRE. Paired PREs and bound PcG proteins aggregate as tight foci in the nucleus, forming so-called PcG nuclear bodies (Grimaud et al.,2006). The purpose of the nuclear bodies may be to sequester PRE-containing genes to a nuclear subcompartment where silencing is propagated and/or maintained.
PcG nuclear bodies colocalize with or lie in close proximity to RNAi nuclear bodies bearing components of RNAi machinery, including Dicer-2, ARGONAUT1, and PIWI. Genetic data show that RNAi machinery lies upstream of long-range PRE association. Mutants for single RNAi proteins experience decay in PRE association over time. PcG-dependent cosuppression, a phenomenon whereby introduction of multiple copies of a transgene results in their silencing, also requires RNAi machinery, but is apparently governed by an independent mechanism. While cosuppression is faulty in mutants of the RNAi machinery gene homeless, PRE association is not (Pal-Bhadra et al.,2004). Cosuppression mediates heterochromitization of repeat sequences, but how RNAi machinery maintains long-range association of PREs is unclear. The answer may lie in the observation that siRNAs are generated from PRE-containing transgenes. The authors propose that siRNAs from PRE loci act as “molecular glue” that keeps PcG complexes on different PREs together. So far, siRNAs from endogenous PREs have not yet been detected.
Trithorax Group Promotes Active Chromatin
As the mechanistic opposite of PcG, trithorax group (trxG) maintains gene expression, including Hox genes, by facilitating euchromatin (Beisel et al.,2002). Mammalian trxG consists of multiple proteins and includes a panoply of homologs to the Drosophila H3K4 HMT trithorax, namely Set1a, Set1b, Mll1, Mll2, Mll3, and Mll4, and the SWI/SNF ATPase Brg1. trxG binds the same chromosomal element as PcG. In the silenced state, the element recruits PcG (PRE), and in its active state recruits trxG (TRE). Activation of PRE/TRE requires transcription of the TRE. Noncoding TRE transcripts bind and recruit the Drosophila trxG HMT Ash1, which deposits marks for euchromatin, H3K4me3 (Sanchez-Elsner et al.,2006). Recent studies show that BPTF specifically binds H3K4me3, and loss of Xenopus BPTF results in aberrant Hox expression. BPTF is a subunit of NURF and may target the chromatin remodeling complex to histones methylated by trxG. NURF facilitates active chromatin by repositioning nucleosomes along DNA (Sims and Reinberg,2006). TrxG also promotes expression of the Hox gene Ubx by facilitating transcriptional elongation (Petruk et al.,2006). These findings may explain how trxG regulates gene activation.
REST/NRSF Activities Govern Neuronal Fate Decisions
As the embryo develops, multipotent cells activate specific gene programs that trigger differentiation into specialized cell types. Equally important, a cell must silence expression of genes specific to other cell types to secure its fate. Because repression must be maintained throughout the life of the animal, epigenetic mechanisms are ideal for mediating such events.
How a cell keeps from expressing inappropriate gene programs has been worked out for genes from just one cell type, the neuron. Repressor element 1 (RE-1) silencing transcription factor/neuron restrictive silencing factor (REST/NRSF) is a zinc finger protein that triggers epigenetic silencing of neuronal genes (Ballas and Mandel,2005). In mice lacking REST/NRSF, the neural-specific gene βIII-tubulin is expressed in some non-neural tissues, but many cell types are unaffected (Chen et al.,1998). However the mice die at E11.5, precluding insight into the roles for REST in nervous system development. These data illustrate that REST/NRSF is not a master regulator of neuronal gene silencing, but rather maintains repression.
In non-neuronal cells, REST/NRSF binds RE-1 elements in regulatory regions of target genes and blocks their transcription. The elements reside in genes central to neurogenesis, including ion channels, neurotransmitter receptors, axonal guidance molecules, and the neurogenic gene NeuroD (Bruce et al.,2004). REST/NRSF imposes dynamic, impermanent repression through its association with the corepressors CTD phosphatase, which inhibits RNA polymerase II, and HDACs, which limit chromatin accessibility (Battaglioli et al.,2002; Yeo et al.,2005). The corepressor Co-REST coordinates stable, epigenetic repression by directly binding the H3K9 HMT G9a and the H3K4 demethylase LSD1 (Roopra et al.,2004; Shi et al.,2004). Co-REST also recruits additional epigenetic silencing factors, the methyl DNA binding protein MeCP2, and the H3K9 HMT SUV39H1 (Lunyak et al.,2002). Finally, heterochromatin protein 1 (HP1) binds and condenses neuronal genes marked with methylated H3K9 (Lunyak et al.,2002).
REST/NRSF is also expressed in cells that have the potential to become neurons, embryonic stem cells and neuronal progenitors. In multipotent cells, neuronal genes must be permissive for activation. Therefore, REST/NRSF must not simply be an “off” switch for neuronal genes. Rather, the degree of repression mediated by REST/NRSF is tailored to the developmental stage of the cell (Ballas et al.,2005). In pluripotent embryonic stem cells, REST/NRSF binds RE1 sites, but H3K9 remains unmethylated, suggesting that neuronal genes are poised for activation. In neuronal progenitors, posttranslational modifications lead to the down-regulation of REST/NRSF. A cell with low levels of REST/NRSF is permissive for transcription of some neuronal genes. In differentiated neurons, transcriptional repression of REST/NRSF results in derepression of most RE1-containing genes. Interestingly, in the mature neuron, a class of stimulus-inducible genes remains methylated at DNA and continues to be bound by the REST corepressors MeCP2 and Co-REST. Upon membrane depolarization, expression of one such gene, brain-derived neurotrophic factor (BDNF), is up-regulated and MeCP2 is released while Co-REST remains unaffected. The authors speculate that Co-REST persists as a platform to facilitate dynamic gene regulation in mature neurons. This work shows that slight manipulations of REST and its corepressors prepare a cell with neuronal potential for the next phase of development.
DNA and Histone Methylation Are Required for Differentiation of Specific Organs
Many genes that regulate epigenetic mechanisms are expressed ubiquitously, and are essential for life. These characteristics have made it difficult to parse out epigenetic requirements for the development and differentiation of specific cell types. Recently, Rai et al. circumvented this problem by reducing, but not eliminating, expression of DMNT1 and the HMT Suv39h1 in zebrafish embryos by injecting translation blocking antisense morpholino oligos (Rai et al.,2006). Surprisingly the phenotypes in morphants for both genes largely phenocopy one another, and the observed defects are very specific. Terminal differentiation is impaired in the intestine, exocrine pancreas, and retina, while other organs are unaffected. Explaining the observed identical phenotypes, genetic and biochemical data show that DMNT1 and Suv29h1 function in the same pathway. Both cytosine DNA methylation and H3K9 methylation are reduced in DMNT1 morphants, and overexpression of suv391 rescues DMNT1 morphants. It remains to be determined what genes are targeted by the two regulators, and what molecular cues trigger their silencing. The answers to these questions may help determine whether these epigenetic modifications regulate differentiation of other cell types.
In mice, an olfactory sensory neuron expresses only 1 of 1,300 olfactory receptor (OR) genes. How one gene is expressed exclusive of all others has been a subject of much study. Richard Axel and colleagues recently reported that OR choice is mediated by interchromosomal interactions (Lomvardas et al.,2006). The H DNA sequence was previously found to act as an enhancer element in cis for a cluster of OR genes. In this study, fluorescent in situ hybridization (FISH) visualized the association of H with OR genes in trans. The authors also discovered that one H allele is silenced by DNA methylation, a finding that gave rise to the idea that H might only activate one gene at a time. In support of this, olfactory neurons in transgenic mice bearing more than one active H element, and OR pseudogenes express two OR receptors, one endogenous and one pseudogene, in ≤1% of cells. An OR pseudogene was used to circumvent a negative feedback loop whereby OR gene selection prevents expression of a second OR gene. The authors hypothesize that OR choice is controlled by random, long-distance chromosomal association of a single H enhancer element with a single OR gene.
AN INTERVIEW WITH THE EXPERTS
Developmental Dynamics: Describe your research.
Giacomo Cavalli (Fig. 1): We are interested in patterning mechanisms that involve the function of Polycomb group (PcG) and trithorax group (trxG) proteins. These epigenetic components were originally discovered as regulators of Hox gene expression. Through this function, they play a master role in the specification of the anteroposterior axis of the body plan. In addition, they regulate a variety of other genes that play important roles in developmental patterning and regulation of cell differentiation and proliferation. One important feature of PcG and trxG proteins is that they are able to maintain the memory of developmental decisions through cell division.
We are studying the molecular mechanisms of PcG/trxG-dependent epigenetic inheritance of chromatin states, and we are trying to understand the global regulatory logic behind the function of these proteins. We have recently found that PcG proteins have the ability to induce the association of some of their target genes in the three-dimensional space of the cell nucleus. Now we would like to understand how widespread are these contacts and what is their influence on genome regulation.
Noel Buckley (Fig. 1): We have looked at transcriptional regulation in neurons for several years. The idea of “epigenetic regulation” is one that has taken hold relatively recently and has great resonance within neurobiology because of the potential to provide an understanding of the molecular underpinnings of the sorts of long-term changes that are the hallmark of neuronal plasticity and memory. Having said that, caution must be used to distinguish “epigenetic changes” from simple readily reversible histone modifications—even though the former may depend upon the latter. This is a well-recognized caveat and one that many, including Bryan Turner, have been careful to highlight.
Our own work has focused upon changes in histone modifications that accompany the transition from pluripotent ES cells, through multipotent neural stem cells to differentiated neurons, particularly in relation to the neuronal transcription factor, REST. More recently, we have begun to examine chromatin modifications at pan-neuronal and subtype specific genes as neuronal differentiation unfolds.
How did you become interested in epigenetics?
G.C.: Toward the end of the 1980s, it started to become clear in yeast that chromatin was not just a dull way to condense DNA in order to pack it into the nucleus, but rather a functional substrate of genomic regulatory processes. I had the privilege to join in the chromatin field during these years and to witness crucial advances showing that epigenetic regulation of chromatin is at the heart of genome function and contributes important information that can be passed on to the cellular progeny along with the DNA sequence. This is not only true in yeast and Drosophila, our model organism, but also in plants and vertebrates.
N.B.: We took up the theme of epigenetics relatively recently—more a slow embrace rather than an impassioned charge. The drive largely came in parallel with our burgeoning interest in neural stem cells. First, there was an interest in identifying transcription factors that determined or regulated neuronal phenotype. Second was a frustration borne from the dearth of suitable tractable, meaningful cellular systems to carry out these studies—there was no neural equivalent to the myoblast cell lines and hematopoietic stem cells that underpinned much of our current knowledge of hematopoiesis and myogenesis. This barrier has now been broken, and there are several robust neural stem cell models that are amenable to dissecting neurogenesis in a similar manner. The final thread of this trinity was the realization that many long-term changes induced by transcription factors were mediated by means of changes in histone modifications and/or DNA methylation. So, a rather indirect route brought us to epigenetics.
What papers have influenced you the most?
G.C.: One striking paper by Shiv Grewal and Amar Klar (Grewal and Klar,1996) showed that alternative chromatin states of the Schizosaccharomyces pombe mating type locus could be passed on to the majority of the mitotic and meiotic cellular progeny. This was the first clear demonstration that epigenetic regulation can affect heredity, making the point that genes are more than their primary DNA sequence. A second seminal paper in the field is that of Rea et al., identifying the first histone methyltransferase enzymes, which turned out to be proteins previously known for their role in heterochromatin formation (Rea et al.,2000). This study paved the way to the identification of many histone methylases involved in chromatin assembly and inheritance, including the widely conserved Enhancer of zeste protein of the PcG as well as histone methylase proteins of the trxG. More than a single paper, I would like to mention as a third event in epigenetics a flurry of papers published independently by several groups in 2006, all of them mapping the genomic distribution of PcG proteins in flies and mammals (Boyer et al.,2006; Bracken et al.,2006; Lee et al.,2006; Negre et al.,2006; Schwartz et al.,2006; Squazzo et al.,2006; Tolhuis et al.,2006). Together, these studies have a huge impact since they show that PcG proteins regulate in a coordinated fashion a number of conserved transcriptional pathways of fundamental importance for the development of the body plan. They also mark the beginning of a new phase in genome-wide location studies, where DNA chips of the latest generation, tiling through the large genomes of vertebrates, are coupled to chromatin immunoprecipitation to yield very accurate protein distribution maps. This robust technology will produce massive amounts of information in the next few years.
N.B.: As a neurobiologist I remain intrigued by the molecular and cellular basis of mood, perception, and cognition. Some time ago, a colleague pointed out a study that had discovered a significant increase in the incidence of adult schizophrenia in the offspring of mothers who had undergone prolonged starvation in the Chinese famine of 1959–61 in the province of Anhui (St. Clair et al.,2005). It is thankfully rare that we have access to studies such as these on the scale necessary to have sufficient statistical power to allow robust conclusions to be drawn, and although not an ounce of mechanism can be inferred, the influence of epigenetics on something as profound as human behavior can be clearly inferred.
Although, not strictly epigenetics, ChIPchip studies (first pioneered by Rick Young in yeast) have shown on a global scale how changes in histone modifications can be mapped over the whole genome (Ren et al.,2000). This is an immensely powerful technology and allows individual histone marks to be overlaid on to features of the genomic landscape, and this landscape, in turn, can be viewed at different developmental time points. There are now many of these papers, but I would pick two papers from Eric Lander's and Mark Groudine's groups since they were the first to show large-scale correlation between histone methylation, and acetylation and gene activity (Schubeler et al.,2004; Bernstein et al.,2005). I was also taken with Mandy Fisher's work (Azuara et al.,2006) and her demonstration that these marks can coexist in pluripotent stem cells — an idea echoed by Eric Lander later that same year (Bernstein et al.,2006). Both of these studies get away from the idea of individual histone marks as the definitive feature of the epigenome.
Relatively few epigenetically regulated developmental decisions have been identified (e.g., X inactivation, imprinting, maintenance of Hox silencing by PcG, control of neuronal genes by REST). Why are they so difficult to find?
N.B.: We have to first deal with the old chestnut of what we mean by “epigenetics.” There is a semantic and a scientific aspect to this and unfortunately, one clouds the other, especially in neurobiology. A textbook definition of epigenetics usually goes along the lines “reversible heritable changes in gene function or other cell phenotype that occur without a change in DNA sequence.” Neurons are terminally differentiated postmitotic cells, so any change in gene function is not heritable at either the cellular or organismal level. Epigenetic theories of neural development, therefore, embrace the implicit assumption that any “epigenetic change” must also occur in the germ cells in order for the change in gene function to be heritable. Changes in chromatin that are restricted to neurons cannot lead to heritable changes. Nevertheless, I want to expand this argument by reference to schizophrenia (Perkins et al.,2005; Sharma,2005), a devastating psychosis that effects around 1% of the population. Schizophrenia is a neurodevelopmental disorder, at least in the mind of many neurobiologists, and since large-scale population studies such as those used in the Dutch and Chinese famines show a robust epigenetic component to inheritance (see reply to third question), then we must accept that there is an epigenetic component to development—in this case, the development of behavior.
The hard bit is the last part of the question. Except for a few studies showing changes in DNA methylation at schizophrenia risk genes, there is nothing to hint at the exact nature of these heritable changes. This I think is the principle difficulty in identifying epigenetic mechanisms in development—demonstration of the phenomenon of epigenetic inheritance and provision of a direct molecular correlate. All too often, the twin aspects are not robustly linked.
G.C.: Noel is right, there has been a drift in the semantic definition of what is epigenetics, shifting from the developmentalist viewpoint of Waddington—the ensemble of the processes that use the information of the genotype to bring the phenotype into being—to a sharper and DNA-centered definition focused on mechanisms of inheritance not relying on the primary DNA sequence. This second definition was going against the mainstream view following which ALL inheritance was dependent on DNA. Although fascinating as a concept, this put the people working in epigenetics in the trenches. Every work they were delivering “had to” demonstrate that there was such a thing as inheritance after DNA. In my opinion, this preoccupation has prevented many attempts to even identify developmental processes regulated by epigenetic components. Because everybody was trying to decrypt mechanisms, they took phenomena that were overtly regulated epigenetically. This is typically the case of the processes you mention.
In fact, it is likely that many processes are regulated by “classical” transcription factors as well as epigenetic components. Moreover, many processes regulated by epigenetic factors may not have a strong heritable component. Luckily the groundwork of convincing the scientific community that epigenetic factors do regulate genes and can convey inheritance is now done, and this idea starts getting across even to the larger public. This brings development and also cognition and behavior at the center of interest of people involved in epigenetics, and also facilitates access to epigenetic research to those who were not doing it before. I am sure that, in the next years, we will watch discoveries of every sort of process being regulated, in part, by known epigenetic components.
Coming back to semantics, however, this is not without consequences. If we don't need a demonstration of DNA sequence-independent inheritance to admit the implication of epigenetics in a certain process, then are there really “epigenetic” and “nonepigenetic” regulators? In an age when every transcriptional regulator is being found forming chromatin regulatory complexes, is there a fundamental difference between epigenetic stars like Polycomb or heterochromatin proteins and any other transcription factor? The only one I can imagine is that some epigenetic processes involve inheritance, while others don't. In this view, epigenetic regulation would be a very general concept and epigenetic inheritance would be restricted to a part of these processes.
Do you think that many developmental decisions are epigenetically regulated?
N.B.: It is hard to believe otherwise. Any developmental change is accompanied by wholesale changes in gene expression with entire batteries of genes activated or silenced. Changes in histone acetylation and methylation accompany all these events and appear to be stably maintained in the differentiated state. Phenomenologically, this is epigenetics. Again, as in my earlier comments, the missing aspect is often the mechanistic link. Further proof comes from gene knockout studies that show the impact of chromatin modifying activities on developmental decisions. There are many examples, including the role of chromatin remodeling activities such as Brg1 in myogenesis (de la Serna et al.,2001) and neurogenesis (Eroglu et al.,2006). Before being hoist upon my own petard, I accept that this does not in itself constitute proof-positive of epigenetics, but it is a very strong indicator.
G.C.: I fully agree with Noel. When you look at the list of genes bound and repressed by PcG proteins in mouse and human cells (Boyer et al.,2006; Lee et al.,2006), and when you further consider that many of these genes are derepressed upon ES cell differentiation, while they actually become permanently silenced by DNA methylation upon cancer development (Schlesinger et al.,2006; Widschwendter et al.,2006), how can one sensibly doubt that epigenetics regulates important developmental processes?
Both REST and PcG repress/silence genes are involved in neuronal differentiation. Why do neuronal genes need their own repressor/silencer?
N.B.: This is one of the hardest outstanding questions and usually sits beside the equally vexed question of what governs whether a particular gene falls under REST (or PcG) regulation? The reason for coupling these two questions is that it unifies the idea that we need to provide an explanation for why particular genes become regulated by any individual transcription factor. In the case of REST, many targets, but by no means all, are neuronal genes. Equally important is the corollary that, by no means, are all neuronal target genes regulated by REST. Inspection of the 1,300 or so RE1 sites in the human genome (Johnson et al.,2006) provides no insight into any linkage among the targets, other than a bias toward genes associated with neuronal development or function.
One speculation that provides insight, if not explanation, is our recent demonstration that many REST binding sites seemed to have been deposited into 5′UTRs by means of LINE2 retrotransposons (Johnson et al.,2006). In other words, the “selection” of genes may be random. Deposition of a LINE2 element into a 5′UTR usually decreases the rate of transcription so deposition of a LINE2 element that carries a transcription factor binding site, confers regulated repression. Maybe this is an advantage in allowing batteries of genes to be regulated whilst in the continued presence of an activator. Simply switching off the activator may be more expensive in terms of alternative molecular compensatory mechanisms that would be necessary to keep essential genes turned on. Such fine-tuning of gene expression may be more necessary in the vertebrate nervous system where cell number and cell–cell interaction surpass any other biological system.
To address the latter part of the question, PcG, is normally (but not always) associated with silenced genes (stem cells provide an interesting exception) whereas our studies (Belyaev et al.,2004; Bruce et al.,2004; Greenway et al.,2006) show that REST is normally associated with active genes, i.e., employment of REST can maintain repression of an otherwise active gene, whereas in the differentiated state, PcG normally acts to establish and/or maintain silence.
G.C.: Noel is making a strong case for the study of these questions from an evolutionary perspective. We often have a biased view of the processes we are studying, thinking that they are “mature,” i.e., close to their endpoint evolution. There is a temptation to think that if something is as we see it today there must be a very good reason (strong selective pressure) to maintain it. If we understand how certain regulatory circuits have evolved, we come closer to the understanding of whether “chance” or “necessity” assigned certain regulators to certain processes. This approach is going to become more and more important as genomics progresses and many organisms, including nonmodel species are sequenced and studied.
Coming to the specific question, maybe that PcG proteins do regulate a lot of neuronal genes, but the nervous system might require its own dedicated set of epigenetic factors as an additional layer of regulation.
Do you think other tissue-specific gene programs (e.g. muscle) have specialized repressors/silencers?
N.B.: It is interesting that they haven't been discovered, even in the light of complete sequencing of several vertebrate genomes. As alluded to earlier, regulation of gene expression in the nervous system may require finer controls than other tissues because of the scale of the cellular heterogeneity. One further point to bear in mind is that REST is an evolutionary newcomer on the block—it is only found in vertebrates—again coincident with a rapid expansion in brain size and neuronal diversity. Maybe not a completely compelling argument, but an interesting coincidence.
G.C.: I am at a loss here, on the one hand I share Noel's feeling that the nervous system is a very “epigenetic object” with lots of cross-talk between stable cell fates and plasticity, which is the realm of action of epigenetic regulators. On the other hand, I would be very cautious not to make a prediction in this infant field; we may learn soon that muscle development is epigenetically regulated. For instance, a recent paper (Mal,2006) showing a regulatory association between SUV39H1 and MYOD makes me feel very excited about the amount of hidden regulatory potential in tissue development. I hope to read beautiful stories about these subjects in the future to help make up my mind.
Both REST and PcG can occupy actively transcribed loci. How do you explain this?
N.B.: This comes back to the idea of whether you think REST and PcG are repressors or silencers. In most adult tissue and differentiated cells, we find REST present at actively transcribed loci, and abrogation of REST function leads to derepression, indicating that REST acts to regulate levels of transcription rather than to simply silence transcription. This derepression is accompanied by reciprocal increases in histone H3K9 acetylation and decreases in H3K9 dimethylation at the REST binding site (Belyaev et al.,2004; Greenway et al.,2006).
The idea of singular histone modifications offering a read-out of gene activity is losing ground, and it is likely that no singular mark will have such predictive or causative power. This is especially evident in studies of stem cells where much of the cells chromatin appears to carry a “bivalent” or mixed signature comprising both “active” and repressed' marks (Azuara et al.,2006; Bernstein et al.,2006). Also, remember that in the REST−/− embryo, very few target genes were precociously activated (Chen et al.,1998). This of course could be due to absence of appropriate activators. Since embryonic lethality occurred before neurogenesis, then in vivo testing of these ideas awaits production of a conditional REST knockout.
G.C.: Noel is reminding us of a very important point, many regulatory processes result from a balance between putative activation and repression, with activators and repressors coexisting on their target rather than excluding one another. For instance, PcG proteins have to leave with their trxG partners on their targets and the papers cited above by Noel, as well as a fly study by Muller and coworkers (Papp and Muller,2006) really suggest that this is not just a mix of active and silent cells, but rather a true presence of bivalent marks on one and the same chromatin region.
The recent discoveries of the LSD1 and GASC1 histone demethylases suggest histone methylation is more dynamic than previously thought. How might this affect the interpretation of “silencing” histone methylation marks?
G.C.: The discovery of histone demethylases was very important, although in a way expected, since histone methylation patterns were earlier found to vary rather rapidly following certain treatments. Nonetheless, one should keep in mind that methyl groups have an average half-life of about two orders of magnitude longer (approaching the half-life of histones themselves) compared to other histone marks like acetylation. Therefore, methyl marks are likely to be maintained stably on the chromatin template of most genes, making histone methylation a good candidate modification for transmission of epigenetic inheritance. Histone demethylases are probably able to “reset” this memory when needed, and it will be important to dissect when and how is this reset brought upon.
N.B.: I think Giacomo's idea of “resetting” is key. Just because demethylases exist does not mean that all methyl marks will be erased. It simply lends more adaptability to allow reactivation of genes previously silenced. This could be particularly important in the recruitment, expansion, and differentiation of tissue-specific stem cells in response to damage in the adult where silenced genes may need to be reactivated (or conversely, active genes needed for maintenance of the multipotent state may need to be silenced). Identification of the transcription factors that target the demethylases will provide insight into when and where demethylation is employed.
Will it be important to distinguish stably methylated genes from ones that are subject to demethylation?
G.C.: I predict that, thanks to the explosion of high-throughput epigenetic mapping technology, we will actually “see” some of these demethylation events “on the fly.” Most importantly, we will learn about the genome-wide distribution of histone demethylases, and this might help understanding the whole issue. It is early to say whether there will be specific signatures identifying genes that are reversibly methylated as opposed to genes that are not. Even the most stable regulatory gene states are likely to be reprogrammable under extreme circumstances. However, it is likely that different genes (or classes of genes) can be differentially reprogrammed during development or in response to external stimuli.
N.B.: I certainly concur with the power of ChIPchip studies to provide genome-wide views of epigenetic changes. Considering that an oocyte nucleus can reprogram a somatic cell to derive a whole organism then it is clear that all marks are erasable.
Silencing of at least some PcG targets is mediated by pairing-sensitive silencing. Do you think other epigenetic silencers use this strategy?
G.C.: Long-distance pairing can be driven also by heterochromatin. Indeed, the first ever reported case of long-distance repressory chromosomal associations were published by the Henikoff and the Sedat lab 10 years ago and involved heterochromatic regions (Csink and Henikoff,1996; Dernburg et al.,1996). It is a common observation that heterochromatic loci cluster at the so-called chromocenters in both insects and vertebrate cells, and this clustering may affect silencing. Other epigenetics silencing systems may employ similar strategies and, indeed, long-range chromosomal associations are also linked to gene activation, as the work of several labs is recently showing (Spilianakis et al.,2005; Lomvardas et al.,2006).
N.B.: The notion of long distance interactions, both in cis and in trans is extremely important both for silencing and activation since it offers a mechanism to allow coordinate regulation across multiple genes—exactly the sort of mechanism you would expect to see during development and differentiation. The paper by Lomvardas et al. cited by Giacomo is a great example of the power of chromosome capture conformation (3C) to unravel these interactions. In passing I should point out that we have no evidence that REST undertakes such interactions, although I accept that absence of evidence is not evidence of absence. The one report in the literature by Lunyak et al. (Lunyak et al.,2002) that purports to show that REST can silence an entire locus is unfortunately based upon an incorrectly annotated gene order (Belyaev et al.,2004).
What exciting ideas are emerging in the field?
G.C.: One new idea is related to nuclear compartmentalization. The established view of “genes” is that they are regulated by cis-elements located somewhere close to the transcription start site or within “reasonable” (1–100 kb) distances from it. However, the mounting evidence for chromosomal contacts involving loci from different chromosomes or from different parts of the same chromosome raises the exciting possibility that gene regulation uses these contacts.
Another new concept is that noncoding RNAs play important regulatory roles in the genome, affecting all steps of gene regulation, from chromatin structure to stability and trafficking of the mRNA and to regulation of RNA translation. Many different classes of noncoding RNAs exist, many of them playing a role in developmental regulation. It is thus important to analyze each of them, and the initial studies are very promising.
Finally, the flood of genome-wide data raises the need to put a huge effort in understanding the logic of gene-regulatory circuits. For instance, a glance at the PcG target genes reveals that they often include multiple genes involved in the same pathways. Frequently, these genes code for transcriptional regulators. In the case of transactivators, one may speculate that PcG proteins secure a tight block of certain pathways by simultaneously shutting down multiple genes at different steps of the regulatory cascade. But what is the logic of silencing several transcriptional repressors along multiple steps of a pathway? Superficially, shutting down the expression of a tanscriptional repressor as well as of its own target genes does not make much sense. Clearly, one needs to understand how these events are regulated both spatially and temporally.
N.B.: Giacomo is absolutely right. We have produced detailed linear maps of multiple vertebrate genomes, and the application of increasingly sophisticated bioinformatics and cross-genome comparisons is overlaying detailed annotations of regulatory sites. The discoveries of long-range interactions, nuclear compartmentalization and noncoding RNAs all impose new dimensions of complexity onto this linear map. I think, as always, those of us working on development in higher eukaryotes will be keeping a close eye on the work of our colleagues in yeast genetics particularly in relation to building regulatory networks (Lee et al.,2002; Wyrick and Young,2002).
What important questions remain to be answered?
G.C.: Concerning nuclear architecture, a critical implication of the existence of long-range regulatory chromosomal contacts is that a phenotype may be induced from very remote parts of the genome. Therefore, it will be important to map long-range chromosome contacts systematically and to understand how frequently they affect gene regulation. It will also be crucial to study whether these contacts are the cause or the effect of regulation of gene transcription.
Another unanswered question concerns the mechanisms of epigenetic inheritance through DNA replication and mitosis (or meiosis). Indeed, chromosomes undergo major reorganization events during these two phases of the cell cycle. How can a chromatin state survive these events is not known and, largely, not even studied, because of the lack of appropriate approaches. Much of current effort in molecular biology aims to develop tools in order to work on the scale of limited number of molecules (ideally single molecules). This may allow tackling the issue of chromatin inheritance at the molecular scale in next few years.
Finally, a grand challenge for the future is to understand how the different gene regulatory “tools” are harnessed in a coordinated way during development. Orchestration has always been an issue in developmental biology, but certainly the new discoveries give a mind-boggling perspective of the combinatorial complexity that can be reached. Everything, from transcriptional output to gene product's location, can be dynamically regulated in a fine-tuned way by different processes. The other side of the coin is that there are also many risks for molecular mistakes that can potentially damage the developmental process. We will thus need to analyze how these complex regulatory processes can be coordinated in order to combine developmental plasticity with robustness.
N.B.: All of this is true. Development has often been perceived as a balance between external signals and intrinsic programs, a description that finds its reflection in some classic embryological terms such as “competent” and “determined.” The interplay between intercellular signals and wholesale remodeling of the genome that occurs as differentiation and development proceed will increasingly occupy our experiments—whether such changes are defined as epigenetic or otherwise. At the moment, most researchers focus on either the chromatin or on the signaling events, but these must converge if we are to offer comprehensive molecular correlates of “fate” and “potential.” Again, I think this interplay will be at its most complex in the nervous system where the complexity of signals and diversity of cell type is greater than with any other tissue.
Many thanks to Jacqueline Wittmeyer, Christine Disteche, Rebecca Oakey, Nurit Ballas, and Itys Comet for helpful suggestions. I am also grateful to Noel Buckley and Giacomo Cavalli for sharing their insights.