Introduction: The Developmental Requirement for Gene Silencing
Organisms develop from a single cell and become progressively more complex by a process termed epigenesis. Cells must become highly specialized to carry out specific tasks, and must be organized in time and space, in a process called pattern formation. All cells are clones of the original fertilized egg, so they contain the same genetic information. However, very early in development, cells acquire specific fates and become fixed to follow a specific developmental path, in a process called determination. Determination causes selective use of genetic information (or differential gene expression) to bring about specific changes in cells. Because determination is normally irreversible, determination must fix patterns of gene expression in a cell and its descendants (Wolpert et al., 1998). A fundamental question of epigenetic change is: how is cell fate, and therefore a fixed pattern of gene expression, passed on from a cell to its descendants? See also Morphological Evolution: Epigenetic Mechanisms
Cells acquire different fates in three ways. Cells can receive positional information that tells them where they are in the body, or neighbouring cells can give them instructions that change their fates (induction), or they receive information caused by unequal distribution of a molecule, called a determinant, from the mother cell to the daughter cells. Positional information, induction and determinants all bring about differential gene expression by causing transcription factors to bind to regulatory regions of genes, which in turn causes gene expression to be turned on or off. These ways of changing cell fate have a very important feature in common. The signals from position information, induction and determinants are very short-lived, and often only last for several hours. The gradients of molecules that establish positional information must be established when organisms or structures are small, because the distance that molecules can propagate is only a few cell diameters. With growth, or when cell rearrangements occur, the gradients are disrupted. Similarly, neighbouring cells sending the inductive signal can move away as a result of morphogenetic movements, or have their own fate changed and stop producing the signal. Determinants are diluted out by subsequent cell divisions. Yet in all these cases, the change in gene expression caused by the transitory signal must be maintained in a cell and its descendants for a lifetime. Therefore, there must be mechanisms to ensure that the information a cell gets from transitory signals is registered in its descendants. These mechanisms are called epigenetic mechanisms, and are thought to involve changes to DNA, or to the proteins that bind DNA to make chromatin. See also Gene Expression Induction, and Secondary Induction: Overview
It should be clear that epigenetic mechanisms must exist to keep genes on where they are supposed to be on, and to keep genes off where they are supposed to be off. An error in either silencing or activation could have dramatic developmental consequences. One of the best-studied examples can be seen in the regulation of the homeotic genes that control axis determination in the fruitfly Drosophila melanogaster, and in vertebrates (McGinnis and Krumlauf, 1992). Homeotic genes are expressed in spatially restricted, overlapping patterns in embryos. These patterns create combinatorial codes that specify which structures should be formed in which locations. If the expression pattern of homeotic genes is altered, this creates new combinatorial codes throughout the embryo, and can result in dramatic transformations of one structure into another, such as the growth of antennae where legs are supposed to be. In Drosophila, the pattern of homeotic gene expression is initiated by transiently expressed transcription factors encoded by the segmentation genes. These transcription factors are expressed for about 34 hours and activate and repress homeotic genes. But once initial regulatory cascade fades away, another set of factors is required to maintain the expression patterns of homeotic genes. There must be a similar requirement for epigenetic maintenance of gene expression patterns for all genes regulated by transiently expressed transcription factors. See also Mammalian Embryo: Establishment of the Embryonic Axes, Drosophila Embryo: Cell Signalling and Segmental Patterning, and Drosophila Embryo: Homeotic Genes in Specification of the AnteriorPosterior Axis
Setting the Pattern: Marking Genes On or Off
Turning a gene on or off requires a way to stabilize or encourage the basal initiation machinery to bind the promoter in the case of activation, or to interrupt this process in the case of repression. Normally, transcription factors bound to enhancer elements recruit the RNA holoenzyme to the promoter or prevent the holoenzyme from binding the promoter. Transcription factor-mediated activation or repression lasts only while the transcription factor is present and bound to the enhancer, and is usually disrupted during each cell cycle. To permanently maintain an on or off state, there must be mechanisms that recognize whether a gene is on or off, that are stable to DNA replication and cell division, that maintain themselves when the transient signal is no longer present, and that carry out the activation or repression. Each of these steps could be mediated independently by different molecules or mechanisms. See also Transcriptional Gene Regulation in Eukaryotes
Because DNA is replicated semiconservatively, if there is a mechanism for marking DNA that is correlated with gene expression, and that is also replicated, then such a mechanism would be ideal for serving as an epigenetic tag. In fact, methylation of specific GC sequences found upstream of many mammalian genes, termed CpG islands, behaves like an epigenetic tag. Methylation of CpG islands is associated with repression of the gene, and the methylation status of the CpG island is inherited after cell division. It is thought that specific proteins bind to methylated sites and recruit proteins that keep histones from being acetylated, and that these histones do not permit gene activation by most transcription factors. Methylation differentiates the on and off state of the gene, is heritable, and is required to recruit the proteins that make repression possible, and is the most straightforward example known of an epigenetic tag (Surani, 1998). See also DNA Methylation, and Imprinting (Mammals)
However, not all organisms methylate their DNA, so there must be other ways of stably marking genes as on or off. If a DNA-binding protein that distinguishes the on and off state of a gene could remain bound to DNA semiconservatively during DNA replication, then it could act as an epigenetic tag. No examples are known of a DNA-binding protein that serves as a tag, although histones appear to remain bound to DNA during replication. This bound protein could recruit activating or silencing proteins to the DNA each cell cycle, or it could cause a stable change in chromatin structure. Another attractive idea is that transcription or its absence could cause formation of a particular chromatin structure at a gene that may be stably inherited. Such a model has been proposed to explain how the centromeres of higher eukaryotes are propagated through cell division (Murphy and Karpen, 1998). Of course, combinations of these ideas are also possible. See also Chromatin Remodelling and Histone Modification in Transcription Regulation, and Histones: from Gene Organization to Biological Roles
Maintaining Silencing: the PRE and Pc Complex
Much of what is known about epigenetic mechanisms of gene regulation has come from studying the polycomb group (PcG) genes required for maintenance of homeotic gene regulation in Drosophila. The PcG genes are named after the polycomb gene, named because mutations cause the appearance of extra male-specific structures termed sex combs. Mutations in PcG genes cause homeotic transformations because repression of homeotic loci is disrupted, so the homeotic genes are expressed throughout the embryo instead of being spatially regulated. Although the homeotic loci are the best characterized targets of PcG activity, PcG proteins are important for regulation of many developmentally important genes. Mutations in PcG genes exhibit other mutant phenotypes including nervous system, gut, wing, and segmentation defects, cell death and cancerous overgrowth phenotypes. All PcG proteins bind multiple sites on polytene chromosomes, including the homeotic loci (Pirrotta, 1997) (Figure 1). See also Polytene Chromosomes
In Drosophila, there are 15 PcG loci, of which 13 have been cloned and sequenced. PcG proteins share sequences with a number of other chromatin proteins needed for repression and activation. PcG homologues exist in yeast, plants, Caenorhabditis elegans, and mammals, and where tested, also have conserved silencing functions in these organisms (Schumacher and Magnuson, 1997). M33, the mouse homologue of the Pc gene, rescues flies mutant for the Pc gene. Human or mouse enhancer of zeste can rescue defective yeast telomeric silencing, indicating a conservation of PcG-mediated silencing mechanisms even in simple eukaryotes. Mutations of PcG genes in mice and plants exhibit homeotic transformations, showing that PcG-dependent silencing is conserved in these organisms. However, PcG mutations also show phenotypes not seen in flies, like sex reversal in mice, consistent with PcG function being adapted for different developmental processes in different organisms. See also Evolutionary Developmental Biology: Homologous Regulatory Genes and Processes
Flies mutant for two PcG genes exhibit stronger homeotic transformations than flies mutant for either single PcG gene, suggesting that PcG proteins may function as a protein complex. Consistent with this idea, many PcG proteins bind to the same chromosomal target sites, interact in vitro, and coimmunoprecipitate or cofractionate. Evidence is mounting that there may be multiple PcG complexes, each containing different combinations of PcG and other proteins. To determine how PcG complexes function in silencing, four questions must be answered: (1) How do PcG complexes bind to target sites? (2) How do PcG complexes act on target sites only in regions of the embryo where silencing is required? (3) What is the mechanism of PcG-mediated silencing? (4) How is silencing maintained through multiple cell divisions once a target site has been recognized?
PcG proteins bind targets via PcG-response elements, or PREs. PREs were originally identified as DNA sequences that could confer maintenance of expression boundaries of homeotic regulatory regions late in embryogenesis. This maintenance is dependent on normal PcG function because the boundaries break down in PcG mutants. Incorporation of PREs into new locations in the genome created new binding sites for PcG proteins at these locations, indicating that PREs contain binding specificity for PcG proteins (Pirrotta, 1997). So far, only two PcG proteins have been shown to bind DNA directly. The transcription factor yin yang 1 (YY1) is encoded by the PcG gene pleiohomeotic, and binds to specific sequences in a PRE near the engrailed gene (Brown et al., 1998). These YY1 sequences are also found in other PREs, including those of homeotic genes. The mouse PcG protein Mel-18 binds specific DNA sequences in vitro. PcG complex formation may depend on binding sequence-specific PcG proteins, followed by recruitment of the non-DNA-binding components. However, PREs are complex, suggesting that there are multiple ways to assemble PcG proteins. There may be multiple factors, some of which are not PcG proteins, that mediate PcG complex formation at PREs.
PcG proteins repress Ubx only in the anterior, but not the posterior part of the embryo, even though the DNA sequence is identical in all parts of the embryo. How do PcG complexes bind to PREs only where the gene is repressed? There are essentially two nonexclusive possibilities. A PcG complex may recognize repressed genes by interacting with specific repressive transcription factors present at the locus (such as segmentation gene products), or binding of a PcG complex may require the chromatin structure associated with a repressed state.
The specific mechanism of PcG-mediated repression of target loci is unknown. Several models of PcG function have been suggested: (1) that PcG proteins repress target loci by compacting the DNA into a heterochromatin-like structure; (2) that PcG proteins target DNA segments to repressive compartments in the nucleus; (3) that PcG proteins inhibit transcription initiation; or (4) that PcG proteins interfere with the interaction between enhancer elements and the promoter by looping out the intervening DNA. The most popular model is that a PcG complex binds, and then recruits other PcG proteins that spread along the chromosome, creating a highly condensed, heterochromatin-like structure that prevents transcription factors from binding DNA. The most recent studies suggest that PcG proteins are bound at discrete locations at target loci, rather than being spread throughout the locus. There is little direct evidence for a change in chromatin structure at loci that are repressed by PcG proteins, and no one has ruled out the possibility that the chromatin change is a consequence of transcriptional silencing, rather than the other way round. The compartment idea has not been disproven but it does not seem likely. PcG proteins bind about 100 sites on polytene and mitotic chromosomes that are not localized to a specific region of the nucleus, so one would have to argue that there are at least 100 separate repressive compartments in the nucleus. PcG proteins may interfere with basal transcription, perhaps by recruiting proteins that deacetylate histones. Another possibility is that PREs act as nucleation sites for PcG complexes that can interact with and stabilize complexes bound at other binding sites throughout the regulatory region. This would create a looped out proteinDNA conformation that may prevent enhancerpromoter interactions and thus repress transcription. Repression of reporter constructs containing a PRE increases when the reporter is able to pair with other PRE-containing reporter constructs, suggesting that PREs interact with each other to stabilize repression (Pirrotta, 1998) (Figure 2). See also Heterochromatin and Euchromatin, and DNA Looping and Transcription Regulation
Once PcG proteins have assembled at a target site, the locus must remain silenced through multiple cell divisions. This returns to the problem of the epigenetic tag discussed above. The PcG complex could itself be the epigenetic tag. Analysis of binding of PcG proteins to chromosomes during the cell cycle shows that Pc is removed completely from the chromosomes during DNA replication and only returns to its binding sites after cell division. Trace amounts of the PcG proteins posterior sex combs and polyhomeotic remain associated with target sites throughout the cell cycle, suggesting that small levels of bound PcG proteins may act as tags for the reassembly of a PcG repressive complex at target sites (Buchenau et al., 1998). PcG proteins do not organize chromatin into a self-renewing structure in the absence of a PRE, because loss of a PRE in development prevents silencing. If individual PcG complexes are stable, and multiple PcG binding sites are required for repression, then PcG complexes might disassemble at a given PRE at a locus during DNA replication, and then reform, stabilized by interaction with PcG proteins at other binding sites that have not yet undergone replication, or that have already been replicated, so that the locus as a whole remains marked during replication. Alternatively, PcG proteins could cause deacetylation of histones, which could act as a tag by ensuring that new nucleosomes also contained histones that were not acetylated. See also Repression Mechanism
Competition between the On and Off States: The Trithorax Group
In the same way that PcG proteins are required to maintain silencing, it seems reasonable that there should be proteins required to maintain the on state after transitory expression of the initial transcription factor has ended. Genes encoding such proteins exist, and are called the trithorax group (trxG) after the first gene of this type to be discovered. Mutations in trithorax cause homeotic transformations resulting from reduction of homeotic gene expression. Interestingly, many trxG genes were discovered because they suppress the homeotic phenotypes caused by PcG mutations. This result implies that trxG proteins might oppose the action of PcG proteins, or vice versa (Kennison, 1995). See also Mammalian Embryo: Hox Genes
Mutations in many kinds of genes might prevent gene activation, so trxG genes can be expected to be more heterogeneous than the PcG genes. Consistent with this expectation, some trxG genes encode sequence-specific DNA-binding proteins like zeste or the GAGA factor, others are subunits of highly conserved chromatin remodelling complexes (brahma), one is an intracellular transport protein (vha55), whereas others are chromatin-binding proteins of unknown function, including trithorax itself, absent small or homeotic disks 1 and 2 (ash-1 and ash-2). There is biochemical support for the existence of multiple complexes containing trxG proteins, because complexes containing brahma do not contain trithorax, ash-1 or ash-2.
The existence of genes required for silencing implies that silencing is not merely the absence of transcription. Instead, silencing must be an active process, because if PcG genes are mutated in development, formerly repressed genes become reactivated. Similarly, the maintenance of transcription appears not to be the passive absence of silencing, because the trxG proteins are required constantly throughout development to keep genes on. Ultrabithorax is a homeotic gene that is repressed in the anterior, and expressed in the posterior of the embryo. PcG proteins bind to ultrabithorax PREs only in the anterior, and trxG proteins bind to ultrabithorax regulatory elements only in the posterior. But how is this achieved?
A subset of the trxG proteins, including the GAGA factor, trithorax, ash-1, and ash-2 act directly at homeotic loci. Attempts to use genetic and biochemical means to map trxG binding sites at the ultrabithorax locus suggest that trxG-response elements (TREs) overlap with PREs. The same DNA shown to repress reporter gene activity in vivo when polycomb is present has been shown to activate the reporter when trithorax is present. Similarly, trithorax binds to PRE DNA if it is moved to a new region of the genome. These observations suggest one simple model for maintaining activation or repression of ultrabithorax. If there were competition for binding, and either trxG or PcG proteins could bind the PRE/TRE, but not both, then if prior activation of ultrabithorax promotes binding by trxG proteins, this would prevent binding PcG proteins. Similarly, prior silencing of ultrabithorax regulated by transitorily expressed transcription factors might promote binding of PcG proteins, and prevent binding of trxG proteins. Unfortunately for this appealing model, trithorax protein binds homeotic loci in vivo simultaneously with PcG proteins, ruling out competition for binding sites (Figure 3).
A more complicated model suggests that both trxG and PcG proteins bind the PRE/TRE, but that only one group of proteins is assembled into a functional complex. PcG proteins bind in vivo to PREs of transcribed genes, supporting the idea that binding of PcG proteins is not sufficient for repression. At least in salivary glands, trxG proteins are present at the sites of silenced homeotic genes, suggesting that trxG proteins may not be sufficient for activation. Perhaps there is competition for a limiting factor or factors required for assembly of a functional PcG or trxG complex, and the availability, or accessibility of this factor is limited by chromatin changes associated with prior transcription or repression. In this model, proteins of both groups would be present, but would not be active until a complete activating or repressing complex was assembled after capture of the limiting factor.
Recent evidence suggests that the former idea of two completely separate complexes, a repressing PcG complex, and an activating trxG complex may be too simple. There may be functional overlap between proteins of the two groups. Two members of the PcG, enhancer of zeste and additional sex combs have phenotypes that are associated with the trxG. One of these proteins, additional sex combs, may interact directly with trithorax. The GAGA factor, encoded by trithoraxlike, can recruit chromatin remodelling factors required for activation. Yet the GAGA factor is required for assembly of some PcG complexes, and its binding to homeotic loci comaps with that of polycomb. It could be that GAGA is required to stabilize an open chromatin conformation that allows binding of activator or repressor complexes. It is possible that both activation and repression require some level of cooperative activity between PcG and trxG proteins. See also Evolutionary Developmental Biology: Hox Gene Evolution
Comparison with Other Systems
Epigenetic tags that work directly or indirectly via chromatin proteins are not restricted to the PcG and trxG. Position effects, in which genes are differentially regulated by chromatin depending on their location in the genome, probably provide the closest parallel to the PcG and trxG. Less closely related phenomena include X chromosome inactivation and imprinting in mammals.
In Drosophila, position effect variegation (PEV) occurs when a genome rearrangement brings a normally euchromatin gene close to heterochromatin. The result is that the euchromatic gene is sometimes, but not always inactivated. The mechanism of PEV is unclear. Heterochromatin might spread to inactivate the neighbouring euchromatic gene, or heterochromatin might bring the euchromatic gene into a transcriptionally inactive nuclear compartment. Two PcG genes and several trxG genes also affect PEV. PEV is widespread in eukaryotes (Wakimoto, 1998). See also The Role of Insulators in Genome Organization and Gene Expression
A closely related phenomenon, called telomeric position effect (TPE) occurs in flies and yeast. Genes brought within a few kilobases of telomeres are inactivated. Elegant studies in yeast provide the only clear evidence of spreading of heterochromatin from telomeres to the adjacent euchromatin. Multiple proteins are required, including RAP-1, SIR14, the ORC complex and histones. RAP-1 binds to specific DNA sequences at telomeres, and physically interacts with SIR24, which in turn interact with histones to create a transcriptionally inactive DNAhistone complex. Hyperacetylation of histones can directly interfere with TPE, and this relaxation of silencing can last through multiple cell cycles. Thus acetylation of histones may be acting as an epigenetic tag. Less is known about TPE in Drosophila, but some PcG genes, and some genes required for PEV also affect TPE. Many of the proteins required for TPE are also required for silencing at the mating type and ribosomal loci of yeast, suggesting that all these silencing processes are related. See also Telomeres
In mammals, one of the female X chromosomes is inactivated to equalize the dose of X chromosome genes in females, which have two X chromosomes, and males, which have only one. X chromosome inactivation has many similarities to other forms of silencing. Inactivation occurs early, causes changes in methylation and chromatin structure, and is clonally propagated throughout development. Expression of the Xist gene is necessary for, and precedes X inactivation. Maintenance of a silenced Xist gene on the active X, and an active Xist on the inactive X maintains the clonal inheritance of an active versus an inactive X. However, it is not known how the inactive and active states of Xist are maintained. Nor do we know the connection between Xist expression and the methylation and chromatin changes associated with X inactivation, although coating of the inactive chromosome with Xist RNA is probably important. See also X-
Imprinting provides a clear example of epigenetic tagging in mammals. Imprinting occurs when a gene is differentially transcribed depending on whether the chromosome was inherited from the father or the mother. Inactive, imprinted genes are methylated but these methylation changes may be a cause or consequence of the imprint. Because imprinting must occur in the germline, there must be a link between the transcriptional state of the gene in male or female cells and methylation. See also Genomic Imprinting
- Cellular DNA is packaged with proteins, including histones and nonhistone chromatin proteins to make it more compact.
- A stable change in cell fate.
- Epigenetic information is specified by modification of DNA, or modification of chromatin, as opposed to genetic information, which is specified by DNA sequence.
- Chromatin usually found at centromeres and telomeres which is more condensed, usually gene poor, and later replicating, than euchromatin, which is less condensed, gene rich, and early replicating.
- Describes genes that specify cell identity, or the changes of one body part into another homologous body part, that mutations in these genes cause.
- Modification of a cytosine residue by the addition of a methyl group.
- Morphogenetic movement
- Cell or tissue migrations that lead to changes in organ or body shape.
- The observable features of a cell or organism. Mutant phenotypes are usually compared to wild-type phenotypes.
- Transcription factor
- A regulatory protein required to initiate, upregulate or repress transcription. Usually refers to DNA-binding proteins that bind enhancers or silencers.