SEARCH

SEARCH BY CITATION

Keywords:

  • C. elegans;
  • transcription;
  • transcription factors;
  • development;
  • cell fate specification;
  • chromatin

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

We review recent studies that have advanced our understanding of the molecular mechanisms regulating transcription in the nematode C. elegans. Topics covered include: (i) general properties of C. elegans promoters; (ii) transcription factors and transcription factor combinations involved in cell fate specification and cell differentiation; (iii) new roles for general transcription factors; (iv) nucleosome positioning in C. elegans “chromatin”; and (v) some characteristics of histone variants and histone modifications and their possible roles in controlling C. elegans transcription. Developmental Dynamics 239:1388–1404, 2010. © 2010 Wiley-Liss, Inc.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

Articles in this issue of Developmental Dynamics uniformly praise the nematode Caenorhabditis elegans as an experimental organism, extolling the powerful features of defined cell lineage, superb classical genetics, accurately annotated genome sequence, optical transparency (as if designed for green fluorescent protein [GFP] reporters), highly effective gene knockouts by means of RNAi, and more. We will need all of these powerful experimental approaches to understand how gene transcription is regulated to construct even a relatively simple animal such as C. elegans.

The defined cell lineage of C. elegans (Sulston et al.,1983) can be viewed as a series of binary decisions between different cell fates. More often than not, the central players in such developmental cell fate decisions turn out to be transcription factors that bind to specific DNA sequence motifs in target promoters. There are estimated to be ∼900 such transcription factors encoded in the C. elegans genome (Reece-Hoyes et al.,2005). Although an enormous amount is known about the genetic pathways that make these cell fate decisions, our understanding at the molecular level remains superficial. What exactly is the crucial defining property of a specified cell? What is the physical basis of the stability of the specified state and how is this robust plan stored and read out (at the molecular level) over the hours and days of a worm's lifetime? Why does loss of some transcription factors cause a cell to switch fate while loss of other transcription factors does not cause such switching, with cells simply failing to adopt any obvious identity? When fate switching does occur, why are the fate decisions (usually) either/or and not some unhappy hybrid? How (thermodynamically) stable is one cell fate relative to another? It has long been recognized that alternative stable states of a biological system can be produced by relatively simple regulatory interactions (Novick and Weiner,1957) but is this what really happens in metazoan development? That is, are distinct cell fates an “emergent” property of a regulatory network, emerging from a cloud of feedback and feed-forward molecular subroutines (Kauffman,1987,1993)? Alternatively, is cell fate encoded (or at least stored) in the epigenetic “state” of the chromosome (i.e., histone arrangements, variants and modifications), such that only a single transcription program is locked in and stably propagated? In the current article, we will discuss recent insights into these problems in C. elegans. However, it will be obvious that we have a long way to go before we can claim to understand what really happens when a cell adopts a particular fate.

A comprehensive review of C. elegans transcription written in 2005 (Okkema and Krause,2005) had only just begun to incorporate results from microarray experiments. In the ensuing 4 years, microarrays have, some would say, come and gone, and we are now in the era of next-generation sequencing; at long last, transcript inventories can be assessed at the level of single copies per cell or even per animal. However, this new experimental power shifts the spotlight from the transcriptome to the ultimate phenotype of the particular cell that the transcriptome produces, the assemblies of gene products such as myosin or collagen or digestive enzymes that construct the differentiated cells and that make the worm what it is. How much feedback regulation of transcription is there within a differentiated cell type to proceed from the transcriptome to the final cell phenotype? Does transcription simply control the timing and level of individual gene transcripts, whereupon protein translation and protein–protein interactions take care of all subsequent steps? Are there rules that apply to these steps regardless of cell type? Or have different cells adopted different strategies for fate specification and regulation of differentiation?

We will begin by reviewing several recent examples defining the transcription factors and transcription factor combinations central to cell fate decisions (e.g., pharyngeal gland cells, several specific classes of neurons, and, at the simple end of the spectrum, the intestine.) We will limit ourselves to studies that focus on transcriptional mechanisms and where direct targets have been identified. Hence, we will not discuss several recent studies that identify new cell-fate specifying transcription factor combinations, e.g., the redundant trio of hlh-1, unc-120, and hnd-1 that specifies body wall muscle cells (Fukushige et al.,2006; Fox et al.,2007; Lei et al.,2009) and the overlapping roles of ceh-51 and tbx-35 in specifying the fate of the MS-blastomere of the early embryo (Broitman-Maduro et al.,2009). We will emphasize studies published in the last 2–3 years and will generally acknowledge earlier work by referring to the many excellent chapters in the WormBook, which cover these topics in greater depth (Hobert,2005; Meyer,2005; Okkema and Krause,2005; Strome,2005; Blackwell and Walker,2006; Schaner and Kelly,2006; Cui and Han,2007; Mango,2007; McGhee,2007) as well as to several more recent reviews on related subjects (Hobert,2008; Maduro,2008; Mango,2009).

GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

For the most part, transcription regulatory regions of C. elegans genes are relatively small (< 2 kb; Okkema and Krause,2005), a feature that helps in traditional “promoter bashing” as well as in computational analysis of regulatory sequences. Regulatory sequences presumably involve both “core” promoter sequences required for RNA Polymerase II (PolII) binding as well as distinct cis-regulatory elements (CREs) that function as binding sites for specific TFs and that, either singly or in combination, act as transcriptional enhancers. The structure of “core” promoters is poorly defined in C. elegans, in large part because widespread trans-splicing obscures the precise initiation site of the primary transcript (Blumenthal,2005). It is usually assumed that the most proximal upstream region contains sequences required for PolII binding (such as a TATA-box) and indeed, many C. elegans promoters contain recognizable TATA-like sequences (Blackwell and Walker,2006); however, the relevance of these sequences is generally not investigated. In fact, in at least some cases, proximal promoter regions (including presumptive TATA-boxes) can be deleted with no apparent effect on reporter expression, and some enhancer sequences can activate reporter expression in the absence of any discernible core promoter (Wenick and Hobert,2004; Smit et al.,2008). Such results may reflect cryptic promoter activity within the vectors used to study C. elegans gene expression or may suggest that promoters in C. elegans are more loosely defined than in other organisms.

The compact nature of the C. elegans genome, together with the rapid production of transgenic strains, has allowed for extensive characterization of many gene regulatory regions, including the identification of individual CREs and their cognate TFs. Specific CREs are typically found upstream of the predicted start codon of a gene, generally within 2 kb. However, CREs are also regularly found in introns (e.g., eat-20 and peb-1; Gaudet and Mango,2002) and there are examples of CREs located in downstream sequences (ges-1; Egan et al.,1995; Marshall and McGhee,2001) or within the sequence of neighboring upstream genes (tbx-2; Roy Chowdhuri et al.,2006). Influential (and apparently isolated) binding sites can be several kilobases away from a gene and two such CREs have been identified in forward genetic screens. A binding site for TRA-1 has been identified ∼6 kb downstream in the egl-1 gene, repressing egl-1 in hermaphrodite-specific neurons (Conradt and Horvitz,1999). A CHE-1 binding site has been identified ∼5 kb upstream of the cog-1 gene (O'Meara et al.,2009), activating cog-1 in ASE neurons. O'Meara et al. (2009) emphasize both the rarity and the importance of CREs identified by forward genetic screens. They also sound a word of caution and note certain discrepancies between the results obtained in vivo and by transgenic “promoter-bashing”; in particular, the necessary CHE-1 site identified by genetic mutation might well have been missed if they had relied only on transgenic analysis. Finally, we know of no case in C. elegans of long range enhancers (say, >20 kb away from the responding gene), which are so prominent a feature of gene expression in Drosophila and vertebrates.

As in most systems, there is clearly a range of complexity with regard to the regulation of individual C. elegans genes, as discussed in a recent review (Okkema and Krause,2005). On one end of the spectrum are genes with relatively small regulatory regions (hundreds of bp) that are subject to correspondingly simple regulatory control. These genes are often expressed in simple patterns (a single cell-type or tissue) in differentiated cells, and can be referred to as “terminal differentiation genes” (Hobert,2008). At the other end of the spectrum are genes with relatively large regulatory regions (several kb); such genes typically have complex expression patterns or critical roles in development and must accurately interpret and integrate multiple different transcriptional inputs (Nelson et al.,2004).

“SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

Regulation of terminal differentiation genes is often straightforward, involving a single critical CRE that is both necessary and sufficient for gene expression. In contrast to complex enhancers consisting of multiple transcription factor binding sites (discussed below), such critical CREs are typically found in only one or two copies and represent binding sites for a single TF or TF complex that activates gene expression. In the simplest case, specificity of CRE activity is determined by the expression pattern of the cognate TF, with the relevant activator being expressed in the same pattern as its downstream targets. In the case of CREs that respond to a TF complex, specificity is determined by the overlap of the expression patterns of the different components of the complex. Examples of this type of gene regulation by cell- or tissue-specific regulators are plentiful: regulation of AIY interneuron genes by TTX-3 and CEH-10 (Wenick and Hobert,2004), regulation of intestinal genes by ELT-2 (McGhee et al.,2007,2009), and regulation of pharyngeal gland genes by HLH-6 (Smit et al.,2008), to name a few. A recent review (Hobert,2008) makes a convincing case for this simple mode of regulation being commonplace in specification of subtype identity in neurons. In these cases, the TFs are referred to as “terminal selectors,” given their ability to activate the full set of terminal differentiation genes that uniquely define a given neuron. Regulation of terminal differentiation genes can thus appear to be simple and straightforward but Hobert (2008) describes at least two examples where the terminal selector gene activates a second transcription factor and the pair then activate a subset of the terminal differentiation genes that define the particular neuron type. In other tissues, such feed forward loops involving the terminal selector gene may be common. For example, the GATA-factor ELT-2 is expressed in all cells of the intestine and appears to be involved in transcription of all intestinal differentiation genes (McGhee et al.,2007,2009). However, there are several cases where ELT-2 is known to act in combination with other TFs within the intestine (Moilanen et al.,1999; Neves et al.,2007; Romney et al.,2008); J.D. McGhee, unpublished results) and ELT-2 is a good candidate to control transcription of these auxiliary factors as well. Such ELT-2/(auxiliary TF) combinatorial feed-forward loops might be a common mechanism to provide the intestine with flexible responses to nutrition and the environment but other mechanisms are possible. A particularly good example is provided by the zygotic expression phase of SKN-1 (the same SKN-1 whose maternal expression specifies mesendoderm). It is likely that ELT-2 activates skn-1 transcription in the intestine (McGhee et al.,2009) but, once produced, SKN-1 protein directly controls certain detoxification genes expressed in the intestine (An and Blackwell,2003). In an important layer of regulation beyond transcription, nuclear localization of SKN-1 is controlled by environmentally responsive kinase pathways (An et al.,2005; Inoue et al.,2005). A subset of SKN-1 binding sequences are also ELT-2 binding sequences (and vice versa) and it is possible that ELT-2 also directly controls the same detoxification genes.

Gene expression in some other cell types is not regulated by single terminal selectors, but rather by multiple TFs. In the pharyngeal glands, for example, HLH-6 activates the expression of only a subset of pharyngeal gland genes (Smit et al.,2008), with some gland genes being regulated by other, as-of-yet unidentified factors. Likewise, in the excretory cell, some genes are regulated by CEH-6, some by DCP-66, some by NHR-31 and still others by unknown factors acting through at least two other discrete CREs (Zhao et al.,2005; Mah et al.,2007; Hahn-Windgassen and Van Gilst,2009). In these cases, why are there multiple factors to generate the same expression pattern? One intriguing possibility is that co-regulated genes in a given cell type are functionally related, such that distinct aspects of a cell are parceled out to different gene batteries, each of which is regulated by a single TF. For example, NHR-31 activates expression of a set of vacuolar ATPase subunits in the excretory cell, while known HLH-6 targets in the glands are predominantly members of a family of mucin-like proteins (Smit et al.,2008; Hahn-Windgassen and Van Gilst,2009). A related possibility for multiple regulators acting in a specific cell is that expression of the different gene batteries may be regulated with subtle but important differences, such as differences in temporal regulation or in response to environmental conditions. Whether there is a similar organizational logic to co-regulated gene batteries remains to be seen; it is certainly possible that some gene batteries may be constructed in a more hodge-podge manner.

The simple expression patterns of terminal differentiation genes can be elaborated on by the presence of additional CREs that act either positively or negatively. With respect to positive combinations of CREs, one feature that terminal differentiation genes appear to have in common is the apparent modularity of their regulatory regions. The lys-8 gene, for example, is expressed in both pharyngeal glands and the intestine (Mallo et al.,2002), with expression in each tissue being driven by corresponding tissue-specific factors (HLH-6 and ELT-2, respectively) that act through independent CREs in the lys-8 promoter (Smit et al.,2008; and J. Gaudet, unpublished observations). The same modularity exists for terminal differentiation genes in neurons and is responsible for generating expression in multiple neuronal cell types. A given neuronal gene may be expressed in multiple neuronal subtypes (for example, in both ASE and AIY) with its expression in each cell being under the control of the respective terminal selector (CHE-1 and TTX-3/CEH-10; Uchida et al.,2003; Wenick and Hobert,2004; Hobert,2005). Note that the modular CREs do not result in synergistic activation of expression; rather, expression in each neuronal subtype is independently specified by the appropriate TF.

Terminal differentiation genes can also contain negative CREs that refine their expression. For example, the vitellogenin (yolk) genes contain positively acting CREs that bind to ELT-2 and activate expression in the intestine, as well as negatively acting CREs that bind to MAB-3 to repress expression in males (MacMorris et al.,1992; Yi and Zarkower,1999). Likewise, HLH-6 typically activates expression of target genes in all five pharyngeal glands, but at least two such targets (phat-5 and F15D4.4) are only expressed in a subset of the glands (the two g1A cells; (Smit et al.,2008; J. Gaudet, unpublished observations). Preliminary analysis of the phat-5 promoter indicates the presence of a negative CRE that prevents expression in the non-g1A glands (J. Gaudet, unpublished observations). Of interest, such negative regulation has not been reported for the various terminal differentiation genes expressed in different neuronal subtypes (Hobert,2008)—yet another possible difference between the neuronal terminal selectors and TFs in other tissues.

While most terminal differentiation genes have relatively simple and small regulatory regions, there are well-characterized exceptions. The myo-2 gene, for example, is expressed only in pharyngeal muscle cells, yet directly responds to more than three redundantly acting TFs (Okkema and Fire,1994; Kalb et al.,1998; Thatcher et al.,2001). One possibility to explain the relative complexity of myo-2 regulation may be that there is no single “pharynx muscle” TF but, instead, there are multiple regulators acting in different subsets of pharyngeal muscle. All characterized pharyngeal muscle TFs are, in fact, expressed in only subsets of pharynx muscle (e.g., ceh-22, pha-2, tbx-2; Okkema and Fire,1994; Morck et al.,2004; Roy Chowdhuri et al.,2006; Smith and Mango,2006), possibly reflecting the different lineages from which this group of cells arises.

“COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

In contrast to the relatively simple regulation of terminal differentiation genes, genes that act earlier in development or that have repeated uses in different contexts are subject to more elaborate regulatory control. In general, genes with more complex regulation tend to have larger regulatory regions (>2 kb), presumably reflecting their complexity and greater number of transcriptional inputs (Nelson et al.,2004).

For each of the TFs described in the previous section (TTX-3/CEH-10, ELT-2, and HLH-6), there is a reasonably detailed understanding of the regulatory inputs that activate their expression in the desired cell- or tissue-type (summarized in Fig. 1). Regulation of ELT-2 is probably the simplest of the three (reviewed in McGhee,2007; Maduro,2008), possibly because of the relatively simple relationship between intestinal identity and cell lineage: all descendents of the E blastomere give rise to intestine and only intestine. In most other situations, lineally related cells can have vastly different identities, with the relationship between lineage and cell fate being less clear. Initiation of elt-2 expression at the 2E cell stage is controlled by the two redundant intestine-specific GATA factors END-1 and END-3, responsible for specifying the intestine precursor blastomere E. Indeed, most of the regulatory complications seem to arise upstream of E-cell specification: end-1 and end-3 are themselves under direct control of the maternal SKN-1 factor, cooperating with the HMG-protein POP-1 to confer developmental asymmetry to the E blastomere precursor, EMS. The zygotically expressed MED-1/2 GATA-like factors also participate in E-cell specification but in a relatively minor capacity (Maduro,2008). The majority of the above regulatory interactions are likely to be direct; candidate binding sites for SKN-1 (and MED-1/2) have been identified in end-1 and end-3 promoters (Maduro,2008); candidate binding sites for END-1/END-3, as well as ELT-2 in its autoregulatory role, have been identified in the ELT-2 promoter (J.D. McGhee; unpublished observations). Recently, Owraghi et al. (2009) have produced end-1 end-3 doubly homozygous nulls, which indeed do not produce endoderm, thereby validating the longstanding model (see, for example, Zhu et al.,1997). The availability of these animals allowed the authors to resolve long standing questions about the mesoderm/endoderm-repressing/activating functions of POP-1 and also to provide a surprising evolutionary perspective: this robust intricate pathway determining mesendoderm fate shows unexpected variations between embryos of C. elegans and C. briggsae, despite their near morphological identity (Lin et al.,2009).

thumbnail image

Figure 1. Schematic view of gene regulation in three different cell types: AIY interneurons, intestinal cells and pharyngeal glands. A: Modified from (Hobert,2005). See text for details. B: Modified from (McGhee et al.,2009). See text for details. C: Modified from (Raharjo and Gaudet,2007). P-Act, unidentified posterior pharynx activator; GN-Act, unidentified activator functioning in glands and neurons; G-Act, unidentified activator of hlh-6-independent genes; G-Rep, unidentified repressor acting in non-g1A glands.

Download figure to PowerPoint

In the case of ttx-3/ceh-10 and hlh-6, regulation is more complex, involving the integration of multiple transcriptional inputs to establish precise expression patterns in a single cell type. Expression of the upstream activators is relatively broad, and it is the combination of factors that dictates the final pattern of expression. Such combinatorial control by multiple factors has been best demonstrated for ttx-3 and ceh-10, which involves activation of ttx-3 by the broadly expressed factors REF-2 and HLH-2, followed by activation of ceh-10 by TTX-3 and the Wnt effector POP-1 (Bertrand and Hobert,2009). Regulation by POP-1 depends on levels of nuclear POP-1, with high levels repressing ceh-10 expression and low levels activating expression. Regulation of hlh-6 expression is similarly complex, involving the combined action of at least four distinct CREs (Raharjo and Gaudet,2007; Ghai and Gaudet,2008). Importantly, activity of the individual CREs is not confined to pharyngeal glands nor is the activity of any one CRE sufficient for expression; it is only when they are combined that the gland-specific pattern of hlh-6 is generated. It remains unclear whether combinatorial action of CREs involves specific architectural concerns such as spacing (relative or absolute) or ordering of different CREs. In the case of hlh-6, recapitulation of gland-specific expression can be achieved with a fairly arbitrary arrangement of the relevant CREs, suggesting no particular spacing or ordering requirements. However, TFs that synergize through physical interactions may require specific spacing or phasing of CREs to allow for activation of gene expression.

Kuntz et al. (2008) have provided an excellent recent example of an even more complex promoter. They have analyzed the combined promoter regions lying between the divergently transcribed lin-39 and ceh-13 genes in the C. elegans Hox cluster. Identification of sequences conserved between multiple nematode species, followed up by transgenic reporter assays, identified 10 activating regions spread throughout the ∼20 kb region. A previous lower-resolution analysis by Wagmaister et al. (2006) had shown that these regions bind three different TFs: LIN-1, LIN-31, and LIN-39. Overall, the results strongly support the view that complex (multi-tissue multi-stage) expression patterns are the sum of individual more restricted expression subpatterns driven by discrete independent modular enhancers. Kuntz et al. (2008) also report the remarkable observation that a 700 bp region conserved in the mouse Hox genomic DNA was able to recapitulate much of the expression pattern driven by the homologous sequence in C. elegans. They further argue that this functional conservation was unlikely to result from convergent evolution; in other words, the forerunner of this enhancer was present in the archetypal Hox cluster of our bilaterian ancestor.

So far, there does not appear to be a single strategy (unlike the case of terminal selectors) for how cells adopt a specific fate or enact their unique differentiation programs. Nonetheless, some common themes emerge from the collected work. Foremost among these themes are combinatorial control to regulate expression of some TF, which then imparts this “pattern information” to downstream targets. The overall structure of such a regulatory arrangement has an hourglass shape, with multiple inputs feeding into a single TF, which then regulates expression of multiple downstream targets, as noted previously (Hobert,2008).

CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

An interesting observation in the relationship between transcriptional control and cell identity is the phenotypic effect of loss of different TFs. As noted above, loss of some TFs (usually early acting ones) leads to cells adopting alternate fates, generally relating to alterations in blastomere identity; (however, see Owraghi et al.2009 for an example of a mixed fate). In contrast, loss of other TFs (often later acting ones) does not result in obvious fate switches. In the case of neuronal subtypes, loss of a given terminal selector produces neurons of indeterminate identity: the cells retain neuronal characteristics but do not adopt the characteristics (including expression of specific terminal differentiation genes) of other neurons. Likewise, loss of the pharyngeal regulators HLH-6 or TBX-2 produces cells that do not express any cell-type–specific markers; the defective cells appear to arrest in an undifferentiated state (Smith and Mango,2006; Smit et al.,2008). The same situation occurs in body wall muscle, where loss of three redundantly acting TFs (HLH-1, HND-1, and UNC-120) results in a complete loss of muscle gene expression, yet the presumptive muscle cells do not adopt an obvious alternate identity (Fukushige et al.,2006). It is as if these cells are specified to adopt a particular fate but are unable either to act upon this decision (i.e., being unable to enact a specific differentiation program) or to adopt an alternate developmental path.

PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

Recent technological developments hold great promise for a more thorough description of the transcriptional networks controlling gene expression in C. elegans. In the past, analysis of gene regulation has often relied on traditional “promoter bashing” approaches, with deletion and mutation analysis leading to the identification of discrete cis-regulatory elements that are critical for promoter activity. Moving from cis-elements to their cognate trans-acting factors has then been achieved through molecular/biochemical identification of binding factors and/or genetic identification of relevant factors, either by testing of candidate factors or by genome-wide screening. These approaches will continue to provide valuable information, with new resources greatly facilitating the ease of both element and factor identification. The availability of multiple Caenorhabditis genome sequences allows for the identification of conserved sequence elements in gene regulatory regions, which can often lead to the identification of functionally relevant sequences (as noted earlier with respect to the analysis of lin-39/ceh-13 control; Kuntz et al.,2008). Additionally, online resources make such multi-genome comparisons rapid and easy (e.g., the UCSC Genome Browser; Kuhn et al.,2009). The relatively small size of C. elegans regulatory regions also makes them amenable to computational analyses for CRE discovery. Various tools exist to search for sequence motifs (candidate CREs) that occur in a given set of co-regulated promoters at frequencies greater than expected by chance alone (reviewed in Mango,2007). Such approaches have been used successfully to identify critical CREs (e.g., Pauli et al.,2006; McGhee et al.,2007,2009;Smit et al.,2008), although success is not guaranteed, particularly in cases where a given CRE is relatively degenerate (Wenick and Hobert,2004). Once functional cis-elements have been identified, efforts to identify relevant trans-acting factors benefit from well-annotated TF inventories (Reece-Hoyes et al.,2005) and from ever-increasing genome-scale expression data that can suggest candidate transcription factors that are present in the right cell at the right time. Of particular interest is current work that aims to determine the expression patterns of many (and eventually all) transcription factors at single-cell resolution throughout embryogenesis (Murray et al.,2008). Automated 4D analysis allows the relatively rapid determination of transcription factor expression patterns and overlays this information on the invariant C. elegans cell lineage. Such a detailed and complete description of expression patterns in the embryo (allowing for the appropriate time lags required for reporter folding and/or accumulation to detectable levels) will provide a ready resource for the identification of candidate transcription factors. Yet another exciting new technology has the potential to efficiently identify transcription factors controlling particular genes: a transgenic strain that expresses the GFP reporter is mutagenized and large number of segregants are passed through a COPAS “Worm-sorter” with the capability of identifying rare animals with perturbed expression patterns (Doitsidou et al.,2008). Next generation sequencing (Shen et al.,2008) or comparative genome hybridization (Flibotte et al.,2009) can greatly accelerate identification of the responsible mutation.

Once a specific TF has been identified, worms provide powerful approaches to investigate biological function, most importantly the ultimate availability of gene knockouts in all transcription factors in the C. elegans genome (Moerman and Barstead,2008). However, interpretation may not always be straightforward because of TF redundancy or the presence of early lethal phases complicating investigation of later stages or just the lack of obvious phenotypes. Nonetheless, it is possible in principle to identify the transcriptional consequences of TF loss, either by enriching for desired embryonic cells (e.g., Fox et al.,2007; Von Stetina et al.,2007; McGhee et al.,2009; Meissner et al.,2009) or, for later stages, by enriching the transcriptome itself (Roy et al.,2002; Von Stetina et al.,2007). No matter how one obtains a cell- or tissue-specific transcriptome, next generation sequencing now allows analysis in unprecedented detail. Additional genome-scale efforts are also poised to reveal the range of specific transcription factor interactions with their target sequences. Chromatin immunoprecipitation combined with microarray analysis (ChIP-chip) or deep sequencing (ChIP-seq) can potentially identify many targets of a given transcription factor. Such data can be further exploited to identify candidate recognition sequences for a given factor. In C. elegans, ChIP has been most successful in the study of abundant or broadly expressed factors (Oh et al.,2006; Ercan et al.,2007; Whittle et al.,2008,2009), with tissue-restricted proteins posing some challenges, possibly due to low abundance or limited distribution of the factor (Lei et al.,2009).

An area for future investigation will no doubt be the molecular mechanism(s) by which specific TFs activate transcription. How, for example, does combinatorial control actually work? Is it simply the presence of multiple TFs binding to a region of DNA that leads to PolII recruitment by means of “mass-action” or some threshold effect? Or do specific TFs undergo unique allosteric changes upon binding, enabling particular downstream general transcription factors (GTFs) to recruit RNA Polymerase II (Meijsing et al.,2009)? An initial step toward such mechanistic insights is to characterize the roles of the general transcription factors and particular subunits of the Mediator complex, together with their interactions with specific TFs.

MEDIATORS: FROM THE SPECIFIC TO THE GENERAL

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

Specific transcription factors must ultimately regulate gene expression by influencing the assembly, recruitment and/or activity of GTFs and the multicomponent RNA Polymerase II enzyme. The effects of specific TFs on the general TF machinery are mediated by the aptly named Mediator complex, first identified in yeast (reviewed in Casamassimi and Napoli,2007). Studies from yeast suggest a structure for the Mediator complex consisting of three modules: a head, middle and tail module. The head module contacts PolII and GTFs, while the tail module interacts with specific TFs that bind to DNA elements in the promoter. Mediator complexes may exist in a variety of forms, depending on the presence of specific sets of Mediator subunits (MDTs in C. elegans; MEDs in other systems), with different complexes possibly serving different functions in gene regulation. For example, an additional module of Mediator is the CDK8 or CDK module, which is present only in larger Mediator complexes and appears to function primarily (though not necessarily exclusively) in transcriptional repression.

Understanding the specific composition and function of Mediator complexes is being facilitated by genetic analysis in C. elegans. For example, recent work identified a role for a subset of Mediator components in maintaining the quiescent state of vulval precursor cells (VPCs; Clayton et al.,2008). In normal development, VPCs remain quiescent through two larval stages of development, before undergoing cell divisions in the third larval stage leading to production of the vulva. Reduction or elimination of Mediator subunits by RNA-mediated interference (RNAi) revealed a role for five Mediator components in maintaining VPC quiescence (Table 1), possibly through repression of downstream targets that initiate cell division. Of interest, the set of components implicated in VPC quiescence includes most of the C. elegans homologs of components of the repressive CDK module, suggesting that, as in yeast and mammals, C. elegans possesses functionally distinct subclasses of Mediator complexes, as had been suggested by earlier biochemical studies of purified C. elegans Mediator complexes (Kwon et al.,1999).

Table 1. Summary of C. elegans Mediator genes
GeneMammalian and Yeast HomologsFunction(s)
mdt-1.1/sop-3MED1, Med1Modulates Wnt signaling (Grant et al.,2001); RNAi lethal (Clayton et al.,2008)
mdt-1.2MED1, Med1ND
mdt-4MED4, Med4ND
mdt-6/let-425/med-6MED6, Med6Modulates Ras and Wnt signaling (Kwon et al.,2001); essential gene
mdt-7/let-49/med-7MED7, Med7Essential gene; RNAi lethal (Clayton et al.,2008); required for germline and gonad development (Kwon et al.,2001)
mdt-8MED8, Med8RNAi lethal (Clayton et al.,2008)
mdt-10/med-10MED10, Nut2RNAi lethal (Clayton et al.,2008)
mdt-11MED11, Med11RNAi viable (Clayton et al.,2008)
mdt-12/dpy-22/sop-1MED12, Srb8Modulates Ras and Wnt signaling (Moghal and Sternberg,2003; Yoda et al.,2005); mutants are viable
mdt-13/let-19MED13, Srb9Modulates Wnt signaling (Wang et al.,2004; Yoda et al.,2005); essential gene
mdt-14/rgr-1MED14, Rgr1Required for early embryonic transcription (Shim et al.,2002); essential gene
mdt-15MED15, Gal11Response to ingested material (Taubert et al.,2006,2008; Yang et al.,2006); mutants sick but viable; intestine and neuronal expression
mdt-17MED17, Srb4ND
mdt-18MED18, Srb5Role in axon guidance (Schmitz et al.,2007); RNAi lethal (Clayton et al.,2008)
mdt-19MED19, Rox3RNAi lethal (Clayton et al.,2008)
mdt-20MED20, Srb2Distal germline expression (Kohara,2001a,b)
mdt-21MED21, Srb7Role in axon guidance (Schmitz et al.,2007)
mdt-22MED22, Srb6RNAi lethal (Clayton et al.,2008)
mdt-23/sur-2MED23, -Modulates Ras signaling (Singh and Han,1995; Howard and Sundaram,2002); RNAi viable (Clayton et al.,2008)
mdt-27MED27, -RNAi viable (Clayton et al.,2008)
mdt-28MED28, -RNAi viable (Clayton et al.,2008)
mdt-29MED29, -RNAi viable (Clayton et al.,2008); modulation of Notch signaling (Chen et al.,2004), regulation of VPC quiescence (Xia et al.,2009)
mdt-31MED31, Soh1RNAi lethal (Clayton et al.,2008)

The best characterized C. elegans Mediator subunit is MDT-15, which was originally described as having a role in lipid metabolism and was shown to directly interact with the specific transcription factors NHR-49 (a nuclear hormone receptor) and SBP-1 (a basic helix-loop-helix protein; Taubert et al.,2006; Yang et al.,2006). More recently, Taubert et al. (2008) demonstrated an expanded role for MDT-15 in regulating response to ingested material, including lipids, but also including other nutrients and xenobiotics. This work used microarrays to identify 187 genes that were down-regulated in mdt-15(RNAi) animals compared with wild-type. Products of these genes included proteins involved in lipid metabolism, as expected from previous studies, but also included proteins involved in detoxification (e.g., UDP-glucosyltransferases, glutathione S-transferases and cytochrome P450s) and proteins with speculative roles in pathogen response. Accordingly, mdt-15 appears to be required for up-regulation of detoxification genes in response to chemical toxins and mdt-15 mutants display increased sensitivity to toxins. This work further demonstrated that NHR-49 and SBP-1, the known TF partners of MDT-15, are dispensible for expression of detoxification genes, suggesting that different classes of MDT-15-dependent genes are regulated by distinct TFs. The TF involved in regulation of detoxification genes has not yet been identified but, in the intestine, ELT-2 and/or SKN-1 would seem to be reasonable candidates (see above). The organization of MDT-15-dependent genes into functionally relevant groups has an intuitive appeal, in that its regulatory role can be understood in terms of a set of related responses. Whether other Mediator components have similarly unified regulatory targets remains to be seen. The specificity of mdt-15 function is at least partly a reflection of the fact that mdt-15 expression is enriched in the intestine (as is expression of at least some of the MDT-15-dependent genes), the tissue chiefly responsible for response to ingested material. Importantly, MDT-15 homologs are predicted to belong to the tail module of Mediator, where they can make contact with specific TFs and are not required for more general functions of Mediator such as contact with GTFs.

Additional genetic analysis, using gene mutations or RNAi, is likely to reveal other distinct functions of different Mediator components. Mediator components have been implicated in other specific tissues or processes, as reviewed previously (Blackwell and Walker,2006; Table 1). Some components appear to be involved in mediation of Ras/MAPK signaling, Wnt signaling and Notch signaling, while recent work has suggested a role for mdt-18 and mdt-21 in neuronal development and axon migration (Schmitz et al.,2007). Yet other work implicates mdt-29 in regulation of temporal identity in a subset of epidermal cells (called seam cells; Xia et al.,2009). In addition, some Mediator subunits exhibit spatially restricted expression patterns (like mdt-15), suggesting tissue-specific roles for these proteins. For example, mdt-20 is specifically expressed in the distal germline, based on in situ hybridization experiments (Kohara,2001a,b), although a biological role for mdt-20 has not been described.

Analysis of Mediator complexes in C. elegans is still in early days, with many predicted subunits having no ascribed function, despite being essential for viability (Clayton et al.,2008). A more complete analysis of Mediator subunit expression patterns, together with phenotypic characterization and biochemical studies (such as ChIP) will certainly lead to a greater understanding of the extent and specificity of Mediator subunit activity. Many interesting questions remain unanswered for Mediator complexes as a whole. What is the specificity of subunit function? Do different subunits have clearly definable roles in regulating particular sets of targets, as in the case of MDT-15? Does all PolII-dependent transcription require Mediator? Depletion of core Mediator components from C. elegans, followed by analysis of transcriptional activity, may begin to address this question. We note that studies from yeast suggest that expression of many genes does not require Mediator (Fan et al.,2006; Zhu et al.,2006). How many different Mediator complexes exist and what is the functional relevance of different forms? Biochemical studies may resolve this issue to some extent, but careful genetic analysis of the contributions of different subunits to activation and/or repression of different target genes seem more likely to provide insight.

GENERAL TRANSCRIPTION FACTORS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

As in other eukaryotes, mRNA transcription in C. elegans ultimately requires the activity of the RNA Polymerase II holoenzyme, oriented and organized by the GTFs (Kornberg,2007). As indicated above, core promoter elements of C. elegans genes are not well-described. Nonetheless, several GTFs have been studied in C. elegans, allowing a dissection of their distinct contributions. For example, a TBP-like factor encoded by tlf-1 is required for expression of some genes, suggesting the possibility that different genes may be regulated by different core promoter elements that respond to either TBP or TLF-1 (Dantonel et al.,2000; Kaltenbach et al.,2000).

Functions of other TBP-associated factors have been studied, many of which are broadly but not universally required for embryonic transcription (e.g., TAF-1, TAF-5, TAF-9, and TAF-10; Walker and Blackwell,2003; Walker et al.,2004). One notable exception is TAF-4, which appears to be required for all aspects of PolII-dependent embryonic expression (Walker et al.,2001).

Lin and coworkers have described an intricate process by which the necessary function of TAF-4 has been used to establish transcriptional silencing in C. elegans germline precursors (Guven-Ozkan et al.,2008). In the developing embryo, germline precursors are specified by a series of asymmetric divisions, beginning with the single-cell zygote (P0) and ending with the final precursor (P4). A hallmark of these precursors is that they are transcriptionally inactive. While transcriptional repression in later germline precursors (P2–P4) is regulated by the PIE-1 protein, repression in early precursors (P0–P1) is accomplished by the inactivation of TAF-4. Using the yeast two-hybrid system, Lin's group identified TAF-4 as a binding partner for both OMA-1 and OMA-2, a pair of redundantly acting proteins known to be present in the oocyte and very early (1–2 cell) embryo. In the very early wild-type embryo, TAF-4 is initially broadly distributed but becomes nuclear-localized in late 2-cell embryos (and onward). This change in location is correlated with changes in OMA-1/2 levels, suggesting (together with the above-mentioned two-hybrid result) that the OMA proteins might be responsible for sequestration of TAF-4 in the cytoplasm of 1–2 cell embryos. In support of this idea, RNAi-mediated depletion of the OMA proteins results in increased nuclear localization of TAF-4 in one- to two-cell embryos. Furthermore, depletion of OMA proteins results in derepression of zygotic transcription, suggesting that the sequestration of TAF-4 in the cytoplasm of very early embryos is a key mechanism for silencing transcription. Of interest, another GTF, TAF-12, is required for the nuclear localization of TAF-4, and appears to compete with the OMA proteins for binding to TAF-4. Additional studies of TAFs may reveal other regulatory targets like TAF-4, as well as reveal critical physical interactions and their relevance to transcriptional control.

C. elegans CHROMATIN

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

Up to this point, our discussion has avoided the all-important fact that transcriptional regulation does not occur with naked DNA but rather with the protein-DNA complex of “chromatin.” That is, C. elegans, like the vast majority of eukaryotes (dinoflagellates apparently remain a mysterious exception), has its genomic DNA arranged as nucleosomes, ∼145 bp units of DNA wrapped on the surface of an eight histone complex (two copies each of histones H2A, H2B, H3, and H4), separated by 0–100 bp of “linker” DNA associated with a fifth type of histone, the lysine rich histone H1 (Kornberg and Thomas,1974; Kornberg and Lorch,1999). In the case of C. elegans, the linker DNA is 10–20 bps, rather short as linkers go (Dixon et al.,1990). A typical diploid C. elegans nucleus thus contains on the order of one million nucleosomes. As has been recognized for several decades, interesting and important questions arise about the relations between chromatin and transcription; such questions generally fall into the realm of “epigenetics.” How are nucleosomes positioned relative to the underlying DNA sequences? How do nucleosome arrangements influence (or how are they influenced by) gene transcription or by other critical biological phenomena such as replication, recombination, DNA repair, cell cycle transit, etc? How are histone modifications and histone variants arranged over the genome? Important answers have emerged from other systems, in particular, S. cerevisiae (for reviews, see Henikoff and Ahmad,2005; Kouzarides,2007; Li et al.,2007; Cedar and Bergman,2009). Only recently are the same questions being asked about the chromatin of C. elegans, and the timing is appropriate: the new sequencing technologies produce millions of short sequence reads, making it conceivable to investigate the positions and characteristics of all nucleosomes in a C. elegans nucleus. The present time also corresponds to a wave of interest and technical expertise in defining sequences associated with histone variants and histone modifications (using demanding, but by now almost standard techniques, such as ChIP, ChIP-Chip, and, the current front-runner, ChIP-Seq; Barski and Zhao,2009).

NUCLEOSOME POSITIONING

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

Fire and coworkers have led the application of these new sequencing methods to the study of worm chromatin (Fire et al.,2006; Johnson et al.,2006; Valouev et al.,2008; Gu and Fire,2009). In one recent study (Valouev et al.,2008), standard micrococcal nuclease digestion was used to produce mono-nucleosomes from (all tissues of) mixed stage wild type animals and the protected sequences were analyzed exhaustively. Although limited genomic regions could be detected where nucleosomes do appear to adopt a unique fixed phase relative to the underlying sequence (as estimated with stringent criteria, <1% of the genome), the authors state that: “the major feature … observed with C. elegans chromatin …(is)… the lack of universal sequence-dictated nucleosome positioning for a substantial majority of the genome.” It remains a question how or if this arrangement might change if chromatin could be investigated from a single unique tissue at a unique stage.

Further interesting features of worm chromatin emerge from the Fire lab's analyses. For example, they detect a significant depletion of nucleosomes around the site of transcription initiation, a long-standing observation in many other systems (see, e.g., McGhee et al.,1981) and detectable by nonenzymatic methods as well (Giresi et al.,2007; Auerbach et al.,2009). Fire et al. (2006) have also reported an unusual feature of worm DNA possibly related to chromatin conformation/arrangement, namely that a sizeable fraction (on the order of 6% by the adopted criteria, much higher than in the other genomes used in the comparison) have AA/TT sequences arranged in a ∼10 bp periodicity. Such periodic arrangements have long been suggested to confer physical properties that could dictate nucleosome position, with the net effect, for example, of rendering promoter regions nucleosome-free (Segal and Widom,2009). However, these so-called PATC (periodical AA/TT clusters) are unlikely to be the driving force behind the nucleosome depleted regions found near the start of C. elegans genes because PATCs are enriched in introns of genes transcribed in oocytes (Fire et al.,2006). In addition, the proposal that intrinsic DNA sequence leads to nucleosome depletion of promoters, while so attractive in unicellular yeast, is less appealing in metazoans where individual cell types must pursue their own transcriptional programs.

HISTONE VARIANTS IN C. elegans

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

The four core histones are extremely conserved during evolution but there are also a small number of histone “variants,” which are themselves highly conserved. These variants (primarily H2A-Z and H3.3) are incorporated into chromatin without requiring S-phase DNA replication and are thus prime candidates for chromatin constituents that could be changed in response to biological needs (Henikoff and Ahmad,2005).

The histone variant H2A-Z has been investigated in numerous experimental systems. Conclusions about H2A-Z function are by no means unanimous but it is generally agreed that nucleosomes containing H2A-Z are enriched near the site of transcription initiation. The H2A-Z variant in C. elegans is called HTZ-1 and its disposition within chromatin has been investigated by Lieb and co-workers (Whittle et al.,2008). They first showed that HTZ-1 is required both for embryogenesis and for postembryonic development. HTZ-1 has a sizeable (and long-lasting) maternal contribution and can be detected in nuclear chromatin by the 4-cell stage of embryogenesis, in both soma and germline precursors; HTZ-1 persists in apparently all cells of all developmental stages thereafter. Anti-HTZ-1 polyclonal antibody was used to precipitate sheared crosslinked chromatin isolated from early embryos and subsequent microarray analysis revealed ∼5,000 sites of significant HTZ-1 incorporation. Incorporation tended to occur preferentially in intergenic regions, especially upstream of genes associated with growth and development. Several examples were provided where there was clear and impressive association of HTZ-1 with the 5′-end of genes, just upstream of the ATG initiation codon (where the primary transcript might be expected to initiate). Although the upstream gene in an operon showed higher incorporation levels of HTZ-1 than did most downstream genes in the same operon, roughly one third of all operons did have a significant HTZ-1 incorporation in a downstream gene, suggesting independent transcriptional initiation events within the operon.

The level of HTZ-1 incorporation into a particular gene was found to be positively correlated both with transcript levels and with the levels of immunologically detectable RNA Polymerase II bound to the same region. However, these correlations held only at low transcript levels or at low RNA Polymerase levels; at high levels, the correlations reversed sign, with high transcript or RNA Polymerase levels showing low levels of HTZ-1 incorporation. As pointed out by the authors, one possible interpretation of this behavior is that the most active genes might actually have lost all their nucleosomes near their initiation site. The authors are careful to point out that their observations reflect the net behavior of a large number of individual genes. When they inspected individual genes (in particular, the set of genes identified by Baugh et al. [2003] as active in the early embryo) almost every type of behavior was observed, i.e., different genes reflect different aspects of HTZ-1 behavior, or differentially reflect the kinetics of HTZ-1 incorporation or turnover.

What is the function of HTZ-1 in C. elegans? Updike and Mango (2006) showed that loss of HTZ-1 was synthetic lethal with the pharynx organ identity factor PHA-4 (and so were several chromatin proteins previously associated in other systems with HTZ-1 incorporation). A YFP-tagged HTZ-1 could be detected associated with a multicopy transgenic array of a pharyngeal promoter (myo-2) inside living embryos and this localization was shown to depend both on the presence of PHA-4 and on the presence of PHA-4 sites in the myo-2 promoter.

How does HTZ-1 relate to the nucleosome-depleted region around the 5′-ends of active genes? It has been suggested that HTZ-1, especially when in combination with the histone H3.3 variant, produces an unstable nucleosome that is easily displaced, either by transcription or perhaps even by experimental manipulation (Jin et al.,2009). In C. elegans, expression patterns of H3.3 have been described during gametogenesis (Ooi et al.,2006) but H3.3 association with HTZ-1 has not yet been reported. Recently, Ooi et al. (2009) have described a clever scheme to enrich for H3.3-containing chromatin; their initial results indicate that H3.3 is enriched on the body of transcribed genes, with the H3.3 abundance positively related to levels of transcription.

HISTONE MODIFICATIONS IN C. elegans

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

The four best-known ways that histones can be postsynthetically modified are by acetylation, methylation, phosphorylation, and ubiquitinylation; (histones can also be sumoylated, ADP-ribosylated, deiminated, and proline-isomerized; Kouzarides,2007). Modifications such as acetylation and methylation have been intensely studied over the past 2 decades, primarily because of the proposal (and considerable evidence) that they can act as a store of epigenetic information, providing a molecular memory to enhance or suppress transcription, replication and repair. Detailed modifications can vary in histone type, position within the primary amino acid sequence, and, in the case of methylation, whether particular residues are mono-, di-, or tri-methylated. Moreover, combinations of the different modifications, either within the same histone or the same nucleosome or the same nucleus, provide an enormous number of potentially distinguishable epigenetic states. In experimental systems ranging from unicellular yeast to human cells, a bewildering variety of functions have been assigned to different histone modifications, producing a fog of competing observations, interpretations and even opinions. Yet certain conclusions reproducibly emerge from the most diverse data, especially when modifications are surveyed globally and comprehensively using the new sequencing technology (see, for example, Jothi et al.,2008; Barski et al.,2009; Wang et al.,2009). Several of these conclusions have been supported (and extended) by recent work in C. elegans.

First consider methylation of lysine residue # 4 in histone H3 (in shorthand, H3K4me, not distinguishing between dimethyl and trimethyl modifications). Gu and Fire (2009) have investigated the disposition of this modification, using micrococcal-nuclease-generated sucrose-gradient-purified mononucleosomes isolated from staged adult C. elegans (whole animals). DNA was isolated from H3K4me-enriched nucleosomes (obtained by immunoprecipitation) and subjected to Solexa-Illumina sequencing. Their first observation was that nucleosomes positioned within the 1 kb upstream of the ATG are significantly (several fold) enriched in H3K4me, compared with the frequency of unmodified nucleosomes in the same region. This 5′-enrichment is what is usually found in other experimental systems, from yeast to human (Kouzarides,2007). Across the genome, the H3K4me-containing nucleosomes more-or-less mirror the distribution of genes, i.e., concentrated several fold in the middle of autosomes and depleted somewhat from the X-chromosome. In contrast, nucleosomes containing a different modification on H3 (H3K9me3) are enriched on the one end of each chromosome that is known to be the pairing centre in meiosis, as if H3K9me3 reflected some property of meiosis or chromosome pairing, not transcription.

Inspection of the frequency of sequence reads (corresponding to “starts” or “ends” of nucleosome cores) revealed that a significant fraction of H3K4methyl-containing nucleosomes appear to exist in ordered arrays. To determine how (or if) this apparent order is disposed relative to gene sequences, the authors first identified a set of ∼4,000 genes that show good evidence for a H3K4me-containing nucleosome close to the ATG (so-called “H3K4me-anchored genes”); the frequency of sequence reads (nucleosome “starts” or “ends”) was then positioned relative to the dyad axes of this anchoring H3K4me-containing nucleosome and then summed over all the genes in the set. The plot (see Fig. 2) shows a clear series of ∼150 bp spaced peaks in read frequency, three peaks downstream of the ATG and five peaks upstream. Moreover, the two series of peaks are not continuous but are phase-shifted by ∼120 bps, a distance smaller than a single nucleosome, immediately upstream of the “anchored nucleosome.” This region of the promoters is also enriched for sequences that are conserved in related nematodes. One of the proposed models is that the 120-bp gap immediately upstream of the anchored nucleosome could be due to the transcription apparatus (e.g., RNA Polymerase II) shoving the local nucleosomes aside and forcing them into a close-packed array. The subset of genes used in this analysis is enriched for genes that might be expected to be widely expressed in adult C. elegans, i.e., genes coding for ribosomal proteins or proteins involved in the cytoskeleton. Presumably this widespread expression explains why the signal from the ordered array can be detected in whole worm extracts and also raises the possibility that equally dramatic and revealing H3K4me-containing nucleosome arrays might exist around tissue-specific genes.

thumbnail image

Figure 2. Patterns of H3K4-methyl-containing nucleosomes near the transcription start sites of 3,903 C. elegans genes. Reproduced from Figure 6c of Gu and Fire (2009), plotting the number of sequence reads (nucleosome “starts” = reads on forward strand; nucleosome “ends” = reads on reverse strand) as a function of distance to the dyad axis of the anchoring H3K4methyl nucleosome at or near the transcription start site of the selected 3,903 genes, as described in greater detail in the text. The grey ovals represent 147 base pair nucleosome core DNA, lying between successive start and end peaks.

Download figure to PowerPoint

In contrast to the promoter-enriched location of H3K4me-containing nucleosomes, nucleosomes containing trimethyl modifications of lysine 36 of histone H3 (H3K36me3) tend to be enriched in the body of genes (Kouzarides,2007; Li et al.,2007). Ahringer and co workers (Kolasinska-Zwierz et al.,2009) have investigated this rule in C. elegans and have added an important new insight. They immunoprecipitated H3K36me-containing nucleosomes from staged L3 larvae and probed the contained sequences using microarrays. They were able to verify the enrichment of H3K36me3-containing nucleosomes in the gene body but made the important new discovery that H3K36me3-containing nucleosomes are significantly enriched in exons compared with introns. Parallel control experiments showed no such exon-vs.-intron differential enrichment for either H3K4me3 or H3K9me3 containing nucleosomes. A clear relation between the degree of H3K36me3 marking and the level of transcripts associated with the gene led the authors to conclude that the exon marking is associated with transcription. Alternatively spliced exons show a lower degree of marking than do adjacent constitutively spliced exons, suggesting that the differential H3K36me3 marking of exonic nucleosomes reflects the process of transcript splicing, and not some static feature of exons, such as GC content. The authors extended their observations to mouse and human cells, and indeed their basic observation has since been verified in a re-analysis of data in the literature (Andersson et al.,2009).

As if eukaryotic transcription were not sufficiently complicated already, this latter observation of differential exon–intron histone modification puts the focus on the association of the splicing machinery with the transcription apparatus during the actual act of transcription. The authors point out two general categories of models that will have to be distinguished. Perhaps the exonic enrichment of H3K36me3-nucleosomes is a consequence of splicing, as if the splicing machinery were to recruit the histone methyl transferases. Alternatively, histone marking might facilitate splicing, as might occur if the splicing machinery were to be recruited by the marked histones. This second model clearly begs the question how the worm decides to mark only nucleosomes in exons.

HIGHER ORDER STRUCTURE OF CHROMATIN

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

So far, we have only discussed histone variants and (a small fraction of) histone modifications. What about the proteins responsible for histone replacement or histone modifications? A large variety of such proteins have been implicated in the control of chromatin higher order structure. Prominent among these proteins are members of the Drosophila trithorax and polycomb complexes (and their homologs in other systems), involved in long-term stability of transcriptional programs (contributing to sustained gene activity or gene repression, respectively). A further class of proteins of intense interest are the nucleosome remodelers (Clapier and Cairns,2009), which so often involve histone acetylases and deacetylases, and which are known to cause important effects in C. elegans (see, for example, Shi and Mello,1998). This vast field has been reviewed often in the past few years and we only point to two potential roles of such proteins that stand out in worms: (i) the role of polycomb-related complexes in silencing transcription from the X-chromosome in the germline (Strome,2005), and; (ii) the finding that many of the “SynMuv” genes encode chromatin modifying enzymes (Cui and Han,2007; Fay and Yochem,2007). The term SynMuv means synthetic multivulva phenotype; many individual genes, when mutated, cause a Muv phenotype; in the case of SynMuv genes, two mutations are required, one from each of two or three different gene classes. The molecular identity of roughly three dozen SynMuv genes are known, of which ∼1/3 are likely to be involved somehow in chromatin structure/function. Several SynMuv genes are also associated with the RNAi pathway and it will undoubtedly be complicated to work out their molecular roles in controlling transcription (Cui and Han,2007; Fay and Yochem,2007). Nonetheless, the SynMuv observations provide the opportunity to apply the power of C. elegans classical genetics to these interesting and important problems.

We will now briefly describe two reports that emphasize the logical importance of regulators of chromatin structure to the long-term stability of developmental programs in C. elegans. As mentioned earlier, nucleosomes that contain the histone modification H3K4methyl (without stipulating whether the modification is mono-, di-, or tri-methyl) have been repeatedly associated with transcriptionally active genes. H3K4methyl modifications are apparent cytologically in the C. elegans germline (presumably associated with genes that must be transcribed in the germline) but are normally removed at an early stage (the Z2–Z3 cells) of the developing germline of the next generation. Katz et al. (2009) have asked what would happen if demethylation of this histone modification was prevented. Would the chromatin mark persist into the next generation germline and would there be developmental consequences? The spr-5 gene encodes the C. elegans ortholog of the LSD1/KDM1 H3K4me2 demethylase. Null mutants in spr-5 become visibly defective in their germline and ultimately sterile; however, the onset of the germline defects and sterility are gradual, increasing slowly over 20–30 generations. (The fact that the decline in germline function is so gradual is presumably explained by parallel pathways that can also remove the histone methylation marks.) Genes that become misregulated over these 20–30 generations as a result of the spr-5 mutation are highly enriched in spermatogenesis-expressed genes. As would be predicted, H3K4methyl containing chromatin increases over this time period, both as detected cytologically in the germline precursor cells and as detected by ChIP of selected spermatogenesis-expressed genes. The authors conclude that demethylation of H3K4 is necessary to reprogram epigenetic memory “to maintain germline immortality.” This work presents an important genetic observation that will provide purpose and direction to biochemical experiments.

The previous paragraph concerns the inability of a mutant worm to remove an “activating mark.” What about the related situation, the inability to deposit a repressive mark, in this case H3K27methylation associated with inactive genes? Yuzyuk et al. (2009) investigated the role of the mes-2 gene, the C. elegans homolog of enhancer of zeste, and part of the trimeric PRC2 complex (mes-2, mes-3, and mes-6) that is involved in X-chromosome silencing in the germline (Strome,2005) and also that provides all H3K27 methylation marks in the early embryo (Bender et al.,2004). The authors showed that lack of mes-2 activity prolonged embryonic “plasticity,” assayed by the ability of embryonic cells to change fates when subjected to ectopic transcription factors (e.g., HLH-1 or END-1, driving cell fates toward muscle or endoderm, respectively). Furthermore, several genes that are expressed in the early embryo and that would ordinarily be down-regulated by the 8E cell stage, persist at higher than normal levels. Loss of mes-2 activity caused observable effects (conformation and level of engaged RNA Polymerase II) on chromatin, either of transgenic reporter arrays or on several endogenous single-copy loci. Overall, the authors argue that such loss of plasticity is not a downstream effect of the cell-fate-determining transcription factor (e.g., HLH-1 or END-1) but rather is more likely to be a direct effect of PRC-2-associated methylation events. It is important to keep in mind that, in the normal course of events, embryos that lack mes-2 are largely normal (Bender et al.,2004) and that loss of mes-2 does not prevent “loss-of-plasticity,” only delays it (Yuzyuk et al.,2009). Thus, it is likely that other factors besides H3K27 methylation contribute to long term stability of transcription programs and cell fate.

OTHER CONTROLS ON C. elegans TRANSCRIPTION?

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

Although paused RNA Polymerase II molecules have been well studied in other systems (especially on the heat shock genes of Drosophila; Fuda et al.,2009), they are only now being investigated in C. elegans. As part of a study on genes expressed (or repressed) in starved L1 larvae, Baugh et al. (2009) detected RNA polymerase II molecules enriched on thousands of genes that tend to accumulate transcripts during L1 arrest and that are also expressed upon hatching in the presence of food. Genes that had the most 5′-bias of the associated RNA Polymerase II molecules were also the genes that were the most up-regulated upon feeding. Upon feeding, a redistribution of the RNA Polymerase could be detected, with more molecules now on the gene body rather than localized at the 5′-end. The authors suggest that the paused/poised polymerase molecules provide a priming mechanism, through which the starvation-arrested L1 larvae can respond rapidly to the addition of food, reflecting the boom-or-bust nature of the worm's normal lifestyle.

Kim and co-workers (Roy et al.,2002; Pauli et al.,2006) have analyzed the chromosomal distribution of genes expressed (not necessarily specifically) in particular tissues and concluded that genes expressed in muscle or intestine have a statistically significant propensity to be more closely positioned to each other than would be expected by chance; (the particular algorithm used to estimate this clustering propensity calculates the probability that two genes have their 5′-ends lying within a chromosomal interval of 10 kb). The authors suggest that this apparent clustering could reflect higher order chromatin structure, e.g., by “opening” of multi-gene loops for transcription in the same tissue, or could reflect genes controlled by a tissue-specific enhancer. Their subsequent analysis (Pauli et al.,2006) noted that it is actually the “housekeeping genes” expressed in muscle or intestine that appear to cluster, not the genes that are muscle- or intestine-enriched or specific. The authors argue that this (perhaps unexpected) result rules out the second local-influence-of-enhancer model and instead supports a chromatin loop model. While the overall clustering signal is statistically significant, the actual effect on an individual gene is likely to be rather modest: for the case of the intestine, 684 of the 1,746 intestine-expressed genes have chromosomal positions within 10 kb of each other, whereas 519 would be expected by chance. It is not obvious how such a statistical bias translates into concrete molecular mechanisms of transcriptional control. Moreover, Yanai and Hunter (2009) have recently concluded that gene neighbor relations are poorly conserved between C. elegans and C. briggsae, even though the two animals are anatomically and developmentally highly similar. On the other hand, higher order controls on gene expression certainly exist in C. elegans; perhaps the best understood example of such phenomena is dosage compensation, by means of which genes on the X-chromosome are expressed in the hermaphrodite soma at ∼50% of the level they are expressed from the single X chromosome in the male soma. Excellent reviews have been written by two of the main contributors to the field (Meyer,2005; Ercan and Lieb,2009).

Anyone who has stared at the seemingly endless array of genes revealed by the C. elegans genome browser must have wondered how the controls of these close-packed genes keep from getting entangled. Transcripts from one gene can presumably be prevented from running into neighboring genes by local termination signals (see, for example, Haenni et al.,2009). It is less clear how enhancers controlling one gene keep from influencing neighboring genes. In vertebrates, this problem is solved in large part by the CTCF protein, which acts as an “insulator” between adjacent genes and adjacent chromosomal regions (Bushey et al.,2008). C. elegans apparently does not have a CTCF homolog (Heger et al.,2009), and so we must look elsewhere for mechanisms to keep control localized. Perhaps C. elegans enhancers are, by their design and nature, only able to work over short distances. Indeed, most identified enhancers in C. elegans consist of a binding site or two for a particular factor or pair of factors, not the complex three-dimensional assemblages of multiple transcription factors that appear to rule early development in Drosophila (Segal et al.,2008), or that have been described in mammalian gene control (Panne,2008). Perhaps the simplicity and limited range of C. elegans enhancers could be the reason for the conserved anatomy of nematodes, i.e., nematodes do not have the far-acting enhancers that could be so amenable to mutation and expanding regulatory control (Carroll,2005). An interesting and feasible experiment would be to test how far a C. elegans enhancer can be displaced and still retain its effect on its target gene.

FUTURE PROSPECTS?

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES

We are the first to admit that studies in C. elegans have not led the way in understanding the molecular details of transcription and transcriptional control, certainly compared with the pioneering studies in yeast, in Drosophila and in mammalian tissue culture cells. C. elegans studies have, however, provided fundamental understanding of how transcription factors specify cell fate and how differentiated cell phenotypes result from particular transcriptional programs; the present review discussed several such contributions. But what does the future hold? Will studies in C. elegans reveal anything fundamental and general about the molecular details of transcription and transcriptional regulation? We are optimistic and have presented several examples in the present review attesting to the accelerating pace of such studies in C. elegans. We thus end our review as we began it, emphasizing the power of C. elegans as an experimental system but, in this second incarnation, pointing to features that provide advantages for molecular studies of transcription. For example, the compact gene structure of C. elegans means that enhancers are close to initiation sites and there is thus no need to search hither and yon to understand transcriptional control as in other animals. Above all, animal transparency combined with the new generation of microscopic methods raises the exciting possibility of single molecule studies inside living animals. By means of facile and powerful RNAi, transcription factors, either general or specific, can be knocked out with ease, and this should be especially powerful for microscopic studies on individual animals. The rapid time course of embryonic development means that processes must happen within minutes, making it potentially easier to order crucial biochemical steps. In other words, C. elegans presents the possibility of asking what a gene is doing at the present moment, when we know it will be active in twenty minutes. Powerful new biochemical tools to purify native chromatin fragments that are extremely demanding in higher organisms (Dejardin and Kingston,2009) might be applied more easily to the multicopy transgenic arrays that form naturally in C. elegans. And behind all of these advantages, lies the power of the worm's genetics, its defined cell lineage, and, to end unashamedly with a pun, its simple elegance.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. GENERAL ARCHITECTURE OF GENE REGULATORY REGIONS IN C. elegans
  5. “SIMPLE PROMOTERS,” TERMINAL DIFFERENTIATION GENES, AND “TERMINAL SELECTOR” FACTORS
  6. “COMPLEX PROMOTERS” AND THEIR DEVELOPMENTAL REGULATORS
  7. CELL FATE, DIFFERENTIATION, AND BINARY SWITCHES
  8. PROSPECTS FOR CHARACTERIZING SPECIFIC TFS AND THEIR TARGETS
  9. MEDIATORS: FROM THE SPECIFIC TO THE GENERAL
  10. GENERAL TRANSCRIPTION FACTORS
  11. C. elegans CHROMATIN
  12. NUCLEOSOME POSITIONING
  13. HISTONE VARIANTS IN C. elegans
  14. HISTONE MODIFICATIONS IN C. elegans
  15. HIGHER ORDER STRUCTURE OF CHROMATIN
  16. OTHER CONTROLS ON C. elegans TRANSCRIPTION?
  17. FUTURE PROSPECTS?
  18. Acknowledgements
  19. REFERENCES