Broad Chromatin Domains: An Important Facet of Genome Regulation

Chromatin composition differs across the genome, with distinct compositions characterizing regions associated with different properties and functions. Whereas many histone modifications show local enrichment over genes or regulatory elements, marking can also span large genomic intervals defining broad chromatin domains. Here we highlight structural and functional features of chromatin domains marked by histone modifications, with a particular emphasis on the potential roles of H3K27 methylation domains in the organization and regulation of genome activity in metazoans.


Introduction
The linear sequence of DNA is compacted into chromatin by organization into nucleosomes, and functional units of chromatin can be distinguished by the enrichment of different histone modifications and chromatin binding proteins. [1,2] While defined genomic elements such as promoters and other regulatory elements are usually associated with local enrichment for particular histone modifications, recent research has shown the existence of broad domains of histone modifications that spread over large genomic regions. Patterns of histone modifications on genomic elements have been shown to be similar across species, [3] however, much less is known about the formation, function, and conservation of broad domains. Here, we examine work over the last decade or so that has demonstrated the existence of broad chromatin domains in a range of organisms. We will focus, in particular, on domains defined by the methylation of the lysine 27 residue of histone 3 (H3K27me), and discuss functions and relationships with other histone modifications and with the three-dimensional architecture of the metazoan genome.

Chromatin Domains Are Common
Chromatin domains are extended genomic intervals characterized by the continuous enrichment of a given histone modification, spanning between tens of kilobases to megabases in length, and usually encompassing multiple genes. Their large size and relatively uniform basal enrichment profile distinguish broad chromatin domains from other histone modification-enriched loci, such as extended H3K4me3 enrichment on individual regulatory regions [4,5] or collections of narrow peaks at regions with a high density of regulatory elements, such as super-enhancers. [6] Chromatin domains have been described in most well studied metazoan species. A paradigmatic example are the large domains of H3K27me3 deposited over Hox gene clusters in mammals and Drosophila. [7][8][9] These chromatin domains can span hundreds of kilobases in length, and the Hox genes they encompass are transcriptionally repressed. Other histone modifications have also been observed to coat extended chromatin intervals, such as the large domains of H3K9me2/ 3 on mammalian and Drosophila pericentric heterochromatin. [2,10] More recently, large domains of H3K27me2 were observed in mouse [11] and Drosophila [12] covering up to 70% of the genome. Thus, the association of one histone modification with large genomic regions is a common feature in metazoans.

Broad Chromatin Domains of H3K36me3 and H3K27me3 Define Transcriptionally Distinct Regions in Caenorhabditis elegans
Recent studies in Caenorhabditis elegans highlighted roles for broad chromatin domains in genome regulation. Analyses of patterns of histone modifications in undifferentiated embryo and differentiated larval stages uncovered a genome-wide pattern of alternating chromatin domains that have differing marking and genomic activity ( Figure 1A). [13,14] "Active" domains are highly enriched for H3K36me3-marked genes expressed in the germ line and usually also widely and stably expressed across cell types. "Regulated" domains, in contrast, are characterized by the extended deposition of H3K27me3 over genes and intergenic regions, and are enriched for genes with temporally, spatially, or environmentally regulated expression. The regions between active and regulated domains, termed borders, were shown to have features of transcription regulation, such as transcription factor binding sites and long intergenic regions, as well as enrichment for particular repeat elements. The positions of domains largely overlapped across the two developmental stages assayed, suggesting that this chromatin organization is a core property of the genome.
How the domains form and their functional roles are not yet clear, but intriguingly germ line activities appear to be important. MES-4 is a germ line and maternally active H3K36me3 histone methyltransferase that maintains the memory of germ line transcription. [15] In early embryos, maternally provided MES-4 maintains the H3K36me3 marking of genes that were expressed in the germ line. MES-4 was shown to act antagonistically with the C. elegans PRC2-like complex that generates H3K27me3: when mes-4 was depleted, germ line specific genes acquired H3K27me3. [13] Studying this interaction at the level of domains, it was found that active domains contracted and regulated domains marked by H3K27me3 expanded in embryos with reduced MES-4, [14] supporting the view that germ line events play a role in determining domain structure. The association of features of transcription regulation at the borders between active and regulated domains suggests that transcription might be involved in domain separation, although this has not yet been tested.

Do Chromatin Domains Exist in Other Species?
The existence of chromatin domains associated with genes having either widespread stable or regulated gene expression across the C. elegans genome raises the question of whether a similar chromatin organization characterizes other species. The possibility of active and regulated regions in other metazoans is supported by the long-known non-random distribution of genes with similar expression profiles, [16,17] as well as by the overall similar distribution of chromatin modifications across distantly related organisms. [3] The first genome-wide analyses of H3K27me3 modification profiles in Drosophila revealed the presence of several large chromatin domains, mostly covering genes involved in developmental processes. [8,9] Despite their similarity with the regulated domains described in worm, these domains encompass only a small fraction of the genome. However, recent work has reported a much more widespread distribution of H3K27me3 in Drosophila primary spermatocytes ( Figure 1B). [18] In this cell type, the H3K27me3 mark is enriched over broad regions, a pattern significantly different from that observed in Drosophila embryos and cell lines. Based on the relative enrichment of this chromatin modification across the whole genome, the authors defined thousands of H3K27me3enriched and H3K27me3-depleted domains, showing strong correspondence with polytene chromosome bands and chromatin architecture (see section 7). Intriguingly, the H3K27me3depleted domains cover the vast majority of housekeeping genes, and are also enriched for a set of chromatin marks associated with active transcription, including H3K36me3, H3K9ac, and H3K27ac. In contrast, H3K27me3-enriched domains include mostly regulated genes with cell-type or tissue-specific expression that lack chromatin marks classically associated with gene activity. This result is in agreement with a previous report showing that developmentally regulated genes undergoing active transcription lack histone modifications normally associated with gene activity, [19] and suggests differential chromatin control of active and regulated gene expression. Overall, the domains defined by the relative enrichment of H3K27me3 in Drosophila spermatocytes are reminiscent of the active and regulated domains in C. elegans, although this chromatin organization has not been observed in other tissues or developmental stages in the fruitfly.
Whether H3K27me3 in mammals defines chromatin domains similar to those observed in C. elegans is more uncertain. Early, genome-wide analyses described H3K27me3 enrichment as generally confined to narrow regulatory regions (except for a few broad domains) in pluripotent mouse ES cells and lineage-committed embryonic fibroblasts (MEF). [20] In these cells, H3K27me3 was found either alone or associated with H3K4me3 at thousands of promoter regions, and its presence associated with transcriptional repression. A subsequent study performed on similar MEF cell lines found that most H3K27me3 peaks were contained in broad H3K27me3 domains, termed BLOCs, spanning on average 43 kbp in length. [21] The authors also showed that at a larger scale, H3K27me3 is enriched over megabase-sized domains strongly overlapping with regions of high gene density and to the typical light Giemsa banding pattern in mouse metaphase chromosomes (R-bands), and anticorrelated with large regions of H3K9me3 and H4K20me3. Subsequent studies further supported the broad distribution of H3K27me3 in fully differentiated cells of mouse and human, where it was observed that H3K27me3 covers up to 40% of the genome in differentiated cells, compared to 8% in pluripotent samples ( Figure 1C). [22,23] Therefore, in contrast to the focal enrichment initially described in ES cells, more recent studies additionally revealed a broad distribution of H3K27me3 over large domains in mammalian lineage-committed and differentiated cell types.
More recently, the development of new, low input ChIP-seq protocols has allowed the analysis of histone modification distributions in samples with limited starting material. [24] Using the STAR-ChIP technique, Zheng et al. profiled H3K27me3 in mouse oocytes and early embryos, observing that H3K27me3 is found in broad domains from oocytes to pre-implantation blastocysts. Later, in post-implantation E6.5 epiblasts, H3K27me3 shows a more punctate pattern, similar to that seen in ES cells ( Figure 1C). [25] In sum, current results indicate that H3K27me3 is generally found over large chromatin domains in most mammalian cell types. The more focal distribution seen in ES cells was proposed to be due to a more dynamic chromatin state based on the observation that these cells have a wide distribution of H2AZ. [22,26] The aformentioned studies consistently found that transcription of genes covered by H3K27me3 domains is repressed, supporting a negative regulatory role for this mark in differentiated cells. Nonetheless, it is as yet unclear whether the broad domains defined by H3K27me3 distinguish genes with regulated expression from those with wide stable expression, and to what extent the domains identify the same genomic regions across different cell types or developmental stages. Moreover, it is worth noting that there appear to be species-specific differences in broad domain marking. Despite Figure 1. Histone modification distributions in C. elegans, D. melanogaster, and M. musculus. A) In C. elegans, active (magenta) and regulated (black) domains from Ref. [14] overlap mutually exclusive enrichments of H3K36me3 and H3K27me3 previously noted in Ref. [13] B) Histone modification profiles in Drosophila melanogaster. H3K27me3 is distributed over large domains in primary spermatocytes, and its profile closely resembles the profile of H3K27me2 in the embryo-derived Kc167 cell line. In late embryonic and L3 larval stages, H3K27me3 covers fewer domains. Some domains are still present (e.g., shaded region), usually overlapping developmentally regulated genes such as the Optix gene (highlighted in red). C) In mouse oocytes and pre-implantation embryos (e.g., 2-cell embryos), H2K27me3 is found over broad domains that occupy the same regions. In post-implantation epiblasts and embryo-derived ESCs, H3K27me3 is focally enriched at regulatory elements. The H3K27me3 profile changes again in more differentiated adult tissues; in adult organs, focal regions are present, but in addition broad regions of H3K27me3 also cover large portions of the genome, including gene bodies and intergenic regions. C. elegans data are from modENCODE (see Ref. [3]). Drosophila H3K27me3 in primary spermatocytes are from Ref. [18] (GSE85502), H3K27me2 in Kc-167 cells are from Ref. [33] (GSE32825), and H3K27me3 from embryos and larvae from Ref. [3] Mouse H3K27me3 in oocytes, 2-cell embryos, and epiblasts are from Ref. [25] (GSE76687). Mouse H3K27me3 in ESCs are from Ref. [11] (GSE51006), in adult liver are from Ref. [65] (ENCFF001KMV, ENCFF001KMW). Images of C. elegans (by Bob Goldstein), D. melanogaster (by Andr e Karwath) and mouse are under a CC-BY-SA license.
www.advancedsciencenews.com www.bioessays-journal.com the similarities in the distribution of many histone modifications, indeed, there are significant differences in the abundance of some histone modifications between C. elegans, Drosophila, and mammals. For example, trimethylation is the most abundant modification at the H3K27 residue in C. elegans across all developmental stages. [27] Instead, in Drosophila larvae [28] and mouse ESCs, [29] the H3K27 residue is mostly dimethylated. In these species, H3K27me2 domains cover vast portions of the genome and repress transcription of the underlying sequences. [11,12] Notably, there is a remarkable correspondence between H3K27me2 domains in Drosophila Kc167 cells and the H3K27me3 domains in primary spermatocytes [18] (Figure 1B), suggesting that different H3K27 methylation states may define these chromatin domains at different stages of Drosophila development.

Domains of H3K9 and H3K4 Methylated Chromatin
Domains of H3K9 methylated chromatin, often associated with gene repression and/or repetitive element silencing, have been shown to exist in multiple species. A well-known example is the chromatin surrounding centromeres (pericentromeric heterochromatin) where large, multi-megabase domains of H3K9 methylated chromatin are associated with DNA that is gene poor and highly enriched for repetitive sequences. [30] There, loss of H3K9 methylation leads to transcription of repetitive sequences and impaired centromeric function. [30] Domains of H3K9 methylated chromatin also form outside of pericentric heterochromatin. An early study showed that human and mouse chromatin contains large regions modified by H3K9me2, termed LOCKs, which were observed to span regions up to 4.9Mbp. [10] LOCKs appeared to be associated with differentiation, as they covered a larger fraction of the genome of differentiated cells (10-36%) compared to undifferentiated ES cells (4%). LOCKs were dependent on the histone methyltransferase G9a and the genes within them generally repressed. Interestingly, LOCKs were also significantly reduced in cancer cells. The results support a role for H3K9me2 domains in regulating cell type specific gene expression. Mammalian genomes also contain large non-pericentric domains of H3K9me3, which like H3K9me2, are more prevalent in differentiated cells than undifferentiated cells. [31] They are associated with lineage specific gene repression and their presence is inhibitory for reprogramming. [31,32] In Drosophila, as in mammals, broad domains of H3K9me2 are found within euchromatic regions, in addition to high levels of H3K9me2 and H3K9me3 in pericentric heterochromatin. [33] Some of the broad H3K9me2 domains were found to be common to all cell types examined whereas others were cell type specific. The functional roles of the broad H3K9me2 domains is as yet unclear, but there may be diversity of function because some domains cover genes that are well expressed, but within other domains, genes are transcriptionally silent. [33] In contrast to mammals and Drosophila, C. elegans chromatin does not contain large megabase scale domains of H3K9me2 or H3K9me3, likely due its holocentric chromosomes. Rather than having a single centromere surrounded by pericentric heterochromatin, centromeres are distributed along the chromosomes and pericentric heterochromatin is not apparent. Most H3K9 methylation is present on distal arm regions of autosomes. [34,35] H3K9me3 domains are on average 30 kbp, and similar to other organisms, H3K9me3 covers a larger fraction of the genome of differentiated larvae compared to undifferentiated embryos. [3] H3K9 methylation is associated with transcriptional repression, as loss of the enzymes that generate methylated H3K9 leads to derepression of genes and repetitive elements. [36,37] C. elegans H3K9me3 domains also differ from those in other animals in that they generally also contain H3K27me3. [3] Domains of chromatin methylated on H3K27 or H3K9 are amongst the best known examples of chromatin domains, but methylated H3K4, usually associated with gene activity, has also been shown to form broad domains. Broad H3K4 di-and trimethylated domains are located at HoxA and HoxB loci, where they were observed only in cells in which these genes are specifically expressed. [38] However, broad H3K4me3 domains are not always associated with gene activity. Recent studies found that broad H3K4me3 domains cover 22% of the genome in mouse oocytes, where they significantly overlapped genes and putative regulatory elements associated with subsequent zygotic genome activation (ZGA). [24,39] Mapping at the 2-and 8-cell stage showed that the broad H3K4me3 domains were progressively lost, leaving punctate peaks of H3K4me3 at promoters of ZGA genes. The broad H3K4me3 domains in oocytes appear to be repressive, because depletion of the H3K4 demethylases KDM5A/B resulted in their persistence at 2-and 8-cell stages, the downregulation of ZGA genes, and the failure of embryo development. [39] In summary, multiple lines of evidence indicate that broad chromatin domains, or the proteins responsible for their deposition, have roles in the control of transcriptional processes.

Mechanisms of Domain Formation
A common feature associated with chromatin domains is a mechanism enabling the histone modification to spread. Histone methyltransferases are often associated with adaptor proteins or contain protein domains that can bind the histone modification generated and hence aid in expanding the mark to neighboring nucleosomes. For example, SUV39H2 contains a chromodomain that recognizes methylated H3K9. [40] Heterochromatin Protein 1 (HP1), which also harbors a chromodomain that binds H3K9me2 and H3K9me3, recruits SUV39H2, forming a feedback loop to spread H3K9me2/H3K9me3 along the chromatin. [41] A similar mechanism for H3K9 spreading occurs in Schizosaccharomyces pombe centromeric chromatin. [42] The spreading of H3K27 methylation is dependent on the ability of EED in the PRC2 complex to bind methylated H3K27 via its WD40 domain. EED bound H3K27me3 is sensed by the stimulatory recognition motif (SRM) of EZH2 and stimulates its methyltransferase activity, allowing H3K27me3 to be propagated along the chromatin. [43] H3K4me3 and H3K36me3 modifications, associated with gene activity, have been shown to allosterically inhibit the methylation activity of PRC2 via SUZ12 [44,45] thus limiting the spread of H3K27me3 domains. This phenomenon may explain the expansion of C. elegans H3K27me3 domains when the H3K36 methyltransferase MES-4 is inhibited. [13,14]

Chromatin Domains and 3D Genome Organization
Segmentation of chromatin into broad linear domains resembles the organization of genomes into topologically associated domains (TADs) (for reviews on this topic, see Refs. [46,47]). TADs are contiguous genomic regions defined by strong physical selfinteractions and their relative insulation from other sequences, as revealed by high-throughput chromosome conformation capture techniques such as Hi-C. [48] The genomes of mammals and Drosophila have been shown to be spatially organized into TADs usually comprising 1-10 genes, [49][50][51] whereas, the threedimensional architecture of the C. elegans genome appears to differ significantly. Indeed, a Hi-C physical interaction map defined self-interacting TAD-like domains primarily on the Xchromosome in C. elegans. [52] These megabase-scale domains contain an average of 200 genes and their structure is regulated by dosage compensation. The lack of a C. elegans TAD architecture similar to that in Drosophila and mouse might be related to the apparent lack of architectural insulator proteins in C. elegans, including CTCF [53] , which is important for TAD definition, at least in mammals. [54] Interestingly, topological domains in fruitfly and mammals have characteristics similar to C. elegans chromatin domains, since chromatin marking across a TAD is often relatively uniform [47] and TAD positions are highly conserved across different cell types. [55] These observations suggest that there may be a functional overlap between the two types of genome organization. Are active/regulated chromatin domains and topological domains separate entities, possibly fulfilling similar roles in distinct species? Or, if chromatin domains characterize the genomes of other metazoans, how do they correlate with the three-dimensional genome architecture? Recent work in Drosophila suggests a strong correlation between broad H3K27me3 domains in primary spermatocytes and three-dimensional chromatin organization. [18] Active chromatin domains (low H3K27me3) show a very strong overlap with TAD boundaries, and are characterized by significantly more short-range physical interactions compared to the high H3K27me3 domains. This result agrees with previous reports showing that TADs overlapping transcriptionally active domains, which encompass most of the stably expressed genes in Drosophila, [56] have a peculiar contact structure characterized by generally close intra-domain interactions. [50] Although the chromatin and topological domains compared in this work were obtained from different cell types, a significant overlap was observed between the H3K27me3 profile in primary spermatocytes and the H3K27me2 landscape in the Kc167 cells from which the Hi-C data were produced. It will be therefore interesting to study whether different methylation levels of the H3K27 residue might reflect the topological architecture of the genome in other tissues and developmental stages in the fruitfly.
Recent work on the 3D architecture of mouse gametes and early embryos also found a correlation between chromatin architecture and broad H3K27me3 domains. From the zygote to the 8-cell stage, [57] broad H3K27me3 domains were observed to cover so-called "B" compartments, genomic regions that interact in 3D and that are associated with inactive chromatin. [48] Notably, at E6.5, when strong broad domains of H3K27me3 are no longer present and instead this modification has a punctate pattern at regulatory elements (see section 4), [25] H3K27me3 is enriched in the "A" (active) compartment, similar to the pattern in lymphoblastoid cell lines. [48,57] If confirmed, these observations would support a unified model in which the histone modification landscape matches chromatin topology, raising questions regarding the interplay between the two levels of genome organization. For example, it will be important to define whether chromatin state determines the physical insulation of domains, or alternatively genome topology constrains and drives the deposition of specific histone marks. The latter scenario, based on current data, is unlikely. Indeed, although previous reports suggested that CTCF might actively delimit H3K27me3 domains, [58] more recent work revealed that this protein, directly involved in the definition of TADs in mammals, does not constrain the spreading of H3K27me3 to neighboring regions. [54,59] On the other hand, there is initial evidence for a role of chromatin modifications or chromatin modifiers in the definition of TADs. A recent study focused on the onset of zygotic transcription in early Drosophila embryos showed that some histone modifications were significantly enriched at future TAD borders before the formation of topological domains, suggesting that chromatin modifications might directly or indirectly define chromatin topology. [60] Nonetheless, reliance of TAD formation upon chromatin modifications might be expected to involve the conservation of a given chromatin state throughout organismal development. In this respect, the finding that germ line chromatin marking is relevant to chromatin domain structure in C. elegans and Drosophila suggests a potential mechanism of transmission of domain information. [13,14,18,61] Whether certain histone modifications are stably found at the same genomic loci during the life cycle of an organism is currently unclear, and therefore further work is needed to define the profile of more chromatin modifications during development, as well as their potential link with the establishment of topological domains.

Conclusions and Future Directions
In summary, different histone modifications form broad chromatin domains, and such domains appear to be functionally important in genome regulation, but important details about their biology are still missing. For example, despite the similarities in chromatin domain properties in different species, it is as yet unclear to what extent they are shared traits in metazoans. The apparent conservation of active and regulated domain structures at different stages of C. elegans development, indeed, contrasts with the more dynamic H3K27me3 patterns during Drosophila and mouse development. Another aspect that will require further investigation is their epigenetic role. The structure of H3K27me3 chromatin domains makes them ideal candidate for the epigenetic transmission of gene regulation control, [62] and recent experimental work supports an epigenetic role for this histone mark in animals. [60,63] www.advancedsciencenews.com www.bioessays-journal.com To address the functions and conservation of broad chromatin domains, attention needs to be given to technical aspects of their study. Guidelines for ChIP-seq analyses, such as those defined by the ENCODE consortium, have been optimized for the identification of narrow peaks of enrichment, [64] but datasets fulfilling these guidelines often have insufficient depth of coverage for analyses of broad domains, where enrichment levels are relatively low. Moreover, a uniform computational approach for delineating broad domains would facilitate their comparison between cell types and species. These improvements, coupled to more comprehensive profiling of chromatin modifications, perturbation analyses, and manipulation of domain structures using genome editing, will help uncover the conservation, roles and regulation of broad chromatin domains in the control of genome activity.