Understanding fundamental principles of enhancer biology at a model locus

Abstract Despite ever‐increasing accumulation of genomic data, the fundamental question of how individual genes are switched on during development, lineage‐specification and differentiation is not fully answered. It is widely accepted that this involves the interaction between at least three fundamental regulatory elements: enhancers, promoters and insulators. Enhancers contain transcription factor binding sites which are bound by transcription factors (TFs) and co‐factors expressed during cell fate decisions and maintain imposed patterns of activation, at least in part, via their epigenetic modification. This information is transferred from enhancers to their cognate promoters often by coming into close physical proximity to form a ‘transcriptional hub’ containing a high concentration of TFs and co‐factors. The mechanisms underlying these stages of transcriptional activation are not fully explained. This review focuses on how enhancers and promoters are activated during differentiation and how multiple enhancers work together to regulate gene expression. We illustrate the currently understood principles of how mammalian enhancers work and how they may be perturbed in enhanceropathies using expression of the α‐globin gene cluster during erythropoiesis, as a model.


INTRODUCTION
It is estimated that in mammalian genomes there are ∼ 20 000 genes regulated by ∼900 000 enhancer-like elements, [1] interspersed with ∼30 000 CCCTC-binding factor (CTCF)-bound elements, [2,3] many of F I G U R E 1 Schematic representation of erythropoiesis.S0-low to S5 mark seven defined cellular stages of erythropoiesis representing an immunophenotyping-based purification strategy [7] that allows the isolation of the desired population from mouse fetal livers.The stages of erythroid differentiation are shown starting with Hematopoietic Stem Cells (HSC) committing, amongst other lineages, to the erythroid progenitors (burst colony forming unit and colony forming unit erythroid, BFUe and CFUe respectively).α-globin expression is first detected and progressively increases as cells differentiate from proerythroblasts to mature red blood cells, as indicated by the gradually increasing red colour in the red triangle.The schematic representation of the progressively maturing erythroid cells highlight the morphologically distinguishable stages both in the varying size and hemoglobinisation states of the cells.
Importantly, we are also documenting how these processes are perturbed in human genetic disease.Despite these huge advances and the availability of an almost overwhelming resource, we still do not fully understand the mechanisms by which individual genes are switched on or off during development, lineage specification and differentiation.In our work, we have focussed on understanding the regulation of a single gene using the orthologous human and mouse α-globin loci as our model and this continues to provide new insights into how enhancers control gene expression in the context of a regulatory domain.
The globin genes are exclusively expressed during the process of erythropoiesis, which produces 1-2 million red blood cells every second in healthy adults [4] and occurs when haematopoietic stem cells undergo lineage specification and differentiation (Figure 1).] Mutations of the human α-globin locus cause a common form of inherited anaemia (α-thalassaemia) and therefore provide a rich source of informative naturally occurring mutations including mutations of the enhancer cluster. [8]Such mutations can be modelled in the orthologous mouse locus, which faithfully recapitulates the molecular and cellular phenotypes arising from human mutations.The mouse also provides an excellent test-bed for producing newly engineered mutations, not present in humans, to further test hypotheses concerning gene regulation.Importantly, unlike many other genes that have been analysed, α-globin has no downstream targets and therefore during erythropoiesis, natural and engineered mutations do not alter the expression of any other genes that might affect cell fate and confound the primary effects of the mutations that are introduced.This is an ideal situation when analysing the principles by which enhancers regulate transcription rather than how they control cell fate.
It is often argued that understanding a single locus might not reveal the general principles of gene regulation: evolution is only influenced by the selective advantage of the final output and so there may be great differences in the mechanisms of mammalian gene regulation.Whilst this is possible, all mechanisms of gene regulation first established at the αand β-globin loci have been found to be very widely used.
These include nearly all processes involving enhancer-driven expression, insulator elements, transcription, RNA processing and translation.
We are therefore optimistic that the mechanisms by which the globin enhancers communicate with and activate transcription from their cognate promoters will elucidate general principles of gene regulation.
Here we review current information about the order of events as the mouse α-globin enhancers and promoters are activated during erythropoiesis.We will then focus on two important unanswered questions in enhancer biology.First, what are the roles of individual elements in the context of a cluster of enhancers (sometimes referred to as a locus control region [LCR] or super-enhancer [SE]): do the enhancer elements act individually or as a group?Second, in contrast to individual enhancers, do enhancer clusters act in an orientationdependent manner?Finally, we will discuss our current understanding of how the enhancer communicates with the promoter and how this process is perturbed in enhanceropathies.

THE STRUCTURE OF THE α-GLOBIN CLUSTER
The mouse α-globin cluster on chromosome 11, is located in a ∼65 kb erythroid-specific sub-TAD (topologically associating domain), which is contained within a larger ∼165 kb TAD present in all tested cell types (Figure 2), [9] typical structures emerging as a consequence of loopextrusion delimited by largely convergent CTCF boundary elements [10] and reflecting domains of preferred interactions amongst DNA elements. [11,12]The locus includes an embryonic ζ-globin gene (Hba-x) and a pair of almost identical adult α-globin genes (Hba-1 and Hba-2).The cluster also contains two θ-globin genes (Hbθ-1 and Hbθ-2) of unknown function.
All of these genes are regulated by a set of five erythroid-specific enhancer elements (R1, R2, R3, Rm and R4) present ∼8-31 kb 5′ (upstream) of the cluster (Figure 2A).Four of these enhancers (R1, R2, : 32 060 000-32 260 000) with the higher intensity colours reflecting the higher frequency of contact between the α-globin major cis-regulatory elements and delineating the sub-TAD (dark grey bar under the matrix), and contacts established between the convergent CTCF sites marking the span of the TAD (light grey bar under the matrix), both structures aligned to the schematic of the locus below.
ing sites for TFs known to regulate erythropoiesis (GATA1, TAL1, NFE2 and KLF1) and the enhancer cluster fulfils the definition of a SE. [13,14]The α-globin genes and their enhancers are flanked by multiple largely convergent CTCF-binding elements at the boundaries of the sub-TAD. [15]These elements in the mouse are largely conserved in the human α-globin locus. [16]

THE CELLULAR EVENTS IN ERYTHROPOIESIS
Globin gene expression occurs within the context of erythropoiesis (Figure 1).Via different pathways, haematopoietic stem cells ultimately form bi-potential progenitors which may differentiate into erythroid cells or megakaryocytes which eventually form platelets required for haemostasis.The first cell type fully committed to erythropoiesis alone is referred to as the erythroid burst forming unit (BFU-E), retains considerable capacity for expansion and gives rise to erythroid colony forming units (CFU-Es).Of interest, these cells pass through a cell cycle with an unusually rapid S-phase [7] ; they then become fully committed to terminal differentiation and form a synchronous population of cells which progress through 3-4 subsequent divisions and ultimately enucleate to form mature red blood cells.Using flow cytometry, in mice these cells have been stratified into seven subgroups (S0 low and S0 medium, S1, S2, S3, S4 and S5) [7] (Figure 1).Using single-cell RNA-seq, α-globin RNA can be detected at basal levels even in stem and progenitor cells (S0). [9]Expression of α-globin increases dramatically as cells transition from late CFU-E (S1) to proerythroblasts (S2), and plateaus at S3 as cells become fully committed to terminal differentiation. [9,17]

THE ORDER OF EVENTS LEADING TO α-GLOBIN TRANSCRIPTION
Using primary erythroid cells and erythroid cell lines corresponding to the various stages of mouse erythropoiesis (S0-S5), the order of events leading to activation of the α-globin genes has been characterised in some detail. [9,17]It is important to consider these events not as fixed step-wise progressions; rather they are highly dynamic processes by which the probability of activation increases with time throughout differentiation.
Changes in α-globin expression are driven by changes in the protein factors bound at the α-globin enhancers and promoters, whilst insulator elements are bound by CTCF at all stages of erythropoiesis.
In early progenitors and precursors, the α-globin gene promoters, but not the enhancers, are bound by components of the polycomb complex (Figure 3) which is thought to maintain the silencing of these genes, at least in part, via histone deacetylation. [18]This process was originally identified at the human α-globin cluster but has more recently been shown to also occur at the mouse locus (Beagrie R et al., in preparation).2] In early erythroid cells, the α-globin regulatory elements are also bound by GATA2 suggesting it may act as a pioneer factor [23] for opening chromatin.In early progenitors, low levels of other TFs involved in α-globin expression (GATA1, TAL1, LDB1 and ZBP89) are also found at the enhancer elements [24,25] (Figure 3).In these precursors, enhancers are also modified by H3K4me1, implying that the MLL3/4 COMPASS complex is also present and active at these sites. [21,22,26]At these early stages of erythropoiesis, when the α-genes are not being transcribed the level of histone acetylation is low and contact frequency between the α-globin enhancers and promoters appears relatively low. [9]The appearance of enhancer-RNAs (eRNAs) at enhancers precedes the activation of nearby genes; the occurrence of eRNAs has been variously proposed to keep the enhancer region in an open configuration, [27,28] and/or to facilitate enhancer-promoter looping [29,30] (eRNA roles are extensively reviewed in ref. [31]).Together, these findings suggest that in the stages before the α-genes are transcribed (particularly S0 low and S0 medium), the regulatory elements are forming via a dynamic process while the target genes are substantially repressed via the polycomb complex, preventing premature activation of the α-genes.
Several changes occur at the critical stage of erythroid commitment (S1-S2).The polycomb complex is removed.Chromatin accessibility reaches its maximum level.][34] GATA1 is recruited at high levels fully replacing GATA2.This 'GATA switch' , crucial for progressive erythroid maturation, reflects the RNA levels of these TFs and is thought to be driven by the increasing abundance of GATA1 displacing the decreasing levels of GATA2. [35]However, this interpretation of the GATA switch was questioned by a report showing discrepancy between the levels of GATA1/2 RNAs and proteins as differentiation proceeds. [36]The CCAAT-box binding factor (NFY) is detectable at the promoters and the levels of H3K4me1 (at enhancers) and H3K4me3 (at promoters) mediated by the Histone (H) Lysine (K) methyltransferarses, MLL3/4 and MLL1/2, respectively, reach their maximum levels.However, at this stage, there are still no readily detectable components of the pre-initiation complex (PIC) or RNA polymerase II (Pol II) at the α-globin promoters.Thus, it appears that, at this stage of erythropoiesis, the elements are ready for activation but relatively little α-globin transcription occurs (Figure 3).

Activation of α-globin expression occurs at the cell transition
between S1 and S2 reaching a maximum at S2-S3.This activation is associated with recruitment of TAL1 (at the enhancers) possibly as a member of the pentameric erythroid complex (TAL1, E2A, LDB1, LMO2 and GATA1). [25,37,38]Activation is also associated with recruitment of KLF1 at the enhancers and promoters. [39,40]The recruitment of all tested components of the PIC and Pol II to the α-globin promoters is documented [39] as well as an increase in histone acetylation across most of the sub-TAD.It appears that the primary role of the enhancers is to recruit the PIC and initiate transcription. [41]Although this is the primary effect, it does not rule out a role for enhancers in subsequent stages of the transcription cycle.Increased α-globin expression is associated with a concomitant increase in contact probability between the enhancer and promoter. [9] is clear that the Mediator complex and its co-factor BRD4 play a role in activating α-globin transcription as both protein complexes are present at the α-globin enhancers and promoters in committed erythroid cells [14] and reduced when the enhancers and α-globin transcription are compromised. [42]Of interest, the Mediator complex has been shown to occupy enhancers in the β-globin LCR in ES cells, even though looping of this enhancer with its promoter does not occur until the erythroid lineage. [43,44]In future, to complete the model, it will be important to determine exactly when the Mediator complex and BRD4 complexes are recruited to the α-globin enhancers during erythropoiesis.
The current model suggests that the enhancers are first primed, then interact with the promoter and facilitate recruitment of the PIC and Pol II.This is supported by human enhanceropathies which cause α-thalassaemia by deletion of the enhancers and experiments in mice The dynamics of the molecular events at the α-globin locus throughout the concomitant erythroid differentiation and α-globin expression.The series of events from early erythroid progenitors (S0) where GATA2 acts as a pioneer factor at enhancers, low contact frequency between α-globin enhancers and promoters, α-globin is not transcribed (A-H3K4me1 at enhancers deposited by MLL3/4 COMPASS, B-Polycomb-mediated repression of gene expression at promoters).As the erythroid differentiation progresses (S1), GATA1 replaces GATA2 (C-H3K4me3 deposited at promoters by the trithorax proteins MLL1 and MLL2, D-Polycomb complex removed and high levels of H3K27Ac deposited by p300 and CBP leading to maximum chromatin accessibility).The more differentiated erythroid progenitor state (S2) is marked by a more established binding of erythroid-specific transcription factors (the pentameric complex) and by an increase in contact probability between the enhancer and promoters and noisy transcriptional bursting.As the more mature erythroid cells emerge (S3), continuous transcriptional bursting is observed, followed by a return to a noisy transcriptional bursting state (S4). [73] which the enhancers have been partially or completely deleted.In all cases, the primary effect of these mutations is to reduce the recruitment of the PIC and Pol II (see below).

HOW ARE INTERACTIONS BETWEEN ENHANCERS AND PROMOTERS ESTABLISHED DURING DIFFERENTIATION?
The 165 kb TAD containing the α-globin cluster and five widely expressed genes lying upstream is found in all cell types tested.By contrast, the 65 kb sub-TAD containing the α-genes and their enhancers is only seen in erythroid cells. [9]Within single cells analysed by Hi-C, defined fixed TADs do not exist: it is only when populations of cells are considered that the patterns of TADs emerge. [45,46]Therefore, it is important to consider TADs and sub-TADs as interaction probabilities rather than as defined structures.Similarly, interactions which juxtapose enhancers and promoters within sub-TADs are dynamic with estimated time spans of 15 min [47,48] The true rate of molecular contact between enhancers and promoters at the nanoscale is challenging to study because of technical limitations. [49]Of interest, the sub-TAD containing the α-globin locus defined by imaging can be found in 76% of erythroid cells [50] although the precise borders defining these structures are not known at high resolution since the probes cover relatively large regions (64-139 kb). [50]conciling the chromatin structure models inferred from chromatin conformation capture population averages (3C and Hi-C) and single-cell imaging approaches remains a challenge. [51,52]The view of chromatin organisation as a dynamic ensemble of contacts impacts discussions around enhancer-promoter interactions, especially around how stable or transient they are. [52]What brings the elements into close proximity is also a highly debatable topic. [53,54]Popular models explaining enhancer-promoter contacts involve loop extrusion [10,55,56] and/or passive diffusion of enhancers and promoters. [57,58]These mechanisms are not mutually exclusive.Several lines of evidence suggest that both TADs and sub-TADs are formed by loop extrusion (Figure 4A), itself a dynamic process, mediated by the cohesin complex and delimited by CTCF-bound insulators.Consistent with this, both cohesin and its associated protein NIPBL, which is thought to load cohesin and play a role in its translocation, [59] are detectable at increased levels at the α-globin enhancers and promoters in erythroid cells. [15,60]Furthermore, deletion of two CTCF-bound F I G U R E 4 Theoretical and α-globin-specific depiction of enhancer-promoter interactions.(A) Two models describing molecular states of enhancer-promoter interaction.In both models, it is thought that proximity between the enhancer and promoter may be, at least in part, the result of loop extrusion.How information is transferred from the enhancer to the promoter is unknown.One possibility is that the Mediator complex provides a protein bridge between the two elements.Another model proposes the formation of a transcriptional hub in which molecular crowding or liquid-liquid phase separation forms a high concentration of TFs and CoFs creating an environment that is conducive to transcription.(B) A schematic representation of the α-globin interacting domain, as visualised using super-resolution microscopy, formed specifically in erythroid cells whereby enhancers and promoters are interacting in a de-compacted segment of chromatin delimited by flanking convergent CTCF sites.Below, a linear representation of the locus with all the elements indicated as in Figure 2, and highlighting in colour the segments that correspond to probes used in the labelling process; green and red for the flanks and blue for the enhancer-promoter region.The schematic above shows the enhancer-promoter interaction domain observed only in erythroid cells.(C) The super-resolution image of the α-globin interaction domain; the blue probe highlights de-compacted chromatin spanning the α-globin enhancers and promoters bounded by the flanking red and green probes encompassing the convergent CTCF sites.Scale bar 0.5 μm.
insulators lying upstream of the α-globin enhancers extends the sub-TAD and this is associated with activation of newly incorporated genes within the extended sub-TAD. [15]It is also possible that enhancer-promoter proximity occurs by passive diffusion (Figure 4A).However proximity is induced, contacts are thought to be transiently stabilised by homotypic protein interactions, for example involving the LDB1 component of the erythroid pentameric complex [61] or SP1.
As erythroid differentiation and α-globin transcription proceed, the probability of interaction between the enhancers and promoters, as judged from tiled capture-C and super-resolution microscopy, appears to increase (Figure 2B). [9]It may be that this is related to increased loop extrusion.In support of this, in recent experiments we have placed a CTCF-bound insulator between the enhancers and the promoters and found that this significantly decreases α-globin expression in an orientation-dependent manner (Stolper R, Tsang F et al., in preparation).Of interest, the insulator causes a greater effect on α-gene expression when orientated such that it would block cohesin translocating from the enhancer to the promoter.This suggests that loop extrusion is required for the interaction between the α-globin enhancers and promoters and further work is underway to fully test this hypothesis.It remains possible that there is another as yet unknown mechanism by which insulators can alter enhancer-promoter interactions or transcription in an orientation-dependent manner.

WHAT IS THE NATURE OF THE ENHANCER-PROMOTER INTERACTION?
It is generally agreed that enhancer-driven transcription is associated with increased proximity between enhancers and their cognate promoters, [62][63][64][65][66] although there are exceptions. [67,68]This is consistent with the concept that enhancers and activated promoters may be found in transcriptional hubs (originally referred to as transcription factories) in which there is a high concentration of the many factors and co-factors required for transcription (e.g., TFs, PolII, Mediator and BRD4).Various models have been proposed describing the relationship between these proteins, chromatin, DNA and RNA including eRNAs.Although originally proposed as fixed nuclear sub-structures, it has been shown that these hubs are more likely transient, nonmembrane bound nuclear compartments. [69]The biophysical nature of these hubs is a matter of current debate and it has been proposed that the intrinsically disordered domains of proteins within the hubs may form liquid-liquid phase separated condensates. [70,71]It has also been suggested that the transient nature of the hubs and interactions between proteins may also explain why transcription occurs in bursts often lasting minutes rather than occurring continuously. [49,72,73]ing super-resolution microscopy, we have observed the active α-globin locus in structures consistent with loop-extruded chromatin flanked by CTCF-bound insulators (Figure 4B).This together with tiled capture C experiments shows that the enhancer and promoter come into close proximity only in erythroid cells. [9,50]Using tri-C experiments, which capture interactions between regulatory elements in single cells, it appears that the five α-globin enhancers work as a group rather than as individual elements. [74]This observation is further supported by micro-capture-C (MCC), which shows at nucleotideresolution that the enhancers interact more frequently with each other than with the α-globin promoters. [60]In addition, this work suggests that the α-globin enhancers not only interact with the globin genes but also with other flanking promoters in the same TAD, suggesting that the hub contains several transcriptionally active promoters.7][78] The content, distribution, concentrations and roles of the very large number of proteins (histones, TFs, co-factors, enzymes, etc.) thought to be contained within the transcriptional hub are sketchy.One important complex with 26 subunits that has been considered in detail is the Mediator complex and its associated protein Brd4.Subunits at the tail module of Mediator serve as a major interaction surface for a variety of sequence-specific TFs located at the enhancer whilst the head module serves as a docking site of Pol II and general TFs [79,80] at the promoter.Depletion of Mediator globally diminishes the level of gene expression, [81] but enhancer-promoter interactions are largely preserved [81,82] or slightly reduced [83] even after acute depletion of Mediator or PIC components implying that Mediator does not serve as a major structural bridge between enhancers and promoters.
In line with the hub hypothesis, the α-genes, like many other genes, are transcribed in frequent transcriptional bursts each lasting ∼5 min.
As in other systems, our findings suggest that the enhancers influence the frequency of transcriptional bursting. [73]A key unanswered question is whether the transcriptional bursts result from direct physical interaction between the enhancer and promoter or more simply from transient assembly and dispersal of the transcriptionally favourable hub.Measuring dynamic changes in the distance between enhancers and promoters is challenging and at the limit of current imaging.Nevertheless, recent efforts to measure the distances between enhancers and their target promoters during active transcription suggest that, although in proximity, at the atomic scale they are separated by large distances, in the order of 200-300 nanometers (nm). [49,84,85]Given the sizes of the proteins and molecular crowding within the hub, this separation may simply reflect the contact between multiprotein complexes binding to the enhancers and promoters.MCC, a cell population assay, defines enhancer-promoter interactions with base-pair resolution, [60] and this reveals the patterns of proteins interacting at the α-globin enhancers and promoters.Whilst this suggests that there are atomic-range interactions between these elements, it does not tell us how frequent they are or their relationship to transcription.Two elements separated by 200-300 nm might well contact each other by random diffusion and not be related to transcriptional activation.Models correlating contact and transcriptional activation may be further complicated by the requirement for multiple enhancer-promoter contact events to activate transcription. [63]

DO CLUSTERS OF ENHANCERS WORK AS A GROUP OR INDIVIDUALLY?
The visualisation of the extensive contacts between different cisregulatory elements in high resolution has led to the hypothesis that clusters of enhancers (SE) accentuate the impact of protein crowding for the formation of transcriptional hubs for gene regulation, and it brings to the fore the importance of understanding how these genomic elements function.The fundamental elements of the genome (enhancers, promoters and insulators) undoubtedly interact depending on their individual attributes and their distributions with respect to each other in the genome.The situation is made more complex because there is considerable overlap in the function of these elements which are very frequently initially classified by their locations with respect to transcriptional start sites and their epigenetic signatures.Most enhancers also act as promoters producing eRNAs, or meRNAs if located within the introns of a gene; the α-gene enhancers are no exception. [86]Some promoters can act as enhancers and sometimes may act as insulators, preventing an enhancer activating a more distal gene. [87,88] has been estimated that each gene may be regulated by 40-50 enhancers: although this remains to be seen, it seems unlikely.
Genome-wide identification of enhancer elements based on their chromatin signatures does not correlate very well with the activity of such elements in classical enhancer assays.This suggests that many elements identified by current chromatin signatures are not enhancers and other regulatory elements such as tethering elements and recently described facilitators which have no intrinsic enhancer activity may share signatures that are currently indistinguishable from classical enhancers. [42,65,66]The identification of as many as 11 different classes of multipartite clusters of enhancers including SEs and LCRs, has added to the complexity of enhancer biology. [89]It is not clear to what extent these multipartite enhancers act simply as a group of conventional enhancers or whether they include other classes of regulatory elements.Furthermore, it is not clear if these multipartite enhancers cooperate additively or if they act together to be more than the sum of their parts.
We previously analysed potential synergy between the five α-globin enhancers (R1, R2, R3, Rm and R4) by studying them in conventional enhancer assays and then by removing each enhancer individually and in informative but limited combinations from the endogenous cluster in mouse models.This showed that although all elements had the signature of an enhancer, most of the activity was encoded within R1 (40%) and R2 (50%) with R3, Rm and R4 having little or no activity when removed individually from the cluster.From these experiments, it appeared that the elements acted additively (Figure 5A).
To test this further, we recently rebuilt the cluster by generating an enhancerless allele and subsequently adding each element individually and in combination. [42]This work showed that on their own, without the context of the other elements, R1 (10% transcription) and R2 (15% transcription) had much less activity than predicted from previous experiments (Figure 5A).Furthermore, sequential addition of the enhancer-like elements, with no inherent enhancer activity, ultimately restored full function (Figure 5B).Importantly, the extent to which each element restored activity was dependent on its position relative to R1 and R2.We have called these elements facilitators and they share the known chromatin marks with other elements of the α-globin enhancer cluster.It is worth noting that facilitators do not score in conventional enhancer assays and consequently are probably discarded as inactive enhancers.Their deletion and impact on gene expression in situ could be interpreted similarly to canonical enhancers, as necessary, redundant or inactive, obscuring their differing nature.Also, the fact that the facilitators only have activity in the presence of active enhancers fur-ther obscures assignment of their role within the cluster; if activators are deleted, target gene expression is totally abolished even if facilitators are intact.It is only by combining their various characteristics with extensive genetic dissection that their role in regulating optimal levels of target gene expression can be revealed.Identifying and analysing elements which serve this type of role in other multipartite enhancers will be important.

DO CLUSTERS OF ENHANCERS WORK IN AN ORIENTATION DEPENDENT MANNER?
Individual enhancers, by definition, act in an orientation-independent manner. [90,91]As the initial reports describing enhancers and their characteristics were plasmid-based, the topic remains subject to debate.[94] We have inverted R2, the major enhancer element in the α-globin enhancer cluster, both in human and mouse loci in erythroid cultures and showed no effect on the expression of α-globin, [92,94] supporting a singleenhancer orientation-independent function.However, some reports highlight the promiscuity of enhancers and their ability to simultaneously control more than one gene including genes lying upstream and downstream of the enhancer. [77]Also, in the context of chromatin, the organisation of the genome can act to constrain enhancer activity in one particular direction.7] The enhancer orientation-independent function paradigm therefore should be revisited.Equally, it is not known if clusters of enhancer-like elements, working as a unit, harbour functional polarity.Of interest, in one set of experiments using large, randomly integrated transgenic inserts derived from bacterial artificial chromosomes, it was found that inversion of the β-globin LCR (a well-characterised cluster of enhancers) reduced expression of the linked β-globin gene cluster. [98]In addition, when the β-globin LCR was inserted in either orientation within a group of housekeeping genes, although it activated genes both 5′ and 3′, the upregulated genes changed depending on the orientation of the LCR. [99]Finally, when the complex enhancer cluster lying between the Kcnj2 gene and the Sox9 gene was inverted, there was a reciprocal change in expression of these two genes. [100]Together, these observations suggest that, whereas single enhancers act in an orientation independent manner, some clusters of enhancers may act as a unit with an encoded bias to the direction in which they activate gene expression.However, it remains unclear whether such directionality is a general feature of SEs and what the underlying mechanisms might be.
We have recently analysed the direction of interaction and the effects on gene expression of the cluster of α-globin enhancers. [92]To examine any effect of the orientation of enhancer clusters on transcription, we used the mouse α-globin locus as an experimental model.
We have previously shown that the cluster of enhancers regulating α-globin expression represents one of the most highly ranked SEs in The cooperation of α-globin super-enhancer elements in driving gene expression.Top, UCSC Refseq gene annotation across the α-globin locus and ATAC-seq peaks highlighted in grey bars corresponding to the α-globin SE constituent elements (R1, R2, R3, Rm, R4).(A) The reduction in α-globin expression, as a percentage of a wildtype expression level, corresponding to SE elements upon deletion of a single element (lane 1, for example the deletion of R1 only causes a drop of 40% in expression whilst the single deletion of R3 causes no expression change) or a combination of elements (R1 and R2 deleted together whilst other elements remain intact, lane 2) as reported in ref. [14] (B) Contribution to α-globin expression, presented as a percentage of a wildtype expression level, of single elements (lane 3, adding R1 only in the absence of all other elements contributed only 12% of expression whereas adding R3 caused no activation) or a combination of elements when added to an enhancer-less α-globin locus (lane 4: R1 + R2, lane 5: R1 + R2 + Rm, lane 6: R1 + R2 + R4, lane 7: R1 + R2 + Rm + R4, lane 8: R1 + R2 + R3 + Rm + R4) as reported in ref. [42] Note the discrepancy between the expected versus observed levels of expression based on deletion versus sequential addition of elements.R3, Rm and R4, inactive based on the deletion study (lane 1), prove crucial for the full activity of R1 and R2 elements (compare lanes 2 to 4 and 8).
erythroid cells [14] .Although the α-globin SE has a major influence on expression of the α-globin genes lying 30 kb 3′, it has little activity on the genes within the ∼165 kb α-globin TAD lying 12-35 kb upstream of R1.
We found that by inverting the entire α-globin SE, with or without surrounding CTCF binding sites or intervening promoters, the predominant interactions from the SE change direction and while αglobin expression is severely reduced, expression of the genes lying upstream, 5′ of the SE, is increased (Figure 6). [92]Together, these findings show that clusters of enhancers (such as SEs), in contrast to individual enhancers, may interact and influence gene expression in an orientation-dependent manner.This functional polarity is promoter agnostic and encoded within the cluster itself.At present it is not clear if the newly discovered facilitator elements whose activity depends on position play a role in determining the orientation of the activity of the cluster.We have described a functional hierarchy among the facilitators to be more dependent on each element's position with respect to the promoter rather than its sequence. [42]This observation may shed light on possible mechanisms that contribute to the cluster's functional directionality.In the α-globin SE inversion model, the position of R1 and R2 is almost unchanged, whereas the three facilitators are re-oriented towards the upstream genes, suggesting that their re-positioning may cause the associated changes in gene expression.We hypothesise that a unidirectional linear tracking mechanism, such as loop extrusion powered by cohesin, [10,56] may underlie the inherent functional orientation of a SE but this remains to be investigated.

NATURAL VARIATION IN THE ENHANCERS AND HUMAN DISEASE
03] It is becoming increasingly clear that the genetic, structural and/or epigenetic disruption of enhancers represent major causative factors in many human diseases referred to as enhanceropathies, ranging from rare congenital disorders to common diseases associated with ageing and lifestyle (e.g., cancer, diabetes).[106] By contrast, mutations causing well defined monogenic diseases are most often found in the coding sequences of well-defined genes; these include single nucleotide variants and insertion/deletions. [107] In monogenic

F I G U R E 6
The α-globin super-enhancer functions in an orientation-dependent manner.(A) The α-globin SE in its native configuration preferentially interacts with the α-globin genes in erythroid cells as schematically indicated and drives normal levels of expression, as presented in the table (indicated as 100%).The genes lying upstream of the α-globin SE are also expressed, albeit at much lower levels than the α-globin, in in vitro-derived erythroid cells.(B) When inverted, the α-globin SE interacts less with the α-globin genes, which are now expressing at 20% of the wildtype levels in in vitro-derived erythroid cells.The inverted SE interacts more with Rhbdf1 and Snrnp25, the genes lying 5′ of the native locus, in the direction of the inversion, upregulating both genes, as schematically shown and as reported in ref. [92] (C) Potential confounding factors, the CTCF boundary elements and the intervening MpG gene, were deleted.The phenotype persisted, demonstrating that the functional polarity is enhancer-cluster driven.
Consistent with these general observations, common natural variation of the human α-globin cluster causing α-thalassaemia is almost always due to deletions or nucleotide variants in the coding sequence.
Nevertheless, deletions and duplications of the α-globin enhancer elements have been seen in sporadic families with α-thalassaemia (Figure 7), often from geographical regions where α-thalassaemia is otherwise rare.These have arisen by illegitimate recombination, telomeric truncation and translocation of the enhancers.These rare families provided some of the first examples of human enhanceropathies caused by deletion of enhancers and first pointed to the existence of distal regulatory elements controlling α-globin gene expression.
Alpha-globin enhanceropathies remove between one and all four of the human α-globin enhancer elements, and of interest, all of the deletions characterised to date include R2.This is a classical enhancer that, in humans, contributes 90% of enhancer activity of the cluster of four α-globin enhancers.One patient homozygous for a deletion of R2 had a moderately severe anaemia showing that the remaining enhancers can drive sufficient α-globin gene expression to sustain a relatively normal life, supporting the idea that R1 together with R3 and R4 can act as 'shadow' enhancers, seemingly redundant enhancers but shown to add robustness to tissue-and developmental-specific gene expression. [94,111]Despite careful analysis of ∼50 individuals with the phenotype of α-thalassaemia and yet no associated deletions or insertions, we have never observed a deleterious single nucleotide polymorphism in the R2 enhancer.Although such polymorphisms may exist, it is likely that they do not cause sufficient change in α-globin expression to cause a recognisable change in phenotype.

CONCLUSIONS AND FUTURE QUESTIONS
The key challenges of understanding how enhancers activate gene expression and their role in disease have been well summarised.
Clearly, enhancer(s) must be securely assigned to their specific targets.
Importantly we need a consistent definition of an enhancer based on its activity within its normal chromosomal context rather than in transient assays, randomly integrated transgenes, or on the basis of a chromatin signature.The correlation between enhancer signatures and activities is poor.Most enhancers work in specific cellular contexts and must be tested in these contexts.Analysing enhancer-gene pairs that alter cell fate results in difficulties in interpreting changes in gene expression since the entire transcriptional and epigenetic programme will have changed, potentially producing secondary effects on expression of the gene in question.
The α-globin model discussed here fulfils all of these criteria and is allowing us to ask the fundamental general questions about enhancerpromoter biology.First, to fully characterise the individual elements in the cluster of α-globin enhancers: they are not all classical enhancers.Second, to ask about enhancer-promoter compatibility: the α-globin enhancers preferentially activate the α-globin promoters rather than other promoters that lie closer to them in linear proximity.Third, to address how distal enhancers come into proximity with their cognate F I G U R E 7 Deletions encompassing the α-globin enhancer cluster which give rise to α-thalassemia.
promoters: it appears that loop extrusion plays some role in this but may not provide a full explanation.The recent discovery of a role for the orientation of the α-globin enhancers in promoter choice is of great interest, particularly in the context of a linear tracking model of cohesin-mediated enhancer-promoter proximity.Finally, the nature of the transcriptional hub and its relationship to transcriptional bursting needs further examination: we need to understand the anatomy of a hub at the nanoscale and the dynamics of DNA, RNA and proteins within such structures.Although there will be variations on the mechanisms elucidated by studying the α-globin locus, the history of our understanding of mammalian gene regulation from the principles established at the globin loci suggest that the fundamental underlying mechanisms may be very similar.
The final arbiter of whether or not we fully understand how enhancers activate transcription from their cognate promoters will be to build an accurately regulated locus from scratch in a neutral region of the genome using synthetic biology.This approach is now possible and underway.
R3 and Rm) are located in introns of the adjacent widely expressed gene Nprl3.Each enhancer contains various combinations of bind-F I G U R E 2 The α-globin locus regulatory domain.(A) At the top, UCSC track representing the α-globin locus (coordinates [mm9]: 32 120 000-32 200 000).Middle, the regulatory elements of the α-globin locus are indicated by accessible chromatin (Assay for Transposase-Accessible Chromatin using sequencing: ATAC-seq) and Chromatin Immunoprecipitation followed by sequencing (ChIP-seq) tracks for H3K4me1 marking the enhancers (dark and light pink rectangles), H3K4me3 indicating the promoters (light blue rectangles), CTCF binding pattern corresponding to the CTCF sites (green and orange triangles), and Rad21 binding across the locus.Bottom, a schematic representation of the locus with the elements represented as described above and specifically the enhancers R1 and R2 and facilitators R3, Rm and R4 as well as the embryonic (ζ) and adult α-globin genes indicated.(B) Chromatin conformation capture (3C, Tiled-C Capture) contact matrix covering 200 kb spanning the mouse α-globin cluster (coordinates [mm9] For example, it is possible that random diffusion of enhancers and promoters is limited by the 3D structure of the sub-TAD which might be altered by the position and orientation of newly introduced CTCF insulators.In summary, long-range enhancer-promoter communication could result from the combination of diffusion and loop extrusion bringing the enhancer and promoter in close proximity and bridging of TFs and co-factors.