Folding dynamics of polymorphic G‐quadruplex structures

G‐quadruplexes (G4), found in numerous places within the human genome, are involved in essential processes of cell regulation. Chromosomal DNA G4s are involved for example, in replication and transcription as first steps of gene expression. Hence, they influence a plethora of downstream processes. G4s possess an intricate structure that differs from canonical B‐form DNA. Identical DNA G4 sequences can adopt multiple long‐lived conformations, a phenomenon known as G4 polymorphism. A detailed understanding of the molecular mechanisms that drive G4 folding is essential to understand their ambivalent regulatory roles. Disentangling the inherent dynamic and polymorphic nature of G4 structures thus is key to unravel their biological functions and make them amenable as molecular targets in novel therapeutic approaches. We here review recent experimental approaches to monitor G4 folding and discuss structural aspects for possible folding pathways. Substantial progress in the understanding of G4 folding within the recent years now allows drawing comprehensive models of the complex folding energy landscape of G4s that we herein evaluate based on computational and experimental evidence.

This broad and simple definition of G4s can be fulfilled by a variety of different G4 and G4-like [5][6][7] structures ( Figure 1I). Restricting and fine-tuning in specific sequences found in the genome, like: x À L I y À G 2 x À L II y À G 3 x À L III y À G 4 x h i , with x ¼ 3 -5 and y ¼ 1 -7 leads to a manifold of different conformations characterized by their relative strand orientation and the resulting intramolecular loop geometries ( Figure 1II). [9] A sequence-based prediction for distinct conformations is already complex within this canonical set of G4 topologies. [10] In addition to this canonical G4 structural polymorphism, new aspects of structural complexity have been described recently, which are referred to as non-canonical polymorphism. [2,[11][12][13] Aspects including bulges, [14][15][16] exceptional loop arrangements [17,18] and snap-back motifs [19][20][21][22][23] can be observed in G4s from sequences that do not comply with the narrow definition given above (Figure 1, III).
Polymorphism is pronounced among G4s from different sequences, but is also observed within a given G4 forming sequence, leading to concurrent folding isomers in heterogenous ensembles. [24][25][26] In an enlarged conformational space, due to additional combinatorial possibilities, two particularly intriguing forms of folding isomerism arise Firstly, if the number of subsequent G-residues exceeds (or undercuts [27][28][29] ) the number of G-tetrad layers in a quadruplex, an exchange of a single G-register along the G4 core can be observed, leading to a conformational subset of shifted G-register isomers. [30] Secondly, if the number of subsequent G-tracts is greater than four, different isomers can be formed by incorporating different  [8] ) G-tracts into the formation of the G4 core. [31,32] For the latter, the term spare-tire isomerism has been newly coined. [31,33] Note that we here define both forms of folding isomerism with respect to the relation of the involved distinct G4 conformations arising from the same nucleotide sequence.

| Regulatory role of G4
The emerging role of G4 forming elements in the human genome has been reviewed extensively in the context of transcription regulation, [34] replication, [35,36] genomic instability, [37,38] epigenetic modifications [39][40][41][42][43] and telomer stabilization. [44] Since the first reports on transcription regulation through small-molecule ligands that target gene promoter G4s, [45] a plethora of approaches have evolved that focus on the development and characterization of G4 stabilizing agents. [46,47] While these strategies aim at stabilizing G4s as molecular mechanism in novel anti-cancer therapies, [48] more recently, converse strategies have been proposed to counteract genomic instability induced by the stable formation of G4s in chromosomal DNA. [37] Both these general approaches, stabilization and destabilization, interfere with a misregulated G4 formation at a pathological stage. The ambivalent consequences of G4 structure formation in different contexts highlights the requirement for the dynamic regulation of G4s: transient folding and unfolding has to be maintained for balanced cellular homeostasis. It is thus not surprising that an inherently dynamic nature is also a key feature of G4 formation in RNAs. [49][50][51] The versatile potential to fold into a manifold of distinct G4 structures has thermodynamic, kinetic and also biological consequences. In particular for the latter, it is a crucial aspect to ensure adaptability of G4 elements in response to external stimuli. As a prime example for the structural adaptability, the functional role of spare-tire G-tracts has been proposed to maintain G4 formation after oxidative damage, since G-stretches with increasing lengths are especially prone to oxidize. [31,32,[52][53][54][55] This mechanism enables the subsequent recruitment of repair machineries, which can reset the damaged DNA stretch. [55,56] In this simple picture, the maintenance of function for polymorphic G4s refers to an on/off switching for example, in transcriptional control, if a G4 structure is present or not.
However, there is now growing evidence that G4 polymorphism itself is a crucial aspect of G4 regulatory function. We exemplify these new findings with two structural aspects: Spare-tire isomers with different loop lengths have drastically different affinities towards G4 interacting proteins, even for G4s with the same folding topology (e.g., parallel loop isomers). [57][58][59] Structural isomers with different topologies (e.g., hybrid and parallel) result in vastly different unwinding efficiencies for G4 specific helicases. [38,[60][61][62][63][64][65] These structural aspects might already arise from small changes. Thus, while formation and fold topology of a G4 might be maintained in principle, the result of modified sequences (due to oxidation or mutations [66,67] ) could be a completely altered regulation cascade. [59] In view of the consequences of G4 polymorphism, the simple model that G4s act as steric bulking structures, often anthropomorphically called roadblocks, is not suited to explain their regulatory function. [39] Hence, statements about a general functionality for G4 formation have to be taken with care, whenever the specific conformation is not considered. For G4 forming sequences that are able to fold into different stable conformations, the individual kinetics of concurrent folding pathways towards a specific conformation might be biologically even more relevant than the stabilities at thermal equilibrium.

| Preparation of non-equilibrium conformational states
A prerequisite to study coherent structural changes is the preparation of suitable starting points away from equilibrium ( Figure 2).
The experimental approaches to prepare any kind of trapped, retained or excited state can heavily influence the folding progression, possible pathways and kinetics. [68][69][70] It is thus worthwhile to compare and evaluate the experimental premises to understand possible ambiguous results.

Mechanical unwinding
Mechanical unwinding is an intriguing, but technically demanding possibility to investigate G4 folding as a measure of force under isothermal experimental conditions and at constant cation concentration ( Figure 2A). Cheng et al. have used this method to study the folding/ unfolding of a BCL2 promoter G4 with single molecule force spectroscopy; the observed force changes are in the range of pN. [71] Using a tethered oligonucleotide that has been fixed with magnetic beads, they were able to describe kinetic differences for spare-tire isomers of the BCL2 G4 with this approach. While this method clearly ensures a pure unfolded state, it should be noted that this state will be characterized by inherently lower conformational entropy compared to other denatured states due to fewer translational degrees of freedom. [72] Thermal hysteresis In reversible thermal melting and annealing experiments, G4 forming oligonucleotides can show pronounced hysteresis ( Figure 2B). [73][74][75] The complex behavior at thermal transitions can be used to gain hysteresis. [30,69,73,76] Rapid temperature jumps (T-jump) have been used to induce G4 folding in circular dichroism (CD)-spectroscopic [77] and mass spectrometric [78] setups and new probe designs could potentially allow T-jump induced folding also for NMR spectroscopy. [79][80][81] Thermal (un-)folding is typically limited to lower than physiological K + concentrations, due to the high thermal stability of many G4 structures. [74,75] 2.1.2 | Irreversible folding (II) and refolding (III)

Cation-induced folding
A widespread strategy of inducing G4 folding is to dissolve DNA with G4 forming sequences under buffer conditions that lack G4 stabilizing cations ( Figure 2C). Thus, G4 folding is inhibited at ambient temperatures. Under careful experimental control (e.g., unwanted cation uptake from tubes), in particular DNA G4s can be prepared in an unfolded state. Folding at a specific temperature then can be induced by mixing with for example, Na + or K + which allows the application for any spectroscopical method as readout. [82] In general, cationinduced folding is a very simple, reliable and broadly applicable method, yet the degree of denaturation has to be evaluated carefully with spectroscopic methods. For RNA G4s for example, preparation of an unfolded state following this procedure is more difficult. In a previous study pre-formation of an RNA G4 fold was observed even in the obvious absence of K + -ions and complete folding occurs at much lower K + equivalents as compared to the corresponding DNA sequence. [83] CD spectra provide characteristic patterns for specific G4 architectures. Even more insightful, NMR spectra of the fingerprint region show hydrogen-bonded imino 1 H signals that allow a sensitive evaluation of residual structure formation. [84] Indeed, in many cases the formation of pre-folded states (see below) has been observed or structure formation even at very low, sub stoichiometric K + concentrations. [83,85] In many experiments, the concentration of K + is often below physiological concentrations, which greatly affects the thermal stability of G4s. [75] The kinetics are accelerated with increasing K + concentration (even at unphysiological high concentrations of greater than 100 mM) [86] but the main effects, in particular the branching of pathways are present already at very low K + concentrations (<3 mM). [82,83] Photocaging for conformational selection Photocaging of RNA and DNA with photolabile protecting groups on their nucleobase moieties can be applied for studying G4 folding. [87,88] Photocages can inhibit hydrogen bond interactions site specifically on distinct nucleobases in the oligonucleotide and can be removed upon light irradiation with a selective wavelength, thereby releasing the completely unmodified nucleobase. This concept was first exploited to study RNA refolding, [89][90][91][92] and has been applied to G4s recently in two different ways, either by selecting single folded conformations out of a polymorphic conformational ensemble from a G4 forming sequence instead of using sequence mutations ( Figure 2E) [69] ; or to suppress completely the G4 folding to trap the unfolded state ( Figure 2D). [33,76] The method allows investigating the isothermal folding at constant experimental conditions and ensures a robust disruption of hydrogen bond interactions or preferential pre-orientations. While photocages are introduced at the nucleobases and act irreversible in the direction of folding, the incorporation of photosensitive scaffolds into the DNA backbone can be used to cleave and hence unfold G4s ( Figure 2F), [93] or enable a reversible switching between different G4 conformational states. [94] F I G U R E 2 Non-equilibrium G4 dynamics. 2.2 | Spectroscopic methods to study folding and refolding kinetics

| Circular dichroism
CD spectroscopy is an excellent method to monitor folding kinetics of G4s and provides an easy access to the basic structural constitution of G4s. Much of the pioneering work on G4 folding has been conducted or supported by CD spectroscopy. [75,82,95,[100][101][102] CD spectroscopy gives a simple and characteristic readout that can be used to distinguish between an unfolded state and different folded conformations [parallel: $260-265 nm (+), anti-parallel: $295 nm (+), hybrid $265/295 nm (+)]. [103] CD spectroscopy is mainly indifferent to aspects of non-canonical polymorphism such as parallel G-register shifted or spare-tire isomers. However, there are now sophisticated analysis tools available that allow deconvolution of complex CD spectra from polymorphic G4s. [103,104] Time-resolved CD spectroscopy was used in combination with an laser-induced temperature jump to monitor G4 folding down to a millisecond timescale. [77] UV spectroscopy is suited as a readout in a similar way, but is restricted to a simple "folded/non-folded" monitoring with characteristic changes at 295 nm. [105][106][107][108]

| Mass spectrometry
Mass spectrometry (MS) gives orthogonal insight into aspects of G4 folding that are not directly available with spectroscopy, while MS itself does not provide direct structural information. [109] MS has been used for the evaluation of G4-ligand binding and of the folding pathways of G4s by analysis of cation binding stoichiometry. [110,111] A precise evaluation of K + binding to the DNA strand is crucial to understand the enthalpic and entropic contributions that affect G4 folding in thermal experiments. [111] In a recent fascinating paper, Gabelica et al. have presented an approach for the detection of mass-resolved CD spectra of G4 forming oligonucleotides. [112] This powerful method in combination with advanced computational methods for the deconvolution of CD spectroscopical and mass spectrometric parameters will enable new perspectives on G4 folding. [104,[113][114][115][116]

| Single molecule spectroscopy
Force spectroscopy/microscopy can be used for directed force manipulations on G4 oligonucleotides accomplished with magnetic [71,[117][118][119] or optical beads/tweezers, [120][121][122] also in combination with fluorescence detection. [123][124][125] Sugiyama et al. have demonstrated the observation of G4 folding in DNA nanostructures using high-speed atomic force microscopy (AFM). [72,[126][127][128] Förster resonance electron transfer (FRET) yields a very specific readout for twosite distances, which requires, however, the incorporation of dye labels that potentially bias the G4 structural integrity. [75,129] While FRET provides no direct information on the G4 conformation it allows a very selective observation of folding trajectories in single molecule experiments. [99,130,131] The selective observation of different FRET states make this method suitable for the application in high molecular weight complexes, in particular for the investigation of G4 unwinding by helicases. [64,65,[132][133][134] 2.2.4 | Nuclear magnetic resonance Nuclear magnetic resonance (NMR) spectroscopy is a powerful and versatile method to study G4 structural dynamics at atomic resolution. [81,135] Substantial information on the number of states adopted by a given G4 can already be read-off in one-dimensional NMR spectra, as the spectral regions for imino hydrogen atoms involved in Watson-Crick, Hoogsteen or i-motif interactions are clearly distinct.
Counting the number of resolved imino hydrogen atoms often already provides a direct readout of multiple, polymorphic states in slow conformational exchange, implying at least millisecond lifetimes of these states. [136] The quantification of arising imino 1 H signals in timeresolved experiments was used to study the folding of RNA and DNA G4s [33,69,83,85] and DNA i-motifs, [137] respectively. To access the rich and complex structural information of NMR spectra, however, higher dimensional homo-or heteronuclear correlated spectra are required since signal resolution decreases with increasing molecular size. [138][139][140] Especially nucleic acids show an inherently poor spectral dispersion due to only four different nucleobases that constitute the basic polymer building blocks. [136] 3 | FOLDING ENERGY LANDSCAPE AND FOLDING PATHWAYS

| Ensemble effects
From NMR and CD spectroscopic G4 folding experiments, we derive a conformational energy landscape that depicts the entire experimentally observable conformational space of G4 DNAs. To some extent, this landscape is simplified, compared to theoretical landscapes predicted by molecular dynamics (MD) simulations that potentially aim to represent the complete phase space. [70,141] In spectroscopic experiments only macrostates, represented by a particular conformational state (or structural fold) of the DNA strand of a certain lifetime can be observed. The multiples of subordinated microstates that contribute to for example, the conformational entropy of a macrostate thus are rather a subject of MD simulations than experimental evidence. Nevertheless, also in experimentally derived kinetic models conformational states should be referred to as ensembles (e.g., transitory ensemble), when the range of involved microstates exceeds a structurally clearly defined macrostate. [142] The description of folding pathways can be experimentally achieved on a single molecule level or in an ensemble average. In NMR, for example, the observation of an ergodic ensemble [143] (100 μM NMR sample ≈ 300 nmol DNA ≈ 10 17 folding events) leads to an extensive mapping of the energy landscape. In CD spectroscopy, this number is lower by a factor not smaller than 10 À3 . Different to other biomolecules, in the special case of chromosomal DNAs the folding of a particular G4 sequence is not an ensemble process, but a single event in each living cell. The assumption that G4 folding is an infrequent event is based on the fact that G4 folding does not happen spontaneously in double stranded DNA. A presumable requirement for G4 folding in any chromosomal region different from the single stranded telomeres is negative superhelicity, which is induced during transcription or replication. [39,[144][145][146][147] Interestingly, G4 folding is also associated to accessible chromatin states and therefore might even precede transcription. [148] Regardless of other cellular triggers for G4 folding, we try to give a rough estimate for the relevant rates of G4 folding events with respect to transcription. Assuming a total intracellular concentration of $10 5 mRNAs per cell, [149,150] with a median copy number per gene of $17 per mRNA (in comparison: the protein concentration is $10 9 per cell, [151] with $50.000 copies per protein [152,153] ). The typical intracellular lifetimes for mRNAs are longer than several hours, [152][153][154] but especially regulatory mRNAs have significantly shorter lifetimes, [154] leading to a potentially higher transcription rate of certain G4 mediated genes. However, these calculations still lead to only a very few, approx. <100 potential folding events of a distinct promoter G4 per hour per cell. Hence, the total number of transcription initiated G4 folding events is hardly to call a dynamic ensemble. This situation changes, if tissues are considered: in a tumor tissue for example, $10 8 to $10 9 cells are observed per gram tissues, [155] which adds up to a tremendous number of independent G4 folding processes in vivo.

| A view from computation
In stark contrast to funnel-like energy landscapes that describe folding trajectories of proteins, G4-forming oligonucleotides exhibit a rough conformational energy landscape (Figure 3, I). [70,141]  kinetics partitioning has been studied with MD simulations, and revealed possible involvements of hairpins, [142,156] triplexes [157] and strand slipped conformations. [29,70,158] Stable G-hairpins, [159] cross or parallel G-hairpins [142] and newly discovered pseudocircular G-hairpins [160] represent interesting possible waymarks along the folding pathways. Derived from the computational picture of the conformational energy landscape, the underlying molecular folding mechanisms with competing trajectories are described as kinetic partitioning mechanism, as opposed to a funnel-like mechanism. [70] 3.3 | Folding kinetics and rate constants The kinetic partitioning populating parallel folding pathways causes complex and multiphasic folding kinetics (Figure 3, II). The kinetic rates for G4 folding are vastly different [86] and span a range between sub-seconds [77] to minutes. [161] The kinetics report on structural rearrangements that are driven by enthalpic and entropic contributions, leading to the observation of different temperature dependencies in an Arrhenius analysis. Linear Arrhenius behavior with either positive [100] or negative [162] activation energies, as well as non-linear [86] Arrhenius behavior has been reported. [33] Since the apparent activation energies report only on the observable rate constants, it is conceivable that the overall folding kinetics are determined by different rate limiting steps in the multiphasic folding mechanism. Noteworthy, the kinetics are highly influenced by the experimental setup, temperature-induced folding is typically observed to be orders of magnitude faster than for example, isothermal K + -induced folding. [69,77] The reasons for the different kinetics can be explained by differences in the energetic nature of the unfolded states, that is, different entropy contributions, or the presence of pre-folded states. [68,70,111] In general, the comparably slow folding kinetics of DNA G4s reflect the consequences of the kinetic partitioning mechanism, including the sampling of misfolded macrostates. [70] The pronounced polymorphism found in DNA G4s is less abundant in RNA G4s and consequently the folding kinetics are remarkably faster. [83,86] A main contribution for the diverging kinetics of DNA versus RNA G4s might be the slightly favored anti-glycosidic conformation of RNA. [83] Giving a random distribution of syn-and anti-glycosidic conformations results in 2 13 = 4096 possible microstates in an unfolded DNA chain able to fold into a 3-tetrad G4, while an RNA chain does not require syn/anti flipping. [70] At this point, we wish to compare the folding kinetics with the biologically relevant timescale for DNA G4 folding. In the context of transcription regulation, the crucial temporal parameter is the transcription rate of RNA polymerase II (RNAPII). The reported RNAPII rates can be estimated with $1.1-4.3 kilobases per minute ($15 ms per nt on average) [163,164] in eukaryotic cells. This rate can be significantly slowed down to $6 bases per second ($170 ms per nt), [164] which is in line with observed dwell times of several seconds for molecules in transcription domains. [165]

| Loop effects
It is self-evident that the constitution of the loops restraints the flexibility of any advanced G4 folding states and thus the loops have major contributions to the folding dynamics. The effects of loop lengths on stability and preferences for particular conformations has been dissected earlier. [166][167][168] The stable formation of parallel conformations is guided by single nucleotide lateral loops (L I + L III ), while the length of the proximal loop is less crucial. [169][170][171][172] Interestingly, the proximal loop was found to be the main effector for the folding kinetics of parallel G4s. Nguyen et al. found a nearly linear dependence of the folding time with increasing length (1-25 nts) of the proximal loop (L II ). [161] However, if the loop sequence is redesigned to form a duplex stem loop (internal hairpin) the folding time re-accelerates as compared to an unstructured loop of the same length. In a recent study, we have also observed decelerated kinetics for a 6 nt long loop, as compared to a single nucleotide loop in a cMYC spare-tire isomer. [33] This conformation showed non-Arrhenius kinetics that highlight the impact of the flexibility of the proximal loop for parallel G4s. net-sum for the orientation vectors of all residues. Under real, nonideal sample conditions the deviations from the ideal definition of unfolded states are highly determined by the experimental design. In any case, the resulting unfolded ensembles will likely feature a nonrandom distribution of conformational microstates resulting in a preferential pre-orientation or the formation of pre-folded states.

| Pre-folded states
How much the existence of different pre-folded states affects the folding pathways has been demonstrated by Frelih et al., [68] examining the consequences of temperature and pH on the folding of human telomeric G4 sequences. They find different defined pre-folded structures even in the absence of cations that guide the folding towards either a hybrid-1 or a hybrid-2 final conformation. Similar observations for the formation of pre-folded states, presumably hairpin-like conformations are also reported for the bimolecular Oxytrichia nova telomeric G4 [173] as well as for the EGFR [25] and cKit [174] promoter G4s. The hydrogen bond interactions in pre-folded states mark energy barriers that need to be overcome if the pattern is non-productive for the final fold. While the beforementioned sequences show only partial folding patterns that are also retained in the final G4 folds, the G4-forming sequence of the WNT1 promoter adopts a completely aligned, stable hairpin conformation under K + -free conditions. [175] Though this hairpin cannot be defined as pre-folded state but rather as an alternative fold, it gives crucial insight into the consequences of starting the folding as a transition from a base-paired state to the G4 rather than from an unfolded or partially folded state. The folding kinetics in the reported study were dominated by the unfolding rate of the hairpin and hence are rather slow (82 min À1 ).

| Intermediate states
We can imagine any kind of intermediate state being either along a folding pathway that leads to the final G4 conformation (on-pathway) or being along a second folding pathway that reaches an impasse or leads to an alternative G4 conformation (off-pathway). For on-pathway trajectories, several intermediate states during G4 folding have been proposed and predicted, most prominent among them are hairpins and triplexes. [142,176] While the spectroscopical evidence for hairpin pre-folded states gives a good representation of a trapped intermediate state, the transient formation of intramolecular G-triplexes remains more elusive. [177][178][179] The quest to detect triplex folding intermediates is mainly driven by the assumption of a consecutive, sequential assembly of G-tracts: single strand-hairpintriplex-quadruplex. This kind of a stepwise strand recruitment folding mechanism has been discussed first for inter/tetramolecular G4s, where this stepwise assembly is intuitive. [180][181][182] Predictions from molecular dynamics simulations give insight into the feasibility of intramolecular triplex formation, [70,142,157] but experimental evidence for their transient formation remains ambiguous. [127] Intramolecular parallel, anti-parallel and hybrid G4s have different sterical requirements for the orientation of the strands and the resulting interconnecting loops. Even for a rigorous sequential strand recruitment, several options are possible that lead to different intermediates. While triplex formation for a hybrid G4 resulting from a hairpin and a third strand can be a reasonable arrangement, for an anti-parallel (chair) G4 the merging of two hairpins is an easy way to go around triplexes. [183] The relative strand orientation does not necessarily imply a hydrogen bonded, rigid tetrad formation. The assembly of a quadruplex can also involve a zipping along tetrad formation of the G-tracts into the final G4 structure. [110] This results in n-x possible G-tetrad intermediates for a n-layered G4. Gabelica et al. demonstrated with mass spectrometry that the recruitment of cations occurs stepwise. [110] Their detection of a single K + -bound state is thus indicative for a 2-tetrad (or tetrad + triad) [111] intermediate state, where the cation is coordinated between the G-planes.

| Misfolded states
During folding, different kinds of misfolded states can be generated that are associated to the different aspects of structural polymorphism in G4s and it depends on definition what to call a misfolded state. Misfolding during folding can lead to off-pathway intermediates that need to unfold to form a stable G4 structure. Misfolding can also be defined as the formation of different stable or metastable G4 conformations. An interesting example for the latter can be seen in human telomeric tandem repeats. The uniformity of the repeating G-tracts can lead to a kinetic frustration, given a random, unpreferential nucleation of G4s. [120,184,185] On the other hand, the formation of G-wires for example, in Tetrahymena telomeric repeats, can result in different species of higher-order G4 structures. [186] These structures are polymorphic, but with a well-defined periodicity, which makes the term misfolded inappropriate.

| Metastable states
Misfolding can lead to the formation of fully folded intermediates that are long-lived (exceeding biological relevant timescales), which we define as metastable states in the following. Examples for kinetic products of G4 folding away from thermal equilibrium have been reported with direct NMR-spectroscopical evidence for topological isomers of the telomeric G4 (hybrid-1 and -2) [85] ; for G-register shifted, [69] and for spare-tire [33] isomers of the cMYC G4. In all instances, the subsequent refolding into thermal equilibrium is on a timescale of several hours at ambient temperatures.

| Coexisting states
The refolding dynamics between isomeric G4 conformations might be as well reversible, instead of (nearly) depleting an initial kinetic product. The G4 sequence from the human telomerase promoter (hTERT) adopts a hybrid and a parallel conformation in a 0.4:0.6 ratio. [24] Nußbaumer et al. [187] have investigated the refolding dynamics with time-resolved heteronuclear 2D NMR after incorporation of 13 C building blocks. We have also investigated the refolding dynamics between the hybrid and the parallel conformation and found consistently slow kinetics in ($0.7 h À1 ) proceeding towards a thermal reequilibration of the populations, irrespective of the starting point (100% hybrid or 100% parallel). [69] This example shows that coexisting states are indeed cross talking and are in dynamical exchange, although on a slow timescale.

| PREVALENCE OF GELEMENTS WITH NON-CANONICAL FOLDING ISOMERISM IN GENE PROMOTERS
The vast majority of G4 forming sequences that are found in gene promoters feature the possibility to undergo G-register or spare-tire exchanges. [9,30,31,188] In the cMYC G4 both kinds of folding isomerism are found in a compact sequence segment that can be deconvoluted in structural and kinetic aspects. However, in many cases the promoter regions are convoluted with multiple interdependent G4 elements, which complicates a full disentanglement of the folding dynamics.

| cMYC promoter
The G4 forming sequence (Figure 4, I) derived from the nuclease hypersensitive element III 1 (NHE-III 1 ) À142 to À115 nucleotides upstream the P1 promoter of the human cMYC oncogene is a prime example for a polymorphic G4 ensemble and its structure has been studied extensively in the recent two decades. [20,45,145,[192][193][194][195][196] The 27 nucleotide long sequence features five G-tracts (1À5), of which three G-tracts have four consecutive G residues. A total of three stable parallel spare-tire isomers involving strands 1234, [145] 1245 [195] or 2345 [194] have been reported with a subset of four different F I G U R E 4 Non-canonical Polymorphism in promoter G4s. (I) wt-Promoter sequences [189] of cMYC [45] (PU27), BCL2 [169,190] (P1G4 and PU39), hTERT [24,191] (putative quadruplex sequences 1, 2, 3) and cKIT [174] (kit2, kit* and kit1).  [8] Adapted with permission from Grün et al. [69] Copyright 2020 American Chemical Society. Adapted with permission from Grün et al. [33] Copyright 2021 The Authors. Published by American Chemical Society) G-register isomers [30,162,192] for the majorly populated spare-tire isomer 2345 (Figure 4, II). [145,162,192,194,195] Even though the overall structure of the G4 core architecture is very much alike, the different lengths of the lateral and proximal loops cause diverging thermal stabilities for all conformations. [162,197] The resulting conformational landscape is determined by two attributes with a vastly different impact on folding kinetics. While the separation into spare-tire isomers requires recruitment of differently distanced parts of the DNA chain, the division in G-register isomers is topologically more indifferent.
One thus may assume ( Figure 5) three major wells representing the spare-tire isomers, where the well for conformation 2345 is fine structured in four competing basins of attraction that represent the G-register isomers. The possibility to fold into the direction of a fourfold basin with an increased conformational space thus compensates for the entropy penalty during folding resulting in an accelerated concurrent overall folding for 2345. [76] We denote the G-register isomers of 2345 with respect to the G-tracts 3 (first digit) and 5 (second digit) in a two digit abbreviation depending if the registers are shifted in 5 0or 3 0 -direction, resulting in conformations 33, 35, 55 and 53. [30,69,76] Out of the four possible G-register isomers, only two are majorly populated in the wildtype sequence at thermal equilibrium, namely conformations 33 and 53. [30,198] While 33 is thermodynamically slightly preferred, we have observed a kinetic overshoot for 53 in K + -induced folding. The subsequent refolding between these states proceeds on a timescale of hours, proceeding via transitory ensembles that do not require a complete unfolding-refolding mechanism.
It was shown that also the remaining two competing wells with spare-tire isomers 1234 and 1245 can be populated, if any of the randomly sampled initial contacts or geometries are trapped sufficiently long to result in macrostates that can undergo further kinetic steps towards other spare-tire isomers. [33] The folding kinetics of each conformation are highly dependent on the lifetimes of potential on-/off-pathway intermediates. [33,95,110] The experimental data from our recent publication [33] in line with observations in other experimental studies [95,110] suggest that the stability of a major possible intermediate is determined by the length of the proximal loop ( Figure 6). These experimental findings support the idea that an initial collapse into lateral hairpins might be a reasonable first step down the folding energy landscape. The findings of several pre-folded hairpins in other G4 forming sequences are in line with the conclusions for cMYC G4 folding. [25,173,174] We have described this intermediate as misfolded state during folding of the spare-tire isomer 1234 based on CD-and NMR-spectroscopical observations. [33] While the stable G4  [8] Adapted with permission from Grün et al. [33] Copyright 2021 The Authors. Published by American Chemical Society) anti-parallel CD-signature. [ [110] which prompted the proposal of a 2-tetrad chair conformation. The low stability for the resulting single nucleotide lateral loops in case of a closed attical G-tetrad is predicted from simulations [141,142,156] and further supports this structural model.

| BCL2 promoter
In the BCL2 promoter the situation is already more complex than in cMYC, since here a total of three G4 elements can be found, where each of the G4 elements shows inherent polymorphism (Figure 4, I; P32 [199] : À1906 to À1875, not shown; Pu39 [169,200] : À1489 to À1451; P1G4 [190] : À1439 to À1412). [201,202] Even though each G4 element (Pu39 or P1G4, for P32 no structural data are available) shows an increased number of isomeric conformational substates the G4 elements are still well separated, which in principle allows an isolated investigation of each G4 ensemble. In particular, the Pu39 G4 element can adopt different spare-tire isomers incorporating either G-tracts 2345 (hybrid) [200] or 1245 (parallel). [190] The folding of Pu39 has been investigated with force spectroscopy and it was found that BCL2-2345 (7 nt long proximal loop) is kinetically favored, despite BCL2-1245 (13 nt long proximal loop) being thermally more stable. [ 71,201] The findings further suggest the involvement of additional states during folding, presumably BCL2-1234 that had not been described structurally before.

| hTERT promoter
The G4 forming sequence in the human telomerase reverse transcriptase promoter (hTERT) consists of an extended G-rich sequence (À20 to À110 nt upstream of the TSS; Figure 4, I) that can adopt up to three concurrent G4s that adopt a subset of G-register isomers. [24,191,203,204] It has been initially proposed that this G4 element consists of two stacked G4s with an unprecedented long hairpin loop formation (Figure 4, I: PQS2 III/IV + PQS3 I/II), [204] newer models for the entire G4 element, however, presume a stacked triple parallel G4 arrangement. [191,203] In the only available study on the folding dynamics of possible higher-order G4 structures published by Selvam et al. possible folding pathways in hTERT have been investigated with an mechanical unwinding approach. [205] The presented findings for the complex unfolding patterns imply multiple pathways and associative interactions between neighboring G4s.
The isolated 5 0 -proximal G4 (Figure 4, I: PQS1) was shown to coexist in a hybrid and parallel conformation. [24] While the hybrid conformation is thermodynamically slightly more favored, the folding kinetics for the parallel conformation are $1,6x faster. [69] The refolding dynamics between the hybrid and parallel conformation are slow ($0.7 h À1 ). [69,187] These findings prompt the hypothesis of a potential cooperative kinetic mechanism that guides the G4 element towards the "correctly" (parallel) folded higher-order G4 structures in hTERT.

| cKIT promoter
A similar situation to the hTERT promoter can be found in the promoter region of cKIT (Figure 4, I), [206][207][208] with three highly polymorphic nearby G4 structures (cKIT-1, [21] cKIT*, [25,209] cKIT-2, [ 105,210,211] in this particular order in 3 0 -5 0 direction). The adjacent G4s cKIT* and cKIT-2 show a crosstalk and mutually influence the stability of each other. [209] cKIT-2 forms a parallel G4 and features complex folding kinetics with branched pathways, long lived intermediates and a clear evidence for the formation of pre-folded states even in the absence of monovalent cations. [105] In contrast to that, cKIT* forms a two-tetrad anti-parallel (chair) G4, which folds significantly faster than cKIT-2 (and presumably cKIT-1). [110,174] While it is reasonable to assume that the faster folding of cKIT* might guide the folding of the adjacent G4s, it is worth noting that the folding of cKIT* itself is highly influenced by the constitution of the 3 0 -tail and thus depends on a folding cooperativity with cKIT-1. Hence, a mutual dependence of the folding dynamics of each G4 in higher-order G4 structures seems very likely.

| DISCUSSION AND FUTURE CHALLENGES IN THE FIELD
We have herein presented an overview of experimental studies that shed light on different aspects of the complex folding dynamics of G4s, in particular DNA G4s from gene promoter sequences. To our understanding, it is not expedient to conclude on generalized folding pathways, or even recurrent mechanistic patterns in different G4 forming sequences.
Attempts to draw simplified folding trajectories will always come short on the conformational diversity of possible macrostates during folding. There are plenty of ways to fold a G4 and it is likely that many of them contribute to a realm, or corridor towards folded G4s. However, not all routes lead to Rome: some are wrong tracks, irreversible pathways that lead to misfolded states or short tracks that lead to kinetically favored states. Every G4 folding event in chromosomal DNA is basically a single take -other than the ensemble average of folding events typically investigated in most spectroscopic experiments. There might be cellular regulation mechanisms that guide G4 folding and refolding. Up to now, to little is known about how the folding dynamics are affected in the presence of G4 interacting proteins and the molecular mechanisms of G4 folding in a genomic context. It is conceivable that G4 folding is variable during different stages of the cell cycle or that potential G4 chaperones mediate refolding of different G4 conformations. There is now growing evidence that the noncanonical polymorphism of G4 structures is involved into regulation cascades of proteins that differentiate between different distinct G4 conformations. [59] Nucleolin is one well-established example for the discrimination of different loop isomers in parallel G4s [57,58] and G4 unwinding helicases do as well show vastly different unwinding efficiencies on different topological isomers. [38] Furthermore, the epigenetic modulation of chromatin states is now strongly linked to G4 formation and their influence on genetic stability. [39,41] G4 folding in this context might be not influenced just locally but could be mechanically mediated from distant sites in gene promoters through the DNA polymer chain. [212] Thus, biophysical studies on systems with increasing molecular size, more complex than single stranded oligonucleotides will be required to elucidate the altered folding dynamics of G4 forming sequences.