Centromere diversity: How different repeat‐based holocentromeres may have evolved

In addition to monocentric eukaryotes, which have a single localized centromere on each chromosome, there are holocentric species, with extended repeat‐based or repeat‐less centromeres distributed over the entire chromosome length. At least two types of repeat‐based holocentromeres exist, one composed of many small repeat‐based centromere units (small unit‐type), and another one characterized by a few large centromere units (large unit‐type). We hypothesize that the transposable element‐mediated dispersal of hundreds of short satellite arrays formed the small centromere unit‐type holocentromere in Rhynchospora pubera. The large centromere unit‐type of the plant Chionographis japonica is likely a product of simultaneous DNA double‐strand breaks (DSBs), which initiated the de novo formation of repeat‐based holocentromeres via insertion of satellite DNA, derived from extra‐chromosomal circular DNAs (eccDNAs). The number of initial DSBs along the chromosomes must be higher than the number of centromere units since only a portion of the breaks will have incorporated eccDNA at an appropriate position to serve as future centromere unit sites. Subsequently, preferential incorporation of the centromeric histone H3 variant at these positions is assumed. The identification of repeat‐based holocentromeres across lineages will unveil the centromere plasticity and elucidate the mechanisms underlying the diverse formation of holocentromeres.

most "regional" monocentromeres are formed by blocks of CENH3containing nucleosomes intermingled with canonical ones. [5]These blocks are termed "centromere units" possessing either centromereor noncentromere-specific DNA and all essential features to form a functional kinetochore protein complex.There are one or two constitutive centromere associated network (CCAN) complexes per CENH3 nucleosome according to structural models of the centromeric chromatin-to-microtubule connection (reviewed in ref. [6]).The CCAN is a subcomplex in the kinetochore that localizes to the centromere throughout the cell cycle.The "regional centromeres", representing the majority of centromeres studied so far, are often assembled on repetitive DNAs and flanked by heterochromatin domains. [7]Also, ChIP-seq experiments in different species, such as soybean, [8] Arabidopsis, [9] maize, [10] and Juncus effusus [11] demonstrated that regional monocentromeres consist of multiple neighbouring CENH3-enriched subdomains forming centromere units.The most direct proof that monocentric regional centromeres are composed of repeated centromere units is the observation of telocentric chromosomes, formed by centromere fission of a (sub)metacentric chromosome (e.g., ref. [12]).In addition, an elegant in vitro experiment has shown that dimers of the recombinant centromere protein CENPC bind stably to two nucleosomes, permitting further assembly of all other kinetochore subunits with relative ratios closely matching those of endogenous human kinetochores. [13]In contrast, holocentric chromosomes seem to be composed of chromosome-wide distributed centromere units, as first suggested by Franz Schrader, the pioneer of holocentromere research. [14,15]In addition, so-called metapolycentric chromosomes are known, which harbor several distinct clusters of adjacent centromere units, yet restricted to the extended primary constrictions of the chromosomes, as in pea, [16,17] fire ants, [18] and the beetle Tribolium castaneum. [19] monocentric species, new centromeres can appear during evolution at an ectopic chromosomal location.A new centromere often tends to form near the progenitor centromere, [8,20,21] and frequently, the establishment of a new centromere is accompanied by inactivation or loss of function of the old centromere.Initially, newly formed centromeres do not usually contain repeat DNAs but mature gradually through the acquisition and accumulation of repeats.As soon as one or a few repeats invade the novel centromere, their accumulation can be achieved by extended gene conversion during DSB repair.
If no major structural rearrangements (e.g., inversions) accompany the shifting of the centromere position, [8] the phenomenon is called centromere repositioning.The mechanism(s) underlying this process is/are not yet completely understood (for review, see ref. [22]).
Because holocentric species are frequently found within phylogenetic lineages possessing monocentric chromosomes, holocentric chromosomes are assumed to be derived from monocentric ones.These one-way transition events occurred multiple times in distant lineages. [23,24]As a consequence of independent evolution, the holocentromeres are diverse in composition and organization (reviewed in ref. [25]).

How to identify holocentric chromosomes
• Lack of a primary constriction in mitotic metaphase chromosomes and parallel separation of mitotic anaphase chromatids.
• A distinct longitudinal centromere groove in large holocentric plant chromosomes.
• Existence of inverted meiosis (first sister chromatids, then homolog separation) in some holocentric species.
• Line-like distribution of kinetochore proteins.
• Attachment of tubulin fibers along the entire length of chromosomes.
• Uniform or line-like distribution of (peri)centromerespecific histone marks (e.g.phosphorylated histone H3S10 and H3S28) in plant mitotic metaphase and anaphase chromosomes.
• The number of centromeric signals in interphase nuclei exceeds the actual number of chromosomes.
• A combination of Hi-C assembly-based characterization of genomic features.
• Stable transmission of irradiation-induced chromosome fragments.

Holocentromeres vary in their organization
Not in all holocentric species the assembly site for the kinetochore is epigenetically determined by CENH3 like in most monocentrics.
In some insect lineages, multiple independent events of CENH3 loss were found to be associated with the transition from mono-to holocentricity. [26]Instead, kinetochore complexes are established at transcriptionally inactive sites through a centromere protein CENP-T-dependent process. [27,28]Also, in holocentric Cuscuta plants, microtubules can attach to chromosome regions both with and without CENH3. [29,30]The CENH3-possessing holocentric species Luzula elegans is rich in satellite repeats, but none of the tested repeats is centromere-specific. [31]The holocentric nematode Caenorhabditis elegans displays no centromere-specific DNA sequence, [32,33] and its centromeres coincide with binding hotspots for various transcription factors without a particular preference for any specific transcription factor. [34]Hence, in various holocentric species, the mechanisms for centromere identity and kinetochore determination differ, supporting the notion of the independent emergence of holocentric lineages.
In holocentrics, the higher-order organization of centromere units varies between mitotic interphase and metaphase.Unlike monocentrics, where the dispersion of the centromeres during interphase is often restricted to chromocenters; in holocentrics, during interphase, depending on the species, the line-like centromeres disassemble and appear as dot-like (Figure 1A) or clustered (Figure 1B) centromere units throughout the nucleus.With the onset of mitotic chromosome condensation, the centromeric units join and form "line-like" structures along both chromatids.After the segregation of chromatids, the dispersion of holocentromeres is concomitant with chromatin decondensation.Hence, the higher-order organization of holocentromeres is cell cycle-dependent, and the line-like centromeres at mitotic metaphase result from the alignment of centromeric units during chromosome condensation.Polymer modelling suggests that the mitotic assembly of the holocentromere relies on the interaction between centromeric nucleosomes and chromatin fiber loop extruders, such as Structural Maintenance of Chromosome (SMC) complexes like cohesin and condensin during chromosome condensation. [35] monocentric species, the presence of additional centromeres at a distance necessitates the inactivation or loss of one of the centromeres to prevent chromosome destruction caused by the rupture of anaphase bridges resulting from sister chromatid twisting between them allowing centromeres on the same chromatid to attach to opposite poles.In contrast, holocentrics are not subjected to such constraints, because the tight alignment along the sister chromatids prevents twisting between them. [1]Instead, the microtubule-binding activity of all neighboring centromere units along the entire chromatids ensures the segregation of the holocentric sister chromatids to opposite spindle poles.
Similar to the de novo formation of monocentromeres, [36] genepoor regions are likely preferred for de novo formation of centromere units in holocentrics.Interlocus gene conversion may then aid in the homogenizing of centromeric satellite DNAs.In most monocentrics, the centromere-associated DNA consists of fast-evolving repetitive sequences, including satellite repeats and mobile elements (reviewed in refs.[37, 38]).However, the genome proportion and composition of centromeric satellites and retrotransposons vary enormously between species and profoundly influence the genome architecture (reviewed in ref. [39]).For holocentrics, the first centromere-specific repeats were identified in the sedge Rhynchospora pubera (Cyperaceae).The holocentromeres of this species harbor thousands of regularly spaced 15-25 kb-long CENH3-interacting satellite arrays and occasionally centromeric retrotransposons. [40,41]Due to the large number of small centromeric units, its genome reveals regularly interspersed eu-and heterochromatic subdomains at a broad scale. [40]holocentromere can be composed of only a few large units resembling monocentromeres Recently, a new type of repeat-based holocentromere with an exceptionally high genome proportion (16%) of centromeric satellite DNA was identified in the plant Chionographis japonica.[42] In contrast to other holocentric species, the centromere units of this species are of similar size as centromeres of some monocentric chromosomes, while individual centromere units of all other holocentric species studied in detail are significantly smaller.Each of the 68-137 Mb large C. japonica chromosomes carry only 7-11 evenly spaced CENH3-positive centromere units that represent arrays of the 23 and 28 bp-long minisatellite repeats (Chio1 and Chio2) (Figure 2A).The average size of single centromere units of 1.89 Mb (ranging from 0.24 to 4.46 Mb) is in the range of centromeric arrays in many monocentrics, such as Arabidopsis thaliana [9] and Zea mays, [10] and is 200-fold larger than those of holocentric Rhynchospora chromosomes.[41,43] The average distance between centromeric units is 9.97 Mb in C. japonica.A comparable centromere organization was recently reported for the mulberry Morus notabilis.The holocentromeres of this species consist of 3-9 satellite DNA-rich centromere units with an average size ranging from 2.14 to 3.03 Mb. [44] However, the dynamic of this centromere during the cell cycle still needs to be analyzed.
Due to the varying number and size of centromere units, the large-scale eu-and heterochromatin arrangement at interphase differs between holocentric species with many small (small unit-type, e.g., R. pubera) or, alternatively, few large centromere units (large unit-type, e.g., C. japonica) (Figures 1 and 2).In the former, eu-and heterochromatin marks are uniformly distributed. [31,41]In contrast, in the latter, multiple centromere units cluster in blocks that form conspicuous chromocenters in interphase nuclei, [42] resembling the situation in many monocentric species.
In C. japonica, a distinctive eu-and heterochromatin arrangement at mitotic metaphase exists across the chromosome diameter instead of along the chromosome length as in monocentric chromosomes (Figure 2B).The immunosignals of the heterochromatin-associated histone H3K9me2 mirrored a holocentromere-like CENH3 distribution pattern along the chromosomes, while the signals of the euchromatinassociated histone H3K4me2 are enriched throughout chromosomes except in (peri)centromeric regions. [42]Likely, the chromatid folding and different condensation of eu-and heterochromatin explain why immunostaining of mitotic chromosomes showed a distinct eu-versus heterochromatin distribution compared to the interphase patterns obtained by ChIP-seq analysis (Figure 2C).This pattern resembles the chromatin organization in monocentric species with small genomes. [45]e existence of palindromes (dyad symmetries) is a prevalent trait of centromeric DNA, suggesting that these dyad symmetry structures may be able to maintain the centromere and act as an epigenetic marker. [46,47]Monocentromeres are abundant in DNA, deviating from the usual B-conformation, such as hairpin loops and cruciform shapes.
Also, in C. japonica, the centromeric Chio satellites, with monomers in the length of only 23-and 28-bp, contain short dyad symmetries likely forming non-B DNA secondary structure, [42] which might be required for the formation of centromere units. [46]Besides, similar dyad symmetry-rich centromeric satellites were found in the holocentric Meloidogyne nematodes. [48]Thus, mono-and holocentromeric DNA may share a non-B-conformation.

Repeat-based holocentromeres -an evolutionary consequence of multiple DNA double-strand breaks and their erroneous DNA repair?
Helitron transposable element-mediated dispersal of hundreds of Tyba satellite arrays has been suggested as a mechanism to disperse centromere units in the holocentric species R. pubera [41] (Figure 3A).However, such a mechanism seems improbable in the case of C. japonica.Because Chio repeats in the C. japonica genome do not show sequence similarity with any transposable element in this genome, nor are they flanked by particular transposable elements.In addition, the transport of megabase-scale Chio arrays by kilobase-sized transposable elements is unlikely. [42]Consequently, different types of repeat-based holocentromeres have evolved, for example, those com-posed of many small centromere units or few large centromere units, each through different mechanisms.
We speculate that DNA double-strand breaks (DSBs) and their erroneous repair could initiate the de novo formation of repeat-based holocentromeres of C. japonica (Figure 3B).DSBs are caused by environmental factors such as ionizing irradiation or by endonucleases, and their repair can happen in all cell cycle stages.Several DSB repair mechanisms are known.All of them repair DSBs mainly in a correct manner.Occasionally, however, misrepair leads to deletions, insertions, or ligation of the ends of different DSBs, the latter resulting in diverse structural chromosome rearrangements (e.g., translocations, inversions) (for review ref. [49]).
We suggest that the extra-chromosomal circular DNA (eccDNA) of the monocentric precursor species was almost genome-wide inserted during the process of insertion-linked DSB repair after simultaneous multiple breakages.EccDNAs could result from either deletion-linked error-prone repair of DSBs within repeat arrays [50] or, even more likely, from complementary DNA synthesis of centromeric transcripts.
Indeed, centromeric eccDNA has been observed in some species, such as A. thaliana and Oryza sativa. [51]Centromeric Chio repeat arrays of C. japonica were observed exhibiting alternate forward and reverse orientations. [42]This could be due to the insertion of eccDNA in different orientations into DSBs, to additional small inversions during the repair of multiple DSBs, or alternatively due to ectopic recombination.
Breakage-fusion-bridge (BFB) cycles, another option to yield inverted repetitive sequences via rupture of (transient) dicentric chromosomes, [52] seem less likely in the context of holocentricity with few large centromere units per chromosome.A spontaneous Next, if eccDNA insertion served as the mechanism for holocentromere formation in C. japonica, then the genome synteny between the holocentric C. japonica and putative monocentric precursor Chamaelirium luteum proposed by Tanaka [53] should be highly conserved.To confirm synteny conservation, we suggest chromosome-wide oligopainting FISH experiments and comparative genomic analysis between the two species.
An alternative to eccDNA integration into multiple simultaneous DSBs would be the "fusion" of small monocentric chromosomes by translocation with terminal breakpoints.However, this route requires an ancestor with a corresponding number of monocentric chromosomes of a size that prevents the twisting of sister chromatids between the centromeres after "chromosome fusion."Chromosome fusion seems unlikely as a mechanism for the formation of Chionographis holocentromeres, as its putative monocentric precursor, Chamaelirium luteum, shares the same number of chromosomes. [53]ken together, considering the different prerequisites and limitations for the de novo formation of a repeat-based holocentromere, the transition of mono-to holocentromere via simultaneous spreading of centromere units across all chromosomes of the species seems to be a rare but possible event.However, an origin via a stepwise, distally proceeding insertion of clustered centromere units, starting from the original monocentromeres (with metapolycentromeres as an intermediate stage) cannot be excluded.Intriguingly, in monocentric Juncus effuses, [11] a close relative of the holocentric genus Luzula, two types of centromeres have been found after detailed CENH3-ChIPseq analysis.Type 1 centromeres resemble canonical monocentromeres with a single CENH3 domain, but type 2 shows few additional CENH3 domains embedded within a restricted centromere region.It was speculated that this initial split of CENH3-rich domains could constitute a transient state between mono-and holocentricity which later facilitate the progressive chromosome fusions as observed in Rhynchospora. [41]

1
Simplified models of the dynamic organization of centromere units during mitosis of the repeat-based holocentromeres of (A) Rhynchospora pubera composed of many small repeat-based centromere units (small unit-type) and (B) Chionographis japonica characterized by a few large centromere units (large unit-type).To simplify, for both types, only one chromosome is depicted.

2
Comparison of holocentric Rhynchospora pubera and Chionographis japonica species possessing repeat-based holocentromeres.(A) Centromere and genome characteristics in both species.The data were retrieved from refs.[41] and [42], respectively.(B) Ideogram showing immunostaining patterns of CENH3, histone H3K9me2, and H3K4me2/3 in mitotic metaphase chromosomes of both species.(C) Enlarged view of a 50-Mb region showing the interphase enrichment of CENH3, the euchromatin marks H3K4me3 in R. pubera and H3K4me2 in C. japonica, and the heterochromatin mark H3K9me2 in both species.ChIP-seq signal tracks are shown as log 2 (ChIP/input) for C. japonica and as log 2 (IP) for R. pubera.
release of eccDNAs followed by a chromosome-wide and simultaneous random reintegration into the genome might occur rarely.De novo formation of a holocentromere requires additionally (i) an assembly of CENH3-containing nucleosomes at eccDNA-derived centromeric satellite repeats, (ii) a sufficient size of centromere units to binding microtubules, and/or (iii) blocks of centromere units at a distance that prevents twisting of sister chromatids between them, ensuring correct segregation during nuclear division.Only newly formed centromere units that fulfill these prerequisites may allow the formation of repeat-based holocentromeres that survive natural selection.To test our hypothesis, we propose the characterization of the eccDNA fraction in C. japonica by taking advantage of short-and long-read next-generation sequencing.The eccDNAs with sequence similarity to the centromeric Chio repeats should exist if the eccDNAs were involved in the de novo formation of holocentromere in C. japonica.