What Doesn't Kill You Makes You Stronger: Transposons as Dual Players in Chromatin Regulation and Genomic Variation

Transposable elements (TEs) are sequences currently or historically mobile, and are present across all eukaryotic genomes. A growing interest in understanding the regulation and function of TEs has revealed seemingly dichotomous roles for these elements in evolution, development, and disease. On the one hand, many gene regulatory networks owe their organization to the spread of cis‐elements and DNA binding sites through TE mobilization during evolution. On the other hand, the uncontrolled activity of transposons can generate mutations and contribute to disease, including cancer, while their increased expression may also trigger immune pathways that result in inflammation or senescence. Interestingly, TEs have recently been found to have novel essential functions during mammalian development. Here, the function and regulation of TEs are discussed, with a focus on LINE1 in mammals. It is proposed that LINE1 is a beneficial endogenous dual regulator of gene expression and genomic diversity during mammalian development, and that both of these functions may be detrimental if deregulated in disease contexts.


Introduction: Overview of TE Biology
Approximately one third to half of the typical mammalian genome is derived from transposable elements (TEs). [1,2] TEs are DNA sequences that are capable of inserting into new locations in the genome, giving rise to repeated copies of the elements and to genome expansion. Most of our knowledge of mammalian TEs is derived from a limited number of organisms, in particular mice and humans. TEs can be classified as: i) retrotransposons or class I elements, which replicate using an RNA intermediate in a 'copy and paste' mechanism; and ii) DNA transposons or class II elements, which transpose without any RNA intermediate. We focus here on retrotransposons, as they are the most active and abundant in mammalian genomes. There are two main classes of retrotransposons: i) long terminal repeats (LTR) elements, which are characterized by the presence of 100-300 bp LTR at their 5′ and 3′ ends; and ii) non-LTR elements which lack such structures. Endogenous retroviruses (ERVs) are the main LTR elements and are thought to have evolved from ancestral retroviral infections (reviewed in ref. [3]). Long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs) are the two most active and studied non-LTR elements. LINEs are thought to have evolved from bacterial or organelle group II introns, [4] whereas SINEs, which depend on LINE machinery for transposition, have evolved from a variety of cellular RNAs. [5] Mammalian genomes have higher TE content compared to nonmammalian vertebrates. Mammalian genomes contain low amounts of DNA transposons (<3%) and limited LTR elements (4-10%) but are dominated by the accumulation and activity of the non-LTR elements (about 75% in most mammals) (reviewed in ref. [2]). Genomic distribution of a TE family often follows a pattern originating from a two-step process: first, intrinsic features of transposition direct the initial location of the insertions, and second, selective pressures shape the evolving distribution of TEs (reviewed in ref. [6]). elements are ≈6kb long and encode two proteins (ORF1p and ORF2p) that constitute their transposition machinery, [9] with primate LINE1 coding for an additional peptide, ORF0p. [10] LINE1 contains an internal ribosome entry-like sequence between ORF1 and ORF2. [11,12] ORF1p is a LINE1 RNA-binding protein, whereas ORF2p embodies the enzymatic activities of endonuclease (EN) and reverse transcriptase (RT). Despite their abundance, most LINE1 elements are immobile due to the accumulation of mutations. [9] Only 80-100 LINE1 in human and an estimated 2300 in mice are transposition-competent out of ≈868 000 and ≈599 000 elements in their respective genomes. [13,14] LINE1 insertion takes place mainly via an endonucleasedependent target-primed reverse transcription (TPRT) [15,16] (reviewed in refs. [17,18]) although some insertions can also take place in preexisting DNA lesions. In TPRT, a LINE1 mRNA serves as the template for the synthesis of numerous copies of the ORF1p but only one or two copies of ORF2p. LINE1 RNA associates with several ORF1p homotrimers and at least one ORF2p dimer in cis to form a ribonucleoprotein (LINE1 RNP) particle in the cytoplasm. The LINE1 RNP enters the nucleus where the ORF2p endonuclease nicks a degenerate consensus sequence at the genome and liberates a 3′ hydroxyl group. This free 3′ hydroxyl group is then used as a primer by the ORF2p reverse transcriptase to synthesize LINE1 cDNA starting from the polyA tail of LINE1 mRNA. The polyA tail of nonautonomous SINE element RNA can compete with LINE1 RNA polyA for the LINE1 ORF2p reverse transcriptase, thus hijacking the LINE1 machinery to transpose. [19,20] Moreover, LINE1 ORF2p can also retrotranspose unique protein coding mRNAs and small nuclear RNA. [21,22] While LINE1 is considered an autonomous element, it is clear that it depends on other cellular proteins for retrotransposition. A number of positive regulators of LINE1 activity have been identified [23] (reviewed in ref. [24]). For example, nucleolin and heterogeneous nuclear ribonucleoproteins (hnRNPs) both bind the IRES in LINE1 RNA. While nucleolin promotes LINE1 and LINE1-assisted Alu transposition, hnRNPs limit their mobility. [11] Other positive regulators of LINE1 retrotransposition include PABPC1, PCNA, mitogen-activated protein kinases and cyclindependent kinases (reviewed in ref. [24]). In addition, the DNA repair machinery is required for completion of LINE1 insertion (reviewed in refs. [17,18]).
Conversely, there are several mechanisms to limit LINE1 transposition in mammalian genomes, at transcriptional, posttranscriptional, and posttranslational levels. LINE1 transcription is silenced via CpG DNA methylation and repressive histone modifications at the LINE1 promoter [25] (reviewed in ref. [26]). Distinct KRAB-ZFPs selectively recognize ERVs and LINE1 and recruit KAP1 to silence them via formation of heterochromatin. [25,27,28] The HUSH complex and MORC2 have also been shown to target evolutionarily young euchromatic LINE1 elements for silencing by deposition of H3K9me3. [23,29] Posttranscriptional silencing of LINE1 RNA is mediated via RNA interference by small RNAs, such as piRNAs and rasiRNAs. [30][31][32] PIWI proteins use piRNAs to guide the cleavage of LINE1 RNA in germ cells. In addition, the piRNA pathway can lead to de novo methylation and silencing of LINE1 DNA in primordial germ cells (PGCs). [33] LINE1 retrotransposition is also posttranslationally restricted by a number of interferon-stimulated genes, including some of the APOBEC3 cy-tidine deaminase family members, MOV10, TREX1, and several other factors (reviewed in ref. [24]).

The Connection between Derepression of TEs and Disease
The existence of multilayer mechanisms to suppress LINE1 in normal cells supports the hypothesis that uncontrolled LINE1 activity is deleterious. LINE1 disruptive insertions were discovered as far back as 1988, within the factor VIII gene of a subset of hemophilia A patients. [34] A large body of research is now devoted to understanding the potential roles of LINE1 in many cancers [35] (reviewed in ref. [36]). The CpG-rich LINE1-promoter is subject to global hypomethylation in cancer along with other regions of the genome, which correlates with widespread reactivation of LINE1 Orf1p expression. [35] Moreover, LINE1 expression in cancer is associated with both increased retrotransposition and also poorer prognosis in many tumor types. [37,38] The most commonly cited pathogenic role for LINE1 is as a mutagen. First described in the disruption of APC in colon cancer, [39] LINE1 somatic insertions have since been found in many tumors, [40,41] where they have the potential to disrupt both the insertion locus as well as the expression of nearby genes. [41] Uncontrolled retrotransposition has been associated with several conditions outside of cancer. In mice, disruptions in the piRNA pathway result in male sterility, accompanied by LINE1 Orf1p upregulation and DNA damage during adult spermatogenesis. [42] In wild type mouse ovaries, higher Orf1p expression correlates with meiotic defects. Inhibitors that block LINE1 reverse transcriptase (RTis) increase oocyte numbers in both wild type and piRNA pathway-mutant mice, pointing to a link between LINE1 expression and fetal oocyte attrition. [43] In another model of piRNA deficiency, Mov10l1-mutant mice exhibit upregulated LINE1 expression and retrotransposition, yet infertility is not rescued by RTis. [44] This and an increasing body of evidence points to potential links between TE expression and pathology outside of retrotransposition. The upregulation of IAP elements in Miwi2 and Dnmt3l null spermatogonia is not associated with DNA damage, but instead transcriptional deregulation of nearby genes. [42,45] Outside the germline, TE overexpression is described in several neurodegenerative disorders, including ALS and Alzheimer's Disease (reviewed in ref. [46]). In this case, existing evidence points to the transcription of TE-derived RNAs as responsible for pathology. Several TE promoters, including certain LINE and ERV elements, drive bidirectional transcription and the production of double-stranded RNA that triggers innate immune responses. [47] In addition, MeCP2 mutations in Rett syndrome lead to upregulated LINE1 activity in neuronal cells. [48,49] Most recently, LINE1 elements have been shown to become activated in late-stage senescence, with LINE1-derived cDNA inducing an interferon response mediated by the cGAS-STING cytoplasmic DNA sensor. [50,51] Remarkably, RTi treatment is sufficient to improve several aging-related metrics. [50] Overall these results suggest there is much more to be learned about TE-induced pathology, including functions that may be independent of retrotransposition.

TE DNA and the Evolution of Cis-Regulatory Networks
The mutagenic activity of TEs has made them potent drivers of genomic change and evolution (reviewed in ref. [52][53][54][55]). It is striking to note that the genomic content of TEs but not genes has steadily increased with species biological complexity. [56] The best example of TE contribution to genomic evolution is in the widespread mobilization of cis-elements by retrotransposition. Transcription factor binding sites for several proteins including the pluripotency factors OCT4 and NANOG, as well as master regulators TP53 and ESR1, share low sequence homology between mouse and human. These sites have to a large extent been propagated due to species-specific TE mobilization. [57][58][59] A comprehensive study of 26 pairs of transcription factors in mouse and human cells revealed that over 20% of all binding sites are within TE repeats. [60] Thus, whilst many important transcription factors may share similar roles between mouse and human, a large fraction of their binding site repertoire has been independently generated through TE mobilization.
It is only via retrotransposition during early embryo or germline development that a TE can be vertically transmitted to future generations. Therefore, it is unsurprising that increasing roles are being uncovered for TEs in developmental gene regulatory networks. This is particularly true for ERV regulatory elements, which can function as both promoters or enhancers. Distinct classes of ERV elements are expressed during early human embryogenesis and are often spliced into nonviral exons, [61,62] or may function as enhancer elements with the potential to affect both nearby and distant gene expression. [63] The ERVL LTR promoter drives early cleavage-stage specific genes in both mouse and human embryos, and has been associated with zygotic/embryonic genome activation. [64][65][66][67] Most recently, both MERVL and HERVH elements have been implicated in 3D genome organization, where their presence can affect TAD boundaries as well as nearby gene expression. [68,69] As genomewide mapping and analysis tools continue to improve, it is likely that many more examples of TE-derived cis-regulatory elements and networks will be uncovered.

Emerging Evidence for Essential Roles for Expressed TEs during Development
Increasing evidence suggests that regulatory functions of TEs are not limited to provision of cis-elements at the DNA level, but that TE RNA and/or protein products may adopt functional roles in cells. In several organisms, ERV-derived products have been shown to interfere with or inhibit viral infections, in part by competing with exogenous viral proteins to inhibit replication. [70] Interestingly, these same interactions may also drive the evolution or diversification of ERV proteins. [70] RNA derived from intronic Alu elements has been suggested to be important for nucleolar structure and Pol I transcription, [71] with splice-sites contained within Alu RNA also contributing to alternative splicing events. [72] Inverted Alu repeats are additionally frequent in the genome, where they may form double-stranded RNA structures that have the potential to affect RNA editing, retention of Alu-containing RNA, or circular RNA production [73] (reviewed in ref. [74]).
During early mammalian development, the relaxation of epigenetic repression of TEs has allowed their expression and redeployment for diverse cellular roles. The placenta has been known for several decades to be highly permissive to ERV expression. ERV elements are hypomethylated in trophoblast cells, [75] and have adopted several functions in placental development, contributing to rapid diversification of this organ. For example, RLTR13D5 DNA elements in mice function as species-specific trophoblast stem cell enhancer elements to direct placental gene expression programs. [76] Placental syncytins, which are derived from retroviral envelope (Env) proteins, are a well-known example of TE co-option and are essential for trophoblast fusion and normal placental development. [77,78] Intriguingly, these proteins separately evolved from retroviruses in both mouse and human for similar functions. Most recently, a role for syncytins in mediating the fetal response to maternal infection during pregnancy has been suggested, whereby IFN-mediated activation of IFITM proteins directly inhibits syncytin-induced trophoblast fusion. [79] TEs such as LINE1 and various ERVs become globally demethylated and expressed during the major waves of epigenetic erasure that take place during preimplantation and germline development (Figure 1). [80][81][82][83][84][85] Similarly, a relaxation of H3K9me2-based repression of TEs occurs in PGCs, contributing to their derepression. [86][87][88] As described above, many TE families act as important cis-elements for gene regulation. However, it is becoming evident that there are also roles for TE gene products in early development. RNA derived from HERVH elements is important for maintenance of the undifferentiated state of human embryonic stem cells (ESCs), possibly by acting as lncRNAs. [89,90] The HERVK-encoded protein, Rec, is highly expressed in human cleavage embryos, where it increases antiviral resistance via upregulation of IFITM1 proteins. Rec also binds to a subset of endogenous RNAs, regulating their ribosome occupancy and expression levels in a manner that may be important for early human development. [62]

A Novel Role for Nuclear LINE1 RNA in the Regulation of Gene Expression During Development
In mice, we and others recently discovered novel roles for LINE1 during early embryonic development. LINE1 RNA is abundant within the nuclei of ESCs and cancer cells, in contrast to LINE1 Orf1p protein. [91,92] LINE1 is highly expressed in preimplantation embryos and downregulated upon differentiation (Figure 1). [80,92,93] Despite its abundance, LINE1 appears to be under tight dosage regulation, with either its repression or overexpression resulting in a reduction in blastocyst formation. [92,93] Perhaps due to the association of LINE1 RNA with euchromatin as determined by in situ hybridization, [91] the expression of LINE1 appears to regulate global chromatin accessibility. LINE1 RNA knockdown or transcriptional inhibition leads to global reductions in DNA accessibility. [92,93] This effect is not rescued by exogenous injection of LINE1 mRNA, suggesting a role for nuclear LINE1 RNA in cis. [93] In support of this notion, we found that LINE1 RNA is associated with genomic loci of both rDNA and also a potent MERVL activator, Dux [66,67] in ESCs. [92] Interestingly, LINE1 depletion in ESCs and embryos induces not only a loss of global transcription, including ribosomal RNA, but also, Figure 1. Dynamic expression and essential roles of LINE1 in mammalian development. LINE1 is highly expressed during several stages of mouse and human development, notably during the global reprogramming of DNA methylation and histone modifications that takes place during preimplantation and germline development. The study of one such context-preimplantation development-has revealed essential functions for LINE1 RNA in regulating both rRNA and 2-cell specific gene expression by promoting recruitment of key factors to chromatin. LINE1 expression also impacts global chromatin organization, via direct or indirect mechanisms that remain to be elucidated. In separate studies, LINE1 expression and mobility has been linked to neurogenesis. See text for details. specific activation of Dux and the MERVL expression program. This function of LINE1 RNA is mediated by nucleolin, which associates with LINE1 RNA and depends on it for recruitment to rDNA and Dux (Figure 1). Moreover, when ESCs transition to a 2-cell (2C)-like state, the Dux locus is relocalized from the nucleolar periphery, where nucleolin and LINE1 DNA are known to accumulate, [94] to the nucleoplasm. [92] We hypothesize that LINE1 RNA acts in cis as a nuclear scaffold that is essential for chromatin organization and gene regulation during development (Figure 1). While LINE1 RNA had previously been suggested to be required for early mouse development, possibly by acting as a source of RT activity, [95,96] our data [97] and those of Jachowicz et al. [93] indicate that the developmental function of LINE1 RNA is not mediated by retrotransposition. Taken together, these data suggest that there may be many more cases of unexpected roles for LINE1 and other TEs in gene regulation within both development and disease contexts.
It remains unclear whether there are specific LINE1 elements that are key regulators of gene expression during development, or if multiple transcribed LINE1s cooperate to perform this function. Further work will be required to distinguish these possibilities, although at present we favor the latter one. The widespread distribution of LINE1 elements in mammalian genomes make them good candidates for nucleators of local chromatin organization and higher order chromatin architecture. Interestingly, LINE1 and SINE elements, both of which depend on the activity of LINE1 ORF2p for dispersion in the genome, have anticorrelated genomic distributions: LINE1 elements are enriched at gene-poor, silenced, heterochromatic, AT-rich regions, whereas, SINE elements are overrepresented at gene-rich, expressed, euchromatic, GC-rich regions. [1,98] A similar overall segregation of genomic regions is seen with B (silenced) and A (expressed) compartments in Hi-C studies of higher order chromatin conformation. [99] Recent results raise the intriguing possibility that LINE1 and SINE elements not only correlate with the B and A compartments, respectively, but may also causally contribute to the folding of chromosomal regions into these two compartments. [100,101] The high association of repeat RNA including LINE1 RNA with chromatin, [91] and the evidence that nascent chromatin-bound RNA promotes global chromatin organization, [102,103] lend additional support to this model. Further studies will be required to determine the mechanisms by which transcription of LINEs and SINEs might impact 3D genome organization and its dynamics during development.

Moving Beyond the "TEs versus host" Way of Thinking
The studies discussed above point out the essential roles that TEs play during mammalian development. If they were not mobile elements, or derived from elements that had once been mobile, they would fall under the category of DNA regulatory sequences that orchestrate early developmental transitions. However, the fact that some elements are mobile, can in some cases cause disease, and have mostly been studied from that perspective has led to the view that they are in conflict with the rest of the genome, referred to as the "host." While suspected cases of horizontal transmission of LINE1 have been reported, [8] LINE1 itself is not thought to ever have been infectious, since the proteins it codes for carry out an intracellular "life cycle." Yet, it is perhaps the most ancient TE in eukaryotes, present across fungi, plants, and animals. In other words, LINE1 has been with us, eukaryotes, for over 1 billion years, with a notable explosion in mammals, where it accounts for ≈30% of the genome if we include SINEs. Is LINE1 really a parasite that we simply cannot find a way to get rid of, or an integral symbiont and part of our biology?
There are, as summarized above, cellular pathways that have evolved to antagonize transposition. In some cases, these pathways may be part of evolving "arms-races" between TEs and their "hosts", such as the piRNA pathway and KRAB-ZNF transcription factors (reviewed in ref. [104]). In the study of such pathways, or in cases of recent horizontal transfers of TEs into a new species, it may be helpful to continue to assume the "TE vs host" perspective. However, in the case of LINE1 in mammals, and potentially other cases, this perspective may be limiting. We would argue that the existence of cellular pathways that limit transposition does not necessarily warrant the view of TEs such as a LINE1 exclusively as parasites exploiting hosts. Biology is rich in examples of cellular pathways with opposing functions, such as translation to make proteins and the ubiquitin/proteasome system to degrade them, signaling pathways that induce the expression of their own inhibitors, etc. These "conflicts", rather than being detrimental, support two essential features of biological systems: robustness, by being able to buffer fluctuations in individual components, and flexibility, by allowing novelty to arise in response to changes in external conditions. [105] We propose that the relationship between many TEs and the pathways that limit their mutagenic activity can be viewed from this perspective. This relationship maintains genomes in an overall metastable state capable of supporting reproducible organismic development to reproductive age, while remaining poised for the rapid generation of genetic novelty in a changing environment. In turn, this metastability may in some cases lead to disease by deleterious retrotransposition events. In other words, the potential reactivation of TEs in adult somatic cells and their link to diseases such as a cancer may be the price to pay for their roles in development and evolvability.

Stress, TEs, and Evolvability
Barbara McClintock speculated that the activation of TEs is triggered by some sort of genomic shock, be it internally derived or caused by stresses external to the organism, which contributes to genomic rearrangements that drive speciation. [106] In fact, McClintock described how her discovery of TEs in the experiments she began in 1944 was enabled by fact that the maize strains that she was studying contained a chromosomal abnormality (a broken telomere) that may have caused activation of TEs as a stress response. [106] In other words, if it were not for the activation of TEs in response to stress, McClintock might not have discovered TEs. Over 35 years after McClintock's proposal that TEs are stressresponsive agents of variation, [106] the evidence has mounted in its favor. Numerous studies in organisms including plants, yeast, flies, mice, and humans have shown that TEs such as LINE1 and others are reactivated under conditions of environmental stress, suggesting an inherent mechanism to increase genetic diversity in order to adapt to harsh conditions, although the underlying mechanisms remain poorly understood (Figure 2, reviewed in refs. [53,107,108]). We highlight two recent examples of how cellular stress may be molecularly linked to TE activation in mammalian cells. In one such study in mice, SIRT6 was shown to bind directly to the LINE1 promoter, facilitating its repression via KAP1 and HP1a. [109] Both DNA damage or aging lead to loss of SIRT6 from the LINE1 promoter and induction of its expression. In a different study in mice, HSP90, a stress-responsive chaperone that buffers genetic variation, [110] was shown to promote repression of ERVs and nearby genes, again via interaction with KAP1. [111] Stress conditions can compromise HSP90 function and lead to the derepression of ERVs, although in this case retrotransposition was not investigated. [111] A related role for HSP90 in repression of transposons had previously been reported in Drosophila. [112] It is worth noting that, unlike plants and most other animals, mammalian embryos are not "orphan", in that they are not left to fend for themselves in direct contact with the external environment. The maternal body imposes several layers of separation between mammalian embryos and the environment, notably the uterus and the placenta. It is therefore reasonable to ask whether mammalian embryos in utero are capable of "sensing" environmental stress. The answer is yes, including in humans. For example, subnutrition during gestation leads to higher incidence of metabolic, cardiovascular and neurological disorders in adulthood. [113] Recent studies have begun to reveal potential molecular mechanisms of how environmental perturbations during pregnancy can impact the fetus. [87,114] Interestingly, these molecular mechanisms of environment-embryo communication involve epigenetic layers of gene regulation known to impact directly the expression of TEs, such as DNA methylation. [115,116] It will be important to further explore the sensitivity of mammalian embryos to environmental stressors in vivo, including how these may affect the regulation and activity of TEs.

TEs as Agents of Genomic Diversification
As described above, TEs such as LINE1 are highly expressed in early mammalian development, and this expression is essential for developmental progression. The vast majority of such TEs are incapable of transposing due to acquired mutations, but some LINEs, SINEs, and ERVs are still transposition-competent in certain species. Moreover, there are multiple ways that TEs can be mutagenic beyond their canonical transposition (see below). The expression and function of TEs during early development and the embryonic germline poises them to generate heritable genomic novelty. This, in turn, may provide natural selection with a palette of new variants. Most TE-related mutations are lost to genetic drift, and the cases where they have an irreparably detrimental impact tend to be lost to natural selection. It is therefore misplaced to ask whether TE-induced variation overall Figure 2. TEs as dual regulators of chromatin organization and genomic variation. TEs can contribute to a multitude of cellular functions, either at the DNA, RNA, or protein level, notably to chromatin organization and dynamics during development. TEs also contribute to genomic variation, by retrotransposition as well as other mechanisms, in both somatic cells and the germline. Several pathways prevent uncontrolled TE activity at multiple levels. These pathways can be compromised by environmental stressors, which may thereby lead to higher levels of genomic variation induced by TEs. In addition to the developmental roles and mutagenic activity of TEs during ontogeny (panels "TE function"-"Impact of Stress"), germline-transmitted TE-induced variation may be subject to natural selection, and contribute to genome diversification over evolutionary time (panel "Evolution"). Note that this figure focuses on retrotransposons, which constitute the vast majority of TEs in mammals. See text for details.
has an adaptive value, much in the same way that we would not ask that question of other sources of variation, such as point mutations-the answer would naturally be: "it depends." What is clear is that TE-induced variation carries a very high information content, given the numerous ways in which it can modulate gene activity, [54,55] and the profound impact that it has had in the evolution of genomes.
When it comes to the mutagenic activity of LINE1, estimated frequencies of new insertions are of approximately 1 in 100 human births [117] and 1 in 8 mouse births. [118] These are conservative estimates and there are many ways by which LINE1 can impact the genome. [119] To summarize a few: i) Canonical LINE1 activity induces genomic insertions, as described above, via an RNA intermediate and a TPRT mechanism. It is important to note that, given the 3′ to 5′ nature of canonical LINE1 retrotransposition, a mutation can only be classified as a LINE1 insertion if it is long enough to extent past the polyA tail of the mRNA and into sequences diagnostic of LINE1; ii) LINE1 activity drives the retrotransposition of SINE/Alu elements, themselves major modulators of genomic variation; [5] iii) it is possible that the endonuclease activity of LINE1 ORF2 may have mutagenic effects independent of retrotransposition; [120] iv) LINE1 and SINE elements can impact short tandem repeats, particularly AT-rich ones, which are abundant in the genome and have much higher mutation rates than non-repeat regions; [121] v) due to their repetitive nature, LINE1 and SINE elements can induce large-scale genomic variants such as duplications, inversions, and deletions via recombination. [5,119,122] In addition, two recent studies reveal surprises with regards to the insertion preferences (or lack thereof) of LINE1 in the human genome. [123,124] Despite the enrichment for LINE1 elements at gene-poor, heterochromatin regions, these studies show that de novo insertions of LINE1, while restricted by degenerate, short nucleotide motifs, have no preference to any particular chromatin feature except for replication timing, suggesting a link to host DNA replication. [123,124] Thus, LINE1 is a much more versatile mutagen than previously thought, and is then subject to the powerful forces of natural selection during evolution, to potentially lead to its enrichment at heterochromatin. Bourque and colleagues recently stated that "on average, any two human haploid genomes differ by approximately a thousand TE insertions, primarily from the LINE1 or Alu families." [54] The full spectrum of mutagenesis induced by LINE1 deserves further investigation.
As mentioned above, inhibition of pathways that repress LINE1 expression in the female mouse germline, notably DNA methylation and the piRNA pathway, leads to excessive LINE1 retrotransposition and oocyte death. [43,125,126] Interestingly, treatment of mice with AZT, an RTi, prevents oocyte demise but only temporarily, suggesting that the EN activity of ORF2 (which is not blocked by RTis) can also contribute to DNA damage and apoptosis. [43] In support of this notion, in mice mutant for the checkpoint kinase Chk2, which acts in the pathway that induces apoptosis in response to DNA damage, treatment with AZT leads to preservation of the oocyte pool, with no effects on fertility. [126] www.advancedsciencenews.com www.bioessays-journal.com In other words, mutations potentially induced by LINE1 ORF2 EN can be tolerated in the germline of mice, if the DNA damage sensing pathway is compromised. This could be one mechanism by which LINE1-induced genomic variation is added to the gene pool in mammals.
In sum, as long as the activity of TEs is controlled such that it does not compromise organismic development and reproduction, a certain level of TE-induced mutagenesis may be tolerated, and perhaps increased in conditions of environmental stress ( Figure 2). TEs may, at least in certain cases, fulfill in biology the philosopher Friedrich Nietzsche's aphorism "what does not kill me makes me stronger." [127]

Conclusions and Looking Ahead: LINE1 as Dual Regulator of Gene Expression and Genomic Diversity
We have summarized here some of the growing evidence of the critical roles played by TEs in genome evolution, chromatin architecture, and embryonic development. These roles put into question the prevailing view of TEs solely as genomic parasites. The recent data suggest that mammalian LINE1 and perhaps other TEs can be envisioned similarly to mitochondria, which we do not anymore distinguish as foreign bodies in the eukaryotic cell, even though they arose from an endosymbiotic relationship. [128] In fact, the proposed evolution of LINE1 from bacterial or organelle group II introns [4] is in line with this notion. In mammals, we consider the roles of LINE1 as chromatin regulator and, in certain conditions, generator of genomic diversity, to be compatible and deserving of an integrative approach in the contexts of both development and disease ( Figure 2).
There is much exciting work ahead on the role of LINE1 in chromatin organization and gene expression during development. Although LINE1 RNA is suspected to act in cis at or in the vicinity of sites of LINE1 transcription, this remains to be investigated in detail. It will also be important to decipher which parts of the LINE1 RNA interact with which nuclear factors, how these interactions regulate the local chromatin landscape, and how these relationships change during developmental progression. We anticipate that LINE1 RNA will have roles beyond preimplantation development and ESCs. Exploring potential roles in somatic cell progenitors or in regeneration contexts will be of particular interest. These studies may benefit the development of new tools to perturb LINE1 expression in vivo that are not dependent on delivery of antisense oligonucleotides. It is important to note that inferring LINE1 RNA expression levels from current RNA-seq strategies may be misleading, for two principal reasons. First, global changes in transcriptional output are not captured by standard RNA-seq methods. For example, we recently used cell-number normalized RNA-seq to document that the mouse embryonic germline is in a state of global hypertranscription, including at TEs. [85] Second, a high proportion of LINE1 RNA is chromatin associated, and this fraction tends to be resistant to standard extraction of total cellular RNA. [91] Using approaches such as RNA FISH or enrichment for chromatin-bound RNA may provide a more faithful picture of LINE1 RNA abundance. It will in addition be of interest to devise new methods to detect and manipulate the expression of specific LINE1 families and elements, so as to dissect their relative contribution to chromatin organization and development.
The ability of LINE1 to contribute to genomic variation in conditions of stress raises fascinating future avenues of research. New technologies in single cell genome sequencing may uncover the extent to which stress-induced LINE1 activity leads to genomic mosaicism, and how this can in turn affect development and disease. An understanding of the stress responses that early mammalian embryos employ and how they impact LINE1 activity may shed light on the potential long-term impact of assisted reproduction technologies or gestational stress. LINE1-mediated mosaicism is certainly rife for exploration in cancer, but also potentially in other instances where cells may be impacted by stress, such as in metabolic disorders or (auto-)immunity. The continuous improvement of technologies for long-read DNA sequencing will facilitate the discovery of LINE1-mediated genomic variation. The high expression and evidence for mobility of LINE1 in neural progenitors and epithelial cancers has led to the proposals that LINE1 contributes to genome diversification in these contexts, and we anticipate much to be learned in these areas in the year ahead (for reviews, see [119,129,130]. Given our recent data [92] uncovering a role for LINE1 RNA in ESC self-renewal and early embryo development via regulation of rDNA transcription, independently of retrotransposition, it will be interesting to explore this alternative or additional function of LINE1 in neural progenitors and cancer. The intriguing possibility that LINE1 may be transferred between cells in extracellular vesicles [131] deserves further investigation. Finally, it will be of interest to perform comparative studies including species where the activity of LINE1 is proposed to have become extinct, such as in sigmodontine rodents [132] or megabats. [132] These species contain LINE1 elements in their genome, but with little or no evidence that such elements are still capable of retrotransposition. In these contexts, LINE1 may facilitate genomic variation independent of retrotransposition, as mentioned above. It will also be interesting to determine whether LINE1 DNA/RNA can still have roles in chromatin organization in such species, and/or whether some roles, including a response to stress, were taken over by the expansion of other TEs. [133] In conclusion, the roles of LINE1 and potentially other TEs as chromatin regulators and genome diversifiers open new vistas across biomedical research fields, whether their focus be on development, adult homeostasis, cancer, transposition biology, or evolution. Exciting surprises on the function of TEs during ontogeny and phylogeny no doubt lie ahead.