Contrasting evolutionary trajectories of multiple retrotransposons following independent allopolyploidy in wild wheats

Authors


Summary

  • Transposable elements (TEs) are expectedly central to genome evolution. To assess the impact of TEs in driving genome turnover, we used allopolyploid genomes, showing considerable deviation from the predicted additivity of their diploid progenitors and thus having undergone major restructuring.
  • Genome survey sequencing was used to select 17 putatively active families of long terminal repeat retrotransposons. Genome-wide TE insertions were genotyped with sequence-specific amplified polymorphism (SSAP) in diploid progenitors and their derived polyploids, and compared with changes in random sequences to assess restructuring of four independent Aegilops allotetraploid genomes.
  • Generally, TEs with different evolutionary trajectories from those of random sequences were identified. Thus, TEs presented family-specific and species-specific dynamics following polyploidy, as illustrated by Sabine showing proliferation in particular polyploids, but massive elimination in others. Contrasting with that, only a few families (BARE1 and Romani) showed proliferation in all polyploids. Overall, TE divergence between progenitors was strongly correlated with the degree of restructuring in polyploid TE fractions.
  • TE families present evolutionary trajectories that are decoupled from genome-wide changes after allopolyploidy and have a pervasive impact on their restructuring.

Introduction

The last decades have highlighted the astonishing dynamics of plant genomes over evolutionary time, mainly in relation to prominent mechanisms such as retrotransposition, recombination and polyploidy (Kejnovsky et al., 2009; Murat et al., 2012). Polyploidy (i.e. hybridization between more or fewer divergent genomes, associated with duplication) is a recurrent process across angiosperms (Jiao et al., 2011). Nascent and established polyploids show significant deviation from the expected additivity of their progenitor genomes, indicating major restructuring and epigenetic changes after the origin of polyploid lineages (Soltis & Soltis, 1999; Doyle et al., 2008). Molecular mechanisms and evolutionary processes underlying such genome reorganization are not fully clear yet (Tayalé & Parisod, 2013), but the major fraction of plant genomes represented by transposable elements (TEs) seems to be specifically affected by polyploidy (McClintock, 1984; Fedoroff, 2012). ‘Genomic stress’ such as polyploidy may indeed activate TEs, and fast-evolving polyploids thus represent promising models to investigate the role of TEs in driving genome evolution (Comai et al., 2003; Parisod & Senerchia, 2012; Levy, 2013).

Long terminal repeat (LTR) retrotransposons represent the predominant TEs in plants (Kumar & Bennetzen, 1999). Active LTR retrotransposons indeed increase their copy number through a copy-paste mode of transposition, affecting genome size and overall organization (Lynch & Conery, 2003; Ma & Bennetzen, 2004; Wang & Dooner, 2006). However, sequence elimination was highlighted as a major process in TE genome fractions (Vitte & Panaud, 2005), especially following polyploidy (Parisod et al., 2010; Yaakov & Kashkush, 2011; Parisod & Senerchia, 2012). A balance between proliferation and partial elimination by ectopic and illegitimate recombination may thus lead to a high genome turnover (Bennetzen & Kellogg, 1997; SanMiguel et al., 1998; Vitte & Panaud, 2005; El Baidouri & Panaud, 2013). Given that mutations accumulate among TE copies during and after transposition, the different TE lineages form large populations of repetitive sequences that evolve within genomes (Wicker et al., 2007; Jurka et al., 2011; Wicker, 2013). TE families (i.e. TE copies sharing at least 80% nucleotide similarity) may thus show distinct evolutionary dynamics (Senerchia et al., 2013). However, the processes underlying the particular trajectories of TEs remain unclear and may be influenced by intrinsic TE proprieties as well as processes acting at the host level (Tenaillon et al., 2010; Esnault et al., 2011).

The wheat group belongs to the Triticeae clade of the grass family and presents a great diversity of species that evolved through homoploid divergence, hybridization and allopolyploidy. In particular, wild relatives of cultivated wheat (i.e. Aegilops species) independently evolved multiple di-, tetra- and hexaploid species that mostly co-occur in the Middle East (van Slageren, 1994). The reticulate evolutionary relationships among Aegilops taxa have been thoroughly inferred by genetic and cytogenetic analyses, although they remain poorly dated so far (reviewed in Baum et al., 2012). Aegilops allopolyploids are young entities that mostly formed by combinations of diploid species with D or U genomes (Zohary & Feldman, 1962). Within such polyploid clusters, early cytogenetic evidence showed that chromosomes shared among polyploids (i.e. pivotal genome) remain largely unaltered as compared with their diploid donor, whereas the other genome (i.e. differential genome) presented considerable alteration (Zohary & Feldman, 1962; Feldman, 1965a,b,c). Zohary & Feldman (1962) postulated that this pattern results from gene flow among established allopolyploids, with common pivotal genomes providing substrates for homologous recombination and buffering the initial loss of hybrid fertility, while the differential genomes were accumulating restructuring events through homeologous recombination.

The overall genome organization of hybridizing Aegilops supports this hypothesis to a certain extent (Zohary & Feldman, 1962; Wang et al., 2000), but hardly explains the balanced restructuring of both genomes reported in some species (Kimber et al., 1988; Badaeva et al., 2002, 2004) or that natural hybrids between species with so-called pivotal genomes are noticeably rare. Challenging the pivotal-differential hypothesis, recent studies used various molecular markers to show that asymmetrical restructuring of polyploid genomes may depend on other factors such as sequence types and genome sizes (reviewed in Bento et al., 2011). Additional investigations are thus required to understand the processes underlying the evolution of genomes within species complexes such as Aegilops.

Aegilops allopolyploids present fast-evolving genomes, of which TEs make up c. 80% (Li et al., 2004; Sabot & Schulman, 2009; Wicker & Buell, 2009; Senerchia et al., 2013), thus offering promising models to assess the role of TEs in genome restructuring, and shedding light on the processes shaping genome architecture. Here, we focused on four diploid species, Aegilops caudata (genome C), Aegilops comosa (M), Aegilops tauschii (D) and Aegilops umbellulata (U) that recently combined into four natural allotetraploid species, Aegilops crassa (DM), Aegilops cylindrica (DC), Aegilops geniculata (UM) and Aegilops triuncialis (UC) (Fig. 1). Our aims were to investigate genome restructuring and assess the evolutionary trajectories of abundant LTR retrotransposon families after independent allopolyploidy events. Based on genome survey sequence (GSS) data from two Aegilops polyploid genomes, we designed complementary molecular fingerprint assays tracking restructuring events in random sequences as well as 17 LTR retrotransposon families. Comparisons between TEs and random sequences highlighted the proliferation of several TE families and the predominant sequence deletion in others, indicating species-specific and TE-specific evolutionary trajectories following allopolyploidy. In particular, TE dynamics were associated with divergent insertions between progenitor species, suggesting that genome shocks at the origin of polyploid lineages had a long-lasting influence on genome organization.

Figure 1.

Genome composition of investigated Aegilops diploid and derived allopolyploid species. Genome composition followed van Slageren (1994). Deviations of observed tetraploid genome size from expectations based on the addition of diploid progenitors, according to Eilam et al., 2007, 2008 (using 1 pg of DNA = 978 Mb).

Materials and Methods

Plant material

Three accessions of the polyploids Ae. crassa (PI 487286, PI 542178, TA 1880), Aecylindrica (TA 2204 = AE 719, PI 486235, PI 554221), Ae. geniculata (TA 1800, PI 487221, PI 287737), Ae. triuncialis (PI 487246, PI 542345, PI 491442) as well as of diploid progenitor species Ae. caudata (IG 47965, IG 48080, IG 107317), Ae. comosa (AE 1376, AE 1260, AE 1378), Ae. tauschii (IG 47219, IG 46847, AE 528), and two accessions of Ae. umbellulata (IG 48082, IG 46964) were obtained from the Institutes für Planzengenetik und Kulturplanzenforschung (AE), the United States Department of Agriculture (PI), the International Center for Agricultural Research in the Dry Areas (IG) and the Wheat Genetic and Genomic Resources Center (TA). These accessions have been characterized as genetically variable in previous studies (Badaeva et al., 2002, 2004; Meimberg et al., 2009). They have been maintained in germplasms by selfing and were thus assumed to be inbred. Plants were grown under controlled conditions (18°C, 18 h light) and DNA of 2-wk-old seedlings was extracted from fresh leaves disrupted in liquid nitrogen, following the standard DNeasy Plant Extraction Mini Kit protocol from Qiagen.

Genome Sequence Survey and selection of TE families

Forty nanograms of genomic DNA of one individual from both Aecylindrica (TA 2204 = AE 719) and Ae. geniculata (TA 1800) were mechanically shotgunned and sequenced on half a plate of the Roche 454 GS FLX titanium platform (service provided by Microsynth, Balgach, Switzerland, following manufacturer's instructions). BLASTN of nonredundant reads against complete TREP database (wheat.pw.usda.gov/ITMI/Repeats/) classified TEs following Senerchia et al. (2013). Portions of reads matching to 300 bp at the 5′ end of the LTR regions of 27 TE families were aligned using ClustalW and analyzed through molecular population genetics to distinguish active from quiescent families following Senerchia et al. (2013). We selected 16 LTR retrotransposon families showing evidence of recent transpositional activity (BARE1, Barbara, Cereba, Claudia, Daniela, Danae, Derami, Fatima, Lila, Maximus, Nusif, Romani, Quinta, Sabine, WHAM, and Xalax) as well as one quiescent TE family (Egug, selected as a negative control) and evaluated nucleotide diversity (π) along their LTRs within a sliding window of 5 bp using DnaSP v5 (Librado & Rozas, 2009). For each TE family, a sequence-specific amplified polymorphism (SSAP) primer was designed across the most conserved region of the alignments using Primer 3 (Rozen & Skaletsky, 2000).

Molecular fingerprint techniques

Amplified fragment length polymorphism (AFLP) and SSAP were carried out following the protocol of Parisod & Christin (2008). Briefly, digestion of genomic DNA with EcoRI and MseI (New England Biolabs Inc.) and then ligation of double-stranded adaptors with T4 ligase (Promega) were performed at 37°C in plates with randomly positioned individual DNA. After inactivation of enzymes at 65°C for 20 min, primers corresponding to adaptors were used for preselective amplification with GoTaq®DNA polymerase (Promega). Twenty-fold diluted PCR products were selectively amplified with a touchdown PCR using primers with three additional nucleotides. For AFLP, unlabeled MseI primers were used together with fluorescently labeled EcoRI primers, whereas SSAP used fluorescently labeled TE family-specific primers (Supporting Information, Table S1) instead of EcoRI primers. SSAP thus mostly amplifies the region encompassing the termini of retrotransposon insertions and their flanking genomic sequences up to a restriction site cleaved by MseI. To minimize experimental inconsistencies, digestion, ligation and preselective amplification were performed simultaneously and PCRs were carried out on thermocyclers with fixed ramp rate. Genotyping error rates (Eb) were estimated based on the replication of the whole procedure on five samples (i.e. 22% of the dataset) as Eb = Mrepl/nbin, where Mrepl is the total number of mismatches between replicated samples and nbin is the total number of scored bins (Bonin et al., 2004). Of the 64 different EcoRI–MseI combinations of AFLP selective primers tested, and the eight MseI selective primer combinations tested for each TE family, only the most reproducible combinations were retained and fully analyzed (17 AFLP combinations and a mean of 4.5 SSAP combinations per TE family).

Selective PCR products amplified with FAM™, VIC®, NED™ fluorescent dye were pooled with GeneScan™ 500 LIZ™ Size Standard and separated with an ABI 3500 capillary sequencer. Resulting electropherograms were visualized and scored with GeneMapper 3.5 (Applied Biosystems) using AFLP default peak detection parameters. The scoring was manually checked and loci were recorded as present (1) or absent (0) in binary matrices.

AFLP and SSAP loci analysis

For each species, Nei's gene diversity was estimated, with standard error, for AFLP as well as SSAP for each TE family using the Bayesian method with nonuniform prior distribution of allele frequencies (assuming FIS = 1) as implemented in AFLPsurv (Vekemans, 2002).

For each locus j, frequencies of band presence (1) and absence (0) were estimated among accessions of the diploid progenitors (D1(1)j, D1(0)j and D2(1)j, D2(0)j, respectively) as well as accessions of the derived allopolyploid (A(1)j, A(0)j). The expected polyploid profile (i.e. the additivity of the progenitors) was estimated as probabilities of band presence and absence based on the frequencies in diploid progenitors. The probability of band presence in the polyploid was calculated as E(1)j = D1(1)j × D2(1)j + D1(1)j × D2(0)j + D1(0)j × D2(1)j, whereas the probability of band absence was calculated as E(0)j = D1(0)j × D2(0)j. The observed polyploid profile was then compared with the expected polyploid profile assessing the probability that the locus j was: additive (i.e. identity between expected and observed polyploid profiles) as E(1)j × A(1)j + E(0)j × A(0)j; a new band (i.e. presence of a band in the observed polyploid profile despite absence in the expected profile) as E(0)j × A(1)j; or a lost band (i.e. band absence in the observed polyploid profile despite predicted presence in the expected profile) as E(1)j × A(0)j. Probabilities of additive, new and lost bands were calculated for each locus, summed across loci and divided by the total number of bands to obtain proportions of additive, new and lost band for AFLP and for each SSAP of each TE family. Proportions were calculated for each polyploid accession and averaged within species. The same procedure was applied to the four allopolyploids.

For each tetraploid species, one-way ANOVA followed by multiple comparisons with Tukey's honest significant differences (HSD) post hoc were carried out to assess TE families with proportions of new bands and ratios of lost bands divided by new bands significantly different from AFLP at 5% levels, using the package ‘agricolae’ on R cran (cran.r-project.org).

Correlation of nonshared bands between progenitors and new bands in polyploids

For each pair of diploid progenitors, the proportion of nonshared bands among progenitor species, taking multiple accessions into account, was estimated for each locus as D1(0)j × D2(1)j + D1(1)j × D2(0)j. It was then summed over loci and divided by the total number of loci for AFLP as well as SSAP for each TE family.

Correlations of the proportions of nonshared bands between progenitors and the proportions of new bands and nonadditive bands in the derived polyploids were assessed with a mixed effect linear model using the maximum likelihood method with the package ‘lme4’ and including TE families nested within polyploid species as random effects. R2 was defined using a function correlating fitted values to observed values in R cran (Nakagawa & Schielzeth, 2013).

Origin of lost bands in polyploids

The origin of lost bands was assessed for each polyploid on loci presenting band presence in only one progenitor (i.e. either the pivotal or the differential genome in the polyploid). First, Kruskal–Wallis tests tested significant difference in proportions of lost bands among pivotal and differential genomes in each polyploid. Then, differences in proportions of lost bands between the pivotal and differential genome for each the 17 TE families were tested by a prop.test on R cran.

Results

Candidate TE families

Genome survey sequence of Ae. cylindrica (genome DC) and Ae. geniculata (genome UM) by 454 sequencing identified > 70% of the genome as TEs, including > 165 LTR retrotransposon families (Senerchia et al., 2013). Comparison of 300 bp at the 5′ end of LTR regions of the 17 selected TE families identified seven families (BARE1, Danae, Daniela, Fatima, Lila, Romani, Xalax) with clear evidence of recent activity, whereas nine (Barbara, Cereba, Claudia, Derami, Maximus, Nusif, Quinta, Sabine, WHAM) showed signs of both old and recent transpositional activity (Table 1). By contrast, Egug was considered as quiescent and selected as a negative control. Based on GSS data, specific primers for each of these 17 candidate TE families were designed in conserved regions at the 5′ end of the LTR (Table S1) and used to perform SSAP assays representative of the diversity of TE copies within Aegilops genomes.

Table 1. Summary statistics of investigated long terminal repeat (LTR) retrotransposon families based on the sequencing of 300 bp at the 5′ end of the LTR region in Aegilops cylindrica (CY) and Aegilops geniculata (GE)
TE familyReads CYaπ CYbReads GEaπ GEbActivityc
  1. a

    Number of reads from GSS with proportion of reads out of the total of sequences shown in parentheses.

  2. b

    Nucleotide diversity (π) among copies within genome. na, not applicable.

  3. c

    Activity = TE evolutionary dynamics (recently active, +; quiescent, −), inferred from molecular population genetics (details in Senerchia et al., 2013).

Barbara 11082 (1.66)0.0811677 (1.81)0.06+
BARE1 79188 (11.60)0.0689573 (13.41)0.03+
Cereba 5212 (0.78)0.058263 (1.28)0.06+
Claudia 3841 (0.57)0.084803 (0.74)0.08+
Danae 8129 (1.22)0.199903 (1.53)0.16+
Daniela 12022 (1.80)0.0412587 (1.95)0.1+
Derami 5782 (0.87)0.094319 (0.67)0.13+
Egug 5368 (0.81)0.095538 (0.86)0.1
Fatima 38357 (5.41)0.0438567 (5.97)0.05+
Lila 6779 (1.02)0.144018 (0.62)0.18+
Maximus 10118 (1.52)0.0412798 (1.98)0.06+
Nusif 10822 (1.62)0.1112200 (1.89)0.1+
Quinta 1567 (0.23)0.061906 (0.29)0.06+
Romani 10101 (1.51)0.0910436 (1.61)0.09+
Sabine 7141 (1.07)0.06274 (0.04)na+
WHAM 15715 (2.35)0.1115162 (2.35)0.1+
Xalax 7090 (1.06)0.129819 (1.52)0.15+

Genetic variation in Aegilops and genome restructuring in polyploids

Variable genetic diversity was assessed within Aegilops species, showing specific variation in AFLP as well as the different TE families (Table 2). Genetic diversity within diploid and polyploid species ranged from 0.104 for the TE Romani in Ae. umbellulata to 0.272 for BARE1 in Ae. comosa, and from 0.099 for Cereba in Ae. cylindrica to 0.262 for Derami in Ae. triuncialis, respectively. Several TEs such as BARE1 showed higher genetic diversity than genome-wide AFLPs, but no clear pattern differentiating diploid and tetraploid species was evident. Noticeably, Sabine showed a significantly higher diversity in Ae. cylindrica than in Ae. geniculata. Genetic divergence among species for AFLP as well as the different TE families showed considerable variation depending on the genome fraction considered (Notes S1). In particular, multiple Mantel tests correlating genetic distances between species as measured by AFLP and by SSAP of the different TEs were nonsignificant (after stringent Bonferroni correction for multiple testing), indicating TE-specific patterns of divergence among species.

Table 2. Distribution of loci within and among Aegilops species
 Ae. caudata (C)Ae. comosa (M)Ae. tauschii (D)Ae. umbellulata (U)Ae. crassa (genome DM)Ae. cylindrica (genome DC)Ae. geniculata (genome UM)Ae. triuncialis (genome UC)
DivDivDivDivDivSALNDivSALNDivSALNDivSALN
  1. Nei's gene diversity (Div) within diploid and allotetraploid species (genome composition following van Slageren, 1994) is presented (± SE) for random sequences (amplified fragment length polymorphism, AFLP) as well as the 17 transposable element (TE) families (sequence-specific amplified polymorphism, SSAP). Mean proportions of shared bands among diploid progenitors (S, i.e. non shared bands, S : 1-S) and comparisons with the expected additivity of the diploids (A, additive bands; L, lost bands; N, new bands) are presented for each allotetraploid species.

AFLP0.209 ± 0.0060.191 ± 0.0060.191 ± 0.0060.180 ± 0.0060.238 ± 0.00763.5755.1138.346.460.161 ± 0.00559.2167.8927.065.260.214 ± 0.00669.6162.0633.144.650.215 ± 0.00659.4367.6125.826.42
Barbara 0.226 ± 0.0150.172 ± 0.0150.203 ± 0.0160.194 ± 0.0150.189 ± 0.01565.7750.8431.9016.730.145 ± 0.01254.7566.4726.206.800.173 ± 0.01457.1155.3434.839.270.210 ± 0.01558.2959.8732.846.76
BARE1 0.227 ± 0.0080.272 ± 0.0080.251 ± 0.0080.203 ± 0.0090.259 ± 0.00859.4758.0730.7111.090.149 ± 0.00664.5069.7622.058.040.238 ± 0.00855.5557.6029.0613.200.216 ± 0.00857.0664.3724.0511.43
Cereba 0.180 ± 0.0100.200 ± 0.0100.201 ± 0.0110.138 ± 0.0080.164 ± 0.00965.2454.2832.6612.800.102 ± 0.00761.9066.5427.116.070.146 ± 0.00849.8359.0628.6412.060.151 ± 0.00952.9463.5528.987.19
Claudia 0.178 ± 0.0090.180 ± 0.0090.170 ± 0.0080.115 ± 0.0070.164 ± 0.00854.9552.1335.5112.160.113 ± 0.00655.0368.7124.096.980.147 ± 0.00746.2164.5826.548.670.151 ± 0.00836.5666.1626.527.09
Danae 0.179 ± 0.0090.151 ± 0.0090.159 ± 0.0090.139 ± 0.0090.160 ± 0.00949.3948.7240.6110.410.129 ± 0.00854.0867.8525.306.570.144 ± 0.00846.1959.3627.1713.200.130 ± 0.00849.9560.9029.829.01
Daniela 0.193 ± 0.0090.236 ± 0.0090.177 ± 0.0080.158 ± 0.0070.173 ± 0.00854.6557.1531.6911.020.110 ± 0.00655.9767.8023.218.730.209 ± 0.00850.5259.2430.749.850.147 ± 0.00748.4465.3927.886.55
Derami 0.198 ± 0.0130.238 ± 0.0130.132 ± 0.0120.195 ± 0.0130.204 ± 0.01353.3350.0433.7415.830.154 ± 0.01250.5654.7233.2611.570.202 ± 0.01255.5765.5828.075.950.262 ± 0.01550.6256.5938.774.21
Egug 0.173 ± 0.0090.193 ± 0.0100.183 ± 0.0090.190 ± 0.0100.184 ± 0.01067.3365.3828.296.040.131 ± 0.00773.3076.1519.913.640.171 ± 0.00966.7671.5123.005.190.160 ± 0.00978.7177.2818.633.78
Fatima 0.234 ± 0.0090.192 ± 0.0090.154 ± 0.0080.147 ± 0.0080.172 ± 0.00856.3246.7840.1412.910.108 ± 0.00648.4664.4026.109.310.167 ± 0.00848.6157.5431.8010.480.160 ± 0.00768.7665.5226.138.18
Lila 0.195 ± 0.0090.170 ± 0.0090.181 ± 0.0090.156 ± 0.0090.186 ± 0.00960.1760.9828.6510.140.111 ± 0.00659.9473.9920.375.400.145 ± 0.00855.1066.0224.868.760.161 ± 0.00857.5570.3121.717.74
Maximus 0.174 ± 0.0120.249 ± 0.0130.187 ± 0.0130.174 ± 0.0120.220 ± 0.01356.9551.9436.8410.840.174 ± 0.01248.5363.7330.934.910.205 ± 0.01351.9656.1533.3310.010.189 ± 0.01254.4364.7027.367.52
Nusif 0.159 ± 0.0100.169 ± 0.0110.163 ± 0.0110.189 ± 0.0110.166 ± 0.01167.3558.1430.8110.700.099 ± 0.00760.3270.5124.324.800.154 ± 0.00955.4068.3624.896.420.170 ± 0.01062.8972.8521.944.87
Quinta 0.186 ± 0.0100.220 ± 0.0110.225 ± 0.0110.183 ± 0.1000.178 ± 0.01061.8856.9131.2611.540.130 ± 0.00863.3871.0921.067.590.152 ± 0.00850.3964.0427.917.800.181 ± 0.00955.4666.8724.708.17
Romani 0.118 ± 0.0080.165 ± 0.0090.158 ± 0.0090.104 ± 0.0080.172 ± 0.00960.4242.9935.3021.510.077 ± 0.00655.2863.8924.9710.870.125 ± 0.00749.4550.9129.7519.110.131 ± 0.00847.0759.4326.1014.18
Sabine 0.257 ± 0.0130.212 ± 0.0140.172 ± 0.0130.190 ± 0.0130.151 ± 0.01267.3562.1830.966.340.142 ± 0.00958.3478.9113.876.760.066 ± 0.00559.4759.2737.512.690.187 ± 0.01259.5273.0220.485.89
WHAM 0.163 ± 0.0140.183 ± 0.0150.188 ± 0.0150.134 ± 0.0110.161 ± 0.01463.1156.0036.666.740.107 ± 0.00860.7576.8516.655.560.173 ± 0.01454.9868.6923.726.950.151 ± 0.01364.5470.5123.645.18
Xalax 0.227 ± 0.0120.161 ± 0.0110.244 ± 0.0130.179 ± 0.0110.183 ± 0.01263.3458.3633.367.930.108 ± 0.00657.3074.2818.187.190.160 ± 0.01060.6066.9923.499.150.203 ± 0.01255.9063.6732.403.57

Genome restructuring in four Aegilops allotetraploids was assessed by comparing the fingerprint profiles of multiple accessions for random sequences (i.e. using AFLP) and for the 17 candidate TE families (i.e. using SSAP) with the expected additivity of their respective diploid progenitor species. Comparison of AFLP with SSAP allows contrasting background genome restructuring and specific reorganization of TE fraction. Genotyping of 1224 AFLP loci (with an error rate of 3.36%) and 238–792 SSAP loci for each of the 17 TE families (with an error rate ranging from 1 to 6.6%; Tables S2, S3) from three accessions per species took genetic variation into account.

Combinational probabilities based on the frequencies of band presence and absence in the diploid progenitors were used to determine the expected polyploid profile at each locus and were then compared with the observed profile for each polyploid accession, assessing the probability that the locus was additive or lost, or that a new locus appeared in the polyploid (Table 2). Averaged over loci, the mean proportion for additive AFLP bands across polyploids was 63.1%, close to the overall proportion of SSAP bands for the 17 TE families (62.9%). The proportion of additive AFLP bands, however, varied considerably among the four allopolyploid species, indicating a species-specific degree of background restructuring following the different polyploidy events. In particular, Ae. crassa showed a higher degree of restructuring (i.e. nonadditive AFLP bands, 44.9%) than Ae. cylindrica (32.1%), Ae. geniculata (37.9%) and Ae. triuncialis (32.9%). Proportions of additive SSAP bands showed considerable variation among TE families, ranging from 43.0% for Romani in Ae. crassa to 78.9% for Sabine in Ae. cylindrica.

New bands in random sequences and in multiple TE families

Comparing degrees of restructuring from a large number of SSAP loci for each TE family with the degree of restructuring in random sequences (AFLP) allowed us to highlight TEs with specific evolutionary trajectories, while accounting for possible ambiguities of fingerprint methods (Petit et al., 2010; Sarilar et al., 2013). Tukey HSD tests assessed that 15 out of 17 TE families showed significantly higher proportions of new bands than AFLP, indicating proliferation in at least one of the tetraploid species (Fig. 2a). The number of TEs showing higher proportions of new bands than AFLP varied among the tetraploids (Table S3). Nine TE families in Ae. crassa, 10 in Ae. cylindrica and 13 in Ae. geniculata, but only two TEs in Ae. triuncialis, showed such a pattern indicative of TE proliferation. However, few TEs showed consistent patterns of new bands in all polyploids. Only BARE1 and Romani presented significantly higher proportions of new bands than AFLP in the four tetraploids, whereas Egug and Nusif never exhibited significant differences. Most TE families instead showed species-specific patterns, with some TE families showing significantly more new bands in the three tetraploids (Barbara, Claudia, Daniela, Fatima and Quinta in Ae. crassa, Ae. cylindrica and Ae. geniculata), in two tetraploids (Cereba in Ae. crassa and Ae. geniculata; Derami in Ae. crassa and Aecylindrica; Xalax in Ae. cylindrica and Ae. geniculata) or in only one tetraploid (Danae, Lila, Maximus and WHAM in Aegeniculata or Sabine in Aecylindrica).

Figure 2.

Deviation from the expected additivity of progenitors in Aegilops allopolyploid species. Proportions of deviating bands for the 17 transposable element (TE) families investigated with sequence-specific amplified polymorphism (SSAP) and random sequences (amplified fragment length polymorphism, AFLP). (a) Proportions of new bands; (b) ratio between proportions of lost and new bands (log-scaled). TE families (crosses) presented in gray show nonsignificantly different proportions from the AFLP (black ellipse), as assessed by Tukey tests. TE families differing significantly from the AFLP (in black) and not sharing small letters show significantly different proportions.

As expected, TE families with clear evidence of recent activity as assessed by 454 GSS (i.e. Barbara, BARE1, Cereba, Claudia, Daniela, Danae, Derami, Fatima, Lila, Maximus, Quinta, Romani, Sabine, Wham and Xalax) showed higher proportions of new SSAP bands indicative of transposition after polyploidy in Ae. cylindrica and/or Ae. geniculata (see later). By contrast, molecular fingerprints detected nonsignificantly higher proportions of new SSAP bands than AFLP for Nusif and Egug. Egug, but also Nusif to a certain extent, indeed showed quiescence based on GSS data (also see Senerchia et al., 2013). Both high-throughput sequencing and fingerprinting approaches showed congruent results, but their respective advantages deserve further attention.

Balance of lost/new bands in random sequences and in multiple TE families

The proportion of lost bands divided by the proportion of new bands offered an integrative proxy of the evolutionary trajectories of TE families towards either sequence deletion (> 1) or proliferation (< 1) when compared with AFLP. Ratios for AFLP as well as TE family were usually larger than one, indicating that sequence loss was the major restructuring process following polyploidy (Fig. 2b). Tukey HSD tests assessed that all 17 TE families had significantly different ratios from the AFLP ratio in at least one of the four polyploids (Table S3). The number of TEs with significantly different ratios of lost/new band proportions from the AFLP ratio varied considerably among polyploids. None of the TE families displayed significantly different ratios from the AFLP ratio in Ae. triuncialis, whereas 12 and 10 TE families offered evidence of proliferation with significantly lower ratios than the AFLP ratio in Ae. crassa and Ae. cylindrica, respectively. In Ae. geniculata, all TEs presented significantly different ratios from AFLP, with 16 families showing a lower ratio indicating proliferation, whereas Sabine presented a higher ratio indicating considerable sequence loss after polyploidy. TE families thus showed species-specific long-term evolutionary trajectories following independent polyploidy events.

Most TEs with significantly different ratios of proportions of lost/new bands compared with AFLP showed evidence of proliferation, but also revealed noticeable exceptions. In particular, Sabine showed considerable variation among tetraploid species, with clear evidence of proliferation in Ae. cylindrica (ratio of 2.05, significantly lower than AFLP), whereas considerable band loss was reported in Ae. geniculata (ratio of 13.94, significantly higher than AFLP). These results were firmly supported by random GSS, which identified Sabine as the TE presenting the largest difference in number of reads, with > 7000 out of 667 485 reads in Ae. cylindrica (i.e. 1.06% of the genome), whereas as few as 274 out of 646 327 reads (0.04%) were detected in Ae. geniculata. Semiquantitative PCR using specific primers designed in the 300 bp of the LTR of Sabine on the same DNA extracts further confirmed the contrasting abundance of TEs in the allopolyploid species (data not shown).

Correlation between progenitor divergence and restructuring after polyploidy

A linear mixed model nesting TE families within polyploid species as random effects assessed that the proportion of nonshared bands between pairs of diploid progenitors was significantly associated with the proportion of new bands in the derived polyploid (Fig. 3a; maximum likelihood linear mixed effect model; t-value = 4.18, < 0.01, R2 = 0.72). Thus, for each TE family, the more divergent the arrangements of TE insertions among diploid genomes being merged, the more new SSAP bands we saw that were indicative of transposition for that specific TE in the derived allotetraploid.

Figure 3.

Association of divergence between progenitor species and restructuring in derived Aegilops polyploids. General linear mixed model associating proportions of nonshared bands between the progenitors with the proportion of new bands (a) or the proportions of lost and new bands (b) in the allopolyploids for each transposable element (TE) family. Gray arrows indicate proportions of nonshared bands between the progenitor species for random sequences (amplified fragment length polymorphism, AFLP), whereas letters and lines show proportions and relationships, respectively, for the 17 TE families in Aegilops crassa (R, solid lines), Aegilops cylindrica (C, short dashed), Aegilops geniculata (G, dotted) and Aegilops triuncialis (T, long dashed).

A similar model assessed that the proportion of nonshared bands between pairs of diploid progenitors was significantly associated with the proportions of nonadditive bands for each TE family (Fig. 3b, linear mixed effect model; t-value = 5.73, < 0.01, R2 = 0.74). Thus, the higher the divergence between the respective diploid progenitors, the higher the proportion of nonadditive bands (i.e. new bands and lost bands) in the derived polyploid.

Origin of lost bands following polyploidy

The proportion of lost bands originating from the pivotal vs the differential genome was assessed on a subset of loci that were specific to each progenitor species (Table S4). The proportions of lost bands were significantly higher in the differential than in the pivotal genome in Ae. crassa (Kruskal–Wallis test, χ2 = 69.04, < 0.001) for all TEs except Quinta (Fig. 4). Similarly, Ae. triuncialis (χ2 = 29.77, < 0.001) and Ae. cylindrica (χ2 = 4.39, = 0.036) showed significantly higher band loss from the differential genome, but only specific TEs significantly followed this pattern (Claudia, Danae, Fatima, Nusif and WHAM for Ae. triuncialis, and BARE1, Maximus and Nusif for Ae. cylindrica). By contrast, Ae. geniculata showed significantly higher proportions of lost bands from the pivotal genome (χ2 = 50.81, P < 0.001), with significant differences for Cereba, Maximus, Romani, Sabine and WHAM.

Figure 4.

Proportion of lost bands originating from the differential and the pivotal genome in the allotetraploid species Aegilops crassa (DM), Aegilops cylindrica (DC), Aegilops geniculata (UM) and Aegilops triuncialis (UC). Significant differences were assessed by Kruskal–Wallis tests. Medians and quartiles are shown as boxes, whereas whiskers show minimum and maximum values within 1.5 interquartile ranges. Open circles, outlier data; closed circles, proportions of lost random sequences (amplified fragment length polymorphism, AFLP).

Discussion

Restructuring of polyploid genomes

Combinations of different diploid genomes in four established Aegilops allotetraploids examined here with two complementary molecular fingerprint methods tracking either random (AFLP) or TE sequences (SSAP) revealed considerable departure from the expected additivity of their progenitors (Table 2). Despite a relatively low number of surveyed accessions per species, our quantitative handling of the presence/absence of hundreds of loci through a probabilistic approach sheds light on genome evolution in wild wheats, taking intraspecific variation into account. This approach offers significant novelties as compared with the traditional approach comparing single accessions through direct count and could be straightforwardly extended to large numbers of samples across the distribution ranges of species.

Genome restructuring highlighted here is congruent with GSS data (Senerchia et al., 2013) and coherent with the otherwise reported instability of the TriticumAegilops genomes over evolutionary time (Badaeva et al., 2002, 2004, 2007; Brenchley et al., 2012; Yaakov et al., 2013). It further highlights allopolyploidy as a major process eliciting drastic genome changes (McClintock, 1984; Comai, 2000; Liu & Wendel, 2003) as otherwise reported in cultivated wheat (Kashkush et al., 2003; Kraitshtein et al., 2010; Feldman & Levy, 2012).

Aegilops is a radiating genus, whose polyploids most likely originated during the Pleistocene (Baum et al., 2012). Despite questionable dating of polyploidy events (Doyle & Egan, 2010), Aegilops allopolyploids were assumed to be of comparably recent origin. Accordingly, our results indicate variable degrees of genome restructuring as illustrated by Ae. crassa presenting a higher proportion of nonadditive bands than Ae. cylindrica, Ae. geniculata or Ae. triuncialis. Noticeably, Ae. crassa presents an overall distinctive evolutionary trajectory, as it is the only polyploid examined here showing genome upsizing (Eilam et al., 2008). This species presents an atypically restricted distribution range for a polyploid Aegilops (Kilian et al., 2011), but it remains unknown to what extent such genome dynamics is associated with processes acting at the level of natural populations (Bonchev & Parisod, 2013).

As expected under the pivotal-differential hypothesis proposed by Zohary & Feldman (1962) to explain chromosomal repatterning in polyploid species of the U-cluster, D and U genomes showed overall lower amounts of band loss than differential genomes (here, C and M) in all polyploids except Ae. geniculata. In agreement with this, detailed cytogenetic work has shown substantial modification of both parental genomes in Ae. geniculata, but also in Ae. crassa (Kimber et al., 1988; Badaeva et al., 2002, 2004). The present results only partially match expectations raised by the pivotal-differential hypothesis and further indicate that restructuring does not consistently affect the larger parental genome as suggested by Bento et al. (2011). Accordingly, processes other than long-term gene flow between species and intergenomic recombination also drive genome reorganization in established polyploids (Feldman et al., 2012). In particular, the specific trajectories of repeated sequences reported here suggest that the intrinsic TE dynamics played a significant role in restructuring polyploid genomes.

TE evolutionary trajectories following allopolyploidy

This study tracks several LTR retrotransposon families at a genome-wide scale, thus offering a powerful comparative framework to contrast their dynamics, particularly when compared with random sequences. Large numbers of SSAP loci were indeed scored for the 17 TE candidate families and systematically compared with AFLP proportions to minimize potential bias owing to: polymorphic loci segregating among populations of the diploid species; nonadditive bands appearing following molecular changes at the insertion site and modifying the size of the amplified product rather than being indicative of transposition events; and the sensitivity of the EcoRI restriction enzyme to rare cytosine methylation states (Cervera et al., 2003). Nonadditive SSAP bands could indeed find their origin in more complex scenarios than transposition and deletions, as shown by traditional studies that cloned and sequenced polymorphic SSAP and that reported new/lost bands corresponding to transposition/deletion events only to a certain extent (Petit et al., 2010; Sarilar et al., 2013). In this report, we provide an original solution circumventing these limitations by relying on quantitative comparisons between SSAP (i.e. tracking TE insertions) and AFLP (i.e. tracking random sequences). As these techniques share very similar features, molecular events that are not TE-specific (e.g. chromosomal rearrangements, deletions or introgression) result in nonadditive bands in both SSAP and AFLP profiles, whereas TE-specific events are apparent in corresponding SSAP profiles only. Such a probabilistic approach does not designate specific bands, but highlights TEs showing significantly higher proportions of nonadditive SSAP bands than AFLP profiles as a result of a predominance of TE-specific events (e.g. reorganization in TE-rich genome regions such as heterochromatin; Le Rouzic et al., 2007; Parisod et al., 2009). In loosely compartmentalized genomes such as Aegilops, SSAP vs AFLP patterns can thus be chiefly interpreted as evolutionary trajectories of the corresponding TEs (e.g. proliferation).

This study assessed TE families with significantly higher proportions of new SSAP bands than new AFLP bands and offered convincing support for proliferative transposition events after allopolyploidy. Noticeably, few TE families showed consistent proliferation in all polyploids. Only BARE1, a TE known to be active in Triticeae (Vicient et al., 1999), and Romani always presented evidence of continual transpositional activity, whereas Nusif and Egug were seemingly quiescent. Other TE families presented evidence of proliferation after particular allopolyploidy events, but remained quiescent after others (Fig. 2).

High proportions of lost bands highlighted sequence deletion as a major restructuring process following polyploidy. In particular, the balance between lost and new bands confirmed proliferation of a majority of TE families as compared with random sequences, but indicated that deletion of TE sequences was usually predominant. Such evolutionary trajectories are in line with the general downsizing of polyploid genomes (Leitch & Bennett, 2004; Leitch et al., 2008) as reported in Ae. cylindrica, Ae. geniculata or Ae. triuncialis (Ozkan et al., 2003; Eilam et al., 2007, 2008). Accordingly, the predominance of band loss reported here matches predictions raised by the increase/decrease model of genome size evolution as a balance between transposition and small deletions resulting from illegitimate recombination (Ma et al., 2004; Parisod et al., 2010; El Baidouri & Panaud, 2013). Consequently, TE genome fractions showed considerable turnover after independent polyploidy events, supporting genome divergence whose precise evolutionary causes and consequences have still not been fully investigated.

The dynamics of multiple LTR retrotransposons revealed both species-specific and TE-specific dynamics in response to allopolyploidy as a common trigger. The remarkably contrasting evolutionary trajectory of Sabine in different polyploids is particularly illustrative of the variable evolutionary trajectories of TEs after polyploidy. Several lines of evidence indeed revealed that this retrotransposon presented significant proliferation in Ae. cylindrica, but massive elimination in Ae. geniculata, whereas it was apparently quiescent in the other polyploids. The molecular mechanisms underlying such a deletion of Sabine sequences in Ae. geniculata remain elusive, as illegitimate recombination is unlikely to achieve such specific elimination of interspersed TE insertions (Devos et al., 2002; El Baidouri & Panaud, 2013). The evolutionary processes driving contrasting evolutionary trajectories of TEs in different species deserve further attention.

Determinism of allopolyploid genome evolution

Despite species-specific and TE-specific evolutionary trajectories, TE dynamics following allopolyploidy seems nonrandom. Differences in arrangements of TE insertions between progenitors, rather than genome-wide differentiation, showed a significant association with restructuring of corresponding TEs in the polyploids. In other words, for a given TE family, the higher the divergence between the diploid genomes being merged, the higher the turnover of the corresponding TE fraction (i.e. new bands indicative of transposition and nonadditive bands indicative of genome changes) in established polyploids examined here.

Several nonexclusive processes may account for TE dynamics after allopolyploidy (reviewed in Parisod et al., 2010). The present results highlighting TE divergence among progenitors as an influential factor in the TE dynamics in allopolyploids largely support the genome shock hypothesis. Allopolyploidy expectedly weakens key epigenetic processes repressing TEs, leading to their activation (Bourc'his & Voinnet, 2010; Parisod & Senerchia, 2012). Consequently, the merging of specific parental genomes with divergent TEs at the origin of a particular allopolyploidy event could result in genomic conflicts that would drive nonrandom restructuring of TE families (Tayalé & Parisod, 2013), as reported here in all polyploids examined.

Conclusions

The evolutionary trajectories of TEs do not necessarily match the dynamics of genomic backgrounds (e.g. Le Rouzic et al., 2007; Parisod et al., 2012). TEs can indeed show intrinsic dynamics through proliferation in response to various triggers, such as genome shocks, but their dynamics are also influenced by extrinsic factors such as sequence deletion (Kejnovsky et al., 2009; Murat et al., 2012; El Baidouri & Panaud, 2013) or evolutionary processes acting at the host level (Tenaillon et al., 2010; Bonchev & Parisod, 2013).

Transposable elements generally followed predictions raised by the pivotal-differential hypothesis in Ae. crassa, suggesting that extrinsic factors have shaped the evolutionary trajectories of TEs to a large extent. However, most TEs were evenly removed from both parental genomes and showed TE-specific trajectories as compared with random sequences in the majority of polyploids. Furthermore, the present study revealed long-term restructuring of TE genome fractions that is remarkably coherent with expectations raised by intrinsic properties of TEs in response to the initial genome shock of allopolyploidy. Intrinsic and extrinsic processes may act in concert in driving the evolutionary trajectories of TEs (Ma & Gustafson, 2005), but their relative importance remains elusive. As polyploidy-induced genome shocks probably result in TE activation and genome restructuring during the first generations after the origin of new polyploid species (Ha et al., 2009; Parisod & Senerchia, 2012), resynthesized polyploids could be compared with established ones to distinguish processes acting after initial genome shock and gradual changes occurring throughout the life span of host species. In that respect, polyploid speciation is a promising model to investigate the multiple factors controlling TE dynamics.

Acknowledgements

We thank R. A. Slobodeanu for statistical advice, as well as M-A. Grandbastien, A. Tayalé and anonymous reviewers for valuable comments on the manuscript. This work was funded by the Swiss National Science Foundation through the National Centre of Competence in Research ‘Plant Survival’ and a grant (PZ00P3-131950 to C.P.).

Ancillary