Although the genome ofArabidopsis thalianahas a small amount of repetitive DNA, it contains representatives of most classes of mobile elements. However, to date, no miniature inverted-repeat transposable element (MITE) has been described in this plant. Here, we describe a new family of repeated sequences that we have namedEmigrant, which are dispersed in the genome ofArabidopsis and fulfil all the requirements of MITEs. These sequences are short, AT-rich, have terminal inverted repeats (TIRs), and do not seem to have any coding capacity. Evidence for the mobility ofEmigrantelements has been obtained from the absence of one of these elements in a specificArabidopsisecotype.Emigrantis also present in the genome of differentBrassicaeand its TIRs are 74% identical to those ofWujinelements, a recently described family of MITEs from the yellow fever mosquitoAedes aegypti.
Transposable elements have been divided into two classes according to their mode of propagation. Class I elements, also known as retrotransposons, transpose via an RNA intermediate, while class II elements transpose by a DNA–DNA mechanism. Elements of both classes have been described in plants and both seem to be widely distributed, although retrotransposons are by far the most abundant (Grandbastien 1992;Saedler & Gierl 1996).
In the last few years a new class of transposable elements, called MITEs (miniature inverted-repeat transposable elements), has been described in plants (Wessler et al. 1995). These elements share features of both class I and class II elements and, therefore, remain unclassified (Wessler et al. 1995). The different MITEs described so far share structural, but not sequence, similarity. They are short A/T rich DNA sequences, have no coding capacity, have potential to form DNA secondary structure, and are flanked by inverted repeat sequences (Wessler et al. 1995). Elements from the same family share similar inverted-repeat sequences, but only elements belonging to the same subfamily have internal sequence similarities (Bureau & Wessler 1994a;Río et al. 1996).
While MITEs were first described in plants, other short interspersed repeated elements having some characteristics of MITEs are present in animals, e.g. Xenopus laevis (Morgan & Middleton 1990;Ünsal & Morgan 1995) and humans (Morgan 1995;Smith & Riggs 1996). Recently, different families of mobile elements with all the characteristics of MITEs have been described in the yellow fever mosquito Aedes aegypti (Tu 1997).
Emigrant is a new family of MITEs from Arabidopsis thaliana
During the characterisation of the Arabidopsis thaliana chromosome IV genomic sequences obtained in our laboratory (within the framework of the European Arabidopsis Genome Project), a short sequence was found to have a high level of sequence similarity with four sequences dispersed in the Arabidopsis genome. A careful search in databases, using these five sequences as a query, revealed that the genome of Arabidopsis contains at least 14 short sequences displaying a high degree of sequence similarity (see Table 1). Other sequences showing a more limited degree of sequence similarity are also present in the genome of this plant (not shown). These sequences are found in different locations on different chromosomes of Arabidopsis (see Table 1). We have named this new family of repetitive sequences Emigrant (Emi).
Table 1. Characteristics of Emigrant elements
Similarity to consensus (%)
Closest ORF prediction
ΔG (kCal mol–1)
ΔG° was not determined (n.d.) when only partial sequence was available. ORF predictions are not available for sequences flanking Emi1, Emi2, Emi3, Emi6, Emi7, Emi8, and Emi10. Thus, the closest ORF prediction is not shown (n.d.).
The name of the BAC clone containing this element is given.
We have not found any sequence similarity between Emigrant sequences and any other repetitive sequence described to date. Nevertheless, Emigrant has some of the features of transposable elements. The localization of the 14 Emi elements in three different chromosomes (see Table 1), as well as the result of the Southern blot hybridizations (not shown), suggest that this element is dispersed in the Arabidopsis genome. The ends of Emigrant elements are inverted repeated sequences of 24 nt (see Fig. 1). Terminal inverted repeats (TIRs) are characteristics of class II transposons, and also of the new class of short repeated elements known as MITEs. Like MITEs, Emigrant elements do not seem to have any coding capacity, they are AT-rich and have the potential to form stable secondary structures with ΔG° values comparable to those reported for other families of MITEs (Bureau & Wessler 1992;Bureau & Wessler 1994a,b) (see Table 1). In addition, Emi elements are flanked by the dinucleotide TA which could represent target site duplication generated upon insertion (see Fig. 1), and coincides with the TA(A) target site duplications of MITEs (Bureau & Wessler 1992;Bureau & Wessler 1994a,b;Bureau et al. 1996;Río et al. 1996;Tenzen et al. 1994). Since the TIR sequences of Emigrant elements do not have any significant homology with those of other plant MITE families described (see Fig. 2), we propose that the Emigrant elements are a new family of MITEs.
We have studied the possible mobility of Emigrant elements by looking for their presence at particular sites among different Arabidopsis ecotypes. PCR amplification of seven regions that contain an Emigrant element in Columbia ecotype revealed the polymorphic presence of two of them among the four Arabidopsis ecotypes studied here. Figure 3 presents the analysis of one of these polymorphic regions. This result suggests that Emigrant elements have actively transposed since the divergence of these ecotypes. The comparison between the sequences containing an Emigrant insertion with the corresponding empty sites has allowed us to confirm that Emi elements generate a duplication of the dinucleotide TA upon insertion (see Fig. 3), which coincides with the consensus TA(A) target site duplication of previously described MITE elements (Bureau & Wessler 1992;Bureau & Wessler 1994a,b;Bureau et al. 1996;Río et al. 1996;Tenzen et al. 1994).
Emigrant is not associated with genes in Arabidopsis
The presence of 14 highly homogeneous Emi elements over the 15 Mbp of genomic Arabidopsis DNA, available through the databases since December 31st 1997, suggests that if Emi elements were homogeneously distributed, Arabidopsis should contain around 150 of these highly conserved sequences within the 145 Mbp of its genome. However, the number of Emi-related sequences should be higher, as sequences having a more limited degree of similarity have also been detected in these searches (not shown). In order to determine the number and distribution of Emi-related sequences, we have analysed by slot-blot hybridizations their presence in different Arabidopsis ecotypes and Brassicae. The results present in Fig. 4 show that all the different Arabidopsis genomes analysed contain between 500 and 1000 Emi-related sequences, and that this element is also present in other Brassicae. Emigrant is thus less abundant than other MITEs, which can be present at more than 10 000 copies per genome (Wessler et al. 1995).
MITEs, as well as retrotransposons, have frequently been found to be associated with genes in plants (Wessler et al. 1995;White et al. 1994). However, genomic sequencing projects have shown that the organisation of the genome of Arabidopsis may be different in some aspects to that of other plant genomes. Indeed, retro-elements seem to be dispersed in the genome of Arabidopsis (Bevan et al. 1998) in clear contrast to the pattern of retro-elements in larger genomes such as maize, where retrotransposons form nested structures of multiple elements comprising at least 50% of the nuclear DNA of the plant (SanMiguel et al. 1996). The Emi elements described here lie in non-coding regions, and only one of them has an open reading frame prediction within 1 kb upstream or downstream (see Table 1). It would seem that, in contrast to other MITEs (Wessler et al. 1995), Emi elements are present in low copy number in the genome of Arabidopsis and are not frequently associated with genes. As Emigrant elements are longer than other previously described families of MITEs, their insertion within transcribed regions is more likely to interfere with gene expression. This could be a possible explanation for its particular pattern of insertion. Nevertheless, if other MITEs exist in Arabidopsis, they probably share this characteristic with Emi elements, as recent computer-based searches that have detected 37 MITE sequences within rice genes have failed to detect these elements in the close vicinity of Arabidopsis genes, although there are four times as many Arabidopsis gene sequences than rice genes in the GenBank and EMBL databases (Bureau et al. 1996). Because of the close association of MITEs with plant genes, it has been suggested that these elements could have been involved in the evolution of genes in plants (Bureau et al. 1996;Wessler et al. 1995). Nevertheless, the results presented here show that while MITEs are present in Arabidopsis, its impact on the evolution of gene regulation in this species has been less important than in other species, such as maize or rice. On the other hand, it has also been suggested that the association of MITEs with genes could be a consequence of their already unknown mechanism of transposition. If MITEs transpose by an RNA intermediate, their presence within transcribed regions could facilitate mobilisation (Río et al. 1996). Alternatively, the association of MITEs with coding sequences could reflect a preference of this type of element for integration in transcribed sequences. The existence of MITEs not associated with genes in Arabidopsis suggests that this association is not essential for the transposition of these elements, although we cannot rule out the possibility that the elements described here were generated from other active Emi elements lying in the close vicinity of a gene.
The inverted repeats of Emigrant are similar to those of Wujin from the yellow fever mosquito
Within the 23 nt of Emigrant TIR sequences, 17 are identical to those of Wujin (see Fig. 2), a recently described MITE in the yellow fever mosquito Aedes aegypti (Tu 1997). The sequence of the TIRs, as well as the size and sometimes the sequence of the target site duplication generated upon integration, are believed to be specific for each family of transposable element that share integration machinery. There is no sequence similarity between Emigrant and Wujin elements except in their TIRs. This is a similar situation to that found for the different subfamilies of plant Tourist elements, which have 65–85% identity in their TIRs and little, or no, similarity in their internal sequences (Bureau & Wessler 1994a;Río et al. 1996). Therefore, Emigrant and Wujin are probably two different subfamilies of the same MITE family of elements, and constitute the first example of a MITE family present in two species that belong to different phylogenetic kingdoms. It is tempting to present this as an example of horizontal transfer between plant and animal genomes. Horizontal transmission events have been repeatedly proposed to explain the wide distribution of other mobile elements, such as copia-like retrotransposons, between very distant species (see Flavell et al. 1994). Nevertheless, when an extensive sampling of elements from related species is performed, the results obtained are consistent with a vertical transmission-based evolution of these elements (VanderWiel et al. 1993;Vernhettes et al. 1998). If MITEs are transmitted mainly vertically, as retrotransposons seem to be, the presence of the same MITE family in the genomes of Arabidopsis and the yellow fever mosquito would indicate an ancient association of MITEs with the eukaryote genome. Alternatively, it could be an indication of a convergent evolution of the TIRs of both elements due to constraints imposed by the use of a conserved cellular machinery for their mobility.
Until now, no MITEs have been described in Arabidopsis. The characterisation of the Emigrant family of elements shows that, as for the other families of transposable elements, the genome of Arabidopsis does contain MITEs. Nevertheless, Emigrant is present at a lower copy number than typical MITEs in other plant genomes. Emigrant elements may have been abundant in an ancestor of Arabidopsis, being mostly lost since then, as suggested for retrotransposons (Wright et al. 1996). Alternatively, MITEs could have been unsuccessful in proliferating after being introduced in Arabidopsis. If Emigrant elements, in contrast to the previously described families of MITEs, avoid transcribed regions, it will perhaps be difficult for these elements to find targets due to the high gene density genome of Arabidopsis. In any case, our results show that, as for the other classes of mobile elements, MITEs are not as abundant in Arabidopsis as in other plant genomes. This suggests a general rule restricting mobile elements to a low copy number in Arabidopsis, which seems to control their activity more strictly. Subtle differences in the host DNA repair machineries of maize and Arabidopsis have recently been suggested to explain differences in the footprints generated after Ac excision in these two plants (Rinehart et al. 1997). Thus, the constraints of the Arabidopsis genome to mobile element proliferation, in comparison to other plant genomes, could be a consequence of differences in the general cellular mechanisms responsible for genome dynamics and integrity.
DNA sequencing and computer analyses
The nucleotide sequence of Emi 4 was determined by the dideoxynucleotide chain termination method using an automatic fluorescence sequencer (ABI377 Perkin-Elmer). Sequence similarity searches were made using FASTA and Blast programs of UWGCG, software package (Genetics Computer Group, Madison, WI, USA) against the AT Data Base which contains the last submission of the Arabidopsis Genomic project (http://genome-www.stanford.edu/Arabidopsis). Multiple alignments of sequences were performed using CLUSTAL V and Boxshade (UWGCG) programs. ΔG° values were calculated using the MFOLD program of the UWGCG package. The consensus Emigrant sequence used to calculate the percentage of sequence similarity of the different copies was constructed after CLUSTAL V prediction. The coding sequence predictions were made with the programme Genefinder (Green and Hillier, in preparation). The ORF predictions of clone 19P19 were made using BLAST analysis and the NetPlantGene Program (Hebsgaard et al. 1996).
Slot blot analysis
DNA from four different Arabidopsis ecotypes (Columbia, Landsberg erecta, RLD and WS) and two Brassica species (Brassica napus and Brassica juncea) was obtained by standard procedures (Dellaporta et al. 1984). One μg and 0.1 μg of total genomic DNA of each Arabidopsis ecotype, and 4 μg and 0.4 μg of total genomic DNA of Brassica species was denatured and applied to a Nytran membrane (Schleicher and Schuell). One ng, 0.1 ng, 10 pg and 1 pg of a plasmid which contains the Emi4 element, corresponding to 1000, 100, 10 and 1 copies of the Emigrant element were also applied to the membrane. After neutralisation and fixation, the membrane was hybridized and washed at low stringency (20 mm Na2HPO4 pH:7.2, 1% SDS, 1 mm EDTA, at 37°C) with a probe corresponding to the Emi4 element.
PCR amplifications were performed by standard procedures with oligonucleotides corresponding to sequences flanking the Emi12 element in Columbia ecotype (5′-GAGAGCTTTAGAGTGTCATACC-3′ and 5′-GCGCCATGGAGGATACTCTTC-3′). PCR products were run in an agarose gel and transferred to a nylon membrane (Schleicher and Schuell) by standard procedures. The membrane was hybridized with an Emigrant specific probe and washed at medium stringency (20 mm Na2PO4 pH:7.2, 1% SDS, 1 mm EDTA at 50°C).
We acknowledge the support of the European Genome Project and Plan Nacional de Investigación Científica y Técnica (grant BIO97–1419-CE). This work has been carried out within the framework of the Centre de Referència de Biotecnologia de Catalunya.