• Arabidopsis lyrata;
  • chloroplast;
  • duplications;
  • parallel changes;
  • recombination;
  • trnF(GAA) pseudogenes


  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information

Extensive intraspecific variation in the chloroplast trnL(UAA)–trnF(GAA) spacer of model plant Arabidopsis lyrata is caused by multiple copies of a tandemly repeated trnF pseudogene undergoing parallel independent changes in copy number. Linkage disequilibrium and secondary structure analyses indicate that the diversification of pseudogene copies is driven by complex processes of structurally mediated illegitimate recombination. Disperse repeats sharing similar secondary structures interact, facilitating reciprocal exchange of structural motifs between copies via intramolecular and intermolecular recombinations, forming chimeric sequences and iterative expansion and contraction in pseudogene copy numbers. Widely held assumptions that chloroplast sequence evolution is simple and structural changes are informative are violated. Our findings have important implications for the use of this highly variable region in Brassicaceae studies. The reticulate evolution and nonindependent nucleotide substitution render the pseudogene inappropriate for standard phylogenetic reconstruction, but over short evolutionary timescales they may be useful for assessing gene flow, hybridization and introgression.


  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information

Recombination is not thought to play a major role in the evolution of plastid genomes, due to the predominance of uniparental inheritance and haploidy (Birky, 2001). The overall structure of the chloroplast genome is generally conserved among land plants and structural rearrangements are considered to be phylogenetically informative (Palmer, 1991; Cosner et al., 2004; Raubson & Jansen, 2005). Indels have been extensively used in the chloroplast evolutionary analysis of closely related taxa, or in population studies, increasing the number of characters available to compensate for relatively low base pair substitution rates (Muse, 2000). Short structural changes (<10 bp) are the most frequently detected and are useful for increasing phylogenetic resolution at the species level (Golenberg et al., 1993; Kelchner, 2000) and discriminating haplotypes for population analyses (Schaal et al., 1998; Hewitt, 2004; Coates & Byrne, 2005; Mitchell-Olds et al., 2005). Large structural changes involving many nucleotide positions (>50 bp) are much rarer. Complex structural rearrangements are known, such as the evolution of a minimal plastid genome in the holoparasite Epifagus virginia (L.) W. Barton (Wolfe et al., 1992), inversions (Hoot & Palmer, 1994; Cosner et al., 1997; Perry et al., 2002), translocations (Ogihara et al., 1988) and the loss of one inverted repeat (Lavin et al., 1990; Palmer et al., 1988). Gene duplications are occasionally reported in plastid genomes, including rpl2 and rpl23 (Bowmann et al., 1988), psaM (Wakasugi et al., 1994) and tRNA (pseudogene) genes (Hiratsuka et al., 1989; Lindholm & Gustafsson, 1991; Hipkins et al., 1995; Vijverberg & Bachman, 1999; Drábkova et al., 2004; Raubson & Jansen, 2005).

There is increasing evidence that the value of structural changes for evolutionary analyses may be compromised by complex underlying mechanisms (Kelchner, 2000; Ingvarsson et al., 2003). Small structural changes can arise through slipped-strand mispairing during DNA replication (Levinson & Gutman, 1987). Short repeats can act as substrates for illegitimate recombination (Ohnishi et al., 1999; Ogihara et al., 2002), whereas larger indels are often associated with hairpin formation (Kelchner & Wendel, 1996). Interestingly, tRNA gene duplications have been documented in taxa that exhibit large-scale genome rearrangements and may facilitate intramolecular recombination (Hiratsuka et al., 1989). Furthermore, because each plastid contains many copies of its genome (Deng et al., 1989; Pyke, 1999), intermolecular recombination can take place.

The trnL(UAA)–trnF(GAA) intergenic spacer (IGS) region is one of the most extensively utilized noncoding portions of the chloroplast genome for evolutionary analysis since the development of universal primers (Taberlet et al., 1991). Recently, a variable number of trnF pseudogene duplications have been documented in the trnL–trnF IGS region of around 20 Brassicaceae genera (Koch et al., 2005). The copies are nonfunctional, consisting of a partial trnF gene fragment of c. 50 bp and the associated 5′ flanking region, but the exact origin of these duplications remains uncertain. Previously, these duplications were not reported in the family despite extensive trnL–trnF IGS fragment length variation being noted in Cardamine (Franzke & Hurka, 2000; Lihováet al., 2004), Rorippa (Bleeker & Hurka, 2001), Lepidium (Mummenhoff et al., 2001) and Halimolobos (Bailey et al., 2002). Their absence from many genera (e.g. Brassica, Draba, Noccaea, Raphanus, Sinapis and Thlaspi) indicates that they may serve as a deep lineage marker within the Brassicaceae (Koch et al., 2005). However, there is evidence that the duplications are subject to reticulated evolution. Within Cardamine, Lepidium and Rorippa, there is extensive interspecific copy number variation (Koch et al., 2005), and phylogenetic analysis suggests independent copy number changes in divergent clades of Halimolobos (reviewed by Koch et al., 2005). Similar-sized trnL–trnF IGS fragments occur among members of species complexes from Cardamine (Franzke & Hurka, 2000; Lihováet al., 2004) and Rorippa (Bleeker & Hurka, 2001), indicating parallel intraspecific copy changes have recently taken place. Understanding the mechanisms responsible for pseudogene diversification is necessary if the highly variable trnL–trnF region is to be useful for Brassicaceae studies. To achieve this we advocate the use of dense sampling among a few closely related taxa to resolve any complicated processes.

Arabidopsis lyrata (L.) O'Kane & Al-Shehbaz (1997) has emerged as a model organism to study plant ecological and evolutionary processes (Mitchell-Olds, 2001). As part of a wider study of its biogeographic history we have been examining the geographic distribution of trnL–trnF IGS nucleotide variation in the European taxon A. lyrata ssp. petraea (L.) O'Kane & Al-Shehbaz (S.W. Ansell, unpublished data). This provides an opportunity to study trnF pseudogene evolution throughout the entire distribution range of a single taxon and to address the following questions: (A) How did the duplication originate? (B) To what extent does the number of pseudogene copies vary within a single species? (C) To what extent are parallel changes in pseudogene copy number occurring? (D) Are the patterns of tandem array length changes consistent with recombination processes? (E) What are the wider implications of the pseudogene sequence evolution for Brassicaceae studies utilizing the trnL–trnF region?

Materials and methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information

Leaf material from 540 diploid plants (confirmed by allozyme electrophoresis; S.W. Ansell, unpublished data) of A. lyrata ssp. petraea was collected from 54 populations from localities in Austria, Bavaria (S. Germany), Czech Republic, Harz Mountains (N. Germany), Iceland, Norway, Scotland, Shetland Isles, Sweden and Wales, so that all major elements within this taxon distribution range (Jales & Suominen, 1994) were represented (Table S1, Supplementary material online). Fifteen samples of Arabidopsis halleri (L.) O'Kane & Al-Shehbaz were also collected from three localities in Austria and Germany to assist with the interpretation of spacer evolution (Table S1, Supplementary material online).

Total DNA was extracted from desiccated leaves using a CTAB protocol (Doyle & Doyle, 1987). The trnL(UAA)–trnF(GAA) IGS region was amplified using PCR and the ‘E’ and ‘F’ primers of Taberlet et al. (1991). Amplifications were performed using Bioline Taq polymerase under the following conditions: 94 °C for 2 min, followed by 28 cycles of 94 °C for 30 s, 50 °C for 30 s, 72 °C for 30 s, followed by one cycle of 72 °C for 2 min. PCR products were assayed on agarose gels and visualized with ethidium bromide under UV light (Sambrook et al., 1989). Nucleotide variation in Bavarian A. lyrata ssp. petraea samples was detected by restriction fragment length polymorphism (RFLP) assay and by sequencing one to three representatives of each banding pattern type in each population. For all other samples (290 individuals), including A. halleri, nucleotide variation was assayed directly by sequencing PCR products.

Single banded PCR products were excised from the agarose gel and purified using a home-made spin column (Chuang & Blattner, 1994). Two microlitres of each purified PCR product (c. 100 ng of DNA) was digested using five units of the restriction enzyme MseI (Fermentas, York, UK) following the manufacturer's instructions. Digested samples were radioactively labelled by 0.2 μCi δ ATP (GE Healthcare, Little Chalfront, UK) and 0.5 units of T4 polynucleotide (Fermentas) using the exchange reaction, following the manufacturer's instructions. Digested samples were denatured in formamide loading dye and fractionated on 6% denaturing polyacrylamide gels following standard procedures (Sambrook et al., 1989). Digested products were visualized by autoradiography and fragment sizes were calculated from side-by-side comparisons to a 10-330 bp radio-labelled AFLP marker ladder (Invitrogen, Paisley, UK) included on each gel. PCR products were bi-directionally sequenced using BigDye Terminator kits (v 1.1 Applied Biosystems, ABI) and an ABI 3730 capillary DNA analyser.

TrnF pseudogene recognition and sequence alignment

The GenBank accession of the Arabidopsis thaliana (L.) Heynh. chloroplast genome (AP000423, positions 46894–48247) covering the trnL (UAA)–trnF (GAA) IGS region was selected as a molecular standard in comparison with the trnL–trnF IGS region of Brassica nigra (L.) Koch. (AB213009, positions 420–814). The latter has been shown not to contain trnF pseudogenes (Koch et al., 2005), enabling the molecular origin of the tandemly repeated structures to be postulated.

Sequences were assembled using SeqMan (v. 6 Lasergene; DNAstar, Madison, WI, USA) and were manually aligned with MegAlign (v.6 Lasergene). Sequences containing eight trnF pseudogene copies (the maximum number detected in this study) were first aligned, enabling shorter sequences to be aligned by the insertion of gaps. Manual checks of the trnF pseudogene duplications confirmed that they did not contain additional ‘F’ primer annealing sites and that all sequencing products contained at least 44 bases of the trnF gene. Nucleotide sequences generated here have been submitted to GenBank (accession numbers DQ989814DQ989862). BioEdit (v. 7.05, Hall, 1999) was employed to calculate the theoretical MseI RFLP pattern for all Arabidopsis trnL–trnF IGS sequence variants identified.

Pseudogene sequence analysis

Nucleotide variation preceding the tandem array (alignment positions 1–180) was used to define haplotypes. The minimum number of differences between each pair of haplotypes was used to manually construct a minimum spanning network (MSN). A haplotype was assigned to each trnL–trnF IGS sequence variant. Parallel changes of length within the tandem array were assessed by mapping the pseudogenes present in each IGS variant onto the MSN.

Pseudogene sequences were excised from the main alignment and were manually realigned using MegAlign. Chi-squared tests were performed on the A. lyrata ssp. petraea pseudogenes to assess for a departure from linkage disequilibrium (LD) and the possibility of recombination using SITES (Hey & Wakeley, 1997). The presence of stem-loop structures within the trnL–trnF IGS and trnF gene region were determined using the DNA version of the Mfold web server (v. 3.2, Zuker, 2003), following the default settings of 37 °C, [Na+] = 1.0, [Mg2+] = 0.0, percentage suboptimality was 5.0, upper bound of foldings was 50, and linear DNA sequences. The relationship between the pseudogene sequences were investigated by construction of Neighbornet trees, using Splitstree (v. 4.6, Huson & Bryant, 2006). Networks were reconstructed using LogDet distances weights modified using least squares, and set maximum dimensions to four. Networks appear particularly useful to infer phylogenetic patterns in which alternative hypotheses are required to be visualized, e.g. in studies considering recombination (Huson & Bryant, 2006).


  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information

Population survey of A. lyrata ssp. petraea andA. halleri

The PCR-RFLP survey of Bavarian A. lyrata ssp. petraea samples yielded 11 trnL–trnF IGS MseI digestion patterns, with only three patterns occurring in more than two populations (Table S1, Supplementary material online). DNA sequencing confirmed that only pattern PET 1 had multiple sequence variants for Bavaria when one to three individuals were sequenced per population for each RFLP type detected. Direct sequencing of PCR products identified a further 34 trnL–trnF IGS sequence variants from the rest of Europe. From the combined data, a total of 46 distinct trnL–trnF IGS sequences were identified, corresponding to 34 MseI digestion patterns (Table S2, Supplementary material online), with multiple variants for patterns PET 1, PET 19, PET 21 and PET 31 (Table 1). The sequencing survey of A. halleri identified three IGS sequences (HALL 1–3, Table 1).

Table 1. Arabidopsis thaliana, A. lyrata ssp. petraea (PET) and A. halleri (HALL) trnL-trnF IGS sequences and their trnF pseudogene composition.
Species and IGS variantGenBank accessionsHaplotypetrnF pseudogene copies
Arabidopsis lyrata ssp. petraea
 PET 1ADQ9898142I.1II.1VI.1VII.1IX.1 X.1
 PET IBDQ9898152I.1II.1VI.1VII.1IX.1 X.1
 PET 1CDQ9898162I.6II.1VI.1VII.1IX.1 X.1
 PET 2DQ9898172I.1II.1VI.1VII.1IX.1 X.2
 PET 3DQ9898182I.1II.1VI.1VII.1IX.1 X.3
 PET 4DQ9898192I.1II.1VI.1VII.1IX.1 X.1
 PET 5DQ9898202I.6II.1VI.3VII.1IX.1 X.1
 PET 6DQ9898212I.1II.1VI.1VII.3 –
 PET 1DQ9898222I.2 –
 PET 8DQ9898232I.3VI.1VII.1IX.1 X.1
 PET 9DQ9898242I.4IX.1 X.1
 PET 10DQ9898252I.1VI.1IX.2 X.1
 PET 11DQ9898262I.6II.1 X.7
 PET 12DQ9898272I.1II.1VI.2IX.1 X.3
 PET 13DQ9898282I..1II.1 X.5
 PET 14DQ9898292I.6II.1VI.1VII.4 –
 PET 15DQ9898302I.3 X.7
 PET 16DQ9898312I.1II.1 X.7
 PET 17DQ9898321I.10 –
 PET 18DQ9898333I.1II.3VI.3IX.3 X.4
 PET 19ADQ9898343I.1II.3 IX.6
 PET 19BDQ9898356I.1II.3 IX.8
 PET 20DQ9898365I.5II.1VI.1VII.2IX.1 X.4
 PET 21ADQ9898376I.6II.1VI.4VII.5IX.1 X.4
 PET 21BDQ9898386I.6II.2VI.4VII.7IX.1 X.4
 PET 21CDQ9898396I.6II.1VI.5VII.7IX.1 X.4
 PET 21DDQ9898407I.6II.1VI.4VII.7IX.1 X.4
 PET 21EDQ98984110I.6II.1VI.5VII.7IX.1 X.4
 PET 22DQ9898426I.1II.1VI.4VII.7IX.1 X.4
 PET 23DQ9898436I.6II.4VI.4VII.6IX.1 X.4
 PET 24DQ9898446I.1VI.4VII.7IX.1 X.4
 PET 25DQ9898456I.1VI.4VII.7IX.1 X.4
 PET 26DQ9898466I.1VI.4VII.6IX.1 X.4
 PET 27DQ9898476I.7IX.1 X.4
 PET 28DQ9898486I.6II.1VI.4VII.8 –
 PET 29DQ9898496I.6II.4VI.4VII.8 –
 PET 30DQ9898506I.1II.1VI.4VII.8 –
 PET 31ADQ9898516I.6II.1 X.8
 PET 31BDQ9898528I.6II.1 X.8
 PET 31CDQ9898534I.8II.1 X.7
 PET 31DDQ9898549I.6II.1 X.8
 PET 31EDQ9898556I.6II.1 X.9
 PET 31FDQ9898566I.6II.1 X.8
 PET 32DQ9898576I.6II.4 X.9
 PET 33DQ9898586I.9 –
 PET 34DQ9898596I.9 –
Arabidopsis halleri
 HALL 1DQ9898601I.6II.1VI.3VII.7IX.1 X.6
 HALL 2DQ9898611I.1II.5IV.1VI.4VII.9VIII.1IX.1 X.4
 HALL 3DQ9898621I.1IV.1V.1VI.4VII.7IX.1 X.4
A. thaliana Col.AP0004230I.ATII.ATIII.ATVI.ATIX.AT X.AT

Origin of the trnF pseudogene duplication

We recognized six duplicated sequences within the trnL–trnF IGS of A. thaliana that ended with the motifs GTCAC, GTCAT or GCCAC (Fig. 1). Comparisons to the trnL–trnF IGS of B. nigra (no pseudogenes) revealed the A. thaliana duplications are derived from an imperfect duplication of the trnF gene and 5′ flanking region (Fig. 1). This fragment includes −10 bp promoter (region A) found in most land plants (Quandt et al., 2004), one or two copies of an intervening 15 bp short repeat TGATACTTCGGTAAT or its derivatives (region B) and the first 50 bp of the 72 bp long trnF (GAA) gene (region C). The latter also has a 6 bp deletion in the putative D domain (Fig. 1), so only the putative anti-codon domain remains intact. These incomplete copies are therefore no longer functional as transfer RNAs and will be called pseudogenes. Searches for repeated sequences around the trnF gene region of B. nigra identified one 10 bp and one 12 bp motif that share eight positions in common and directly border the margins of a fragment equivalent to the duplications found in Arabidopsis (Fig. 1).


Figure 1.  Alignment of Brassica nigra and Arabidopsis thaliana trnF gene and 5’ region, and the trnF pseudogene copies identified in three Arabidopsis taxa. *Putative chimeric sequences.

Download figure to PowerPoint

IGS sequence data

The Arabidopsis sequence alignment is 1076 bp in length (Fig. S1, Supplementary material online). Individual sequences varied from 330 to 907 bp, and the tandem array varied from 150 to 727 bp. In all three Arabidopsis taxa, the entire tandem array precedes the functional trnF gene, and the array is separated from the trnF gene by a modified short repeat of TGATCCTT or TGATTCTT, rather than a standard 15 bp copy. Within the tandem array, 50 pseudogene variants were identified across all three taxa. These differ by at least one substitution or indel and range from 67 to 107 bp in length (excluding copy III, Fig. 1). The variants were assigned to pseudogene copy families (I-X) according to substitution composition and alignment position (Fig. 1), with up to 10 variants assigned to a single copy family. Between one and six copies were identified in A. lyrata ssp. petraea, six to eight copies in A. halleri, and six in the Columbia strain of A. thaliana (Table 1). Arabidopsis thaliana copy III is highly degraded, being only 32 bp in length, and is absent from the other Arabidopsis taxa. Arabidopsis lyrata ssp. petraea also differs from A. thaliana by the addition of copy VII, and A. halleri differs from A. lyrata ssp. petraea by the addition of copies IV, V and VIII. The latter three appear to be recent duplications, as copies IV and V are identical and are duplications of copy II, whereas copy VIII is a duplication of copy VII. The six A. thaliana pseudogene variants were not shared among the other studied taxa. Of the remaining variants, 30 were confined to A. lyrata ssp. petraea, five were confined to A. halleri and nine were shared by both taxa. Apart from six minor indels (1–15 bp) within a few pseudogenes, length variation in the tandem array was caused by loss or gain of complete pseudogene copies (Table 1).

Haplotype analysis and parallel length changes

Ten variable positions were identified 5′ to the tandem array (Table 2) and 11 distinct haplotypes were detected (Table 2). These formed two star-shaped clusters within an MSN, with A. thaliana being ancestral to cluster I (haplotype 0). Haplotype 1 was shared by A. halleri and A. lyrata ssp. petraea. Parallel independent changes in pseudogene copy numbers were evident when the pseudogene content of each IGS variant was mapped onto the network (Fig. 2). Intergenic spacers that contain one, three, four, five or six copies are represented in both clusters (e.g. five-copied PET 12 and PET 25), and alternative versions of the same copy numbered array are present in each cluster (e.g. three-copied PET 27 and PET 32 of cluster II, Fig. 2, Table 1). Only a small proportion of the IGS variants have undergone haplotype divergence after the length change event (e.g. PET 31A vs. PET 31B/C, clade I).

Table 2.   Haplotype defined on variable positions prior to tandem duplications, based on positions 1–180 of alignment shown in Fig. S1. Haplotypes 1–5 are cluster I, haplotypes 6–10 are cluster II.
HaplotypePositions of trnL–trnF alignment
4647Indel A8896119136154156167

Figure 2.  Minimum spanning network of Arabidopsis trnL-trnF IGS haplotypes, with the trnF pseudogene copies (shaded) of each IGS type mapped according to haplotype state. Haplotypes 1–5 are cluster I, haplotypes 6–10 are cluster II.

Download figure to PowerPoint

A. lyrata pseudogene sequence analysis

Nucleotide variation in the 30 pseudogenes of A. lyrata ssp. petraea appears to be highly ordered, mainly consisting of a small number of positions that are subject to reoccurring substitutions. These cluster as four subregions, corresponding to the degraded promoter within A2, the short repeat of B3, and subregions C1 and C2 of the partial trnF gene fragment. Most of the pseudogenes differences can be explained by alternative combinations of minor variants of these four subregions (Fig. 1). LD analysis over these sequences variants revealed that 99 of the 465 pairwise combinations of nucleotide positions had significant (P < 0.005) LD (based on a matrix of 31 variable sites, Table 3). These clustered as eight islands within the chi-squared matrix, and the positions corresponded to the four subregions (A2, B3, C1 and C2) responsible for the structured substitutions, or to combinations of these subregions (A2/C1, A2/B3, B3/C1 and C1/C2). For 13 pseudogene sequences, there is strong evidence of chimeric (hybrid) sequence formation (marked by an asterisk in Fig. 1). This is most clearly demonstrated for types 5–9 of pseudogene copy X, whereby alignment positions 12–23 are derived from copy VI, whereas positions 60–133 are consistent with being derived from copy X (Fig. 1). Overall, these patterns are consistent with an exchange of motifs between the pseudogene copies, and that recombination and unequal crossing over may be in operation.

Table 3.   Matrix of chi-squared tests between all pairs of variable nucleotide position detected among the 30 A. lyrata ssp. petraea pseudogenes taken from the alignment in Fig. 1. Non-significant (ns), *P < 0.005, **P < 0.001. Thumbnail image of

IGS secondary structure analysis

DNA secondary structure analysis on the six pseudogene copied PET 1A identified 24 stem-loop structures (Fig. 3). Six occurred prior to the tandem duplications and three are associated with the trnF gene. Within the tandem array there were eight small and seven large stem-loop structures associated with the six pseudogenes copies (Fig. 3). Five of the eight short stem-loops (stems 7, 11, 14, 17 and 20) coincide with the short repeats (region B) portion of the pseudogenes, whereas six of the seven large stem-loops (stems 8, 10, 12, 15, 19 and 21) coincide with the partial trnF sequence (region C) portion of the pseudogenes (Fig. 3). Thus, the tandem array consists of regularly spaced stem-loops that share similar internal folding patterns for the pseudogene subregions previously identified by the LD analysis to be potentially undergoing exchange between the pseudogene duplications.


Figure 3.  DNA folding analysis of the trnL-trnF IGS and trnF gene sequence of PET 1A, using the sequence defined in Fig. S1. Pre-tandem repeat (Pre-TR) and post-tandem repeat regions (Post-TR) are indicated.

Download figure to PowerPoint

Pseudogene sequence similarities

The NeighborNet analysis of the 50 Arabidopsis pseudogene sequences is shown in Fig. 4. Individual pseudogene sequences clustered with members of the same copy family, and copies suspected of being recently duplicated grouped with the putative ancestral copies (i.e. copies IV and IV clustered within copy II sequences, and copy IIIAT clustered with copy II, also from A. thaliana). Overall the pseudogene sequences are highly divergent with respect to the outgroup trnF gene sequences of A. thaliana and B. nigra. The groupings containing copies IX and VI are nearly completely merged, reflecting a high number of shared mutations. This agrees with the earlier finding that copies IX types 7–10 are probably chimeric sequences, derived from copies X and VI. The many alternative branching routes between the pseudogene sequences indicate extensive parallel substitutions, which is consistent with the occurrence of recombination, or that sequence divergence is limited. With no clearly separated groups and hierarchies of pseudogene relationship, we do not reconstruct the history of pseudogene duplications.


Figure 4.  Neighbornet tree based on the 50 Arabidopsis trnF pseudogenes from alignment Fig. 1 and the trnF gene sequences from outgroups B. nigra and A. thaliana. Network statistics: nSplits = 115 and the total weight = 1.11497847. Putative chimeric sequences (*), duplicated copies ([ ]), sequences derived from A. thaliana(AT).

Download figure to PowerPoint


  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information

This comprehensive survey of A. lyrata ssp. petraea shows that extensive trnF pseudogene copy variation occurs at the intraspecific level within the Brassicaceae. Indeed the intraspecific variation detected in A. lyrata ssp. petraea equals that previously only known at the generic level (Koch et al., 2005). Our study highlights that the trnL–trnF IGS is a hotspot for length mutations once the duplicated sequence is incorporated, and that repeated independent copy number changes have taken place in the period since A. lyrata ssp. petraea diverged from A. thaliana around 5.1 Ma (Koch et al., 2001). These results have important implications for the use of the trnL–trnF region in Brassicaceae phylogenetic analysis.

We adopted an outgroup (B. nigra) comparison to define the pseudogene copies in A. lyrata. In our opinion, this approach emphasizes the actual structural origins of the duplications and avoided Koch et al.’s (2005) problem of splitting the partial trnF gene sequences through internalizing the trnF gene 5′ flanking region, and creating chimeric sequences derived from two copies. In doing so, we more precisely define the pseudogene duplication cut-off points and for the first time recognize two bordering motifs, which are also present in the putative ancestral region of B. nigra (Fig. 1). The plastid genome of A. lyrata’s close relative A. thalina (Sato et al., 1999) has the typical gene order arrangement of other angiosperms (Nicotiana, Spinacia and Oenothera) (Wakasugi et al., 1998; Hupfer et al., 2000; Schmitz-Linneweber et al., 2001; Yukawa et al., 2005). Thus, we find no evidence of genome wide reorganization to account for the origin of the duplication in Arabidopsis. With between 400 and 1600 genome copies per chloroplast (Pyke, 1999), we postulate the duplication originated locally within the trnF gene region, via the bordering motifs mediating inter-plastid (intermolecular) recombination without disrupting the rest of the genome.

The origin of the Brassicaceae chloroplast trnF duplication has been dated to 16–21 MYA from nuclear gene sequences (Koch et al., 2005). However, the exact phylogenetic origin is uncertain, as a complete Brassicaceae phylogeny is still under construction (Bailey et al., 2006). Outside of the Brassicaceae, there have only been three reports of a trnF duplication: twice within the Lactuceae in non-related Microseris and Taraxacum (Vijverberg & Bachman, 1999; Wittzell, 1999) and from the monocot Juncus (Drábkova et al., 2004). In both Microseris and Taraxacum, the entire trnF gene has been tandemly duplicated, and the duplications maintain a high sequence similarity (88–99%) to the trnF gene. In Juncus, only the 5′ acceptor stem and anticodon domain are represented, and there has been an 8 bp insertion in the D domain (Drábkova et al., 2004). This situation closely resembles that of A. lyrata.

The pattern of Arabidopsis trnL–trnF IGS structural changes are not consistent with a slip-strand mispairing model of length variation. This process involves short motifs (<10 bp, Kelchner & Wendel, 1996; Vijverberg & Bachman, 1999) and the production of highly homoplasious strings of length variation during DNA replication (Kelchner, 2000). This mechanism probably accounts for length variation reported in chloroplast minisatellite and microsatellite repeats (Provan et al., 1999; Cozzolino et al., 2003; Ceplitis et al., 2006). The Arabidopsis length changes, by contrast, involved entire pseudogene repeats (67–107 bp) without a breakdown of sequence homology and parallel copy number changes occurred in divergent haplotype lineages. With multiple genome copies per plastid (Deng et al., 1989; Pyke, 1999), the presence of repeated sequences within the Arabidopsis trnL–trnF IGS enables both intramolecular and intermolecular recombinations to diversify the number of pseudogenes present.

Linkage disequilibrium analysis confirmed that the IGS fragment had a pattern of substitutions that did not conform to the expectations of strict uniparental inheritance. The chloroplast genome of A. lyrata ssp. petraea is assumed to be maternally inherited like its close relative A. thaliana (Corrivea & Coleman, 1988); hence, the entire IGS fragment should approximately behave as a single linkage group. Our analysis identified significant deviations from linkage equilibrium over several neighbouring nucleotide positions and four linkage groups were recognized. Three of these corresponded to the short repeat (subregion B3) and partial trnF elements (subregions C1 and C2, Table 3). Significantly, the distribution of these units coincided with the distribution of stem-loop structures within the tandem array (Fig. 3), suggesting that structurally mediated recombination diversifies the pseudogene sequences and copies.

Short repeats (4–15 bp) are associated with sites of length change in the chloroplast genome (Hipkins et al., 1995; Sang et al., 1997; Ogihara et al., 2002) and may act as substrate for recombination (Kanno et al., 1993; Kawata et al., 1997; Kano et al., 1997; Ohnishi et al., 1999). Between one and three 15 bp short repeats are present in each Arabidopsis pseudogene, and all share a high proportion of nucleotides in common (subregion B3, Fig. 1), and can form internal stem structures through base pair complementation (stems 7, 11, 14, 17, 20, Fig. 4). Palindromic or inverted short repeats can undergo stem-loop formation and are susceptible to intramolecular recombination (Cosner et al., 1997; Kim & Lee, 2004). Interactions between widely dispersed repeats in Arabidopsis potentially facilitate intramolecular recombination, causing the elimination of intervening regions and the formation of chimeric pseudogene sequences. Interestingly, the number of short repeats present in copies I, VI, X has increased from one to two or three repeats, forming a mini array. This may promote the frequency of recombination, and may be an important factor in generating the extensive A. lyrata copy number variation.

The short repeats can also facilitate intermolecular recombination via multiple copies of the genome per plastid (Deng et al., 1989; Pyke, 1999) and the sharing of homologous sequences among genomes. A diverse array of recombination products are predicted under unequal crossing over with reciprocal exchange that includes insertions (i.e. duplications), deletions and reversions to single copy arrays (Vijverberg & Bachman, 1999). Our survey of A. lyrata ssp. petraea and A. halleri detected IGS types consistent with these predicted products. For example, identical duplicated pseudogene sequences were present in all three A. halleriHALL types (Table 1), whereas 26 of the 46 A. lyrata ssp. petraea IGS types had intermediate (two to five copies) copy numbers (i.e. in mid-expansion/contraction), and 4/46 had reversed to single copy arrays (Table 1).

Plastid transfer RNA gene duplications are also associated with chloroplast genome structural rearrangements (Palmer et al., 1988; Cosner et al., 2004), and recombination may be mediated via their secondary structure and homologous motifs (Hiratsuka et al., 1989; Kelchner, 2000). The chimeric A. lyrata ssp. petraea pseudogene sequences indicates the pseudogene's involvement during array length changes (copy X, types 6–9, Fig. 1). Interestingly, the three pseudogene copies most frequently involved in forming chimeric (hybrid) pseudogene sequences (copies I, VI and IX) form near identical internal stem-loop structures for the partial trnF gene sequence containing region (stems 8, 12, 21, Fig. 4). Stem bending coupled with shared homologous motifs may greatly facilitate genetic exchange via intermolecular or intramolecular forms of illegitimate recombination.

These combined mechanisms create a highly reticulated history of trnF pseudogene copy evolution in A. lyrata. Copy number evolution is likely to follow iterative cycles of parallel expansion and contraction, through an antagonistic relationship between intermolecular and intramolecular recombination. These cyclical processes will disrupt the history of copy changes and sequence divergence, rendering reliable phylogenetic reconstructions of pseudogene evolution problematic. As similar dynamic patterns are likely in related Arabidopsis taxa, a large sequencing effort involving many specimens of each species with trnF duplications would be required to infer the history of the Arabidopsis duplications (although see Koch et al., 2005).

Half the genera so far examined contain at least one trnF pseudogene copy (Koch et al., 2005), most likely a single origin within the ‘halimolobine’ tribe (Bailey et al., 2002, corresponding to the ‘arabidopsis’ and ‘cardaminoide’ clades, Heenan et al., 2002). Consequently, we agree with the findings of Koch et al. (2005, 2007) that the pseudogenes may serve as a deep lineage marker for this family. This is especially important, as nearly all morphological characters historically used for the classification of this family are highly homoplasic (Mummenhoff et al., 1997; Koch et al., 2003; Al-Shehbaz et al., 2006). However, as eight genera contain species with three or more pseudogene copies (see Table 1 of Koch et al., 2005), intraspecific variation is likely to be widespread in this family. The probability of length changes taking place in tandem repeats increases with increasing numbers of repeats (Kelchner, 2000). Consequently, for genera with multiple copies, such as Arabidopsis, the chloroplast genome probably exists as a population of cytotypes within each plastid. Vegetative sorting of plastids during apical meristem cell division (Birky, 1995) will determine the intraspecific diversity of IGS sequences. Thus, the sampling strategies employed during molecular studies (systematic or population based) will have a dramatic influence on the number of copies detected. Superficially, similar tandemly repeated structures may arise independently in divergent lineages, leading to paralogy problems for phylogenetic reconstruction (Martin & Burg, 2002; Razafimandimbison et al., 2004). Our data also suggests pseudogene sequence diversification occurs via the exchange of multi-nucleotide motifs between copies. Nucleotide substitutions no longer have unique histories and historical analytical methods (parsimony) will be inappropriate for describing sequence relationships. Rate heterogeneity will also exist among the ‘segregating’ structural motifs.

We strongly recommend that only pseudogene presence/absence should be used as a phylogenetic character, and that pseudogene copy number and substitutions should be ignored. Studies that have inadvertently included the duplications in their sequence analyses should be viewed with some caution (Bleeker & Hurka, 2001; Bailey et al., 2002; Lihováet al., 2004; Wittzell, 1999). We share Vijverberg & Bachman's (1999) opinion that structural mutations cannot be universally considered as reliable markers of evolutionary time.


  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information

The authors would like to thank Marcus Koch and Michael Frohlich for their observations, Anne-Marie Van Dodeweerd for constructive criticism of this manuscript and the following who helped with field collecting especially: Joseph Griemler, Paul Harvey, Andreas Hemp, Bill Kunin, Maria Clauss and Chris Preston. This work is a part of a wider study of the biogeographic history of A. lyrata ssp. petraea during the PhD of SA, funded by N.E.R.C. grant no. GR3/12073 and supported by a B.S.B.I. travel award.


  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information
  • Al-Shehbaz, I.A., Beilstein, M.A. & Kellogg, E.A. 2006. Systematics and phylogeny of Brassicaceae (Cruciferae): an overview. Plant Syst. Evol. 259: 89120.
  • Bailey, C.D., Carr, T.G., Harris, S.A. & Huges, C.E. 2003. Characterization of angiosperm nrDNA polymorphism, paralogy, and pseudogenes. Mol. Phylogenet. Evol. 29: 435455.
  • Bailey, C.D., Koch, M.A., Mayer, M., Mummenhoff, K., O'kane, S., Warwick, S.I., Windham, D. & Al-Shehbaz, I. 2006. Toward a global phylogeny of the Brassicaceae. Mol. Biol. Evol. 23: 21422160.
  • Bailey, C.D., Price, R.A. & Doyle, J.J. 2002. Systematics of the halimolobine Brassicaceae: evidence from three loci and morphology. Syst. Bot. 27: 18332.
  • Birky, C.W. 1995. Uniparental inheritance of mitochondrial and chloroplast genes. Proc. Natl Acad. Sci. U.S.A. 92: 1133111338.
  • Birky, C.W. 2001. The inheritence of genes in mitochondria and chloroplasts, mechanisms and models. Annu. Rev. Genet. 35: 125148.
  • Bleeker, W. & Hurka, H. 2001. Introgressive hybridisation in Rorippa (Brassicaceae): gene flow and its consequences in natural and anthroprogenic habitats. Mol. Ecol. 10: 20132022.
  • Bowmann, C.M., Barker, R.F. & Dyer, T.A. 1988. The location and possible evolutionary significance of small dispersed repeats in wheat ctDNA. Curr. Genet. 10: 931941.
  • Ceplitis, A., Su, Y.T. & Lascoux, M. 2006. Bayesian inference of evolutionary history from chloroplast microsatellite in the cosmopolitan weed Capsella bursa-pastoris (Brassicaceae). Mol. Ecol. 14: 42214233.
  • Chuang, S. & Blattner, F.R. 1994. Ultra-fast DNA recovery from agarose by centrifugation through a paper slurry. Biotechniques 17: 634.
  • Coates, D.J. & Byrne, M. 2005. Genetic variation in plant populations. In: Plant Diversity and Evolution: Genotypic and Phenotypic Variation in Higher Plants (R. J.Henry, ed.), pp. 139164. CABI Publishing, Cambridge, MA.
  • Corrivea, J.L. & Coleman, A.W. 1988. Rapid screening method to detect potential biparental inheritance of plastid DNA and results for over 200 angiosperm species. Am. J. Bot. 75: 14431458.
  • Cosner, M.E., Jansen, R.K., Palmer, J.D. & Downie, S.R. 1997. The highly rearranged chloroplast genome of Trachelium caeruleum (Campanulaceae): multiple inversions, inverted repeat expansion and contraction, transposition, insertions/deletions, and several repeat families. Curr. Genet. 31: 419429.
  • Cosner, M.E., Raubeson, L.A. & Jansen, R.K. 2004. Chloroplast DNA rearrangements in Campanulaceae: phylogenetic utility of highly rearranged genomes. Evol. Biol. 4: 27.
  • Cozzolino, S., Cafasso, D., Pellegrino, G., Musacchio, A. & Widmer, A. 2003. Molecular evolution of a plastid tandem repeat locus in an orchid lineage. J. Mol. Evol. 57: S41S49.
  • Deng, X.-W., Wing, R.A. & Gruissem, W. 1989. The chloroplast genome exists in multimeric forms. Proc. Natl Acad. Sci. U.S.A. 86: 41564169.
  • Doyle, J.J. & Doyle, J.L. 1987. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem. Bull. 19: 1115.
  • Drábkova, L., Kirschner, J., Vlček, Č. & Paček, V. 2004. TrnL–trnF intergenic spacer and trnL intron define major clades within Luzula and Juncus (Juncaceae): importance of structural mutations. J. Mol. Evol. 59: 110.
  • Franzke, A. & Hurka, H. 2000. Molecular systematics and biogeography of the Cardamine pratensis complex (Brassicaceae). Plant Syst. Evol. 224: 213234.
  • Golenberg, E.M., Clegg, M.T., Doebley, D.J. & Ma, D.P. 1993. Evolution of a non-coding region of the chloroplast genome. Mol. Phylogenet. Evol. 2: 5264.
  • Hall, T.A. 1999. BioEdit: A User-Friendly Biological Sequence Alignment Editor and Analysis Program. North Carolina State University, Raleigh, NC.
  • Heenan, P.B., Mitchell, A.D., Koch, M. 2002. Molecular systematics of the New Zealand Pachycladon (Brassicaceae) complex: generic circumscription and relationship to Arabidopsis sens. lat. and Arabis sens. lat. N. Z. J. Bot. 40: 543562.
  • Hewitt, G.M. 2004. Genetic consequences of climate oscillations in the quaternary. Philos. Trans. R. Soc. B 359: 183195.
  • Hey, J. & Wakeley, J. 1997. A coalescent estimator of the population recombination rate. Genetics 145: 833846.
  • Hipkins, V.D., Marshall, K.A., Neale, D.B., Rottmann, W.H. & Strauss, S.H. 1995. A mutation hotspot in the chloroplast genome of a conifer (Douglas-fir: Pseudotsuga) is caused by variability in the number of direct repeats derived from a partially duplicated tRNA gene. Curr. Genet. 27: 572579.
  • Hiratsuka, J., Shimada, H., Whitter, R. et al. (12 co-authors) 1989. The complete sequence of the rice (Orzya sativa) chloroplast genome: intermolecular recombination between distinct tRNA genes accounts for a major plastid DNA inversion during the evolution of the cereals. Mol. Gen. Genet. 217: 185194.
  • Hoot, S.B. & Palmer, J.D. 1994. Structural rearrangements, including parallel inversions, within the chloroplast genome of Anemone and related genera. J. Mol. Evol. 38: 274281.
  • Hupfer, H., Swiatek, M., Hornung, S., Herrmann, R.G., Maier, R.M., Chiu, W.L. & Sears, B.B. 2000. Complete nucleotide sequence of the Oenothera elata plastid chromosome, representing plastome I of five distinguishable Euoenothera plastomes. Mol. Gen. Genet. 263: 581585.
  • Huson, D. & Bryant, D. 2006. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23: 254267.
  • Ingvarsson, P.K., Ribstein, S. & Taylor, D.R. 2003. Molecular evolution of insertions and deletions in the chloroplast genome of Silene. Mol. Biol. Evol. 20: 17371740.
  • Jales, J. & Suominen, J. 1994. Cardaminopsis petraea. In: Atlas of the Floraea Europaea, Volume 10. Cruciferae (Sisymbrium to Aubrieta) (J.Jales & J.Suominen, eds.), pp. 180181. Helsinki University Press, Helsinki, Finland.
  • Kanno, A., Watanabe, N., Nakamua, I. & Hirai, A. 1993. Variation in chloroplast DNA from rice (Oryza sativa): differences between deletions mediated by short-direct repeat sequences within a single species. Theor. Appl. Genet. 86: 579584.
  • Kano, A., Lee, Y.-O. & Kameya, T. 1997. The structure of the chloroplast genome in members of the genus Asparagus. Theor. Appl. Genet. 95: 11961202.
  • Kawata, M., Harada, T., Shimamoto, K., Ono, K. & Takaiwa, F. 1997. Short inverted repeats function as hotspots of intermolecular recombination giving rise to oligomers of deleted plastid DNA (ptDNAs). Curr. Genet. 31: 179184.
  • Kelchner, S.A. 2000. The evolution of non-coding chloroplast DNA and its application in plant systematics. Ann. Mo. Bot. Gard. 87: 482498.
  • Kelchner, S.A. & Wendel, J.F. 1996. Hairpins create minute inversions in non-coding regions of chloroplast DNA. Curr. Genet. 30: 259262.
  • Kim, K.J. & Lee, H.L. 2004. Complete chloroplast genome sequences from Korean ginseng (Panax schinseng Nees) and comparative analysis of sequence evolution among 17 other vascular plants. DNA Res. 11: 247261.
  • Koch, M.A., Al-Shehbaz, I.A. & Mummenhoff, K. 2003. Molecular systematics, evolution, and population biology in the mustard family (Brassicaceae). Ann. Mo. Bot. Gard. 90: 151171.
  • Koch, M.A., Dobeš, C., Kiefer, C., Schmickl, R., Klimes, L. & Lysak, M.A. 2007. Supernetwork identifies multiple events of plastid trnF(GAA) pseudogene evolution in the Brassicaceae. Mol. Biol. Evol. 24: 6373.
  • Koch, M.A., Dobeš, C., Matschinger, M., Bleeker, W., Vogel, J., Kiefer, C. & Mitchell-Olds, T. 2005. Evolution of the trnF(GAA) gene in Arabidopsis relatives and the Brassicaceae family: monophyletic origin and subsequent diversification of a plastidic pseudogene. Mol. Biol. Evol. 22: 10321043.
  • Koch, A.M., Haubold, B. & Mitchell-Olds, T. 2001. Molecular systematics of the Brassicaceae: evidence from coding plastidic matK and nuclear CHS sequences. Am. J. Bot. 88: 534544.
  • Lavin, M., Doyle, J. & Palmer, J.D. 1990. Evolutionary significance of the loss of the chloroplast DNA inverted repeat in the Leguminosae subfamily Papilionoideae. Evolution 44: 390402.
  • Levinson, G. & Gutman, G.A. 1987. Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4: 203221.
  • Lindholm, J. & Gustafsson, T. 1991. The chloroplast genome of the gymnosperm Pinus contorta: a physical map and a complete collection of overlapping clones. Curr. Genet. 20: 161166.
  • Lihová, J., Aguilar, J.F., Marhold, K. & Feliner, G.N. 2004. Origin of the disjunct tetraploid Cardamine amporitana (Brassicaceae) assessed with nuclear and chloroplast DNA sequence data. Am. J. Bot. 91: 12311242.
  • Martin, A.P. & Burg, T.M. 2002. Perils of paralogy: using HSP70 genes for inferring organismal phylogenies. Syst. Biol. 51: 570587.
  • Mitchell-Olds, T. 2001. Arabidopsis thaliana and its wild relatives: a model system for ecology and evolution. Trends Ecol. Evol. 16: 693700.
  • Mitchell-Olds, T., Al-Shehbaz, I.A., Koch, M.A. & Sharbel, T.F. 2005. Crucifer evolution in the post-genomic era. In: Plant Diversity and Evolution: Genotypic and Phenotypic Variation in Higher Plants (Henry, R.J., ed.), pp. 119136. CABI Publishing, Cambridge, MA.
  • Mummenhoff, K., Bruggeman, H. & Bowman, J.L. 2001. Chloroplast DNA phylogeny and biogeography of Lepidium (Brassicaceae). Am. J. Bot. 81: 20512063.
  • Mummenhoff, K., Franzke, A. & Koch, M. 1997. Molecular data reveal convergence in fruit characters used in the classisification of Thlaspi s.l. (Brassicaceae). Bot. J. Linn. Soc. 125: 183199.
  • Muse, S.V. 2000. Examining rates and patterns of nucleotide substitutions in plants. Plant Mol. Biol. 42: 481490.
  • O'Kane, S.L. & Al-Shehbaz, I.A. 1997. A synopsis of Arabidopsis (Brassicaceae). Novon 7: 323327.
  • Ogihara, Y., Isono, K., Kojima, T., et al. (19 co-authors) 2002. Structural features of a wheat plastome as revealed by complete sequencing of chloroplast DNA. Mol. Gen. Genom. 266: 740746.
  • Ogihara, Y., Terachi, T. & Sasakuma, T. 1988. Intramolecular recombination of chloroplast genomes mediated by short direct-repeat sequences in wheat species. Proc. Natl Acad. Sci. U.S.A. 85: 85738577.
  • Ohnishi, Y., Tajiri, H., Matsuoka, Y. & Tsunewaki, K. 1999. Molecular analysis of a 21.1 kbp fragment of wheat chloroplast DNA bearing RNA polymerase subunit (rpo) genes. Genome 42: 10421049.
  • Palmer, J.D. 1991. Plastid chromosomes: structure and evolution. In: Cell Culture and Somatic Cell Genetics in Plants, Vol. 7. The Molecular Biology of Plastids (L.Bogorad & I. K.Vasil, eds), pp. 553. Academic Press, San Diego, CA.
  • Palmer, J.D., Jansen, R.K., Michaels, H., Manhart, J. & Chase, M. 1988. Chloroplast DNA variation and plant phylogeny. Ann. Mo. Bot. Gard. 75: 11801206.
  • Perry, A.S., Brennan, S., Murphy, D.J., Kavanagh, T.A. & Wolfe, K.H. 2002. Evolutionary re-organisation of a large operon in Aduzi bean chloroplast DNA caused by inverted repeat movement. DNA Res. 9: 157162.
  • Provan, J., Soranzo, N., Wilson, N.J., Goldstein, D.B. & Powell, W. 1999. A low mutation rate for chloroplast microsatellites. Genetics 153: 943947.
  • Pyke, K.A. 1999. Plastid divisons and development. Plant Cell 11: 549556.
  • Quandt, D., Müller, K., Stech, M., Frahm, J.-P., Frey, W., Hilu, W. & Borsch, T. 2004. Molecular evolution of the chloroplast TRNL-F region in land plants. Monogr. Syst. Bot. Mo. Bot. Gard. 98: 1337.
  • Razafimandimbison, S.G., Kellogg, E.A. & Bremer, B. 2004. Recent origin and phylogenetic utility of divergent ITS putative pseudogenes: a case study from Naucleceae (Rubiaceae). Syst. Biol. 53: 177192.
  • Raubson, L.A. & Jansen, R.K. 2005. Chloroplast genomes of plants. In: Plant Diversity and Evolution: Genotypic and phenotypic variation in higher plants (R. J.Henry, ed.), pp. 4568. CABI Publishing, Cambridge, MA.
  • Sambrook, J., Fritsch, E.F. & Maniatas, T. 1989. Molecular Cloning, A Laboratory Manual, 2nd edn. Cold Spring Harbour laboratory press, New York.
  • Sang, T., Crawford, D.J. & Stussey, T.D. 1997. Chloroplast DNA phylogeny, reticulate evolution and biogeography of Paeonia (Paeoniaceae). Am. J. Bot. 84: 11201136.
  • Sato, S., Nakamura, Y., Kaneko, T., Asamitzu, E. & Tabata, S. 1999. Complete structure of the chloroplast genome of Arabidopsis thaliana. DNA Res. 6: 283290.
  • Schaal, B.A., Hayworth, D.A., Olsen, K.M., Rauscher, J.T. & Smith, W.A. 1998. Phylogeographic studies in plants: problems and prospects. Mol. Ecol. 7: 465475.
  • Schmitz-Linneweber, C., Maier, R.M., Alcaraz, J.P., Cottet, A., Herrmann, R.G. & Mache, R. 2001. The plastid chromosome of spinach (Spinacia oleracea): complete nucleotide sequence and gene organization. Plant Mol. Biol. 45: 307315.
  • Taberlet, P., Gielly, L., Pautou, G. & Bouvet, J. 1991. Universal primers for amplification of three non-coding regions of chloroplast DNA. Plant Mol. Biol. 17: 11051109.
  • Vijverberg, K. & Bachman, K. 1999. Molecular evolution of a tandemly repeated trnF (GAA) gene in the chloroplast genomes of Micoseris (Asteraceae) and the uses of structural mutations in phylogenetic analyses. Mol. Biol. Evol. 16: 13291340.
  • Wakasugi, T.J., Sugita, M., Tsudzuki, T. & Sugiura, M. 1998. Updated gene map of tobacco chloroplast DNA. Plant Mol. Biol. Rep. 16: 231241.
  • Wakasugi, T.J., Tsudzuki, T.J., Ito, S., Nakashima, K., Tsudzuki, T. & Sugiura, M. 1994. Loss of all ndh genes as determined by sequencing the entire chloroplast genome of black pine Pinus thunbergii. Proc. Natl Acad. Sci. U.S.A. 91: 97949798.
  • Wittzell, H. 1999. Chloroplast DNA variation and reticulated evolution in sexual and apomictic sections of dandelions. Mol. Ecol. 8: 20232035.
  • Wolfe, K.H., Morden, C.W. & Palmer, J.D. 1992. Function and evolution of a minimal plastid genome from a nonphotosynthetic parasitic plant. Proc. Natl Acad. Sci. U.S.A. 89: 1064810652.
  • Yukawa, M., Tsudzuki, T. & Sugiura, M. 2005. 2005 version of the chloroplast DNA sequence from tobacco (Nicotiana tabacum). Plant Mol. Biol. Rep. 23: 359365.
  • Zuker, M. 2003. Mfold web server for nucleic acid folding and hybridisation prediction. Nucleic Acid Res. 31: 34063415.

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Acknowledgments
  8. References
  9. Supporting Information

Table S1. The distribution of trnL-trnF IGS variants detected among samples of Arabidopsis as determined by PCR RFLP analysis of Bavarian samples, or direct DNA sequencing of PCR products from elsewhere. IGS variants were classified according by taxa (PET1-34, HALL1-3) and subtype (A-F) if more than one nucleotide variant was detected.

Table S2.Arabidopsis trnL-trnF IGS MseI RFLP patterns. Note digestion profiles corrected to represent sizes excepted for the entire PCR product generated using ?E? and ?F? primers of Taberlet et al. (1991).

JEB1397SF1.doc126KSupporting info item
JEB1397ST1.tif5435KSupporting info item
JEB1397ST2.tif3866KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.