|I.||Introduction – overview||629|
|III.||Factors that have shaped the genomes of angiosperms and gymnosperms||640|
|IV.||The future of genomic studies in seed plants||643|
|I.||Introduction – overview||629|
|III.||Factors that have shaped the genomes of angiosperms and gymnosperms||640|
|IV.||The future of genomic studies in seed plants||643|
The large-scale replacement of gymnosperms by angiosperms in many ecological niches over time and the huge disparity in species numbers have led scientists to explore factors (e.g. polyploidy, developmental systems, floral evolution) that may have contributed to the astonishing rise of angiosperm diversity. Here, we explore genomic and ecological factors influencing seed plant genomes. This is timely given the recent surge in genomic data. We compare and contrast the genomic structure and evolution of angiosperms and gymnosperms and find that angiosperm genomes are more dynamic and diverse, particularly amongst the herbaceous species. Gymnosperms typically have reduced frequencies of a number of processes (e.g. polyploidy) that have shaped the genomes of other vascular plants and have alternative mechanisms to suppress genome dynamism (e.g. epigenetics and activity of transposable elements). Furthermore, the presence of several characters in angiosperms (e.g. herbaceous habit, short minimum generation time) has enabled them to exploit new niches and to be viable with small population sizes, where the power of genetic drift can outweigh that of selection. Together these processes have led to increased rates of genetic divergence and faster fixation times of variation in many angiosperms compared with gymnosperms.
Seed plants today comprise the Angiospermae (hereafter referred to as the angiosperms or flowering plants) and the Acrogymnospermae (hereafter referred to as the gymnosperms) (Cantino et al., 2007). Whilst much attention has been devoted to angiosperm genome structure and evolution, rather little has been expended on gymnosperms. Nevertheless, recent advances suggest that substantial differences exist between their genomes. Here we assess the evolutionary trajectories of genome diversification in these two groups, taking into account what is known from the monilophytes (comprising horsetails, whisk ferns, eusporangiate and leptosporangiate ferns) and lycophytes (comprising club and spike mosses and quillworts) to make an assessment as to which mechanisms and evolutionary processes are derived or lineage-specific. Such a review is timely given the recent surge in sequence and genomic data arising from technological advances.
Gymnosperms, which today comprise c. 780 species, are represented by just four groups (cycads, Ginkgo, gnetaleans and Coniferales (conifers), Fig. 1). Yet they have a long and extensive fossil record which goes back to the Carboniferous (c. 290 million yr ago (mya)), and comprises arguably 14 groups with considerably greater morphological and species diversity in the Mesozoic era than today (Willis & McElwain, 2002; Hilton & Bateman, 2006). By contrast, morphologically recognizable fossil angiosperms first appeared more recently (in the lower Cretaceous c. 130 mya), and this was followed by a 20–30 million yr (myr) period before there is any fossil evidence of their rise to ecological dominance in the mid-Cretaceous (c. 100 mya) (Bateman et al., 2006; Friis et al., 2010). Today, angiosperms comprise c. 352 000 species (Paton et al., 2008) and they dominate most terrestrial ecosystems. The sudden appearance and rapid diversification of angiosperms in the fossil record were, to Darwin, an ‘abominable mystery’ (letter from Darwin to Hooker 1887; see Friedman, 2009) and indeed, there is still considerable interest in their early evolution. Yet despite probing features such as gene divergence, development, anatomy and morphology, including, where possible, representatives in the fossil record (e.g. Bateman et al., 2006; Doyle, 2008), the closest living relatives of angiosperms remain controversial (Frohlich & Chase, 2007). The large-scale replacement of gymnosperms by angiosperms in many ecological niches over time and the huge disparity in species numbers between these two seed plant groups has also led scientists to explore the factors that may have contributed to the astonishing rise in angiosperm diversity. Indeed, a number of features have been proposed, some or all of which may be significant, including polyploidy (Fawcett et al., 2009; Jiao et al., 2011), seedling growth rate (Bond, 1989), breeding and developmental systems (Donoghue & Scheiner, 1992), and the evolution of the flower and insect pollinators (Willis & McElwain, 2002). However, Davies et al. (2004) considered that multiple interactions between biological traits and the environment, rather than a few key processes, are responsible for angiosperm diversity.
Here, we explore a new dimension – that of genome structure and the mechanisms by which seed plant genomes are shaped. We compare and contrast the structure, evolution and dynamics of angiosperm and gymnosperm genomes. We argue that the angiosperms are undergoing relatively fast genome evolution compared with gymnosperms, but that gymnosperms have evolved genetic novelties not encountered in either the angiosperms or other vascular plants (i.e. monilophytes and lycophytes). Thus the genomes in these two groups have undergone distinct evolutionary trajectories. Fundamental differences in life strategy and genome dynamism may well have resulted in angiosperms being well placed among existing seed plants to become the dominant land plant group.
To appreciate differences between angiosperm and gymnosperm genomes, we first explore the relationships between these two groups, so that available genomic data can be viewed in an appropriate evolutionary context. We then examine the broad nature and extent of genomic differences across vascular plants (lycophytes, monilophytes, gymnosperms and angiosperms) and finally we speculate as to which aspects of seed plant biology may have influenced, or been influenced by, genome dynamics.
Vascular plants are considered to have originated and diverged > 425 mya in the early Silurian (Steemans et al., 2009). Today they fall into three distinct clades: lycophytes, monilophytes and seed plants (Pryer et al., 2001). Seed plants first became dominant in the world flora in the Permian (290–248 mya) with the expansion of two early seed plant groups (pteridosperms and cordaites) and the emergence of new groups, including cycads, Ginkgo, Bennettitales and glossopterids. By the Upper Permian, 60% of the world’s flora was dominated by gymnosperms (Willis & McElwain, 2002). The expansion in the Permian was associated with major shifts in environmental conditions generated in part by the formation of the large land mass, Pangea, which formed a continental climate in tropical latitudes (Willis & McElwain, 2002).
From these seed plants, what emerged and survives to this day are the four extant lineages of gymnosperms and angiosperms. The most abundant extant gymnosperm group is Coniferales, whose species perhaps still occupy niches similar to the ones that they occupied in the early phase of their evolution (Willis & McElwain, 2002). Indeed Coniferales may be specialists in dealing with water stress, occurring today in tundra, uplands and water-stressed soils, where they can still form the dominant component of the flora. The remaining gymnosperm groups were at their zenith earlier in the geological record, and today have lower species numbers (Table 1).
|Number of species |
|1200 spp.||9189 spp.||210 spp.||1 sp.||614 sp.|| Ephedra c. 40 sp. |
 Gnetum 30 sp.
 Welwitschia 1 sp.
|352 000 sp. Paton et al. (2008)|
|Growth form||Small herbaceous terrestrial, xerophytic, aquatic and epiphytic plants requiring water for reproduction||‘Trees’, terrestrial, aquatic, epiphytic, annual and rhizomatous plants, requiring water for reproduction||Trees of warm climates||Tree species probably saved from extinction by cultivation (Nakao et al., 2001)||Trees and shrubs with one parasitic species known (Parasitaxus usta, Podocarpaceae)|| Ephedra– xeromorphic shrubs |
 Gnetum– trees and lianas in tropical forests
 Welwitschia– desert shrub (Mabberley, 2008)
|Trees to ephemeral weeds growing in terrestrial, aquatic (including semi-marine) habitats. Includes parasitic and epiphytic species. Only absent in areas of perpetual ice or completely barren rock|
|Sexual system||Heterosporous and homosporous||Heterosporous and homosporous||Dioecious||Dioecious||Monoecious/dioecious|| Ephedra – most species dioecious (Ickert-Bond, 2003), some mutants reported that are hermaphroditic (Sporne, 1974) |
 Gnetum– monoecious and dioecious
 Welwitschia– dioecious
|All known sexual systems found although hermaphrodite flowers are most common|
|Breeding system||Heterosporous lycopods likely outcrossing, while homosporous lycopods with subterranean gametophytes likely inbreeding||Most species are highly outcrossing, although some inbreeding species are also reported (Soltis & Soltis, 1990)||Predominantly outcrossing, some inbreeding reported in Cycas seemannii (Keppel et al., 2002). Wind- and/or insect-pollinated||Outcrossing. Wind-pollinated.||Predominantly outcrossing, although inbreeding does occur, but individuals are rare in populations and in Pinaceae they die between fertilization and seed maturity (Williams, 2007). Selfed conifers usually exhibit inbreeding depression (Ahuja, 2009)|| Ephedra– highly outcrossing. Mostly wind pollinated but insect pollination reported for a few species (Kato et al., 1995) |
 Gnetum – insect pollinated (Kato et al., 1995)
 Welwitschia– Inbreeding rare in Welwitschia (Jacobson & Lester, 2003). Likely to be insect pollinated (small amounts of pollen and nectar are produced, Wetschnig & Depisch, 1999)
|Inbreeding commonly found although diverse mechanisms have evolved to promote outcrossing in many groups (e.g. self incompatibility genes, temporal separation in maturation of carpels and anthers, insect pollination systems)|
|Vegetative reproduction (no = unaware of any reports)||Agamospory in Isoetes and Selaginella reported (Walker, 1984)||Frequent across a wide variety of ferns (Walker, 1984)||Viable suckers reported in some species of Encephalartos (http://en.wikipedia.org/wiki/Encephalartos_woodii)||Yes (http://en.wikipedia.org/wiki/Ginkgo_biloba)||Occurs in Sequoia sempervirens (Ahuja, 2005) and some species of Abies, Picea, Pinus, Larix, Pseudotsuga, Thuja, Chamaecyparis and Cryptomeria (Land, 1913)|| Ephedra– Yes (Land, 1913) |
 Gnetum– No
 Welwitschia– No
By contrast, angiosperms underwent extensive diversification in the mid- to late Cretaceous (Friis et al., 2006) and perhaps became predominant by filling new niches and taking advantage of vacant niches exposed by the mass extinction events at the Cretaceous–Tertiary (KT) boundary, c. 65 mya (Fawcett et al., 2009; Fawcett & Van de Peer, 2010). The angiosperms are the dominant land plant group today, comprising over 400 families (APGIII, 2009).
Much effort has been expended in an attempt to resolve the relationships between angiosperms, gymnosperms and their extinct relatives (fossil gymnosperms), using both morphological and DNA sequence-based characters. There is general agreement that Bennetitales, with a flower-like structure, and some other extinct gymnosperms (e.g. Pentoxylon, Caytonia, Corystospermales) form a distinct natural group with the angiosperms, the ‘anthophytes’ (Crane, 1985; Doyle, 2008). However, the relationships within extant gymnosperm groups and with angiosperms remain controversial (Frohlich & Chase, 2007). Unfortunately, available genomic data do not allow us to contribute to this debate because most of our knowledge of gymnosperm genomes is derived from Coniferales.
The distribution and range of genome sizes in each of the major groups of vascular plants are shown in Fig. 2 and Table 2 and it is clear that there are considerable differences between them. The lycophytes comprising c. 900 species have just 24 species with C-values, although each of the major groups (clubmosses – Lycopodiaceae, quillworts – Isoetaceae and spikemosses – Selaginellaceae) is represented. Their genome sizes range 139-fold (1C = 0.086–11.97 pg, Fig. 2a). The monilophytes, comprising c. 11 000 species, include data for 93 species and their genome sizes range 94-fold (1C = 0.77–72.68 pg). However, the ranges in genome size differ substantially between the major groups of monilophytes. Horsetails (Equisetum), marattioides and leptosporangiate ferns have small to medium-sized genomes (up to 1C = c. 30 pg), whilst considerably larger genomes are found in the whisk ferns (Psilotum 1C = 72.7 pg) and some ophioglossoid ferns (Ophioglossum 1C = c. 65 pg) (Fig. 2b). For further discussion of the implications and limitations of these data, see Leitch & Leitch (2012).
|Genome size- range |
1C DNA amount (Leitch & Leitch, 2012 and references within)
|Range: 139-fold |
1C = 0.086 pg in five Selaginella spp. to 11.97 pg in Isoetes lacustris
|Range: 94-fold |
1C = 0.77 pg in the water fern Azolla microphyla to 72.68 pg in the whisk fern Psilotum nudum
|Range: 1.8-fold |
1C = 12.05 pg in Zamia angustifolia to 21.1 pg in Encephalartos villosus
|Monotypic genus |
1C = 9.95 pg
|Range: 5.5-fold |
1C = 6.6 pg in Lepidothamnus intermedius to 36.0 pg in Pinus ayacahuite
|Range: 8.1-fold |
 Ephedra: 1C = 8.9 pg (2x)–18.22 pg (4x)
 Gnetum: 1C = 2.25–3.98 pg
1C = 7.2 pg
|Range: 2400-fold |
1C = 0.063 pg in Genlisea margaretae to 152.23 pg in Paris japonica
|Chromosome number range (cf. Bennett & Leitch, 2010; Leitch & Leitch, 2012 and references within)||2n = 14–550 |
2n = 14 in some Selaginella sp. (Jermy, 1967) to 2n = 550 in Huperzia prolifera (Tindale & Roy, 2002)
|2n = 18 to c.1440 |
2n = 18 in some Salvinia sp. (Tatuno & Takei, 1969) to 2n = c. 1440 in Ophioglossum reticulatum (Abraham & Ninan, 1954)
|2n = 16–28 |
2n = 16 in Stangeria, Ceratozamia and Zamia to 2n = 28 in Zamia and Microcycas
|2n = 24||2n = 18–66 |
 2n = 18 in Podocarpaceae to 2n = 66 Sequoia sempervirens (Murray, 2012)
 There is a dubious count of 2n = 14 in Amentotaxus
 Most Pinaceae have the same number of 2n = 24.
 Podocarpaceae most variable family, 2n = 18–38 (Hair & Beunzenberg, 1958)
|2n = 14–56 |
2n = 14, 28, 42, 56
2n = 22 and 44 (Hizume et al., 1993)
 Welwitschia: 2n = 42
|2n = 4–640 |
2n = 4 reported in six species to 2n = 640 in Sedum suaveolens (Crassulaceae) (Uhl, 1978)
|Major chromosome evolution mechanisms||Polyploidy and dysploidy |
Polyploidy is most commonly found in homosporous lycophytes, up to 50-ploid in Huperzia (Husband et al., 2012)
Dysploid variations in base chromosome numbers (e.g. in Selaginella Jermy, 1967)
|Polyploidy and dysploidy |
Chromosomal polyploidy often to high levels in homosporous fern species (e.g. 96× in Ophioglossum) and/or chromosomal fragmentation. In heterosporous ferns, polyploidy is less frequent
|Robertsonian fusion/fissions |
Robertsonian fusion/fissions and unequal chromosome translocations frequently observed (Ehrendorfer, 1976). One triploid Encephalartos hildebrandtii reported with 2n = 27 (Abraham & Mathew, 1966)
| Robertsonian fusions/fissions |
In Podocarpaceae (Jones, 1998)
 Polyploidy rare 1.5% of species chromosomally polyploid, most in Cupressaceae. Sequoia sempervirens considered hexaploid (2n = 66). Two species of Pseudolarix (Pinaceae) considered tetraploid (Murray, 2012)
 Ephedra: 44% of species reported to be tetraploid, but two octoploid cytotypes also found (Ickert-Bond, 2003)
 Gnetum: Unclear
 Welwitschia: Unclear (monotypic genus)
|Polyploidy and dysploidy |
Polyploidy can occur to high levels, 80x reported in Sedum suaveolens (Uhl, 1978) and is recurrent in many lineages (Jiao et al., 2011).
Robertsonian fusions and fissions occur only sporadically (Jones, 1998)
|Rate of chromosomal diversification1||No data reported||No data reported||Very low rate |
0.001(10 genera examined)
|Low rate |
0.000121(26 genera examined)
|No data reported||High rates |
Herbs: 0.07361 (99 genera examined) Shrubs: 0.01021 (63 genera examined)Hardwoods: 0.00141 (39 genera examined)
|Occurence of endo-reduplication||None reported |
Based on analysis of six species by Bainard et al. (2011)
Up to 16C reported in gametophyte and apical meristems of sporophytes in Equisetum and some leptosporangiate ferns (reviewed in Polito, 1980; Bainard & Newmaster, 2010)
|No data found |
Web-of-knowledge search parameters (endo* and cycad*)
|Occurs 64C reported in female gametophyte (Avanzi & Cionini, 1971)||Rare |
Endomitosis followed by nuclear fusion to give endopolyploid cells to 6C in female gametophytes in Cupressaceae (Pichot & El Maataoui, 1997; El Maataoui & Pichot, 1999)
| Ephedra: |
No data found Web-of-knowledge search parameters (endo* and Ephedra)
[2 & 3] Gnetum and Welwitschia: Endoreduplication not reported, but nuclear fusion leading to polyploid nuclei observed in both genera (Waterkeyn, 1954; Martens & Waterkeyn, 1974)
 Common in annuals and biennial herbs (in e.g. glandular and vascular tissue)
 Largely absent in woody species
 Endosperm typically, although not ubiquitously, triploid through nuclear fusion (Barow & Meister, 2003; Barow & Jovtchev, 2007)
|Angiosperm-type siRNA directed DNA methylation DCL3 occurrence and size of small RNA fraction||Perhaps |
In Selaginella moellendorffii, DCL3 occurs (but two other dicer-like family members in angiosperms are absent). Predominantly 21 nt small RNA fraction, and a small 24 nt RNA fraction (Banks et al., 2011)
No data on DCL3 found. 21 and 24 nt small RNA fraction observed (Dolgosheina et al., 2008)
No data on DCL3 found. Small RNA fraction unclear (Dolgosheina et al., 2008)
No data on DCL3 found. 21 nt fraction present, presence of other fractions unclear (Dolgosheina et al., 2008)
|Probably not |
Conifers: No DCL3, dearth of 24 nt small RNA fraction whereas 21 nt fraction abundant (Dolgosheina et al., 2008; Morin et al., 2008)
 Ephedra. No data on DCL3 found. Small RNA fraction unclear (Dolgosheina et al., 2008).  Gnetum and Welwitschia. No data found
DCL3 and 24 nt fraction – used for siRNA directed DNA methylation (Lisch, 2009)
|No. of nucleolar organizing regions (NORs)/ribosomal DNA (rDNA) sites in meristematic cells||No data||Limited data |
Osmunda japonica has eight sites (terminal) (2n = 44) (Kawakami et al., 1999) Ceratopteris richardii (2n = 78) has 16 sites (i.e. four major sites and 12 minor sites) (most terminally located) (McGrath & Hickok, 1999)
|Numerous pericentromeric or terminal rDNA sites, ranging from six sites in Zamia angustifolia (2n = 16) to 20 sites in Z. muricata (2n = 23) (Tagashira & Kondo, 2001). In Cycas revoluta rRNA distal on 4 submetacentric and all 12 telocentric chromosomes (Hizume et al., 1992). Two to four sites reported for Bowenia (see Kokubugata et al., 2000)||4 sites of rDNA (number of satellites varies between three in male and four in female plants) (Nakao et al., 2005)||Typically, large numbers of rDNA loci and highly variable chromosomal distribution between related taxa. Pinus– 8–20 sites (in 19 species (Cai et al., 2006). |
Picea 12–16 sites (Siljak-Yakovlev et al., 2002)
Abies– 10 sites (Puizina et al., 2008)
Podocarpus 6 sites colocalizing to 5S rDNA (Murray et al., 2002) 7–10 sites based on review of Pinus (Cai et al., 2006). Great variation in distribution of rDNA in Pinus contrasts with stable karyotype of 2n = 24
|1 pair of satellite chromosomes and 2 NORs in Welwitschia (Ahuja, 2005)||1–10 sites based on review of Cai et al. (2006)|
The seed plants have better representation in terms of genome size data. Indeed, gymnosperms have the highest proportion of species with recorded C-values (c. 25% of species), with at least one genome size estimate for each family (Leitch et al., 2001). Despite this, gymnosperms have surprisingly little genome size variation (just 16-fold overall, Table 1, Fig. 2c–e) compared with c. 2400-fold variation in angiosperms, the largest range for any plant group (Pellicer et al., 2010). Furthermore, given that only 1.5% of angiosperms (of c. 352 000 species) have been measured, it is likely that an even greater range in genome size exists in this group.
In addition to the large range of genome sizes, angiosperms are unusual in that the distribution is strongly skewed towards small genomes (Fig. 2f), with the modal and mean genome sizes being just 1C = 0.6 and 5.9 pg, respectively (based on data for 6287 species). This small size is despite polyploidy in the ancestry of all lineages (see Section II.3, Jiao et al., 2011). By contrast, the distribution of genome sizes in the gymnosperms and monilophytes is less skewed, with higher mode and mean values (gymnosperms mode 1C = 10.0 pg, mean 1C = 18.8 pg; monilophytes mode 1C = 8.0 pg, mean 1C = 14.0 pg).
Lycophytes and particularly monilophytes have an astonishing range in chromosome number (from 2n = 14 to c. 2n = 1440), with the karyotypes typically composed of many tiny chromosomes that are often difficult to count (Table 2, e.g. Fig. 3a). Much of this diversity has been accounted for by polyploidy, although there are too few detailed karyotype studies to know whether more complex processes are also involved. We also find a large diversity of chromosome numbers (2n = 4–640) in angiosperms although here there is also considerable variation in size (from < 0.5 to > 30 μm), both contributing to the large range in genome sizes (Figs 2f, 3b,c).
By contrast, chromosome numbers and sizes are less variable in gymnosperms. The Coniferales and cycads are characterized by fairly large (c. 5–15 μm) chromosomes in comparison to most angiosperms, and chromosome numbers vary only from 2n = 18 to 2n = 66 in Coniferales and 2n = 16 to 2n = 28 in cycads. In Gnetales, species in Ephedra (2n = 14–56), Gnetum (2n = 22, 44) and Welwitschia (2n = 42) all have smaller chromosomes (≤ 5 μm), which are often telocentric. These data indicate reduced chromosome number diversity in gymnosperms, particularly in the most species-rich family of Coniferales, the Pinaceae, where only three out of 157 species counted deviate from 2n = 24 (Murray, 2012).
Such observations are supported by the study of Levin & Wilson (1976), who estimated chromosome diversification rates in herbaceous, shrubby and woody angiosperms and compared the data with Coniferales and cycads. Values were calculated from chromosome number diversity within genera and ages of the genera as predicted from the fossil record. The data showed that herbaceous angiosperms had significantly higher rates of chromosome diversification, both aneuploidy and polyploidy, than woody and shrubby angiosperms, which themselves had significantly higher rates of chromosome diversification than the Coniferales (Table 2).
Much chromosome number diversity in angiosperms is associated with polyploidy (whole genome duplication, see Section II.3) and dysploidy (the increase or decrease of chromosome numbers in association with chromosome rearrangements). Dysploidy is often caused by chromosome rearrangements, and in angiosperms these may be complex (Schubert & Lysak, 2011; Fig. 3d) and involve recombination between nonhomologous sequences, potentially causing losses and gains of DNA. Dysploidy may be common in angiosperms because polyploidy and paleopolyploidy give rise to genetic redundancy and perhaps an enhanced tolerance to DNA losses associated with karyotype restructuring.
In gymnosperms, dysploidy likely arises through Robertsonian rearrangements – the fusion or fission of chromosomes at or around the centromere, generating one metacentric or two telocentric chromosomes per rearrangement event, respectively. This is particularly prevalent in the cycad genus Zamia where chromosome numbers include 2n = 16, 18, and 21–28 (Caputo et al., 1996; Tagashira & Kondo, 2001; Fig. 3e), although chromosome number changes between and within cycad genera may also involve more complex rearrangements (Kokubugata & Kondo, 1998).
Another source of chromosome diversification is found in sex chromosomes, although they are infrequently encountered. In angiosperms, they are reported in Silene, Rumex, Humulus, Cannabis and some Curcurbitaceae (Bernasconi et al., 2009; Janousek et al., 2012). In gymnosperms there are dubious reports of sex chromosomes in Ginkgo as there are inconsistencies in the chromosomes that are considered responsible, and which is the heterogametic sex (cf. Lee, 1954; Ruiyang et al., 1987). In cycads, differences in the number of secondary constrictions at nuclear ribosomal DNA (rDNA) loci between male and female plants have been observed (Segawa et al., 1971; Nakao et al., 2005), which, if correct, indicate epigenetic control. In lycophytes and monilophytes, no sex chromosomes have been reported, probably because sex determination in these groups is largely determined environmentally.
Polyploidy, affecting all cells in the plant, has occurred frequently in the divergence of many angiosperm species. Indeed Jiao et al. (2011) reported one whole genome duplication (polyploidy) event in the common ancestor of all seed plants and another in the common ancestor of all angiosperms. It has been estimated from chromosome counts mapped to phylogenetic trees that c. 15% of all angiosperm speciation events have arisen through polyploidy (Wood et al., 2009), and polyploid lineages have clearly been extremely successful (Soltis et al., 2009). Many angiosperms show evidence of recurrent polyploidy in their ancestry. For example, the model plant Arabidopsis thaliana, selected for whole genome sequencing because of its small genome size and low chromosome number (1C = 157 Mb and 2n = 10, Bennett et al., 2003), shows evidence of at least three ancient whole-genome duplications since the origin of angiosperms (Bowers et al., 2003). Furthermore, c. 12 000–300 000 yr ago, a progenitor of this species was involved in another allopolyploidy event, showing that polyploidy is both ancient and ongoing (Jakobsson et al., 2006).
Amongst the other vascular plants, polyploidy is also common and can reach high levels in homosporous ferns (monilophytes) and lycophytes. For example, Ophioglossum has chromosome numbers ranging from 2n = 60 to 2n = 1440, with most species having counts that are multiples (or near multiples) of n = 120 (Khandelwal, 1990). Interestingly, there is also a good linear relationship between chromosome number and genome size for a range of monilophyte species (Nakazato et al., 2008; Bainard et al., 2011), suggesting that polyploidy is not associated with substantial genome downsizing, the loss of DNA following polyploidy, that typically occurs in angiosperms (Leitch & Bennett, 2004). Perhaps diploidizing processes that erode the chromosomal signature of polyploidy in angiosperms over time are less prevalent in monilophytes and lycophytes; for example, analyses of the diploid fern Ceratopteris richardii only found evidence of a single ancient duplication at 180 mya (Barker & Wolf, 2010) and there was no evidence of paleopolyploidy in the genome of the lycophyte Selaginella moellendorffii (Banks et al., 2011).
In contrast to the high frequency of polyploidy observed in the lycophytes, monilophytes and angiosperms, within the gymnosperms, polyploidy is generally rare, with just a few tetraploid and one hexaploid (Sequoia sempervirens) species in Cupressaceae (Stebbins, 1948; Delevoryas, 1980), and one report of a triploid cycad (Encephalartos hildebrandtii; Abraham & Mathew, 1966). The exception is found in the Gnetales genus Ephedra, where c. 44% of species are chromosomally tetraploid and two octoploid cytotypes have also been reported (Ickert-Bond, 2003).
Why polyploidy is less important in the diversification of cycads and conifers than in angiosperms, monilophytes and lycopytes is unclear. The frequent occurrence of polyploidy across vascular plants suggests that the incidence of polyploidy may have been reduced in these gymnosperm lineages. One possible explanation is based on the observation that polyploids often arise through the production of unreduced gametes. In angiosperms, the high mean frequency of unreduced gametes (0.56% of gametes, rising 50-fold to 27.52% in hybrids) may lead to triploids and tetraploids, which in turn may act as bridges to higher ploidy levels (Ramsey & Schemske, 1998; Husband, 2004). In gymnosperms, unreduced gametes may be produced less frequently. Indeed, we are aware of only one paper reporting unreduced gametes in gymnosperms and this was in Cupressus dupreziana (Pichot & El Maataoui, 2000). It may be significant that this species belongs to Cupressaceae, one of the few conifer families reported to have polyploid representatives.
Errors in meiosis are a source of unreduced gametes, as shown in Atps1 mutants of A. thaliana which produce a high frequency (c. 65%) of diploid pollen as a consequence of aberrant spindle orientation in meiosis II (d’Erfurth et al., 2008). However, unreduced gametes may also arise via somatic doubling of chromosomes in the germline as a result of mitotic errors, which, if occurring early enough in development, can influence all or many cell lineages. Indeed, the first reported example of an allopolyploid in angiosperms arose from spontaneous doubling of chromosomes in somatic cells of a single branch of a Primula hybrid (generating the new polyploid species Primula kewensis (Newton & Pellew, 1929)).
There may also be a correlation between the incidence of polyploidy and of endoreduplication, which involves rounds of DNA replication without nuclear or cell division. Endoreduplication is commonly encountered in many angiosperms, although it is restricted to certain cell types, often glandular and vascular tissue. In some species endoreduplication can reach astonishingly high levels, for example, 1024C in Scilla bifolia antipodal cells, 8192C in Phaseolus coccineus suspensor cells and potentially (based on nuclear volumes) up to 24 576C in the endosperm haustoria of Arum maculatum (Bennett, 2004). By contrast, in gymnosperms, endoreduplication is much less prevalent. In Gnetum, Welwitschia and Cupressaceae, only low numbers of endopolyploid cells have been reported, and these have resulted from nuclear fusions in multinucleate cells within gametophytic tissues (Table 2, Pichot & El Maataoui, 1997; El Maataoui & Pichot, 1999). The highest degree of endoreduplication in gymnosperms is reported in Ginkgo biloba at 64C (Avanzi & Cionini, 1971).
Both endoreduplication and polyploidy are particularly common in annual and biennial angiosperms, whereas polyploid series (Stebbins, 1971) and endoreduplication (Barow & Meister, 2003) are rarer in tree species of both angiosperms and gymnosperms (Table 2). Furthermore, it may be no coincidence that where endoreduplication is found in conifers (Cupressaceae, Pichot & El Maataoui, 1997; El Maataoui & Pichot, 1999), polyploidy has also been reported. It could be envisaged that the genes involved in regulating endoreduplication (Chevalier et al., 2011), if expressed erroneously in vegetative cells leading to the germline lineage, would give rise to polyploid gametes. There are no data on endopolyploidy in Ephedra as far as we know, but if there is a correlation between the occurrence of polyploidy and endoreduplication in development, one might expect endopolyploidy to be encountered in this genus, where polyploidy is common.
Comparative genomic studies have revealed that the angiosperm genome is astonishingly dynamic and flexible in terms of its structure and the rate and occurrence of DNA integration and excision (Kejnovsky et al., 2009). DNA is integrating into the nuclear genome from a variety of sources, including (retro)transposable elements, viral DNA, plastid and mitochondrial sequences. Furthermore, stress and polyploidy activate retrotransposon transcription and integration (Grandbastien, 1998; Petit et al., 2010). The inevitable genome enlargement is counteracted by mechanisms which excise DNA (e.g. unequal crossing over, illegitimate recombination etc.). The result is that much of the genome may be turning over within a few myr. For example, in rice, it has been estimated that retroelements have a half-life of only 6 myr (Ma et al., 2004), while in Nicotiana, there is evidence of near-complete turnover of genomic DNA within 5 myr (Lim et al., 2007). A consequence of this high turnover is that the genome becomes homogenized and few chromosome-specific characters emerge.
The question is whether there is any evidence for similar degrees of genome dynamism in gymnosperms. However, most of our understanding of gymnosperm chromosomes is derived from conifers, particularly Pinaceae, especially Pinus, and, to a lesser extent, the cycads. Here we summarize some general features (for a more extensive analysis, see Murray, 2012).
Fluorochrome staining of chromosomes shows that many gymnosperms have bands of chromatin enriched in either GC or AT nucleotides, typically occurring at centromeric regions and secondary constrictions (Jacobs et al., 2000). Whilst some angiosperm groups and species have similar bands, they are not so regularly encountered.
Species belonging to the gymnosperm genera Pinus, Podocarpus and Cunninghamia (but not Abies, Picea, Cryptomeria, Larix and Ginkgo; Hizume et al., 2000; Puizina et al., 2008) are also unusual amongst plants in having numerous large blocks of interstitial telomere sequences that form bands, enabling individual chromosomes to be identified (e.g. Fig. 4a, Schmidt et al., 2000; Murray et al., 2002; Islam-Faridi et al., 2007). Interstitial telomere sequences do occur in angiosperms (e.g. in Cestrum; Sykorova et al., 2003), but we are unaware of any species where they form such discrete, large and numerous bands on chromosomes. In Zamia, there are blocks of interstitial telomeric repeats at centromeres (composed of the motif (TTTAGGG)n), their location perhaps arising from Robertsonian fusions of chromosomes (Kondo & Tagashira, 1998).
Next-generation sequencing studies on Pinus taeda show that its relatively large genome (1C = c. 22 pg) is dominated by a high number of low abundance repeats (Morse et al., 2009; Kovach et al., 2010). Such a finding suggests that all of the repeats have been similarly constrained in their ability to amplify to high copy numbers. This contrasts with studies of many angiosperms, where a few individual repeat families, typically retroelements, have amplified greatly to become a significant proportion of the genome (e.g. Gorge3 elements in Gossypium herbaceum, 34% of genome (Hawkins et al., 2006); RIRE1 element in Oryza australiensis, 27% of genome (Piegu et al., 2006); Ogre element in Pisum sativum, 20–33% of genome (Macas et al., 2007); see also review by Hawkins et al. (2008)).
Collectively, these data for Coniferales (AT- and GC-rich banding patterns, interstitial telomeric sequences and lack of specific highly abundant repeats contributing to genome size differences) suggest a more highly structured, less dynamic chromosome than is typical of angiosperms. Kejnovsky et al. (2009) argued that a highly structured and compartmentalized genome arises when there is a high ratio of intrachromosomal to interchromosomal homogenization. Chromosomal substructuring is likely to break down with elevated interchromosomal homogenization involving, for example, the mobility of (retro)transposons, illegitimate recombination and unequal homologous recombination. Indeed, such processes in angiosperms are thought to generate the high-copy-number, widely dispersed repeats that can dominate their genomes. By contrast, the more constrained and highly structured genomes of conifers suggest that these processes may not be so active and/or operate less extensively across the genome.
Interestingly, the apparently low rate of chromosomal divergence reported in conifers is not reflected in the distribution of 45S rDNA loci, where in Pinaceae, their numbers vary considerably between species. This is reminiscent of the situation in the angiosperm genus Aloe where variability in rDNA distribution within a fixed karyotype structure was also reported (Adams et al., 1998). Nevertheless, Pinus species have more numerous 45S rDNA sites than is typical of angiosperms (Murray, 2012). Indeed, Pinus taeda has 38 rDNA sites distributed across its 24 chromosomes, a likely record in plants (Fig. 4b, Islam-Faridi et al., 2007). Conifers also have a longer 45S rDNA unit length (ranging from 27 to 40 kb) than is typical of angiosperms (ranging between 6 and 13 kb), with the differences largely accounted for by the length of the intergenic spacer region (Ribeiro et al., 2008).
Data on genome and chromosomal dynamics in monilophytes and lycophytes are scarce, making it difficult to assess whether it is the gymnosperms (at least the conifers) that have experienced a slowing down of chromosome and genome evolution, or angiosperm genomes that have become more dynamic. Nevertheless, analyses of genome size and chromosome numbers (Nakazato et al., 2008; Bainard et al., 2011) together with molecular data (Nakazato et al., 2006) hint at the latter scenario being the more likely, as the data suggest that monilophyte and lycophyte genomes are less dynamic than angiosperms (Nakazato et al., 2008; Barker & Wolf, 2010).
Angiosperm genomes are characterized by being rich in both tandem and dispersed repeats, especially retrotransposable elements, which contribute to the dynamic nature of their genomes over evolutionary time (Kejnovsky et al., 2009). Despite this dynamism, repeat mobility and amplification are considered to be constrained via heterochromatinization. The mechanism driving this process involves the activity of RNA-dependent RNA polymerase (RdRP) and a specific class of the endonuclease family Dicer called Dicer-Like 3 (DCL3), which together generate a diverse array of 24 nucleotide (nt) RNA fragments. These small interfering RNAs (siRNAs) are targeted back to the DNA where, through homology with genomic sequences, they facilitate siRNA-directed DNA methylation (RDDM) and histone modifications (Henderson & Jacobsen, 2007; Zhang, 2008; Lisch, 2009). This, in turn, is associated with DNA condensation and (retro)transposon silencing.
However, epigenetic mechanisms involved in constraining repeat amplification and mobility may not be the same across all land plants. In some monilophytes, a 24 nt siRNA fraction is reported (Dolgosheina et al., 2008) while in the moss Physcomitrella patens, there is a broader range of siRNA size categories compared with angiosperms (from 21 to 24 nt, Cho et al., 2008). By contrast, there was a dearth of 24 nt fragments in the small RNA fraction of several conifer species investigated. Instead the conifers had an abundant fraction of 21 nt RNAs, occurring with high sequence diversity (Dolgosheina et al., 2008; Morin et al., 2008).
In addition to an absence of a 24 nt small RNA fraction in conifers, they are thought to lack DCL3 (Dolgosheina et al., 2008), which in angiosperms generates the 24 nt fraction of small RNA. Furthermore, an analysis of 2833 orthologues from 101 genera across seed plants, including representatives of the Coniferales, Gnetales and cycads, revealed that the clade comprising gymnosperms is strongly supported by the divergence of NRPD2, an essential protein in processing the 24 nt siRNA (Lee et al., 2011). In angiosperms, the processed siRNAs are targeted back to the repeats, where they direct DNA methylation and influence the distributions of particular histone modifications (Lisch, 2009). Interestingly, investigated gymnosperm genomes show different distribution patterns of histone modifications. For example, while mono and dimethylation of lysine 9 on histone H3 (i.e. H3K9me1/me2) is a mark for heterochromatin in angiosperms, the same histone modification in gymnosperms (Picea abies and Pinus sylvestris) marks euchromatin (Fuchs et al., 2008). Collectively these data suggest significant differences in genome silencing mechanisms in gymnosperms (or perhaps conifers) compared with angiosperms.
Given that homologues of DCL3 are found in P. patens and the lycophyte S. moellendorfii (Cho et al., 2008; Banks et al., 2011), the data also suggest that siRNA-directed DNA methylation arose early in land plant evolution. If so, this would suggest that it has since been lost or modified in gymnosperms (or conifers, see Table 2). However, we do not know enough about the distribution of DCL3, 24 nt RNA fractions and patterns of histone modifications in other gymnosperm groups to know if the differences reported are restricted to conifers.
During pollen grain maturation in A. thaliana, the active vegetative nucleus produces copious quantities of siRNAs associated with retroelement activation. However, the siRNAs then accumulate in the sperm cells, one of which will form the zygote. In the sperm cells, the siRNAs probably act to ensure the retroelements remain transcriptionally silent (Slotkin, 2010). In gymnosperms there cannot be protection of the germline in the same way since there is no double fertilization mechanism or clear demarcation of nuclei for particular roles, as in angiosperms. (Although double fertilization has been reported in Gnetales, it is not considered to be homologous to that found in angiosperms, and its role is unclear, since it does not form a second embryo or a nutritive tissue (reviewed in Donoghue & Scheiner, 1992).)
Overall, there are clearly elaborate molecular and cellular mechanisms to silence repeats in angiosperms; however, these mechanisms may be different in gymnosperms and this might contribute to the more stable genomes over time. Furthermore, Dolgosheina et al. (2008) proposed that their larger modal genome size of conifers may also be correlated with different genome silencing mechanisms. Potentially such major shifts in mechanisms for genome silencing may prove to be useful new characters for resolving angiosperm vs gymnosperm affinities.
From the evidence summarized in the previous section, it is apparent that angiosperm genomes have experienced faster chromosome and genome evolution and they are more genetically diverse than those of gymnosperms, particularly in the Coniferales. The underlying reasons that may have driven the greater dynamism of angiosperm genomes are likely to be a consequence of fundamental differences in their biology, falling into two broad categories (Fig. 5):
Studies on genome size (Beaulieu et al., 2010), chromosome structural diversity (Levin & Wilson, 1976), and molecular evolution of nuclear, plastid and mitochondrial sequences (Smith & Donoghue, 2008) all suggest that herbaceous angiosperms have higher rates of evolution than woody angiosperms. Whilst ecological factors (e.g. small population sizes, shorter generation times; see later and Fig. 5) might explain these data for herbaceous and woody angiosperms, rates of chromosomal diversification are even lower for gymnosperm than for angiosperm trees, suggesting that genetic factors are also involved (Levin & Wilson, 1976).
In the context of vascular plants as a whole, our understanding is hampered by limited data. However, it is apparent that each group has features that characterize the diversification of their genomes. For example, monilophytes have high degrees of polyploidy, but their overall stable genome structure means that this is not associated with ‘genome downsizing’. The Coniferales may be unusual amongst vascular plants in having a low incidence of polyploidy and they may have evolved their own epigenetic mechanisms to silence (retro)transposons. The angiosperms have the most dynamic genomes and genome diversity, with the large range of genome sizes encountered reflecting the balance between DNA amplification and losses over time.
Whilst polyploidy likely arises in all groups of organisms (they are a product of meiotic and mitotic errors; Ramsey & Schemske, 1998), the greater occurrence of polyploidy in angiosperms compared with gymnosperms may be because it arises more frequently (see potential association with endoreduplication discussed earlier). Furthermore, polyploids may be more likely to become fixed in angiosperms because they can frequently inbreed, and because polyploids must inevitably arise from small numbers, potentially individual plants.
A high frequency of polyploidy in angiosperms compared with gymnosperms offers potential advantages over their diploid progenitors, for example, through fixing heterozygosity (mitigating against inbreeding depression), the evolution of genetic novelty from the gene duplicates (neofunctionalization and subfunctionalization) or via the integration of regulatory and biochemical networks and the formation of novel molecular complexes, all of which have the potential to generate transgressive characters (Soltis & Soltis, 2000; Chen, 2007; Leitch & Leitch, 2008; Kejnovsky et al., 2009).
Nevertheless, there are also potential disadvantages to polyploidy, including fertility losses arising from inappropriate chromosome pairing and segregation at meiosis and an increase in genome size. However, these too are likely to be contributing to the dynamic nature of angiosperm genomes. Studies of natural and synthetic angiosperm polyploids in early generations after their formation have shown that individuals can have relatively low fertility and significantly altered karyotypes (e.g. carrying chromosomal translocations, deletions and duplications; Lim et al., 2006; Gaeta et al., 2007). Even in synthetic polyploids, changing copy numbers of repeats can be observed (Petit et al., 2010; Renny-Byfield et al., 2011). From these variants, selection favours the most fertile individuals where chromosome pairing and segregation are regular. In addition, selection and drift will also act on the evolution of genome size.
Large-scale surveys of C-values across many taxa have shown that a common response to polyploidy in angiosperms is ‘genome downsizing’, whereby DNA is eliminated from the polyploid, so that its genome size is significantly less than expected (Leitch & Bennett, 2004). Such data contrast with monilophytes, where C-values seem to increase in proportion to chromosome number, suggesting in this group the stable inheritance of genome structure over long time periods (Barker & Wolf, 2010). Currently there are no comparable data for polyploids of Ephedra so it is unknown whether this gymnosperm genus undergoes genome downsizing like angiosperms or is more similar to monilophytes in its response to polyploidy.
In addition to polyploidy, (retro)transposition results in an increase in genome size; however, these mechanisms that increase DNA amount can be counteracted by processes that act to reduce the genome (e.g. replication or recombination-based errors generating indels, unequal recombination between sister chromatids or chromosomes; Grover & Wendel, 2010). Collectively these opposing processes contribute to the highly dynamic nature of the angiosperm genome such that even within the context of a ‘diploid’ there can be near-complete turnover of the nongenic component of the genome within a few myr (Ma et al., 2004; Lim et al., 2007).
(Retro)transposon mobility, which can be triggered by both polyploidy and stress (Grandbastien et al., 2005; Petit et al., 2010), is regulated by epigenetic processes. Epigenetic mechanisms are also involved in regulating DNA condensation, which can be predicted to influence the frequency of recombination-based mechanisms that also act to homogenize the genome. Certainly it has been argued that condensed rDNA is less vulnerable to sequence homogenization than actively transcribed rDNA in some angiosperms (Kovarik et al., 2008). The fundamental differences in epigenetic mechanisms that are emerging between angiosperms and gymnosperms might influence the dynamics and frequency of (retro)transposon and recombination events, and, in so doing, influence the dynamics of their evolution. It may also alter the frequency of interchromosomal homogenization compared with intrachromosomal homogenization which we previously suggested may contribute to the more highly structured genome in gymnosperms.
A key major feature distinguishing angiosperms from gymnosperms is the rapid generation time and herbaceous habit (i.e. plants lacking persistent woody stems above ground). Together these novelties may have contributed to the more dynamic genomes observed in many angiosperms through their influence on population size and nutrient availability.
Donoghue & Scheiner (1992) argue that the reduction in generation time was achieved in part through modifications to the development of the female and male gametophyte, leading to a more rapid sexual cycle and a change in the source of the nutritive tissue. In gymnosperms, gametophyte development can be slow; the haploid female gametophyte matures before fertilization to provide the nutritive tissue for the zygote while the time from pollination to fertilization can be extremely long. For example, in Pinus it takes 1 yr (Sporne, 1974) and in Ginkgo 4–5 months (Nakao et al., 2001), although in Ephedra it may be as little as 10–36 h. In addition, maturation of the embryo may also be slow, for example, requiring another 8 months in Ginkgo (Donoghue & Scheiner, 1992).
By contrast, the zygote in angiosperms is nourished by the endosperm which is typically a triploid tissue arising from the fusion of two haploid maternal gametophytic nuclei with one of the male nuclei (the other fuses with the egg cell to form the zygote). The presence of the endosperm and the potential for a rapid rate of early embryo development have enabled some angiosperms to evolve rapid life cycles which can be completed within 1 yr. Indeed, ephemeral species such as A. thaliana can complete their life cycle in just 6 wk. The minimum generation time for gymnosperms is found in Ephedra and it is several years.
In addition to the fast and efficient embryo growth found in angiosperms, they also display a much wider range of morphological forms than gymnosperms, which are all trees or shrubs. Indeed, as noted earlier, one of the novelties in angiosperms was the evolution of the short-lived herbaceous habit (absent in gymnosperms, monilophytes and lycophytes), which facilitated the evolution of species with distinct ecologies, including those that are aquatic, ephemeral/annual, weedy, insectivorous or halophytic. It seems likely that the combined features of a rapid life cycle and a herbaceous or shrubby habit were important in the establishment of angiosperms in the first 20–30 myr of their evolution (Taylor & Hickey, 1996; Friis et al., 2010). Certainly fossil angiosperm wood is very rare from this period, whilst for other plant organs, notably the flowers, there are numerous fossils (Friis et al., 2010).
Associated with a herbaceous habit in angiosperms is a higher frequency of inbreeding than observed in gymnosperms, favouring the establishment and maintenance of small viable populations. Indeed, tree species (found in both angiosperms and gymnosperms) are generally characterized by much larger effective population sizes of predominantly outcrossing species (Petit & Hampe, 2006). It has been argued by Scofield & Schultz (2006) that there may be selection against inbreeding in trees, arising because there is no sequestration of the germline. Since more cell divisions, and hence accumulated mutations, will occur in trees than in herbaceous plants, this will lead to a higher genetic load, higher magnitudes of inbreeding depression and thus selection for outcrossing, self-incompatibility and dioecy.
Long-distance seed dispersal and wind pollination are also likely to promote large effective population sizes that are frequently observed in trees. For example, Robledo-Arnuncio (2011) showed that in a population of P. sylvestris there was over 4% gene flow between populations located 100 km apart. By contrast, insects may contribute an important role in maintaining small effective population sizes in herbaceous angiosperms occurring at low density, as they can specifically transfer pollen to appropriate individuals. Whilst insect pollination has been reported in cycads and Gnetales, it is absent from Coniferales, which are exclusively wind-pollinated. Furthermore, epigenetically controlled developmental or physiological plasticity reported in angiosperms may enable the establishment of small viable populations in different ecological settings. Certainly, different epigenetic profiles are observed between populations of orchids and it can be envisaged that over time drift and selection will result in their divergence (Paun et al., 2010, 2011).
These combinations of factors may well have contributed to the higher frequency of small population sizes seen in angiosperms compared with gymnosperms, which in turn may have played an important role in the evolution of the more dynamic angiosperm genomes. Lynch (2010) argues that with small, effective population sizes, the power of selection is outweighed by genetic drift, preventing the evolution of processes that more faithfully replicate DNA between generations and resulting in faster rates of genetic divergence. Such an argument can also be used to explain why RDDM, a mechanism that heterochromatizes DNA and silences retroelements in angiosperms, is rather ineffective given the relatively high frequency of retrotransposition reported. Perhaps the power of selection for a more efficient mechanism is outweighed by that of genetic drift in small populations. Certainly, such a hypothesis is consistent with Levin & Wilson’s (1976) data on different rates of chromosome evolution between herbaceous and woody angiosperms and gymnosperms. Indeed, these authors suggested that the high rates of chromosome change observed in herbaceous angiosperms arose because of their occurrence in small isolated populations, which were more likely to experience the dual pressures of drift and local selection. By contrast, the slow rates of chromosome change observed in gymnosperms and angiosperm trees arose because they comprised long-lived woody plants with large effective population sizes that experienced a more uniform environment over their lifetime.
The differences between angiosperms and gymnosperms are apparent in their diversity and distribution of genome sizes. Angiosperms have a nearly 2400-fold range in their genome size, far exceeding that found in the gymnosperms (16-fold range). Despite this, 50% of angiosperm species for which we have data have a small genome size of 1C = 2.5 pg or less and all have a modal genome size of just 1C = 0.5 pg (Fig. 2e). It is likely, therefore, that for most angiosperms there is strong selection against large genomes. That selection may arise in part through constraints imposed by limiting amounts of available phosphates and nitrates that can be assimilated and are needed for DNA synthesis (Leitch & Leitch, 2008). In addition, given that there is a correlation between genome size, cell cycle time and minimum generation time (the minimum time taken from the seed of one generation to the seed of the next), this may impose an upper limit on the genome sizes of all species that are annual or ephemeral (Bennett, 1987).
By contrast, the smallest genome so far reported for a gymnosperm (1C = 2.25 pg in Gnetum ula) is over four times larger than the modal genome size in angiosperms (Fig. 2). The absence of annual and ephemeral gymnosperms and an ecology that is likely to be most limited by available water (most species live on dry sandy or high altitude/latitude habitats) rather than nutrient availability (Table 1) may act together or independently to reduce selection pressures for small genome sizes. It may be significant that the smallest genomes are in Gnetum which grows in areas that are not water-limited, but on soils that have particularly low available nutrients.
Recent advances in next- and third-generation sequencing technologies are set to revolutionize research on a wide range of organisms, extending our understanding far beyond the few tens of model organisms that have been used for the majority of genetic studies in the past. With these new studies, comparative genomics will come of age, enabling large-scale comparisons across widely divergent groups, and the testing of hypotheses relating to genome evolution in different ecological settings.
The focus of the immediate future would be on gymnosperms, particularly outside of the Coniferales, where there is a dearth of data. Hitherto, the focus on Coniferales has given a misleading impression of the genetics of gymnosperms, for example, polyploidy is regularly encountered in Ephedra. A gymnosperm model is also urgently needed to spearhead detailed developmental, cell biology and genetic studies. We suggest that Ephedra or Welwitschia may present the best opportunities for this, having the shortest generation times (a few yr), and a plant size that makes them amenable to glasshouse growth.
We also suggest that more in-depth genomic studies of gymnosperms, particularly outside Coniferales, may prove influential in resolving affinities between seed plants. Already the differences recently observed in the epigenetic mechanisms between Coniferales and angiosperms may be fundamental, and comparison with other gymnosperm groups may provide some powerful and novel phylogenetic tools.
With respect to ecological data there is an urgent need to better understand how drift and selection act together in the shaping of genomes and the underlying cellular processes. Such an understanding will be essential if we are to predict how climate change will influence the different vascular plant groups and how they respond in different ecological settings.
We thank NERC for support. We thank friends and colleagues for useful comments, especially Richard Buggs, Jeff Duckett, Laura Kelly and Richard Nichols for helpful conversations. We also thank the referees for their helpful and constructive comments.