Reconstructing relative genome size of vascular plants through geological time



  • The strong positive relationship evident between cell and genome size in both animals and plants forms the basis of using the size of stomatal guard cells as a proxy to track changes in plant genome size through geological time.
  • We report for the first time a taxonomic fine-scale investigation into changes in stomatal guard-cell length and use these data to infer changes in genome size through the evolutionary history of land plants.
  • Our data suggest that many of the earliest land plants had exceptionally large genome sizes and that a predicted overall trend of increasing genome size within individual lineages through geological time is not supported. However, maximum genome size steadily increases from the Mississippian (c. 360 million yr ago (Ma)) to the present.
  • We hypothesise that the functional relationship between stomatal size, genome size and atmospheric CO2 may contribute to the dichotomy reported between preferential extinction of neopolyploids and the prevalence of palaeopolyploidy observed in DNA sequence data of extant vascular plants.


A strong positive correlation between cell size and genome size is apparent not only across animals, where the relationship is widely regarded as ubiquitous (Gregory, 2005), but also across plants (e.g. Bennett & Leitch, 2005; Beaulieu et al., 2008; Greilhuber & Leitch, 2013). This correlation has been used in occasional palaeontological studies to reconstruct genome size of disparate groups of fossil vertebrates, including lungfish, conodonts and dinosaurs (Thomson, 1972; Conway Morris & Harper, 1988; Organ et al., 2007, 2011). In plants, Masterson (1994) investigated stomatal area in extant and fossil leaves to infer changes in ploidy in three extant families of angiosperms and suggested that the majority (c. 70%) of angiosperms have polyploidy within their ancestry. This view is supported by large-scale surveys of chromosome numbers in extant angiosperms (e.g. Raven, 1975) and subsequent work in molecular evolution, which together indicate that whole-genome duplication (polyploidy) events are common throughout angiosperm evolution (Levin, 2002; De Bodt et al., 2005; Jiao et al., 2011; Fawcett et al., 2013).

Since Masterson's publication, there has been a significant increase in the number of genome size estimates of extant plants (e.g. Obermayer et al., 2002; Bennett & Leitch, 2005, 2011, 2012), providing sufficient data to enable a broader-scale investigation of the relationship between guard-cell length (GCL) and genome size (GS) (Fig. 1). Beaulieu et al. (2008) showed that the relationship was highly significant across a phylogenetically diverse suite of angiosperms, including seven magnoliids, 32 monocots and 62 eudicots, that together encompassed a wide range of growth forms (41 herbaceous, 26 shrub and 34 tree species). Lomax et al. (2009) demonstrated that the GS–GCL relationship in Arabidopsis thaliana was robust across a wide range of ecologically and geologically relevant environmental conditions, including elevated CO2, drought and UV-B radiation. Together, these studies encourage estimation of GS from GCL measurements of fossil plants. Franks et al. (2012a) sampled key nodes in the phylogeny of extant vascular plants to generate a relationship between GS and GCL, in order to predict the genome size of fossil members of several plant groups (including lycophytes, ferns, sphenophytes, progymnosperms, pteridosperms, bennettites, cycads and angiosperms). These data were then used to reconstruct changes in palaeogenome size in major land-plant clades through geological time, finding a strong correlation between genome size and megacycles of atmospheric CO2 (pCO2).

Figure 1.

Relationship between genome size (GS) and guard-cell length (GCL) in extant angiosperms. Open circles, published data for a wide spectrum of angiosperms (Beaulieu et al., 2008); closed circles, Allium species. Linear regression through the entire dataset shows a highly significant positive relationship (y = −7.068 + (0.801 × x), r2 = 0.71, F = 25.795, < 0.001) between GCL and GS, and independent contrast analysis shows that this relationship is independent of phylogeny (PicR = 0.45, < 0.001).

Although the approach of Franks et al. (2012a) has delivered valuable data and insights into the evolution of GS through geological time and its possible link to megacycles of pCO2, the relationship between GS and GCL is not tightly defined across contrasting tracheophyte groups. From a genome perspective, the different groups of vascular plants may be evolving at different rates (e.g. Leitch & Leitch, 2013). For example, angiosperm genome size evolution is dynamic; both increases and decreases in GS occur in closely related lineages, often following polyploidy or retrotransposon amplification or deletion (Leitch & Bennett, 2004; Doyle et al. 2008; Leitch et al., 2008; Soltis et al., 2009). The frequency of these processes may well contribute to the 2400-fold range of genome sizes observed in angiosperms (Pellicer et al., 2010). By contrast, ploidy and genome size changes appear less frequent in extant gymnosperms (Khoshoo, 1959; Ohri & Khoshoo, 1986), perhaps explaining the narrow range of genome sizes encountered in this seed-plant group. Furthermore, no episodes of polyploidy have been identified within the genome of Picea abies, the first gymnosperm to be sequenced (Nystedt et al., 2013). Within extant lycophytes and monilophytes (ferns plus sphenophytes), current (albeit limited) data suggest that these groups also have a more static genome than angiosperms (Barker & Wolf, 2012).

There is also considerable uncertainty regarding how plant GS responds to environmental stimuli (Leitch & Leitch, 2012; Greilhuber & Leitch, 2013). Hence, the quantitative prediction of GS based on a limited, yet well-spaced, phylogenetic spectrum of samples may lead to over- or under-predictions of GS in extinct species. Furthermore, no taxonomically fine-scale fossil plant genome-size reconstruction has yet been undertaken. We therefore conducted an extensive palaeobotanical literature survey of GCL that covers the entire period of stomatal preservation, beginning in the Early Devonian (c. 415 Ma) and spanning the subsequent diversification of vascular land plants. Placing our fossil GCL database into a phylogenetic context allowed us to infer GS change through time over a range of taxonomic scales. However, given the concerns already highlighted, we do not (yet) seek to quantitatively reconstruct palaeogenome size.

Materials and Methods

In order to reconstruct GS in fossil plants, it is important to show that the positive correlation noted in the introduction holds across the whole range of GS and GCL values encountered in plants. Recognising that the Beaulieu et al. (2008) study under-represented species with large GS, we sampled 39 extant species of Allium – a monocot genus documented to maintain unusually large GS values (minimum 2C = 14 866 Mbp, maximum 2C = 145 722 Mbp, mean 2C = 38 546 Mbp; Bennett & Leitch, 2012). Allium species chosen had known genome sizes that had previously been archived in the Plant DNA C-values database (Bennett & Leitch, 2012). A 1-cm2 area was painted with clear enamel and removed with a piece of adhesive tape that was attached to the corner and then gently peeled back. The resulting impression in the enamel film of the guard cells was then placed in a drop of water on a microscope slide under a cover slip. Images were captured using a compound microscope with a digital camera and measurements of GCL were undertaken as in Beaulieu et al. (2008).

To further test for the effects of elevated CO2 the relationship between GS and GCL in large-genome species, two species of AlliumA. cepa (onion: 2C = 32 763 Mbp) and A. tuberosum (garlic chives: 2C = 62 768 Mbp) – were grown in ambient (400 ppm) and elevated (800 ppm) CO2 in Conviron (Winnipeg, MB, Canada) BDR16 reach-in plant growth chambers at the University of Sheffield central annexe growth suite. The plant growth environment specified was: irradiance 300 μmol m−2 s−1; relative humidity 55% with day : night cycle of 16 h : 8 h at a temperature 21°C : 18°C, respectively. GCL was determined as in Lomax et al. (2009).

Our extensive literature survey resulted in a matrix of 130 measurements of GCL that spanned the past 414 Myr of plant evolution (Supporting Information Table S1). To have been included in our dataset a record must have met the following selection criteria: (1) either a reported GCL or an image of a guard-cell pair with either captioned magnifications or an accurate scale bar from which we could estimate GCL, and (2) an acceptably narrow stratigraphic age (within 10 Myr). Our database contains examples where GCL was used as part of a taxonomic diagnosis and a range of GCL values was reported. In these cases, we scored the midpoint of the given range. We recognise that our methodology incurs several limitations, specifically determining for each species whether (1) the midpoint is representative of the true mean, (2) a published image is representative of the species as a whole, and (3) contrasts in preservation and image quality between featured specimens can be accommodated satisfactorily. Despite these limitations, this study represents an important step in quantifying changes in GCL through geologic time and thereby inferring substantial changes in genome size.

Results and Discussion

Genome sizes in our Allium dataset for 39 species ranged from 2C = 22 934 to 2C = 92 861 Mbp and guard-cell sizes ranged from 28 to 100 μm. Integrating these measurements with the dataset of Beaulieu et al. (2008) increased the strength of the GS GCL relationship (y = −7.068 + (0.801 × x), r2 = 0.71, F = 25.795, < 0.001; Fig. 1). Independent contrast analysis incorporating phylogenetic relationships (Felsenstein, 1985; Ackerly et al., 2006; Webb et al., 2008) showed that the regression analysis was robust (PicR = 0.45, < 0.001). The analysis of this expanded matrix increased confidence in the utility of using GCL to infer the GS of fossil plants.

Our fossil plant GCL dataset is perhaps best viewed as a series of snapshots in the evolution of GS through geological time. Moreover, the plant fossil record is known to favour sun-leaves over shade-leaves and wetland over dryland species (e.g. Spicer, 1981) – habitats where large-GS species are infrequent in the modern flora (Knight et al., 2005). In addition, studies of extant angiosperms show that GS can increase instantly through genome duplication and rapidly through amplification of repetitive DNA such as transposable elements, though more gradual genome-size reduction can follow either category of increase (Leitch & Leitch, 2008; Grover & Wendel, 2010). Despite these factors, we are confident that our GCL dataset has revealed several striking and biologically meaningful patterns (Fig. 2a,b).

Figure 2.

Changes in guard-cell length (GCL) through geological time. (a) Plots of GCL against geological time. (b) 25 Myr time-binned average of GCL plotted against geological time –stars indicate timing of mass-extinction events (end-Ordovician, c. 443 Ma; end-Devonian, c. 375 Ma); end-Permian, c. 251 Ma; Triassic–Jurassic boundary, c. 200 Ma; Cretaceous–Paleogene boundary, c. 65 Ma; error bars, ± SE. (c) Summary of possible phylogenetic relationships among tracheophytes (from Jiao et al., 2011); grey bars indicate the approximate timing of significant whole-genome duplication (WGD) events inferred by Jiao et al. (2011), the grey bar at c. 350 Ma is the ancestral seed plant WGD and the grey bar at c. 200 Ma is the ancestral angiosperm WGD. (d) Long-term changes in atmospheric CO2 (GEOCARB III), redrawn from Berner & Kothavala (2001) – solid line indicates best-estimate model and dotted lines indicate uncertainty around the model predictions.

Early Devonian plants have exceptionally large stomata

The range of GCL among the earliest Devonian (c. 410 Ma) plants (22–86 μm) encompasses almost the full range of GCL observed in extant plants (Beaulieu et al., 2008) and the present work (15–100 μm). Given the strong correlation between GCL and GS, these data suggest that a wide range in GS emerged during the primary diversification phase of vascular land plants. Several species from the Rhynie Chert (c. 410 Ma) have a large GCL (Horneophyton lignieri, 86 μm, Aglaophyton major, 82 μm, Nothia aphylla, 67 μm, Asteroxylon mackiei, 73 μm and Rhynia gwynne-vaughanii, 63 μm). The architecturally similar extant species Psilotum nudum has very large GCL (99 μm: Pant & Khare, 1971) and a correspondingly large GS (2C = 142 152 Mbp: Obermayer et al., 2002). The large GCL of Psilotum nudum (and of the closely related extant ‘fern’ Ophioglossum petiolatum, measured at 61 μm: Peterson & Hambleton, 1978; 2C 128 216 Mbp: Obermayer et al., 2002) compares well with the Early Devonian species, although the Devonian plants are only distantly related to Psilotum and Ophioglossum, and they occupied radically different, hot-spring habitats. When combined with our understanding of extant plants, these fossil data imply that many Early Devonian vascular plants were characterised by large genomes. However, given the early-divergent position that these Devonian plants occupy in embryophyte phylogeny, it is likely that at least some of the mechanisms that link GCL with GS were less constrained by canalisation and/or followed contrasting pathways of epidermal development (Rudall et al., 2013).

Extant plants with large genomes are absent from extreme environments and under-represented in disturbed environments (Knight et al., 2005). However, this does not seem to be the case for some of the Early Devonian plants. Horneophyton lignieri and Rhynia gwynne-vaughanii are characterised as primary colonisers of ever-wet and drier ground, respectively. Aglaophyton major is interpreted as being adapted to periodic flooding and waterlogging, whereas Nothia aphylla and Asteroxylon mackiei are regarded as successional plants rather than primary colonisers (Powell et al., 2000). This indicates that Early Devonian plants occupied a variety of niches associated with hot springs. This environment may have been equable in the short-term (seasonal) but in the longer-term (generational) it undoubtedly represents an extreme and dynamic, disturbance-prone environment. The relationship between ecology and genome size suggests that, in angiosperms at least, this environment would in theory be dominated by species possessing small genomes (Knight et al., 2005). However, the fossil data imply that this relationship does not hold through geological time or across all phylogenetic groups. Unfortunately, comparison with modern hot-spring floras is weakened by radical taxonomic differences.

Gaps and mass extinctions in the fossil record

There are two notable stratigraphic gaps within our database (Fig. 2a). These span the Late Devonian to Mississippian (c. 380–330 Ma), and a second shorter gap in the earliest Triassic (250–236 Ma). The paucity of Late Devonian GCL data can be ascribed to a lack of wetland biotas, whereas Mississippian data are scarce due to an ongoing lack of systematic investigations (Scott & Galtier, 1996). The Early Triassic gap is driven by climate, widespread aridity favouring oxidation and the loss of plant material before it can be preserved. Both of these gaps in our database are problematic as they coincide with major evolutionary and climatic events. The Late Devonian gap is associated with a large-scale draw-down in CO2 (Berner & Kothavala, 2001; Berner, 2006), the end-Devonian extinction event (c. 360 Ma; primarily a feature of the marine fossil record: Bond & Wignall, 2008), and the origin of true megaphyllous leaves (Kenrick & Crane, 1997; Osborne et al., 2004; Galtier, 2010). The Early Triassic gap represents the aftermath of the end-Permian (c. 250 Ma) mass extinction event (e.g. Hallam & Wignall, 1997). Consequently, plant genome-size responses to these important global events remain elusive.

In order to determine whether patterns of change in GCL, and therefore in inferred GS, could be associated with any of the other mass extinction events (asterisked in Fig. 2b), we calculated the running average in GCL through 25 Myr time-bins (Fig. 2b). Since the earliest of these extinction events, the end-Ordovician (443 Ma), occurred before the earliest well-documented diversification of vascular plants, this event was not analysed further. As for the Triassic/Jurassic (200 Ma) or the Cretaceous/Tertiary (K/T: 65 Ma) extinction events, no long-term impacts in GS could be identified from the available data, despite the fact that both events left excellent plant fossil records. For example, cuticle from macrofossils is readily recoverable from the Triassic/Jurassic boundary sections of Greenland (McElwain et al., 1999), and suites of dispersed cuticle assemblages are available from the Cretaceous/Tertiary boundary of the western continental interior of the USA (Wolfe & Upchurch, 1987a; Upchurch, 1995).

The lack of any clear change in GCL (and hence GS) associated with the K/T extinction event is perhaps surprising, given that a recent phylogenetic investigation by Fawcett et al. (2009) highlighted a statistically significant increase in the frequency of angiosperm polyploids immediately following the K/T boundary that would be predicted to lead to larger genomes. Clearly, denser sampling of the well-preserved fossil record on either side of the K/T boundary would provide a more rigorous test of any perceived association.

Stomatal density (SD) and stomatal index (SI) analysis through both boundary sequences has determined that both mass-extinction events are accompanied by a large increase in pCO2 (Triassic/Jurassic boundary, McElwain et al., 1999; Steinthorsdottir et al., 2011; K/T boundary, Beerling et al., 2002). Increases in atmospheric pCO2 typically result in a decrease in SD and SI (e.g. Woodward, 1987). Also, GCL increases as stomatal density declines, resulting in the plant having fewer but larger stomata (Hetherington & Woodward, 2003). This negative correlation implies that plants with larger GCL, and hence larger GS values, might be expected to become more frequent during angiosperm re-colonisation and recovery, given the high CO2 of the post-impact environment of the early Paleogene (c. 65 Ma; Lomax et al., 2000; Beerling et al., 2002). Thus, the relationships between stomatal density, GCL and CO2 may explain why some plants with large genome sizes, including perhaps early Paleogene polyploids, may have preferentially prospered in the early Paleogene high-CO2 world (Fig. 2a).

Extrinsic influences: changes through time in atmospheric CO2

Two key palaeopolyploidy events have recently been placed within, and dated using, broad-brush molecular phylogenies of land plants (Jiao et al., 2011) (summarised in Fig. 2c). The earlier event (c. 350 Ma) coincides approximately with the diversification of major groups of gymnosperms, and the later event (c. 200 Ma) is purportedly contemporaneous with the initial diversification of extant angiosperm groups that preceded the separation of the monocots and eudicots. Both palaeopolyploidy events appear to coincide with relatively high, yet declining, atmospheric CO2 concentration (Fig. 2c,d). However, using the fossil record rather than molecular dating techniques, the earlier event actually corresponds with the initial radiation of seed plants in the Late Devonian and Mississippian, while the latter is contemporaneous with the Triassic–Jurassic boundary and hence significantly pre-dates the earliest morphologically recognisable angiosperms (e.g. Bateman et al., 2006). To explore the relationship between GCL/GS and CO2 we compared our GCL trend with the GEOCARB III model (Berner & Kothavala, 2001; Fig. 2d). Like the taxonomically broad analysis of Franks et al. (2012a), we see a positive trend between pCO2 and GCL/GS within our species-level investigation. The broad correlation between CO2 and climate is well illustrated by the Permo-Carboniferous glaciation (c. 330–255 Ma), which corresponds with a period of low CO2 (Mora et al., 1996), and the Cretaceous period, when warm poles and low latitudinal temperature gradients (e.g. Wolfe & Upchurch, 1987b; Spicer & Hermann, 2010) coincided with high CO2. Evidently, additional broad-scale climatic variables may influence plant GS through physiological and ecological processes.

Many tracheophytes have recorded episodes of palaeopolyploidy within their genome (e.g. Jiao et al., 2011). However, analyses of neopolyploid plants have also shown that the average speciation rate of neopolyploids may be significantly lower than that of diploids whereas their extinction rates are significantly higher (Mayrose et al., 2011). These results may seem counter-intuitive, but Mayrose et al. (2011) suggested that this dichotomy is a result of successful palaeopolyploids having an expanded genomic potential that drives their longer-term evolutionary success. Plants with fewer, larger stomata (polyploids) are typically less efficient in terms of stomatal responses to environmental stimuli (e.g. Hetherington & Woodward, 2003), which may then result in concomitant changes in water-use efficiency and photosynthetic performance – these factors may in turn eventually impact on overall plant fitness. Most studies investigating the relationship between GCL and perceived plant fitness have focused on comparisons between diploid and polyploid angiosperms. However, the precise nature of these relationships remains contentious, given that results differ between treatment and species. For example, polyploids can be more (Garbutt & Bazzaz, 1983; Watanabe, 1986; Li et al., 1996) or less (Baldwin, 1941) tolerant of water stress when compared with their diploid progenitors. Similarly dichotomous results are observed for other abiotic factors that are recorded in the literature (reviewed by Maherali et al., 2009). Such studies highlight the importance of understanding the role played by polyploidy in the crucial relationship between speciation and extinction (Arrigo & Barker, 2012).

Given elevated pCO2, there is typically a reduction in stomatal conductance as a result of increasing stomatal closure (Drake et al., 1997; Ainsworth & Long, 2005; and references therein). Elevated atmospheric CO2 may therefore provide an alternative explanation for the apparent dichotomy of selective extinction of neopolyploids noted above (Mayrose et al., 2011). In the modern atmosphere relatively deprived of CO2, increases in GCL associated with increases in ploidy may select against neopolyploids. By contrast, events responsible for palaeopolyploidy observed today in extant lineages would have occurred in the CO2-enriched atmospheres of the past, when stomatal closure via elevated CO2 may have reduced this potential selective pressure. However, this intriguing hypothesis requires testing in experimental conditions, where polyploid plants can be compared with their diploid progenitors over a range of CO2 concentrations relevant to the evolutionary history of vascular land plants.

Effect of CO2 on stomatal size

Allium cepa showed a statistically significant 11% decrease in GCL whereas A. tuberosum showed no statistical significant difference in GCL (ambient vs elevated CO2 800 ppm). These new experimental data partially support many previous studies that cover a range of taxa (angiosperms and gymnosperms) and genome sizes. For example, Lomax et al. (2009) demonstrated that GCL values were independent of CO2 concentration in Arabidopsis thaliana (2C = 157 Mbp). Ryle & Stanley (1992), working on Lolium perenne (2C = 5390 Mbp), found no statistical difference in GCL with CO2 (340 vs 680 ppm). Radoglou & Jarvis (1990) and Ceulemans et al. (1995) found no statistical difference in GCL with altered CO2 in poplar clones. Knapp et al. (1994) examined Andropogon gerdarii (2C = 8839 Mbp) and Sorghastrum nutans and found no statistical difference in GCL with CO2 treatment (ambient vs double ambient) in either species. Ogaya et al.'s (2011) study of ‘living fossils’ found that GCL did not change in Araucaria araucana (2C = 44 499 Mbp), Sequoia sempervirens (2C = 62 856 Mbp) or Metasequoia glyptostroboides (2C = 21 594 Mbp). But Taxodium distichum (2C = 19 462 Mbp) showed a statistically significant 8% decrease in GCL and Nothofagus cunninghamii showed a statistically significant 10% increase in GCL between the CO2 treatments (400 vs 800 ppm). Haworth et al. (2013) examined the stomatal parameters of eight species with both passive (Osmunda regalis, 2C = 27 393 Mbp; Ginkgo biloba, 2C = 22 983 Mbp; Podocarpus macrophyllus, 2C = 18 973 Mbp; and Agathis australis, 2C = 30 905 Mbp) and active (Lepidozamia peroffskyana, 2C = 54 083 Mbp; Nageia nagi, 2C = 10 954 Mbp; Solanum lycopersicum; and Hordeum vulgare, 2C = 10 855 Mbp) stomatal control and found no statistical difference in stomatal pore length (equivalent to GCL) with pCO2 (380 vs 1500 ppm).

However, Franks et al. (2012b) showed that in four phylogenetically disparate species (Selaginella uncinata, 2C = 176 Mbp; Osmunda regalis; Commelina communis; Vicia faba, 2C = 26 064 Mbp) stomatal size was significantly affected by CO2; stomatal size increased with elevated pCO2 (240, 450 and 1000 ppm). These authors also reported an apparent change in genome size, but this was attributed to a change in packaging of the DNA leading to altered stoichiometry in the binding of the fluorochrome to the DNA rather than to an actual increase in the amount of DNA (Franks et al., 2012b).

Plotting the range in GCL observed in Allium cepa and A. tuberosum due to growth environment onto the herbaceous dataset of Beaulieu et al. (2008) shows that changes in GCL for the two Allium species are greater than the changes observed in Arabidopsis (Fig. 3b), despite A. thaliana being subjected to a wider range of environmental perturbations (atmospheric CO2 400–3000 ppm, drought, relative humidity, irradiance, ultraviolet radiation and pathogen attack; Lomax et al., 2009). These data suggest that plants with a larger GS may have a higher degree of flexibility with respect to scaling GCL to GS when compared with plants possessing a smaller genome. Furthermore, these observations emphasise the complexity of the relationship between GCL and CO2 and indicate the existence of both responders (both increasing and decreasing in GCL with elevation of pCO2) and nonresponders, analogous to the stomatal density and index response to altered CO2 concentrations (Woodward & Kelly, 1995; Royer, 2001). However, this variation in response does not appear to relate to any one variable, including GS, phylogenetic history or the mechanism of stomatal control. These observations imply not only that broad-scale palaeo-GS reconstructions based on limited datasets (e.g. Franks et al., 2012a) might be overly simplistic, but also that the relationship between CO2, GCL and GS requires further investigation.

Figure 3.

(a) Box-and-whiskers plots of guard-cell length (GCL) for Allium cepa and A. tuberosum grown at ambient (400 ppm, open box) and elevated (800 ppm, dark grey box) CO2; the median (solid line), the mean (dotted line), 25th and 75th percentile (box). The whiskers are the 5th and 95th percentile and the outliers are closed circles. Differences (ANOVA) between CO2 treatment for each species (A. cepa F = 15.44, < 0.001 (**); A. tuberosum F = 0.48, P = 0.50); (b) Log10-transformed GCL data of Arabidopsis thaliana (Lomax et al., 2009; shown as a black box-and-whiskers plot), A. cepa (light grey box) and A. tuberosum (white box), compared with the herb GS–GCL dataset of Beaulieu et al. (2008).

Intrinsic influences on genome size

Angiosperm-focused research shows that two processes are particularly common in genome evolution – polyploidy and the accumulation of noncoding repetitive DNA sequences, especially transposable elements, in the genome. Evidence of their widespread occurrence in angiosperm genomes suggests that the null hypothesis of genome size evolution should be a trend of increasing genome size through time (Bennetzen & Kellogg, 1997). Certainly, polyploidy can lead to an increase in cellular DNA content and a concomitant increase in GCL; for example, within the Coffea arabica polyploid series there is a 43% increase in GCL from diploid through tetraploid to the octoploid cytotype (Mishra, 1997). Similarly, diploid species that differ in genome size due to variations in the amount of repetitive DNA also show differences in GCL (e.g. Cypripedium; Khandawala, 2009). Analysis of our database, which is novel in being dominated by nonflowering seed-plants (83/130; 64%), enabled testing of whether GS (as inferred by changes in GCL) has increased within the gymnosperms (Fig. 4a–c).

Figure 4.

Changes in fossil gymnosperm guard-cell length (GCL) through geological time. (a) Gymnosperm GCL divided into Coniferales (closed circles) and all other gymnosperm orders (open triangles). (b) Data-points categorised according to the remaining gymnosperm orders: Cordaitales (grey triangles), Medullosales (red triangles), Glossopteridales (blue triangles), Bennettitales (white triangles) and Ginkgoales (black triangles). (c) Coniferales GCL values represented at the family level: Cupressaceae (white circles), Podocarpaceae (blue circles), Pinaceae (grey circles), Cheirolepidiaceae (red circles) and Araucariaceae (black circles). One datum (Brachyphyllum crucis, c. 180 Ma) is bicoloured separately to reflect uncertainty in its family-level affiliation. All trends in GS/GCL values through time are not statistically significant (for statistical analysis see Supporting Information Table S2).

Maximum GS has increased through gymnosperm evolution, driven primarily by an increase within some Coniferales of GCL/GS (Fig. 4a), but despite this trend, species with a small GS do occur in some extant gymnosperms (e.g. the gnetaleans Gnetum and Welwitschia, 2C = 4401–7785 Mbp and 14 083 Mbp, respectively). When the data are analysed at a finer taxonomic resolution, we detect increases in the GCL of Cordaitales and Bennettitales and hence infer an increase in GS (Fig. 4b), though admittedly data from these groups are based on a limited sample size. For groups yielding more numerous data points ( 7), such as the Medullosales, Glossopteridales and Ginkgoales, we identified no significant trends in GS (Fig. 4b, Table S2). In the case of Medullosales and Glossopteridales, the observed ranges of GS may include evidence of genome duplication events within the data or genome size expansion via amplification of repetitive DNA sequences at the diploid level. However, in the absence of rigorous phylogenetic frameworks in which to analyse such ‘localised’ GS distributions, this hypothesis cannot be adequately tested. The contemporaneous (in a geological sense) nature of the data for these groups (Fig. 4b) precludes confident identification of long-term trends. Analysing conifer GCL sizes at family level suggests that average GS increased through time in the Cheirolepidiaceae and Araucariaceae but decreased in Cupressaceae (Fig. 4c); however, these trends are not (yet) statistically significant (Table S2).

Directions of future research

We view the present contribution as an interim report on progress achieved in this relatively novel field of research, intended to stimulate more detailed investigations. Although additional data from any taxonomic group and time period should prove helpful, factors potentially confounding the positive correlation between stomatal guard-cell size and genome size will be fewer and/or of lesser impact on future studies if time periods are deliberately shortened and taxonomic spectra are narrowed to closer relatives. The comparatively large guard-cell size evident in Early Devonian land plants is probably best explored by more detailed investigation of ‘breathing pores’ in extant nonvascular land plants and the lycopsid descendants of Asteroxylon. In terms of interpreting both the effects of developmental canalisation and ecophenotypic modification of guard-cell size, controlled experiments conducted in geologically relevant conditions and on palaeobotanically relevant species will clearly continue to pay dividends. In addition, as noted above, some evidence exists that molecular responses to genome duplication events may differ between major groups of land plants (e.g. lycophytes, monilophytes, gymnosperms, angiosperms). Given that most research on polyploidy to date has focused on angiosperms, expanding these studies to include other land-plant groups will also be essential for understanding how polyploidy impacts on the evolution of guard-cell size and genome size.


B.H.L.'s research on palaeogenome size is funded via NERC New Investigators grant NE/J004855/1. J.H. and R.M.B. acknowledge the previous support of NERC responsive mode grant NE/E004369/1.