A universe of dwarfs and giants: genome size and chromosome evolution in the monocot family Melanthiaceae

Authors


Summary

  • Since the occurrence of giant genomes in angiosperms is restricted to just a few lineages, identifying where shifts towards genome obesity have occurred is essential for understanding the evolutionary mechanisms triggering this process.
  • Genome sizes were assessed using flow cytometry in 79 species and new chromosome numbers were obtained. Phylogenetically based statistical methods were applied to infer ancestral character reconstructions of chromosome numbers and nuclear DNA contents.
  • Melanthiaceae are the most diverse family in terms of genome size, with C-values ranging more than 230-fold. Our data confirmed that giant genomes are restricted to tribe Parideae, with most extant species in the family characterized by small genomes. Ancestral genome size reconstruction revealed that the most recent common ancestor (MRCA) for the family had a relatively small genome (1C = 5.37 pg). Chromosome losses and polyploidy are recovered as the main evolutionary mechanisms generating chromosome number change.
  • Genome evolution in Melanthiaceae has been characterized by a trend towards genome size reduction, with just one episode of dramatic DNA accumulation in Parideae. Such extreme contrasting profiles of genome size evolution illustrate the key role of transposable elements and chromosome rearrangements in driving the evolution of plant genomes.

Introduction

The majority of angiosperms are characterized by possessing small to very small genomes (1C ≤ 3.5 pg; Leitch et al., 1998) according to the data compiled in the Plant DNA C-values database (http://data.kew.org/cvalues/; Bennett & Leitch, 2012). However, when it comes to genome size (GS), it is evident that angiosperms are one of the most diverse groups of living organisms, as illustrated by the 2400-fold range of C-values encountered (Pellicer et al., 2010; Fedoroff, 2012). Large genomes (1C ≥ 35 pg) occur in a range of taxonomic groups (Kelly et al., 2012), although truly giant ones have evolved independently in just a few genera. These are found in Ophioglossales and Psilotales in ferns (Obermayer et al., 2002), and in the monocot orders Liliales and Asparagales and the eudicot order Santalales in angiosperms (Zonneveld, 2010; Leitch & Leitch, 2013).

The debate concerning the genetic processes controlling genome size in plants is ongoing (Ma et al., 2004; Bennetzen et al., 2005; Hawkins et al., 2006, 2009; Fedoroff, 2012), although it is now recognized that the main mechanisms contributing to genome expansion in plants are polyploidy and the proliferation of repetitive DNA sequences (especially transposable elements (TEs); see reviews by Grover & Wendel, 2010; Kejnovsky et al., 2012; Leitch & Leitch, 2012). Long terminal repeat (LTR)-retrotransposons have been shown to be particularly abundant in many plant genomes, often reaching high copy numbers (up to thousands or tens of thousands) and accounting for a significant fraction of large plant genomes (Bennetzen, 2000; Kejnovsky et al., 2012; Slotkin et al., 2012). Such observations raise several questions. To what extent are failures in the mechanisms counterbalancing TE amplification responsible for the occurrence of such obese genomes? When in plant evolution did such shifts in GS take place? What are the genetic and environmental factors that may have triggered such genome expansion? In this context, phylogenetic trees provide crucial evolutionary frameworks for testing model-based approaches to track wide-scale GS changes over time (Soltis et al., 2003; Leitch et al., 2005) and for gaining insights into the mode and tempo of GS evolution at lower taxonomic levels (Leitch et al., 2007; Lysák et al., 2009; Gurushidze et al., 2012; Andrés-Sánchez et al., 2013; Pellicer et al., 2013).

In this paper, we present a survey of GS and chromosome data across the monocot family Melanthiaceae. As circumscribed in APG III (2009), the family comprises c. 16 genera and 180 species (Govaerts, 2013) of bulbous or rhizomatous woodland and alpine perennial herbs, mainly occurring in temperate to Arctic zones of the Northern Hemisphere (Fig. 1a). The only exception is the genus Schoenocaulon, which ranges into the northern Andean region of South America (Zomlefer et al., 2006). The family probably arose during the Cenozoic (c. 46–62 million yr ago (Mya)), according to Vinnersten & Bremer (2001), and exhibits classic ‘Arcto-Tertiary’ disjunctions (e.g. North America/East Asia).

Figure 1.

(a) Global distribution of the family Melanthiaceae. (b) Floral diversity of tribes of Melanthiaceae: 1, Chionographideae (Chionographis japonica); 2, Heloniadeae (Heloniopsis leucantha); 3, Parideae (Paris polyphylla); 4, Melanthieae (Toxicoscordion venenosum); 5, Parideae (Trillium ovatum); 6, Xerophylleae (Xerophyllum tenax) (images 1 and 6 are taken from Wikimedia commons (http://commons.wikimedia.org); images 2–5, J. Pellicer). (c) Fifty per cent majority-rule consensus tree from Bayesian inference based on the trnL-trnF data set. Supported branches (posterior probability (PP) ≥ 0.95) are indicated in bold. Nuclear DNA contents for each species (1C) were mapped onto the tree and ancestral state reconstruction was conducted with maximum parsimony (MP) using 1Cx values.

The circumscription of Melanthiaceae has been controversial as a consequence of the diverse morphology (Fig. 1b) (see Zomlefer et al. (2006) for a review). One of the early classifications (Dahlgren et al., 1985) already considered tribes Chionographideae, Melanthieae and Xerophylleae within the family, but also included other tribes, such as Petrosavieae and Narthecieae (currently recognized as unrelated families). Taxonomic rearrangements have been common in the group, and several authors (Takhtajan, 1997) supported the segregation into five independent families: Chionographidaceae, Heloniadaceae, Melanthiaceae s.s., Trilliaceae and Xerophyllaceae. Most of these segregates were later returned into an expanded Melanthiaceae s.l. by Tamura (1998a), with Trilliaceae still recognized as a distinct family (Tamura, 1998b; Farmer, 2006), albeit with much debate. Since then, the systematics of the group has been revisited several times at different taxonomic levels (Chase et al., 2000; Fuse & Tamura, 2000; Rudall et al., 2000; Zomlefer et al., 2001; Fay et al., 2006). As currently recognized (Zomlefer et al., 2006; APG III, 2009), Melanthiaceae are divided into five monophyletic tribes: Chionographideae, Heloniadeae, Melanthieae, Parideae and Xerophylleae (Fig. 1c).

The karyological history of the family is also complex, with reports of several base numbers (x = 5, 8, 10, 11, 12, 15 and 17; Tamura, 1998a; Zomlefer et al., 2006). Although little is known about the mechanisms of chromosome evolution in Melanthiaceae (Tanaka & Tanaka, 1979; Lee, 1985), such a diversity of numbers is indicative of extensive chromosomal rearrangements, with the high base numbers in Heloniadeae and Xerophylleae (= 15 and 17; Kokubugata et al., 2004) probably arising via ancient episodes of polyploidization and subsequent chromosome reorganizations.

Previous studies have reported the existence of giant genomes in the family and a series of karyotypic processes operating that are likely to influence GS (Zonneveld et al., 2005; Pellicer et al., 2010; Zonneveld, 2010). With the main objective of evaluating the impact of changes in these genomic traits on the evolution of the family, we conducted a comprehensive survey of nuclear DNA contents and chromosome data across Melanthiaceae. These data were analyzed using statistical approaches in a phylogenetic context allowing us to: track the current distribution of C-values and chromosome numbers across the family; reconstruct ancestral character states for GS and chromosome numbers; and identify where shifts in GS have taken place during the evolution of the family and gain insights into the key processes involved.

Materials and Methods

Taxon sampling and plant material

Fresh leaf tissue of 99 accessions, representing 79 species (including six varieties and one known hybrid), was used for GS estimation. Once processed, samples were silica-dried and stored at room temperature in sealed plastic bags. Detailed information about provenances and herbarium vouchers are listed in Supporting Information Table S1. For samples that we could not voucher, photographic information is kept in the Living Collection Virtual Herbarium at RBG, Kew, and is available from the authors on request.

Chromosome counts

Actively growing roots were obtained from the same plants as used for GS estimations whenever possible. The method used was based on Pellicer et al. (2010). Root tip meristems were pretreated in either a saturated solution of 1-bromonaphthalene at 20°C for 24 h or in a 0.05% solution of aqueous colchicine at 20°C for 2–3 h. Material was then fixed in absolute ethanol:glacial acetic acid (3 : 1) for 24 h at 4°C, transferred to 70% ethanol and placed at −20°C for long-term storage. For chromosome preparations, root tips were washed in deionized water for 20 min, hydrolyzed in 1 M hydrochloric acid (HCl) for 5–9 min at 60°C, stained with Schiff's reagent for 30 min and squashed on slides in 45% acetic acid. The average total karyotype length (TKL) of selected species in tribe Parideae was calculated by measuring chromosome lengths with the software ProgRes Capture Pro v2.8.8 (Jenoptik Optical Systems GmbH, Jena, Germany) for ten metaphase plates from different roots. Specimens used for such a purpose were from the same populations as used to estimate GS.

GS estimation by flow cytometry

Nuclear DNA contents were estimated based on Pellicer et al. (2012). Fully expanded leaf tissue from each specimen (c. 1 cm2) was chopped along with the appropriate selected internal standard (see Table 1 for details) using a new razor blade in 1 ml of ‘general purpose isolation buffer’ (GPB; Loureiro et al., 2007) supplemented with 3% polyvinylpyrrolidone (PVP-40). A further 1 ml of GPB was then added and the sample was passed through a 30-μm nylon filter. The homogenate was stained with propidium iodide (1 mg ml−1; Sigma) and ribonuclease AII (3 mg ml−1; Sigma) to give a final concentration of 50 μg ml−1 and kept on ice for 20 min. For each species analyzed, samples from three individuals were prepared and three replicates of each were run, recording 5000 particles using a Partec Cyflow SL3 (Partec GmbH, Münster, Germany) flow cytometer fitted with a 100-mW green solid state laser (Cobolt Samba, Solna, Sweden). The resulting histograms were analyzed with the FlowMax software (v. 2.7; Partec GmbH).

Table 1. List of species used in the present study, including infrageneric classification, chromosome numbers, inferred ploidy and genome size
TaxaChromosome number (2n)a2C value (pg) ± SD1C value (pg)1C value (Mb)b1Cx value (pg)Standardc
  1. a

    Counts of chromosome numbers were made from the same plants as used for genome size estimation whenever possible. Alternatively, published chromosome counts were used for calculations (see Supporting Information Table S1 for further clarification). *Potential ancient polyploids, but diploid-like behavior in meiosis.

  2. b

    1 pg = 978 Mb (Doležel et al., 2003).

  3. c

    Calibration standards used for genome size assessment: (1) Solanum lycopersicum L. ‘Stupiké polní rané’ (2C = 1.96 pg; Doležel et al., 1992). (2) Petroselinum crispum ‘Champion Moss Curled’ (2C = 4.50 pg; Obermayer et al., 2002). (3) Pisum sativum L. ‘Ctirad’ (2C = 9.09 pg; Doležel et al., 1998). (4) Allium cepa ‘Ailsa Craig’ (2C = 33.55 pg; Van't Hof, 1965).

  4. Underlined species indicate those taxa not included in phylogenetic or statistical analyses.

Tribe Chionographideae
Chamaelirium luteum 24 (2x)*1.96 ± 0.030.98958.440.982
Chionographis japonica 24 (2x)*3.05 ± 0.011.531496.341.531
Tribe Heloniadeae
Helonias bullata (1)34 (2x)*6.45 ± 0.063.233154.053.233
Helonias bullata (2)34 (2x)*6.37 ± 0.023.193114.933.193
Heloniopsis kawanoi 34 (2x)*7.27 ± 0.113.643555.033.643
Heloniopsis koreana (1)34 (2x)*5.01 ± 0.012.512449.892.513
Heloniopsis koreana (2)34 (2x)*5.1 ± 0.012.552493.902.553
Heloniopsis leucantha (1)34 (2x)*8.21 ± 0.034.114014.694.112
Heloniopsis leucantha (2)34 (2x)*8.00 ± 0.034.003912.004.002
Heloniopsis orientalis (1)34 (2x)*6.3 ± 0.043.153080.703.153
Heloniopsis orientalis (2)34 (2x)*6.35 ± 0.033.183105.153.183
Heloniopsis orientalis var. breviscapa (1)34 (2x)*8.06 ± 0.014.033941.344.032
Heloniopsis orientalis var. breviscapa (2)34 (2x)*7.97 ± 0.033.993897.333.992
Heloniopsis orientalis var. flavida34 (2x)*7.53 ± 0.013.773682.173.772
Heloniopsis tubiflora 34 (2x)*4.99 ± 0.032.502440.112.503
Heloniopsis umbellata (1)34 (2x)*9.50 ± 0.034.754645.504.752
Heloniopsis umbellata (2)34 (2x)*9.56 ± 0.094.784674.844.782
Ypsilandra cavaleriei 34 (2x)*10.99 ± 0.175.505374.115.502
Ypsilandra thibetica 34 (2x)*9.64 ± 0.044.824713.964.822
Tribe Melanthieae
Amianthium muscitoxicum 32 (4x)6.75 ± 0.073.383300.751.683
Anticlea elegans var. glaucus32 (4x)7.66 ± 0.013.833745.741.912
Anticlea occidentalis 16 (2x)3.16 ± 0.061.581545.241.582
Anticlea volcanica 32 (4x)8.57 ± 0.024.294190.732.142
Schoenocaulon macrocarpum 16 (2x)1.44 ± 0.010.72704.160.721
Schoenocaulon texanum 16 (2x)1.31 ± 0.010.66640.590.661
Stenanthium densum 20 (2x)3.44 ± 0.011.721682.161.721
Stenanthium gramineum 20 (2x)3.36 ± 0.031.681643.041.681
Stenanthium leimanthoides 20 (2x)3.3 ± 0.011.651613.701.651
Toxicoscordion exaltatum 22 (2x)4.85 ± 0.032.432371.652.433
Toxicoscordion fontanum 22 (2x)4.93 ± 0.032.472410.772.473
Toxicoscordion fremontii 22 (2x)5.85 ± 0.052.932860.652.933
Toxicoscordion gramineum 22 (2x)4.83 ± 0.042.422361.872.423
Toxicoscordion micranthum 22 (2x)4.91 ± 0.032.462400.992.463
Toxicoscordion paniculatum 22 (2x)5.14 ± 0.082.572513.462.573
Toxicoscordion venenosum 22 (2x)4.89 ± 0.042.452391.212.453
Veratrum album (1)32 (4x)6.57 ± 0.053.293212.731.643
Veratrum album (2)32 (4x)6.69 ± 0.093.553271.411.673
Veratrum californicum var. caudatum32 (4x)5.43 ± 0.022.722655.271.353
Veratrum fimbriatum (1)32 (4x)6.49 ± 0.013.253173.611.623
Veratrum fimbriatum (2)32 (4x)6.46 ± 0.033.233158.941.613
Veratrum grandiflorum 32 (4x)6.84 ± 0.013.423344.761.713
Veratrum lobelianum 32 (4x)6.72 ± 0.033.363286.081.683
Veratrum longibracteatum (1)16 (2x)4.61 ± 0.012.312254.292.313
Veratrum longibracteatum (2)16 (2x)4.54 ± 0.052.072220.062.273
Veratrum maackii var. japonicum16 (2x)4.8 ± 0.012.402347.202.403
Veratrum maackii var. maackii16 (2x)4.23 ± 0.032.122068.472.123
Veratrum nigrum (1)16 (2x)5.52 ± 0.022.762699.282.763
Veratrum nigrum (2)16 (2x)5.47 ± 0.032.742674.832.743
Veratrum nigrum (3)16 (2x)5.57 ± 0.102.792723.732.793
Veratrum parviflorum 16 (2x)5.88 ± 0.012.942875.322.943
Veratrum schindleri 16 (2x)4.15 ± 0.022.082029.352.083
Veratrum stamineum 32 (4x)7.12 ± 0.043.563481.681.783
Veratrum virginicum 16 (2x)3.47 ± 0.021.741696.301.743
Veratrum viride 32 (4x)6.70 ± 0.053.353276.301.673
Veratrum woodii 16 (2x)4.43 ± 0.012.222166.272.223
Zigadenus glaberrimus 54 (6x)10.64 ± 0.025.325202.961.773
Tribe Parideae
Paris forrestii 10 (2x)113.17 ± 0.556.5955 340.1356.594
Paris incompleta 10 (2x)84.49 ± 0.5542.2541 315.6142.254
Paris mairei 10 (2x)111.81 ± 0.9955.9154 675.0955.914
Paris polyphylla (1)10 (2x)107.21 ± 0.2353.6152 425.6953.614
Paris polyphylla (2)10 (2x)108.43 ± 0.4254.2253 022.2754.224
Paris quadrifolia 10 (2x)101.04 ± 0.7150.5249 408.5650.524
Paris tetraphylla 10 (2x)81.49 ± 0.1640.7539 848.6140.754
Paris thibetica 10 (2x)109.16 ± 0.5754.5853 379.2454.584
Paris thibetica var. thibetica10 (2x)105.03 ± 0.9552.5251 359.6752.524
Paris verticillata 10 (2x)62.41 ± 0.3231.2130 518.4931.214
Trillium camschatcense 10 (2x)101.82 ± 0.4350.9149 789.9850.914
Trillium catesbaei 10 (2x)100.76 ± 0.4850.3849 271.6450.384
Trillium cernuum 10 (2x)101.01 ± 0.9050.5149 393.8950.514
Trillium chloropetalum 10 (2x)101.84 ± 0.8150.9249 799.7650.924
Trillium cuneatum 10 (2x)105.14 ± 0.8152.5751 413.4652.574
Trillium decipiens 10 (2x)103.62 ± 0.1451.8150 670.1851.814
Trillium discolor 10 (2x)102.98 ± 0.7851.4950 357.2251.494
Trillium erectum 10 (2x)101.96 ± 0.9550.9849 858.4450.984
Trillium erectum x flexipes 10 (2x)100.07 ± 0.1950.0448 934.2350.044
Trillium flexipes 10 (2x)101.24 ± 0.7950.6249 506.3650.624
Trillium govanianum 20 (4x)137.27 ± 0.4868.6467 125.0334.314
Trillium grandiflorum 10 (2x)104.06 ± 0.6052.0350 885.3452.034
Trillium kurabayashii 10 (2x)100.09 ± 0.7550.0548 944.0150.054
Trillium lancifolium 10 (2x)106.07 ± 0.1853.0451 868.2353.044
Trillium luteum 10 (2x)109.11 ± 0.6954.5653 354.7954.564
Trillium maculatum 10 (2x)107.7 ± 0.0853.8552 665.3053.854
Trillium nivale 10 (2x)95.01 ± 0.2547.5146 459.8947.514
Trillium ovatum (1)10 (2x)106.00 ± 0.9553.0051 834.0053.004
Trillium ovatum (2)10 (2x)107.35 ± 0.6253.6852 494.1553.684
Trillium parviflorum 10 (2x)100.02 ± 0.2350.0148 909.7850.014
Trillium pusillum var. pusillum 10 (2x)99.12 ± 0.3149.5648 469.6849.564
Trillium recurvatum 10 (2x)103.23 ± 0.9851.6250 479.4751.624
Trillium rivale 10 (2x)55.01 ± 0.1527.5126 899.8927.514
Trillium sessile 10 (2x)108.17 ± 0.5954.0952 895.1354.094
Trillium sulcatum (1)10 (2x)99.15 ± 0.2349.5848 484.3549.584
Trillium sulcatum (2)10 (2x)99.16 ± 0.3649.5848 489.2449.584
Trillium taiwanense 20 (4x)149.57 ± 0.6074.7973 139.7337.394
Trillium tschonoskii 20 (4x)141.92 ± 0.8570.9669 398.8835.484
Trillium undulatum 10 (2x)78.62 ± 0.6739.3138 445.1839.314
Trillium viride 10 (2x)103.08 ± 0.6151.5450 406.1251.544
Tribe Xerophylleae
Xerophyllum asphodeloides 30 (2x)*6.92 ± 0.023.463383.883.463
Xerophyllum tenax (1)30 (2x)*5.91 ± 0.032.962889.992.963
Xerophyllum tenax (2)30 (2x)*5.94 ± 0.012.972904.662.973

DNA extraction, amplification and sequencing

Many of the DNA sequences used for phylogenetic analyses were downloaded from GenBank. For those taxa that we sequenced de novo (see Table S1), genomic DNA was isolated following the 2x CTAB extraction procedure (Doyle & Doyle, 1987) and purified using a CsCl/ethidium bromide density gradient and dialysis. The plastid trnL-trnF region and matK gene were amplified by PCR using the same primer combinations as in Zomlefer et al. (2001) and Osaloo & Kawano (1999) in 25-μl reactions, each reaction containing 22.5 μl of ReddyMix PCR master mix (1.5 mM Mg; Abgene, Epsom, UK), 1.5 μl of 0.4% bovine serum albumin (BSA), 0.6 μl of H20, 60 ng of primers and c. 35 ng of DNA template. PCR products were purified using DNA purification columns according to the manufacturer's protocol (QIAquick; Qiagen Ltd, Crawley, UK) and cycle sequenced using the Big Dye terminator v3.1 chemistry (ABI, Warrington, UK) following the protocol recommended by the manufacturer. See Buerki et al. (2012) for specific information on PCR and sequencing conditions.

Sequence editing, alignment and phylogenetic analysis

Nucleotide sequences were assembled and edited using BioEdit version 7.0.9 (Hall, 1999). Alignments were made separately for each region with ClustalW (Thompson et al., 1997) using default settings implemented in BioEdit, and gaps were manually adjusted. A single-partition data set (trnL-trnF) was employed to build a representative family-level tree; a second region (matK) was added to a reduced trnL-trnF data set, focused on tribe Parideae and relatives. Note that for six species for which we estimated GS in Parideae (Table 1), we were not able to obtain unambiguous sequences, and these were excluded from statistical analyses based on phylogenetic results. Phylogenetic reconstructions using Bayesian inference (BI) were carried out with MrBayes version 3.1.2 (Ronquist & Huelsenbeck, 2003). The most appropriate nucleotide substitution models for each partition were chosen with MrModeltest (v.2.; Nylander, 2004). The best-fitting model based on both AIC (akaike information criterion) and hLRT (hierarchical likelihood ratio test) for the single partition data set (trnL-trnF) was GTR + G, and GTR + I + G for the combined data set (trnL-trnF matK). For each analysis, four Markov chains were run simultaneously for 40 × 106 generations and sampled every 1000 generations. The MCMC (markov chain monte carlo) sampling was considered sufficient as the effective sample size (ESS) was > 200 in each case, as evaluated in Tracer v.1.5 (Rambaut & Drummond, 2007). Data from the first 10 × 106 generations were discarded as the ‘burn-in’ period in each analysis, and the remaining trees were used to construct 50% majority-rule consensus trees. Posterior probabilities (PPs) of nodes were calculated from the pooled samples.

Reconstruction of ancestral character states

Following the recommendations of Cusimano et al. (2012), haploid ‘n’ instead of basic ‘x’ chromosome numbers were used to reconstruct ancestral chromosome numbers. Analyses were run in ChromEvol v. 1.3 (http://www.zoology.ubc.ca/prog/chromEvol.html) (Mayrose et al., 2010), using BI and maximum likelihood (ML) approaches. This software implements eight models of chromosome number changes, with a probabilistic estimation of the number of evolutionary events taking place along the branches, including the following parameter rates: polyploidization (ρ), demi-polyploidization (μ) and dysploidization (chromosome gains (λ) and losses (δ)). These models were fitted to our data, each with 10 000 simulations. The maximum number of chromosomes was set to 10 × higher than the highest number found in the empirical data, the minimum was set to 1 and the tree was re-scaled with the ‘branchMul’ parameter set to 0.01. Haploid chromosome numbers used for inferences were based on previous published reports and our new counts (see Tables 1, S1).

Before conducting statistical inference of ancestral GS reconstruction, we considered the potential effect of polyploidy (i.e. whether to use 1C or 1Cx values sensu Greilhuber et al., 2005). We calculated 1Cx values and used them for ancestral reconstruction with the proviso that our primary goal was to provide a general overview of the dynamics of GS at the family level. Given the admixture distribution (i.e. mixture of different populations) of GS values in our data (see Fig. S1a), we tested several transformations, trying to obtain a normal distribution of GS. However, none was successful so we could not use the generalized least squares (GLS) model in a Bayesian framework as has previously been used to reconstruct ancestral GS and evaluate rates of parameter evolution (e.g. Bayes Traits; Pagel, 1999). Instead, we used unordered maximum parsimony (MP) as implemented for continuous characters with Mesquite v.2.73 software (Maddison & Maddison, 2010). Character history was also reconstructed on the 50% consensus tree using ML under the Mk1 model and under MP using Mesquite v.2.73, after coding of continuous data (1Cx values) into discrete categories (because continuous data cannot be analyzed under ML in Mesquite).

Statistical analyses

GS comparisons between specific lineages were performed with Statgraphics Plus v.5.1 (Statistical Graphics Corp., Warrenton, VA, USA). The normality of the data distributions was tested with the Kolgomorov–Smirnov test and the homogeneity of variances with Levene's test. The significance of differences in GS between selected groups was determined using the nonparametric Kruskall–Wallis test.

Results

Evolutionary relationships in Melanthiaceae

The single partition Bayesian analysis yielded a well-resolved phylogenetic tree for the family supporting its monophyly and the segregation into five strongly supported tribes (Fig. 1c), although some relationships at the species level lacked significant support. As in Zomlefer et al. (2001), tribe Melanthieae is resolved as sister to the remaining four tribes of the family (Fig. 1c, clade 2). Tribes Chionographideae and Heloniadeae (Fig. 1c, clades 3 and 4), both with Arcto-Tertiary disjuncts, are closely related and sister to the lineage comprising tribes Xerophylleae and Parideae (Fig. 1c, clades 5 and 6). The well-supported phylogenetic relationship between Xerophylleae and Parideae has been previously reported, and the addition of the matK gene to a reduced data set confirmed the sister position of Pseudotrillium rivale to the remaining representatives of Parideae and improved resolution of some of the internal nodes in Parideae and Heloniadeae (Fig. 2).

Figure 2.

Fifty per cent majority-rule consensus tree from Bayesian inference based on the combined trnL-trnF and matK data sets focusing on tribe Parideae. Nuclear DNA contents for each species (1C) were mapped along the tree, and ancestral state reconstruction was conducted under maximum parsimony (MP) using 1Cx values. Ps, Pseudotrillium; K, Kinugasa.

GS and chromosome variation across Melanthiaceae

With C-values reported for 79 species of Melanthiaceae (see Table 1), this study represents the most comprehensive survey of GS for the family. The flow cytometric analysis resulted in high-resolution histograms (Fig. S1b) with 2C peaks for the target sample and the reference standard of good quality (CV% 1.65–4.82; mean 3.28 for target samples). GS measurements from different accessions of the same species varied < 2.55%, falling within acceptable limits of intraspecific variation (i.e. < 5%; Doležel & Bartos, 2005).

Overall, 1C values in Melanthiaceae varied 230-fold, ranging from 0.66 pg in Schoenocaulon texanum (2n = 2x = 16) to 152.23 pg in Paris japonica (2n = 8x = 40; Pellicer et al., 2010). At the 1Cx level, this variation dropped down to c. 86-fold, with the smallest 1Cx estimate in the diploid Schoenocaulon texanum, and the largest 1Cx estimate in the diploid Paris forrestii (1C= 56.59 pg). Nuclear DNA contents (1C) were mapped onto the phylogenetic trees (Figs 1, 2), highlighting the impressive shift towards GS expansion during the diversification of tribe Parideae.

The ranges of GS in each tribe are listed in Table 2 and illustrated in Fig. 3. The Kruskall–Wallis test showed GS differences as only significant in the case of Parideae (= 55.25; < 0.0001), with the remaining four tribes forming homogeneous groups. Within Parideae, mean GS values for Paris and Trillium were not significantly different (= 0.03; > 0.05).

Table 2. Ranges of C-values (pg) in Melanthiaceae
TribeCoverage generaaCoverage speciesaGenome size (1C)bGenome size (1Cx)b
Min.Max.Mean ± SDMin.Max.Mean ± SD
  1. a

    Number of genera and species extracted from Govaerts (2013).

  2. b

    Values calculated based on the data in Table 1. Calculations in Parideae also include the C-value for P. japonica taken from Pellicer et al. (2010).

Chionographideae2/22/70.981.531.25 ± 0.380.981.531.25 ± 0.38
Melanthieae7/731/760.665.322.62 ± 0.960.662.941.97 ± 0.54
Heloniadeae3/39/132.505.503.81 ± 0.952.505.503.81 ± 0.95
Parideae3/336/7927.51152.2354.01 ± 18.4427.5156.5948.17 ± 7.29
Xerophylleae1/12/22.963.463.21 ± 0.352.963.463.21 ± 0.35
Figure 3.

Box plot showing the distribution of 1Cx values across tribes. Solid vertical lines in boxes represent median values and whiskers standard deviation.

Chromosome numbers were obtained de novo for the taxa studied (Table 1), and also previously published reports were used to infer ploidy (see Table S1). A selection of metaphase plates (Fig. 4) illustrates the diversity of chromosome numbers and sizes in the family. Three ploidies were encountered in Melanthieae: diploids with 2n = 16 (Anticlea, Schoenocaulon and Veratrum), 20 (Stenanthium) or 22 (Toxicoscordion), tetraploids with 2n = 32 (Amianthium, Anticlea and Veratrum) and one hexaploid in Zigadenus glaberrimus with 2n = 54 (W. B. Zomlefer et al., unpublished). By contrast, tribes Chionographideae, Heloniadeae and Xerophylleae have consistent chromosome numbers, with 2n = 24, 34 and 30, respectively, and all behave cytologically as diploids (i.e. with regular bivalent formation in meiosis). In Parideae, the majority of species analyzed were diploid with 2n = 10. Polyploidy was only encountered in some of the Asian representatives: tetraploids Trillium govanianum, Trillium tschonoskii and Trillium taiwanense (2n = 20), and octoploid P. japonica (2n = 40; Pellicer et al., 2010).

Figure 4.

Chromosome number diversity and evolution in Melanthiaceae inferred under Bayesian optimization. Pie chart color codes represent probabilities of inferred chromosome numbers (numbers that scored the highest probability are depicted). Note that the same chromosome numbers were reconstructed under maximum likelihood (see Supporting Information Fig. S3). Numbers along branches represent events inferred with posterior probability (PP) > 0.5. Metaphase chromosome photographs scale bars, 10 μm.

Ancestral GS and chromosome evolution in Melanthiaceae

According to the MP approach, the most recent common ancestor (MRCA) of Melanthiaceae was reconstructed with a 1Cx = 5.37 pg (Fig. 1c, clade 1). Overall, excluding Parideae, the trend was towards genome size reduction in the family (Fig. 1c, clades 2–5), with the largest decrease in Chionographideae (MRCA 1Cx = 1.15 pg). These dynamics, however, were not apparent in Parideae, in which the diversification of the extant lineages (Fig. 1c, clade 6) was accompanied by a significant GS expansion. The MRCA of the tribe was inferred as having a 1Cx value of 35.10 pg, in striking contrast to the small ancestral 1Cx value of 4.47 pg reconstructed for Xerophylleae, the sister group of Parideae (Fig. 1c). Analysis of an expanded data set for Parideae (Fig. 2) revealed that continued GS increases probably occurred gradually in this tribe with only relatively modest increases in the ancestral 1Cx values inferred at the internal nodes (see clades 4, 5 and 6; Fig. 2). The inferred 1Cx values for the MRCA of Paris (43.73 pg) and of the main Trillium clade (43.66 pg) were similar (Fig. 2). Under ML (and MP in categorized data), the reconstructed 1Cx ranges were smaller overall than those obtained with MP in continuous data. However, the results mirrored the dynamics found with the former approach, with four of the five tribes in the family characterized by small to very small genomes, in contrast to the punctuated genome expansion in Parideae (Fig. S2a,b).

Analysis of chromosome evolution in Melanthiaceae recovered the three-parameter model (M2) of Mayrose et al. (2010), as the one best fitting our data, with inferred rates of change as follows: ρ = 0.79, λ = 0, δ = 1 (see Table S2 for further details). The ancestral chromosome numbers reconstructed using BI, their probabilities (PPs) and the number of chromosome events inferred with PP > 0.5 are shown in Fig. 4. Ancestral chromosome numbers inferred using ML were always coincident with the highest scores in BI (Fig. S3).

The haploid (n) chromosome number at the root of the family with the highest PP was = 9. Increases in chromosome numbers were derived from demi-polyploidization (e.g. Heloniadeae) or polyploidization events (e.g. Veratrum and Paris), and not through chromosome gains (i.e. fissions), at least not with high probability. Chromosome losses were inferred frequently during the evolution of the family, and specifically on branches leading to the major tribal diversifications, with a significantly higher rate in Parideae (Fig. 4). In contrast, polyploidization (i.e. whole-genome duplication (WGD)) events were reconstructed in more derived positions, occurring mainly at the tips rather than at internal nodes in the tree (Fig. 4; e.g. in Amianthium, Paris, Trillium and Veratrum).

Discussion

Chromosome diversity and evolution in Melanthiaceae

Across the family as a whole, Melanthiaceae are relatively variable in chromosome numbers (Table 1, Fig. 4). However, such diversity contrasts with the constancy of haploid numbers in most tribes (i.e. Chionographideae = 12, Heloniadeae = 17, Xerophylleae = 15, and Parideae = 5). The exception to this is found in tribe Melanthieae, with four haploid chromosome numbers reconstructed in our analysis (= 8, 10, 11, and 16; Fig. 4).

Changes in karyotypic traits (e.g. size and number) are reflected in the GS of organisms and their evolution is intimately linked (e.g. Bliss & Suzuki, 2012; Nam & Ellegren, 2012; Jang et al., 2013; Pellicer et al., 2013). One of the best illustrative examples is that of Parideae, in which the lowest diploid chromosome numbers (2n = 10) in the family, arising from a significant number of chromosome losses (Fig. 4), occurred in parallel with high rates of DNA amplification. This is shown by the large average length of chromosomes in Parideae (c. 20 μm) compared with the small size of those from the sister tribe Xerophylleae (e.g. Xerophyllum tenax; mean = 2.65 μm). In Parideae, an almost two-fold GS difference among representatives of the same ploidy was reported (Table 2). Again, this was reflected by differences in the total karyotype lengths (TKLs), that is, chromosome sizes, as previously reported by Warmke (1937). For example, in Pseudotrillium rivale, with the smallest 1C value of any of Parideae investigated (1C = 27.51 pg), we calculated a TKL = 80.67 ± 3.05 μm, whereas at the upper end of the scale, in Paris thibetica (1C = 54.58 pg) or Trillium ovatum (1C = 53.68 pg), the TKLs were 97.91 ± 3.05 μm and 133.54 ± 8.93 μm, respectively (Fig. S4). Tribe Melanthieae has also undergone several chromosomal restructuring events during its diversification, leading to the current diversity in chromosome numbers observed (i.e. 2n = 16, 20, 22, 32, and 54). Among diploids, it is striking that the greatest differences in GS were found in taxa with the same chromosome number (e.g. Schoenocaulon and Veratrum; 2n = 16). This would suggest that, at this scale, it is specific deletions of DNA segments coupled with different rates of DNA amplification that are significant in promoting GS variation (Petrov, 2002; Grover & Wendel, 2010; Hu et al., 2011). To gain insights into the origin of such diverse karyotype profiles, molecular cytogenetic techniques involving physical chromosome mapping and high-throughput sequencing approaches would be useful, as demonstrated for several plant groups (Lan & Albert, 2011; Buggs et al., 2012; Jang et al., 2013).

To determine the ancestral chromosome number of the family and the main evolutionary mechanisms responsible for generating the extant karyotypic diversity (i.e. chromosome gains and losses, polyploidization etc.) a statistical modeling approach was taken using the program ChromEvol. Our results showed that the main processes responsible for generating changes in chromosome number during the evolution of the family were chromosome losses and episodes of polyploidy, with polyploidy being prevalent mostly at the tips of the tree rather than early in the evolution of Melanthiaceae (Fig. 4). Although at a lower rate, demi-polyploidization events (i.e. those resulting in odd ploidies) were also inferred in several lineages (e.g. Heloniopsis, n = 17; Toxicoscordion, n = 11). This may reflect the presence of taxa of hybrid origin arising from parental genome donors with different haploid numbers (e.g. via the union of reduced and unreduced gametes) or via apomixis, followed by subsequent chromosomal reorganizations.

The prevalence of polyploidy (i.e. WGD) as a derived trait, with apparently no significant impact on the early evolution of the family, has previously been reported in other families using similar statistical modeling approaches. In Araceae, an extremely high rate of chromosome loss was shown to be the main cause for the chromosome diversity observed in the family, probably as a consequence of the high base number recovered for the MRCA of Araceae (Cusimano et al., 2012). A similar result was also reported in Crocus (Iridaceae), with high rates of chromosome loss playing a key role during chromosome evolution within this genus (Harpke et al., 2013). Data based on ChromEvol analyses are relatively scarce, and large-scale statistical inference requires careful interpretation. Bearing in mind the caveats of any statistical approach, Soltis et al. (2005) hypothesized a low base chromosome number for the angiosperms (= 6–9). However, they did not evaluate the potential mechanisms leading to the extant diversity in chromosome numbers, and the question of whether chromosome losses have been more prevalent relative to other evolutionary mechanisms inducing chromosome number change in angiosperms remains unanswered. Recent large-scale genome comparative analyses have provided evidence of an ancient WGD at or close to the origin of angiosperms (Cui et al., 2006; Jiao et al., 2011). These findings highlight that, although analysis of extant chromosome numbers is an extremely useful approach for understanding the relative importance of polyploidy versus other types of chromosomal evolution, it may not uncover these processes deep in the evolutionary tree, which could explain why many of the instances of polyploidy are inferred near the tips of the trees. For Melanthiaceae, no ancestral haploid chromosome number has previously been proposed. In the present study, = 9 was inferred as the ancestral haploid chromosome number under BI and ML approaches, similar to = 8 reported for the closely related family Liliaceae (Peruzzi et al., 2009). This number, despite receiving the highest level of support in the analysis (Fig. 4), must be considered as tentative given the statistical nature of the analysis which was based only on extant species.

Genome size diversity and evolution in Melanthiacae

Before this study, C-value estimates had been reported for only a few genera of Melanthiaceae, including Heloniopsis, Paris, Trillium and Veratrum (Bharathan et al., 1994; Zonneveld et al., 2005; Pellicer et al., 2010; Zonneveld, 2010). Nonetheless, these data were sufficient to indicate the existence of contrasting GS patterns in the family. Thus, the question arose as to what extent increases and decreases in GS had occurred during the evolution of this group. In this comprehensive study, we included the entire generic diversity in the family and revealed that just one extraordinary episode of genome expansion occurred during the evolution of the family, in the lineage leading to Parideae. This tribe contains the genera Paris, Pseudotrillium and Trillium, with the single species of Pseudotrillium, P. rivale, placed as sister to all the remaining species (Fig. 2). The taxonomic placement of this taxon has not been exempt from debate, and it was included in Trillium until recently, when Farmer & Schilling (2002) proposed its formal segregation based on morphological and phylogenetic data. The novel phylogenetic and GS data obtained here provide further evidence supporting the segregation of this taxon as an independent genus.

Overall, GS diversity across the family ranges 230-fold and, as a consequence, Melanthiaceae are currently the most GS-diverse family of land plants. Such a diversity is unusual given the rather limited number of species (c. 180 species) relative to some of the other large monocot families such as Poaceae (c. 10 000 species; c. 85-fold range in GS) or Asparagaceae (c. 12 000 species; c. 117-fold range in GS), and only challenged by Orchidaceae (c. 25 000 species; c. 168-fold range in GS) (Leitch et al., 2009; Bennett & Leitch, 2012). Nevertheless, as illustrated in Fig. 5 (see also Fig. S1a), overall the GS distribution in Melanthiaceae mimics the pattern reported in angiosperms as a whole (Kelly & Leitch, 2011), with most species possessing genomes that fall within the very small (< 1.4 pg/1C) and small (≤ 3.5 pg/1C) categories of Leitch et al. (1998).

Figure 5.

Comparison of the genome size (GS) distribution patterns between Melanthiaceae and other angiosperms (angiosperm data from the Plant DNA C-values database; http://data.kew.org/cvalues/).

With the exception of Parideae, extant representatives of Melanthiaceae have small genomes (Fig. 3), leading to the hypothesis that the ancestor of the family may also have had a relatively small genome. This was confirmed here using both MP and ML analyses, although the GS of the MRCA reconstructed for Melanthiaceae under MP (1Cx = 5.37 pg) was higher than that obtained using ML (< 3.5 pg). A further MP analysis run using the categorical data set mirrored the profiles obtained with ML (Fig. S2a,b). Bearing in mind that all the approaches conducted revealed the same dynamics towards GS reduction in four of the tribes, with a reversal of this trend in Parideae, the differences in the values recovered from the different analyses may just reflect the influence of transformation of continuous data into categorical ranges. A relatively small ancestral GS of 1Cx = 6.67 pg was previously recovered in Liliaceae (Leitch et al., 2007), a family closely related to Melanthiaceae that also contains giant diploid genomes (e.g. Fritillaria; 1C= 30.1–89.2 pg). The analysis of GS dynamics for all the major lineages of monocots suggested that they were characterized by small genomes, with an ancestral GS of 1C = 1.85 pg (Leitch et al., 2010). Such results support the concept that GS is generally skewed towards small sizes, and species with large genomes have evolved only rarely (Soltis et al., 2003).

According to our character reconstruction, the apparent burst of GS expansion in Parideae occurred early in the diversification of this tribe, as evidenced by the medium-sized ancestral GS recovered for the MRCA of Xerophylleae and Parideae (1Cx = 9.84 pg; node 1 in Fig. 2), followed by a further significant increase in the MRCA of extant Parideae (1Cx = 35.85 pg; node 3 in Fig. 2). Given that the GS of the MRCA of Paris and Trillium (excluding P. rivale; node 4 in Fig. 2) was reconstructed as 1Cx = 41.24 pg, it suggests that a more than four-fold increase in GS occurred following the divergence of Xerophylleae and Parideae but before the diversification of the majority of extant species of Parideae. It therefore seems that the evolution of giant genomes in Paris and Trillium occurred before the split of these two main lineages, with more limited increases (e.g. P. mairei; 1Cx = 55.91 pg) or even decreases in GS (e.g. T. undulatum; 1Cx = 39.31 pg) occurring after this point.

Most of our understanding of the processes generating genome size diversity has come from analyzing plants with small or very small genomes. In the absence of recent polyploidy, such studies have indicated that the differential proliferation of TEs, with or without chromosomal rearrangements, has largely contributed to differences in genome size observed between species (Kejnovsky et al., 2012; Slotkin et al., 2012). For example, retrotransposon activity is responsible for doubling the GS in plants such as rice (Oryza australiensis) and cotton (Gossypium spp.) (Hawkins et al., 2006; Piegu et al., 2006). Although on a different scale, the enormous genomes in Parideae may similarly be explained by bursts of TE transposition leading to genome expansion, accompanied by a low efficiency of counterbalancing deletion mechanisms as observed in a few limited studies of other organisms with large genomes. For example, in a study examining the genetic landscape of the gymnosperm Picea abies (GS/1C = 20.04 pg) (Nystedt et al., 2013), > 86% of the repetitive elements recovered were singletons, suggesting that the GS increase had arisen through the amplification of many individual TE families, rather than of a few specific elements that had been amplified to high copy numbers, as seen in angiosperms with smaller genomes, coupled with low rates of deletion via unequal intra-strand recombination. Likewise, in salamanders, low rates of DNA loss as a result of fewer deletions and smaller deletion sizes than found in animal species with smaller genomes favor the occurrence of gigantic genomes (1C = 14.30–44.90 pg; Sun et al., 2012).

Several hypotheses concerning the potential evolutionary forces responsible for generating the diversity of GS in angiosperms have been proposed. Given the skewed distribution of GS towards small genomes, a neutral proportional model for GS evolution was proposed (Oliver et al., 2007), suggesting that it was unlikely for small genomes to become large and remain large, but easier for large genomes to evolve and become smaller. Therefore, large genomes should be scarcer, even in the absence of any selection pressures against them. In contrast, other authors have postulated that genome obesity is maladaptive, and that obese genomes arise as a consequence of the failure of purifying selection to eliminate them efficiently (Lynch & Conery, 2003). Petrov (2002) proposed that, under a neutral model, bias towards deletions could be the primary determinant generating genome size diversity. Although this was strongly argued against by Gregory (2003), the above-mentioned genomic analyses of Nystedt et al. (2013) and Sun et al. (2012) have provided further evidence to suggest that the slow rate of DNA elimination via reduced recombination may play a role in the evolution of large genomes of plants and vertebrates. Fedoroff (2012) reviewed the role of epigenetic mechanisms controlling recombination in the accumulation of TEs and suggested that plants with large genomes could have evolved as a result of possessing highly efficient epigenetic silencing pathways, which target and suppress TE activity by packaging them into heterochromatin, hence preventing their removal via recombination.

Some of these hypotheses have been invoked in explaining the evolution of giant genomes in Liliaceae (Leitch et al., 2007). Based on the hypothesis that increased diversification rates might promote variability of traits (Schluter, 2000), Leitch et al. (2007) proposed that the initial radiation in tribe Lilieae (in which the giant genomes of Fritillaria, Lilium, Cardiocrinum etc. are found) could potentially have provided the engine for generating the variability in GS observed. Although it is noted that the emergence of giant genomes in Melanthiaceae could also be explained by several of these theories, they differ from those of Liliaceae as the occurrence of giant genomes is restricted to just a single lineage (Figs 1, 2). Tribe Parideae arose c. 35–50 Mya (Vinnersten & Bremer, 2001) so the emergence of giant genomes was probably the result of an explosive process. Nevertheless, this assumption is based on evaluating extant species and the possibility remains that extinct species with smaller GS, similar to those of P. rivale and T. undulatum, might have existed, and potentially there was a long period (perhaps tens of millions of years) during which DNA could have slowly accumulated. Although the main mechanisms responsible for genome expansion are relatively well known (Grover & Wendel, 2010), the ‘behind the scenes’ machinery controlling genetic activity (i.e. epigenetic regulation) remains unclear, especially for species with giant genomes such as those found in Parideae. Further in-depth DNA sequence analyses are essential to uncover the types of DNA sequences involved, the mechanisms responsible, and the nature of the triggers which lead to such extreme expansions of the genome.

Concluding remarks

The analyses of genome size and chromosome dynamics presented here have confirmed that the genomic evolution of Melanthiaceae has been characterized by a single episode of dramatic DNA accumulation and chromosome number reduction restricted to one of the five tribes, with genome downsizing being predominant in the rest of the family. Such extreme contrasts in genomic dynamics between closely related lineages have not been previously reported, and this illustrates the key role that TEs and chromosome rearrangements can play in driving the evolution of plant genomes.

Acknowledgements

The authors thank the following botanists and horticulturists for their generous assistance in providing samples and maintaining the RBG, Kew living collections used in this study: Philip Cantino (BHO), Oriane Hidalgo (U. Barcelona), Walter S. Judd (FLAS), Thomas F. Wieboldt (VPI), Holly C. Forbes (U. California, Botanical Garden), R. Kernick, K. Strange, K. Price and M. Christenhusz (RBG, Kew), Thomas Wendt (TEX), Derick B. Poindexter (NCU), Michael W. Denslow (BOON), and Steven R. Hill (ILLS). James Allison, Wilson Baker, Jason Comer (GA), Laura Lukas (GA), Patrick Lynch (GA), Angus Gholson Jr, David E. Giannasi (GA), Jeremy Rentsch (GA), Gil Nelson and Alexander Reynolds assisted W.B.Z. in the field. We also acknowledge the Royal Botanic Gardens, Kew and the National Science Foundation (NSF DEB-083009, J. H. Leebens-Mack (PI) and W.B.Z. (coPI)) for financial support. J.P. benefited from a Beatriu de Pinós postdoctoral fellowship with the support of the Secretary for Universities and Research of the Ministry of Economy and Knowledge (Government of Catalonia) and the co-fund of Marie Curie Actions (European Union 7th R&D Framework Programme). L.J.K. received a postdoctoral fellowship from the Natural Environment Research Council (NERC) funded project ‘Evolutionary Dynamics of Genome Obesity’ (NE/G01724/1).

Ancillary