Adaptive and nonadaptive genome size evolution in Karst endemic flora of China


  • Ming Kang,

    1. Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
    Search for more papers by this author
  • Junjie Tao,

    1. Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
    2. University of Chinese Academy of Sciences, Beijing, China
    Search for more papers by this author
  • Jing Wang,

    1. Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
    Search for more papers by this author
  • Chen Ren,

    1. Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
    Search for more papers by this author
  • Qingwen Qi,

    1. Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
    2. University of Chinese Academy of Sciences, Beijing, China
    Search for more papers by this author
  • Qiu-Yun Xiang,

    1. Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
    Search for more papers by this author
  • Hongwen Huang

    Corresponding author
    1. Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, China
    Search for more papers by this author


  • Genome size variation is of fundamental biological importance and has been a longstanding puzzle in evolutionary biology. Several hypotheses for genome size evolution including neutral, maladaptive, and adaptive models have been proposed, but the relative importance of these models remains controversial.
  • Primulina is a genus that is highly diversified in the Karst region of southern China, where genome size variation and the underlying evolutionary mechanisms are poorly understood. We reconstructed the phylogeny of Primulina using DNA sequences for 104 species and determined the genome sizes of 101 species. We examined the phylogenetic signal in genome size variation, and tested the fit to different evolutionary models and for correlations with variation in latitude and specific leaf area (SLA).
  • The results showed that genome size, SLA and latitudinal variation all displayed strong phylogenetic signals, but were best explained by different evolutionary models. Furthermore, significant positive relationships were detected between genome size and SLA and between genome size and latitude.
  • Our study is the first to investigate genome size evolution on such a comprehensive scale and in the Karst region flora. We conclude that genome size in Primulina is phylogenetically conserved but its variation among species is a combined outcome of both neutral and adaptive evolution.


Genome size variation is of fundamental biological importance and has been a longstanding puzzle in evolutionary biology (Bennett & Leitch, 2005, 2011). Genome size, measured as the haploid nuclear DNA content (C value), varies over 2400-fold across angiosperm lineages (Pellicer et al., 2010). Such wide variation has been hypothesized to be the result of several genetic mechanisms. Among these, polyploidization and the proliferation of transposable elements are considered the most prevalent processes contributing to genome size increase, while high rates of deletion and selection against transposable elements, unequal crossing over, and illegitimate recombination are believed to be the major factors leading to genome size decrease (Morgan, 2001; Wendel et al., 2002; Ma et al., 2004; Bennetzen et al., 2005). Several evolutionary models have been proposed to explain genome size variation, including neutral (Petrov, 2002), maladaptive (Lynch & Conery, 2003) and adaptive models (Gregory, 2002; Vinogradov, 2004). However, the relative importance of these models, their biological mechanisms and the evolutionary consequences of genome size diversity remain controversial.

In the past decade, model-based phylogenetic comparative methods have become a standard statistical approach to understanding trait evolution, including the mode and tempo of genome size evolution. For example, using this approach, a mode of punctuated genome size evolution was found in Orobanche (Weiss-Schneeweiss et al., 2006), Liliaceae (Leitch et al., 2007) and Allium subgenus Melanocrommyum (Gurushidze et al., 2012), while a gradual mode was found in Hieracium (Chrtek et al., 2009). The phylogenetic comparative approach has also allowed for tests of correlated evolution between genome size and other organismal traits, such as morphological, cytological, reproductive, physiological, and ecological traits (Beaulieu et al., 2007a; Knight & Beaulieu, 2008; Whitney et al., 2010; Herben et al., 2012). Although the observation of a significant correlation between genome size and other phenotypic characters in many previous studies has led to the hypothesis of adaptive evolution of genome size, the validity of this hypothesis remains to be established. In these studies, the biological link between genome size variation and the phenotypic traits was not clearly revealed and the causes of these correlations were not convincingly determined (Oliver et al., 2007). For example, the fact that specific leaf area (SLA) lies at the intersection of cell structure and leaf functions (Wright et al., 2004; Poorter et al., 2009) leads to the predication that SLA variation might be linked to the evolution of DNA content in the cells of a species. Recently, two meta-analyses suggested that 2C DNA content is positively correlated with SLA in angiosperms, but the relationship was negative in phylogenetic independent contrasts (PIC) analysis (Beaulieu et al., 2007b; Knight & Beaulieu, 2008). Furthermore, latitudinal range has normally been used as a proxy for a suite of environmental variables such as temperature and precipitation (De Frenne et al., 2013). Genome size variation of phylogenetically independent lineages along latitudinal gradients is often interpreted as an adaptive response to variation in a temperature- and/or precipitation-based selective regime; however, previous studies have often produced conflicting results (Knight & Ackerly, 2002).

Most comparisons of genome sizes have been meta-analyses of diverse phylogenetic clades representing deep divergences such as families (Kellogg, 1998), floras (Grime & Mowforth, 1982; Knight & Ackerly, 2002) or life forms (Bennett & Leitch, 1995), with limited sampling of species within the respective clades. Such analyses at higher taxonomic ranks have little power to disentangle the different evolutionary forces at work. Several recent studies indicate that ecological factors probably play a more important role in shaping genome size variation at lower taxonomic levels than at higher levels (Jakob et al., 2004; Eilam et al., 2007; Dušková et al., 2010). Hawkins et al. (2008) pointed out that phylogenetic scale is an important factor in comparative analyses of genome size variation, and suggested that analyses among closely related species within a single genus should provide greater interpretive power than analyses comparing more distant lineages at higher taxonomic ranks. Comparative analyses of closely related species and populations allow detection of ongoing evolutionary forces and mechanisms driving genome size evolution. Unfortunately, to date, studies addressing genome size variation among closely related species and its relationship to phenotypes and environmental factors are still scarce (but see Šmarda et al., 2007; Díez et al., 2013).

As one of the most important biodiversity characters, C values have been estimated for > 7500 angiosperm species (Bennett & Leitch, 2012), representing c. 2% of flowering plant species and > 50% of all angiosperm families (APG III, 2009). These data are biased towards particular regions such as Europe and North America. The generality of previous findings on plant genome size evolution may thus be limited, as plant genome sizes in many other regions of the world with much higher species richness and geographic endemism remain unknown (Bennett & Leitch, 2011). One of the most interesting regions is the area of Karst in southern and southwestern China, which boasts over 20 000 plant species and whose flora is ranked as the most endemic-rich subtropical flora in the world. However, limited information is available on genome sizes of Karst plants. In southern and southwestern China, the Karst landform is characterized by high edaphic and topographic heterogeneity and offers a multitude of ecological niches for plant diversification and speciation. The Karst environment is generally characterized by low soil water content, periodic water deficiency, and poor nutrient availability, which exert strong selective forces on plant evolution, resulting in remarkably high species richness and endemism in the region. For example, many calcicoles (species adapted to calcareous soil) have evolved in the flora. Because of their highly diverse and unique biota, Karst regions in Southeast Asia have long been regarded as ‘natural laboratories’ for ecological and evolutionary studies to understand natural selection and speciation (Clements et al., 2006). The assessment of genome size variation and its correlation with ecological and geographic traits should provide insights into the mode and mechanisms of genome size evolution, as well as its roles in the evolution and diversification of the Karst flora. In this study, we conducted such analyses on Primulina, a genus that is highly diversified in the Karst region of southern China.

Primulina, based on circumscription of recent molecular phylogenetic analyses, is one of the largest genera of the Old World Gesneriaceae (Wang et al., 2011; Weber et al., 2011). The newly revised Primulina is a monophyletic group comprising > 140 species of perennials that are widely distributed throughout the Karst regions of China and adjacent countries of Southeast Asia. Approximately 85% of the species (120 species) are endemic to southern and southwestern China. The genus occurs in a wide latitudinal range (18°N–31°N) and is adapted to remarkably diverse habitats and niches from steep cliffs and cave entrances to lowland sandstone. However, most species are ‘point endemics’ (Samways & Lockwood, 1998) found only in a single or microareal location. Nutrient constraints in calcareous soils, particularly for nitrogen (N) and phosphorus (P), nutrients that are essential for the synthesis of nucleic acids, might have selectively favored smaller genome sizes (Hessen et al., 2010).

The rich species diversity of the genus along with the high degree of microhabitat specialization makes Primulina an ideal system for studying evolutionary divergence, adaptation and speciation. In this study, we investigated genome size variation and its relationship to SLA and latitudinal distribution to gain insights into the mechanisms driving genome size evolution. We report a large data set of new chromosome counts for 56 Primulina species, and DNA content and SLA measurements for > 100 species. We examine genome size variation within and between species, test phylogenetic signal in genome size, SLA and latitude, and assess the fit of these variables or traits to different evolutionary models. We further evaluate the relationship of genome size to SLA and latitude in a phylogenetic framework.

Materials and Methods


The species diversity of Primulina has only been recently recognized, because of its high endemism and edaphic specialization. During 2010–2012, we conducted several special field investigations on this genus and discovered a number of new taxa. Living plants or seeds were collected in the field throughout the geographic range of the genus in China and grown in glasshouses at South China Botanical Garden, Chinese Academy of Sciences (CAS). A list of the species is provided in Supporting Information Table S1. These samples represent 234 populations of 88 described Primulina species, as well as 16 undescribed species (coded as Primulina sp. nov. 1–16). Geographic coordinates were recorded using a Garmin-Etrex GPS instrument (Garmin International Inc., Olathe, KS, USA). Herbarium vouchers were deposited in the IBSC (South China Botanical Garden, CAS). Wild-collected seeds were first germinated in Petri dishes. Seedlings were then transplanted into pots (8 cm diameter) and maintained in glasshouses. Representatives of three species (i.e. Primulina brachytricha, Primulina linearicalyx and Primulina varicolor) died, and thus 101 species were available for flow cytometry analysis.

Phylogeny reconstruction

The circumscription of Primulina was recently revised based on phylogenetic analyses of nuclear internal transcribed spacers (ITSs) and plastid intergenic spacer trnL-trnF (Wang et al., 2011; Weber et al., 2011). For the purposes of this study, we constructed a comprehensive species-level phylogeny of this newly circumscribed genus using DNA sequences of ITS and intergenic spacer regions trnL-trnF, rpl32-trnL, and atpB-rbcL of 104 species. Most species were represented by two to five populations, each represented by one individual. The two species Didymocarpus hancei and Petrocodon dealbatus, close relatives of Primulina, were used as outgroups. The entire ITS region including the 5.8S gene was amplified and sequenced using the ITS5P and ITS8P primers (Möller & Cronk, 1997). The trnL-trnF intergenic region was amplified using universal primers designed by Taberlet et al. (1991). Amplification and sequencing of rpl32-trnL and atpB-rbcL were conducted using primers published by Shaw et al. (2007) and Mayer et al. (2003), respectively. The combined sequence matrix was initially aligned in ClustalX (Thompson et al., 1997) and further edited manually in Bioedit (Hall, 1999). Sequences were submitted to GenBank (Table S2).

We used Bayesian inference as implemented in MrBayes v. 3.1.2 (Ronquist & Huelsenbeck, 2003) to infer the phylogeny, partitioning the data set by locus with the substitution model for each based on the Akaike information criterion corrected by sample size (AICc) values from jModelTest.0.1.1 (Posada, 2008). The GTR+I+Γ substitution model was selected for each partition, and each model parameter was unlinked across partitions. Markov chain Monte Carlo (MCMC) with four chains was performed for 10 000 000 generations, with sampling of trees every 100 generations. We discarded 25% of the trees as burn-in after checking for stationarity and convergence of the chains with Tracer v1.5 (Rambaut & Drummond, 2007). The phylogeny obtained from MrBayes analyses did not fully resolve relationships among the species, preventing direct use of the phylogeny for comparative analyses. We used the following strategies to construct a workable phylogenetic framework for subsequent comparative analyses. Because accessions of the same species formed a monophyletic clade for all species except for Primulina eburnea, the most widely distributed species, we reduced the data matrix to include only one accession per species by random selection among accessions. For P. eburnea, accessions fell into two different clades. We thus chose one individual for each clade randomly to represent these clades in the phylogeny. The 50% majority-rule consensus tree of the 104 Primulina species with bootstrap values can be found in Fig. S1. Although theoretical works have shown that polytomies and missing branch-length information have only negligible effects on comparative analyses (Garland & Diaz-Uriarte, 1999; Münkemüller et al., 2012), a fully resolved phylogenetic tree is required by many comparative methods. Therefore, to bypass this problem, the polytomies in the tree were randomly resolved in Mesquite v. 2.75 (Madison & Madison, 2011). The outgroups were also removed from the phylogenetic tree in the comparative analyses.

Specific leaf area

Approximately 10 fully expanded leaves were randomly collected from multiple adult individuals of a population in the field. The fresh leaves were stored in zip-lock bags for SLA analysis. Immediately upon returning from the field, leaf area (LA; cm2) was measured for each leaf using a Li-3000C Portable Laser Area Meter (Li-Cor Inc., Lincoln, NE, USA). Measured leaves were then placed in an air oven to dry for a minimum of 72 h at 80°C and the final dry mass (MD; g) was recorded. SLA (cmg−1) was calculated as: SLA = LA MD−1.

Determining chromosome number

Chromosome number was reported for 60 species of Primulina (11 were new and unpublished) in two recent studies. All the species examined were 2n = 36 except Primulina longgangensis, which was either 2n = 36 (Liu et al., 2012) or 2= 72 (Christie et al., 2012). Chromosome numbers for the other 66 species included in this study are, however, unknown. Thus, we determined chromosome number for these species using the method described in Ren et al. (2012) with minor modification. Chromosome numbers were counted under light microscopy (Nikon DXM1200F; Kawasaki, Kanagawa, Japan) for at least five cells.

Genome size

Based on the uniformity of chromosome numbers in the genus (all species we examined were 2n = 36, including the accessions of Plonggangensis), we interpreted 2C DNA content as genome size throughout the article. We firstly developed an optimized protocol by evaluating the reliability of different nuclear extraction buffers and staining conditions. We tested several commonly used buffers and chose LB01 (Doležel et al., 1989) as the suitable buffer for Primulina with an optimal concentration of propidium iodide (PI; Sigma-Aldrich, St Louis, MO, USA) of 100 μg ml−1 and a staining duration of 40 min. Briefly, young, fresh leaves (20 mg) were co-chopped for 30–60 s in 1 ml of LB01 cold buffer on ice using a sharp razor blade in a Petri dish. The resulting homogenate was filtered through a 50-μm mesh filter. RNase A (Sigma-Aldrich) was added at a concentration of 50 μg ml−1 and PI was used according to protocol standards. Stained samples were analyzed on a Partec CyFlow Space cytometer (Partec GmbH, Münster, Germany) and the fluorescence intensity of 10 000 particles was recorded. Solanum lycopersicum cv ‘Stupické polni rané’ (1C = 0.98 pg; Doležel et al., 1992) was chosen as an appropriate primary reference standard. Oryza sativa ssp. japonica (1C = 0.43–0.45 pg; Arumuganathan & Earle, 1991) served as a secondary reference standard; its 1C value was calibrated against S. lycopersicum (10 replicates). The genome size value for each sample is reported as the average of three technical replicates in this study. The number of plants analyzed per population varied from one to six individuals, resulting in a total of 714 samples from 101 species examined in this study. Means and standard deviations were calculated for each taxon.

Models of trait evolution

The generalized least-squares method implemented in BayesTraits (Pagel & Meade, 2007) was used to test the mode and tempo of genome size, SLA, and latitude evolution. Although latitude is not an organismal trait, we treated latitude as a continuous trait because it is the main factor affecting plant distribution. In the analyses, species means of genome size and SLA were used in the tests of evolutionary models and comparative analyses, while for latitude, the maximum latitude at which a species occurs was used. We first evaluated whether a random-walk (model A) or directional change model (model B) was the most appropriate for explaining the evolution of each of these variables. Model A corresponds to the standard constant-variance random-walk model, and model B is a directional random-walk model (Pagel, 1999). To obtain insights into the evolutionary history of genome size in the genus, we reconstructed the ancestral genome size and traced the evolutionary history of genome size on the rooted phylogenetic tree using the square-change parsimony reconstruction method in Mesquite 2.75 (Madison & Madison, 2011).

Pagel's lambda (λ), kappa (κ) and delta (δ) were estimated for the species data to determine phylogenetic associations, mode and tempo of trait evolution, respectively. The parameter λ assesses the contribution of phylogeny to the covariance among species on a given trait (i.e. phylogenetic signal). A value of λ = 0 indicates that trait evolution has proceeded independently of phylogeny, whereas λ = 1 suggests that phylogenetic relationships predict effectively the observed pattern of trait variation (pure Brownian-motion process). Intermediate values of 0 < λ < 1 indicate different degrees of phylogenetic signal. The branch-length scaling parameter κ measures the contribution of a punctuated mode versus a gradual mode of trait evolution in a phylogeny. If κ = 0, trait evolution is independent of branch length, which is consistent with a punctuated mode of evolution, while κ = 1 indicates that trait evolution is directly proportional to branch length. Values of κ < 1 suggest proportionally more evolution in shorter branches, whereas κ > 1 suggests proportionally more evolution in longer branches (gradual mode). Finally, the path-length scaling parameter δ was estimated to detect differential rates of evolution over time and rescales the phylogeny based on whether the rate of evolution is constant; δ = 1 (gradual evolution). Values of δ < 1 indicate temporally early trait evolution or ‘early burst’, the signature of adaptation radiation. If δ > 1, then longer path lengths have contributed disproportionately to trait evolution, which is interpreted as accelerated evolution over time (species-specific adaptation). Likelihood ratio tests (LRTs; Huelsenbeck & Rannala, 1997) were used to determine the most appropriate models.

We also compared the fit of Pagel's models, including Brownian motion (BM), change concentrated at the tips (lambda), speciational (kappa) and accelerating or decelerating (delta), plus Ornstein–Uhlenbeck (OU), to explore the best fitting model of trait evolution. The OU model assumes that a trait evolves towards a hypothetical optimum (θ) (Hansen, 1997; Butler & King, 2004). The model also includes a parameter α measuring the rate of adaptation towards the optimum and a stochasticity component σ which is a measure of the intensity of the random fluctuations in the evolutionary process. The model fitting was conducted with the R package geiger (Harmon et al., 2008). The best fitting model of evolution was identified using log likelihood and AICc (Hurvich & Tsai, 1989). A higher log likelihood and lower AIC signify better fit.

Phylogenetic regression of the traits

We evaluated the relationship between trait pairs using a phylogenetic generalized least-squares (PGLS) approach, in which the phylogenetic regression was performed with a phylogenetic tree whose internal branches were all multiplied by λ, leaving tip branches at their original length (Revell, 2010). In this approach, when λ is forced to 0, it is equivalent to ordinary (nonphylogenetic) least-squares regression (OLS), which assumes a star phylogeny in which residual variation is independent among species. However, when λ is forced to be equal to 1, PGLS is functionally equivalent to PIC regression (Felsenstein, 1985). However, when characters display different degrees of phylogenetic signal (0 < λ < 1), neither the OLS nor the PIC method is suitable given that they respectively underestimate and overestimate the influence of phylogeny (Hernández et al., 2013). In this study, the R package caper (Orme et al., 2012) was used to compute the three types of regression model with λ forced to equal 0, 1, and estimated values, with SLA and latitude as dependent variables and genome size as the independent variable. In addition, the significance of the relationship between pairs of comparisons was evaluated using a linear quantile regression model following Díez et al. (2013). We estimated the quantile regression functions from the 15th to the 95th quantile using the R package quantreg (Koenker, 2012).

To further explore the correlation between genome size/SLA and climatic variation across latitudinal gradients, we extracted 19 bioclimatic variables for average georeferenced records per species from the WorldClim climate database (at 2.5 min scale;; Hijmans et al., 2005). These bioclimatic variables (Table S3) represent summaries of temperature and precipitation dimensions of environment. The analyses were performed under both OLS and PGLS models with the caper package.


The chromosome numbers of the 61 Primulina species are presented in Table S1 and the somatic chromosomes are illustrated in Fig. S2. Of these species, chromosome numbers were determined for the first time in this study for 56 species, and all have 2n = 36. The chromosome number of P. longgangensis estimated in this study is 2n = 36 (Fig. S2), which is consistent with that reported by Liu et al. (2012), but different from the report of Christie et al. (2012), suggesting intraspecific variation of ploidy levels in the species. Our results confirmed that chromosome numbers in Primulina are uniform, as reported before (Christie et al., 2012; Liu et al., 2012). The flow cytometry analysis yielded high-resolution histograms with coefficient of variance (CV) < 5% for the majority of measurements (Table S1). Representative histograms are shown in Fig. S3. The 2C values assessed among Primulina species show a 2.27-fold difference between the lowest (Primulina huajiensis; 1.12 pg) and the highest (Primulina gueilinensis var. brachycarpa; 2.54 pg) (Table S1), with a mean value of 1.92 pg. Although substantial intraspecific variation in DNA content was detected for a few species (up to 41.23% for Primulina linearifolia), genome size is intraspecifically stable, with variation for most species < 10% (74 out of 101 species; Table S1).

Reconstructing ancestral genome size on the phylogeny indicated that the estimated ancestral genome size for the genus is 1.692 pg. Genome size was observed to both increase and decrease across the range of the genus, with a general increase northward (to high latitude) and decrease southward (to low latitude) (Fig. S4). SLA in Primulina species showed 25.05-fold variation, from 47.23 cm2 g−1 in Primulina ophiopogoides to 1183.27 cm2 g−1 in Primulina xiuningensis (mean SLA 282.14 cm2 g−1) (Table S1). Species with low SLA are always found in the southern part of the distribution of Primulina, while species with high SLA can been found throughout the genus' range.

For genome size, SLA and latitude, the likelihood scores for model B were not significantly greater than those for model A, indicating that the evolution of these traits does not show any general trend toward either increase or decrease. Estimates of Pagel's λ indicated that all three variables exhibited a significant amount of phylogenetic signal, with λ significantly different from 0 (Table 1). The holoploid genome size (2C DNA) was mostly influenced by the phylogenetic relationships (λ = 0.900), followed by latitude (λ = 0.819) and SLA (λ = 0.752). λ values of these variables also differed significantly from 1.0, implying that the traits' evolution did not entirely result from a pure drift process alone. The estimates of the parameter κ suggested that genome size and latitude evolved according to a gradual model of trait evolution with increased rates of evolution in shorter branches (0 < κ < 1; Table 1). By contrast, the evolution of SLA was consistent with a punctuated mode (κ = 0.245; not significantly different from 0; Table 1). Estimates of the δ parameter revealed a model of species-specific adaptation in the evolution of genome size (δ = 2.792), while the evolution of SLA and latitude had constant rates over time (Table 1; δ not significantly different from 1). Maximum likelihood tests of the continuous models showed that genome size and latitude best fit a lambda-based model, while SLA best fit the OU model. However, the OU model could not be distinguished from the lambda (= 0.420) and kappa (= 0.351) models for SLA (Table 2), and the lambda model could not be distinguished from the kappa model for latitude (= 0.130).

Table 1. Likelihood ratio tests (LRTs) for the observed versus expected values of phylogenetic scaling parameters for different models of trait evolution of the Primulina genus
TraitObserved valueLog likelihoodP for LRT
  1. The observed parameters (λ, κ and δ) were contrasted with values expected under the null hypothesis (value = 0 and 1). When the observed models show no significant difference from expected models, the latter was selected. The selected models are indicated in bold.

    SLA, specific leaf area.

Genome size
Lambda λ
λ estimated 0.900 79.007  
λ forced = 157.817< 0.0001
λ forced = 048.167< 0.0001
Kappa κ
κ estimated 0.513 70.345  
κ forced = 157.8170.0004
κ forced = 060.3380.0015
Delta δ
δ estimated 2.792 63.041  
δ forced = 157.8170.022
Lambda λ
λ estimated 0.752 −67.345  
λ forced = 1−79.2040.0005
λ forced = 0−77.6580.0013
Kappa κ
κ estimated0.245−67.780 
κ forced = 1−79.2040.0007
κ forced = 0 68.8250.3066
Delta δ
δ estimated1.944−77.885 
δ forced = 1 79.2040.2507
Lambda λ
λ estimated 0.819 150.587  
λ forced = 1136.8010.0002
λ forced = 0120.833< 0.0001
Kappa κ
κ estimated 0.306 149.443  
κ forced = 1136.8010.0003
κ forced = 0120.833< 0.0001
Delta δ
δ estimated1.647137.510 
δ forced = 1 136.801 0.3997
Table 2. Model selection statistics for the evolution of genome size, specific leaf area (SLA) and latitude of the Primulina genus
ModelParametersLog likelihood k AICc
  1. Log likelihood, logarithm of the maximized likelihood; k, total number of parameters in the model; AICc, second-order estimator of the Akaike information criterion; BM, pure Brownian motion; lambda (λ), Pagel's lambda; delta (δ), Pagel's delta; kappa (κ), Pagel's kappa; SLA, specific leaf area; OU, Ornstein–Uhlenbeck model. The best fitting models are indicated in bold.

Genome size
BM 57.8172−111.518
Lambda λ = 0.939 79.007 3 151.780
Deltaδ = 2.79263.0413−119.846
Kappaκ = 0.51270.3453−134.455
OUα = 55.72464.4253−122.614
BM −81.3852166.887
Lambdaλ = 0.753−67.6703141.575
Deltaδ = 1.788−79.7333165.466
Kappaκ = 0.245−67.7803141.795
OU α = 98.30267.345 3 140.926
BM 136.8012−269.485
Lambda λ = 0.819 150.587 3 294.939
Deltaδ = 1.648137.5103−268.785
Kappaκ = 0.306149.4423−292.885
OUα = 62.927143.8113−281.386

Fig. 1 shows the phylogenetic tree of Primulina species with genome size and SLA trait information indicated for each species. The results of the trait correlation detected by different models in phylogenetic regression analyses are summarized in Table 3. Overall, results of PGLS were almost identical to those of OLS in both of the comparisons, although the PGLS yielded a weaker explanatory power (R2 = 0.050–0.068) than OLS (R2 = 0.137–0.271) (Table 3). A significant positive relationship between SLA and genome size was also detected by PIC regression (R2 = 0.128; < 0.0001), but no relationship was found in the latitude and genome size comparisons (Table 3). In all analyses, the PGLS model had much higher likelihoods than the nonphylogenetic OLS (i.e. λ = 0) or PIC (i.e. λ = 1) (Table 3), and the LRTs showed that the PGLS model fitted the data better than both the OLS and PIC models (< 0.0001). Significant positive correlations were detected for both of the comparisons by the PGLS model. These results are consistent with our linear quantile regression model (Fig. 2), in which both of the comparisons showed a significant positive relationship (< 0.05) for all six quantiles considered (0.15, 0.30, 0.45, 0.60, 0.75 and 0.90).

Table 3. Summary of trait correlation as estimated by three types of linear regression models of the Primulina genus: ordinary least-squares regression (OLS; i.e. nonphylogenetic regression), phylogenetic independent contrasts (PIC) and phylogenetic generalized least-squares (PGLS)
ModelSLA–genome sizeLatitude–genome size
Lh b R 2 P Lh b R 2 P
  1. MLE, maximum likelihood estimates of λ; Lh, log likelihood; b, slope; SLA, specific leaf area. P values significant at < 0.05 are in bold.

OLS (λ = 0)56.070.1140.137 < 0.0001 65.0751.0260.271 < 0.0001
PIC (λ = 1)65.150.0990.128 < 0.0001 58.154−0.1660.0060.519
PGLS (λ = MLE)82.530.0630.068 0.0007 80.5750.3500.050 0.005
Figure 1.

Genome size (white) and specific leaf area (SLA) (gray) mapped on a phylogenetic tree of 104 species of Primulina. Didymocarpus hancei and Petrocodon dealbatus are outgroups. #, data not available.

Figure 2.

Scatter plot and quantile regression showing the relationships between (a) specific leaf area (SLA) and genome size and (b) latitude and genome size for the Primulina species (= 101). The dashed gray lines correspond to the quantiles (0.15, 0.30, 0.45, 0.60, 0.75 and 0.90), the solid blue line shows the median fit to the data, and the red line is the least-squares estimate of the conditional mean function.

The PGLS analyses further showed a significant correlation of genome size with most temperature-related variables, but the lack of a significant relationship of genome size with precipitation-related variables (Table S3). By contrast, under the PGLS model, SLA was significantly correlated with only two of the 19 climate variables analyzed, that is, mean diurnal range (mean of monthly (max temp – min temp)) (BIO2) and precipitation of warmest quarter (BIO18) (Table S3).


This is the first study to explore the variation and evolution of genome size and its correlation with ecological and geographic traits in the Karst floristic region. We have shown here that genome size, SLA, and latitude variation all display strong phylogenetic signals, but best fit different evolutionary models. The significant positive relationships detected between genome size and SLA and between genome size and latitude support an adaptive hypothesis of genome size evolution in Primulina. Our main conclusion is that genome size in Primulina is phylogenetically conserved but its variation among species represents a combination of both neutral (genetic drift) and adaptive evolution.

Phylogenetic signal and trait evolution

The strong phylogenetic signals detected in all three traits of Primulina suggest that patterns of trait variation were not random but rather associated with phylogenetic divergence. It has been suggested that phylogenetic signal may vary across different phylogenetic/taxonomic scales (Kamilar & Cooper, 2013). Numerous studies have found strong phylogenetic dependence of genome size at higher taxonomic scales (Beaulieu et al., 2007b; Knight et al., 2010; Whitney et al., 2010). However, to date, only a few studies have quantified the strength of phylogenetic signal for plant genome size. Bainard et al. (2012) reported a strong phylogenetic signal for genome size of 41 herbaceous plant species representing 17 families (λ = 0.74 and 0.54 for 2C and 1C DNA content, respectively). At genus level, strong phylogenetic signals in genome size were detected in Orobanche (λ = 1; Weiss-Schneeweiss et al., 2006), Hieracium (λ = 0.908; Chrtek et al., 2009) and Filago (λ = 0.934; Andrés-Sánchez et al., 2013). This evidence together with our finding of strong phylogenetic signal in Primulina suggests that genome size is phylogenetically conserved among closely related species in general.

The strong phylogenetic signals detected for both genome size and SLA suggest that the observed variation of these traits in Primulina are probably a result of genetic drift or neutral evolution; however, a pure drift model alone is not sufficient to explain the patterns. The deviation from a pure stochastic process of evolution as evidenced by the fit of the SLA data to a single-optimum OU model and a punctuated mode of SLA evolution indicates that different degrees of adaptation probably occurred in some branches of the phylogeny. This would not be a surprise because SLA has long been known to be a functional trait with adaptive value in special soil habitats such as serpentine and limestone (Westoby et al., 2002; Ackerly & Monson, 2003). SLA has been shown to be an important adaptive trait in Karst flora, where leaves of Karst plants generally have lower SLA than those of non-Karst plants (Cheng et al., 2011; Guan et al., 2011; Wei et al., 2011).

Our assessment of the tempo of genome size evolution revealed that there was an increase in the rate of genome size change in recent phases of the genus' evolution. This indicates that genome size variation has experienced species-specific adaptation in Primulina. It is most likely that the modality of Primulina speciation is underlain by species-specific adaptation to various microhabitats in the Karst region. Species-specific adaptive evolution of genome size was also inferred in Allium subgenus Melanocrommyum (Gurushidze et al., 2012). However, the elevated rate of genome size evolution could also be explained by severe genetic drift as a result of small populations localized in these microhabitats. A strong phylogenetic signal detected for genome size in the genus provides evidence congruent with the drift hypothesis.

Correlated evolution of the traits

We tested correlated evolution of genome size, SLA and latitude with three different methods: the nonphylogenetic OLS, phylogenetic independent contrasts (PIC), and phylogenetic generalized least-squares (PGLS). Our LRTs showed that the PGLS model fits the data better than either the OLS or PIC model. Therefore, in the following discussion, we refer mainly to results from the PGLS model with comparisons to those from OLS and PIC.

Genome size and SLA

SLA is a leaf density parameter that measures the cost, in term of dry mass, of photosynthetically active leaf surfaces (Wright et al., 2004). Much of the biomass of plant material is composed of cell walls, and there is a strongly positive relationship between genome size and cell size (Beaulieu et al., 2008). Small cell volumes associated with small genome sizes should have a larger ratio of cell wall to unit volume, which leads to the prediction that genome size should be positively correlated with SLA. In accordance with the theoretical prediction, a significant positive correlation between genome size and SLA was found in the genus Primulina under all three regression models (OLS, PIC and PGLS), although the relationship was much stronger under the less optimal OLS and PIC models than the PGLS (R2 = 0.068–0.137; Table 3). It is possible that small cell volume is favored by selection for nutrient conservation in the Karst region, resulting in the positive correlation of the two traits. Thus, evolution of genome size may also have been adaptive and associated with SLA, as already discussed.

Positive relationships between genome size and SLA have also been found in an analysis of 249 of the angiosperms across various clades on the angiosperm phylogeny under the nonphylogenetic OLS model (Knight & Beaulieu, 2008). However, the authors found a significant negative correlation between genome size and SLA when the data set was analyzed under the PIC model (Beaulieu et al., 2007a; Knight & Beaulieu, 2008). These investigations demonstrated inconsistency between nonphylogenetic regression and PIC analyses in assessing correlations of trait evolution at deep evolutionary divergences (Knight & Beaulieu, 2008). To date, few studies have examined correlated evolution between genome size and SLA for a clade of recently derived species. A recent study by Gallagher et al. (2011) found a significant positive relationship between genome size and SLA from an analysis of 92 Acacia species using nonphylogenetic regression, which also disappeared if the PIC method was employed. In another study of Pinus, a significant negative relationship of genome size and SLA was found with both nonphylogenetic regression and PIC methods (Grotkopp et al., 2004). In our study, a significant correlation between genome size and SLA was found with all three methods of regression analyses. These results together suggest that the relationship is sensitive to analytical methods and thus highlight the importance of taking phylogenetic history into account when explaining correlated evolution of traits. Nevertheless, the positive relationship of SLA and genome size in Primulina, detected by multiple methods, is probably robust, suggesting that selection on SLA might have also acted on the evolution of genome size.

Genome size and latitude

The correlation of plant traits with latitudinal gradients is often taken as a signature of adaptive evolution. Although the relationship between genome size and latitude has been explored across different taxonomic scales in several studies (Levin & Funderburg, 1979; Bottini et al., 2000; Hall et al., 2000; Joyner et al., 2001; Knight & Ackerly, 2002), phylogeny-based analyses of correlated evolution of these traits are scarce. In a phylogenetic analysis, Grotkopp et al. (2004) reported a significant negative relationship between genome size and northern latitudinal limit in Pinus. In our study, the significant positive relationship between genome size and latitude in Primulina was detected in analyses of data considering the geographic range of species distribution and intraspecific variation of genome size with both PGLS and OLS models (Table 3). The contrasting results between this study and that of Grotkopp et al. (2004) may be attributed to any number of reasons, including differences between gymnosperms and angiosperms, geographic regions (high versus low latitude regions), and growth habits (tree versus herb). Complex patterns of selection along latitudinal gradients and/or variation in sensitivity to climatic conditions among lineages may result in taxon-specific relationships.

The result of this study suggests a possible ecological adaption of genome size evolution that was associated with phylogenetic divergence across the latitudinal range of the genus. To our knowledge, this is the first study of the geographic pattern of genome size variation in tropical/subtropical China, where limestone Karst habitats harbor an extremely high species richness and endemism of plants. The positive correlation between genome size and latitude in Primulina is consistent with previous studies that demonstrated a pattern of variation of genome size along ecogeographic gradients, which has been argued as evidence of adaptive evolution along latitudinal gradients (Knight & Ackerly, 2002; Knight et al., 2005). Our further regression analysis of genome size with climatic variables observed in Primulina indicated that the selection forces might have been temperature- rather than precipitation-related variables (Table S3). In addition, our results support the hypothesis that small genome size evolves as an adaptation to stressful environments. The Karst environment is characterized by poor soil development, drought and heat stress. Under such conditions, plant uptake of nitrogen and phosphorus may have been restricted (Hu et al., 2013), and therefore small genome size might offer a selective advantage for species in the low latitudes by reducing the nutrient and energy cost associated with seedling establishment and growth in hot and dry habitats. We did find that species occurring in non-Karst moist habitats have larger genome size than those in limestone habitat in general (Table S1). The hypothesis was also supported by our ancestral state reconstruction of genome sizes, in which we observed a general southward (to low latitude) reduction in genome size from ancestral to extant states.

To summarize, we have presented large-scale genome size estimates based on extensive sampling of 101 species, representing c. 70% of the species of Primulina. The study illustrates the power of complete species sampling at fine scale for better understanding of genome size evolution. Further studies addressing the underlying mechanisms of correlated evolution between traits are necessary to shed light on the evolutionary forces driving genome size variation.


We thank Guofeng Li, Bo Pan, Yan Liu and Taijiu Zhou for their collaboration in the field investigation and sample collection. Many thanks to Jun Chen and Hui Liu for their help with data analyses. We also thank Richard Abbott and the anonymous reviewers for their valuable comments on the manuscript. We are in debt to Norman Douglas for his critical reading of the manuscript and correction of English. This work was supported by the Natural Science Foundation of China (31270427).