The California Floristic Province exhibits one of the richest floras on the planet, with more than 5500 native plant species, approximately 40% of which are endemic. Despite its impressive diversity and the attention it has garnered from ecologists and evolutionary biologists, historical causes of species richness and endemism in California remain poorly understood. Using a phylogenetic analysis of 16 angiosperm clades, each containing California natives in addition to species found only outside California, we show that CA's current biodiversity primarily results from low extinction rates, as opposed to elevated speciation or immigration rates. Speciation rates in California were lowest among Arcto-Tertiary lineages (i.e., those colonizing California from the north, during the Tertiary), but extinction rates were universally low across California native plants of all historical, geographic origins. In contrast to long-accepted ideas, we find that California diversification rates were generally unaffected by the onset of the Mediterranean climate. However, the Mediterranean climate coincided with immigration of many desert species, validating one previous hypothesis regarding origins of CA's plant diversity. This study implicates topographic complexity and climatic buffering as key, long-standing features of CA's landscape favoring plant species persistence and diversification, and highlights California as an important refuge under changing climates.

Biodiversity varies dramatically across space. Nearly half of all plant species and over one third of all vertebrate species are endemic to “biodiversity hotspots” that cover only 1.4% of the Earth's available land (Myers et al. 2000). Understanding the mechanisms underlying such sharp gradients in biodiversity is a central goal in ecology and evolution, and could facilitate conservation efforts for species-rich regions. Biodiversity hotspots may be centers of origin (Mora et al. 2003), accumulation (Ladd 1960; Bellwood and Meyer 2009), and/or preservation (Stebbins 1974) of species, corresponding to factors that promote high speciation rates, high immigration rates, and low extinction rates, respectively. Because these factors are likely to vary in importance among lineages and across time scales, comparative data on multiple lineages inhabiting a region are required to understand which processes have been most important there. To date, there is little consensus on the causative factors responsible for most biodiversity hotspots.

The California Floristic Province (CFP) is a temperate biodiversity hotspot with over 5500 plant species, 40% of which are endemic (Myers et al. 2000). However, only about one quarter of the CFP's ecosystems remain in relatively pristine condition (Conservation International 2010), so understanding the causes of the CFP's high biodiversity is imperative to motivate conservation strategies. Furthermore, California (CA) has been a center of plant evolutionary and ecological research over the last century, including work by the early and influential biosystematists (reviewed in Smocovitis 1997), ecologists (e.g., Whittaker 1960), and botanists (reviewed in Ertter 2004), as well as many present-day biologists. It is thus one of the few biodiversity hotspots where there is sufficient data across diverse lineages to attempt a comprehensive comparative analysis of diversification patterns.

Hypotheses for the high species richness of the CFP include several factors that may increase speciation, lower extinction, increase immigration, or typically, affect some combination of these processes.

The recent change to a winter-wet, summer-dry Mediterranean climate is considered a primary contributor to high biodiversity and endemism in CA. The transition to a Mediterranean climate regime began approximately 2 to 5 Ma (Axelrod 1973; Suc 1984) when decreasing ocean temperatures and altered patterns of ocean and atmospheric circulation affected west-facing coastal regions between 30°N and 50°N and S latitude. Speciation may then have accelerated due to strong selection favoring annual life-histories and shifts in reproductive phenology (Raven 1973; Raven and Axelrod 1978), and immigration of warm-adapted lineages may have increased (Raven and Axelrod 1978; Ackerly 2009). Accelerated speciation under the recently derived Mediterranean climate is primarily evidenced by high biodiversity and neoendemism in all five disjunct regions with Mediterranean climates worldwide, although hypotheses for potential mechanisms are not well developed.

CA also displays a complex and varied topography. It has some of the highest and the lowest points in North America, and a concomitant variety of edaphic zones, including serpentine soils and other heavily mineralized soil types, sedimentary basins, shale outcrops, and granitic soils. Topographic and edaphic variation is positively correlated with plant α and β diversity within CA at the local scale (Stebbins and Major 1965; Richerson and Lum 1980; Harrison and Inouye 2002), suggesting that these factors have played an important role in driving diversification. Sharp elevational and edaphic gradients can present both barriers to gene flow and divergent ecological niches. Disruptive selection combined with reduced gene flow may accelerate speciation rates, in comparison to less topographically complex regions (Stebbins and Major 1965; Kruckeberg 1986; Anacker et al. 2010; Kay et al. 2011).

Topographic complexity may also reduce extinction by providing both climatic buffering and multiple niches for tight species packing. Species that were more widespread prior to the Sierra Nevadan uplift may have persisted in the CFP, and gone locally extinct under desertification in the rain shadow to the east of this mountain range. For example, the genera Sequoia and Sequoiadendron, which respectively contain the coast redwood and the giant sequoia, were once more widespread across North America, but are now narrow CA paleoendemics (Raven and Axelrod 1978). Topographic complexity also may buffer against extinction under the extensive climate change in recent epochs because species can move up and down in response to fluctuating global climates, thus escaping extremes of heat, cold, or aridification at much shorter dispersal distances than for species inhabiting flatter regions (Loarie et al. 2008). Finally, with its wide variety of climatic and edaphic zones, the CFP may function as a center for accumulation for plant lineages originating in varied habitats worldwide.

In 1978, Raven and Axelrod (hereafter, R&A) published their landmark study on the origins of the CA flora. Using fossil floras and current distributions of CA native plant species, R&A analyzed CA's extant diversity as a function of its reconstructed paleobotanical history. Their analysis provided insight into the broad patterns behind regional and local patterns of diversity, and many of their conclusions are well accepted, although they have not been reevaluated in a phylogenetic context.

From the available fossil data, R&A identified much of the current CA flora as modern descendants of two ancestral floras that had overlapping ranges in CA during the tertiary: the Arcto- and Madro-Tertiary geofloras. Extant CA representatives of the Arcto-Tertiary geoflora are relicts of a mesic forest assemblage that covered much of North America throughout the Tertiary; most descendant species of this geoflora are found in northern temperate regions today. Modern representatives of the Madro-Tertiary geoflora are descended from subtropical, semiarid lineages that are often sclerophyllous and fire-adapted, and which became adapted to warmer and drier climates during the hottest periods of the Tertiary (Axelrod 1958). Each of these two geofloras were much more widespread in the past than they are today, and their ranges currently do not overlap, except within a few, isolated regions, including the CFP.

Because the fossil record for many plant lineages is limited, R&A also separately classified plant groups by their current distributions, or patterns of biogeographic affinity. They suggested that Arcto-Tertiary ancestry of CA genera can often be inferred from predominantly North Temperate extant relatives (North Temperate biogeographic affinity; Table 4 in R&A). Modern CA representatives of Madro-Tertiary lineages, in turn, are often heavily represented in the CFP (CFP association; Table 6 in R&A) or may share ancient or more recent ties with lineages found presently in the Mediterranean basin (Mediterranean biogeographic affinity; Table 5 in R&A; Axelrod 1975). Other common CA lineages are not derived from either ancestral geoflora, but were thought to have invaded CA from southern deserts in more recent times (Warm Temperate/Desert biogeographic affinity; Table 8 in R&A).

Having categorized the CFP flora, R&A proposed that factors hypothetically affecting speciation, extinction, and immigration rates varied in importance across these different types of plant lineages and across different time periods. According to R&A, lineages derived from the Madro-Tertiary geoflora or with a Mediterranean biogeographic affinity were particularly likely to contribute to the burst of speciation in CA following the onset of the Mediterranean climate, because these lineages were preadapted to hot, dry conditions, and therefore have likely produced many of CA's neoendemics (Stebbins and Major 1965). In contrast, speciation of Arcto-Tertiary or North Temperate lineages may not have been positively affected by this climate change event. R&A further predicted the onset of the Mediterranean climate bolstered CA's species richness by facilitating immigration of Warm Temperate/Desert affiliated clades in the past few million years. Finally, R&A predicted that Arcto-Tertiary lineages (or those with North Temperate biogeographic affinities) were most likely to have benefited from reduced extinction in the equitable climate of CA. Higher elevations and cooler, wetter climatic niches created by the Coast, Klamath, and Sierra Nevada ranges may have permitted the persistence of many of these cold/wet-adapted lineages (Raven and Axelrod 1978), such that Arcto-Tertiary lineages produced many but not all of CA's paleoendemic species (Stebbins and Major 1965; Raven and Axelrod 1978).

We tested each of these often-cited hypotheses for the origins of CA's biodiversity by reconstructing time-calibrated phylogenies with publicly available sequence data for a selection of 16 monophyletic angiosperm clades and comparing historical speciation, extinction, and migration rates inside and outside CA (Table 1). Our 16 clades each contain lineages that have colonized and diversified within CA, as well as lineages that inhabit other regions, and overall included a total of 444 sampled CA species (176 of which are endemic to CA) and 1243 sampled non-CA species, from 114 genera and 14 families.

Table 1.  Clade biogeographic, species richness, diversification, and colonization data. Current and historic geographic affinities for each clade, based on Raven and Axelrod (1978), species richness and sampling within each clade, and estimated diversification and migration rates (in units of lineage/lineage/million years for λ0,1 and q01,10, and in units of extinction event/speciation event for ɛ0,1). Error terms are time series standard errors calculated from 990,000 MCMC steps.
Geoflora and current geographic relationshipsSpecies richness and samplingEstimated diversification and migration ratesEstimated age of CA residency (Ma)
CladeGeofloraBiogeographic affinityStrongly associated with CFP?Total CA native species richnessProportion native CA species sampledTotal non-CA native species richnessProportion non-CA native species sampledλ0λ1ɛ0ɛ1q01q10
Antirrhineae (Plantaginaceae)NeitherMediterraneanNo210.813070.201.25±0.0080.56 ± 0.0030.95±0.0010.34±0.0010.01±0.0000.34±0.0026.23
Arbutoideae (Ericaceae) Madro-Tertiary Mediterranean No 65 0.68 22 0.55 0.30±0.006 0.31±0.011 0.73±0.005 0.72±0.010 0.08±0.005 0.02±0.002 26.31
Artemisia (Asteraceae)Arcto-TertiaryNorth temperateNo180.894320.310.45±0.0060.54±0.0080.73±0.0060.52±0.0060.01±0.0000.39±0.0087.17
Chironiinae (Gentianaceae) Neither Mediterranean No 13 0.62 95 0.64 0.49±0.003 0.80±0.007 0.58±0.003 1.53±0.007 0.47±0.002 2.22±0.009 4.38
Coreopsideae (Asteraceae)Madro-TertiaryWarm Temperate/DesertYes120.924780.262.38±0.0090.30±0.0020.96±0.0000.66±0.0020.00±0.0000.10±0.00112.25
Ericameria (Asteraceae) Neither Warm Temperate/Desert No 28 1.00 17 0.94 0.30±0.005 0.39±0.005 0.91±0.008 0.53±0.006 1.37±0.011 0.94±0.008 7.86
Lepidium (Brassicaceae)NeitherWarm Temperate/DesertNo160.941590.600.96±0.0231.92±0.0240.73±0.0130.80±0.0140.27±0.0032.48±0.0171.25
Lotus (Fabaceae) Madro-Tertiary Mediterranean No 30 0.63 120 0.71 0.49±0.006 0.28±0.003 0.95±0.001 0.65±0.002 0.00±0.000 0.07±0.001 44.99
Lupinus (Fabaceae)Arcto-TertiaryNorth temperateNo750.411750.491.64±0.0201.74±0.0210.96±0.0030.66±0.0030.02±0.0010.36±0.0069.77
Lycieae (Solanaceae) Neither Warm Temperate/Desert No 9 0.89 80 0.46 0.28±0.002 0.40±0.003 0.31±0.002 1.48±0.004 0.28±0.001 1.88±0.007 1.37
Phrymoideae (Phrymaceae)Madro-TertiaryUnknownYes830.73660.640.21±0.0020.33±0.0031.10±0.0090.48±0.0050.09±0.0050.15±0.00419.30
Polemoniaceae Madro-Tertiary North temperate Yes 184 0.66 201 0.67 0.30±0.003 0.28±0.002 0.72±0.008 0.74±0.013 0.13±0.004 0.14±0.004 12.54
Salvia (Lamiaceae)Madro-TertiaryMediterraneanYes180.679000.131.78±0.0110.61±0.0040.88±0.0010.60±0.0030.00±0.0000.17±0.00114.76
Saniculoideae (Apiaceae) Arcto-Tertiary North temperate No 27 0.59 254 0.57 0.30±0.004 0.15±0.001 0.58±0.005 0.45±0.003 0.00±0.000 0.03±0.000 24.60
Sidalcea and related species (Malvaceae)Arcto-TertiaryNorth temperateYes250.84310.580.11±0.0020.08±0.0010.65±0.0050.93±0.0070.92±0.0091.13±0.01235.93
Visceae (Santalaceae) Arcto-Tertiary North temperate No 16 0.94 524 0.14 0.60±0.006 1.95±0.012 0.93±0.001 0.76±0.004 0.04±0.000 1.70±0.014 1.05
Combined clades       0.73±0.0020.30±0.0020.93±0.0000.57±0.0020.00±0.0000.09±0.000 

Using the Binary-State Speciation and Extinction (BiSSE) model of diversification (Maddison et al. 2007), we tested whether species native to CA have diversified more rapidly in comparison to non-CA natives from the same clades, and if so, whether elevated CA diversification rates are due to elevated speciation rates in CA, decreased extinction rates in CA, or both. We also estimated the rates of migration to and from CA within each clade, to determine if historical patterns of dispersal contribute to CA's high extant plant biodiversity.

We estimated these speciation, migration, and extinction rates separately for each clade (separate-clades rates) and simultaneously across all 16 clades (combined clades rates). Combining clades increases our power to detect differences across the CA border and transcends idiosyncratic patterns of diversification and migration within any particular lineage. Using our separate-clades rates, we further examined whether speciation, extinction, and immigration rates within CA vary across lineages according to R&A's ancestral geoflora or biogeographical affinity classifications. Finally, we examined how the date of CA colonization varies according to ancestral geoflora and biogeographic affinity: Arcto- and Madro-Tertiary clades are predicted to have been present in CA the longest, and Warm Temperate/Desert clades the least amount of time. Finally, we analyzed temporal variation in diversification rates within CA native subclades to identify potential effects of the Mediterranean climate on plant diversification in CA.



We selected 16 angiosperm clades for phylogenetic analysis that are well-represented both in CA and beyond, and which contain at least four to five CA endemic species (such that some diversification can be presumed to have occurred within CA). Information on endemism was obtained from http://www.calflora.org. Our selected clades were initially identified from Tables 4–6 and 8 within R&A. We required that clades should be well-sampled for ITS sequences both within CA and beyond, and that fossil, vicariance, or other substitution rate calibration data be available. The proportion of species sampled from each clade ranged from 0.41 to 1.00 (mean of 0.76) for CA species and 0.13 to 0.94 (mean of 0.49) for non-CA species. Geofloras and biogeographic affinities of each clade were assigned based on the tables and text of R&A and are presented in Table 1.

We estimated phylogenetic relationships separately for each clade, using internal transcribed spacer (ITS) nrDNA sequences, which are commonly available for plants and evolve in a relatively clock-like manner (Baldwin et al. 1995; Kay et al. 2006). ITS sequences were obtained from GenBank (accession numbers provided in Table S1) and aligned in MUSCLE v.3.7 (Edgar 2004). We checked alignments by eye in Mesquite v.2.6 and 2.7 (Maddison 2009) and excluded any coding (18s, 5.8s, and 26s nrDNA) or poorly aligned regions from the analysis. We used MrModelTest v.2.3 (Nylander 2004) to determine the appropriate model of nucleotide substitution for each clade with hLRT criteria (Table S2). We then estimated phylogenetic relationships using Bayesian Markov Chain Monte Carlo (MCMC) analysis implemented in BEAST v.4.8 and 5.0 (Drummond and Rambaut 2007), with a lognormally distributed relaxed molecular clock model (Drummond et al. 2006) and a birth-death prior in estimating divergence times. For each clade, we obtained at least two fossil or vicariance calibration dates from the literature, presented in Table S2. The only exception is Lepidium, for which calibration dates were not available; therefore we used a previously published rate estimate for Brassicaceae to calibrate nodes within that clade. Phylogenies were evaluated with a chain length of 10,000,000 states, increased to as high as 50,000,000 states for clades with low effective sample sizes for posterior distributions. For each clade, two BEAST runs were compared for convergence and then combined, after removing a burnin of 10% from each run. A set of 1000 trees was saved for each clade, so that we could evaluate and account for the effects of variation in estimated tree topology and branch lengths on diversification analyses.


We analyzed state-dependent diversification using the Binary-State Speciation and Extinction model (BiSSE; Maddison et al. 2007) which allows transitions to occur between the two states of a binary character (at rates q01 and q10), and for speciation (λo, λ1) and extinction (μ0, μ1) rates to depend on character states. Our character state of interest in applying BiSSE was “Native to CA?” (y= 1 /n= 0), assigned based on the designations in the CalFlora database (http://www.calflora.org). This database assigns native status based on CA state borders, and not CFP borders, although most species native to CA are native to the CFP. We avoided including clades primarily restricted to the desert or Great Basin provinces (i.e., clades native to CA but not to the CFP). Species recently naturalized in CA were assigned as non-native. At the time of our sampling, the CalFlora database used the Jepson Manual I (Hickman 1993) as the taxonomic authority for CA species; name changes reflected in the second edition of the Jepson Manual (Baldwin et al. 2012) are presented in Table S1. Most name changes do not affect our analysis, with the exception of Arceuthobium (tribe Visceae), in which seven CA native species have now been reduced to subspecies (Nickrent 2012). This may have caused us to overestimate speciation rates in Visceae, which had the highest rates of speciation in CA of all of our clades (Table 1).

Note that our BiSSE analysis overestimates the rate of speciation in CA, because it assumes that CA native species arose in CA, when in fact the speciation event may have occurred in another part of the species range. However, given our results (of lower speciation rates in CA than elsewhere), our conclusions are not compromised by this limitation of the analysis. GeoSSE (Goldberg et al. 2011) represents a possible alternative analysis, wherein state-dependent diversification rates are calculated specifically in reference to alternative geographic states. Use of GeoSSE would therefore avoid overestimation of speciation rates in CA. However, GeoSSE has other limitations, which we wished to avoid. First, when estimating μ0,1, GeoSSE fails to discriminate between species-level extinctions versus local extirpations. Our BiSSE estimates avoid this limitation by including only species-level extinctions in estimates of μ0,1, and estimating the rate of local extirpations from CA as q10, thus effectively distinguishing between these two processes. Another limitation of GeoSSE is that, when estimating q01,10, it does not distinguish between colonization of new niches versus niche filling following speciation. Because all speciation events within CA native lineages are ascribed to CA in our BiSSE analysis, our estimates of q01 capture only primary CA colonization events and are not contaminated by estimates of re-invasion via niche filling (i.e., when a wide-ranging CA native lineage speciates outside of CA, and daughter species subsequently reestablish themselves throughout their immediately ancestral range). Because we were more interested in testing the impact of the CFP on lineage-level speciation, extinction, and colonization events than we were in recreating the behavior of individual species and populations across a geographic border, BiSSE was ultimately a better fit for our study.

We applied BiSSE to our posterior set of phylogenetic trees for each clade, using Bayesian techniques to estimate posterior probability distributions for each rate parameter (FitzJohn et al. 2009); rates are assumed to be constant over time and across branches within the clade. All rates are calculated as events per lineage, per million years. We implemented BiSSE in the diversitree module (FitzJohn et al. 2009) of R (R Development Core Team 2012) and applied a correction for incomplete sampling of phylogenies, under the “skeletal tree” scenario, which includes the probability that an extant species is sampled in estimating the diversification and extinction models (FitzJohn et al. 2009). Sampling probabilities/frequencies for each character state are presented in Table 1.

We estimated diversification rates over 1000 trees per clade, selected at even intervals throughout the combined BEAST runs. Estimating diversification rates over multiple topologies allowed us to test for alternative peaks in the likelihood surface for diversification rates, and to evaluate the effects of topology and branch length uncertainty. In our combined clades analysis, we generated combined rate estimates over all 16 of our clades, by fitting joint likelihood functions. To do this, we randomly selected one tree from each clade, and multiplied its likelihood with that of a single, randomly selected tree from every other clade. We repeated this process 1000 times to generate 1000 unique tree combinations. For each of these separate and combined trees, we calculated BiSSE diversification rates using MCMC in diversitree, with 1000 steps per tree and an exponential prior. One thousand steps resulted in convergence of estimates, but we confirmed this with a supplementary analysis using 10,000 steps per tree, to compare our results to those obtained with a longer MCMC chain. Rates were virtually identical between the two analyses, and the direct comparison is presented in Table S3. After evaluating and removing a burnin of 10 steps per tree, a total of 990,000 MCMC samples per clade (and for combined clades) were analyzed for comparative diversification rates in CA versus elsewhere. We examined three-dimensional scatterplots of speciation, extinction, and migration estimates derived from a random subset of our MCMC samples, to check for the presence of multiple likelihood peaks, which could confound our estimates (Valente et al. 2010). We found no evidence for multimodality.

For further statistical analysis of diversification rates, we performed the following transformations: for extinction parameters, we first calculated relative extinction (i.e., extinction relative to speciation rates in each region: ɛ0 =μ00 and ɛ111), because estimates of μ were correlated with underlying speciation rates, whereas estimates of ɛ were not (i.e., a regression of μ0 on λ0 across our clades results in r2= 0.99, P < 0.0001; μ1 regressed on λ1 results in r2= 0.72, P < 0.0001. Alternatively, ɛ0 regressed on λ0 results in r2= 0.04, P= 0.44 and ɛ1 on λ1 results in r2= 0.007, P= 0.76). By using relative extinction rates, we avoid confounding our extinction results with effects arising from speciation rate differences. In addition to evaluating values of λ0,1, ɛ0,1, and q01,10, we calculated net diversification rates in each region (r00–μ0 and r11–μ1). Both r0,1 and ɛ0,1 were estimated within each MCMC sample, and then averaged across samples. All reported error terms are time-series standard errors, calculated using the coda package in R (Plummer et al. 2009). We evaluated for potential upward or downward bias in our combined clades estimates of ɛ0,1, which can occur when net diversification is highly variable among clades (Rabosky 2010). We found no evidence of bias, as our combined clades estimates of ɛ0,1 were well predicted by the distributions of ɛ0,1 estimates from our separate clades (within ± 0.1 of the median, and well within the interquartile range).

Our combined clades BiSSE analysis implies phylogenetic independence of diversification and transition rates among our selected clades. To evaluate this assumption, we tested for phylogenetic signal in BiSSE rates across our separate clades, using a pruned version of the angiosperm consensus tree (Davies et al. 2004) available on the Phylomatic website (http://www.phylodiversity.net/phylomatic/), with each of our clades represented at a tip. We tested phylogenetic signal by calculating Pagel's λ (Pagel 1999a) for each BiSSE rate using the fitContinuous() function in the Geiger package for R (Harmon et al. 2007), where λ= 0 indicates phylogenetic independence of our rate estimates, and λ= 1 indicates that rates have evolved consistently with Brownian motion and thus exhibit high signal. None of our BiSSE rates resulted in a value for λ significantly different from 0. This does not necessarily imply that diversification rates are unconserved within the large subset of angiosperms represented by our 16 clades. Our small sample size of 16 tips, connected by relatively deep nodes, is unlikely to provide sufficient power to rigorously test such a hypothesis (Boettiger et al. 2012), and we do not attempt to draw biological conclusions from this result. Instead, our result of no phylogenetic signal across our selected clades provides evidence that the assumption of phylogenetic independence among our clades (implicit in our combined clades analysis) is reasonable and appropriate, and that further hypothesis tests with our separate-clades rates do not require phylogenetic correction.


For our ancestral state reconstruction of the first CA colonization event, we used a single, summary tree to represent each clade. This (ultrametric, time-calibrated) tree was the one with maximum posterior credibility topology from our BEAST runs, but with averaged branch lengths. We then estimated maximum likelihood BiSSE transition rates (using MCMC and an exponential prior to integrate our best-model rates over the range of parameter uncertainty; best-model rates were obtained using AICc to compare the full, six-parameter model to reduced models) to reconstruct marginal ancestral states within each clade (Pagel 1999b; Goldberg and Igic 2008). These BiSSE-based ancestral state reconstructions were used to estimate the approximate time when each of our clades first entered CA (Fig. S1). Immigration was considered confidently reconstructed at the oldest node at which the probability of CA-residency exceeded 88% (Mooers and Schluter 1999; Goldberg and Igic 2008).

We tested the hypothesis that CA's species richness is due to increased speciation rates following the onset of the Mediterranean climate, approximately 2 to 5 Ma, in two ways. First, we tested for speciation and extinction rate heterogeneity in selected CA native lineages (Morlon et al. 2011). To select these “CA subclades,” we pruned each of our clades to include only the earliest CA resident (according to our reconstruction; Fig. S1), plus all of its descendants (excluding the South American radiation of Lupinus, which derives from a CA species). These CA subclades capture the first colonization of CA at their root, and therefore capture the maximal span of diversification time in CA for each clade. In clades with multiple CA invasions, selecting the oldest CA subclade (instead of randomly selecting among potential CA subclades) maximizes our ability to detect differences in diversification before versus after the onset of the Mediterranean climate. We did not consider more than one CA subclade within each clade, to avoid pseudoreplication within clades (if subclades exhibit similar patterns of CA-diversification within each clade). We addressed potential bias in our results arising from nonrandom CA subclade selection in a subsequent analysis. For selected CA subclades represented by ≥ 5 extant species, we then evaluated 10 possible models of diversification rate shifts (Morlon et al. 2011): exponentially increasing or decreasing speciation and extinction rates over time, linearly increasing or decreasing speciation and extinction rates, constant rates, extinction rates equal to zero, and various combinations of these (Table S4). These models were compared using AICc (Morlon et al. 2011; Table S4), and the best fit diversification model for each clade is visually presented in Figure 3. This method of assessing diversification rate heterogeneity assumes that clades evolved according to a birth-death process in which speciation and extinction rates can vary over time and extinction rates can exceed speciation rates, realistically allowing for periods of declining diversity (Morlon et al. 2011). In using this method, we applied the built-in correction for incomplete sampling, which is calculated similarly to the “skeletal tree” scenario described earlier.

As a second test of the effect of the recent onset of the Mediterranean climate on CA diversification, we used BiSSE estimates of diversification within each of our separate clades, and compared a model with a break-point in CA speciation rates (λ1) at 5 Ma to a model in which λ1 did not differ before versus after 5 Ma. For this test, we used maximum likelihood BiSSE estimates calculated from the summary tree for each clade, and used likelihood ratio tests to determine if a break-point in λ1 resulted in a significantly better fit to the data than a constant λ1. This test determines if increasing or decreasing speciation rates obtained from the Morlon et al. (2011) method reflect a recent (<5 Ma) shift, versus older or more gradual shifts that are not relevant for our hypothesis. Furthermore, our BiSSE/break-point analysis is based on the entire phylogeny for each clade (i.e., this analysis was not limited to particular subclades), so it does not depend on our ancestral state reconstructions, and it accounts for and estimates diversification rate shifts across multiple CA invasions within each clade.


To test general predictions about the historical processes responsible for high extant plant species richness in CA, we calculated the mean difference between CA versus non-CA BiSSE rates across all 990,000 samples in the combined clades analysis, and estimated probability as the proportion of samples for which a particular statement (i.e., ɛ0 > ɛ1) is true. We then retested R&A's classic hypotheses about the origin of the CA flora by examining the effects of geoflora and biogeographical affinity on speciation, extinction, and immigration/emigration rates in CA versus elsewhere, using our separate-clades rate estimates in a variance-weighted analysis of variance (ANOVA) in JMP v8.0.1 ©2009. To examine the effect of geoflora, an effect of the Warm Temperate/Desert biogeographical affinity, or an effect of CFP association (R&A Table 6) on a clade's initial date of entry into CA, we performed an ANOVA and a two sample t-test, respectively, in JMP v8.0.1 ©2009, with log-transformed values of each clade's CA colonization date as the response variable.


Based on our combined clades analysis, we find that relative extinction rates have been significantly lower in CA than elsewhere (mean difference ɛ0–ɛ1= 0.36 ± 0.002, positive with probability 1.0; Fig. 1A), resulting in higher net diversification in CA than elsewhere (mean difference r1r0= 0.073 ± 0.0004, positive with probability 0.99; Fig. 1B). The hypothesis that high-extant CA biodiversity results from increased speciation rates was not supported, and speciation rates in CA were lower than elsewhere on average (mean difference λ1−λ0=−0.43 ± 0.002, positive with probability 0.0; Fig. 1C). CA's plant biodiversity is also unlikely to have been caused by species accumulation; species were more likely to emigrate from CA than to colonize it (mean difference q01q10=−0.087 ± 0.0003, positive with probability 0.0; Fig. 1D). Rate estimates are listed in Table 1 (see Table S5 for rate comparisons within individual clades).

Figure 1.

Density plots of BiSSE diversification and migration rates across our 990,000 MCMC samples and 1000 combined phylogenies. (A) Relative extinction has been lower in CA (ɛ1) than elsewhere (ɛ0). (B) Overall, lower extinction rates in CA resulted in higher net diversification rates in CA (r1) than elsewhere (r0). (C) Speciation rates have also tended to be lower in CA (λ1) than elsewhere (λ0). (D) Species have migrated out of CA (q10) more commonly than into CA (q01). Units for (A) are extinction events per speciation event. Units for (B) are net lineage increases per lineage per million years. Units for (C) and (D) are speciation or migration events per lineage per million years.

Based on our separate-clades rates analysis, we find that speciation rates in CA (λ1) have been significantly affected by geoflora of origin (F2,13= 5.40, P= 0.02), with lineages derived from the Arcto-Tertiary geoflora speciating at significantly lower rates in CA than Madro-Tertiary or other lineages (Fig. 2A). Complementarily, modern biogeographic affinities also explained differences among clades in speciation rates within CA (F2,12= 4.45, P= 0.04), with North Temperate clades speciating at the lowest rates, significantly lower than Mediterranean lineages, but at rates statistically indistinguishable from the (intermediate) Warm Temperate/Desert clades (Fig. 2B). However, extinction rates within CA (ɛ1) were not different among ancestral geofloras (P= 0.47), or biogeographic affinities (P= 0.95; Fig. 2C, D). The difference in extinction rate across the CA border (ɛ0–ɛ1) was also equivalent for descendants of the Arcto-Tertiary versus Madro-Tertiary geofloras (F1,9= 2.03, P= 0.19). Clades that R&A considered to have been long-standing and prevalent residents of the CA flora (CFP association) did not differ from other clades in their speciation (F1,14= 1.06, P= 0.32) or extinction (F1,14= 0.65, P= 0.44) rates within CA.

Figure 2.

CA speciation and extinction rates, and earliest CA colonization, by ancestral geoflora and modern biogeographic affinity of our clades. (A) Arcto-tertiary lineages have speciated in CA at lower rates than clades of other historical origins, as predicted by R&A. (B) The result in (A) is complemented by the result that CA clades with modern relatives in North Temperate regions speciated at lower rates in CA than lineages with other modern affinities. (C, D) Extinction rates in CA have not been affected by geographic origin or affinities. (E) Arcto- and Madro-Tertiary lineages have resided in CA since the mid-Tertiary, longer than lineages of other origins, supporting R&A's categorizations of Tertiary geofloras. (F) The result in (E) is complemented by the result that Warm Temperate/Desert (i.e., southern and desert) clades have colonized CA most recently, likely during recent climatic drying periods, as described by R&A. Categories not sharing letters are significantly different from each other (see text for the details of these tests). Data in (E, F) were back transformed for graphical presentation.

The inferred, earliest date of CA colonization within a clade was significantly affected by ancestral geoflora membership (F2,13= 4.75, P= 0.03), with Arcto- and Madro-Tertiary lineages having been in residence the longest, as predicted by R&A. The mean age of residency in CA is 15.70 Ma and 21.69 Ma for Arcto- and Madro-Tertiary lineages, respectively, versus 4.22 Ma for lineages from neither of these geofloras (Fig. 2E). Analyzing by geographic affinity, lineages with Warm Temperate/Desert affinities have been in CA the least amount of time, as predicted by R&A (t= 1.92, P1-tailed= 0.04). The mean age of Warm Temperate/Desert clade residence in CA is 5.68 Ma versus 17.25 Ma for clades with other geographic affinities (Fig. 2F). Whether or not R&A categorized a clade as having a long-standing association with the CFP had a marginally significant effect on its age of residence in CA (t=−1.66, P1-tailed= 0.06), with the mean CA residency of CFP-associated clades being 18.96 Ma versus 12.27 Ma for non-CFP-associated clades.

Our analysis of speciation and extinction rate temporal heterogeneity for CA native subclades largely fails to support the hypothesis of increased diversification following the onset of the Mediterranean climate (Table S4, Fig. 3). Our analyses did not reveal any consistent temporal patterns of diversification, with most CA subclades best modeled by constant or decelerating diversification rates over time (i.e., diversification rates in CA have not accelerated in recent millennia). Only 4 of 16 of our CA native subclades experienced increasing speciation rates toward the present (Arbutoideae, Lotus, Lupinus, and Salvia; see Table S4 for diversification model comparisons within each clade), such that a pattern of increasing diversification is significantly underrepresented among our clades (χ2= 4.00, P= 0.045; versus a conservative expectation of increasing diversification in half of our clades). Our BiSSE/break-point analysis confirms this pattern, with only three of our clades experiencing significantly greater values of λ1 after 5 Ma than in earlier millennia (Lotus, χ2= 5.52, P= 0.02; Lupinus, χ2= 3.97, P= 0.046; and Phrymoideae, χ2= 7.49, P= 0.006). Our BiSSE/break-point analysis also indicates that CA speciation rates have decreased since 5 Ma in Sanicula2= 4.77, P= 0.03) and Sidalcea2= 4.10, P= 0.04), both members of the Arcto-Tertiary geoflora. The remaining eight clades did not show significant changes in λ1 at 5 Ma.

Figure 3.

Temporal shifts in diversification within CA native subclades. Best-fit models for diversification were calculating using the method from Morlon et al. (2011) (dashed, gray lines). The fitted diversification models overlay lineage through time (LTT) plots (black lines). AICc comparisons of multiple diversification models for each clade can be found in Table S4.


Our results indicate that the species richness of the CFP biodiversity hotspot primarily results from processes of species diversification within the CFP, rather than species accumulation. Higher diversification rates of plants in CA (vs. elsewhere) are driven largely by substantially lower extinction. In contrast, speciation rates for our clades are not higher in CA, and under the combined analysis, are generally lower in CA than other regions. Low extinction rates may be due to climatic buffering in CA. This buffering effect is likely caused by both a broad elevational range (allowing species to move up and down mountains within the region during periods of rapid climate change, e.g., during the Plio-Pleistocene; Loarie et al. 2008) and the moderate precipitation created by the Sierra Nevada, Coast, and Klamath ranges (Richerson and Lum 1980). The Sierra Nevada, the region's tallest range, may have been particularly instrumental in reducing plant extinction rates in the CFP. This range formed at least 50 Ma (House et al. 1998; Mulch et al. 2006; Cassel et al. 2009; Hren et al. 2010), stabilizing precipitation patterns in the CFP, while creating a rain shadow to the east, during the post-Eocene period of global aridification at CA's latitude (Axelrod 1973; Davis and Moutoux 1998; Sheldon and Retallack 2004; Dupont-Nivet et al. 2007). Multiple topographic, climatic, and edaphic niches may also play a part in reducing extinction in CA by reducing competition for niche space (facilitating multiple species coexistence) and providing opportunities for species to become adapted to a broad range of conditions. Like low extinction rates, overall lower rates of speciation in CA than elsewhere across our clades may also result from climatic buffering, which stabilizes existing niches and slows the production of new niches.

Most of our selected CA native subclades experienced constant or decelerating rates of diversification toward the present (Fig. 3). Four of these subclades (Arbutoideae, Lotus, Lupinus, and Salvia) experienced increasing diversification rates toward the present, although these clades also had low or unexceptional rates of speciation in CA compared to elsewhere (Table S5). Furthermore, the onset of the Mediterranean climate only accounts for the observed speciation rate changes in two of these clades: only Lotus and Lupinus exhibit significantly greater CA speciation rates after a 5 Ma break-point than before. Results differ between our two analyses of time-dependent diversification in CA for the clade Phrymoideae, which likely reflects methodological limitations of our CA subclade selection. The older CA subclade of Mimulus (Phrymoideae; corresponding to Mimulus subgen. Schizoplacus) used in our subclade analysis did not experience time-dependent changes in diversification, suggesting that an increasing speciation rate after 5 Ma (revealed in our second, break-point analysis) reflects processes occurring in the more recent invasion of Mimulus, which corresponds to Mimulus subgenus Synplacus, sensu Grant (1924), and contains the model organisms for speciation studies M. guttatus, M. lewisii, and M. cardinalis (Bradshaw et al. 1998; Wu et al. 2008).

Combining the results of our two time-dependent CA diversification analyses, we can conclude that CA speciation rates have increased over time in 5 of 16 lineages (Arbutoideae, Lotus, Lupinus, Phrymoideae, and Salvia), although diversification rate increases reflect the onset of the Mediterranean climate (5 Ma) in only three of these cases (Lotus, Lupinus, and Phrymoideae), with the other two cases (Arbutoideae and Salvia) reflecting earlier shifts or more gradual processes over longer spans of time. There is also evidence that speciation rates in CA have decreased toward the present in 5 of 16 lineages (Antirrhinum, Ericameria, Polemoniaceae, Sanicula, and Sidalcea), and CA speciation rates have remained constant over time in the remaining six lineages. Overall, speciation rates increased in response to the onset of the Mediterranean climate in only 3 of 16 lineages. These results indicate that the onset of the Mediterranean climate may have been favorable to the diversification of some lineages, but it cannot be concluded to be the major driver of diversification across the CA flora.

Our results do not suggest that speciation has been unimportant in CA, or that studies focusing on ecologically driven speciation in CA are misguided. Many CA neoendemics arose recently, within the past 3 to 5 million years, suggesting that CA is a site of active speciation (Kraft et al. 2010). However, our results indicate that lineages speciating in CA are also speciating elsewhere at comparable or higher rates, on average. For example, Lupinus has undergone recent and rapid radiations in the Mexican highlands and Andes (Hughes and Eastwood 2006; Drummond et al. 2012) with high speciation rates comparable to those exhibited by CA Lupinus (Table 1, Drummond et al. 2012). We did not compare rates for CA lineages to a random draw of lineages present in other regions; it may be that the CFP has a disproportionate representation of lineages prone to relatively rapid speciation throughout their range, and this, combined with lower extinction rates in CA, has resulted in high net diversity. Furthermore, several of our selected clades do show some evidence of higher speciation rates in CA than elsewhere (Table S5), including the clade containing Mimulus.

Although all five of the earth's Mediterranean regions are biodiversity hotspots, the onset of the Mediterranean climate did not lead to increased diversification rates in CA in most of our selected clades. This suggests that high species richness in Mediterranean regions may generally predate the onset of the Mediterranean climate. It may also be that we see these regions as exceptionally diverse essentially in contrast to other temperate regions. While not all Mediterranean regions exhibit unusually high topographic complexity or mountain ranges comparable in elevation to the Sierra Nevada (Cowling et al. 1996), all five of these regions are bordered by a major desert on their lower-latitude side, open ocean on their westward side, and either a site of recent glaciation or open ocean on their pole-ward side. This suggests that climatic buffering could be a shared cause of decreased extinction (and thus high biodiversity) in these regions, with topographic features, intermediate latitude, and oceanic current patterns conspiring to prevent desertification and glaciation, compared to surrounding regions. It is also noteworthy that South-Western Australia and South Africa exhibit considerably shorter elevational gradients than CA, but higher plant species richness, suggesting that climatic buffering results from interactions among multiple climatic, edaphic, and topographic factors, and does not necessarily require the presence of dramatic mountain ranges.

Recent work suggests that many plant adaptations commonly found in Mediterranean regions, such as sclerophylly and the ability to resprout following fire, also predate the Mediterranean climate (Herrera 1992; Verdu et al. 2003). Furthermore, Valente et al. (2010) report a similar pattern of diversification (to what we report here) in Proteas of the Cape Flora: an unexceptional average speciation rate in the Cape over the past approximately 15 million years, multiple species persistence at fine spatial scales, and slowing diversification there in recent millennia. Together with our results, these studies support the conclusion that the floristic uniqueness of, and similarities among, Mediterranean regions are only indirectly related to their current, shared climate.

We also find that immigration rates into CA are lower than emigration rates of out CA. This suggests that CA is not a “center of accumulation” (Ladd 1960) of biodiversity, and that immigration does not explain extant species richness (Mora et al. 2003). However, because our “outside CA” character state represents the rest of the terrestrial biosphere, which surrounds CA and overwhelms it in size, a randomly dispersing plant would be much more likely to land outside of the CFP than inside its boundaries. Even if CA does have a slight, measurable acquisition bias compared to other regions of similar size and shape, it would likely be swamped out in our analysis, due to the discrepancy in sizes of our compared geographic regions. Furthermore, because net diversification is higher in CA than elsewhere on average, CA has more species to export than do many of the surrounding regions that may supply CA with its immigrant pool, leading to a net species export even if individual species have an equal chance of dispersing in either direction. Therefore, we cannot rule out the possibility that the CFP attracts more species immigrants than other regions of similar size. However, we can conclude that CA more commonly supplies biodiversity to surrounding regions than it accumulates immigrants from beyond its borders.

Our molecular phylogenetic analysis corroborates some conclusions of R&A's classic study of CA's plant origins. Our ancestral state reconstructions indicate that Arcto- and Madro-Tertiary lineages colonized CA much earlier than clades of other origins, and clades that R&A characterized as having a long-standing association with the CFP may also have colonized CA somewhat earlier than other lineages. These results support R&A's conclusions based on the fossil record. In contrast to R&A's ideas, however, the substantial variance in colonization dates of clades within a geoflora suggests that these were not constant plant assemblages that migrated as a unit, but rather that individual lineages migrated independently. Warm Temperate/Desert lineages arrived in CA most recently (in our reconstructions), as predicted by R&A's analysis of current geographic affinities and recent climate changes. Furthermore, we find that Arcto-Tertiary/North Temperate clades have speciated at low rates in CA, in comparison with clades of other origins, as predicted by R&A. Arcto-Tertiary lineages do experience significantly reduced extinction probabilities in CA in comparison to elsewhere, as predicted by R&A (and references therein, p. 16). However, reduced extinction in CA is not unique to or particularly prevalent among Arcto-Tertiary lineages, such as an observation of relict stands of Arcto-Tertiary charismatic macroflora like Sequoia and Torreya would suggest. Lineages of all origins experience reduced extinction in CA versus elsewhere. Moreover, Madro-Tertiary lineages (and lineages with CFP or Mediterranean biogeographic affinities) do not speciate at higher rates in CA than lineages from other origins, in contrast to R&A's diversification hypotheses.

In conclusion, we find that the role of the current Mediterranean climate in promoting diversification has been overemphasized, at least in CA, and likely in other Mediterranean regions as well (Linder 2003; Sauquet et al. 2009). The Mediterranean climate may have played a secondary role in promoting regional diversity by facilitating the immigration of tropical and desert species. However, CA biodiversity is primarily due to low rates of extinction since the Tertiary. We do not find that any particular ancestral geoflora or modern biogeographic affinity has contributed especially to CA plant diversification via reduced extinction or elevated speciation rates, in contrast to previous hypotheses. However, our results confirm previous findings that CA exports many lineages to other regions. For example, the Hawaiian radiation of silverswords and the Andean radiation of Lupines each derive from CA native lineages (Baldwin et al. 1991; Drummond 2008), suggesting that CA is an important species refuge and source of biodiversity both within and beyond its borders. This study indicates that relatively permanent features of CA's landscape, such as its topographic complexity and geographical location, are most critical for plant species persistence and diversification, whereas its temporary climatic conditions have been less important. As global climates continue to shift, we predict that the CFP will likely represent an important refuge for plant species. Our results also indicate that we have renewed cause for alarm as habitats are lost to CA's high rate of suburban and agricultural development (Underwood et al. 2009).

Associate Editor: L. Harmon


We thank Emma Goldberg and Helene Morlon for generous assistance with analyses and scripts, and Ammon Corl, Emma Goldberg, Susan Harrison, Christy Hipsley, Justen Whittall, associate editor Luke Harmon, and three anonymous reviewers for comments on the article. This work was conducted while L. T. Lancaster was a center fellow at the National Center for Ecological Analysis and Synthesis, funded by National Science Foundation grant No. EF-0553768, The University of California, Santa Barbara, and the State of California.