SEARCH

SEARCH BY CITATION

Keywords:

  • biodiversity;
  • birth–death;
  • diversification;
  • diversity dependence;
  • extinction;
  • latitudinal gradient;
  • phylogeny;
  • speciation;
  • species richness

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References

1. Rates of evolutionary diversification play a fundamental role in the assembly of regional communities, but the relative balance of diversity-dependent and diversity-independent rate control remains controversial. Recent studies have reported a significant relationship between the amount of time a geographic region has been occupied and species richness, implying that feedbacks between species interactions and diversification rates may be less important than diversity-independent mechanisms in generating regional species pools.

2. Previous analyses of the regional age-diversity relationship have used a range of metrics to quantify the amount of ‘evolutionary time’ that a region has been occupied, but the relative performance of these metrics has not been quantified.

3. Here, I evaluate the performance of the most commonly used methods and data transformations for assessing the regional age-diversity relationship.

4. I find that process-based models of diversification are more appropriate than process-independent models for evaluating the influence of time on species richness. I also demonstrate that time should not be log-transformed when testing the regional time-for-speciation hypothesis, as in some recent studies.

5. Application of this framework to patterns of elevational richness in several recent studies provides support for a logistic model of diversity accumulation within elevational bands and implies that evolutionary age alone cannot fully account for current species richness.

6. These results indicate that process-based models, in concert with appropriate data transformation, provide a robust foundation for inference on the causes of regional diversity gradients.


Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References

Species richness varies dramatically among geographic regions. One of the most prominent and widely studied examples of this variation is the latitudinal diversity gradient (Willig, Kaufman & Stevens 2003; Mittelbach et al. 2007), but species richness also varies with respect to elevational and other gradients. Perhaps the most fundamental question in regional diversity studies is the extent to which species richness is determined by diversity-independent and diversity-dependent mechanisms. If diversity-independent mechanisms control regional species richness, then the rate at which regional diversity changes is largely independent of current diversity. This is an explicitly non-equilibrial view of diversity dynamics and describes a scenario in which abiotic factors (Cracraft 1982) or evolutionary time play an important role in determining regional diversity differences. In contrast, regulation by diversity-dependent factors implies that rates of speciation, extinction and/or interregional dispersal are influenced by standing diversity (MacArthur 1969). Diversity dependence is often assumed to have a dampening effect on diversity trajectories (Rosenzweig 1975; Alroy 1996; Phillimore & Price 2008), although it could also have a facilitative effect if species richness itself generates ecological opportunity for new species (Nee, Mooers & Harvey 1992).

The tempo and mode of diversity accumulation within regions through time is central to understanding the relative balance of diversity-dependent and diversity-independent control of regional diversity. If the dynamics of richness are governed by diversity-independent factors, then richness should generally increase through time. This phenomenon is sometimes referred to as the ‘time-for-speciation’ effect (Stephens & Wiens 2003) and is simply the notion that positive correlations between evolutionary time-within-regions and species richness will generally occur in the absence of any diversity-dependent control of speciation and extinction rates. Even if regions show substantial variation in their net rate of species diversification, such diversity independence should generally lead to a positive relationship between time-within-regions and richness (Rabosky 2010).

In contrast, diversity-dependent feedback on speciation and extinction rates can lead to a decoupling between evolutionary time-within-regions and species richness. If regions are governed by equilibrial or carrying-capacity dynamics, then standing richness may be approximately stable through time, and the amount of time a given region has been occupied will have little bearing on species richness once regional carrying capacities have been reached (Ricklefs 2007; Rabosky 2009a). However, time may nonetheless be an important predictor of species richness even under strict diversity-dependent control of speciation and extinction unless regions are truly saturated with species. A relationship between time and species richness need not imply an absence of diversity-dependent regulatory mechanisms, because diversity-independent mechanisms presumably play at least some role during the early phase of diversity accumulation.

Two general types of data have been used to test hypotheses regarding the accumulation of species richness within geographic regions. The most straightforward scenario occurs when a set of clades can be identified within a particular geographic region (e.g., Ricklefs 2006, 2007; Gamble et al. 2009; Sauquet et al. 2009). In this case, the dynamics of species richness through time can be modelled using the standard birth–death process that has been widely used to study rates of diversification (Magallon & Sanderson 2001; Rabosky 2009b). Because clades are (by definition) monophyletic, clade richness is a function only of speciation and extinction rates: these are the only processes that can change the numbers of species within clades. I refer to this approach as the ‘clade-based’ approach.

The clade-based approach is limited, however, in complex biogeographic scenarios where lineages may have transitioned repeatedly between different biogeographic regions. Suppose that a lineage colonized a particular geographic region at some point in the past and diversified within that region, but that additional lineages subsequently dispersed to the region. The simple clade-based approach is formally invalid in this case, because it fails to consider the possibility that lineages are gained by immigration. Moreover, owing to the potentially complex distribution of geographic character states across the tips of the phylogeny, we have no direct information about the time when a particular geographic region was colonized.

One pragmatic and innovative solution to this problem has been to use reconstructions of ancestral geographic characters to indirectly infer the state of nodes across the tree. Such characters might include the latitudinal or elevational midpoints of species’ distributions, biogeographic regions (e.g., ‘Palearctic’, ‘Neotropical’) or even species’ habitat attributes. Character state reconstructions at these nodes are then used to quantify the amount of evolutionary time for diversification within each geographic region (Wiens et al. 2007; Kozak & Wiens 2010). I refer to this approach as the ‘time-of-colonization’ approach, as it relies on inferences of ancestral ‘time of colonization’ across time-calibrated phylogenetic trees. Researchers then test whether these metrics of evolutionary time are correlated with the current species richness of each region, without assuming a particular underlying diversification model (Smith et al. 2007; Li et al. 2009).

Here, I address the following question: given that a set of colonization times have been correctly inferred across a phylogenetic tree and that we know current patterns of species richness within a set of geographic regions, how should we test the regional time-for-speciation hypothesis? Multiple methods have been used to analyse evolutionary ‘time-within-regions’, but the statistical performance of these approaches has not been evaluated. I assess the relative ability of these metrics and data transformations to predict diversity when evolutionary time is the only factor influencing species richness. I then illustrate how process-based models can be used to test the importance of evolutionary time in generating regional diversity differences. Using data from several previous studies, I apply this framework to explore the variation in species richness along elevational gradients in several vertebrate taxa.

Materials and Methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References

Birth–Death Process and Data Transformation

The time-homogeneous birth–death process is the perhaps the simplest statistical model that can be used to describe the accumulation of species richness within geographic regions. Under this process, new lineages arise with per capita rate λ, and lineages go extinct with rate μ. In a very real sense, this model (Kendall 1948) is the ‘time for speciation’ model: if λ and μ do not vary through time or among regions, then species richness will be completely determined by time and the stochastic error associated with the process. It is convenient to reparameterize the simple birth–death process in terms of the net rate of diversification r, or λ − μ, and the relative extinction rate κ, or μ/λ. Here I follow Ricklefs (2006, 2007) in using κ rather than ε to denote this parameter to avoid confusion with e, the base of natural logarithms. These two parameters (r, κ) determine the distribution of species richness through time under the constant rate process. Specifically, for a process beginning with N0 species at time t = 0, we can compute the expected richness as

  • image(eqn 1)

However, researchers frequently condition this expectation on the probability that a clade is not extinct, as we are generally looking at clades that have survived to the present to be observed. Define P0,t as the probability that a clade has gone extinct by time t, or

  • image(eqn 2)

The expected richness through time is then obtained by simply conditioning eqn (1) on the probability that the clade is not extinct, or

  • image(eqn 3)

as discussed by previous researchers (Raup 1985; Magallon & Sanderson 2001).

Several important results follow immediately from eqns (1–3). First, if extinction is equal to zero, the natural logarithm of species richness will increase linearly with respect to time. Second, the natural logarithm of species richness will also increase linearly with respect to time in the limit as t [RIGHTWARDS ARROW] ∞, as it follows from eqn (2) that

  • image(eqn 4)

Thus, as time becomes large, log(N) can be approximated by a linear model with slope rt and y-intercept

  • image(eqn 5)

These results indicate that time should not be log-transformed in testing hypotheses concerning the age-diversity relationship, although this has been done in previous studies (Smith et al. 2007; Wiens et al. 2007; Li et al. 2009). To illustrate this, I used eqn (3) to generate expected richness-through-time curves that resulted in N = 1000 species at time t = 1 time units under a variety of relative extinction rates between κ = 0 and κ = 0·90. It is clear that for most values of κ, the relationship between log(N) and time is approximately linear (Fig. 1a), whereas log transformation of time induces a strong nonlinearity in the relationship (Fig. 1b).

image

Figure 1.  Theoretical diversity accumulation curves for a clade reaching N = 1000 species after 1·0 time units as a function of (a) time and (b) log-transformed time. In each plot, curves are shown for four relative extinction rates (κ = 0, 0·3, 0·6, and 0·9). Logarithmic transformation of time (b) induces strong nonlinearities in the relationship between time and log-transformed species richness.

Download figure to PowerPoint

Because most previous studies have tested the time-for-speciation hypothesis by fitting linear models to the age-diversity relationship, I conducted simulations to quantify and compare the error associated with modelling diversity as a function of untransformed time vs. the natural logarithm of time. For each simulation, I first selected a value of κ and r that would result in an expected clade size of N = 1000 species at time = 100 and denoted these values as κi and ri. I then generated a set of 100 random clade ages from a uniform distribution on (0, 100). Using eqn (3), I computed the expected richness for each of the 100 colonization times assuming κi and ri. Thus, species richness across the 100 clades represents the exact theoretical expectation under κi and ri. There is no stochastic variation in clade size in this exercise, and clade size is perfectly specified by κi, ri and the clade age t. Using this set of clades, I fit a linear model to the relationship between (i) log(N) and clade age, as well as (ii) log(N) and the logarithm of clade age. I conducted this exercise across a range of relative extinction rates, with 1000 replicate sets of 100 simulated clade ages per κi, and compared the coefficient of variation for the unlogged vs. log-transformed time regressions.

These results clearly indicate that log transformation of time can lead to substantial residual error when modelling the regional age-diversity relationship (Fig. 2). Simple linear regressions of log(N) on untransformed time explain most of the variation (>95%) in species richness across the range of relative extinction rates considered. This is not the case for the regressions of log(N) on log(time), where a large percentage of the variation in the age-diversity relationship remains unexplained (10–35%).

image

Figure 2.  Percentage of the variance of the theoretical age-diversity relationship explained by ordinary least squares modelling of log-richness as a function of (a) time and (b) log-transformed time, under different ratios of extinction to speciation. Lower, middle and upper lines on boxes denote the 2·5%, 50%, and 97·5% quantiles, respectively, of the distribution of r2 values from simulated data sets. Logarithmic transformation of time performs poorly. Compare to lineage accumulation curves shown in Fig. 1.

Download figure to PowerPoint

Multiple Colonization Scenarios

The analyses presented in Fig. 2 describe an explicitly ‘clade-based’ scenario: lineages colonize geographic regions and we can simply model the species richness of clades through time without considering subsequent dispersals into the region. However, as clearly described by previous workers (Wiens et al. 2006, 2007), this approach poses a number of challenges. It may be difficult to clearly delimit a set of clades within a particular geographic region, but we may nonetheless have estimates of a set of times when the region was colonized by reconstructing ancestral geographic states (Smith et al. 2007) and we presumably have an estimate of current species richness within the region. In this situation, researchers have tested the time-for-speciation hypothesis by modelling species richness as a function of some single or compound metric of the inferred colonization times of each region (Smith et al. 2007).

To address the statistical performance of these methods, I conducted a series of simulations under a scenario where species richness within regions is a function of multiple colonization events. Following approaches used in the recent literature, I then modelled regional species richness as a function of: (i) the initial time of colonization of a geographic region, ignoring all subsequent colonization events that have occurred (Stephens & Wiens 2003; Roncal et al. 2011); (ii) the natural logarithm of the time of first colonization (Smith et al. 2007; Li et al. 2009); (iii) the sum of all inferred colonization times (Wiens et al. 2009); and (iv) the natural logarithm of the sum of all colonization times (Smith et al. 2007; Li et al. 2009). I compare the performance of these four metrics and transformations to that of a simple birth–death model, modelling richness as a simple function of the initial time of colonization and ignoring all transitions to and from a given geographic region.

As in the exercise above, I chose initial values of κ and r for each simulation that would result in an expected clade size of N = 1000 species at time = 100. I then sampled a vector of 100 initial colonization times from a uniform (0, 100) distribution. There are thus 100 geographic regions, each with a potentially unique ‘time of colonization’. I assumed that secondary colonizations occur within regions according to a Poisson process with exponentially distributed waiting times. Each simulation was thus associated with a Poisson rate parameter θi, with θ values taken from the integer set ranging from 1 to 10. A rate parameter θi = 5 implies that on average, five secondary colonizations are expected to occur on the time interval (0, 100) and a region with an initial colonization occurring at t = 50 would have a mean of 2·5 secondary colonizations with this value of θ. Waiting times between successive colonization events within regions were chosen from an exponential distribution with parameter 100/θi. Thus, a geographic region with young initial colonization time (e.g., t < 25) might have only a few secondary colonizations, but a geographic region with an old initial colonization time (e.g., t > 75) might be associated with many secondary colonizations.

For the initial colonization as well as all secondary colonizations, I computed the expected clade diversity using eqn (3). I then computed the total diversity within each region as the sum of theses expected richness values, or

  • image(eqn 6)

where t0 is the initial colonization time and tk is the time of the k’th secondary colonization, taken over a total of W secondary colonizations. Thus, the total diversity within a region is the sum of all clade diversities within the region. For example, suppose a region is initially colonized at time t = 20, and a secondary colonization occurs at t = 45. Assuming κ = 0 and = 0·069, this region would thus contain 249·6 species from the first colonization and 44·5 species from the second, for a total of 294·1 species at the end of the simulation. A total of 500 simulated data sets were generated under each combination of θ (θ = 0, 1, 2, ….10) and κ (κ = 0, 0·45, 0·90). I did not restrict species richness to integer values, as I was interested in how well existing approaches could model a ‘perfect’ theoretical age-diversity relationship under a time-for-speciation process.

For each simulated set of 100 geographic regions, I used ordinary least squares regression to model the natural logarithm of regional richness (N) as a function of (i) the initial time of colonization (InitTime); (ii) the logarithm of the colonization time (LogInitTime); (iii) the sum of colonization times (SumTime); and (iv) the logarithm of the summed colonization times (LogSumTime). For comparison, I also fit a simple birth–death model (eqn 3) to the data using nonlinear least squares regression, ignoring all secondary colonizations and assuming regional richness is a strict function of the initial colonization time. For the fit of the birth–death model, I estimated both r and κ from the data but assumed that N0 was known. I estimated the residual error of each fitted model as the sum of the absolute value of model residuals for each of the five approaches described earlier.

The five approaches show substantial differences in residual error under the multiple colonization scenario (Fig. 3). The worst overall approach appears to be one where log(N) is modelled as a function of the logarithm of the initial colonization time. Surprisingly, both approaches that utilize summed colonization times performed poorly. Even a simple linear model that discards all information about secondary colonizations (Fig. 3, first column) outperforms these approaches; it is striking that residual error does not show an appreciable increase with the number of secondary colonizations contributing to total regional richness. The overall best approach is the simple birth–death model, ignoring secondary colonizations (Fig. 3, column 5). Despite the fact that richness within some regions is a compound variable composed of multiple independent clades of different ages, the simple birth–death process does a remarkable job of approximating the accumulation of richness through time (Fig. 4). These results are conditional on the ‘perfect’ theoretical expectation for the relationship between age and richness under the birth–death process. Adding stochastic variation in species richness to the simulation will almost certainly increase the total residual error and could potentially reduce the striking differences between the approaches illustrated in Fig. 3.

image

Figure 3.  Residual error in linear models of the regional age-diversity relationship under four metrics of evolutionary ‘time-within-regions’, as a function of interregional dispersal rates. Each column denotes a separate metric or transformation of ‘evolutionary time’, and the fifth column is the result of fitting a simple birth–death model (eqn 3) using nonlinear least squares regression. Rows represent different values of the relative extinction rate. Lower, middle and upper lines on boxes denote the 2·5%, 50% and 97·5% quantiles, respectively, of the distribution of total residual error from 500 simulations per set of simulation parameters. Initial colonization times for each region were chosen from a uniform (0, 100) distribution, and secondary colonizations of each region followed a Poisson process with rate θ. A simple linear regression of the relationship between log(N) and the initial time of colonization (InitTime) outperforms all other approaches that are typically used in the literature, but a constant-rate birth–death model is the overall best approach for modelling diversity dynamics under the time-for-speciation hypothesis. The performance of the fitted birth–death model under high rates of secondary colonization is remarkable, because the model assumed that regional richness was a function of the initial colonization time only. Results are based solely on expected age-diversity relationship and neglect stochastic variation in richness through time.

Download figure to PowerPoint

image

Figure 4.  A simple constant-rate birth–death process provides a good approximation of regional diversity accumulation curves even when species richness is sum of multiple independent diversification processes. Here, diversification within a region began with a single colonization event at t = 50 my before present, and secondary colonizers arrived every 7·5 my (thin grey lines). The total richness within the region (thick grey line) is the sum of the diversities of all colonizing lineages at that point in time (eqn 6). Dashed line is the fit of a constant-rate birth–death model that assumed all richness resulted from the initial colonization event at t = 0.

Download figure to PowerPoint

Hypothesis Testing Framework

Analyses of the regional time-for-speciation hypotheses have generally relied on simple correlational analyses of the relationship between species richness and one of the four time metrics described earlier (InitTime, LogInitTime, SumTime, LogSumTime). Although such correlations presumably indicate some role for evolutionary time in generating regional diversity differences, they provide little insight into the potential role of diversity-dependent and diversity-independent controls on species richness, because significant age-diversity relationships may be observed even if a strong signal of diversity dependence is present in the data. Importantly, the fit of simple constant-rate birth–death models (e.g. time-for-speciation models) are generally not compared against alternative models that entail logistic diversity dynamics within regions. Appropriate data transformation is critical for these analyses because log transformation of time will compress the time axis and can weaken the signal of logistic diversity accumulation through time (Fig. 5). If ‘older’ regions are at carrying capacity, a log(time) transformation can lead to an apparent linear increase of richness through time and researchers may miss the signal of asymptotic diversity accumulation within regions (Fig. 5). Likewise, a true linear relationship between log(N) and time may be misinterpreted if time is log-transformed (Figs 1 and 2).

image

Figure 5.  Logarithmic transformation of time can effectively linearize a logistic diversity accumulation curve. Upper figure shows theoretical diversity-through-time curve (light grey line) for five sampled regions that share a common carrying capacity of K = 100 species but that differ in their initial time of colonization. The fit of a linear model (dashed line) to this relationship is relatively poor. However, logarithmic transformation of time for the same data (bottom) greatly improves the fit of the simple linear model by reducing the contribution of ‘older’ regions to the fitted model. Log transformation of time is thus especially inappropriate for testing hypotheses about the factors that mediate the dynamics of species richness within regions.

Download figure to PowerPoint

To illustrate how comparison of multiple alternative models can provide insight into regional diversity dynamics, I reanalysed several published data sets on patterns of elevational species richness in a range of vertebrate taxa. For each data set, I compared a simple time-for-speciation model to a model that specified diversity-dependent dynamics within regions. I modelled the time-for-speciation effect with a constant-rate birth–death model (eqn 3), noting that this model outperformed all other approaches even under high rates of secondary regional colonization (Fig. 3). For the diversity-dependent model, I considered a simple logistic model of diversity accumulation within regions, with richness given by

  • image(eqn 7)

where K is the regional carrying capacity and r is the net rate of lineage accumulation within regions. This model is the solution to a standard differential equation of logistic population growth, or

  • image(eqn 8)

The logistic model should fit the data best if evolutionary time predicts species richness for recently colonized regions but not for regions that have been occupied for substantial amounts of time. The birth–death (time-for-speciation) and logistic models have exactly two parameters each (birth–death: r, κ; logistic: r, K), and for the purposes of this exercise, I assume that rates of species diversification do not vary among regions and that regions do not vary in their carrying capacity.

It is noted that the logistic model considered here is not conditioned on clade survival to the present, because the simple birth–death probability of clade extinction is not applicable to diversity-dependent scenarios. For example, consider a clade that is regulated by diversity-dependent speciation and extinction rates. This model is potentially consistent with a clade extinction probability of zero, even though speciation and extinction rates may be simultaneously high and balanced at equilibrium clade size. Clades must be small to go extinct, but extinction rates may be very low for small clade sizes (e.g., McInnes, Orme & Purvis 2011). Clade extinction as a result of stochastic fluctuations in clade size may be extremely unlikely if extinction rates are inversely correlated with clade size.

As a process-independent alternative, I modelled species richness using a simple linear model, such that log(N) = β0 + β1t0. This model is expected to fit the data well if time-within-regions is the dominant process influencing species richness and if background extinction rates have been low. Alternatively, this model could fit the data as well or better than the alternatives if there is no pattern in the data. The latter possibility is consistent with a range of processes, from diversity-dependent regulation of richness (Rabosky 2010) to systematic error in the estimation of regional colonization times. Models were fitted using nonlinear least squares regression, and small-sample corrected AIC scores (AICc) were computed for each model.

I fit these models to data drawn from (i) Mesoamerican hylid frogs (Smith et al. 2007), (ii) Mesoamerican salamanders (Wiens et al. 2007), (iii) cyprinid fishes on the Tibetan plateau (Li et al. 2009) and (iv) plethodontid salamanders from the Appalachian mountains of eastern North America (Kozak & Wiens 2010). The data in each study consist of an estimate of the number of species that occur within a set of elevational bands (e.g., 800–1000 m), as well as an estimate of the earliest colonization time of that elevational band. Colonization time estimates were derived from ancestral state reconstruction on time-calibrated phylogenetic trees and represent the earliest dated node on the phylogeny that could be assigned with a high level of confidence to a particular elevational band. All four studies tested the regional time-for-speciation hypothesis by assessing the relationship between log(N) and log-transformed time-of-initial-colonization and/or the log-transformed sum of colonization times. Based on theoretical considerations and simulation results presented in this paper, I base my reanalysis on the relationship between log(N) and untransformed time.

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References

A logistic model fits the data better than a simple time-for-speciation model for all four data sets (Table 1). For Mesoamerican and Appalachian salamander data sets, the conditional probability of the logistic model given the candidate set of models is ≥0·99 (Δi > 0·99). Visual assessment of the regional time-diversity relationship strongly suggests that rates of species accumulation within regions have slowed through time, particularly in the three amphibian data sets (Fig. 6). In each case, the fitted birth–death models require extremely high relative extinction rates (κ = 1; or λ μ) to fit the observed data. Given that I have used these models to approximate lineage diversification within regions without accounting for secondary regional colonizations, specific parameter estimates should be treated with caution. In each of these data sets, it is clear that evolutionary time-within-regions is a substantial predictor of species richness only for regions that have been recently colonized.

Table 1.   AICc scores for three models of regional diversity accumulation fitted to four data sets. Akaike weight (Δi) of each model is given in parentheses. A two-parameter logistic model fits best in each case
Data setReferencesLinearBirth–deathLogistic
Hylid frogsSmith et al. (2007)25·3 (0)12 (0·19)9·2 (0·81)
Mesoamerican salamandersWiens et al. (2007)27·2 (0)3·7 (<0·01)−8·4 (>0·99)
Tibetan cyprinidsLi et al. (2009)14·9 (0)−2·0 (0·32)−3·5 (0·68)
Appalachian salamandersKozak & Wiens (2010)−10·2 (0·01)8·4 (0)−19 (0·99)
image

Figure 6.  Log-transformed species richness within elevational bands as a function of the initial colonization time for four groups of vertebrates. Grey line denotes fitted logistic model (eqn 7), and dotted line denotes fitted birth–death (time-for-speciation) model (eqn 3). For each data set, the two-parameter logistic model outperforms the birth–death model (Table 1); this is especially true for both groups of salamanders (ΔAICc > 12). Inset figures show fitted relationship between log-richness and log-transformed time as presented in the original published sources. The log-time transformation results in low power to detect asymptotic diversity accumulation and leads to the perception that diversity has increased through time without bounds (Fig. 5).

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References

These results have a number of methodological implications for the study of regional species accumulation curves. First, researchers must be cautious when applying data transformations in the study of regional age-diversity relationships. I have shown that a logarithmic transformation of time is unwarranted on theoretical grounds and has poor statistical performance in practice. Second, seemingly reasonable metrics to quantify evolutionary time, such as the sum of all colonization times, do a poor job of predicting patterns of species richness within regions, even when secondary colonization rates are high (Fig. 3). It is possible that simulation scenarios could be identified where these or other metrics would perform better (or worse) than the simple scenarios considered here. My results indicate that such approaches should not be used without theoretical or simulation-based analyses of their statistical performance. Simple approaches that ignore all secondary colonizations perform surprisingly well, even though these methods are at best approximations that ignore the dynamics of interregional dispersal.

Most importantly, confronting regional age-diversity patterns with multiple alternative models can provide a far richer perspective on the factors that influence diversity dynamics. Previous research on this topic has generally entailed asking simply whether species richness is significantly correlated with evolutionary time-within-regions. This is an important question, but mere demonstration of a positive age-diversity correlation within regions cannot rule out the possibility that diversity-dependent dynamics play an important role in mediating regional richness patterns.

My analysis of elevational diversity patterns in four groups of vertebrates suggests that in each case, the tempo and mode of diversity increase through time is more consistent with a logistic model than an unconstrained time-for-speciation process (Table 1; Fig. 6). Indeed, within each data set, regions with initial colonization times greater than the median value show effectively no relationship between time and richness (Fig. 6). These results suggest the possibility of diversity-dependent regulation of regional richness and are consistent with some analyses of regional diversification dynamics (Weir 2006).

Although previous analyses have suggested that simple decoupling between evolutionary time and richness can reflect diversity dependence (Ricklefs 2007, 2009; Rabosky 2009a), the results presented here (Fig. 6) provide a much more robust form of evidence for this phenomenon. I have not merely documented a lack of relationship between age and richness; rather, the focal groups appear to contain the signal of a growth phase (‘young’ regions) as well as an equilibrial phase (‘older’ regions). This intriguing dimension of these regional diversity accumulation curves was only apparent after analysis of the data on an appropriate time-scale: logarithmic transformation of time in the original studies masked an apparent asymptote in diversity with respect to time within elevational bands (Fig. 6).

Although a logistic model generally outperforms a simple time-for-speciation model, some caution is needed in interpreting this pattern. It is possible that this pattern reflects error in estimation of colonization times at deeper nodes in the phylogeny. Specifically, if the ages of old nodes – but not young nodes – are associated with considerable error variance, then error alone could potentially lead to a decoupling of time-within-regions and species richness. Likewise, any biases associated with ancestral geographic state reconstruction that were present in the original analyses were propagated through my reanalysis. However, it is unclear how this could lead to a decoupling of regional age and richness for older nodes only.

Another general issue in testing these models is that for much of parameter space, a diversity-dependent/logistic process and a constant-rate birth–death process will result in similar diversity trajectories through time. In fact, for the balanced speciation-extinction process (λ μ), rates of diversity accumulation will appear to undergo a temporal deceleration (e.g., Fig. 6). Species richness of young clades under this model will appear to accumulate at a rate approaching λ, but will appear to decrease to rate λ − μ for old clades. This is entirely a retrospective phenomenon that occurs as a result of lineage extinctions: with high extinction, there are many young lineages at any point in time, but most of these will go extinct. In salamanders, richness patterns within regions generally appear inconsistent with a simple birth–death process. However, whether we believe a constant-rate model can adequately explain patterns in hylid frogs and Tibetan cyprinids depends on our prior beliefs about extinction rates. It is certainly true that extinction rates are high in the fossil record (Gilinsky 1994; Alroy 2000, 2008), but it seems unlikely that they have been equal to speciation rates for large clades that have survived to the present to be observed. For example, the long-term average extinction rate in mammals approaches the speciation rate (Alroy 1996), but this estimate includes clades that have gone extinct in their entirety (Bininda-Emonds et al. 2007) and presumably overestimates rates for clades that have survived to the present day. In a Bayesian context, a prior hypothesis that λ μ will only increase support for a logistic model for data similar to those shown in Fig. 6.

Regardless, further research is clearly needed to evaluate the central assumption of this paper and many other papers: that such geographic state reconstructions provide a useful window into the true amount of evolutionary time that lineages have occupied a particular geographic region. My analyses began with the explicit assumption that ancestral geographic state estimates are unbiased estimates of the true colonization times. No simulation-based study has rigorously evaluated the methods that underlie this body of research. For example, suppose that a set of geographic regions differ in regional carrying capacities and that species richness within each region is at equilibrium. Given that some geographic states are more common than others, it is possible that the most frequent character state will generally be reconstructed at the earliest nodes in the tree. This could potentially lead to a spurious time-for-speciation effect, but neither this (nor other) scenarios have been rigorously tested by simulation. Likewise, there is yet no treatment of the basic problem of taxa that occur in multiple regions: how should researchers deal with these species, and how do they influence the nature of the analyses presented here and elsewhere?

The approaches utilized here may ultimately be superseded by new geographic state-dependent models that explicitly allow rates of speciation, extinction, and dispersal to vary among geographic regions (Goldberg, Lancaster & Ree 2011) and through time (Rabosky & Glor 2010). However, these approaches may be limited by the low power of state-dependent diversification models (Maddison, Midford & Otto 2007), by difficulties modelling continuous geographic distributions on phylogenetic trees (Goldberg, Lancaster & Ree 2011) and by incomplete taxon sampling in molecular phylogenies (FitzJohn, Maddison & Otto 2009; Cusimano & Renner 2010; Brock, Harmon & Alfaro 2011). One challenge will be to accommodate the potential explosion of character states that will occur in these models when individual lineages are able to occupy multiple geographic regions. For example, in the basic GeoSSE model described by Goldberg, Lancaster & Ree (2011), lineages can occupy either of two geographic regions, or they may occupy both. The number of possible states for such a model will increase as a function of the number of combinations of geographic regions (Ree et al. 2005; Ree & Smith 2008). Great progress has been made in jointly modelling geography and diversification in recent years, and both clade-based and time-of-colonization-based (Smith et al. 2007; Wiens et al. 2007; Li et al. 2009) approaches potentially have much to contribute to our general understanding the dynamics and causes of species richness within geographic regions.

Summary and recommendations

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References

These results present a number of practical recommendations:

  • 1
     Logarithmic transformation of time is not appropriate for evaluating the regional time-for-speciation hypothesis. Most importantly, log transformation of time can effectively linearize an asymptoptic diversity curve (Figs 5 and 6), leading to potentially misleading conclusions about the relationship between evolutionary time and species richness.
  • 2
     Theory and/or simulation should be used to evaluate the performance of metrics that purport to quantify total evolutionary history within regions. Here, I have shown that a widely used metric involving summed colonization times does not perform as well as simple metrics that ignore complex dynamics of interregional dispersal.
  • 3
     Simple tests for time-diversity correlations represent a minimally adequate approach for testing the time-for-speciation hypothesis. Much greater insight into regional diversity dynamics can result from fitting process-based models to regional diversity accumulation curves.
  • 4
     Time-of-colonization methods remain a useful tool in the study of regional diversity dynamics. However, additional work is needed to evaluate whether these methods provide robust estimates of colonization times under alternative models of diversity regulation within regions and/or in the presence of regional differences in speciation, extinction and dispersal rates.

Acknowledgements

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References

I thank M. Alfaro, J. Losos, I. Lovette, A. Rabosky, R. Ricklefs and G. Slater for comments on the manuscript and/or discussion of these topics. I also thank J. Wiens and C. Fu for providing data. This research was supported by the Miller Institute for Basic Research in Science at UC Berkeley.

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Summary and recommendations
  8. Acknowledgements
  9. References