- Top of page
- Materials and methods
Convergent evolution is among the most powerful lines of evidence for the power of natural selection to shape organisms to their environment (Simpson 1953; Harvey & Pagel 1991; Losos 2011). The repeated evolution of similar phenotypes in similar environments implies a deterministic aspect of phenotypic evolution. In some evolutionary radiations, including African cichlids (Kocher et al. 1993), Caribbean Anolis lizards (Losos 2009) and Hawaiian Tetragnatha spiders (Gillespie 2004), communities of similar ecological specialists have evolved largely independently. Such clade-wide convergence can be interpreted as lineages independently responding to the same selective regimes, or equivalently, discovering the same adaptive peaks on a macroevolutionary adaptive landscape (Schluter 2000). In what follows, we do not distinguish between convergence and parallelism, as either represents evidence for nonrandom evolutionary change, and as we model convergence at the phenotypic level without addressing its underlying genetic and developmental basis (Simpson 1953; Arendt & Reznick 2008; Losos 2011).
While studies of convergence have resulted in many key insights into adaptation and adaptive radiation, several issues complicate the statistical detection of exceptional convergence in continuous traits. First, some lineages may evolve similar trait values by chance even in the absence of deterministic convergence. Simulations demonstrate that even a random walk (Brownian motion) model of evolution can lead to considerable incidental convergence, especially if trait space is low-dimensional (Stayton 2008). If repeated convergent evolution is to be interpreted as adaptation to shared environments, the frequency of convergence should be distinguishable from what is expected by chance. More subtly, tests for convergence may be motivated by the observed similarity of sets of species, such as ‘ecomorphs’ that have evolved similar morphology in response to similar ecological conditions (Williams 1972; Gillespie 2004; Losos 2009). The a priori identification of ecomorphs creates the potential for bias in tests for convergence, for two reasons. First, nonecomorph species may be ignored in the analysis, exaggerating the extent of phenotypic clustering in a clade (Losos et al. 1998; Beuttell & Losos 1999). Second, testing whether ecomorphs are convergent in a set of traits has an element of circularity if those traits played any role in ecomorph designation. Tests for convergence should be able to rule out phenotypic similarity due to chance, and to avoid identifying candidate convergent species a priori when it is inappropriate to do so.
We present a new method for identifying convergent evolution without the a priori designation of ecomorphs or selective regimes. The method takes as input only a phylogenetic tree and continuous trait data, and fits a series of stabilizing selection models to identify cases where multiple lineages have discovered the same selective regimes. Our method is called ‘SURFACE’, a recursive acronym for ‘SURFACE Uses Regime Fitting with Akaike Information Criterion (AIC) to model Convergent Evolution’. It builds upon two recent developments in comparative phylogenetic analysis: methods allowing selective regimes to be ‘painted’ onto the branches of a phylogenetic tree (Hansen 1997; Butler & King 2004; Beaulieu et al. 2012), and data-driven stepwise algorithms that locate evolutionary shifts on a tree (Alfaro et al. 2009; Thomas & Freckleton 2012). SURFACE consists of a ‘forward’ stepwise phase in which selective regimes are added to the tree, followed by a ‘backward’ phase that identifies cases where the same regime is reached by multiple lineages (Fig. 1). This results in an estimate of the macroevolutionary adaptive landscape that includes measures of the extent of phenotypic convergence.
Figure 1. The forward and backward phases of SURFACE. (a) Generating Hansen model used to simulate trait evolution on a pure-birth tree with a total depth of 10 My, painted with one ancestral and two convergent regimes (shifts denoted * and #). Three traits (values proportional to symbol size) were simulated with relatively rapid adaptation (α = 0·5, σ2 = 0·25). (b–f) Steps of the forward phase in which a regime shift is added to the branch with the lowest ∆AICc score (only values <10 shown). (g) ∆AICc values for each candidate pairwise regime collapse in the backward phase (all collapses were compatible and completed in one step). (h) Hansen model returned by SURFACE: in this case, all regime shifts were recovered.
Download figure to PowerPoint
The SURFACE method is implemented as open source software in the r environment (R Core Team 2012), and is available as the extension package surface from the Comprehensive R Archive Network (CRAN). surface calls functions in the ouch package (Butler & King 2004) to fit models with selective regime shifts, and incorporates functions from the ape (Paradis, Claude & Strimmer 2004), geiger (Harmon et al. 2008), pmc (Boettiger, Coop & Ralph 2012), and igraph (Csardi & Nepusz 2006) packages. The two phases of the SURFACE algorithm are carried out by the functions surfaceForward and surfaceBackward, or by the wrapper function runSurface. These functions take as input a phylogenetic tree (this can contain polytomies, which should be left unresolved), and data for one or more continuous traits for each species in the tree. Other features of surface include the function surfaceSimulate for generating data sets, utilities for converting between data formats and accessing outputs, functions for visualizing the results of an analysis and a vignette that demonstrates the major features of the package.
- Top of page
- Materials and methods
We have described a new method for inferring the macroevolutionary adaptive landscape for a clade, allowing the assessment of phenotypic convergence given only a phylogenetic tree and continuous trait measurements. SURFACE fills a gap in the set of currently available phylogenetic comparative methods by combining features of two recent developments. First, recent applications of the OU model allow researchers to paint the branches of a tree with hypothesized selective regimes (Butler & King 2004; Beaulieu et al. 2012). This provides a powerful way to test whether taxa in similar environments have evolved similar phenotypes, but does not solve the issue of circularity if hypothetical regimes were identified in part based on the traits being modelled. The second development is methods for fitting shifts in evolutionary parameters to a tree without an a priori hypothesis about where the shifts should occur. The first of these methods, MEDUSA, uses stepwise AIC to locate shifts in speciation and extinction rates on a tree (Alfaro et al. 2009), and subsequent methods allow shifts in the Brownian rate of trait evolution σ2 to be located using stepwise AIC (trait MEDUSA: Thomas & Freckleton 2012) or Bayesian Markov chain Monte Carlo methods (Eastman et al. 2011; Venditti, Meade & Pagel 2011; Revell et al. 2012). The method most similar to ours is MATICCE, which uses a model-averaging information theoretic approach to evaluate support for a number of candidate Hansen models (Hipp & Escudero 2010). The major differences are that SURFACE does not take candidate regime shift scenarios as inputs, and that it includes routines for evaluating whether regimes are convergent.
The main innovation of SURFACE is its backward phase, which assesses whether multiple regime shifts are towards the same regimes. Our simulations show that SURFACE performs fairly well at recovering the true convergent and nonconvergent regimes in simulated data sets under a range of conditions, particularly given multidimensional trait data and fast adaptation to new optima (Figs 3 and 4). In general, features that increase the degree to which taxa in the same regime are clustered in trait space should improve the performance of SURFACE. Greater trait dimensionality increases the likelihood of separation in at least one dimension, while more widely spaced optima, faster adaptation, or lower rates of stochastic evolution should lead to a greater signal of deterministic vs. stochastic evolutionary processes. We have also described how simulations can be used to test evolutionary hypotheses, such as whether the extent of convergence is greater than expected under a given null model. This test has good statistical power given multidimensional trait data and a moderate or high extent of convergence in the generating model (Fig. 5). We leave it to users to decide whether this null model approach is appropriate to test their hypotheses of interest, or if they prefer to make inferences strictly based on the model AICc and parameter values.
Figure 5. Approximate statistical power and type I error of the simulation-based hypothesis test for unexpectedly high convergence, given different numbers of taxa (n) and traits (m) and different true levels of convergence (∆k). Each point shows the proportion of significant tests out of 20 ‘true’ simulated data sets, using 50 resimulated data sets from a Hansen null model without convergence to generate a null distribution of ∆k.
Download figure to PowerPoint
The ability to carry out data-driven tests for exceptional convergence presents an opportunity to re-evaluate clades that have previously been recognized as containing many cases of convergence. In this study, an analysis of morphological evolution in Tetragnatha spiders on Hawaii identified only a limited extent of convergence that occurred between subclades (Fig. 2), although it is important to note that we could not incorporate the discrete characters understood to be convergent within subclades (Gillespie, Croom & Hasty 1997; Gillespie 2004). SURFACE provides an opportunity to objectively evaluate the extent of convergence in other clades, including classic replicated radiations such as cichlids in Africa's Great Lakes (Kocher et al. 1993).
SURFACE may have several additional uses to researchers interested in a data-driven estimation of the adaptive landscape. For example, one may wish to test whether regime shifts are nonrandomly associated with biogeographical events, or are concentrated early or late in a clade's history, although it is important to remember that regime groupings of extant taxa and broad measures of convergence are recovered more reliably than precise positions of regime shifts (Figs 3 and 4). Other hypotheses may concern the inferred adaptive landscape, which could be compared to a landscape predicted on the basis of resource distributions and phenotype-resource use mapping (e.g. Schluter & Grant 1984). The methods described here can potentially be extended to allow simulation-based hypothesis tests tailored to a range of biologically motivated questions.
SURFACE infers a Hansen model without the need to paint selective regimes onto the tree in advance, but it may often be more appropriate to test specific adaptive hypotheses. This is particularly the case when the adaptive hypothesis was identified using traits other than those being fitted with the Hansen model. For example, one may test whether habitat use results in convergent evolution of morphology (Collar, Schulte & Losos 2011), or whether ecomorphs recognized by one set of traits (e.g. microhabitat and morphology) are also convergent in other traits (e.g. sexual dimorphism; Butler & King 2004). We do note that SURFACE should be robust to a ‘trickle-down’ effect that can mislead inference when hypothetical evolutionary shifts are placed on specific branches, whereby a shift on a phylogenetically nested branch may provides false support for a hypothesized shift on an earlier branch (Moore & Chan 2004; Revell et al. 2012).
The choice of traits will be an important component of any SURFACE analysis. First, as the method assumes that traits have independent rates of adaptation (α) and diffusion (σ2), traits with strong evolutionary correlations should be avoided. SURFACE performs poorly when given only a single trait, but between two and four traits is often enough to ensure good performance (Figs 3 and 4), assuming these traits are in fact affected by the selective regime shifts. Including too many traits, especially axes lacking clear biological interpretation or unlikely to be involved in environmental adaptation, may limit the ability of SURFACE to recover convergence in ecomorphological traits. On the other hand, selection of only traits already believed to be convergent may predispose the analysis to finding a positive result. Researchers using SURFACE should ensure that the number and type of input traits are appropriate for addressing a given question in their clade of interest.
SURFACE uses stepwise AICc as a computationally tractable means of exploring the space of possible evolutionary scenarios (Alfaro et al. 2009; Thomas & Freckleton 2012). This stepwise approach has drawbacks: the constraint of adding one regime shift per step means the optimal configuration may not be found, and the answer can be sensitive to the choice of ∆AICc* and the topology and branch lengths of the tree. While SURFACE can be run on multiple credible trees and can optionally allow stochasticity in the sequence of regimes added, there is still no guarantee of finding an optimal model or fully quantifying uncertainty. Future Bayesian methods may allow a more thorough exploration of model space and a better accounting for uncertainty in the placement and degree of convergence of regimes and in the phylogeny itself. In the meantime, SURFACE offers a valuable step forward in the application of comparative methods to test hypotheses about convergent evolution.
Many clades have long been understood to contain extensive convergence, but statistically appropriate methods for testing the extent of convergence have been lacking. SURFACE allows reassessment of such data sets, and can be used to test whether convergence is greater than expected by chance. As an objective tool for characterizing the macroevolutionary adaptive landscape of a clade, SURFACE provides many new opportunities to understand the dynamics of adaptive radiation.