It is widely recognized that macroecological patterns are not independent of the evolution of the lineages involved in generating these patterns. While many researchers have begun to evaluate the effect of ancestor–descendant relationships on observed patterns using the phylogenetic comparative method, most macroecological studies only utilize the cross-sectional comparative method to ‘remove the phylogenetic history’, without considering the option of evaluating its effect without removing it.
Currently, most researchers use this method without explicitly evaluating three fundamental evolutionary assumptions of the comparative method: (i) that the phylogeny is constructed without error (which implies evaluating phylogenetic uncertainty); (ii) that more closely related species tend to show more similar characters than expected by chance (which implies evaluating the phylogenetic signal) and; (iii) that the model of the characters' evolution effectively recapitulates their history (which implies comparing the fit of several evolutionary models and evaluating the uncertainty of the estimating model parameters).
Macroecological studies will benefit from the use of the comparative method to assess the effect of phylogenetic history without removing its effect. The comparative method will also allow for the simultaneous analysis of trait evolution and its impact on diversification rates; it is important to evaluate these processes together because they are not independent. In addition, explicit evaluations of the assumptions of comparative methods using Bayesian inferences will allow researchers to quantify the uncertainty of specific evolutionary hypotheses accounting for observed macroecological patterns.
We illustrate the usefulness of the method using the phylogeny of the genus Sebastes (Pisces: Scorpaeniformes), together with data on the body size–latitudinal range relationship to estimate the effect of phylogenetic history on the observed macroecological pattern.
Macroecology is a research program that focuses on the search for general principles or natural laws underlying the organization of ecological systems over distinct spatial and temporal scales (Brown 1999; Marquet 2001; Marquet 2002) using ecological statistics (Smith et al. 2008). Macroecology is a synthesis of multiple disciplines (Marquet 2001) that has emerged over the last two decades (Brown 1999; Smith et al. 2008) with the aim of improving understanding of ecological systems through the study of their general properties and the use of interdisciplinary questions, such as interactions between ecology, biogeography and macroevolution (Brown 1995; Gaston & Blackburn 2000; Maurer 2000; Blackburn & Gaston 2002, 2006). In macroecological analyses, individual species function as replicates in the search for emergent patterns (Brown & Maurer 1987; Kelt & Brown 2000), such as patterns related to geographic range, body size, population density, trophic status and species number (e.g. Brown & Maurer 1989; Blackburn & Gaston 2001). While this discipline has reported various consistent patterns in nature, such as the positive relationship between body size and range of distribution (Gaston & Blackburn 1996a,b,c), in general it has been difficult to identify the macroevolutionary processes that underlie these patterns, and, moreover, identify the cases in which it is necessary to consider a macroevolutionary hypothesis.
Current macroecological patterns are not independent of the evolutionary history of the lineages involved in generating these patterns (e.g. Maurer, Brown & Rusler 1992; Taylor & Gotelli 1994; Maurer 1998a,b; Gotelli & Taylor 1999; Cumming & Havlicek 2002; Diniz-Filho 2004). In this sense, the history of the lineages, or macroevolution (Futuyma 1998), could play a fundamental role in the origin of current macroecological patterns (Fig. 1) conferring a dynamic context to macroecological analyses and a potential process through which these patterns could originate.
The majority of traditional macroecological studies assumes the importance of processes of evolution and diversification (e.g. Brown 1995) as fundamental for the generation and maintenance of current macroecological patterns. Nevertheless, few studies conduct explicit evaluations of this assumption as the macroecological approach is generally static and does not explicitly consider the cladogenetic and anagenetic history of the taxa. For this reason–and because studies comparing related species are confounded if they do not take into account the phylogenetic relationships between taxa (Felsenstein 1985; Harvey & Pagel 1991; Martins 1996)–macroecologists have begun to evaluate the effect of ancestor–descendant relationships on observed patterns (e.g. Taylor & Gotelli 1994; Poulin 1995; Blackburn & Gaston 1998; Pyron 1999; Harvey 2000; Diniz-Filho & Tôrres 2002; Freckleton, Pagel & Harvey 2003; Knouft & Page 2003; Price 2003; Purvis, Orme & Dolphin 2003; Olifiers, Vieira & Grelle 2004; Hernández-Fernández & Vrba 2005; Rodriguez-Serrano & Bozinovic 2009). This approach, called the phylogenetic comparative method (PCM; Felsenstein 1985; Pagel & Harvey 1988; Harvey & Pagel 1991) has become a standard statistical approach for analysing interspecific data (Ashton 2004). It is based on the observation that phylogenetically related species tend to resemble each other in many aspects of their phenotype as well as in ecological characteristics more than is expected by pure chance, and thus, they cannot be considered to be independent points (e.g. Felsenstein 1985; Harvey & Pagel 1991). As a result, comparative studies that do not take into account phylogenetic relationships can present a high rate of Type I error in the evaluation of the hypotheses (see Felsenstein 1985, 1988; Harvey & Pagel 1991; Martins & Garland 1991; Garland, Harvey & Ives 1992; Garland et al. 1993; Díaz-Uriarte & Garland 1996; Harvey & Rambaut 1998). Because of this potential for error, the majority of macroecological studies that incorporate the comparative method have primarily aimed at solving the statistical problem of non-independence to carry-out statistical analyses on patterns utilizing the cross-sectional comparative method (CSCM), which calculates the changes across the tips and nodes of a phylogeny to remove its history from the data. The CSCM includes phylogenetic independent contrasts (PICs), a technique frequently used by ecologists. According to Harvey & Pagel (1991), this is a statistical approach which allows the removal of the effect of shared ancestry on the variability in characters, such that the points (i.e. taxa) are statistically independent. This method is based on the Brownian-motion model of evolution (Felsenstein 1985; Pagel 1993). However, most researchers utilize CSCM without considering that if it is possible to ‘remove the phylogenetic history’ in the observed variation in current characters, then it must also be possible to evaluate the effect of history without having to remove it. This idea was conceptualized in the directional comparative method (DCM; i.e. using ancestral character states estimated explicitly or implicitly through evaluating the evolution of characters along the branches of a phylogenetic tree, sensu Pagel 1993) and recognizes that phylogenetic trees retain information about the mechanisms of evolutionary events that led to extant diversity (Harvey et al. 1991; Hey 1992; Nee, Mooers & Harvey 1992; Harvey, May & Nee 1994; Nee, May & Harvey 1994b; Nee et al. 1994a; Kubo & Iwasa 1995; Mooers & Heard 1997; Nee 2001). However, Pagel (1993, 1997) showed that the distinction between CSCM and DCM is unnecessary for the estimation of slope parameters under a pure Brownian-motion evolutionary model. Moreover, Blomberg et al. (2012) demonstrated that estimations of the slope parameters beyond PICs (CSCM) and PGLS (Phylogenetic Generalized Least-Squares) methods are equivalent under a pure Brownian-motion model of evolution, reinforcing the idea that in this case the dichotomy is unnecessary. Nevertheless, PICs usually function only under Brownian motion evolution and not when there are deviations from the Brownian-motion model. In this case, PGLS should be applied because of its strong statistical performance and ability to account for complex evolutionary models (Diniz-Filho & Tôrres 2002; Revell 2010; Venditti, Meade & Pagel 2011).
Macroecologists who use PICs and similar methods generally do not evaluate the three fundamental evolutionary assumptions of these approaches (Fig. 1a, 1b): (i) that the phylogeny is constructed without error, which implies taking into account phylogenetic uncertainty and utilizing it in the analysis (e.g. Huelsenbeck, Rannala & Masly 2000); (ii) that more closely related species tend to show more similar characters than expected by chance, which implies evaluating the phylogenetic signal of the variable or variables under study (see Revell 2010); and (iii) that the evolutionary model used is appropriate, which requires the comparison of different models (e.g. Brownian motion, Brownian motion-directional, Ornstein-Uhlenbeck; e.g. Collar, Schulte & Losos 2011), and evaluating the uncertainty of estimating model parameters (e.g. Ronquist 2004).
Importance of evaluating evolutionary assumptions
With respect to the first assumption, researchers typically perform comparative analyses on a single phylogenetic tree under the assumption that the phylogeny or the evolutionary history of the group under study is known without error (e.g. Huelsenbeck, Rannala & Masly 2000; Rezende & Garland 2003). In this sense, the PCM is based on the hypothesis that the phylogenetic tree being utilized is a valid representation of the history of hierarchical relationships between the species of a monophyletic group and of the relative degree of divergence between the species. Nevertheless, phylogenies are rarely known with complete certainty (e.g. Schluter 1995) and are usually inferred from groups of morphological or molecular data (e.g. Stepien & Kocher 1997; Felsenstein 2004), which are themselves subject to error and uncertainty (Revell, Harmon & Glor 2005). This presents a problem when the phylogenetic relationship or the hierarchical relationships between the species show high uncertainty because different phylogenetic trees could give different answers to the same comparative questions. As a result, all of the conclusions derived from the comparative analyses using a single phylogenetic tree are conditional upon the particular phylogeny selected for analysis.
It has been suggested that the Bayesian method using Markov Chain Monte Carlo (hereafter BMCMC) offers a solution to the problem of sampling phylogenies using a formal statistical procedure to sample from the probability distribution of phylogenetic trees (e.g. Larget & Simon 1999; Huelsenbeck, Rannala & Masly 2000; Huelsenbeck et al. 2001; Holder & Lewis 2003; Pagel & Meade 2004, 2005a). This method can be applied to comparative analyses aimed at studying the evolution of characters (Huelsenbeck et al. 2001; Lutzoni, Pagel & Reeb 2001; Pagel & Lutzoni 2002; Pagel & Meade 2004, 2006), such as macroecological traits at the species level. Given a sample of the probability distribution of phylogenetic trees, the phylogenetic uncertainty is dealt with by estimating the parameters of interest in each tree and integrating the estimations over all the trees (Pagel & Meade 2004, 2005a,b, 2006; Pagel, Meade & Barker 2004). Therefore, BMCMC provides a method that accounts for the phylogenetic uncertainty of comparative studies by evaluating macroevolutionary hypotheses in a statistically justified sample of phylogenetic trees. However, when the phylogenetic relationship shows low uncertainty, the conclusions derived using a single phylogenetic tree or a sample of trees are the same (Avaria-Llautureo et al. 2012).
Addressing the second fundamental assumption of the PCM involves evaluating the phylogenetic signal of the variable or variables under study. The use of approaches that incorporate the PCM has greatly increased during the last years, and some authors now suggest that analyses incorporating phylogeny should be routinely used (Price 1997; Blomberg, Garland & Ives 2003). Moreover, the need to evaluate when it is necessary to incorporate phylogenetic information in the comparison of characteristics at the species level has been emphasized many times (Losos 1999; Freckleton, Harvey & Pagel 2002; Blomberg, Garland & Ives 2003; Ashton 2004; Rheindt, Grafe & Abouheif 2004; Freckleton 2009; Münkemüller et al. 2012). This evaluation is based on determining the ‘phylogenetic signal’ (see Pagel 1997, 1999a, 2002; Blomberg, Garland & Ives 2003) of the characters under study, which allows researchers to describe whether the similarity in the characteristics of the species is influenced by the phylogenetic relationships of the species. The option of evaluating the phylogenetic signal in macroecological variables opens the door for studying the effects of macroevolutionary processes on observed macroecological patterns, effectively evaluating the importance of processes of evolution and diversification on these patterns (Fig. 1c, 1d).
If macroecological variables are found to have a phylogenetic signal, then the comparative method uses the phylogeny not only to investigate the ancestor–descendant relationships between taxa but also to evaluate the evolution of the traits that characterize the species and the relationships between traits (Harvey & Pagel 1991; Martins & Housworth 2002; Blomberg, Garland & Ives 2003). In this context, the DCM allows researchers to infer the evolution of characters to determine the direction of diversification and the rate of evolutionary change in characters between ancestors and descendants (e.g. Nee, Mooers & Harvey 1992; Pagel 1993, 1997, 1999a,b, 2002; Hansen 1997; Schluter et al. 1997; Mooers, Vamosi & Schluter 1999; Knouft & Page 2003; Bokma 2008; Cooper & Purvis 2010; Harmon et al. 2010; Lartillot & Poujol 2010; Laurin 2010; Monroe & Bokma 2010; Venditti, Meade & Pagel 2011). These studies are especially useful in the absence of fossil records, given that it is sometimes possible to use the reconstruction of ancestral characters to estimate the evolution of continuous and discrete characters (Hansen 1997; Pagel 1997, 1999a,b, 2002; Ronquist 2004; Bokma 2008). Consequently, the DCM complements traditional paleontological approaches in the study of the past and can shed new light on understanding the origin of current macroecological patterns. Specifically, the current implementation of the DCM in a Bayesian framework (Pagel, Meade & Barker 2004; Ronquist 2004) allows researchers to combine information about the uncertainty of the phylogeny with uncertainty in the estimation of the model parameters. In fact, the implementation of this new approach to the comparative method provides the opportunity to evaluate complex scenarios of correlated evolution with continuous (Organ et al. 2007) and discrete (Organ et al. 2009) characters based on the robust probabilistic evidence that Bayesian analyses can offer.
Models of trait evolution
The third fundamental assumption of the PCM is the evolutionary model used to describe the evolution of a given trait. If the model of trait evolution assumed by a phylogenetic comparative method is incorrect, subsequent comparative analyses may be invalid (Harvey & Purvis 1991, Freckleton & Harvey 2006). It is possible to evaluate the accuracy of the chosen evolutionary model by comparing the Brownian-motion model with other models that are variations of the simple Brownian-motion model, such as Ornstein-Uhlenbeck (OU), Early Burst model (EB), Directional model, and other tree transformation models (see The directional comparative method: character evolution section). All these variations on the simple Brownian-motion model involve the transformation of the parameter αroot (i.e. root state) and the V matrix (σ2C, or the product of the multiplication between the Brownian rate parameter σ2 and the C matrix of variance-covariance given the phylogeny) (see Harmon et al. 2010; Slater, Harmon & Alfaro 2012).
Evolution of macroecological characters: some hypotheses to evaluate using DCM
A fundamental issue in macroecology is to understand how patterns arise, as well as the evolutionary processes that generate these patterns (Losos 1994). For example, the frequently reported positive relationship between body size and distribution range may be explained by the ability of larger bodied species to disperse more rapidly and successfully than small-bodied species. Alternatively, larger bodied species may be evolutionarily older and therefore have had longer periods in which to disperse, establish, and attain larger geographic ranges (see Gaston & Blackburn 1996b,c) . The latter hypothesis predicts an increase or decrease in body size and range of distribution on an evolutionary scale (e.g. Burness, Diamond & Flannery 2001; Diniz-Filho & Tôrres 2002; Olifiers, Vieira & Grelle 2004; for reviews see Gaston & Blackburn 1996b,c). If both variables are correlated over time (i.e. during the evolution of the focal taxon, larger bodied species disperse), then the pattern is due to these species. Alternatively, if the observed body size values of a lineage are product of its evolutionary age, assuming that ancestors were bigger, then larger bodied species disperse more during their long history, generating the current pattern. This last evolutionary explanation suggests a trend over time from large ancestors to small descendants (i.e. miniaturization, see Stanley 1973; and Hanken & Wake 1993). A further alternative explanation is that the ancestor of the focal taxon was small and increased in body size through time (i.e. Cope's Rule, see Stanley 1973; and Avaria-Llautureo et al. 2012). Considering that species with large body sizes require larger home ranges to obtain the minimal resources for survival, the distribution range may increase with time, resulting in a positive correlation between body size and range size (e.g. Marquet & Tarper 1998; Burness, Diamond & Flannery 2001).
The positive relationship between body and range size is one of the most frequently described macroecological patterns (e.g. Brown 1981; Diniz-Filho & Tôrres 2002; for a review see Gaston & Blackburn 1996b,c) and has been described by a triangular polygon (e.g. Gaston & Blackburn 1996c; Diniz-Filho & Tôrres 2002; Olifiers, Vieira & Grelle 2004). According to this model, the geographic range of a species tends to increase with body size as does its probability of extinction, based on the minimum geographic area required for a species to survive given its body size (Gaston & Blackburn 1996c; Marquet & Tarper 1998; Rosenfield 2002). This suggests that minimum geographic range is of great importance for conservation biology (Gaston & Blackburn 1996c) and implies that large species cannot survive in restricted areas.
The most integrative approach to the macroevolutionary study of macroecological patterns is to evaluate the effect of a given ecological trait on the diversification process of the taxon under study (Fig. 1c, d). This issue has important consequences for understanding the origin and diversification of lineages (e.g. Gould 1988; McShea 1994; Marquet 2001; Wang 2001; Gaston & He 2002). In this respect, the use of the DCM to fit models of continuous trait evolution allows for the evaluation of different models of character evolution, in addition to evaluating their time and mode of evolution (Hansen 1997; Pagel 1997, 1999a,b, 2002; Knouft & Page 2003; but see Bokma 2008 and Hadfield 2010 for other models of trait evolution). For example, with respect to the positive relationship between body size and distribution range, we could evaluate the hypothesis that this macroecological pattern is the result of macroevolutionary processes associated with changes in these variables throughout the history of the taxon (Fig. 1). In this case one would expect that the macroecological variables under study will show a phylogenetic signal (Fig. 1b; Table 1, parameter λ), and correlated evolutionary change (Fig. 1c; Table 1, parameter r in PGLS). In addition, if these variables show a phylogenetic signal, then we can evaluate other historical processes associated with the origin of the macroecological variables (Fig. 1; Table 1) such as: (i) if the evolutionary forces are homogeneous causing a directional evolutionary change (parameter β, see Table 1); (ii) if evolutionary change accumulates gradually or if evolution occurs in the speciation events of lineages (i.e. punctuated equilibrium; parameter κ, see Table 1); (iii) if there is a non-constant rate of trait evolution in adaptive radiations (i.e. if the evolution is fastest early in clade's history and slows through time using parameter Delta δ or r, see Table 1); (iv) if species' traits are attracted, with a strength of selection α, towards an optimum value through time (parameter theta, θ in OU model, Table 1); (v) the importance of trait evolution on diversification, taking into account a possible evolutionary trend in the traits (i.e. trait-dependent diversification using a Birth-Death model; Fig 1, Table 1).
Table 1. Summary of evolutionary processes that can be evaluated with the approaches discussed and implemented in this article, the based models of the approaches, associates parameters and its significance, some key references, and some softwares that implement each approach
Evolutionary processes that can be evaluated
Abbreviations used for the model referred in this table. PGLS: Phylogenetic General Least Square; EB: Early Burst; OU: Ornstein–Uhlenbeck. Software references:
Similarity in species traits is influenced by the phylogenetic relationships of the species (i.e. Phylogenetic Signal)
Lambda (λ): λ = 1 indicate that phylogenetic relationship predict effectively the patterns of similarity between species traits. λ = 0 indicate that patterns of trait similarity amongst species are independent of phylogeny. 0 < λ< 1 indicate different levels of phylogenetic signal.
Species traits have evolved according to phyletic gradualisms or punctuated evolution through the history.
Kappa (κ): κ = 1 indicate gradual evolution. κ< indicate proportionally more evolution in shorter branches. κ>1 indicate that longer braches contribute proportionally more to trait evolution. κ= 0 indicate that amount of evolution is associate only with speciation's event (i.e. punctuated evolution).
Correlated evolution between two or more continuous species traits
Correlation coefficient (r): indicate the correlation of two variables through the phylogeny. In PGLSλ phylogeny is incorporated as a variance-covariance matrix with λ in the error term of the regression equation.
Traits evolution dependency of diversification rates and directional trends
σ2: Is the evolutionary rate of trait. Ɵ, directional trend ‘drift’ parameter, which captures the deterministic or directional component of character evolution. λs, the speciation rate. μ, extinction rate.
These hypotheses can be evaluated by estimating specific parameters of continuous trait evolution models (see Table 1), that are based on the original pure Brownian-motion model and its modifications; hence all these models can be compared using different statistic such as the likelihood ratio test and Akaike Information Criterion (Harmon et al. 2008, 2010). In particular, the evaluation of these hypotheses would allow researchers to uncover the suite of macroevolutionary processes that exist between the macroecological variables, such as the potential correlated tendencies between range of distribution, body size and diversification proposed in the Taxon Cycle Hypothesis (e.g. Wilson 1961; Ricklefs & Cox 1972, 1978; Miles & Dunham 1996; Ricklefs & Bermingham 1999, 2002; Ricklefs 2005).
In this study, we used the PCM to evaluate whether the variation in body size, range of distribution and also the relationship between both characters, is explained by historical processes, using rockfish in the genus Sebastes (Cuvier) as a study model. The genus Sebastes is a monophyletic (Rocha-Olivares, Rosenblatt & Vetter 1999; Rocha-Olivares et al. 1999), species-rich (currently 112 named species), ecologically and morphologically diverse group of rockfishes (Magnuson-Ford et al. 2009), and is phylogenetically well-characterized (Love, Yoklavich & Thorsteinson 2002; Hyde & Vetter 2007).
Materials and methods
Macroecological data collection and evaluation of patterns
We built a database of the maximum body size (total length in centimetres) and latitudinal range extent (latitudinal degrees) reported for all of the species of the genus Sebastes, mainly compiled from the international database FishBase (Froese & Pauly 2010) as well as from other sources (Appendix S1). We used latitudinal range extent because it is the main factor in the distribution of marine animals, especially fishes (Stevens 1989; Rohde 1992; Rohde, Heap & Heap 1993; Macpherson & Duarte 1994; Smith & Gaines 2003; Alcaraz, Vila-Gispert & García-Berthou 2005) and consequently is a good descriptor of geographic range size and the species' habitats. The macroecological pattern was determined by evaluating the relationship between body size and range of distribution via regression analysis. The significance of this relationship was evaluated using the Quantreg R package (Koenker 2012). We fit an ordinary least squares regression (OLS) to the data and then used the bootstrap approach (10 000 random matrices) to test the null hypothesis that the slope was equal to 0 (a significance level of P = 0·05 was assumed in this study). Also, to evaluate the presence of a minimum geographic range, we searched for an inferior limit to the distributional range–body size relationship, determining the linear regression of the lowest significant quantile using Quantreg, which establishes the significance of the slope (with the null hypothesis of a slope equal to 0) using the rank score test for quantile regression (Koenker 1994; Koenker & Machado 1999). This test evaluates the probability (P) of a Chi-square distribution, using a bootstrap approach (for this analysis we used 10 000 randomizations). Both analyses were performed using natural logarithm transformations of the variables. The OLS analysis was then done using a CSCM approach, removing the phylogenetic effect in the macroecological variables based on PICs, which generated a new data set (i.e. contrasts of body size and range of distribution) using the ultrametric consensus tree (see below) based on the R package Ape (Paradis, Claude & Strimmer 2004). With this new data set, it was possible to apply any parametric statistic (e.g. OLS) without the effect of phylogeny (Felsenstein 1985, 1988).
Bayesian phylogenetic reconstruction of the Sebastes genus
We use DNA aligned sequence data from eight loci for 99 of the 112 currently described species downloaded in NEXUS format from TreeBASE (http://treebase.org/treebase-web/search/study/summary.html?id=2031), which correspond to the species that were used by Hyde & Vetter (2007) in the phylogenetic reconstruction of the Sebastes genus. We applied a general likelihood-based mixed model (MM) of gene-sequence evolution as described by Pagel & Meade (2004, 2005a), that accommodates cases in which different sites in the alignment evolved in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. The Reversible-Jump Markov Chain Monte Carlo (RJMCMC) procedure (Pagel & Meade 2006) was used with the objective of finding the best MM that summarizes the sequence evolution, using the bayes phylogenies 1·1 software (http://www.evolution.rdg.ac.uk/BayesPhy.html). This approach enables researchers to explore the variety of possible models and parameters, converging towards the model that best fits the data in the posterior tree sample. Nine independent BMCMC analyses were run using 46 580 000 generations of phylogenetic trees, sampling every 10 000th tree to assure that successive samples were independent. We used the three independent runs which reached the same convergent zone, from a mixed sample of trees. From the mixed sample of trees, the first 200 trees of the sample were removed to avoid including trees sampled before the convergence of the Markov Chain, and we re-sampled every 25 trees to obtain a final sample of 544 independent trees, which were used for the comparative analyses.
To obtain an ultrametric tree, to use QuaSSE method of character evolution and diversification (see below), we analysed the sequence alignment with beast 1·6·2 software (Drummond & Rambaut 2007). This analysis was conducted using a BMCMC framework to estimate the posterior probability of phylogenetic trees, to use the consensus tree in the comparative analysis. As prior information, we used a GTR + Γ + I model of sequence evolution, the Yule process of speciation and one point of fossil calibration: 8 ± 2 millon years ago for the origin of Sebastes genus (Hyde & Vetter 2007). Analyses were based on four models of mutation rate: (i) A strict molecular clock; (ii) an uncorrelated lognormal relaxed clock; (iii) an uncorrelated exponential relaxed clock; and (iv) a random local clock. The MCMC chain was run for 35 000 000 generations (1 000 000 generations were discarded as burn-in before the posterior probabilities distribution of the selected diversification model converged), with parameters sampled every 10 000 steps. Examination of MCMC samples using tracer v. 1·5 software (Rambaut & Drummond 2007) showed that effective sample sizes for all parameters of interest were greater than 500. To find the best molecular clock model, we used Bayes factor to compare the four clock models, given that it is the soundest theoretical framework for model comparison in a Bayesian framework (Drummond & Rambaut 2007).
The directional comparative method: character evolution
Using the PGLS model (Martins & Hansen 1997; Pagel 1997, 1999a,b, 2002) implemented in a Bayesian framework, we first evaluated the form and mode of the evolutionary patterns using the five phylogenetic scaling parameters defined by Pagel (1999a,b, 2002) (Beta β, alpha αroot, lambda λ, kappa κ, and delta δ; estimated from species data and the BMCMC sample of non-ultrametric phylogenetic trees) to determine four aspects of trait evolution.
We evaluated whether a random-walk (Model A) or directional change model (Model B) (Fig. 1c) was the most appropriate model for explaining the evolution of macroecological variables. Model A corresponds to the standard constant-variance (σ2) random-walk model (sometimes called Brownian motion). In this model, the σ2 parameter of evolution is determined by choosing a value of α from the random-walk model, where α is the trait value assigned to the root of the tree based on the phylogenetic controlled mean of the tip data (Pagel 2002). Model B is a directional random-walk model. This model has two parameters, the σ2 parameter, as in Model A, plus the directional change parameter, β. This parameter effectively measures the regression of trait values across species against total path length (from the root of the tree to the tips), which is interpreted as the direction and magnitude of change in a character per unit of divergence (Pagel 2002). However, this model can only be implemented with non-ultrametric trees where branch lengths represent some measure of genetic divergence.
We evaluated the extent to which the phylogeny correctly predicts patterns of similarity in body size and latitudinal range of Sebastes species (i.e. phylogenetic signal, Fig. 1b) using the phylogeny scaling parameter, λ. In this approach, λ reveals whether the phylogeny fits the patterns of covariance among species for a given trait. This parameter evaluates whether one of the key assumptions underlying the use of the comparative method (i.e. that species are not independent), fits the data for a given phylogeny and trait, assessing the strength of the phylogenetic signal. Values close to zero indicate there is no concordance between phylogeny and the trait values of species (phylogenetic independence). If traits are evolving as expected, given the tree topology and branch lengths, λ takes the value of 1 (i.e. pure-historical pattern or pure Brownian-motion model; the observed pattern of trait variation among the species is predicted by a model of evolution along the phylogeny; see Münkemüller et al. 2012). Intermediate values of λ, between 0 and 1, indicate different degrees of a phylogenetic signal. Therefore, both non-historical and pure-historical patterns are not ideal models, because they respectively underestimate and overestimate the influence of phylogeny.
Next, we contrasted punctuational vs. gradual trait evolution (Fig. 1c) using the branch-length scaling parameter κ. In this test, κ scales with the relationship between individual branch lengths and trait evolution (Pagel 1994, 2002). If κ is 1, trait evolution is directly proportional to branch length, and the gradual mode of trait evolution is better supported. Values of κ greater than 1 indicate proportionally more evolution in longer branches. Values of κ less than 1 indicate proportionally more evolution in shorter branches. In the extreme case of κ = 0, trait evolution is independent of branch length, which is consistent with a punctuational mode of evolution.
Finally, we evaluated the non-constant rate of evolution through time (Fig. 1c). using the path-length scaling parameter, δ. In this test, δ is a parameter that detects differential rates of evolution over time and rescales the phylogeny based on whether the rate of evolution is constant, δ = 1 (gradual evolution). If the estimate of δ < 1, this indicates that shorter paths (i.e. earlier evolution of the trait in the phylogeny) contribute disproportionately to trait evolution (‘early burst’). If δ > 1 longer paths contribute more to trait evolution; this is the signature of accelerating evolution as time progresses, with temporally later changes (sensu Pagel 2002).
To evaluate the evolution of macroecological variables, we estimated the phylogenetic scaling parameters using a Bayesian framework, sampling the parameter values from the posterior probability for a particular model of evolution and sample of trees. We used the distribution of parameter values over the sample of BMCMC to evaluate the deviation of the estimated parameters from the null model of pure Brownian motion (i.e. constant-variance model with λ, κ and δ equal to 1), and a non-historical model (i.e. λ equal to 0). The BMCMC approach allows us to integrate both parameter and phylogenetic uncertainty (Ronquist 2004). These analyses were conducted using the Continuous module implemented in bayestrait 1·0 software (Pagel & Meade 2007). We used the Bayes factor (Gelman et al. 1995) to compare the marginal likelihood of the observed model (i.e. constant-variance model; with λ, κ and δ calculated in a Bayesian framework) with pure Brownian-motion and non-historical models. Given that the marginal likelihood of a model is the integral of the model likelihoods over all values of the model's parameters and over all possible trees, this marginal likelihood is difficult to estimate. For this reason, we used the method proposed by Newton & Raftery (1994) based on a weighted likelihood bootstrap with modification by Suchard, Weiss & Sinsheimer (2001) implemented in tracer program v1·5. The estimates were obtained using importance sampling in Tracer with 1000 bootstrap replicates (Suchard, Weiss & Sinsheimer 2005).
We also compared the fit of Pagel's models plus Ornstein-Uhlenbeck (OU), and the Early Burst model (EB), to explore the best fitting model to the macroecological variables (Table 3). The OU model describes how a trait evolves away from its optimal value (θ), and is pulled back towards the optimum with a strength corresponding to α (or ‘rubber-band’ parameter, Hansen 1997; Butler & King 2004). The EB is the Random-Walk model in which the net rate of evolution slows exponentially through time as radiation proceeds (Harmon et al. 2010) and has the additional parameter r to describe the pattern of rate change through time. The maximum likelihood estimation of parameters associated with character evolution were done using the R package GEIGER (Harmon et al. 2008), and they were compared with Akaike Information Criterion corrected by sample size (AICc).
The directional comparative method: using PGLSλ to evaluate correlated evolutionary change
We evaluated the phylogenetic effect on the trends in character relationships between taxa (i.e. the observed macroecological pattern), using the best model of evolution that was previously found for each character. To do this, we evaluated the significance of the relationships between the pair of characters using a measure of correlated evolution in a Bayesian framework implemented in bayestrait 1·0 software (Pagel & Meade 2007) (Fig. 1c). We introduced λ in the regression analysis (PGLSλ, sensu Revell 2010), where phylogeny is incorporated as a variance-covariance matrix with λ in the error term of the regression equation. The error term is then decomposed into a component that represents the phylogeny and the remaining error term (Pagel 1997, 1999a; Freckleton, Harvey & Pagel 2002). In this approach, when λ is forced to be equal to 0, it is equivalent to OLS regression (i.e. a species-level analysis in which the phylogeny is not considered). On the other hand, when λ is forced to be equal to 1, the results are similar to those obtained with phylogenetically independent contrasts (PIC; Pagel 1999b; Garland & Ives 2000; Lavin et al. 2008). However, when the estimation of λ is between 0 and 1, neither OLS nor PIC methods are suitable given that they underestimate or overestimate the influence of phylogeny (see Capellini, Venditti & Barton 2010, 2011; Revell 2010).
As the null hypothesis we used a model in which the covariance between characters was set to zero (i.e. complete character independence), and the alternative hypothesis was the observed covariance between characters (Pagel 1999a,b). If the null hypothesis was rejected, then we concluded that the phylogenetic relationship and the models of evolution of the characters did influence the observed macroecological patterns. We used Bayes factor (Gelman et al. 1995) to compare these hypotheses. We summarized the parameters of all selected models using the mean and the 95% highest posterior density interval (HPD). We fitted three models: a model in which λ was forced to equal 0, another with λ forced to equal 1, and finally another in which λ was estimated. To assess which model had the best fit, we used the Bayes factor approach.
The directional comparative method: diversification models and character evolution
The evolution of traits is not independent of diversification rates (Paradis 2005; Maddison, Midford & Otto 2007; Freckleton, Phillimore & Pagel 2008), and inferring character evolution is problematic when the character affects speciation or extinction (Maddison 2006; Paradis 2008), and even more if the evolution of the trait has a directional tendency (FitzJohn 2010). For this reason, we compared the random vs. random-directional models of evolution with the method proposed by FitzJohn (2010) to evaluate the mode of macroecological variables (Fig. 1d). This method takes an ultrametric tree and set of trait measurements for the tip species and fits a series of birth–death models in which the speciation and extinction probabilities are independent of trait evolution or vary along branches as a function of a continuous trait that evolves according to a diffusion process with or without an evolutionary tendency (i.e. increase or decrease over time). These models have the following parameters: the speciation and extinction rate parameters (λs, μ); the diffusion parameter (σ2), which is the expected squared rate of change and captures the stochastic elements of character evolution; and the directional trend ‘drift’ parameter (θ), which captures the deterministic or directional component of character evolution; this is the expected directional change in the character over time and may be due to selection or any other within-lineage process that has a directional tendency (FitzJohn 2010). The analyses were performed in the R package Diversitree (FitzJohn 2012) with the QuaSSE method (FitzJohn 2010). Finally, we selected the best model of diversification using the Likelihood and P values.
Macroecological pattern and the CSCM approach
The macroecological pattern of the Sebastes genus indicated a positive relationship between Ln of body size and range of distribution (P = 9·25E–10; r = 0·57; Fig. 2a), with a significant lower limit to the geographic range at the 0·001 quantile (P = 0·003; Fig. 2a), describing a triangular polygon. The PICs of these variables showed a positive relationship (P = 0·001; r = 0·32; Fig. 2b).
The directional comparative method: character evolution
The directional comparative method results, based on the sample of Bayesian trees, showed that the best predictor of body size and latitudinal range evolution in the Sebastes genus was a Random-Walk model (BF = 1 002 and 1·005, for body size and latitudinal range respectively), with αroot = 42·46 centimetres, and 24·67° of latitudinal range. Both characters were significantly influenced by phylogeny (λ > 0; Table 2, Fig. 3), but body size was more influenced by the phylogenetic relationships (λ = 0·88, Table 2, Fig. 3) than latitudinal range of distribution (λ = 0·51, Table 2, Fig. 3). The evolution of body size was gradual with short branch length having a major effect in body size differences among species (0 < κ < 1; Table 2, Fig. 3), and the rate of evolution of the traits was constant over time (δ = 1·26, not significantly different from 1; Table 2, Fig. 3), so there was no early burst or later changes of this trait in the history of this group. Given that the observed κ is less than 1 (0·27), proportionally more evolution occurred in shorter branches (Table 2, Fig. 3). On the other hand, the evolution of range of distribution was consistent with a punctuational model (κ = 0; Table 2, Fig. 3), and the observed δ is 1·74, but this value is not significantly different from 1, so the rate of evolution of this trait was constant over time (Table 2, Fig. 3).
Table 2. Bayes factors used to test the observed vs. expected values of phylogenetic scaling parameters for different models of trait evolution. The observed λ were contrasted with values expected under the hypotheses of no phylogenetic signal (λ = 0) and the pure Random Walk model (λ = 1). The observed κ were contrasted with expected values for punctuated evolution (κ = 0), and the pure Random-Walk model (κ = 1). The observed δ were contrasted only with the expected values for the pure Random-Walk model (δ = 1). When the Bayes factor was less than 3, the simplest model was selected
*indicate the selected model
λ Forced = 1
8 × 10E7
λ Forced = 0
1 × 10E11
κ Forced = 1
4 × 10E8
κ Forced = 0
δ Forced = 1
3 × 10E7
λ Forced = 1
4·12 × 10E17
λ Forced = 0
κ Forced = 1
1·5 × 10E16
κ Forced = 0
δ Forced = 1
The comparison of the continuous models using maximum likelihood, based on the Bayesian consensus tree, showed that the kappa-based model had the best fit with body size evolution (Table 3), with shorter branch lengths contributing more to body size variability (κ = 0·21; Table 2). In contrast, the lambda-based model had the best fit with latitudinal range evolution (Table 3), actually the intermediate value of lambda parameter (λ = 0·51; Table 2) indicates that neither a Pure Brownian-motion (λ = 1) nor a non-historical model (λ = 0), are suitable given that they overestimate or underestimate the influence of phylogeny respectively.
Table 3. Summary of comparisons of model fit to log Body Size and Latitudinal range of distribution of the Sebastes genus. k = Number of model parameters; lnLik = Natural logarithm of Maximum likelihood; AICc = Corrected Akaike Information Criterion; BM = Pure Brownian motion, Lambda = Pagel's lambda, Delta = Pagel's delta, Kappa = Pagel's kappa, OU = Ornstein-Uhlenbeck model, EB = early burst model and Directional = Pagel's Directional model
In bold show the best fitting model
PGLSλ to evaluate the relationship of body size and latitudinal range
The comparisons between models of phylogenetic regression analysis (i.e. with λ = 1, 0, and estimated) indicated that the PGLSλ model provided a better fit to the data than both the OLS model and PIC. Under this model, the variables presented an R2 = 0·15, with a slope (β) = 0·74 and intercept (αroot) = 49·95, and the residuals had a phylogenetic signal of λ = 0·74. This suggests that both traits were correlated through the phylogeny, indicating that a significant historical relationship between the characters exists, with body size significantly predicting the values of latitudinal ranges throughout the evolutionary history of this group.
Diversification models and character evolution
The results of the analysis of body size and latitudinal range evolution and its relationship with speciation rate showed that for these two variables the best fitting model is a Drift Linear model with a positive trend (Ln Lik = −283·73, P = 0·0004, for body size; LnLik = −363·82, P = 0·0006, for latitudinal range; Table 4). The positive trends for body size and latitudinal range were described by θ = 0·347 and θ = 0·705 respectively (Table 4). On the other hand, the relationships between the speciation rates and the two variables showed a negative relationship, with values of −0·124 and −0·063 for body size and latitudinal range respectively (Table 4). These results indicate that both traits have a general tendency to increase over time, and species that have a larger body size and/or larger latitudinal ranges have lower speciation rates.
Table 4. Maximum likelihood parameter estimation used to select the best model of speciation rate based on the Consensus tree obtained from Bayesian approach for Body Size and Latitudinal Range. d.f. = Degrees of freedom of each model; lnLik = Natural logarithm of Maximum likelihood; Drift = trait evolutionary trend; Chi-Sq = Chi-Square value; and Pr(.[Chi]) = Chi-square probability value. In bold font is indicated the best model
Our results showing a positive relationship between body size and range of distribution agree with the commonly observed macroecological pattern described in the literature (e.g. Gaston & Blackburn 1996b,c). The observed macroecological pattern is greatly influenced by the phylogenetic history of the genus Sebastes, probably because both traits increased together during the species' diversification (Table 2). On the basis of our results, we propose that the triangular shape and the positive relationship that describes the current macroecological pattern in the Sebastes genus could be explained in an historical context by the following logic, which offers a testable hypothesis for future work (Fig. 4): (i) Given that body size had a stronger phylogenetic signal than range of distribution, throughout history changes in body size were likely more affected by the ancestor–descendant relationship than by the ecological context experienced by species, while the changes in range of distribution were more related to the ecological context; (ii) The evolution of body size likely began with a small-bodied ancestor (i.e. directional change model with a positive trend) that had a narrow range of distribution given the minimum geographic size that allowed the species to survive; (iii) During the diversification of the genus, body size and ranges of distributions increased (i.e. directional change model with a positive trend); (iv) this resulted in a net positive trend associated with a decreasing speciation rate (i.e. both large body size and wide distributions tend to diversify less), but given that body size significantly predicted the values of latitudinal ranges and showed a stronger phylogenetic signal, the ranges of distribution follow the evolutive changes of body size, but are highly influenced by the ecological context; (v) The minimum geographic size that allowed the species to survive restricted the changes in ranges of distributions throughout history. This logic, together with an observed increase in body size during evolution, is hypothesized to have produced the current body size–latitudinal range diversity, and the macroecological pattern observed for the Sebastes genus. Moreover, when PICs are utilized to analyse the data, the power of the macroecological relationship tends to disappear (Fig. 2b). indicating that the current macroecological pattern described by a triangular polygon (i.e. positive relationship and a minimum geographic range, Fig. 2b). arises as product of macroevolutionary processes. In fact, the PGLSλ (R2 = 0·15) approach showed that neither the OLS nor the PIC method are suitable, given that they respectively overestimate (R2 = 0·21) and underestimate (R2 = 0·01) the relationships between body size and range of distribution given the phylogeny. This logic, based on the results of DCM, allows us to propose a new evolutionary mechanism that explains the current macroecological pattern considering the correlated evolutionary change and the mode of evolution of body size and range of distribution.
Considering that macroecological patterns are affected by historical processes, the phylogenetic comparative method constitutes an extremely useful approximation for exploring the processes associated with these patterns through the study of character evolution and the relationship between characters (e.g. Losos 1994; Diniz-Filho & Tôrres 2002; Knouft & Page 2003; Olifiers, Vieira & Grelle 2004) when the two fundamental evolutionary assumptions of the comparative method (i.e. that the phylogeny is constructed without error and that the model of evolution of the characters effectively recapitulates their history) are explicitly evaluated. The correct evaluation of the effects of macroevolutionary processes on macroecological patterns first requires estimation of the phylogenetic signal of the variables involved with the purpose of determining whether the given model of evolution accounts for the evolution of the variable. With this information, further exploration of the macroevolutionary processes which may contribute to macroecological patterns can be justified (Fig. 1b). Moreover, it has recently been shown that using λ to evaluate phylogenetic signals would be more appropriate than other parameters because it facilitates choosing between a chronogram and a phylogram to infer ancestral states, and it has a stronger relation with the inference accuracy (Litsios & Salamin 2012; Münkemüller et al. 2012). When macroecological variables have a phylogenetic signal, the DCM allows for the evaluation of a variety of historical processes (Fig. 1c; 1d). For example, evolutionary tendency (Fig. 1c), and whether this tendency is related to the diversification rate of the group under study, can be determined (Fig. 1d). However, the use of ultrametric or non-ultrametric trees is fundamental for inferring and interpreting trends. Particularly, the directional model of Pagel (2002) can only be evaluated using trees with different root to tip lengths or non-ultrametric trees and provides direct evidence of the direction and amount of change per unit of divergence and not time. Consequently, Pagel's directional change model should only be used in studies where the branch lengths, obtained from a given genetic marker, are potentially related to the evolution of the macroecological variables. However, the QuaSSE method (FitzJohn 2010) of diversification is most appropriate for evaluating the hypothesis of directional trends of character evolution, as it combines seven models that use the amount of change over time and also incorporates and tests the effects of speciation and extinction processes. A similar method to evaluate directional trend with ultrametric trees was proposed by Bokma (2008); however, his method does not have an associated software that allows researchers in macroecology to easily to apply this method to macroecological questions.
We suggest that order to improve our ability to explain current macroecological patterns, CSCM and DCM approaches should be integrated. This will allow researchers to disentangle the current and historical processes underlying macroecological patterns. In this context, some DCM approaches, like Pagel's model, complement the use of CSCM approaches, like PICs, and effectively measure the phylogenetic signal of continuous variables (e.g. Münkemüller et al. 2012), the mode of character evolution (see Pagel 2002) and the correlated evolution between characters (e.g. Revell 2010). Overall, it is important to evaluate the existence of correlated evolution if the macroecological pattern under study is related to a trait, such as body size, which is also correlated with other individual level traits that present a phylogenetic signal (e.g. metabolism; Capellini, Venditti & Barton 2010), the pattern shows scaling with other traits of the species in general (e.g. range of distribution; McKinney 1990), and/or if the pattern presents a clear phylogenetic signal (Freckleton, Harvey & Pagel 2002; Blomberg, Garland & Ives 2003; Ashton 2004).
Future research in macroecology should consider recent DCM methods that improve some specific points of historical inferences of the macroecological patterns. Some examples include evaluating the correlated evolution between continuous and binary characters (Ives & Garland 2010), evaluating temporal shifts in the rate of evolution of macroecological characters in different clades of a tree over time (Harmon et al. 2010; Revell et al. 2012; Venditti, Meade & Pagel 2011), and potentially incorporating in the analysis of character evolution those nodes of the phylogeny that are hidden due to extinction (Bokma 2002, 2008; Ingram 2011). Finally, macroecological studies will benefit from the use of the DCM to assess the effect of phylogenetic history without removing this effect. This approach complements the traditional CSCM; together with explicit evaluations of the assumptions of the comparative methods using Bayesian approaches this will allow researchers to quantify the uncertainty of specific evolutionary hypotheses accounting for observed macroecological patterns.
This study was funded by FONDECYT Grant # 11080110 to C.E.H. Also, the authors are very grateful to Luke Harmon, Paula E. Neill and Tara Massad, and anonymous reviewers for comments and suggestions that greatly improved the final version of the manuscript. Previous versions of this manuscript were greatly improved by comments from Marcelo Rivadeneira, Marco Mendez and F. Patricio Ojeda. D.B-B and C.B.C-A were supported by Doctoral Fellowships for the ‘Programa de Doctorado en Sistemática y Biodiversidad’, from the graduate school of the Universidad de Concepción, and CONICYT Doctoral Fellowship respectively. J.A-Ll and B. M-P were supported by CONICYT Master Fellowship.