Many evolutionary processes can lead to a change in the correlation between continuous characters over time or on different branches of a phylogenetic tree. Shifts in genetic or functional constraint, in the selective regime, or in some combination thereof can influence both the evolution of continuous traits and their relation to each other. These changes can often be mapped on a phylogenetic tree to examine their influence on multivariate phenotypic diversification. We propose a new likelihood method to fit multiple evolutionary rate matrices (also called evolutionary variance–covariance matrices) to species data for two or more continuous characters and a phylogeny. The evolutionary rate matrix is a matrix containing the evolutionary rates for individual characters on its diagonal, and the covariances between characters (of which the evolutionary correlations are a function) elsewhere. To illustrate our approach, we apply the method to an empirical dataset consisting of two features of feeding morphology sampled from 28 centrarchid fish species, as well as to data generated via phylogenetic numerical simulations. We find that the method has appropriate type I error, power, and parameter estimation. The approach presented herein is the first to allow for the explicit testing of how and when the evolutionary covariances between characters have changed in the history of a group.

Biologists are frequently interested in the evolutionary correlations between continuously distributed characters measured from species related by a phylogenetic tree. These correlations can arise by a number of causes. Evolutionary correlations can arise by natural selection. For example, an “evolutionary correlation,” that is, a correlation between the evolutionary changes in two characters, can arise if the traits are under correlated selection or evolving along a ridge in the adaptive landscape (Felsenstein 1988; Arnold et al. 2001; Martins et al. 2002; Jones et al. 2003), if the traits are evolving in response to selection toward randomly moving optima, in which the movement of each optimum is by correlated Brownian motion (Hansen and Martins 1996; Revell and Harmon 2008), or if the traits are functionally integrated to perform an ecological task (Walker 2007; Collar et al. 2008). Evolutionary correlations can also arise by genetic drift. For example, an evolutionary correlation between two characters will arise under drift if the characters themselves are genetically correlated (Lande 1979; Arnold et al. 2001; Revell and Harmon 2008).

However, the evolutionary correlation between characters can also change over time as the adaptive, functional, and genetic relationships between characters evolve. For example, selective regimes are expected to change as the phenotype and external environment change through time and among phylogenetic lineages. In fact, many hypotheses about adaptive phenotypic evolution specifically predict changes in the evolutionary correlation between traits. These include hypotheses pertaining to the origin of a novel trait or function that provides a lineage access to a new adaptive zone (so-called “key innovations”; Miller 1949; Galis 2000), the mechanical decoupling of morphological features (Liem 1973; Vermeij 1973; Lauder 1981), and shifts in the selective regime following the invasion of a novel habitat, ecological niche, or geographical area (Grant 1972; Schluter 1988).

Additionally, genetic constraints evolve (Roff 1997; Steppan et al. 2002; Jones et al. 2003; Bégin and Roff 2004) and so must the tendency toward a particular evolutionary correlation by drift. Persistent changes in the genetic covariances between characters can be induced by deterministic factors, such as a change in the regime of correlational selection (e.g., Jones et al. 2007; Revell 2007), as well as by stochastic forces, such as gene duplication or a population bottleneck (e.g., Whitlock et al. 2002; Revell and Harmon 2008). Under drift, a persistent change in the genetic correlation between characters is also expected to induce a persistent change in their evolutionary correlation (Revell and Harmon 2008).

A classic example of a hypothesis that predicts a change in the evolutionary correlation between characters is Liem's (1973) hypothesis that the origin of a novel pharyngeal jaw form in cichlid fishes contributed to the exceptional morphological and ecological diversity found in that group. Liem (1973) hypothesized that the specialization of the cichlid pharyngeal jaw on food processing freed the oral jaw to differentiate in different lineages to capture different kinds of prey. This hypothesis thus predicts that the evolutionary correlation between pharyngeal and oral jaws will be weaker in the cichlid radiation, where pharyngeal and oral jaws are functionally disassociated, than in other lineages of percoid fishes (Liem 1973; Lauder 1981; Hulsey et al. 2006).

Similarly, the genetic decoupling of serially repeated structures such as cilia, limbs, and teeth is hypothesized to have led to new multivariate patterns of morphospace occupation in clades as diverse as trochophores and rotifers (Strathmann et al. 1972; Vermeij 1973), theropod dinosaurs (including birds; Gatesy and Dial 1996; Hunter 1998), and mammals (e.g., Walker 1987; Stock 2001). A hypothesis of complete or partial genetic decoupling of these serial structures provides the testable prediction that the evolutionary correlation has decreased in the affected evolutionary lineages where characters have more freedom to evolve independently.

Although these hypotheses for multivariate phenotypic diversification imply a covariance structure among characters that changes over time, typical approaches for the analysis of continuous characters on a phylogeny ignore this possibility. To analyze the evolutionary correlation in a phylogenetic context we usually either: (1) obtain phylogenetically independent contrasts, using the method of Felsenstein (1985), and then analyze these contrasts using parametric regression or correlation techniques; or (2) use the method of phylogenetic generalized least squares (Grafen 1989; Martins and Hansen 1997; Rohlf 2001) to fit a bivariate or multiple regression model while controlling for the phylogenetic relationships among the taxa in the tree assuming a single, constant rate of Brownian motion evolution, or some modification thereof (Pagel 1999; e.g., Revell and Harrison 2008). The typical application of these approaches thus generally assumes both constant evolutionary rates for individual characters and invariant correlations between characters across the branches of the phylogenetic tree.

Existing methods have some flexibility to accommodate heterogeneity in the evolutionary rate (Garland and Ives 2000), and can be used, under some circumstances, to test for differences in the relationship between characters in different parts of the tree (Garland et al. 1993; Garland and Ives 2000). These methods are either based on independent contrasts (Felsenstein 1985), or on numerical simulation (Garland et al. 1993). Contrasts-based methods suffer the shortcoming that they require the unambiguous assignment of all sets of paired sister nodes to an evolutionary rate category (e.g., Garland and Ives 2000). Simulation-based methods suffer the shortcoming that they do not allow meaningful estimation of the parameters of the evolutionary process (e.g., the phylogenetic analysis of covariance of Garland et al. 1993). Alternative methods that use a generalized linear modeling approach to study the evolutionary rate of a single character (e.g., Martins 1994; Martins and Hansen 1997) could be plausibly adapted to estimate rate heterogeneity in multiple characters; however, these methods also rely on the calculation of independent contrasts (Martins 1994).

Herein, we propose a new method to test for shifts in the evolutionary correlation by using maximum likelihood to fit multiple evolutionary rate matrices, also called evolutionary variance–covariance matrices, to different branches of a phylogenetic tree. The evolutionary rate matrix contains, on its diagonal, the evolutionary variances or rates for individual characters, and, on its off-diagonal, the evolutionary covariances (Revell and Harmon 2008). Following Revell and Harmon (2008), we use the term “evolutionary rate matrix” because the matrix is a multivariate representation of the evolutionary rate (O’Meara et al. 2006), and (under Brownian motion) completely describes the distribution from which evolutionary changes are drawn. As the evolutionary correlation between two characters is a function of their evolutionary variances and covariance, as far as we know this method is the first in which a likelihood approach is used to estimate different evolutionary correlations in different parts of the tree.

The method is based on a Brownian motion model of continuous character evolution, in which evolutionary changes are drawn from a multivariate normal distribution with variances and covariances that are proportional to the time elapsed. Thus, our approach extends existing methods that also use a Brownian model to test for changes in the rate of evolution for a single character (O’Meara et al. 2006; Thomas et al. 2006) or concerted changes in all the elements of the multivariate evolutionary rate matrix (Revell and Harmon 2008). However, our method additionally tests for nonproportional shifts in the evolutionary variances and covariances (of which the evolutionary correlations are a function) among phylogenetic lineages.

A particular advantage of our approach is that it allows testing of any specific a priori hypothesis for rate matrix heterogeneity—regardless of whether that hypothesis consists of monophyletic sets of taxa (i.e., our approach is “noncensored,” in the sense of O’Meara et al. 2006). For example, we can use the reconstruction of a binary or discrete multistate characteristic on the tree to assign branches and even portions of branches to different rate matrix categories (O’Meara et al. 2006). These reconstructed traits can pertain to habitat, biogeographic region, ecology, or any other plausible influence on the evolutionary rate matrix. With appropriate a priori justification, we can go even further and assign internal branches and parts of branches of the tree to different rate matrix categories. For example, we might test a hypothesis in which the evolutionary rate matrix differed between the Pliocene and Pleistocene, or a hypothesis in which the rate matrix differed on the branches of the tree before and after a mass extinction event (O’Meara et al. 2006).

In the present study, we apply the likelihood method to an empirical dataset and phylogeny from centrarchid fishes to test an a priori hypothesis that the evolutionary correlation between two aspects of buccal morphology has changed as a consequence of a change in trophic habit. We also test the method for type I error and power using simulated data generated by one and two rate matrix Brownian motion simulations.



To fit a single correlation model we used the following procedure. First, for a tree containing n species, we calculated a single n×n matrix, C. C contains, in each element Cij, the height above the root for the common ancestor of species i and j (Felsenstein 1973; Rohlf 2001). When C is computed from a phylogenetic tree in which the branches of the tree are proportional to time, Cij represents the time of shared history between the species. Unless it can be sensibly justified, phylogenetic trees for which branch lengths are not expected to be proportional to time (such as Maximum Parsimony trees) and trees for which branch lengths are unavailable should be avoided.

By assuming multivariate Brownian motion as our model for the evolutionary process, we can then compute an analytic solution for the maximum-likelihood estimate (MLE) of the evolutionary rate matrix for m traits as follows:


In this equation, R is the MLE of the evolutionary rate matrix, assuming here that a single rate matrix prevails on all the branches of the tree, X is an n×m matrix consisting of the observations for species for m traits in columns, 1 is an n× 1 column vector of ones, and a is a vector containing the “phylogenetic means,” which is equivalent to the set of m MLEs for the ancestral states for each character at the root node of the tree. Equation (1) here is the same as equation (1) in Revell and Harmon (2008). Like the ML estimator for the variance, the estimator R will be biased by the factor (n− 1)/n, which goes to 1.0 as n goes to inline image (thus making R asymptotically unbiased). Even for small n, such as n= 28 in this study, the expected bias of the estimator is slight (e.g., − 3.6%).

The likelihood of the rate matrix R can be found by evaluating the following equation for the likelihood (L1):


In this equation, which is based on the multivariate normal, y is a columnarized n·m× 1 vector containing the data from X (such that y1 through yn are the data from trait 1, yn+1 through y2n are the data from trait 2, etc.), D is an n·m×m design matrix in which each entry, Dij, is 1.0 if inline image, and 0.0 otherwise (Felsenstein 1973; Freckleton et al. 2002; Revell and Harmon 2008), and inline image is the Kronecker tensor product of R and C (Revell and Harmon 2008). The computation inline image, in which each element of R is multiplied by each element of C, results in an n·m×n·m matrix which is the expected variance–covariance matrix for the observations at all tips for all traits, given the single evolutionary rate matrix R and Brownian motion as the evolutionary process (Hohenlohe and Arnold 2008; Revell and Harmon 2008).

Given an a priori alternative hypothesis of two or more evolutionary rate matrices in different parts of the phylogenetic tree, we can also evaluate the likelihood of this hypothesis. To do so, we must first construct pC matrices for the p versions of R that we have a priori hypothesized. These can be computed by summing the branches of the phylogenetic tree associated with each hypothesized rate matrix into the matrix Ci for each of i= 1, 2, … , p hypothesized rate matrices. This procedure is illustrated in considerable detail in the appendix to Revell (2008). For example, for two rate matrices, R1 and R2, we construct two C matrices, C1 and C2, and then evaluate the following equation for the likelihood (L2):


The branches that we add into the matrices C1 and C2 are not required to be monophyletic, adjacent, or even whole branches. If our hypothesis dictates different evolutionary rate matrices for different parts of a branch, then the portions of branch length can be added to C1 and C2 as if there is an invisible node bisecting the branch (Revell 2008).

Because C in equation (2) is equivalent to C1+C2 in (3), L1 and L2 are equivalent expressions when R1=R2 (i.e., when there is no difference in the rate matrix for different parts of the tree), due to the distributive property of tensor products. Unfortunately, for the situation in which inline image, there is no analytic solution that maximizes the likelihood (L2), so this must be maximized numerically.

Optimization is somewhat complicated by the fact that R1 and R2 must satisfy the requirements of covariance matrices, that is, they must be positive semidefinite. The computational headache that would result from having to test every pair of covariance matrices for positive semidefiniteness prior to likelihood computation can be somewhat alleviated by optimizing the eigenstructure of R1 and R2 instead of the matrices themselves. This is because the eigenstructure of a covariance matrix is subject to several readily definable constraints (e.g., eigenvalues inline image; eigenvectors orthogonal).

For equations (1), (2), and (3), the vector of phylogenetic means, a, can be estimated as follows:


(Hohenlohe and Arnold 2008). In this equation,inline image for the situation of one hypothesized rate matrix, and inline image for the situation in which two rate matrices are assumed (inline image in the general case of p rate matrices). For one trait, as well as for any case in which R1=kR2 (including k= 1.0), this equation will yield the same vector a, as the equation inline image, for inline image. This is the equation provided in Revell and Harmon (2008; from Rohlf 2001 and O'Meara et al. 2006). However, the expression is not generally equivalent to equation (4), above (derived from Hohenlohe and Arnold 2008), for circumstances in which the hypothesized matrices are not proportional (i.e., inline image). The design matrix, D, is as previously defined.

Once the quantities L1 and L2 have been evaluated using equations (2) and (3) respectively, they can be compared using a likelihood ratio or transformed to obtain their corresponding information criteria (such as Akaike information criterion, AIC). For the former comparison, we first calculate −2log(L1/L2), which should be asymptotically distributed as a χ2 with degrees of freedom equivalent to the difference in the number of parameters estimated in the denominator and numerator models. For p= 2 and m= 2 (as in our centrarchid data, below), three more parameters are evaluated in the two rate matrix model than in the one rate matrix model, thus the likelihood ratio is expected to be asymptotically distributed as a χ2 with three degrees of freedom.

When n is small, as in this study (see below), −2 log (L1/L2) may not be χ2 distributed (Revell 2008). In this circumstance, one can also obtain the null distribution for the likelihood ratio by way of numerical simulation. In this case, we should generate a large number of datasets (say, 1000 or more) by simulation using the single matrix MLE of R as our generating evolutionary rate matrix. We then estimate the type I error probability of our hypothesis by evaluating the fraction of likelihood ratios obtained in simulation that are equal to or larger than the likelihood ratio obtained from our observed data and tree.

As an alternative to hypothesis testing, we can compare our one and two matrix models by first evaluating their AIC values (Akaike 1974). AIC provides a model selection criterion that weighs the likelihood of a model against the number of parameters estimated. The AIC value for a particular fitted model can be calculated as AIC = 2 k− 2log(L), in which k is the number of parameters estimated and L is the likelihood of the model. The preferred model is the one with the lowest AIC score. Hurvich and Tsai (1989) also provided a modification of AIC corrected for small sample size (AICc), defined as follows:


Here, as before, n is the number of the taxa in the analysis. When the number of traits, m, is m= 2 (as in this study), for the one-rate matrix model we estimate m(m− 1)/2 +m= 3 parameters in the rate matrix, and an additional m= 2 parameters in the vector a, thus k= 5 for this model. For the two rate matrix model, we estimate m(m− 1)/2 +m= 3 more parameters for the second rate matrix, and thus k= 8 for this model. Burnham and Anderson (2002) recommend using AICc in preference to AIC when the ratio of the number of observations divided by the number of parameters estimated, n/k, is small (e.g., n/k < 40). The values of AIC and AICc will converge for sufficiently large n/k.

It should be kept in mind that this method assumes that the branches of the phylogeny assigned to each rate matrix have been hypothesized a priori. If, alternatively, we were to explore several different hypotheses for evolutionary heterogeneity for a given dataset, then we should also control for experiment-wise type I error using an appropriate procedure for multiple test correction (see Sokal and Rohlf 1995; Quinn and Keough 2002).


We applied this method to an empirical dataset and phylogeny for 28 centrarchid fish species. We analyzed two log-transformed, size-corrected, morphological features that describe the shape of the mouth cavity, gape width and buccal length (Fig. 1), in the context of the available phylogenetic reconstruction for these species (Near et al. 2005; Collar and Wainwright 2006). A complete description of the phylogenetic size correction and several diagnostic tests that were used to assess the validity of Brownian motion as a model for the evolutionary process in these data is provided in a prior article (Collar and Wainwright 2006).

Figure 1.

(A) Species mean trait values for gape width and buccal length (panel inset) for 28 species in the family Centrarchidae. (B) The phylogeny with branch lengths for the same 28 fish species. Species in the genus Micropterus, which consists of specialists on large, evasive prey, are highlighted in grey in both panels.

We hypothesized two different rate matrices: one for the Micropterus clade and the other for the remaining centrarchid lineages (Fig. 1). This a priori hypothesis is based on an inferred shift in the selective regime in the Micropterus lineage. Relative to other centrarchids, Micropterus species feed on a narrower range of prey items, primarily fish and crayfish (Collar et al. 2005). These large, evasive prey have likely imposed functional demands on mouth morphology that differ from those experienced by the other centrarchid fish species. We fit one and two rate matrix models to the data and tree, evaluated the likelihoods of each model, and then compared the models using log-likelihood ratios and information criteria.


We also conducted a simulation test of the method. We tested the type I error rate by simulating data under a rate homogeneous model for the evolutionary process. For this we used the MLE of the single rate matrix from our empirical dataset and tree. We simulated 1000 datasets on our 28 taxon centrarchid phylogeny. For each simulated dataset we estimated the likelihood of one and two rate matrix models, evaluated the log-likelihood ratio, and computed AIC and AICc values for each model. We estimated the type I error rate as the fraction of tests for which a significant log-likelihood ratio was yielded when compared to a χ2 distribution with three degrees of freedom. We also evaluated the fraction of analyses in which AIC or AICc indicated that the two matrix model should be preferred. Because we generated the data for the type I error test under the conditions of our null hypothesis (rate matrix homogeneity), we were also able to use the distributions of likelihood ratios obtained here as our null distribution for the simulation test component of our empirical analysis, as described above.

We also explored the power and parameter estimation of the method. Again using the results from the centrarchid analysis as our generating model, we simulated 1000 datasets on the empirical tree, this time applying the branch assignment and rate matrix estimates from the empirical two rate matrix test. We then estimated the full, two matrix model using likelihood and evaluated parameter estimation by calculating the mean, mean bias, and variance of the estimators across runs. We also evaluated the power of the test under these circumstances as the fraction of analyses for which the method detected significant heterogeneity in the evolutionary rate matrix throughout the tree (i.e., the fraction of times in which the one matrix model was rejected by our likelihood-ratio hypothesis test; or, similarly, the fraction of times in which a two matrix model was selected by AIC or AICc).



Table 1 shows the single and full rate matrices from the test for evolutionary rate matrix heterogeneity on our centrarchid data and tree. The single rate matrix null was rejected by the likelihood-ratio test inline image. This was true regardless of whether the likelihood ratio was evaluated against the χ2 with three degrees of freedom (P= 0.0254), or by simulation (P= 0.0380 from 1000 simulations; Table 1). In the full model, Micropterus exhibits a three-fold slower evolutionary rate for gape width but a much faster rate for buccal length, as well as a more than twofold higher evolutionary correlation between the characters, relative to the other centrarchid lineages. Model selection results were ambiguous (Table 1). Although the uncorrected information criterion (AIC) indicated that the two matrix model should be preferred inline image, the small sample corrected criterion (AICc) recommended the one matrix model. Nonetheless, because inline image in this case, the preference indicated by AICc should be considered ambiguous (Burnham and Anderson 2002) (Table 1).

Table 1.  Empirical results. Shown is the maximum likelihood estimate (MLE) of R from a single evolutionary rate matrix model for the evolutionary process, and the MLEs for R1 (other centrarchid lineages) and R2 (Micropterus) from a two rate matrix model. The inferred evolutionary correlation(s), log likelihood, and uncorrected and small sample corrected Akaike information criterion values (AIC, AICc) are also presented for each model, along with the results from hypothesis testing and model selection. From the likelihood-ratio test the two matrix model is preferred (whether the P-value was obtained by comparison to the χ2 distribution or via simulation). The results from AIC model selection are ambiguous, with the preferred model (indicated by the lowest AIC score) depending on whether small sample correction was used.
Modelrlog(L) AIC AICc
 One matrix model
  inline image0.41572.19−134.4−131.7
 Two matrix model
  inline image0.35376.85−137.7−130.1
  inline image0.801   
 Hypothesis tests       Model selection
Likelihood-ratio test     AIC1−AIC2     AICc1−AICc2
−2log(L1/L2)=9.317P(χ2, df=3)=0.0254P(simulation)=0.03803.317−1.535


Table 2 shows the results from the tests of type I error and power. We found that our method had very close to appropriate type I error, with the likelihood ratio and AIC slightly too liberal (i.e., choosing the incorrect, more complex model at a rate > 0.05), and the AICc was conservative (Table 2). We also found that the method had high power under the conditions of our simulations, rejecting rate matrix homogeneity for the majority of simulations, regardless of the criterion employed. Here again, however, AICc was more conservative than the likelihood ratio or AIC (Table 2).

Table 2.  Results from the tests of type I error and power. In columns are the models selected (one or two rate matrix), given the true models in rows. Model selection was performed either by comparison of the likelihood ratio to a χ2 with appropriate degrees of freedom (LR-test), or by use of Akaike Information Criteria with (AICc) and without (AIC) small sample correction. For convenience and simplicity of presentation, we interpret the success or failure to reject the null hypothesis of rate matrix homogeneity in our likelihood-ratio test as implicating a two or one matrix model, respectively. When the true (i.e., generating) model is the one matrix model (rows 1, 2, and 3), the rightmost column represents the type I error rates for each criterion. By contrast, when the true model is the two matrix model (rows 4, 5, and 6), the rightmost column represents the power of the method to reject rate matrix homogeneity under the simulation conditions used in the present study.
True modelCriterionModel selected
One matrix modelTwo matrix model
One matrix modelLR-test0.9170.083
Two matrix modelLR-test0.2190.781

Table 3 shows the mean MLE of each matrix under conditions of evolutionary rate matrix homogeneity (the one matrix model) and heterogeneity (the two matrix model). Also shown are the mean biases of the estimates and the variability of parameter estimates across analyses. Generating conditions were the one and two matrix model MLE rate matrices from our empirical results (Table 1). We found that, on average, both the one and two matrix model parameter estimates closely approximated their corresponding generating values. Mean bias ranged from 0.95 to 0.99, with average bias across all simulations and ML optimizations close to the ratio theoretically expected under the conditions of our study (mean bias = 0.970; (n− 1)/n= 0.964 for n= 28).

Table 3.  Mean parameter estimates from one and two matrix simulated data. For comparison, the generating matrices are presented in Table 1 as the MLEs for the one and two matrix models estimated for the empirical dataset and tree. Mean biases, calculated as the element by element ratio of the estimate over its known generating value, are also shown (1.0=no bias). Standard errors are not errors of the estimates, but standard deviations of the MLEs of Ri across simulations.
One matrix model  
inline imageinline image 
inline imageinline image 
inline imageinline image 
Two matrix model
inline imageinline imageinline image
inline imageinline imageinline image
inline imageinline imageinline image


We propose a new method to fit multiple evolutionary rate matrices to the phylogeny and species data for two or more continuous characters. Our approach is based on the methods of O’Meara et al. (2006) and Thomas et al. (2006), but multivariate, and Revell and Harmon (2008), but in which the evolutionary rate matrix can change by factors other than a proportionality constant. The evolutionary rate matrix contains evolutionary variances (rates) on its diagonal and covariances elsewhere. As the evolutionary correlation between two traits is a function of their evolutionary variances and covariance, the method can be used to detect differences among phylogenetic lineages in the evolutionary correlations between continuous characters. This method can be applied to test hypotheses about how, when, and in what manner the evolutionary correlation has changed over time and among lineages. We illustrate the method with our empirical demonstration that the evolutionary rate matrix in centrarchids changes in a manner temporally coincident with an inferred shift in the dietary regime.

The likelihood method of this article can be similarly applied to other circumstances in which the evolutionary rate matrix might be expected to differ on the different branches of a phylogenetic tree. The method will be useful to test hypotheses about shifts in the shape and orientation of the adaptive surface, which may occur because of changes in the selective environment or in the functional integration of traits, as well as hypotheses about the consequences of changes in the additive genetic variance-covariance matrix, perhaps resulting from a shift in the mutation rate, the effect and degree of pleiotropy, or the effective population size. Nonetheless, to our knowledge, no prior study has provided a method whereby multiple evolutionary rate matrices (potentially differing in their correlation structure) are estimated for different parts of a phylogenetic tree.

Our illustrative example and simulations are performed using a phylogeny in which each hypothesized rate matrix partition in the tree consists of a set of contiguous branches (one a paraphyletic group, the other a nested monophyletic clade and stem branch; Fig. 1). As noted earlier, assigning branches to rate matrix partitions in this way is not at all a requirement of the method (e.g., Revell 2008). Given an a priori hypothesis justifying it, any set of branches can be assigned to any partition—so long as no portion of branch length is assigned to two different partitions. In fact, fractions of a branch can even be assigned to different partitions, for example, by using the set of stochastic character maps for a discrete character hypothesized to affect the evolutionary rates or covariances (Huelsenbeck et al. 2003; Bollback 2006).

We found that the method presented herein also has good statistical properties. Type I error was slightly elevated above its expected value, but low nonetheless. It is likely that the elevated estimated type I error in this study is a consequence of the asymptotic approximation of the likelihood ratio to a χ2 as the sample size increases. The small size of our empirical phylogenies for simulation (n= 28) and the findings of Revell (2008; e.g., appendix fig. A4) support this interpretation. Lending further credence is the fact that the likelihood ratio of our empirical centrarchid two matrix versus one matrix comparison was less significant when assessed via simulation (although P < 0.05 in both cases; Table 1). Thus, it seems highly possible that type I error of our method will decrease toward 0.05 as the sample size of species is increased.

In this article, we focus on full one and two matrix models in which the estimated matrices can differ in all possible manners. Differences among lineages in the rate of evolution of either character, in the evolutionary correlation, or in some combination thereof might result in a significantly better fit for the two matrix model by our method. As an alternative to the simpleminded approach of fitting only one and two matrix models, we could have implemented a test in which various degrees of evolutionary rate matrix dissimilarity are hierarchically compared (e.g., Phillips and Arnold 1999). For example, we might have tested whether the matrices shared various aspects of their eigenstructure.

Hohenlohe and Arnold (2008) propose a related analysis in which different aspects of eigenstructure similarity between a hypothesized and ML estimated evolutionary rate matrix are compared. To preliminarily explore the application of this approach to our method, we used the same hierarchy of tests to evaluate successively more heavily parameterized models of character evolution for the data and tree from Micropterus and other centrarchid fishes (Fig. 1). The levels in the hierarchy are based on sharing a successively smaller fraction of the eigenstructure among matrices and are (for a inline image matrix): equality, proportionality, shared eigenvectors, and no common structure. We found that the only significant likelihood ratio falls between the shared eigenvectors and dissimilarity models. This probably reflects the fact that the greatest difference between the rate matrices estimated in this study is in their orientation (Table 4).

Table 4.  Results from a hierarchy of tests for shared eigenstructure between evolutionary rate matrices conducted on the empirical centrarchid dataset and tree (Fig. 1). Rate matrices (Ri) and Σ are shown × 104. Parameters are: Σi the sum of the diagonals (trace) of Ri; ɛi the ratio of the primary eigenvalue of Ri and the trace (i.e., λ1,ii); and ϕi, the orientation in radians of the primary eigenvector of Ri with respect to the first trait axis. Likelihood ratios [−2(LR)] are ratios taken between upper and row models (e.g., equality vs. proportionality in row 2, etc.). Degrees of freedom (df) is the difference in the number of parameters between compared models. Probability of type I error [P(LR)] is evaluated against a χ2 with degrees of freedom, df. Akaike Information Criteria are shown both with (AICc) and without (AIC) small sample correction; the lowest (in this case, most negative) AIC value indicates the best-fitting model.
ModelMLE(R1)MLE(R2)Σ1Σ2ɛ1ɛ2ϕ1ϕ2log(L)−2(LR)dfP(LR) AIC AICc
Equalityinline image8.880.7600.42372.19−134.4−131.7
Proportionalityinline imageinline image8.739.450.761 0.411 72.200.02510.999−132.4−128.4
Shared eigenvectorsinline imageinline image9.167.960.8100.5180.25872.980.46010.213−132.0−126.4
No shared structureinline imageinline image9.277.730.8110.9180.2481.03376.857.73910.005−137.7−130.1

Although Hohenlohe and Arnold (2008), following Flury (1988) and Phillips and Arnold (1999), focus on the progression of matrix similarity illustrated in Table 4, we note that matrices can differ in a whole variety of ways other than those reflected in this series. For example, matrices might share a common set of evolutionary rates, but differ in their correlation structure (or vice versa). This type of similarity can be difficult to detect using the hierarchy of tests in Table 4.

We focus on a comparison of two rate matrices each involving only two phenotypic traits; however, this is not a limitation of our method. It is straightforward to extend this method to fit a model in which matrices for more than two rate matrix categories or more than two phenotypic traits are evaluated. Nonetheless, empiricists should keep in mind that the number of parameters to be estimated increases as p[m(m− 1)/2 +m]+m, where p is the number of rate matrix partitions (groups of branches associated with each rate matrix) in the tree, and m is the number of characters. Thus, the number of parameters increases more rapidly with the addition of characters than with the addition of rate matrix partitions. Because numerical optimization is required to obtain the ML parameter estimates and likelihood, estimation will become progressively more difficult as characters are added to the analysis.

We illustrate this method with an empirical test of similarity in the evolutionary rate matrices for two groups of centrarchid fishes. These rate matrices describe the evolution of gape width and buccal cavity length, two morphological features of the suction-feeding mechanism (Carroll et al. 2004). The demonstrated difference in evolutionary rate matrices for these lineages suggests that mouth shape has evolved differently in the Micropterus clade than it has throughout the rest of the centrarchid tree (Table 1). The strong, positive association between the evolution of gape width and buccal length in Micropterus suggests that changes in buccal cavity size are occurring without modification of shape. In contrast, the low evolutionary correlation between the characters in the other centrarchid lineages reveals that evolution has resulted in modifications to both buccal cavity size and shape.

The shift in the evolutionary rate matrix in Micropterus is concordant with a shift to a diet comprising primarily fish and crayfish, which are relatively large and evasive prey items. The other centrarchid lineages feed on a wider breadth of prey items that impose a greater range of functional demands on their capture. These vary from mircocrustacea that swim freely in open water, to aquatic insects that burrow in the benthos or cling to substrates (Collar et al. 2005). Therefore, the observed shift in the evolutionary rate matrix might reflect these differences in functional demands imposed by different trophic niches.

We note that hypothesis tests (i.e., likelihood ratio) and model selection (i.e., AIC, AICc) yielded slightly discordant results in our study. In particular, AICc suggested that a simple, one matrix model was the best-fit model, whereas the AIC model selection criterion and likelihood-ratio tests based on comparison to the χ2 distribution or simulation recommended the more complicated, two rate matrix model. This result is consistent with our finding that AICc is quite conservative regarding the selection of the more heavily parameterized model, choosing it when not the generating model in only 2.0% of simulations (Table 3).

Importantly, the discrepancy between results from hypothesis tests and information criteria does not necessarily represent a “failure” of either approach—because they have different goals. In fact, statistical hypothesis testing and information-theoretic model selection criteria represent fundamentally different paradigms in model choice (Burnham and Anderson 2002). For example, model selection criteria do not have as their target a type I error probability of 0.05, but rather their goal is to choose the best approximating model for inference (Burnham and Anderson 2002). Thus, concordance between the results of likelihood ratio or other hypothesis tests and model selection criteria are not necessarily expected under all circumstances. Figure 2 illustrates the predicted discordance between the model selection criteria AIC and AICc, and a likelihood-ratio test evaluated by comparison to the χ2 distribution. In this figure, the region above each curve is the region of log-likelihood difference in which the more complicated model (Model 2 in the illustration) is chosen over the simpler model. For the sake of simplicity, we set n= 28 and k1= 5 in the calculation of ΔAICc, as in this study. From a practical perspective, we suspect that AICc and hypothesis testing via simulation will be more conservative approaches to model choice when the number of samples is small, whereas all methods will probably converge in their recommendation for larger n (at some point this must be true for AIC and AICc, because inline image as inline image).

Figure 2.

A plot of the model recommendations for a given difference in the number of parameters (k2k1) and log-likelihood scores. If a log-likelihood difference falls above the curve for a given model selection (AIC, AICc) or hypothesis testing criterion [LR(χ2)], then Model 2 is recommended by the criterion; whereas if the likelihood difference falls below the curve, Model 1 is recommended (or cannot be rejected). For AICc, we fixed k1= 5 and n= 28 for this illustration. The vertical line k2k1= 3 and our observed likelihood difference of log(L2) − log(L1) = 4.66 are also plotted (the latter by a star). Because our observed likelihood difference falls above the curves for AIC and LR (χ2), Model 2 is preferred by those criteria. By contrast, Model 1 is indicated by AICc, as in Table 1.

In the present article, we provide a likelihood method and hypothesis testing framework for the analysis of evolutionary rate matrix heterogeneity in the context of evolutionary trees. We suggest that investigations into changes in the evolutionary rate matrix may provide additional insights into the differential diversification of forms among evolutionary lineages. Recently, methods have been developed to apply the phylogenetic approach to the comparison of morphological disparity among groups (Collar et al. 2005; O’Meara et al. 2006; Thomas et al. 2006). These methods have thus far focused primarily on the univariate analysis of evolutionary rate heterogeneity among lineages. The rate of evolution can be interpreted as the rate of accumulation of within-clade variation (Martins 1994; Hansen and Martins 1996) and as such has a direct relation to the manner in which morphospace is filled by a particular group as it evolves. However, an additional aspect of morphospace occupation involves the evolutionary covariance between characters. Low evolutionary correlations between characters should lead to broader multidimensional morphospace occupation, and low evolutionary correlations between characters may indeed be a hallmark of very diverse groups. We advocate the use of our method to investigate the role that might be played by the evolutionary correlations between characters in facilitating or inhibiting the acquisition of morphological disparity in different evolutionary lineages.

Associate Editor: A. Mooers


The authors would like to acknowledge helpful comments from L. Harmon, J. Losos, A. Mooers, P. Wainwright, and two anonymous reviewers, as well as the monetary support of the National Science Foundation (DEB-0519777 and DEB-0722485) and Harvard University's Department of Organismic and Evolutionary Biology.