We compiled phylogenies that represent composites of two recently published source phylogenies. Our primary (hereafter “main”) phylogeny includes a substantial sample of all coelurosaurian theropods (those more derived than Tyrannosauroidea) from compsognathids to maniraptorans, including Mesozoic Aves, reflecting the focus of the study. The phylogeny is complete for relatively well-represented fossil taxa, but it excludes those based on fragmentary fossils or whose validity is questioned. The main source phylogenies are a comprehensive tree of theropods (Turner et al. 2012), and a Mesozoic avian phylogeny (O'Connor and Zhou 2013). We further complemented the main composite tree by the addition of species missing from the source phylogenies, and by the resolution of polytomies. For the main phylogeny, we excluded Jixiangornis and Shenzhouraptor as they are considered possible synonyms of Jeholornis (Zhou and Li 2010). In our main tree, Epidexipteryx is classified as a nonavian paravian (Dececchi and Larrson 2013) and Epidendrosaurus is also excluded from the main analyses as it represents a juvenile (Zhang et al. 2002). Further, in the main analyses, Microraptor zhouianus is synonymous with Microraptor gui (Gong et al. 2012; Xing et al. 2013).
We tested the sensitivity of our analyses to alternative phylogenetic topologies and branch lengths with four additional trees: (1) the recent phylogeny of Godefroit et al. (2013), in which the new taxon Aurornis is regarded as the most basal bird, with a sister-group relationship between the Troodontidae and Aves; in this phylogeny no species are added, and any polytomies are reduced to descending species; (2) a revised version of the main tree with possible synonyms and juveniles added for completeness (“full”); (3) an alternative version of the main tree (called “arch”) with the single change of placing Archaeopteryx outside Aves to be the most basal member of the troodontids; and (iv) a final tree (“alt. branch”) based upon the full tree, and including Eshanosaurus deguchiianus (Barrett 2009), and used to test differences in branch lengths (discussed further below). Further details of the phylogenies are provided in the Supporting Information (Informal Supertree Construction) and the phylogenies are available at figshare (http://dx.doi.org/10.6084/m9.figshare.820135). The ages of all taxa in the study were collated by MJ Benton (MJB) from synoptic resources and current primary literature. In all trees, branch lengths were scaled to time using fossil occurrence dates of taxa in the tree (Brusatte et al. 2008), using a script written for R (http://www.graemetlloyd.com/methdpf.html). This method dates nodes according to their oldest descendant taxon, but this means that all nodes would contain zero branch lengths, even if the first occurrences in the fossil record are congruent with phylogeny, because each node is dated by its oldest descendant. Therefore, a unit of time is added to the descendant of the node from a preceding branch length to prevent zero-length branches (Brusatte et al. 2008). As the root length also contributes to the sharing of time between descendant branches, the root length can influence branch lengths elsewhere in the tree. For the main tree, the root branch length was set to either five or 10 million years, meaning that the root is placed at either 171 Ma or 176 Ma. These estimates are broadly in line with stratigraphy for the origin of coelurosaurians that are more derived than tyrannosauroids (Carrano et al. 2012). To test for the possibility that underestimated branch lengths might have an effect on rate calculations (e.g., leading to false inference of high rates), we added the dubious theropod Eshanosaurus deguchiianus from the Early Jurassic (Barrett 2009; Zanno 2010) to the main phylogeny, thereby placing the root of the tree back to around 200 Ma and having the effect of lengthening internal branches on the phylogeny. We consider the effects of Eshanosaurus to be extreme: the branch length leading to Paraves is almost 20× longer (6.08 Ma) for trees including, compared to those excluding, Eshanosaurus (0.31 Ma on the full tree).
As a further test of the influence of branch lengths on rate calculations, we undertook a sensitivity analysis using the timePaleoPhy function in the R package paleotree (Bapst 2013). In these analyses, we used the following settings: alternative branch-scaling methods, minimum-branch length (set to 2 Ma) in which branches are set to a minimum defined value and later branches are shortened to accommodate the true timing of diversification; additive branch length (with an additive value of 1 Ma), so 1 Ma is added to each branch; and equal, which is equivalent to the Graeme Lloyd script but here takes dates from a uniform distribution of the age range. For the main phylogeny, and the phylogeny based on Godefroit, we obtained a sample of five trees for each of these three branch-scaling methods; age ranges of taxa were taken from the Paleobiology database (Carrano et al. 2013).
Data were collected on the length of the femur and forelimb elements. Femur length is a widely used proxy for overall body size in theropods, which itself is a proxy for a large number of biologically relevant traits (Carrano 2006). Forelimb size is recorded as total humerus + forearm (radius or ulna) + manus lengths. All data were collected from published sources and the Paleobiology database (Carrano et al. 2013), and multiple measurements were recorded as means. Measurements for all elements were preferably taken from a single specimen, but as we wanted to maximize the number of taxa included, this was not always possible. Clearly there could be problems in assessing relative metrics if data for femur and forelimb lengths are taken from animals of very different body sizes. As a mitigation, the analyses were run twice, first with our compiled data (Carrano 2006, and other sources) and then with data from a single paper (Dececchi and Larrson 2013). The body size proxy from this source was snout-vent length (SVL), an alternative to femur length; although femur length is widely used as a proxy for body size in theropod dinosaurs (Carrano 2006), another view is that it may be a poor estimator of body size in Paraves, particularly among Aves (Dececchi and Larrson 2013). Prior to analysis, all data were log-transformed. All data and phylogenies are available at doi.org/10.6084/m9.figshare.820135.
We found morphological data on femur length from 125 species and on forelimb length from 76 species, of which 71 species had data on both femur and forelimb. Branch lengths were estimated on complete phylogenies, which were subsequently pruned to match the available data. The final trees with the effective tree size for femur (125 species), forelimb (76 species), and femur and forelimb (71 species) are shown (Fig. S1).
MODELS OF TRAIT EVOLUTION
We modeled morphological evolution using four complementary phylogenetic comparative methods: (1) we estimated the phylogenetic position and magnitude of changes in the rate of evolution of body size and forelimb length using the trait MEDUSA method of Thomas and Freckleton (2012); (2) we compared the mode and rate of evolution for body size and forelimb among Paraves, Aves, and nonparavian Theropoda; (3) we assessed the fit of models of directional evolution of body size in Paraves and Aves respectively to test if miniaturization continued within either of these clades; and (4) we tested for changes in the coevolutionary relationship between body size and forelimb length among Paraves, Aves, and nonparavian Theropoda. Below we describe the models in detail.
- Rates of evolution
The trait MEDUSA method tests for shifts in the rate of evolution on the branches of the phylogeny in which the location and magnitude of shifts are not known a priori (Thomas and Freckleton 2012). These shifts can either be clade-wide (shared by all branches of a clade), or on a single internal branch leading to the node that represents the most recent common ancestor of a clade. The trait MEDUSA algorithm starts with a baseline of a homogeneous Brownian motion model across the entire phylogeny, it then iterates across all nodes in the tree, allowing a different rate at each clade and individual branch in turn, to locate the shift that most improves the likelihood of the model. This single shift is fixed and is the starting point for the next step, where a second shift is located. This process continues up to a user-defined maximum number of shifts (Thomas and Freckleton 2012). The best overall model is assessed by comparison of the Akaike Information Criterion (AIC) among the best constant-rate, one-shift, two-shift and so on models. Alternative approaches to identifying shifts in evolutionary rates have been developed that use Reversible-Jump MCMC methods (Eastman et al. 2011; Venditti et al. 2011). The RJMCMC method of Eastman et al. (2011), as used by Benson and Choiniere (2013), only allows rate shifts to occur across whole clades. This is an important constraint, because shifts that occur along a single internal branch cannot be readily identified and the method may instead average such a high rate on a single branch across all descendant lineages, so leading to the false inference of high rates across whole clades. The RJMCMC method of Venditti et al. (2011) models both single-branch and clade shifts, but the current implementation is limited to univariate analyses. The trait MEDUSA model, as implemented in the R package MOTMOT can accommodate multivariate data. To do so, it models changes in covariance among traits, where all elements of the trait covariance matrix are modified by a single scalar such that the proportionality of the matrix is constant across the tree, that is, the eigenvectors (the correlations among traits) are constant but eigenvalues are proportional. Revell and Collar (2009) modeled multivariate evolution by allowing each element of the trait covariance matrix to vary among lineages, which increases the generality and potentially the biological realism of the model. However, their model requires that the locations of shifts are defined a priori and cannot be applied to shifts that occur along single internal branches in the phylogeny.
We used the function transformPhylo.ML with the tm2 algorithm in the R package MOTMOT (Thomas and Freckleton 2012) to fit the trait MEDUSA model. We fitted the model to (1) body size (univariate), (2) total forelimb length (univariate), and (3) body size + forelimb length (multivariate). We allowed up to five possible rate shifts (although in practice the best-fitting model always contained fewer shifts) and set a minimum clade size of five, which prevents the algorithm from searching for shifts in very small clades. To determine an appropriate ΔAICc threshold, a simulation was run in which BM was modeled on the phylogeny 1000 times, and the trait MEDUSA was then used to detect a single shift on these simulated data of BM. The 95th percentile of the difference between the AICc of the BM and single-shift model was then used as the AICc cut-off value (Thomas and Freckleton 2012). After simulation, this value was found to be 9.22; this means that during application of the tm2 algorithm, an additional rate shift that improves the AICc by <9.22 compared to a model with fewer shifts would be rejected.
An increase in evolutionary rate on a single branch constitutes either rapid change on that branch, or a parallel (directional) clade-wide shift in a trait value (Thomas and Freckleton 2012). As trait MEDUSA method cannot distinguish between these alternative evolutionary interpretations, below (ii and iii) we describe two complementary approaches to clarify these patterns.
We used variants of the Ornstein–Uhlenbeck (OU) model of evolution (Hansen 1997; Butler and King 2004; Beaulieu et al. 2012) to test alternative evolutionary patterns among Paraves and Aves. We fitted alternative OU-based models using the R package OUwie (Beaulieu et al. 2012), which allows lineages to differ in three parameters: (1) θ, often referred to as the primary adaptive optimum; (2) α, variously referred to as the evolutionary pull toward those optima, or the strength of stabilizing selection; and (3) σ, the rate of stochastic evolution (Beaulieu et al. 2012). We fitted seven alternative evolutionary models to the femur and forelimb data respectively and compared model fit using AICc. We repeated the models allowing a shift in parameters among either Paraves or Aves. The models were as follows: (1) BM1, Brownian motion model, single σ, does not estimate α or θ; (2) BMS, two σs, does not estimate α or θ; (3) OU1, OU model, single θ, α, and σ; (4) OUM, two θs, single α and σ; (5) OUMV, two θs, two σs, single α; (6) OUMA, two θs, two αs, single σ; and (7) OUMVA, two θs, two σs, two αs. In all models, the ancestral state θ0 was not included, so in this model, the starting value of θ is estimated from an OU stationary distribution.
As we noted above, OU models are often presented in the context of stabilizing selection, but some of the parameters, particularly α, are difficult to interpret because statistical support for high α could arise from either evolution among lineages toward a shared optimum or from unaccounted error in the phylogeny or the data. We avoid speculative interpretation of α by treating it as statistical tool to account for deviation from Brownian motion, rather than a function of evolutionary process.
High rates on single internal branches inferred using trait MEDUSA may be caused by high rates at the origin of that clade, or by directional changes of evolution across the clade away from the ancestral state. To test this possibility, we pruned the phylogenies to Paraves only, and compared different models of evolution within these clades. The null BM model was compared to a directional model of evolution where a significantly higher likelihood for a directional model indicates evolution away from the ancestral state of the clade (Pagel 1997, 1999). We used BayesTraits (Pagel and Meade 2013) and fitted a BM model (model A) and a directional (model B). We compared the fit of the models using likelihood ratio tests.
Aves and Paraves are expected to show different relationships between body size and forelimb compared to nonparavian theropods. Specifically, Dececchi and Larrson (2013) infer that apparent forelimb elongation in Aves arose from an allometric move to a small body size. To test for variation in the relationship of femur and forelimb between major theropod groups, we used the phylogenetic generalized least squares (PGLS) function in the R package caper to perform an analysis of covariance (ANCOVA; Orme et al. 2011). This corrects for statistical nonindependence in trait values (Felsenstein 1985; Freckleton et al. 2002) by simultaneously estimating and correcting for the strength of phylogenetic signal in the model residuals using Pagel's λ (Pagel 1997, 1999). A λ value of 1 indicates strong phylogenetic signal (i.e., evolution of a trait is consistent with a constant-rate Brownian motion model), and a value of 0 indicates a lack of phylogenetic signal (deviation from Brownian motion). We fitted a series of alternative models, with interaction terms defined by major clades. Interaction terms were fitted to forelimb length and one of the discrete dummy variables to define either (i) Aves (0), other taxa (1); (ii) Paraves (0), other taxa (1); or (iii) Aves (0), other Paraves (1), other taxa (2). A statistically significant interaction term would imply that the slope of the relationship between femur and forelimb differs among major theropod taxa.
We used simulations to test for potential biases toward finding shifts in particular clades or branches. We note that this is not a test of the accuracy of the branch length reconstruction, but rather a test to determine whether any inferred rate shifts could be statistical artifacts. First, we simulated 1000 datasets under a constant-rate BM model on the main phylogeny pruned to 71 species, so as to match the number of species in the femur + forelimb dataset. Data simulated under these conditions are not expected to result in identification of major rate shifts. We then fitted the trait MEDUSA model to each simulated data vector and recorded the frequency and position of identified shifts. We paid particular attention to the short branch leading to Paraves, and recorded the node position of the identified shift as the number of nodes away from the Paraves where negative numbers indicate nodes outside Paraves and positive numbers indicate nodes within Paraves. If any bias toward identifying rate shifts on the Paraves branch exists, we expect the frequency of shifts to peak at node 0.
Second, we simulated data on the same tree with an increase in rates on the Paraves branch of 10, 50, 100, 500, or 1000 times the background rate. For each magnitude of rate shift, we simulated 1000 datasets and again fitted the trait MEDUSA model, recording the frequency, location, and magnitude of shifts.