## ARE PHYLOGENIES INFORMATIVE?

Since their introduction into the comparative method over two and a half decades ago, phylogenetic methods have become increasingly common and increasingly complex. Despite this, concern persists about the ubiquitous use of these approaches (Price 1997; Losos 2011). From a statistical perspective, these concerns can be divided into two categories: (1) Do we have appropriate models that reflect the biological reality of evolution and represent meaningful hypotheses? and (2) Do we have adequate data to fit these models and to choose between them? The models have been greatly improved since their introduction, and can now account for stabilizing selection (Hansen and Martins 1996), multiple optima (Butler and King 2004), and differing rates of evolution across taxa (O’Meara et al. 2006) or through time (Pagel 1999; Blomberg et al. 2003); but little attention has been given to this second concern about data adequacy. In this article, we highlight the importance of these concerns, and illustrate a method for addressing them.

It can be difficult to accurately interpret the results of comparative methods without quantification of uncertainty, model fit, or power. Most current comparative methods do not attempt to quantify this uncertainty; consequently it can be easy for inadequate power to lead to false biological conclusions. For instance, below we illustrate how estimates of phylogenetic signal (Gittleman and Kot 1990) using the λ statistic (Pagel 1999; Revell 2010) can reach opposite conclusions (from no signal λ= 0 to approximately Brownian, λ≈ 1) when applied to different simulated realizations of the same process. We also show that model selection by information criteria can prefer over-parameterized models by a wide margin. On the other hand, when a simpler model is chosen, it may be difficult to determine whether this merely reflects a lack of power. In both cases, the results can be correctly interpreted by estimating the uncertainty in parameter estimates and the statistical power (ability to distinguish between models) of the model selection procedure.

Here, we provide one solution to these problems using a parametric bootstrapping approach which easily fits within the framework used by many comparative methods approaches. As comparative methods rely on explicit models, this is easily implemented by simulating under the specified models. For the problem of uncertainty in parameter estimation, the bootstrap is a well-established and straightforward method (Efron 1987). A few areas of comparative methods have used a similar approach: for instance, phylogenetic ANOVA (Garland et al. 1993) calculates *P*-values of the test statistic by simulation under Brownian motion (BM). A similar approach was later introduced in the Brownie software (O’Meara et al. 2006) to generate the null distribution of likelihood ratios under BM, and applied in Revell and Harmon (2008), which showed the distribution can deviate substantially from χ^{2}, and a similar approach is applied in Revell and Collar (2009). Unfortunately, such approaches have never become a common in comparative analyses. Here, we describe a method due to Cox (1962) and used by others (Goldman 1993; Huelsenbeck and Bull 1996), that can be used in place of information criteria for model choice, allowing estimation of power and false positive rates, and can provide good estimates of confidence intervals on model parameter estimates. Although simulations are often performed when a new method is first presented, this practice rarely becomes routine. By providing a simple R package (“pmc,” Phylogenetic Monte Carlo) for the method outlined, we hope Monte Carlo-based model choice and estimates of power become common in comparative methods.

To set the stage, we will review common phylogenetic models and describe the Monte Carlo approach to model choice. We then present the results of our method applied to example data and discuss its consequences.