## Introduction

Phylogeny-based analyses of diversification and trait evolution, commonly referred to as phylogenetic comparative methods, are a powerful route for understanding macroevolutionary tempo and mode. These methods appear to have even greater potential when data sets are not limited to a particular moment in time, such as the present, but include additional information from the fossil record on ancestral and extinct lineages (Finarelli & Flynn 2006; Slater, Harmon & Alfaro 2012). However, such analyses require time-scaled phylogenies, that is, branching diagrams (trees) that accurately describe the temporal relationships among lineages, including ancestor–descendant relationships. A necessary prerequisite for applying these analyses in the fossil record is time-scaled phylogenies of fossil taxa, but hypothesized relationships among extinct organisms are typically available only in the form of a cladogram, a branching diagram unscaled to time which depicts only the nesting relationships among morphologically differentiated taxon units (‘morphotaxa’; Fig. 1). Although some methods consider information on temporal occurrence of fossils simultaneously with inferring relationships from morphological characters (Fisher 1991, 1994; Wagner 1998; Marcot & Fox 2008; Pyron 2011; Ronquist *et al*. 2012), most palaeobiological trees are not constructed using these approaches. Thus, methods are needed for integrating inferred cladograms with temporal data to approximate the true time-scaled phylogeny (Fig. 2a).

A frequently applied approach for integrating temporal and cladistic data to produce a time-scaled phylogeny was formalized by Norell (1992) and Smith (1994), hereafter referred to as the ‘basic’ time-scaling method. In this method, clades are as old as the first appearance date of their earliest descendant (Fig. 2b). Although some workers recommend treating plesiomorphic taxa as ancestors, which can appear before branching events (Smith 1994), this is often not done in recent time-scaling attempts, as the cladograms used are supertrees lacking information on apomorphies. As the first appearing lineage can be nested relative to other lineages on the cladogram, this method can cause the branch lengths between successive nodes to collapse into zero-length branches (‘ZLBs’; dashed lines in Fig. 2c). These ZLBs are potentially unrealistic artefacts resulting from gaps in the pattern of evolutionary relationships (Hunt & Carrano 2010). In addition, their inclusion in a tree can cause issues for analyses of trait evolution, as any evolutionary change across a ZLB will appear to be instantaneous. Some comparative methods are unable to evaluate such trees because the necessary phylogenetic variance–covariance matrix (Garland & Ives 2000) can become singular and thus unusable for analytical operations. To avoid the theoretical and methodological issues of including ZLBs, many workers first calculate node ages using the basic method and then extend branch lengths under various algorithms, such as restricting branches to some minimum length (Laurin 2004; Brusatte *et al*. 2008; Laurin, Canoville & Quilhac 2009). Ignoring the issues with zero-length branches, the basic time-scaling method and the various derivations with branch length extensions do not allow for uncertainty in node ages. Furthermore, the basic method assumes that the phylogeny of morphotaxa exactly matches the cladogram used, even though a large number of phylogenies with ancestor–descendant relationships are consistent with any given topology (Platnick 1977; Wagner & Erwin 1995; Bapst 2013).

Here, I propose a general algorithm for stochastic time-scaling of palaeontological phylogenies which I call the ‘zipper’ method, where node ages are sampled randomly. This stepwise process of drawing node ages should be repeated many times to generate large numbers of time-scaled phylogenies, approximating the potential range of time-scaled phylogenies for a given data set. Macroevolutionary analyses should then be applied across such samples of phylogenies, rather than a single tree, as the stochastic quality of the time-scaling method makes any single time-scaled tree a potentially poor predictor of the temporal relationships. This stochastic approach is similar to methods for dealing with the uncertainty arising from soft polytomies or appearance times known only from discrete intervals (see Appendix S1 for more detail on solutions for discrete interval data and uncertain times of observation). The zipper method is extended to allow potential ancestor–descendant relationships and resolve soft polytomies.

A probability distribution is required to describe the random sampling of branching times for each node under the zipper method. Rather than randomly assigning node ages with uniform probability between some set of bounds, the stochastic sampling in this implementation is weighted relative to a distribution defined by a probability function of unsampled phylogenetic history. This model predicts the amount of unobserved evolutionary history as a function of branching, extinction and sampling rates, and thus the combination of this model with the zipper method is referred to here as the three-rate-calibrated time-scaling method (‘*cal3*’).