## Introduction

Phylogenetic diversity (PD), the total branch length of a phylogenetic tree, has been extensively used as a measure of biodiversity. Originally conceived of as a method for prioritising regions for conservation (Faith 1992), PD has seen wider use in other applications such as biogeography (Davies & Buckley 2011), macroecology (Meynard *et al*. 2011) and microbial ecology (Lozupone & Knight 2008; Turnbaugh *et al*. 2008; Caporaso *et al*. 2012; Yu *et al*. 2012; Phillips *et al*. 2012). This increasing breadth of application can be attributed to a number of desirable properties including the following: (1) explicitly addressing the nonequivalence of species in their contribution to overall diversity, (2) acting as a surrogate for other aspects of diversity such as functional diversity (Cadotte *et al*. 2009, but see also Faith 1996), (3) incorporating information on the evolutionary history of communities and biotas and (4) being robust to problems of species delineation because the relationships between populations and even individuals can be represented by relative branch lengths without the need to establish absolute species identity. Further, the original simple formulation of Faith (1992) has been built on to produce a broader ‘PD calculus’ measuring such aspects of diversity as phylogenetic endemism (Faith *et al*. 2004; Rosauer *et al*. 2009), evenness (Hill 1973; Allen *et al*. 2009) and resemblance (Ferrier *et al*. 2007; Lozupone & Knight 2008; Faith *et al*. 2009; Nipperess *et al*. 2010). For the purposes of this study, when referring to ‘phylogenetic diversity’ and ‘PD’, we refer explicitly to the definition of Faith (1992), where diversity is measured as the sum of branch lengths of a phylogenetic tree.

Phylogenetic diversity increases with increasing sampling effort just like many other measures of biodiversity. Thus, the comparison of the phylogenetic diversity of communities is not straightforward when sample sizes differ, as is common with real data sets. Unless data are standardized in some sense to account for differences in sample size or effort, the relative diversity of communities can be profoundly misinterpreted (Gotelli & Colwell 2001).

The established solution to the problem of interpreting diversity estimates with samples of varying size is rarefaction. The rarefaction of a given sample of size *n* to a level *k* is simply the uniform random choice of *k* of the *n* observations (typically without replacement). The observations are typically of either individual organisms or collections of organisms, giving either individual-based or sample-based rarefaction curves (Gotelli & Colwell 2001). To consider a given measure of diversity under rarefaction, the measure of diversity is simply applied to the rarefied sample. Researchers are typically interested in the expectation and variance of a measure of diversity under rarefaction.

Rarefaction curves can be used to understand the depth of sampling of a community compared with its total diversity. Additionally, rarefaction curves capture information about evenness (Olszewski 2004) and beta-diversity (Crist & Veech 2006), depending on whether observations are of individuals or collections. Rarefaction curves have been computed for phylogenetic diversity (Lozupone & Knight 2008; Turnbaugh *et al*. 2008; Caporaso *et al*. 2012; Yu *et al*. 2012). In each of these cases, rarefaction was not by counts of individual organisms or collections of such, but was instead based on counts of unique sequences or operational taxonomic units. Rarefaction by such units, including taxonomic species, makes sense in the context of phylogenetic diversity where it might not with other measures of biodiversity. In effect, with these examples, rarefaction is by the tips of the tree, and the resulting curve gives an indication of tree shape and distribution of sample observations among the tips of the tree.

One way to obtain summary statistics such as expectation and variance under rarefaction is to compute these statistics on samples drawn using a Monte Carlo procedure, that is, calculate the desired statistics on a collection of random draws. On the other hand, there are closed-form solutions for the mean of many measures of biodiversity under rarefaction. For example, an analytical solution is well known for species diversity, can be calculated for rarefaction by individuals and samples and is much more efficient than resampling (Hurlbert 1971; Ugland *et al*. 2003; Chiarucci *et al*. 2008). However, we are not aware of such a formula for any phylogenetic diversity metrics.

In this study, we establish analytical formulae for the mean and variance of phylogenetic diversity under rarefaction. We develop these formulas in the setting of a phylogenetic tree with ‘marks’, which are a simple generalization allowing multiplicity of observations and arbitrary positions of observations along the tree.