Volume 33, Issue 1
Free Access

A diversity of beta diversities: straightening up a concept gone awry. Part 2. Quantifying beta diversity and related phenomena

First published: 04 March 2010
Citations: 192
H. Tuomisto (hanna.tuomisto@utu.fi), Dept of Biology, FI‐20014 Univ. of Turku, Finland

Abstract

The present two‐part review aims to put the different phenomena that have been called “beta diversity” over the years into a common conceptual framework and to explain what each of them measures. The first part (Tuomisto 2010) discussed basic definitions of “beta diversity”. Each arises from a different way of combining a definition of “diversity” with a definition of its alpha component and with a mathematical relationship between the alpha and gamma components. This second part assumes that an appropriate basic definition of a beta component (which may or may not be true beta diversity) has been chosen, and the focus here will be on how to quantify it for a given dataset. About twenty different approaches have been used for this purpose. It turns out that only two of these approaches accurately quantify the selected beta component: one does so for the entire dataset, and the other for two sampling units at a time. The other approaches actually quantify other phenomena, such as mean species turnover between sampling units, compositional gradient length (with or without reference to an external gradient), distinctness of a focal sampling unit, rate of species accumulation with increasing sampling effort, rate of compositional turnover along an external gradient, or the rate of decay in compositional similarity with increasing geographical distance. Although most of these phenomena can be expressed as a function of a beta component of diversity, they do not equal a beta component of diversity. Many of these derived variables are not even numerically correlated with the beta component on which they are based, which needs to be taken into account when interpreting the results. The effects of sampling decisions when results are extrapolated beyond the available data will also be discussed.

The first part of the present review (Tuomisto 2010) discussed eight different basic definitions of “beta diversity”. This second part discusses derived definitions of “beta diversity” that arise from different approaches to quantifying the chosen kind of “beta diversity”, or a phenomenon related to it, for a given dataset. The focus in this part is mostly on five of the eight basic definitions of “beta diversity”, collectively referred to here as beta components. Table 1 presents an overview of both the basic and the derived definitions of “beta diversity” and indicates in which part of the present review each definition is explained in detail.

Table 1. Summary of the notation and names used for different kinds of “beta diversity” in the present paper. The first five are basic definitions in which true gamma diversity (γ=qDγ) of the dataset of interest is partitioned into alpha and beta components using a simple mathematical function. The others are derived definitions, the notation for each of which summarises its relationship to the basic definitions, other components of diversity, and explanatory gradients (“n.n.” means no specific notation). The alpha component is either mean gamma diversity within sampling units inline image, whose measurement unit is spE, or true alpha diversity αd (= qDα), whose value is the same but measurement unit is (effective species)/(compositional unit)=spE/CU. When either all N sampling units are weighted equally or q=1, αt is constrained to the interval [γ/N, γ]. This allows expressing also the range of each beta component in terms of N; these ranges are shown together with the measurement units. The last column indicates in which subsection of the present two‐part review each kind of “beta diversity” is discussed in detail. Part 1 refers to Tuomisto (2010) and Part 2 to the present paper.
Notation Definition Measurement unit [range] Section
βMd true beta diversity=γ/αd CU [1 CU to N CU] Part 1: 1
βMt regional‐to‐local diversity ratio=γ/αt spE/spE [1 to N] Part 1: 1
βAt absolute effective species turnover=γ−αt spE [0 to (N−1)αt] Part 1: 2
βMt−1 Whittaker's effective species turnover = (γ−αt)/αt=γ/αt−1 spE/spE [0 to N−1] Part 1: 3
βPt proportional effective species turnover = (γ−αt)/γ=1−αt spE/spE [0 to 1−1/N] Part 1: 4
Δc any of the effective species turnover measures, i.e. βAt, βMt−1 or βPt as in the chosen turnover Part 2: Introduction
βMtot or Δctot a beta component quantified for the entire dataset as in the chosen beta component Part 2: 1.1
βMj,k or Δcj,k a beta component quantified for a subset of the dataset that consists of the sampling units j and k as in the chosen beta component Part 2: 1.2
inline image average of all the species turnover values that can be calculated for different sampling unit pairs in the dataset (with jk) as in the chosen turnover Part 2: 2.1
inline image average of all the species turnover values that can be calculated between a real sampling unit and a regional compositional centroid in the dataset as in the chosen turnover Part 2: 2.2
Δcj,kmax or Δc′max compositional gradient length in the dataset along the compositional dimension with most turnover as in the chosen turnover Part 2: 2.3
Δcg) compositional gradient length along a specified section of an external gradient g as in the chosen turnover Part 2: 2.4
ΔΔg(Δlog(1–Δc)) number of half‐change units, i.e. observed amount of change in differences in explanatory gradient g expressed in terms of decrease in compositional similarity (unit of g)/(unit of g) Part 2: 2.5
inline image compositional distinctness of the focal sampling unit F as in the chosen turnover Part 2: 3.1
n.n. compositional nestedness of a species‐poor sampling unit in a more species‐rich one sp/sp Part 2: 3.2
n.n. logically inconsistent beta components in which α and γ are based on different datasets as in the chosen beta component Part 2: 3.3
n.n. average of all pairwise beta component values with compositional data taken from outside the sampling units of interest as in the chosen beta component Part 2: 3.4
Δγ/Δx rate of gamma diversity accumulation with increasing (logarithm of the) number of sampling units spE/SU or spE/log(SU) Part 2: 4.1.A
Δαtx rate of alpha diversity accumulation when sampling unit size increases in multiples of (logarithm of the) original size spE/SU or spE/log(SU) Part 2: 4.1.B
Δlog(γ)/Δx rate of gamma entropy accumulation with increasing logarithm of the number of sampling units log(spE)/log(SU) Part 2: 4.2.A
Δlog(αt)/Δx rate of alpha entropy accumulation when sampling unit size increases in multiples of the logarithm of original size log(spE)/log(SU) Part 2: 4.2.B
ΔβMx or ΔΔcx rate of change in a beta component of diversity with increasing number of sampling units (unit of the beta component)/SU Part 2: 4.3.A
ΔβMx or ΔΔcx decay rate of a beta component of diversity when sampling unit size increases in multiples of original size (unit of the beta component)/SU Part 2: 4.3.B
ΔβPtx proportional effective species turnover accumulation rate when an increasing proportion of the available sampling units is taken into account (spE/spE)/SU Part 2: 4.3.C
Δlog(qβM)/Δx rate of change in beta entropy or regional entropy excess with increasing logarithm of the number of sampling units (unit of entropy)/log(SU), e.g. bits/log(SU) Part 2: 4.4
n.n. species diversity or entropy accumulation rate with alpha and gamma diversities based on different data as in the chosen accumulation rate Part 2: 4.5
Δcg)g compositional turnover rate along a specified section of an external gradient g (unit of chosen turnover)/(unit of external gradient) Part 2: 4.6
ΔΔc(ΔΔg)/ΔΔg or Δlog(1–Δc)(ΔΔg)/ΔΔg rate of change in (the logarithm of the one‐complement of) pairwise effective species turnover with increasing distance along an explanatory gradient g (slope of a distance decay regression) (unit of chosen turnover)/(unit of external gradient) or log(unit of turnover)/(unit of external gradient) Part 2: 4.7

Two of the beta components treated in this paper measure compositional heterogeneity in the dataset. True beta diversity (qDβ=qDγ/qDα or qβMd=γ/αd) quantifies the number of compositional units (compositionally distinct virtual sampling units that have the same species diversity as the actual sampling units do on average, abbreviated CU) in the dataset. Regional‐to‐local diversity ratio ( inline image) quantifies how many times as rich in effective species the entire dataset is than a single compositional unit. The two differ in measurement units but obtain the same numerical value when based on the same data; when both are referred to simultaneously, the notation qβM is used. The other three beta components measure effective species turnover. Absolute effective species turnover (qβAt=γ−αt) quantifies the cumulative number of effective species that change among all compositional units in the dataset. Relative effective species turnover is obtained by dividing qβAt by either αt or γ. Whittaker's effective species turnover (qβMt−1=qβAtt) quantifies the number of times that the effective species composition changes completely among all compositional units. Proportional effective species turnover (qβPt=qβAt/γ) quantifies what proportion of the effective species composition of the entire dataset changes among the compositional units. Effective species turnover can be thought of as change in effective community composition, so all three turnover measures are collectively referred to by Δc.

True gamma diversity q D γ=γ is the number of effective species in a dataset, quantified as the inverse of the mean of the proportional abundances of the actual species. The parameter q defines which kind of mean is used, and in practice controls to what degree the rarest species are taken into account (see The starting point: what is diversity? in Tuomisto 2010). When q=0, even the rarest of the S actual species are counted as one effective species, so species diversity equals species richness. When q=1, diversity equals the exponential of the Shannon entropy, and when q=2, the inverse of Simpson concentration. The larger the value of q, the smaller the fraction of an effective species that is contributed by the rare actual species. The same is true of alpha diversity (qDαd or qDγjt), so diversity values based on different values of q are not comparable, and α and γ based on different q cannot be used to calculate a beta component (qβ). When q=0, effective species turnover equals actual species turnover, but increasing q causes the two to diverge as effective species turnover becomes increasingly determined by differences among sampling units in the proportional abundances of the most abundant species. Abundance itself can be measured in different ways that are not interchangeable. For example, when tree plots of a fixed surface area are used, increasing the value of q increases the differences between results based on the number of stems and those based on basal area or biomass. With these considerations in mind, the superscript q will be omitted in the present paper, unless a specific value of q needs to be indicated.

Each definition of a beta component quantifies a different phenomenon, so their values are not commensurate. The minimum value is zero in some beta components and unity in others. The maximum values of all beta components vary with the number of sampling units N, but they also depend on the value of q, on sampling unit weights (which ideally reflect an appropriate measure of sampling effort) and, in one case, on alpha diversity. The beta components only have a consistent upper bound if all sampling units are weighted equally or q=1 (Table 1 and Tuomisto 2010). In the present paper, it is therefore assumed that sampling unit weights are equal. Because the upper bound depends on N, comparing beta components based on different N is problematic. Therefore, the number of sampling units used when calculating the beta component will, in some cases, be made explicit with the help of a subscript.

Some basic definitions of “beta diversity” are based on the raw value of a diversity index rather than on a true diversity. Such measures include regional Shannon entropy excess ( inline image where H′ is the Shannon entropy) and regional variance excessinline image where 2λ is the Simpson concentration; see Tuomisto 2010 for details). When regional Shannon entropy or variance excess is discussed in the present paper, this will be stated explicitly; the notation Δc does not refer to them.

A basic definition of a beta component can be applied to an existing dataset in several different ways. Many of the possible approaches do not actually quantify the chosen beta component itself. Instead, they may quantify just a part of the beta component, the rate of change in the beta component along some external gradient, or some other phenomenon. Nevertheless, the results from all of these approaches have been called “beta diversity” (some of these were discussed by Jurasinski et al. 2009). The purpose of the present paper is to review what the different approaches in fact quantify, and how they relate to one another. Attention is also given to sampling considerations when the purpose of the study is to draw conclusions on “beta diversity” in a region of interest that is too large to be entirely inventoried, and the available dataset therefore only provides a sample of it.

1. Accurate approaches to quantifying βM or Δc

1.1 The regional approach βMtot or Δctot

When the aim of a study is to quantify the total amount of compositional heterogeneity or effective species turnover in a dataset, the regional approach is the way to go. In this approach, the chosen basic definition of a beta component is applied such that all available data are used, and the subscript “tot” can be used to express this. The interpretation of the obtained βMtot or Δctot value is then exactly according to the chosen definition.

Often the available dataset is a sample of a larger region of interest, and the purpose is to compare the beta components in separate regions. If so, it is important to take into account that the maximum values of all beta components depend on the number of sampling units N (Table 1). Erroneous conclusions on which regions are compositionally most heterogeneous or have most effective species turnover can easily be drawn if regions are represented by different sample sizes. This problem can be avoided by using the same N in every region, by way of rarefaction methods if necessary (Perelman et al. 2001). Rarefaction implicitly assumes that each sampling unit is equally good in documenting local species diversity. It is therefore best applied to datasets where each sampling unit contains the same total abundance of the organisms of interest (this can, in turn, be achieved by rarefaction within the sampling units), or sampling effort has been standardised in some other way that is appropriate for the questions at hand.

Total absolute effective species turnover βAt can be larger in a species‐rich dataset where all of the actual sampling units share species with all other sampling units than in a species‐poor dataset where no sampling units share any species. If a species turnover measure that is independent of alpha diversity is required, a relative effective species turnover measure (βMt−1 or βPt) should be used. Conversely, relative effective species turnover and the compositional heterogeneity measures (βM) cannot be interpreted in terms of the absolute number of species involved.

1.2 The pairwise approach βMj,k or Δcj,k

The regional approach to quantifying the beta component gives a single value for the entire dataset of interest. However, often researchers would like to study how the beta component varies and what external factors are correlated with it. This can be achieved by calculating values of the beta component for different subsets of the data, such as sampling unit pairs.

Alpha and gamma diversity are derivable from a sites by species raw data table of N rows and S columns, where each cell value equals the proportional abundance of the ith species in the jth sampling unit (level 1 table A in Fig. 1). Subsets of the data can be formed by taking into account just two of the rows at a time; forming all possible pairwise combinations of the jth and the kth sampling unit yields N2 new raw data tables. The beta components of all these two‐row tables can be quantified and arranged into a derived data table of N rows and N columns (Fig. 1, level 2 table D). The raw data table corresponds to the first level of abstraction sensu Tuomisto and Ruokolainen (2006, 2008); this is not the same as the first level of abstraction sensu Legendre et al. 2005), and the derived table to the second level of abstraction (of both authors).

image

How different approaches to quantifying effective species turnover and related measures correspond to the three levels of abstraction of Tuomisto and Ruokolainen (2006, 2008). Level 1 consists of the raw data measurements at each sampling unit (SU); species abundances are shown on the left in table A and the values of an explanatory gradient on the right in table C. Community composition c is a summary of the species abundance information. The score of a sampling unit along the first axis of an ordination (cell values in table B) approximates its position along the most important compositional gradient, whose relationship with explanatory gradient g is shown in the level 1 scatterplot (the origin of the y axis is at average community composition). Level 2 data are derived as dissimilarities based on the level 1 data. Effective species turnover corresponds to Δc; pairwise turnover values are shown in the level 2 dissimilarity matrix D. Dissimilarities corresponding to the same sampling units but based on explanatory data are shown in the level 2 table E, and the relationship between the two dissimilarity matrices is shown in the level 2 scatterplot. Level 3 data are derived as dissimilarities based on the level 2 data. SUP stands for sampling unit pair. Since the dissimilarity matrices are symmetric, values below the diagonal are not shown. Proposed definitions of “beta diversity” include variance in table A, the difference between the largest and smallest cell value in table B, a single cell value in table D, mean of all off‐diagonal cell values in table D, the slope of the level 1 regression line, the length of the vertical side of the triangle delimited by the level 1 regression line, the slope of the level 2 regression line, the length of the vertical side of the triangle delimited by the level 2 regression line, and the length of the horizontal side of the triangle delimited by the level 2 regression line. For further explanation, see text.

The level 2 table corresponds to a dissimilarity matrix if the beta component has a minimum value of zero when the two sampling units are compositionally identical. This is the case when any of the effective species turnover measures Δcj,k is used. If the sampling units have equal weights, 0βAt multiplied by 2 equals the squared Euclidean distance and the Manhattan metric calculated using presence‐absence data, 0βMt−1 the one‐complement of the Sørensen index and 0βPt ranged to the interval [0, 1] equals the one‐complement of the Jaccard index (Sections 3, 4 and 5, respectively, in Tuomisto 2010). Using a dissimilarity matrix based on 0βMt−1 and 0βPt has been popular in studies related to “beta diversity”, but 0βAt has been used much less, because its dependence on alpha diversity complicates interpretation of the results.

Many data analysis methods use a dissimilarity matrix as a starting point. Principal coordinates analysis (PCoA, also known as metric multidimensional scaling) and non‐metric multidimensional scaling (NMDS) can be used to visualise the Δcj,k values in an ordination diagram. Agglomerative clustering produces a hierarchical classification of the sampling units such that the sampling units with the smallest Δcj,k values are combined first. The Mantel test quantifies the correlation between the cell values in two dissimilarity matrices. Finally, multiple regression on dissimilarity matrices and generalised dissimilarity modelling quantify the relative contributions of alternative explanatory dissimilarity matrices to explaining variation in the response dissimilarity matrix and provide means for forecasting values of Δcj,k.

Most of these analysis methods are included in numerical ecology textbooks (Legendre and Legendre 1998), and sometimes their use is explicitly described in terms of “beta diversity” (Magurran 2004, Tuomisto and Ruokolainen 2006, 2008, Ferrier et al. 2007). However, these methods would only address true beta diversity if the pairwise values were equal to βMd, which they never are; the minimum value of βMd is not zero and it is therefore not a dissimilarity measure. However, because Whittaker's effective species turnover βMt−1 is a linear transformation of βMd, it can be used to obtain results that are consistent with true beta diversity.

2. Approaches that quantify a derivative of Δc

2.1 Average of all pairwise values inline image

One possible measure of compositional differentiation in a dataset is the average of all the pairwise effective species turnover values Δcj,k (Whittaker 1972). In Fig. 1, this corresponds to the average of the off‐diagonal cell values in the level 2 dissimilarity matrix D, or to the corresponding single y value in the level 2 scatterplot. This average quantifies how much effective species turnover of the chosen kind (βMt−1, βPtor βAt) is expected between two sampling units drawn at random (without replacement) from the dataset.

When average pairwise values inline image from several datasets are compared with the corresponding regional values Δctot, a positive correlation can be expected between the two kinds of measure. However, the pairwise values are constrained by N=2, whereas the total values are generally based on larger values of N, so inline image is usually much smaller than Δctot (Fig. 2).

image

How different approaches to calculating compositional heterogeneity 0βM and compositional turnover Δc (as based on Whittaker's species turnover 0βMt−1) reflect the data from a 9‐cell window of a regular grid. All sampling units (identified by letters) with the same community type (numbers) have identical species compositions, and sampling units with different community types share no species. βMtot and Δctot are calculated for the entire window (Section 1.1 of the text); inline image and inline image are averages of all pairwise values between the cells (Section 2.1); inline image is the average of pairwise values between each cell and a regional compositional centroid (Section 2.2); and inline image is the compositional distinctness of the focal sampling unit F (Section 3.1).

Following the suggestion of Whittaker (1972), quantifying “beta diversity” with average pairwise compositional dissimilarity has been relatively common. The median has been used occasionally (Clarke and Lidgard 2000), but most studies have used the arithmetic mean. For presence‐absence data, inline image seems most commonly to have been based on mean 0βPt as ranged to the interval [0, 1] (which equals the one‐complement of the Jaccard index; Scheiner 1990, Scheiner and Rey‐Benayas 1994, Clarke and Lidgard 2000, Balvanera et al. 2002, Ellingsen and Gray 2002, Mac Nally et al. 2004, Tuomisto and Ruokolainen 2005, Urban et al. 2006, Shurin et al. 2009). Mean 0βMt−1 (which equals the one‐complement of the Sørensen index) has also been used (Vazquez and Givnish 1998, Hernández et al. 2008). It has even been suggested that any dissimilarity index can be applied (Tuomisto and Ruokolainen 2008, Ricotta and Burrascano 2009), and some studies have indeed used indices that are not derivable from alpha and gamma diversity and therefore do not correspond to any beta component (Oliver et al. 1998, Ellingsen and Gray 2002, Hewitt et al. 2005, Jankowski et al. 2009, Qian 2009). However, even those dissimilarity measures that correspond to a beta component (βAt, βMt−1 and βPt) do not quantify true beta diversity (βMd).

The mean of pairwise squared Euclidean distances, which equals the variance of the raw data table, has also been promoted as a measure of “beta diversity” (Legendre et al. 2005). However, this is only compatible with the concept of Δc when presence–absence data are used, because in this case the squared Euclidean distance is a linear transformation of absolute species turnover 0βAt (Section 3 in Tuomisto 2010). With proportional abundance data, it quantifies regional variance excess inline image (as in ter Braak 1983; see also Section 7 in Tuomisto 2010). With absolute abundance data, the squared Euclidean distance is not a function of alpha and gamma diversity at all (as in Legendre et al. 2005, 2009, Arias‐González et al. 2008).

Bacaro and Ricotta (2007) discussed using the semivariogram to examine how “beta diversity” changes with distance. The semivariogram is obtained by plotting the semivariance against geographical distance. Because the semivariance equals the average squared Euclidean distance between sampling units in a given distance class, it can only be used to model changes in mean pairwise 0βAt or inline image. True beta diversity (βMd), regional‐to‐local diversity ratio (βMt) and the relative effective species turnover measures (βMt−1 and βPt) cannot be expressed in terms of squared Euclidean distances, so their behavior cannot be studied with the semivariogram. Instead, an analogous graph could be constructed using the average values per distance class from a dissimilarity matrix based on βMt−1 or (ranged) βPt.

2.2 Average of pairwise values between sampling units and a regional compositional centroid inline image

Anderson et al. (2006) proposed that multivariate dispersion provides a better measure of “beta diversity” than average pairwise species turnover as calculated using the actual sampling units ( inline image). Multivariate dispersion is quantified as the mean distance in ordination space between a sampling unit and the regional compositional centroid. The centroid is a virtual sampling unit whose average compositional dissimilarity with the real sampling units is as small as possible, so it is found at the origin of the ordination axes obtained by subjecting a compositional dissimilarity matrix to principal coordinates analysis. If the chosen compositional dissimilarity measure is compatible with effective species turnover (Δc), then the mean distance can be indicated by inline image. This quantifies how much effective species turnover of the chosen kind is expected between the regional compositional centroid and a sampling unit drawn at random from the dataset.

Since being proposed by Anderson et al. (2006), the inline image measure has been used at least once (Terlizzi et al. 2009) to quantify “beta diversity”. However, inline image is a measure of effective species turnover Δc rather than of true beta diversity βMd. Since the centroid is by definition in the middle of the real sampling units, mean turnover between the centroid and a real sampling units is smaller than that between two real sampling units inline image. The latter is already smaller than the overall effective species turnover in the dataset Δctot, because its maximum value is constrained by N=2. The value of inline image will therefore always be smaller than the values of inline image and Δctot for the same dataset. The three measures can be expected to be positively correlated across datasets, but the relationships are not linear (Fig. 2).

2.3 Compositional gradient length Δcj,kmax and Δc'max

The first axis of an ordination represents, by definition, the compositional dimension with most variation in a dataset (Legendre and Legendre 1998). Its length can hence be used as a measure of compositional gradient length. Fig. 1 shows ordination scores, or the positions of the sampling units along the first ordination axis, in the level 1 table B. Compositional gradient length is approximated by the difference between the largest and smallest cell values in this table (Δc′max). Many studies have used the first axis of detrended correspondence analysis (DCA) to quantify “beta diversity” (Hill and Gauch 1980, Økland 1986, Eilertsen et al. 1990, Økland et al. 1990, Naranjo et al. 1998). The advantage of DCA axis length is that it is expressed in standard deviation units, which are considered comparable across datasets. However, the implicit dissimilarity measure in DCA is the chi‐square metric, which is not a function of gamma and alpha diversity. Therefore, DCA axis length quantifies compositional turnover in a way that is not compatible with those effective species turnover measures Δc that can be considered beta components of diversity.

A suitable effective species turnover measure Δc can be used by subjecting a dissimilarity matrix (level 2 table D in Fig. 1) to principal coordinates analysis (PCoA). The length of the first PCoA axis (Δc′max) can be interpreted as the maximum amount of effective species turnover of the chosen kind that can be observed in one compositional dimension in the dataset. Compositional gradient length is then related to the highest values in the compositional dissimilarity matrix (Δcj,kmax). However, the interpretation of compositional axis length based on PCoA is complicated by the fact that the relative effective species turnover measures (βMt−1 and βPt) saturate at a fixed maximum value when sampling units share no species. This leads to spurious structures in the ordination space and underestimation of compositional gradient length. The simplest solution to this problem is the “step‐across” or “extended dissimilarity” approach, in which intermediate sampling units are used to estimate the total compositional distance between sampling units that share no species (Williamson 1978, De'ath 1999). Hybrid multidimensional scaling (HMDS), in turn, uses small compositional dissimilarity values in a metric setting and large values in a non‐metric setting (Faith et al. 1987). This preserves the rank order of the dissimilarities rather than their actual values, which solves the problem of dissimilarity saturation but makes it difficult to compare gradient lengths across datasets. Although both approaches have potential, I am aware of no studies that have used PCoA or HMDS axis length as an explicit measure of compositional gradient length.

Compositional gradient length is measured along a single dimension of the multidimensional compositional space, so only that part of total effective species turnover that takes place in this particular dimension is quantified. Datasets can differ substantially in how large a proportion of the total compositional variation takes place in the first dimension, so Δcj,kmax and Δc′max can rank datasets in a different way than Δctot does.

2.4 Compositional gradient length along an external gradient Δcg)

So far, we have been operating with compositional data only. However, many ecological studies also record the values of some external factor g (such as spatial position or an environmental variable) that is hypothesised to contribute to change in community composition. The gj values recorded in each sampling unit j then form a second raw data matrix (level 1 table C in Fig. 1). The compositional gradient length along a specified interval of the external gradient Δg is then Δcg), which quantifies the amount of effective species turnover of the specified kind that is related to the explanatory gradient. In Fig. 1, this is approximated by the difference between the y values that the regression line in the level 1 scatterplot obtains at gmax and gmin, i.e. ∣c′(gmax)c′(gmin)∣.

Another approach to quantifying Δcg) is to calculate the beta‐turnover measure of Wilson and Shmida (1984), see also Shmida and Wilson 1985). To do so, one first arranges the sampling units along the external gradient of interest and defines for each species its total range along the gradient. Beta‐turnover then equals ∣Gg)+Lg)∣/2αint. Here αint is the arithmetic mean number of species at the points along the gradient, after the species ranges have been interpolated such that they are continuous. G is the cumulative number of species gained and L is the cumulative number of species lost between the beginning and the end of the gradient section of interest.

All species that are observed along the gradient have to be gained once, unless they are already present in the beginning of the gradient section. Similarly, all species have to be lost once, unless they are still present in the end. Consequently,

image

where γ is the total species richness of the dataset, a is the number of species shared between the beginning and the end of the gradient section, b is the number of species present at the beginning but not the end of the gradient section, and c is the number of species present at the end but not the beginning of the gradient section. From this follows that

image

where αb,e is mean species richness of the beginning and end of the gradient. If the endpoints have the same mean richness as any other points along the gradient, the term αb,eint equals unity. Beta‐turnover then appears similar to 0βMt−1=γ/αt−1 (see also Vellend 2001). However, there is one important difference: αint is based on interpolated ranges of species, which means that all species are assumed to be present at each point between their first and last occurrence along the habitat gradient, whether they were observed in intermediate sampling units or not. The filling of gaps in species ranges increases the value of αint above that of αt. The smaller the amount of interpolation that is needed to make species ranges continuous along the habitat gradient, the smaller the amount by which the value of αint exceeds the value of αt, and the smaller the amount by which 0βMt−1 (i.e. Δctot) exceeds the value of beta‐turnover (i.e. Δcg)).

Most studies that have used beta‐turnover as a measure of “beta diversity” have applied it to sampling unit pairs (Simmons and Cowling 1996, Naranjo et al. 1998, Davis et al. 1999, Koleff et al. 2003a,b, Bach et al. 2007). In this case, no interpolation of species ranges is possible, so beta‐turnover takes the same value as 0βMt−1 (= the one‐complement of the Sørensen index). Some studies used N >2 sampling units arranged along a geographical rather than a habitat gradient, and counted the cumulative number of gained and lost species from one end of the gradient to the other without first interpolating species ranges (Blackburn and Gaston 1996, Willig and Gannon 1997). Disjunct species ranges cause the same species to be counted as lost and/or gained more than once, which leads to values that overestimate Δcg) and can even be larger than Δctot.

Another approach to estimating compositional gradient length along an external gradient is to first quantify the instantaneous rates of compositional turnover (to be defined in Section 4.6) for all points along the gradient, and then take the sum of these (which corresponds to taking the integral of the instantaneous turnover rate function; Wilson and Mohler 1983, Oksanen and Tonteri 1995). Depending on how the instantaneous rate of compositional turnover is quantified, this approach may or may not measure turnover in a way that is consistent with a beta component of diversity.

Compositional gradient length along an external gradient Δcg) is dependent not only on compositional data, but also on which external reference gradient is chosen and which part of that gradient is observed. This is because only that part of total compositional gradient length Δcj,kmax or Δc′max is quantified that is related to the relevant interval of the external gradient of interest Δg. A given compositional dataset can easily show a substantial amount of effective species turnover along one environmental gradient but very little along another. Consequently, Δcg) measures a different phenomenon than the corresponding total effective species turnover Δctot does. The two may or may not be correlated across datasets, and this can depend on, among other things, whether the environmental gradients themselves are comparable.

2.5 Number of half‐change units ΔΔg(Δlog(1–Δc))

In one of his examples on how “beta diversity” can be quantified, Whittaker (1960) introduced the number of half‐change units. The half‐change unit is derived from a scatterplot in which the x axis shows Δgj,k values and the y axis shows log(CS) values, CS being the Sørensen index and representing 1−Δcj,k (Fig. 3; compare with the level 2 scatterplot of Fig. 1). The number of half‐change units (HC) equals:

image

image

The derivation of the half‐change unit and the number of half‐change units in a dataset. The data from the level 2 scatterplot in Fig. 1 is here displayed such that the y axis has been converted to similarities by subtracting from unity and then log‐transformed. Compositional similarity is quantified using the Sørensen index (CS). The regression line shows the expected decrease in log(CS) as a function of increasing intersample distance along gradient g. The half‐change unit is the distance Δg(0.5C) at which the expected compositional similarity equals half of the estimated value at Δg=0, and the number of half‐change units is ΔΔgmax/ΔΔg(Δ0.5C).

Simple geometry dictates that HC can also be calculated as (Fig. 3):

image

Here CS(Δgmax) is the expected Sørensen index value between those sampling units that are furthest away from each other along the environmental gradient of interest (see also Whittaker 1972, Vellend 2001). The half‐change unit itself is ΔΔg(Δ0.5C), which quantifies the amount of increase in environmental distanceneeded to reduce compositional similarity between sampling units by one‐half. The number of half‐change units then expresses the length of the environmental distance gradient in multiples of the half‐change unit. In terms of community composition, the number of half‐change units quantifies how many times the compositional similarity is halved along the Δg axis, when comparison is consistently made with the value at the end of the previous half‐change unit. This measure assumes that compositional similarity CS behaves like the number of atoms of a radioactive element: adding one half‐change unit to the x axis value always halves CS, no matter what the initial Δg value. For example, the regression model may forecast that CS between sampling units in the same geographical location is 1.0 and CS between sampling units 50 km apart is 0.5. Then CS between sampling units 100 km apart is forecasted to be 0.25, between sampling units 150 km apart 0.125 and so on. A study extent of 150 km would then correspond to 150 km/50 km=3 half‐change units. As the number of half‐change units increases, the estimate that HC gives of the length of the explanatory gradient becomes less accurate, just as age determination based on radioactive decay becomes less accurate when the number of half‐times increases.

The number of half‐change units represents a higher level of abstraction than the previous beta diversity measures, namely level 3 (Tuomisto and Ruokolainen 2006, 2008). This level is a result of taking Δc values from level 2 and calculating ΔΔc values for them. All pairwise ΔΔcj,k values can be arranged into a level 3 dissimilarity matrix with N2 rows and N2 columns, where each cell value is the difference between two sampling unit pairs in their within‐pair compositional dissimilarity (table F in Fig. 1). Level 1 data correspond to community composition c, level 2 data to difference in community composition Δc and level 3 data to difference in difference in community composition ΔΔc. In a similar way, level 3 data are derived for the environmental data (table G in Fig. 1). The number of half‐change units can be referred to by ΔΔg(Δlog(1–Δc)) which indicates that it quantifies level‐3 explanatory data in terms of differences in log‐transformed one‐complements of level 2 compositional dissimilarities. Obviously, values of ΔΔg(Δlog(1–Δc)) obtained for the same compositional data can vary widely depending on the choice of reference gradient g. The beta components of diversity correspond to Δc, and therefore differ from HC in both the focal kind of data (compositional vs explanatory) and in the level of abstraction addressed (level 2 vs level 3). The number of half‐change units has nevertheless been used as a measure of “beta diversity” in several studies (Whittaker 1960, 1972, Lee and La Roi 1979, Oksanen 1983, Vellend 2001).

3. Approaches that distort βM or Δc

3.1 Compositional distinctness of a focal sampling unit inline image

Compositional distinctness of a focal sampling unit inline image quantifies how much species turnover is expected to take place between the focal sampling unit F and another sampling unit drawn at random from the dataset. In Fig. 2, inline image is the mean pairwise species turnover between the focal sampling unit F and the other sampling units j within the specified geographical window.

In practice, compositional distinctness of a focal sampling unit has mostly been used in macroecological studies where a large area is divided into a grid of equally‐sized cells. A moving window of a fixed number of grid cells is used, for example 3 by 3 cells. The grid cell at the center is the focal sampling unit, and the other grid cells in the window are the sampling units whose pairwise species turnover values with the focal sampling unit are averaged. Alternatively, the average of pairwise values obtained between the focal sampling unit and a specified number of its neighbours to any of the cardinal directions can be used.

Although compositional distinctness of a focal sampling unit has been used to quantify “beta diversity” (Lennon et al. 2001, Hewitt et al. 2005, Gaston et al. 2007, Hernández et al. 2008), it is in fact insensitive to the amount of compositional heterogeneity within the window. Instead, the value of inline image depends on which sampling unit is chosen as the focal one; consider using sampling unit G instead of F in Setup 1 of Fig. 2. Obviously, what inline image quantifies is not correlated with either Δctot or inline image, so using inline image as a measure of “beta diversity” can give very misleading results.

3.2 Compositional non‐nestedness of a species‐poor sampling unit in a species‐richer one

Simpson (1943) was interested in the degree to which species in a species‐poor sampling unit are a subset of species in a more species‐rich sampling unit, and quantified this with Ssim=a/[a+min(b, c)]. Here a is the number of species shared by both sampling units, b is the number of species unique to the first sampling unit and c is the number of species unique to the second sampling unit. This index measures the degree to which the species‐poorer sampling unit is compositionally nested within the species‐richer sampling unit. Ssim resembles the Sørensen index CS=1−0βMt−1 but ignores all species that are unique to the species‐richer sampling unit, and hence obtains a larger value than CS whenever b and c differ. Whereas 0βMt−1 quantifies relative species turnover (which takes into account both species gains and losses between two sampling units), 1−Ssim quantifies relative species loss when moving from the less species‐rich sampling unit to the more‐species rich one, i.e. the proportion of the species‐poorer sampling unit that is not nested in the species‐richer one.

Baselga (2010) partitioned “beta diversity”, which he equated with 0βMt−1, into two additive components, one of which was 1−Ssim. Several other studies have used 1−Ssim itself as a measure of “beta diversity” (Koleff et al. 2003b, Mena and Vázquez‐Domínguez 2005, Baselga and Jiménez‐Valverde 2007, Baselga 2008, La Sorte et al. 2008, Qian 2008, 2009, Leprieur et al. 2009). Most of these studies referred to the beta‐sim measure of Lennon et al. (2001), but this is actually the mean 1−Ssim value between a focal sampling unit and each of its neighbours in a specified geographical window. Beta‐sim as originally described has therefore been used as a measure of “beta‐diversity” less often than appears from the number of times it has been cited (Koleff and Gaston 2002, Gaston et al. 2007, Kallimanis et al. 2008, Melo et al. 2009).

Beta‐sim is analogous to compositional distinctness of a focal sampling unit (Section 3.1, above), but the phenomenon it quantifies is the expected degree to which the less species‐rich sampling unit is compositionally not nested within the more species‐rich one when the focal sampling unit is compared to another sampling unit drawn at random from a geographical window. Just like compositional distinctness of a focal sampling unit, beta‐sim is very sensitive to the choice of the focal sampling unit. Consequently, the correlation between beta‐sim and Δctot or inline image may be weak, and interpreting beta‐sim as “beta diversity” can be very misleading.

3.3 Regional approach with α and γ based on different datasets

It has been rather popular to calculate “beta diversity” such that the alpha and gamma diversities are derived from entirely or partly different datasets. In this case, however, the idea of partitioning γ into α and β components is lost, because the sampling units used to quantify α are not the same as those used to quantify γ. Sometimes such a discrepancy has arisen inadvertently when the aim was to harmonise sampling effort among sites: sites with very small sampling effort were excluded when quantifying α but included when quantifying γ (Clarke and Lidgard 2000).

Some studies have divided the total species richness of a regional dataset by the species richness of a focal sampling unit to quantify βMt (Lennon et al. 2001, Gaston et al. 2007), βMt−1 (Kallimanis et al. 2008) or log(βMt) (Soininen et al. 2007). However, βMt=γ/αt where inline image is the mean species richness in all the sampling units j that contributed to γ (Diversity in relation to two classifications in Tuomisto 2010). Replacing αt by the species richness of a focal sampling unit gives the ratio γ/γF (where γF is gamma diversity at a more local scale, within sampling unit F). This quantifies how many times as rich in species the regional dataset is than the focal sampling unit. Selecting a different focal sampling unit from the same regional dataset could give a very different value of γ/γF. The degree of error made if this measure is interpreted as if it equaled 0βMt depends on the degree to which γF deviates from the mean species diversity of all sampling units inline image.

Following the beta‐2 measure of Harrison et al. (1992) and the beta‐3 measure of Williams (1996), several studies have used the sampling unit with the highest species richness γjmax as the focal sampling unit (Blackburn and Gaston 1996, Oliver et al. 1998, Davis et al. 1999, Clarke and Lidgard 2000, Gray 2000, Koleff and Gaston 2001, Plaza Pinto et al. 2008). This approach quantifies the smallest possible value of ranged 0βMt (in the case of beta‐2) or 0βPt (in the case of beta‐3), i.e. the value that would be obtained if all sampling units had as many species as the most species‐rich one that has actually been observed, but total species richness remained unchanged. The approach has been justified by arguing that using γjmax makes the measure less sensitive to trends in within‐sampling‐unit species diversity (Harrison et al. 1992). However, such a trend is an indication of compositional heterogeneity, and therefore relevant when the focus is on beta diversity. A consequence of using γjmax instead of inline image is that it makes the estimate of “beta diversity” dependent on the variability among sampling units in γj, with the amount of underestimation being related to the amount by which the highest within‐sampling‐unit species richness γjmax happens to exceed inline image in the data. An even more extreme approach was taken by Arias‐González et al. (2008), who used the variability in γj as a measure of “beta diversity”, and thereby entirely detached the concept of “beta diversity” from alpha and gamma diversity.

Harrison (1997) used each sampling unit as the focal one in turn to obtain as many estimates of γ/γF for each region as there were sampling units, and reported their arithmetic mean. However, mean inline image equals inline image only when all sampling units have the same species richness γj. As the variability in γj among the sampling units increases, mean γ/γF increasingly exceeds 0βMt.

Stevens and Willig (2002) quantified “beta diversity” as 1−γF/γ for different regions in the Americas such that γF values were taken from publications reporting numbers of species observed in field sites, and γ was inferred from a compilation of species distribution maps. Rather than quantifying proportional species turnover ( inline image) in a defined dataset, this measure quantifies how large a proportion of the species that are thought to occur in a region were actually found in a given field sampling unit. Since the sampling effort was not uniform for either γF or γ, the continental variation in 1−γF/γ reflects variation in the quality of the distribution maps and the thoroughness of the field sampling to an unknown degree (Biases and constraints, below).

3.4 Average of all pairwise values with compositional data taken from outside the sampling units of interest

Mumby (2001) suggested that “beta diversity” can be quantified with the help of a habitat map. First, field sampling units are classified into habitat types using compositional data. Then the habitat classification is extrapolated to the entire area of interest using a satellite image. Finally, a moving window of a fixed number of pixels is passed over the satellite image. For each window position, the average of all pairwise compositional dissimilarity values between those field sampling units that represent the spectral habitat types found in the pixel window is calculated.

The main problem with the approach is that none of the field sampling units that are used to calculate average pairwise compositional dissimilarity between pixels in a given window need to come from that window. Therefore, the obtained compositional dissimilarity values represent the compositional heterogeneity of the habitat classes over the entire extent of the study, which may be considerably larger than the extent of the pixel window being considered. As a consequence, average compositional dissimilarity between pixels within the window is probably overestimated.

Harborne et al. (2006) modified the approach by taking into account the proportional abundances of the different habitats in each window. They calculated the sum of the squared Bray‐Curtis index values between habitat type pairs, log‐transformed the sum, and multiplied the result by a diversity index based on habitat type abundances within the pixel window. It is difficult to see what this index quantifies, but clearly it is not related to any of the beta components that can be derived from gamma and alpha diversity.

4. Approaches that quantify a rate of change

4.1 Species diversity accumulation rate ΔqDx

Species diversity accumulation can be defined as the increase in the effective number of observed species qD with increasing sampling effort (traditional species accumulation is obtained as a special case when q=0). Sampling effort can here be quantified with the number of sampling units, provided that each sampling unit (SU) represents the same sampling effort (in terms of the number of individuals observed, surface area inventoried, or other appropriate criterion). Species diversity accumulation rate is a measure of how rapidly the effective number of species increases as sampling effort increases. This can be calculated as Δyx, which is the mean slope of a species diversity accumulation curve over a defined interval of sampling effort Δx. Here ΔyqD is the difference in mean observed species diversity corresponding to Δx. If the x axis is log(sampling effort), then

image

where n is the number of sampling units at the subset size of interest. Equally well, the x axis can be linear, in which case

image

Both the interpretation of the result and its numerical value will be very different depending on whether the x axis is log‐transformed or not. In both cases, the value of Δx is dependent on how the sampling units have been defined and how many of them there are.

Scheiner (2003) classified species‐area curves (which are species accumulation curves where sampling effort is defined by surface area) into four different types according to the geographical distribution of sampling units. Type II curves are obtained when the study region is divided into a grid, and each grid cell is a sampling unit (lag=0). Type III curves are obtained when sampling units are spread over the entire extent of the study region but do not cover it all, so that adjacent sampling units are not contiguous (lag>0). In practice, Type III curves are prevalent among studies that obtain their data by doing field work, whereas Type II curves are mostly used in macroecological studies that obtain their data from museum records or distribution maps. Scheiner's concepts can be made more general by allowing other definitions of sampling effort than surface area; let us here use the number of individuals. Then lag=0 means that all individuals of the target group that actually exist within the extent of the study are included in one of the sampling units, and lag >0 means that a part of the individuals remain unobserved.

Both Type II and Type III curves can be either spatially explicit (Type A) or spatially non‐explicit (Type B; Scheiner 2003). This distinction is what concerns us now, because it determines whether the y axis quantifies total species diversity in the focal set of n sampling units (γ) or mean species diversity within sampling units (αt). This leads to two kinds of effective species accumulation rate.

4.1.A Gamma diversity accumulation rate Δγ/(Δn SU) or Δγ/Δlog(n SU)

Spatially non‐explicit (Type B) species accumulation curves trace the increase in mean observed gamma diversity as an increasing number of sampling units is taken into account. Such curves can be produced by drawing sampling units at random from the entire pool of available sampling units, as is done in software such as EstimateS (Colwell 2009). The mean effective number of species for subset size n SU is then determined ( inline image) and an average species accumulation curve drawn. These curves have also been called collector's curves, rarefaction curves and randomised or smoothed species accumulation curves (Colwell and Coddington 1994, Gotelli and Colwell 2001, Colwell et al. 2004). Figure 4A shows an example based on species richness (q=0); examples of (unsmoothed) gamma diversity accumulation curves for different values of q can be found in Magurran (1988, p. 53).

Average gamma diversity accumulation rate can be determined between any two subset sizes as the slope of the line connecting the corresponding points along the smoothed curve (Fig. 4A). It is noteworthy that lag decreases with increasing n, because larger n means that more sampling units are drawn from the same region. In contrast, grain and extent are constant, because the definition of a sampling unit and the total pool of sampling units remain unchanged.

image

Species diversity accumulation curves. (A) In the smoothed spatially non‐explicit curves, n sampling units are drawn at random from the full set of sampling units. For each subset size n SU, the curve shows mean gamma diversity inline image, which equals the gamma diversity of the entire dataset when n=N, and the expected species diversity in a single sampling unit when n=1. Mean Δγ(1;n) equals mean absolute species turnover in subsets of n sampling units βAt(n,1) (scale at left). When n=N, Whittaker's species turnover can be obtained as Δγ(1;N)t(N,1) (scale at right). (B) In the spatially explicit curves, m spatially adjacent sampling units are pooled to form one new sampling unit. For each sampling unit size m, the curve shows the expected species diversity of a single sampling unit αt(N,m) in the entire dataset. The vertical difference Δαt(N,m;N,N) equals absolute species turnover βAt(N,m) (scale at left). Proportional species turnover βPt(N,m) can be obtained for any grain m as Δαt(N,m;N,N)(N) (scale at right).

4.1.B Alpha diversity accumulation rate Δαt/(Δm SU) or Δαt/Δlog(m SU)

In the spatially explicit curves (Type A), lag and extent remain constant but the grain of the sampling is changed. This is achieved by pooling m spatially adjacent original sampling units to form one larger sampling unit before quantifying alpha diversity αt(N,m) where N is the total number of sampling units in the dataset. Alpha diversity equals the generalised mean with exponent 1−q of the within‐sampling unit species diversities qDγj, which corresponds to the arithmetic mean when q=0 (Diversity in relation to two classifications in Tuomisto 2010). All N available original sampling units are used, but they are allocated to N/m new sampling units of m times the original size, such that each different value of m corresponds to a different grain.

The alpha diversity accumulation curve traces the increase in the expected species diversity within a sampling unit as the sampling unit grows larger (Fig. 4B; see also Lira‐Noriega et al. 2007). Average alpha diversity accumulation rate can be determined between any two sampling unit sizes as the slope of the line connecting the corresponding points along the curve.

Interpretation of diversity accumulation rates

Gamma diversity accumulation curves (spatially non‐explicit or Type B curves) visualise the effect of across‐region sampling intensity, i.e. how inventorying more sampling units of a fixed sampling effort increases observed total species diversity. In contrast, alpha diversity accumulation curves (spatially explicit or Type A curves) visualise the effect of within‐site sampling intensity, i.e. how changing sampling unit size by allocating a larger part of the total sampling effort to a single sampling unit increases observed species diversity within sampling units. Note that this formulation of alpha diversity corresponds to αt rather than true alpha diversity αd; this is necessary if a beta component of diversity is to be read off the diversity accumulation curves.

The degree to which the shapes of the spatially explicit and non‐explicit curves differ depends on the spatial structure of compositional heterogeneity within the study region. If species composition changes gradually from one sampling unit to the next, alpha diversity accumulation rate is small between small values of m and increases towards larger m, because all neighbouring sampling units are compositionally similar. As the degree of random variation and/or local‐scale patchiness increases, nearest neighbours can also be more different. Then the alpha diversity accumulation curve rises steeply at small values of m and levels off as m increases. If species are distributed at random (at a given grain), contiguous sampling units are no more similar than non‐contiguous sampling units, and the alpha diversity accumulation curve converges on the gamma diversity accumulation curve (at that grain). The degree of difference between the two types of curve is therefore an indicator of the degree of spatial aggregation in the data (see also He and Legendre 2002, Olszewski 2004). This interpretation assumes that mean diversity is calculated in the same way in both curves, i.e. that the generalised mean with exponent 1−q is used.

These relationships are paralleled in the contrast between individual‐based and sample‐based rarefaction curves (Colwell and Coddington 1994, Gotelli and Colwell 2001, Colwell et al. 2004). An individual‐based rarefaction curve can be thought of as a gamma diversity accumulation curve based on sampling units that contain exactly one individual each. A sample‐based rarefaction curve is obtained when adjacent original sampling units (individuals) are pooled to form new, larger sampling units (as in the alpha diversity accumulation curve), and randomly selected subsets of these, at the desired grain, are used to construct a gamma diversity accumulation curve.

Species diversity accumulation rate has been used or promoted as a measure of “beta diversity” (Ricotta et al. 2002, Scheiner 2003, 2004, Passy and Blanchet 2007). However, the slope of a species accumulation curve does not quantify compositional heterogeneity but the rate of change in gamma diversity with increasing number of sampling units (Δγ/Δx) or the rate of change in alpha diversity with increasing sampling unit size (Δαtx). Different kinds of species turnover (Δc) can be deduced from a species diversity accumulation curve, but all of these are represented by a difference in the y axis value between two points along the curve, not the slope of the curve.

The smoothed gamma diversity accumulation curve is constructed by varying subset size, so the difference in y axis value between points that correspond to subset sizes n SU and m SU (with n>m) equals inline image. Provided the generalised mean with exponent 1−q is used, inline image equals inline image, so the vertical distance between the point at n and the point at m=1 becomes

image

i.e. mean absolute effective species turnover in all subsets of n sampling units (Fig. 4A; see also Crist and Veech 2006, Gardezi and Gonzalez 2008). If the y axis is divided by αt(N,1), the difference between the y values of the lowest and highest points of the curve becomes

image

i.e. Whittaker's effective species turnover in the entire dataset. Unlike mean absolute effective species turnover, mean Whittaker's effective species turnover cannot always be accurately read off the gamma diversity accumulation curve when n<N. This is because the mean of ratios, such as inline image, does not equal the ratio of the corresponding mean numerators ( inline image) and mean denominators ( inline image) unless either the geometric mean is used (i.e. q=1) or all denominators are equal. In real datasets, αt(n,1) values may vary among the different subsets of n sampling units, especially when n is small.

The alpha diversity accumulation curve shows the actual increase in the mean within‐sampling unit species diversity as grain increases. The difference in y axis value between points that correspond to sampling unit sizes n SU and m SU (with n>m) equals αt(N,n)−αt(N,m) (Fig. 4B). When the entire dataset forms a single sampling unit, αt(N,N) equals γ(N). The difference in y axis value between this point and another point along the curve αt(N,m) equals

image

i.e. absolute effective species turnover in the entire dataset when sampling units are m times as large as originally. Interpreting αt(N,n) as inline image allows quantifying mean absolute effective species turnover for any combination of extent and grain by

image

If the y axis is divided by αt(N,N)(N), the vertical difference between the point n=N and another point m becomes

image

i.e. proportional effective species turnover in the entire dataset at grain m. Unlike mean βAt(n,m), mean βPt(n,m) cannot be read off the curve when n<N because a different denominator ( inline image) needs to be used for each different value of n.

4.2 Species entropy accumulation rate Δlog(qD)/Δx

If the species diversity accumulation curves in Fig. 4 are modified such that the y axis shows species entropy instead of species diversity, the species entropy accumulation curve is obtained. The most commonly used entropy in this context has been the Rényi entropy qH=log(qD), because species accumulation curves are often displayed in a log‐log plot. The slope of this curve is the species entropy accumulation rate, which quantifies how rapidly species entropy increases as log(sampling effort) increases. Two kinds of species entropy accumulation rate can be derived analogously to the two kinds of species diversity accumulation rate.

4.2.A Gamma entropy accumulation rate Δlog(γ)/Δlog(n SU)

A smoothed spatially non‐explicit entropy accumulation curve can be obtained by log‐transforming the y values of the smoothed gamma diversity accumulation curve of Fig. 4A. The curve traces the increase in gamma entropy as the number of sampling units that are taken into account increases. The average gamma entropy accumulation rate at a specified interval of sampling effort equals the average slope of the curve at that interval.

4.2.B Alpha entropy accumulation rate Δlog(αt)/Δlog(m SU)

A spatially explicit entropy accumulation curve is derived by log‐transforming the y axis of Fig. 4B. It traces the increase in alpha entropy as the sampling units grow larger (with total sampling effort remaining constant). The average alpha entropy accumulation rate at a specified interval of sampling unit size equals the slope of the curve at that interval.

Interpretation of entropy accumulation rates

Species entropy accumulation rate has been used or promoted as a measure of “beta diversity” (Caswell and Cohen 1993, Rosenzweig 1995, Vazquez and Givnish 1998, Scheiner 2003, 2004, Smith 2008), but the approach has also been criticised (Connor and McCoy 1979, Wilson and Shmida 1984). Indeed, the slope of an entropy accumulation curve does not quantify the amount of compositional heterogeneity in the data, and the two need not even be correlated. However, the difference in the y axis value between specific points along the curve can be interpreted in terms of the regional‐to‐local diversity ratio βMt, provided that the generalised mean with exponent 1−q was used to calculate mean species diversity before log‐transformation. It seems that MacArthur (1965, p. 528) interpreted entropy accumulation curves in this way, rather than (as suggested by several authors) equating the slope of the curve with “beta diversity”.

In the gamma entropy accumulation curve, the difference in y axis value between points that correspond to subset sizes n SU and m SU (with n>m) equals

image

When n=N and m=1, this equals

image

i.e. the logarithm of regional‐to‐local diversity ratio in the entire dataset. As with gamma diversity accumulation curves (Section 4.1), this interpretation is not accurate with other values of m unless q=1. If log(βMt(N,1)) is used as a measure of “beta diversity” instead of using βMt(N,1) itself, then “beta diversity” is equated with regional Rényi entropy excess, which gives regional Shannon entropy excess inline image as a special case when q=1. This is not a beta component of true diversity but of a raw diversity index value (Sections 6 and ∞ in Tuomisto 2010).

In the alpha entropy accumulation curve, the difference in y value between points that correspond to sampling unit sizes n SU and m SU (with n>m) equals

image

When the larger sampling unit contains the entire dataset (n=N), this can be interpreted as

image

or the logarithm of regional‐to‐local diversity ratio in the entire dataset at grain m. Although

image

has been used as a measure of “beta diversity” (Arita and Rodriguez 2002), it does not equal mean γ(n)t(n,m)=mean βMt(n,m) when n<N unless q=1 or the alpha diversities of all subsets with n sampling units are identical.

4.3 Rate of change in a beta component of diversity with increasing sampling effort ΔβMx or ΔΔcx

4.3.A Accumulation rate of a beta component of diversity

Figure 5A shows the accumulation of arithmetic mean true beta diversity βMd with increasing subset size (n SU) for which αd(n,1) and γ(n) are quantified in a spatially non‐explicit setting. Mean true beta diversity accumulation rate between any two points along the curve equals the slope of the line connecting them. The slope between points corresponding to a single sampling unit and a set of n sampling units equals

image

image

(A) The accumulation of mean true beta diversity with increasing n in a spatially non‐explicit setting, in which a subset of n sampling units is randomly drawn from the entire dataset. The slopes of the lines quantify the average rate of change in mean true beta diversity as number of sampling units inventoried increases over a specified interval. (B) The decrease in true beta diversity with increasing m in a spatially explicit setting, in which m sampling units are pooled to form a single new sampling unit before calculating true alpha diversity. The squares represent a dataset with a strong spatial gradient in species composition, and the diamonds a dataset in which the species are distributed at random. The slopes of the lines quantify the average rate of change in true beta diversity as size of the sampling units increases over a specified interval. (C) The accumulation of mean proportional effective species turnover with increasing n in a spatially non‐explicit setting, in which a subset of n sampling units is randomly drawn from the entire dataset. The slopes of the lines quantify the average rate of change in proportional effective species turnover as an increasing proportion of the sampling units is taken into account over a specified interval.

A numerically identical result (but with measurement unit 1/SU instead of CU/SU; Table 1) is obtained if the y axis shows regional‐to‐local diversity βMt or Whittaker's effective species turnover βMt−1. When n=N, the unitless value of the slope equals the one‐complement of the qCSn similarity measure for the entire dataset (generalisation of the Sørensen index; Section 4 in Tuomisto 2010). When n <N, mean 1−qCSn for all subsets with n sampling units is obtained. For presence‐absence data, 1−0CSn equals the beta‐1 measure of Harrison et al. (1992). Beta‐1 and 1−qCSn can therefore be used to quantify the average rate at which compositional heterogeneity and Whittaker's (effective) species turnover increase as sampling effort increases from 1 to N sampling units. Obviously, ΔβMd/(Δn SU) does not quantify the same phenomenon as βMd itself, and the same is true of ΔβMt/(Δn SU) and ΔβMt−1/(Δn SU) (the latter can be referred to, less specifically, by ΔΔcx).

Beta‐1 has been justified by arguing that when beta diversity is constrained to a fixed range, its values can be compared across regions that differ in N (Harrison et al. 1992, Blackburn and Gaston 1996). However, beta‐1 is no less affected by N than βMd is. As the number of sampling units in a sample increases, each additional sampling unit contributes progressively fewer new species to the sample (unless all sampling units were identical to start with, or the region of interest has an effectively infinite species diversity). The amount of increase in mean βMd(n,1) per unit increase in sampling effort therefore necessarily decreases with increasing n within any one dataset (Section 4 in Tuomisto 2010). In general, beta‐1 values can therefore be expected to be smaller in datasets with large N than in datasets with small N.

4.3.B Decay rate of a beta component of diversity

Figure 5B shows the decrease in true beta diversity βMd(N,m)(N)d(N,m) with increasing m, where m is the number of sampling units fused to form one new sampling unit in a spatially explicit setting (see also Lira‐Noriega et al. 2007). Each point along the curve is a single value rather than a mean, because both γ(N) and αd(N,m) are based on the entire dataset. Increasing m increases αd(N,m) but has no effect on γ(N), so the curve is monotonically decreasing. This is in contrast with the spatially non‐explicit curve (Fig. 5A), which is monotonically increasing because γ(n) increases with n but αd(n,1) remains approximately constant. The slope between the point corresponding to the original sampling units and sampling unit size m SU equals

image

A numerically identical result (although with measurement unit 1/SU) is obtained if the y axis shows regional‐to‐local diversity βMt or Whittaker's effective species turnover βMt−1. This slope is ecologically interpretated as the rate at which compositional heterogeneity and Whittaker's effective species turnover decrease as sampling unit size increases from 1 to m times the size of the original sampling units. When m=N, βMt(N,m) equals unity, and the above equation gives a result with the same absolute numeric value as beta‐1 and 1−qCSN for the entire dataset (Section 4.3.A, above), but opposite sign. Obviously, beta diversity decay rate ΔβMd/(Δm SU) does not quantify the same phenomenon as βMd itself, and the same is true of ΔβMt/(Δm SU) and ΔβMt−1/(Δm SU) (the latter can be referred to, less specifically, by ΔΔcx).

The spatially non‐explicit beta component accumulation curve is not affected by spatial structure in the compositional data, but the spatially explicit beta component decay curve is. The initial decrease in the beta component (when sampling units are still small) is more rapid when species are randomly rather than patchily distributed (Fig. 5B), although the mean rate of change between the extremes is the same as long as the same sampling unit definition is used at m=1.

4.3.C Proportional species turnover accumulation rate

Figure 5C shows the increase in the arithmetic mean of proportional species turnover βPt(n,1) when an increasing proportion of the available sampling units is taken into account in a spatially non‐explicit setting. The rate of change in mean proportional species turnover between subsets of a single sampling unit and n sampling units is the slope of the line connecting the corresponding points along the curve:

image

The slope between the lowest and highest points along the curve is obtained when n=N, and it equals the one‐complement of the qCJn measure discussed in Section 5 of the first part of the present review (generalisation of the Jaccard index; Jost 2006 andTuomisto 2010). From Fig. 5C it can be seen that 1−qCJn quantifies the average rate at which proportional species turnover increases as proportional sampling effort increases from one nth part to all of the sampling units in the dataset. This can also be referred to, less specifically, as ΔΔcx, which obviously quantifies something different than Δc (= βPt in this case) itself does.

4.4 Rate of change in beta entropy Δlog(qβM)/Δx

If Fig. 5A were replotted such that both the x and the y axis were log‐transformed, the y axis would show the beta component of Rényi entropy [=log(qβMd)] rather than true beta diversity. The accumulation of arithmetic mean beta Rényi entropy with increasing log(sample size) would then be traced by a curve drawn through the points representing the arithmetic means of the log‐transformed qβMd(n,1) values corresponding to each log(n). The slope of a line connecting two points on this curve quantifies the average rate at which mean beta Rényi entropy increases between the two sampling efforts. When sample size increases from 1 SU to n SU, this equals:

image

Exactly the same slope is obtained irrespective of whether the y axis shows log(qβMd) or log(qβMt) values, even in terms of measurement units since these cancel out.

When q=1, the slope between the lowest and highest point of the curve equals log(1βM(N,1))/log(N). This equals RhN, i.e. the Horn index of heterogeneity as generalised to N sampling units (Section 6 of Tuomisto 2010). When q=1 and n<N, the equation gives the expected RhN for a subset of n sampling units. Obviously, the Horn index and more generally Δlog(qβM)/Δx quantifies a different phenomenon than qβM itself does.

4.5 Species diversity or entropy accumulation rate with alpha and gamma diversities based on different data

In Sections 4.1 and 4.2, vertical differences between specific points in Scheiner's (2003) Type II and Type III species diversity or entropy accumulation curves were interpreted in terms of effective species turnover or regional‐to‐local diversity ratio. Such interpretations are possible when the same set of sampling units is used in all cases, because then inline image (for the spatially non‐explicit curves) and αt(N,N)(N) (for the spatially explicit curves). This is not the case with Scheiner's (2003) Type I curves. These quantify species diversity or entropy accumulation in a set of nested sampling units, where each grain is represented by a single sampling unit, whose size also defines the extent of the study region at that grain. Type I curves therefore quantify simultaneously the increase in total species richness γ(N) as extent becomes larger, and the increase in the species richness of a focal sampling unit γF as its size increases (these curves are typically based on presence‐absence data obtained from published species lists). If the grain and extent are decoupled, it becomes possible to quantify γ(N)−γF or log(γ(N)F) such that γF is based on a smaller grain and extent than γ(N). However, since these measures are based on the species richness of a focal sampling unit γF rather than on the mean species richness of all sampling units inline image, they suffer from the same problems as the measures discussed in Section 3.3, above. Although the slopes of neither linear nor log‐transformed Type I curves quantify a beta component of diversity, they have been equated with “beta diversity” (Lennon et al. 2001, Soininen et al. 2007).

4.6 Compositional turnover rate or effective species turnover rate Δcg)g

Compositional turnover rate is the rate at which community composition changes along an explanatory gradient. This can be quantified between any two sampling units in a way that is compatible with the concept of effective species turnover by dividing a pairwise effective species turnover value Δcj,k with the corresponding Δgj,k value (see the level 2 tables D and E in Fig. 1). Effective species turnover rate is also approximated by the slope of the regression line in the level 1 scatterplot in Fig. 1; average turnover rate can be quantified for the entire observed range of the explanatory gradient, or smaller sections of the gradient can be used (Økland 1986). The instantaneous rate of compositional turnover can also be estimated using the derivatives of species response functions or of a compositional change function along the gradient of interest (Bratton 1975, Wilson and Mohler 1983, Oksanen and Tonteri 1995). However, if these methods are used, the results may not be compatible with effective species turnover as defined on the basis of alpha and gamma diversity. Compositional turnover rate has been called “beta diversity” (Bratton 1975, Cody 1975, 1986), but it is actually Δcg)g and therefore quantifies a different phenomenon than true beta diversity qβMd or even the effective species turnover measures Δc.

Before compositional turnover rate can be calculated, the external factor for the x axis has to be defined, which leads to two main variants of compositional turnover rate.

a) The explanatory gradient is geographical location, i.e. the rate of compositional turnover is measured at a specific geographical location and in a specific compass direction.

b)The explanatory gradient is an environmental variable (or a surrogate thereof, such as elevation), i.e. the rate of compositional turnover along the gradient is measured at a specific value (or interval) of the environmental variable of interest.

Very different compositional turnover rates can be obtained with the same compositional data depending on which reference gradient is chosen. Compositional turnover rate has been used to identify zones of rapid turnover along an explanatory gradient of interest (such as elevation or ground water depth; Økland 1986). Pairwise turnover values between sampling units separated by a fixed distance along the gradient can also be plotted against the explanatory gradient itself. The points in such graphs actually represent (Δcj,k, gj), but they can (with caution) be interpreted in terms of (Δcj,kgj,k, gj) since each Δcj,k corresponds to the same Δgj,k (Beals 1969, Bratton 1975, Wilson and Shmida 1984, Bach et al. 2007, Jankowski et al. 2009).

4.7 Rate of change in (the logarithm of the one‐complement of) pairwise compositional turnover with increasing distance along an explanatory gradient (slope of a distance decay regression) ΔΔc(ΔΔg)/ΔΔg or Δlog(1–Δc)(ΔΔg)/ΔΔg

Rate of change in pairwise compositional turnover is the average rate at which sampling units become compositionally more dissimilar as distance between them along the external gradient of interest increases. It can be quantified as the slope of the regression line drawn through the level 2 scatterplot in Fig. 1, ΔΔc(ΔΔg)/ΔΔg. If the y axis values (Δcj,k) in this plot are subtracted from unity to convert them to similarities, a distance decay plot is obtained (Nekola and White 1999). The slopes of the two plots have the same absolute value but opposite signs. The distance decay plot can also be displayed with a logarithmic y axis, in which case the slope of the regression line equals Δlog(1–Δc)(ΔΔg)/ΔΔg (Fig. 3). The slopes obtained with linear vs logarithmic y axes quantify different things and their values are therefore not commensurate.

Before the slope of a distance decay regression can be calculated, the external factor for the x axis has to be defined. Very different rates of change can be obtained with the same compositional data depending on which reference gradient is chosen. Two main variants can be distinguished.

a) The factor of interest is geographical distance, i.e. the rate of change in (the logarithm of the one‐complement of) pairwise compositional turnover is measured as the geographical distance between sampling units increases. An average value can be obtained for all directions if space is assumed isotropic, but it is also possible to differentiate between rate of change in pairwise compositional turnover in different directions, for example north–south vs east–west.

b)The factor of interest is environmental difference (or a surrogate thereof, such as difference in elevation), i.e. the rate of change in (the logarithm of the one‐complement of) pairwise compositional turnover is measured as the environmental dissimilarity between sampling units increases. Environmental difference can be defined using just one environmental variable or several at a time.

In one of his examples, Whittaker (1960) introduced both variants, since he used geographical distance as a surrogate for environmental difference. However, he did not designate the rate of change as a measure of “beta diversity”, but rather used it as an intermediate step in estimating the number of half‐change units (Section 2.5, above). Pielou (1975, p. 101) reinterpreted “beta diversity” such that it was no longer the number of half‐change units ΔΔg(Δlog(1–Δc)), but the inverse of the length of the half‐change unit 1/ΔΔg(Δ0.5C). This equals the absolute value of the slope of the distance decay regression divided by a constant (log2; Fig. 3). Cody (1993) referred to the slope itself as “beta diversity”, and recently doing so has become quite popular. Some studies have used a log‐transformed y axis (Whittaker 1960, Pielou 1975, Vazquez and Givnish 1998, Leach and Givnish 1999, Qian et al. 2005, 2009, Qian and Ricklefs 2007, Qian 2008, Smith 2008), others a linear one (Cody 1993, Condit et al. 2002, Davidar et al. 2007, Novotny et al. 2007, Muneepeerakul et al. 2008, Qian 2009). Both approaches can be justified, but each quantifies a different rate of change (compare with Sections 4.1 and 4.2, above). Recently also the halving distance, i.e. the length of the half‐change unit Δg(0.5C), has been used as a measure of “beta diversity” (Qian 2009). This equals the inverse of Pielou's measure if the same similarity index (CS) is used.

These rate‐related measures have led to much confusion, because the term “beta diversity” can now refer to concepts as different as the amount of effective species turnover (Δc), the rate of effective species turnover (Δcg)g), the rate of change in effective species turnover (ΔΔc(ΔΔg)/ΔΔg), the amount of change in differences along an explanatory gradient (ΔΔg(Δlog(1–Δc))), the halving distance (Δg(0.5C)), or the inverse of the halving distance (1/Δg(0.5C)). Sometimes two or three of these meanings have been used in the same paper (Cody 1993, Vazquez and Givnish 1998, Leach and Givnish 1999, Novotny et al. 2007, Qian 2009), which makes accurate communication of ideas and results very difficult. This is especially problematic because the slope of a regression line (such as ΔΔc(ΔΔg)/ΔΔg) is independent of the average y value of the observations (such as inline image). Thus, one cannot be used as predictor for the other. The only logical link between the two is that the steepness of the slope is constrained by the variability of the y values; if all y values are equal, the regression slope is necessarily zero.

Biases and constraints

The bottom line

Quantifying diversity necessitates that the target organism group be explicitly defined. The target group may be, for example, trees with diameter at breast height exceeding 10 cm, herbaceous vascular plants, breeding birds, ungulates, any aquatic animals caught in a net of a specified mesh size, any arthropods caught in a pitfall trap, a specific family of beetles caught in a pitfall trap, and so forth. Possible target groups differ greatly in the size of individuals, population density per unit area, detectability of the individuals and existing number of species, and all of these variables have a great impact on diversity estimates.

Diversity can be calculated for any dataset (sites by species table), but meaningful ecological conclusions necessitate that sampling efforts are standardised among sampling units and datasets (Gotelli and Colwell 2001; see also Sections 4.1–4.4, above). There are many ways of quantifying sampling effort, and these are not interchangeable. One possibility is to define each sampling unit such that it contains a fixed total abundance of the organisms of interest. If the number of individuals observed is used as the abundance criterion, the results are relatively easy to compare, because each individual has an equal probability of contributing a new species to the dataset. However, sometimes a more appropriate measure of observed abundance may be biomass, basal area or cover. Another way of standardising sampling effort is to use a plot of a fixed surface area or a fixed period during which observations are made. When such approaches are used, it needs to be taken into account that differences in diversity among areas or target groups may simply be due to differences in the density of individuals. This, in turn, may depend on the size of the individuals or a variety of other factors that are specific to the target organism. These can be, for example, spatial (size and shape of the sampling unit in relation to the size of the organisms), temporal (observation season and time of day) or behavioral (territoriality and tendency to avoid being seen or caught). Great caution is therefore needed if diversity values obtained with different kinds of organisms are compared.

Extrapolating alpha and gamma diversity

All diversity components (α, β and γ) are exactly quantifiable in any dataset where the observed entities have been classified into both sampling units and species. However, often researchers are not interested in the diversity components of the dataset itself, but in using the dataset to infer diversity components for larger areas. Typically, the region of interest contains orders of magnitude more individuals than the sample dataset, such that the species identity of only a small proportion of the relevant individuals is known. Similarly, each site or habitat of interest may contain many more individuals than the sampling unit representing it. Using the species diversities of a single sampling unit qDγj and the sampled dataset qDγ (collectively referred to as diversity of the observed unit qDO) as estimates of the species diversity of the corresponding target units qDT (site, habitat or region) is therefore faced with extrapolation problems. It is also possible that the dataset includes individuals that do not belong to the target set. For example, the interest may be in breeding birds but the available dataset also includes non‐breeding individuals. This may cause qDT to be overestimated.

More commonly, the target unit contains all the individuals of the observed unit and additional individuals as well. Then observed species richness (0DO) can at most equal the actual species richness of the target unit (0DT). The rarer a species, the more individuals need to be observed (on average) before it is found, so the more rare species there are, the larger the downward bias when 0DO is used as an estimate of 0DT. As q increases, the effect of the rare species on the value of qDO decreases, so the downward bias in qDT diminishes. However, usually both sampling and species distributions are spatially autocorrelated, which biases the sampling towards observing some species over others and overestimating the abundances of the observed species. Hence, qDT is probably underestimated at all values of q.

In general, the larger the size discrepancy (difference in the number of individuals) between the observed unit and the target unit, the larger the underestimation error when qDO is used as an estimate of qDT. If all individuals of the target group that exist in the target unit are actually sampled, then qDO necessarily equals qDT. How large a proportion of the individuals in the target unit need to be observed for qDO to converge on qDT depends on the species richness, species abundance distribution and spatial structure of the target unit, as well as on sampling strategy (Scheiner 1990, Colwell and Coddington 1994, Gray 2000, Gotelli and Colwell 2001, Chao et al. 2009).

Imagine two target units, one with 10 species and the other with 1000 species, each represented by an observed unit of 100 individuals. For the species‐rich target unit, 0DO underestimates 0DT by at least 90%, but for the species‐poor target unit, 0DO may equal 0DT. A much higher sampling effort is obviously needed in a species‐rich target unit than in a species‐poor one for it to be possible to observe all species. Furthermore, if a target unit is dominated by a single species, this will also dominate the sample and fewer species will be observed. If the degree of dominance is much higher in a species‐rich target unit than in a species‐poor one, it is even possible for 0DO of the former to be smaller than 0DO of the latter.

The spatial and environmental setup of sampling within the target unit also affect bias. If species distributions within the target unit are spatially structured, a geographically biased sample can generally be expected to yield smaller qDO than a random sample with the same number of individuals. In practice, geographically biased sampling caused by limited accessibility in some parts of the target unit is extremely common. To counteract undersampling bias, known environmental gradients within the target unit can be used to maximise compositional coverage of the observations (e.g. by the “gradsect” method; Austin and Heyligers 1989).

If each sampling unit is its own target unit, then qDO=qDT and the qDγj values have no undersampling bias. Nevertheless, the alpha diversity estimate for the target region (mean qDγj) will have sampling error if all sampling units together do not contain all individuals of the target region. If the sampling units are biased towards sites with higher or lower species diversity than the actual regional mean, then the alpha diversity estimate will be similarly biased.

Often both sampling units and the corresponding target units are circumscribed by spatial coordinates. Then both qDγj and the corresponding qDT values may refer to unknown numbers of individuals, which makes estimating the degree of undersampling bias difficult. Especially macroecological studies suffer from this problem, because actual sampling efforts may differ by orders of magnitude among sampling units of the same surface area, but the ancillary information needed to estimate sampling efficiency is often not available. Comparing diversity patterns among organisms that differ in individual density, detectability and/or species richness is also risky even if sampling effort per surface area is uniform. Results of such studies need to be treated with extreme caution to avoid interpreting patterns in the vagaries of sampling as patterns in species diversity.

Extrapolating the beta component of diversity

The beta components of diversity (βM and Δc) are quantified using available alpha and gamma diversity values, which are subject to the sampling biases discussed above if extrapolated beyond the actual dataset sampled. If the observed qDγ is an underestimate of the gamma diversity of the target region, the beta component will be underestimated. Conversely, if the observed mean qDγj is an underestimate of the alpha diversity of the target region, the beta component will be overestimated. Thereby, the beta component can be either underestimated or overestimated, depending on the relative magnitudes of the biases in the alpha and gamma diversity estimates.

The implications of undersampling are somewhat different for the regional and pairwise approaches to quantifying the beta component. The following is written in terms of effective species turnover Δc, but similar reasoning applies to the heterogeneity measures βM as well. Imagine that each sampling unit is representative of the corresponding target unit, but the total set of sampling units is not representative of the target region. Then the Δctot value of the dataset is clearly an underestimate of Δctot of the target region, but the pairwise Δcj,k values of the sampling units are accurate estimates of the pairwise Δcj,k values of the corresponding target units. The average of the pairwise values in the dataset is an unbiased estimate of the target average if sampling unit placement is unbiased, but can become either an underestimate or an overestimate with biased sampling. The situation is different if the sampling units are very small in relation to the target units, but their number is so large that the target region is well represented. In this case, both the regional and pairwise Δc values are overestimates of their target values.

Imagine further that a region of interest contains two habitats that have 10 species each and share 5 species, and two habitats that have 1000 species each and share 500 species. Whittaker's species turnover (0βMt−1) between the two low‐diversity habitats is exactly the same as that between the two high‐diversity habitats, namely 0.5. This is not the case for sampling units containing 100 individuals each, however. If some species in a habitat have a higher probability of being included in the corresponding sampling unit than others (e.g. because overall abundances differ or species distributions are aggregated), then the values of 0βMt−1 observed between sampling units can theoretically vary over the entire interval [0, 1] depending on vagaries of sampling, no matter what the actual 0βMt−1 value between the corresponding habitats. If every species does have the same probability of being included in a sampling unit, then 0βMt−1≅ 0.5 between the sampling units representing the species‐poor habitats, because all species that are actually present in each habitat are probably included in the corresponding sampling unit. In contrast, 0βMt−1≅ 1 between the sampling units representing the species‐rich habitats, because the number of individuals sampled is only 10% of the number of species present in each habitat, and it is unlikely that the same shared species happen to be sampled in both habitats (Wolda 1981, Scheiner 1990, Colwell and Coddington 1994, Plotkin and Muller‐Landau 2002, Chao et al. 2006). It is obviously incorrect to conclude from such 0βMt−1 values that Whittaker's species turnover is higher between the two species‐rich habitats than between the two species‐poor habitats. However, a justified interpretation can be made in terms of the sampling units themselves: Whittaker's species turnover really is higher between the two species‐rich sampling units than between the two species‐poor sampling units. This is, after all, what was actually measured.

Diversity differences comparable to this example are probably quite common in nature, for example between tree inventories in tropical vs boreal forests. Consequently, great care is needed if patterns in observed βM or Δc along external gradients, such as latitude, are interpreted in terms of habitats or other entities larger than the sampling unit. If there is a trend in species diversity along the gradient of interest, a spurious trend in compositional heterogeneity and effective species turnover may result simply because the degree of undersampling and hence the amount of bias depends on species diversity (Colwell and Hurtt 1994).

When two target units are compositionally very similar, actual Δcj,k and βMj,k values are small and can hence be overestimated by a much wider margin than when the actual Δcj,k and βMj,k values are large. Severe undersampling therefore causes pairwise effective species turnover and compositional heterogeneity to converge towards high values for all sampling unit pairs (Jobe 2008, Cardoso et al. 2009). This makes unraveling ecological or spatial trends in Δc and βM more difficult for species‐rich than for species‐poor habitats or target organisms (Jones et al. 2008). When abundance data are available, the undersampling bias can be corrected to some degree (Chao et al. 2006), but extrapolation of Δc and βM should still be done with caution.

Cardoso et al. (2009) tested how sensitive various “beta diversity” indices are to undersampling, and used their robustness to make recommendations on which to use. However, each one of the indices they tested quantifies a different phenomenon. These included Whittaker's species turnover βMt−1, ranged proportional species turnover βPt, non‐nestedness, measures based on the species richness of a focal sampling unit, and rates of change. Knowledge about the sampling behaviour of an index is important when interpreting the results, but the primary criterion for choosing an index should be whether or not it quantifies the phenomenon of interest.

Discussion

It follows from the diversity of beta diversities that researchers should be explicit about what kind of “beta diversity” they refer to. Some recent papers on “beta diversity” are not cited in the present review, because they did not define “beta diversity” explicitly, and could therefore not be placed in the general framework. The framework itself is not exhaustive, because some papers used such complicated calculations to derive “beta diversity” that I was unable to figure out the ecological meaning of the final variable. Many studies have failed to address their stated research question because the selected method of quantifying “beta diversity” was not compatible with it, or have compared their results with those of earlier studies in which a different kind of “beta diversity” had been used. Most of the indices that have been used as explicit measures of “beta diversity” actually do not measure a (beta) component of diversity at all (see, for example, Table 1 in Koleff et al. 2003a). In the absence of clear criteria by which to select among the available indices, many studies have opted for using two or more indices in parallel. Sometimes one of these did measure a beta component of diversity, other times not.

Species turnover and beta diversity are often considered synonymous, but Vellend (2001) argued that they should be maintained separate. Vellend considered beta diversity “a value that can be related mathematically to α‐ and γ‐diversity”, and species turnover “the rate or magnitude of change in species composition along predefined spatial or environmental gradients”. I find even these definitions too broad. Not only true beta diversity (βMd) and regional‐to‐local diversity ratio (βMt) but also several other variables are simple functions of γ and α (Tuomisto 2010). Three of these (βAt, βMt−1 and βPt) quantify the total number or proportion of effective species that change among sampling units in a dataset, i.e. absolute or relative effective species turnover Δc. This equals actual species turnover when based on presence‐absence data (q=0). Many other definitions of compositional turnover have also been used, but since these are not derivable from gamma and alpha diversity, they do not quantify (effective) species turnover in this strict sense. In addition, magnitude of total species turnover (Δctot of Section 1.1, above) and species turnover in a single compositional dimension (Δc′max of Section 2.3) are not affected by the choice of a reference gradient, but species turnover along an external gradient is (Δcg) of Section 2.4). Furthermore, magnitude of change along a gradient does not equal rate of change along that same gradient (Δcg)g of Section 4.6). The magnitude corresponds to the difference in y axis value between two points on the regression line in the level 1 scatterplot in Fig. 1, whereas the rate corresponds to the slope of the regression line. The “species turnover” measure preferred by Vellend (2001) actually represents the next level of abstraction, and corresponds to the maximum difference in x axis values in the level 2 scatterplot of Fig. 1‘ΔΔg(Δlog(1–Δc)) of Section 2.5’. Rather than quantifying change in species composition, this measure expresses change in environmental difference in terms of change in similarity in species composition. Whittaker (1960) himself included in the original concept of “beta diversity” as well βM as Δc and ΔΔg(Δlog(1–Δc)); the recent literature on “beta diversity” has been so confused that a clarification of terminology is urgently needed.

Jurasinski et al. (2009) reviewed different “beta diversity” concepts and divided them into two groups. The first group included multiplicative and additive partitioning of total species diversity. In the present paper, these correspond to regional‐to‐local diversity ratio (βMt) and absolute effective species turnover (βAt), respectively. The second group included a number of different phenomena subdivided into four categories. The first category included all approaches where sampling units are compared in a pairwise manner using dissimilarity coefficients. As explicit examples, they mentioned a number of indices that do not correspond to any beta component of diversity, and the Sørensen and Jaccard indices, which in the present review correspond to different basic definitions of a beta component. Their second category included the sum of squares of a sites by species matrix, which corresponds to the average of all pairwise squared Euclidean distances between sampling unit pairs (Section 2.1, above). Their third category included the average distance to a compositional centroid (Section 2.2, above), compositional gradient length (Section 2.3, above) and the number of half‐change units (Section 2.5, above). The present paper aims to clarify the differences among variants even further.

Terminological confusion is apparent, for example, in the study by Novotny et al. (2007). They used three different similarity measures to derive “beta diversity”. The Sørensen index corresponds to the one‐complement of βMt−1 (Section 1.2, above). The codominance index C(d) (used also by Chave and Leigh 2002 and Condit et al. 2002) quantifies the probability that two individuals drawn at random from sampling units d kilometres apart belong to the same species. In high‐diversity systems, the “beta diversity” estimate obtained with C(d) approaches zero whatever the compositional similarity between the sampling units (Jost 2006 and Section 7 of Tuomisto 2010). The third index was based on γF/γ where γF is the species richness of a focal site and γ the known species richness of the entire country. γF/γ would equal 1/βMt if γF equaled average sampling unit diversity (Section 3.3, above), but it is more likely that γjmax was used. In tropical rain forest datasets, γjmax is easily orders of magnitude higher than mean sampling unit diversity, and Δctot is hence considerably underestimated if γjmax is used instead of inline image. Furthermore, the authors' claim that γF/γ indicated “low beta diversity” cannot be evaluated, because no information on sampling effort was provided. The possible range of values that γF/γ can take given the data at hand is hence unknown (Section 5 of Tuomisto 2010; Biases and constraints, above). Their fourth “beta diversity” measure was distance decay rate ΔΔc(ΔΔg)/ΔΔg which is not correlated with any measure of Δc (Section 4.7). Although their target region was over 500 km long, Novotny et al. (2007) used data from only 4–8 sampling units, which were purposefully placed in similar environments. This leads to 0DO being a strongly downward biased estimate of 0DT for gamma diversity. At the same time, very high within‐site sampling effort led to 0DO being a rather good estimate of 0DT for alpha diversity. The implications of this sampling strategy were not discussed when the observation of “low beta diversity” was extrapolated to tropical forests in other parts of the world.

A more analytical approach was taken by Koleff et al. (2003b), who made a laudable effort to evaluate how much the correlation between “beta diversity” and latitude depends on how the computations are done. They evaluated ten “beta diversity” indices, which quantify the following (in same order as in their Table 2):

  • 1

    regional‐to‐local diversity ratio βMt (Section 1.1, above)

  • 2

    rate of change in regional‐to‐local diversity ratio with increasing number of sampling units ΔβMtx (Section 4.3.A)

  • 3

    amount of absolute species turnover along an external gradient βAt(Δg) (Section 2.4)

  • 4

    amount of Whittaker's species turnover along an external gradient βMt−1(Δg) (Section 2.4)

  • 5

    the rate at which the sum of pairwise βMt−1 values increases as the number of sampling units in a transect increases

  • 6

    rate of change in 1/βMt with increasing proportion of sampling units inventoried (Section 4.3.C)

  • 7

    rate of change (with increasing number of sampling units) in how many times as many species all sampling units together contain than the most species‐rich one of them (Section 3.3)

  • 8

    the proportion of species pairs that do not co‐occur in any sampling unit

  • 9

    compositional non‐nestedness of a focal sampling unit (Section 3.2)

  • 10

    distinctness in species richness of a focal sampling unit.

In addition, Koleff et al. (2003b) tested how the results are affected by different methods of defining the sampling units and the study region. Method 1 computes pairwise “beta” index values for adjacent grid cells along a latitudinal transect, and hence quantifies if the amount of “beta” between adjacent latitudes is correlated with latitude. Method 2 computes the “beta” index values between entire latitudinal bands, so any latitudinal trend that may actually exist in the “beta” index between adjacent latitudes will be confounded by the effect of varying sampling unit size. This problem was pointed out and extensively discussed by Koleff et al. (2003b). In method 3a, the entire latitudinal belt is used as the study region, and each grid cell within it as a sampling unit. These values reflect “beta” within each latitudinal belt, but latitudinal comparisons are problematic because both the extent of the study region and the number of sampling units vary among latitudes. In method 3b, “beta” is calculated pairwise for adjacent grid cells within each latitudinal belt and the average of these pairwise values is used. Such an average is less affected by the number of sampling units than a regional “beta” value, especially when pairwise comparisons are only done among adjacent neighbours. Consequently, the “beta” values obtained using method 3b are comparable across latitudes even when latitudinal belts differ in the number of sampling units, and this method is suitable when exploring if within‐latitude “beta” is correlated with latitude.

Koleff et al. (2003b) obtained rather disparate results from the different methods, and concluded that we still do not know if there are latitudinal gradients in “beta diversity”. As discussed above, the different analytical approaches in effect quantified different phenomena, many of which are logically uncorrelated with each other. In hindsight, the outcome of the analyses was perhaps not very surprising; if you ask a different question, you may get a different answer. The logical next step is to explore if a more consistent pattern is found when several datasets are used to address the same question.

Conclusions

The original definition of “beta diversity” by Whittaker (1960) was already very broad, and since the coining of the concept, it has been tremendously stretched to cover the most varied phenomena. Beta diversity and different kinds of species turnover can be linked to external factors in different ways, and related phenomena can be quantified at different levels of abstraction. Most of the derived phenomena are interesting and ecologically meaningful in their own right, but the accuracy of scientific communication necessitates that conceptually different things be referred to by different terms. Therefore, care is needed not to confuse beta diversity itself with the various other phenomena that are derived from alpha and gamma diversity or otherwise related to beta diversity. Documenting and understanding patterns in beta diversity and species turnover at different spatial scales, in different areas and for different target organisms is in many ways a huge challenge. Let us avoid the additional difficulties that emerge from the inconsistent use of terminology.

Acknowledgements

I thank Robert K. Colwell for inviting me to write this review, for many useful comments on several versions of the manuscript, and for patience and support through a process that took much more time and resulted in a considerably heavier paper than either one of us had bargained for. The review has greatly benefited from discussions with Kalle Ruokolainen on beta diversity and related concepts. Helpful suggestions were also made by Mirkka Jones and four anonymous reviewers. The Academy of Finland is acknowledged for funding.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.