de Bello, F. (corresponding author, firstname.lastname@example.org), Lavergne, S. (email@example.com) & Thuiller, W. (firstname.lastname@example.org): Laboratoire d'Ecologie Alpine, UMR CNRS 5553, Université Joseph Fourier, 38041, Grenoble Cedex 9, France. Meynard, C.N. (email@example.com): UMR 5554 - ISEM, Université Montpellier II, Place Eugène Bataillon, CC 065, 34095 Montpellier Cedex 5, France. Lepš, J. (firstname.lastname@example.org): Faculty of Science, University of South Bohemia and Institute of Entomology, Biology Centre of ASCR, Branišovská 31, CZ-37005 Ceské Budejovice, Czech Republic. de Bello, F.: Present address: Department of Functional Ecology, Institute of Botany, Czech Academy of Sciences, Dukelská 135, CZ-379 82 Třeboň, Czech Republic.
Co-ordinating Editor: Dr. Valério Pillar.
A methodology for partitioning of biodiversity into α, β and γ components has long been debated, resulting in different mathematical frameworks. Recently, use of the Rao quadratic entropy index has been advocated since it allows comparison of various facets of diversity (e.g. taxonomic, phylogenetic and functional) within the same mathematical framework. However, if not well implemented, the Rao index can easily yield biologically meaningless results and lead into a mathematical labyrinth. As a practical guideline for ecologists, we present a critical synthesis of diverging implementations of the index in the recent literature and a new extension of the index for measuring β-diversity. First, we detail correct computation of the index that needs to be applied in order not to obtain negative β-diversity values, which are ecologically unacceptable, and elucidate the main approaches to calculate the Rao quadratic entropy at different spatial scales. Then, we emphasize that, similar to other entropy measures, the Rao index often produces lower-than-expected β-diversity values. To solve this, we extend a correction based on equivalent numbers, as proposed by Jost (2007), to the Rao index. We further show that this correction can be applied to additive partitioning of diversity and not only its multiplicative form. These developments around the Rao index open up an exciting avenue to develop an estimator of turnover diversity across different environmental and temporal scales, allowing meaningful comparisons of partitioning across species, phylogenetic and functional diversities within the same mathematical framework. We also propose a set of R functions, based on existing developments, which perform different key computations to apply this framework in biodiversity science.
Among the different existing mathematical frameworks (Magurran 2004), Rao's quadratic entropy index (1982) can provide a general approach for partitioning biodiversity into α, β and γ components. Indeed, the Rao entropy is currently the only existing estimator of diversity that formally combines different measures of species dissimilarity (e.g. phylogenetic or functional) with relative species abundances, providing a standardized methodology applicable to compare α, β and γ components between different facets of diversity (e.g. taxonomic, phylogenetic and functional diversity; Pavoine et al. 2004; Ricotta 2005a; Hardy & Senterre 2007). Furthermore, the index provides one of the few direct measures of species redundancy within and among biological communities (de Bello et al. 2007, 2009). These unique properties of the Rao index could open new perspectives to understand mechanisms driving the turnover of diversity along environmental and temporal scales.
However, some key methodological issues regarding the spatial partitioning of diversity with the Rao index have been hotly debated in the recent literature (Ricotta 2005a, b; Hardy & Jost 2008; Villeger & Mouillot 2008; de Bello et al. 2009). Whether the index could lead to negative β values has been discussed (Hardy & Jost 2008; Villeger & Mouillot 2008), with no clear agreement yet on how Rao's index should be computed. Moreover, the index has not been able to offer a robust estimation of β-diversity (de Bello et al. 2009), giving systematically low estimates of β-diversity, even for complete species replacement between communities. To help ecologists find a way out of this mathematical labyrinth, we provide clarification and technical guideline to derive a more realistic partition of diversity using the Rao index (i.e. producing a β-diversity that behaves as ecologists would expect). We discuss these issues with numerical examples and a case study, which demonstrate how promising a corrected version of the Rao index for the partitioning of α, β and γ components of diversity can be. As the idea for the study was conceived during the IAVS meeting held in Crete (2009), we hope this study will constitute a thread to follow for ecologists, as in the Theseus myth, and revive classic ecological questions using a reliable mathematical framework.
Different calculations of Rao α, β and γ entropy
With the Rao index, within-community diversity (α) can be defined as the extent of dissimilarity between species in a community. In particular, the Rao index for α-diversity represents the expected dissimilarity between two randomly chosen individuals from a sampled community. If pic is the proportion of species i in community c (i.e. the relative abundance of the ith species in the cth sampling unit or site), s is the number of species (species richness) in the community, and dij is the dissimilarity (or “distance”) between each pair of species i and j, the Rao α-diversity can be defined as follows:
where pic can be calculated as the number of individuals of a species (Aic) over the sum of individuals of all species in the community, i.e. , which means that pic=1/s if all species are equally abundant. More generally, α Rao is the sum of the dissimilarity between all possible pairs of species, weighted by the product of species proportions (which can be based on any measure of species abundance, e.g. biomass, cover, etc.).
There are several possible ways to calculate dij depending on the type of data and facet of biodiversity considered (Ricotta 2005a; Lepš et al. 2006; Hardy & Senterre 2007). For taxonomic diversity, dij=1 for every i≠j and dij=0 otherwise (i.e. a unity matrix with null diagonal), α Rao equals the Simpson diversity index (Pavoine et al. 2004; Botta-Dukát 2005; Ricotta 2005a). More generally, dij can be expressed using various measures of species differences, e.g. functional or phylogenetic dissimilarity between species (e.g. Shimatani 2001; Pavoine et al. 2004). It is recommended to constrain dij to vary from 0 (species i and j are identical) to 1 (as maximum distance; Botta-Dukát 2005). This way, Simpson's index represents the potential maximum value that the Rao index can reach if species are completely different, and biological redundancy can be calculated as the difference or ratio between Simpson's index and the Rao index (de Bello et al. 2007).
Application of the Rao index to calculate diversity at different spatial scales is bound to several crucial choices. Overall, to calculate γ Rao it is necessary to treat the study region as a single sampling unit by pooling local communities together. Let S be the total number of species in the region and Pi be the regional species relative abundance for species I, then:
The question is then what to consider as Pi. This issue, we believe, resulted in a long-lasting confusion in the literature. Ricotta (2005a), on the basis of Rao's work (1982), originally stated that the regional relative species abundances had to be computed from the average (weighted or not, see below) of the local relative abundances of each species (Fig. 1). This equals:
with n being the number of sites (1 to n) in the region and wc being a weighting parameter for the cth sampling unit or community. Most often wc=1/n (Pavoine et al. 2004; Ricotta 2005a), which corresponds to the “unweighted” form of Pi, simply calculated as the average of pic:
Ricotta also showed that for whatever wc, β-diversity cannot be negative as long as (see e.g. equation 3 in Ricotta 2005a) and β is expressed as:
where Σwcαc represent the weighted contribution of α-diversities to the regional diversity γ. In this way, the same weight applied to calculate pic needs to be applied to weight αc.
However, certain ambiguity remains regarding how to calculate regional species relative abundances. An example can be found in Ricotta (2005b, Table 1, p. 369) where the used formula is not that originally proposed by Ricotta (2005a). Along the same lines, Villeger & Mouillot (2008) showed an example of how to calculate regional species relative abundance as the sum of the number of individuals of a given species in the whole region divided by the total number of individuals of all species, i.e. (see example in Fig. 1; note that fi. of Villeger and Mouillot replaces Pi in equation 2). Here we show that fi. is only a special case of Pi, which occurs when wc equals the total number of individuals in a plot divided by the total number of individuals in a region (called f.k by Villeger and Mouillot). In this case, fi. corresponds exactly to the sum of wcpic (see Fig. 1). Note that fi. is equal to Pi unweighted only if the number of individuals is the same at all sites (e.g. Hardy & Senterre 2007, which in some sampling designs corresponds to having the same sampling effort in all sampling units).
When using fi. as a special case of Pi., one should not forget to apply the same value used for wc to weight all α-diversities (as in the original formula in equation 5). Otherwise β values can indeed be negative (Fig. 1). As a matter of fact, Villeger and Mouillot, to avoid negative β values when using fi., further suggested to correct (as a general rule) α-diversities by a factor f.k, which in fact corresponds exactly to wc in Fig. 1 (i.e. if expressed, as mentioned above, as the total number of individuals in a plot divided by the total number of individuals in a region). In doing this, they did nothing more than applying the original formula of weighted Pi and αc (i.e. using the same value of wc in both equations 3 and 5, rather than only in equation 3).
To summarize, the problem of negative β Rao values is due to a specific case of estimating Pi while not applying the same weighting to the mean α-diversities of a region. Hence, the correction f.k proposed by Villeger and Mouillot for wc should be applied only when fi. is used to estimate Pi and not as a general rule. Indeed, Hardy & Jost (2008) noticed that the cause of negative values obtained by Villeger & Mouillot (2008) probably resulted “from an inadequate mixing of parameter definitions and parameter estimators” and that the correction should be applied only in certain circumstances. We believe that a clear elucidation of the specific reasons of such inadequate use of the Rao index and mixing of parameters is needed in order to make the index a more practical and useful tool for ecologists.
In this sense, as convincingly discussed by Hardy & Jost (2008), the weighting parameter wc in the calculation of the mean α Rao should be used in the original context proposed by Pavoine et al. (2004) and Ricotta (2005a), i.e. it should be most often equal to 1/n (thus resulting in equation 4 for “unweighted”Pi). Correcting for an uneven total number of individuals at sites is possible but should be performed only in very special contexts (Hardy & Jost 2008). As a guideline, wc=f.k should be preferred if all sub-communities within a given habitat and community type have been exhaustively sampled, with the aim of estimating within-community heterogeneity and the overall community diversity. Most often, however, field sampling units are chosen as a representative selection within a given habitat and community type, with the main question being the extent of diversity among habitats (β). Then wc=1/n should be used (i.e. “unweighted”Pi). The development and application of other possible corrections for wc is, in this sense, open to further research.
Biased β-diversity values and necessary transformations
Jost (2007) demonstrated for various species diversity indices, including the Simpson index of species diversity, that β-diversity approaches zero as α-diversity becomes larger, even if the sampling units share no similar species. Overall, this means that the β-diversity will be low regardless of the actual species overlap and the change in diversity across sampling units (Jost 2007; de Bello et al. 2009). Therefore β-diversity estimated using Simpson's formulation could lead to meaningless ecological results (Ricotta & Szeidl 2009; Jost et al. 2010). This was shown to also be the case for indices commonly used in population genetics (Jost 2008). This limitation of the Simpson index in partitioning the spatial components of taxonomic diversity can be resolved by applying the correction proposed by Jost (2007) derived from equivalent numbers:
According to Jost (2007), the β-diversity in a region in terms of equivalent numbers can then be expressed as:
Here βEqv represents the number of communities that have no diversity overlap (i.e. having no species in common). Therefore, the lower limit of the index is 1 (all communities have the same composition) and the upper limit is the number of sampling units (if communities share no species). If we define a more intuitive β, called βprop, which represents the proportion of diversity accounted for by the differentiation between communities (or sampling units) in a given region, we can rewrite equation 8 as follows:
Here we warn ecologists that, in this notation, the α-diversity used to calculate αEqv (equation 6), βEqv (equation 8) and βprop (equation 10) should equate to the average α in a region. This approach should be preferred to using the averages of αEqv calculated from single sampling unit α's. The two approaches produce slightly different results (see Appendix S1, second section, for details) but, more importantly, equations 8 and 10 are only valid with the first approach (e.g. αEqv calculated applying the Jost correction on the mean regional α).
The logic of the original correction of Jost (2006, 2007) is based on the concept of “equivalent communities”. If α and β are to be independent of each other, so that one is not constrained in any way by the other, then this is the unique correct partitioning (Jost et al. 2010). This corresponds to calculating diversity for the case of s equally common species in a sampling unit (each species therefore with a proportion of 1/s), with a resulting α-diversity expression that should equal the actual number of species in a community, i.e. species richness (Jost 2006). It should be noted that the correction proposed by Jost (2006, 2007) corresponds to the original formulation of the Simpson diversity, i.e. expressed as 1/dominance instead of the other possible notation as 1 minus dominance (with dominance, ; Magurran 2004). This notation of 1/D also corresponds to the index of diversity N2 proposed by Hill (1973). Given the particular way to derive equivalent communities, the Jost-corrected Simpson diversity equals the number of species if all species have the same relative abundance in a sampling unit (pi) or in a region (Pi; Fig. 2), which is intuitive and biologically interpretable (i.e. the maximum value of the reciprocal of the Simpson index corresponds to the number of species in a community; Hill 1973).
Although Jost's correction was originally proposed only for multiplicative partitioning of diversity (i.e. β=γ/α), we show here that the correction could be equivalently applied to additive partitioning (see below and Fig. 2). By resolving equation 10, βprop can actually be expressed as a percentage of the diversity of a whole region:
This last formula clearly shows that the difference between γ and α in terms of equivalent numbers is a meaningful measure of β-diversity. If αEqv expresses the average number of equivalent species at the sample scale, and γEqv the number of equivalent species at the regional scale, their difference expresses how many equivalent species are found across sampling units. We call this the “β-equivalent-additive”, i.e. βEqvAdd, with:
With this extension of Jost's correction, β-diversity can be expressed as a proportion of the total regional diversity, which can be very useful when comparing different facets of diversity together (e.g. taxonomic, functional, phylogenetic).
Indeed, equation 12 has an upper limit that depends on the number of sampling units (it equals 1−1/n if the n samples are all completely distinct) and produce a βprop value always lower than 1. For example, a maximum differentiation between two plots results into βprop=0.5 (or, which is the same, 50%; Fig. 2 and Jost 2007). Thus, it should be used carefully to compare results from data sets with very different n. In this case, we propose to normalize this equation to the interval [0, 1]. For example:
It should be noted, however, that for sufficiently large data sets, 1/n will tend to zero so that βNorm−prop will not be significantly higher than βprop.
As a next step, we show that these extensions of the Jost correction can be similarly applied for the Rao index, whenever it is used for functional or phylogenetic diversity. In fact, since the Rao decomposition of diversity is a generalization of the Simpson index (see above), we can imagine that β-diversity expressed by the Rao index does not behave as ecologists would expect (de Bello et al. 2009). This is shown in Fig. 2, where the β Rao index does not adequately depict a complete change in functional diversity across sites (in Case 1, with no functionally similar species shared by two “equivalent communities” with similar α-diversity, β-diversity should equal 50% of γ-diversity). This underestimation of β-diversity could have devastating consequences on the interpretation of ecological patterns (Jost 2007; de Bello et al. 2009; Ricotta & Szeidl 2009).
Based on the concept of equivalent numbers, the example in Fig. 2 (and its extension in Appendix S1) shows also that the same correction proposed by Jost for the Simpson index can also be applied to the Rao index. This is not surprising since the Simpson index is only a specific case of the Rao index (Ricotta & Szeidl 2009). The α and γ Jost-corrected indices should equate the number of species (i.e. species richness) in the case of equivalent communities. This is true in Case 1 in Fig. 2, with two equivalent communities sharing no functionally similar species, which shows that the Jost-corrected Rao index correctly returns the number of distinct species within local communities (αEqv), within an entire region (γEqv) and between communities (βEqvAdd). In addition, as convincingly demonstrated by Jost (2006), this correction does not produce negative β values. Since γ Rao is always greater than the average α Rao (equation 5; Ricotta 2005a), the Jost-corrected α Rao will always be lower than, or equal to, the corrected γ Rao. In this sense, it is important to emphasize that the distance matrix used to calculate this index must be based on ultrametric distances (i.e. dij≤max(dik, dkj), Pavoine et al. 2004; Ricotta 2005a) and that dij should vary from 0 to 1. If dij is not scaled between 0 and 1, the results of the partitioning of diversity will be exactly the same, but the functional or phylogenetic diversity cannot be related, in absolute values, to taxonomic diversity.
To show the importance of the Jost correction for the Rao index with real data, we compared diversity partitioning for taxonomic, functional and phylogenetic diversity obtained with and without the Jost-derived corrections. A case study from the Guisane Valley, in the French Alps, was used. The data set comprises 82 plant communities (of 10 m × 10 m in size with visual estimates of species composition and cover) sampled along multiple environmental gradients (altitude, soil characteristics, slope). For the 212 species in the data set, we used a phylogenetic supertree derived from the phylomatic web tool (Webb & Donoghue 2005). For functional diversity, we calculated species dissimilarity in terms of two plant traits: Specific Leaf Area (SLA), a quantitative trait (i.e. leaf area divided by the leaf dry weight), and Raunkiær's classical life form, a categorical trait. To calculate species dissimilarities with these two traits we applied the Gower distance, a standardized approach proposed by Botta-Dukát (2005) as appropriate for the computation of the Rao index.
The results of this case study show various important patterns illustrating the necessity of using the Jost correction for the partitioning of diversity. First, as shown by Jost (2007), the β taxonomic diversity (TD) is clearly strongly underestimated without the correction (4% TD-NonJost versus 80% for TD-Jost; Fig. 3). Second, without the Jost correction the β functional and phylogenetic diversity (FD- and PD-NonJost, respectively) can be higher than the taxonomic diversity (the values of FD- and PD-NonJost were sometimes higher than TD-NonJost; Fig. 3). This is a biological nonsense result, as β taxonomic diversity should represent the upper limit of functional and phylogenetic β-diversity (i.e. in the case that all species are different). This clearly indicates that comparing TD, FD and PD without the Jost correction can lead to biologically meaningless interpretations. Third, the FD- and PD-Jost indices showed higher β-diversity than the FD- and PD-NonJost indices (P<0.001, Fig. 3), indicating that the Jost correction reduces the risks of a possible underestimation of β-diversity. It should be noted that, potentially, the Jost-corrected β Rao could be even lower than the “uncorrected”β (see Appendix S1). This is, however, a particular case that occurs, for example, between those sampling units where the FD- and PD-NonJost are higher than the TD-NonJost (see above).
At the same time, it should be noted that the Jost-corrected and non-corrected indices are, logically, strongly correlated (the Pearson correlations between Jost versus NonJost indices were 0.78 for TD, 0.97 for PD and 0.96 for FD). Therefore, it is not surprising that using Jost versus NonJost indices does not markedly alter the relationship between β-diversity and environment (i.e. diversity turnover; Vellend 2001). For example, in a Mantel test between β-diversity (in pairs of communities) and environmental dissimilarity, we found a similar effect of environment on β-diversity with or without the Jost correction (for TD, FD and PD; not shown). Similarly, when using null models to compare the observed versus expected partitioning of diversity (see e.g. de Bello et al. 2009), we found no markedly different results comparing Jost versus NonJost indices (not shown). Therefore, the conclusions regarding possible community assembly mechanisms should remain rather unchanged with or without the Jost correction against random expectations.
Overall, these results highlight the idea that the Jost correction is especially important when comparing taxonomic diversity against functional and phylogenetic diversities. Such corrections may, therefore, prove to be fundamental if we intend to jointly use and compare biodiversity indices to understand mechanisms driving the assembly and functioning of natural communities across space and time (Ackerly & Cornwell 2007; Prinzing et al. 2008).
The examples shown here (Figs 1 and 2) are intended to offer a simple guideline to ecologists aimed at comparing spatial partitioning of different facets of diversity with the same mathematical framework. When applying the Rao index framework, however, ecologists should be aware of different crucial choices that need to be made to correctly compute the index. These are synthesized here, as a guideline:
•Different views may lead to calculating Pi, relative abundance of a species in the study region, by modifying the parameter wc within the general formula expressed in equation 3 (Fig. 1). In general, wc should be equal to 1/n (thus resulting in “unweighted”Pi; equation 4), especially if sampling is not equally exhaustive in different habitats.
•Corrections for uneven numbers of individuals in sampling units (i.e. with wc=f.k, i.e. the total number of individuals in a plot divided by the total number of individuals in a region) should be used only together with the corresponding weighting of average α-diversities, i.e. in order to avoid potentially negative β values (equation 5; Fig. 1).
•Regardless of how Pi is calculated, the Rao index can produce systematic lower-than-expected β-diversity values (Figs 2 and 3).
•This may be solved by applying simple corrections based on equivalent numbers, as proposed by Jost (2007) for the Simpson species diversity index. This expansion of the correction will certainly provide a helpful tool to compare spatial partitioning of taxonomic, phylogenetic and functional diversities.
The examples presented are further complemented by a function that applies these calculations within the statistical package R (Appendix S2). This new R function, called “Rao”, can be a useful tool for ecologists in partitioning diversity with the Rao index. The function further returns α, β and γ, both with and without the correction proposed here and by Jost (2006, 2007), for the Simpson and Rao indices of diversity.
Acknowledgements. We are grateful for the constructive inputs of Zoltán Botta-Dukát, Norman Mason and Valerio Pillar, which considerably improved this study. We also thank Philippe Choler for the initial discussions that stimulated this work and David Mouillot for providing insightful comments on an earlier draft of the manuscript. Marco Moretti and Guillaume Lentendu kindly helped to improve the clarity of the manuscript. Jan Jongepier improved the English language. The Station Alpine Joseph Fourier provided support during field sampling. FdB wishes to personally thank Lou Jost and Carlo Ricotta for their sincere and thoughtful suggestions on the final manuscript version. This research was funded through the IFB-ANR DIVERSITALP project (ANR 2008-2011, contract No. ANR 07 BDIV 014), the EU funded EcoChange project (FP6 European Integrated project 2007–2011, contract No 066866 GOCE), the CNRS APIC RT PICs project 4876 and the project LC06073 (Czech Ministry of Education).