As proposed by Lepšet al. (2006), the ‘total community variance’ can be decomposed into the sum of ‘between-species variance’ and ‘within-species variance’. The equation below formalizes this approach, for the first time. Let us take a community composed by i-th species, with species richness (i.e. number of species) expressed as Nsp. Within each species i, quantitative trait values (xai) have been measured for several individuals ai, for a total of Nindi individuals belonging to each species i (the total of individuals sampled, across all species, being Nind). The left hand of the eqn 1 represents the total community trait variance, while on the right hand of the equation corresponds to the between-species variance and within-species variance, respectively (from left to right).
This formulation implies that the contribution of each species to the variance decomposition is identical (i.e. differences in sampling efforts across species are ignored), as all factors are weighted by the number of species Nsp. In case one would consider that the number of individuals sampled corresponds to the abundance of species in the community (i.e. differences in sampling efforts across species do count), then the parameter 1/Nsp should be replaced by Nindi/Nind. It should be noted that with this replacement the total and within-species sum of squares are actually divided by the total number of individuals (see below for the equivalence with permanova–Anderson 2001, 2005– and the Appendix S1 for a practical example). However, as discussed in the last section of this work (‘The selection of individuals’), very often the individuals sampled for trait measurements are only part of the population of a given species in a community. Note also that the formulation of the xcom, also depending on how the relative abundance of species is considered, corresponds to the community trait mean, as applied e.g. by Garnier et al. (2007) and Lavorel et al. (2008).
Although this kind of variance partitioning should be rather intuitive for most ecologists, the equation could be written in more general terms. Indeed, any form of variance could be expressed in terms of the mean dissimilarity between pairs of observations. For example, as demonstrated by Champely & Chessel (2002), for the dissimilarity between pairs of species:
where dij = ||xi-xj|| expresses the trait dissimilarity between each pair of species i and j, i.e. the Euclidian trait value distance between pairs of species, and pi expresses the relative abundance of species in a community (with ). Generally, if all observations have the same abundance, then pi = 1/Nsp, otherwise pi, can express different measures of species relative abundance (based on species cover, biomass, number of individuals etc.; see Lavorel et al. 2008). As a matter of fact, the right term of eqn 2 corresponds to the Rao quadratic entropy index of diversity, for the case that the Euclidian distance between individuals is squared and divided by two (Rao 1982, 2010; Pavoine & Dolédec 2005). This important equivalence between variance and the diversity in terms of quadratic entropy has several interesting implications.
First, the partitioning of variance expressed earlier (eqn 1) can be articulated as the partitioning of quadratic diversity with the Rao index (eqn 3 below). The partitioning of diversity with the Rao index is often calculated to decompose total regional diversity into between communities and within communities (Champely & Chessel 2002; de Bello et al. 2010). Here, building in Pavoine & Dolédec (2005), we propose to use the partitioning of the quadratic entropy to decompose total community diversity into between-species and within species:
As in eqn 1, the total diversity is represented by the left hand of eqn 3, while the right hand of the equation corresponds to between-species diversity (defined in eqn 2) and within-species diversity. The total diversity is represented by the average dissimilarity between each pair of individuals (a and b) weighted by the relative abundance of the species they belong to. The within-species diversity, for each species, is represented by the average dissimilarity between each pair of individuals (a and b) within that species. The contribution of the within-species diversities to the total diversity is also weighted by the relative abundance of the species they belong to. This partitioning of diversity corresponds exactly to the decomposition of the Rao index into within- and among-samples diversity (see details in Pavoine & Dolédec 2005; Ricotta 2005; de Bello et al. 2010). The only difference is that the ‘samples’ in analyses are not represented by a species x communities matrix but by an individuals x species matrix (so that species are formally considered as ‘samples’ in the analyses).
Although this formulation might be less intuitive than variance, it has several advantages. For example, in the case that pi = 1/Nsp, the eqn 3 corresponds to the ‘unweighted’ decomposition of diversity (which equates eqn 1), the within-sample diversity averaged over the number of samples (Ricotta 2005; note again that ‘samples’ are represented by species in eqn 1). However, this established formulation also allows for pi≠1/Nsp, which corresponds to the ‘weighted’ decomposition of diversity (Rao 1982; Ricotta 2005; de Bello et al. 2010), which in our case implies that the diversity within each species is weighted by a factor indicating the contribution of species to the overall diversity. Therefore, this allows to include a parameter which takes into account that species have different abundances in the field and not only how much intensively each species has been sampled for trait measurements. Also, as briefly above-mentioned, when pi = Nindi/Nind, then the approach corresponds to the decomposition of mean sum of squares as in amova and permanova (see worked example in the Appendix S1), which is also equivalent to the weighted diversity decomposition approach proposed by Villeger & Mouillot (2008). It should be reminded that the total diversity also includes the parameter pi in its calculation. Overlooking this parameter in the calculation of total diversity, or in the contribution of within-species diversity, might lead to negative between-samples diversity (see de Bello et al. 2010). For the correspondence between the variance vs. the Rao partitioning, and the application with different pi values, see a worked example in the Appendix S1. Different R functions decomposing total community diversity into within- and between-species diversity are also available in the Appendix S1 (based on the functions in de Bello et al. 2010; The function ‘RaoRel.r’ can be used for the equivalence between variance and Rao, both when considering or not species abundances in the field, i.e. using, respectively, parameter ‘weight = T’ or ‘weight = F’; the function ‘RaoAdo.r’ can be used for the equivalence with permanova, where the a proper weight, i.e. pi = Nindi/Nind, is applied by the option ‘weight = T’; see more details in the example).
An interesting consequence of the analogy between variance and the Rao index is that this approach, implicitly or explicitly, is already applied in different existing algorithms. Therefore, the method can integrate already existing approaches. In particular with our approach, similarly to PERMANOVA (Anderson 2005; which algorithm, as mentioned, can lead to eqn 3), it is possible to test the effect of within- and among-group diversity against null expectations (because of permutations of the dissimilarity between observations). Note also that permanova analysis of variance based on a distance matrix, with the corresponding randomization procedures, is actually equivalent to the one described by Pillar & Orloci (1996). Our approach allows, nevertheless, a more flexible use of the relative abundance of species than permanova, which is fundamental for the question being asked. A similar principle is also used into the test of homogeneity of dispersion from the group centroids into PERMDISP (Permutational analysis of multivariate dispersion; Anderson 2006; available at http://www.stat.auckland.ac.nz/~mja/Programs.htm or in R, as the function ‘betadisper’ in the package ‘vegan’). In our case, the species mean trait values (xi) represent the centroids for each species, and the dispersion of traits around this centroid (as calculated with PERMDISP) represents within-species diversity. Hence, when applying the PERMDISP algorithms to our case, the test indicates whether the extent of the within-species diversity changes across species (in essence, one calculates an F-statistic to compare the average distance of observation units to their group centroid). It should be noted that, to the best of our knowledge, the existing PERMDISP algorithms do not allow users, at present, to consider different species relative abundances (pi) in the calculation (see help of the fdisp function, package ‘FD’, in R).
Another interesting result from the equivalence between variance and the diversity in terms of quadratic entropy is that different measures of trait dissimilarity could be used in this equation, not only for quantitative traits. This implies that the approach can be applied for categorical, fuzzy, circular traits etc. Also, the trait dissimilarity can be computed based on several traits together, even different types of traits. For example, Botta-Dukat (2005) and Pavoine et al. (2009) proposed a standardized approach to compute trait dissimilarity based on multiple traits as appropriate for the computation of the Rao index. In these established approaches, each quantitative trait is standardized by dividing the trait value by the range of possible values for this trait (which corresponds to the Gower distance based on one trait; see Botta-Dukat 2005 and Pavoine et al. 2009 also for standardizations based on other type of traits). After this standardization, the Euclidian distance between pairs of individuals is calculated, for each single trait, ranging between 0 and 1. While using more traits, the Euclidian distance calculated based on several traits will not be bounded to the upper limit of one. We applied this approach in the first case study.
Several alternatives exist to calculate dissimilarity between observations based on multiple traits, whose description is outside the scope of the present work. Dividing by the range of trait values could be unsafe in the case of having several outliers, which should be removed manually or by standardizing by quantiles instead of trait range (with these two solutions driving to roughly the same results with the second involving some arbitrary decision of the quantiles applied). Ideally the range of traits considered for this standardization should reflect the whole range existing in nature, but this information is often not available for most of the existing traits (see Botta-Dukat 2005 for further discussion). Other approaches based on multivariate analyses, for example based on principal coordinate analysis on trait matrices (Villeger, Mason, & Mouillot 2008; Laliberté & Legendre 2010), could be applied. Although a consensus on calculating trait dissimilarity based on multiple traits is far from being achieved in the literature, we believe that the decomposition of community diversity into within- and between-species components should be first analysed based on single traits and, only very cautiously, then compared across traits (see case study below).
Case study 1
Field site and measurements
This first case study involves FD calculations for two meadows in the Czech Republic (see Lepš 2004; Klimešet al. 2001 for details) – one ‘dry’ characterized by lower soil moisture (Čertoryje, south Moravia) and one ‘wet’ characterized by higher soil moisture (Ohrazení, South Bohemia). In both meadows, individuals were selected randomly for trait measurement. As discussed in the following paragraphs (see ‘The selection of individuals’), random selection of individuals guarantees that the variability of our sample correctly reflects the variability of the sampled population, and thus also the potential effect of intraspecific trait variability on community structure.
A total of 22 species were sampled in the dry meadow and 18 species in the wet meadow by randomly selecting 15–25 individuals per species at the beginning of the growing season (for a total of 863 individuals). These species represent on average >80% of the total biomass in both meadows (see Garnier et al. 2007). Species relative abundance used to compute the indices was based on average species frequency in 50 × 50 cm quadrats divided into 25 10 × 10 subquadrats, randomly placed in these mown meadows. We measured vegetative height, as the distance between the top of the photosynthetic tissues of each individual and the soil surface, at the end of the growing season (in total, we measured 785 individuals). We expected variance in plant height to be linearly dependent on the mean, i.e. with higher variance for larger species not being because of greater intraspecific trait variability, but rather to scaling in measurement units. We therefore applied a log-transformation to vegetative height measurements, as it results in independence of mean and variance (not shown) while keeping the additivity of within- and between-species FD. We also measured leaf dry matter content (LDMC), the dry mass of a leaf divided by its fresh mass (Cornelissen et al. 2003) expressed in mg g−1, for 658 individuals at the end of the growing season (avoiding individuals with damaged leaves). Log-transformation for LDMC in this case was required to improve normality (even if trait mean and variance per species were independent, not shown; the results for LDMC did not change considerably without using log-transformations).
Results and implications
In this section, following Table S1, we first highlight the results regarding total community diversity partitioning for height, LDMC alone and the potential of linking both traits together. Then, we comment on the importance of species relative abundance in the partitioning of diversity and conclude discussing the results observed against null expectations.
For both traits, the extent of within-species FD tended to be lower than between-species FD. As expected (Cornelissen et al. 2003), within-species FD was generally greater for height (reaching up to 52% of total FD in the dry sites when not considering species abundance), reflecting size differences between individuals under varying growth microsite conditions. For height, the total and between-species FD were slightly greater in the wet site (compare the absolute observed diversity values across sites), where vegetation was taller (the community mean for height was around 37 cm vs. 27 cm when using species abundance and 26 cm vs. 19 cm when not). These patterns of height highlight different community vertical structures in the two different meadows. At the dry site (lower and less dense vegetation), smaller plants such as rosette and prostrate species dominate. At the wet site (with a taller and denser vegetation, suggesting higher competition for light), smaller and taller species tend to coexist more frequently indicating differences in light acquisition strategies.
For LDMC, the extent of within-, between-species and total FD was higher in the dry site. This suggests higher differentiation in resource acquisition patterns generally linked to LDMC (Cornelissen et al. 2003), which could be explained by a more heterogeneous, patchy, resource availability in the dry site (Klimešet al. 2001). Considering both traits together, one should note the additivity of the diversity of single traits on the combined FD. For example, without considering species relative abundance, the between-species diversity in the dry site was 0·0453, which is the sum of the between-species diversities based on height and LDMC alone, i.e. 0·0113 and 0·034, respectively (Table S1).
This additivity of FD based on single traits has some key implications. First, in our case, one would be inclined to consider the differentiation in terms of LDMC, and its contribution to the total diversity with both traits, to be higher than in terms of height. Such comparisons should be however carefully avoided as, in fact, the absolute value of diversity of single traits depends strongly on the distribution of traits values across individuals (although both traits were standardized as above-mentioned, a more even distribution leads to a higher dissimilarity, as in the case of LDMC here). As a consequence, the decomposition of absolute diversity based on multiple traits has a lower interest when compared to the decomposition based on single traits. On the contrary, the comparison of the percentage variance explained by within- and between-species effects can be more safely compared across traits.
Another key observation highlighting the risks of combining multiple traits is given by the results including species relative abundances in FD estimation. As a matter of fact, the effect was opposite for height vs. LDMC (Fig. 1). For height, the observed total diversity values increased, in both sites, using relative abundances when compared to using only species presence/absence. For LDMC, the total diversity decreased. This implies, first, that at both sites dominant species are more dissimilar among themselves than are all species taken together in the case of height and vice versa in the case of LDMC. Second, this incongruence across traits suggests, again, that combining different traits together should be performed very carefully as these opposite effects on single traits will be overlooked.
It is also interesting to note that the use of abundance-weighted diversity modified the extent of within- and between-species trait variability (especially concerning the absolute values of variance; Table S1 and Fig. 1). This suggests different patterns of differentiation among dominants and subdominant species. A higher between-species FD when considering species relative abundance (as for height in the dry site) suggests, in principle, that different strategies between dominants can be important for coexistence and resource acquisition (Stubbs & Wilson 2004). A lower between-species variability when considering species relative abundance (as for LDMC) can suggest convergence in traits between dominants. Comparing variance partitioning with and without considering species relative abundance could, therefore, be applied to test the relevance of different sources of trait variability on community assembly (Mason et al. 2007; Swenson & Enquist 2009). Also, these results highlight the importance of methods that allow the inclusion, or not, of species relative abundance in calculations of FD.
Finally, as in permanova (Anderson 2001, 2005), we randomized dissimilarity across observations (individual trait values in our case) and assessed how many times the observed values were lower, or higher, than expected by chance. In all cases, the observed diversity between-species was higher than expected by chance (with 99 randomizations), while the within-species diversity was lower than expected by chance (Table S1). More heterogeneous patterns were found for the total diversity. Most importantly, these results advocate for the importance of between-species differentiation in community assembly. Second, we used PERMDIST tests (Anderson 2006), indicating here a significant difference in within-species diversity across different species (i.e. some species have higher dissimilarity within them than others). This, as discussed later, has clear important implications on the way individuals are selected for trait measurements.