1. Species-abundance distributions (SADs) are a convenient and common method for describing ecological communities. Despite their long history and the cornucopia of theoretical models, which have been suggested to describe them, no agreement has been reached as to which models are best.
2. This lack of agreement is in part owing to the inherent differences in the abundance measure used. Discrete measures such as density and point quadrat cover produce a distinct veil line (positive skewness) when compared to continuous measures such as biomass or basal area. We compared two different sets of discrete and continuous abundance measures commonly used to estimate plant abundance, (i) cover (estimated from point quadrats) vs. biomass for 35 quadrats in garigue vegetation on serpentine soil in Tuscany, Italy; and (ii) density vs. basal area for the 2005 50 ha BCI (Panama) tree data. We used marginal plots (ordinary scatter plots with a dotplot of each variable along its own axis) to compare the shape of the SAD based on the two abundance measures.
3. The average of all 35 garigue plots gave a reasonably consistent description of the data. In contrast, when all 35 plots were concatenated, or when an individual plot was investigated, the discrete cover marginal plot, but not the continuous biomass plot, was truncated. The discrete density marginal plot, but not the continuous basal area plot was also truncated.
4. We highlighted the substantial effect that the species-abundance measure selected has on the shape of the SAD by comparing measures of skewness and kurtosis. This suggests that communities sampled using different abundance methods may fit different theoretical models best, not because they are fundamentally different but because the abundance measure is fundamentally different. Averaging over all the quadrats produced a better correspondence between the two abundance methods. Aside from the theoretical aspects of model fitting, our results clearly show that comparisons of communities or meta-analyses using SADs based on different measures of abundance should be treated with caution.
Species-abundance distributions (SADs), also called relative-abundance distributions (RADs), record the relative or absolute number of a set of species in a sample. As a description of an ecological community, they sit conveniently between a simple listing of the species present and multidimensional analysis (McGill et al. 2007) and so have been much studied. Generally, they are unimodal on a logarithmic scale of abundance and that has lead ecologists to seek a simple mathematical form for SADs. Despite this simplicity, McGill et al. (2007) list 27 models that have been suggested and note an absence of agreement about which models are best.
Chiarucci et al. (1999) calculated SADs for 35 individual plots, fitting broken-stick, geometric, lognormal and Zipf–Mandelbrot functions (models 20, 19, 6 and 11, respectively, in McGill et al. 2007) on point quadrat cover and biomass data. The most frequent (22 plots) best fitting model for the point quadrat cover data was the Zipf–Mandelbrot, and in contrast, the most frequent best fitting model for the biomass data (16 plots) was the lognormal. The Zipf–Mandelbrot model was the best fitting model for biomass data in 10 plots, whilst the lognormal was the best fitting model for the point quadrat cover data in only two plots.
There are two simple reasons for this lack of agreement. The first is that none of the models is additive (Williamson & Gaston 2005; Šizling et al. 2009) over either taxa or areas; indeed Šizling et al. (2009) show that such an additive model involves an indefinitely large number of parameters. This means that a model that fits at one scale or one set of species cannot also fit at another scale or over an enlarged set of species. The second, and the one we are concerned with here, is that the models are not invariant under different abundance measures; SADs change shape in relation to the abundance measure used. This has been recognised to some extent by reference to Preston’s veil line (Williamson 2010) or by classing data sets as fully censused vs. incompletely sampled (Ulrich, Ollik & Ugland 2010). But the veil line is an unsatisfactory approximation (Chisholm 2007; Williamson 2010) and all data sets are in some sense and to some degree samples, so sampling is a continuous (and universal) variable, not a discrete one. Chisholm (2007) has a thorough discussion of previous arguments, particularly Dewdney (1998). Williamson (2010) showed that the investigator choice of using individuals as opposed to biomass for sampling has a mathematical consequence; describing the phenomenon of differential veiling using marginal plots of the same community measured as individuals and as biomass. Much SAD work has been with taxa in which individuals are readily distinguished (e.g. Morlon et al. 2009; who studied trees, fishes, birds and mammals). For applied plant ecology, both biomass and density have serious drawbacks as abundance measures (Kershaw 1973). Collecting biomass data is destructive and time-consuming and plant density has the inherent problem that individuals are often difficult to distinguish, if it is possible at all, and where there may be great variation in size within a species (Jonasson 1988). Here, we consider the effect of sampling plant communities using different species-abundance measures. Objective abundance measures commonly used with plant communities often attempt to estimate cover, for example, point quadrat cover or local frequency (Greig-Smith 1964). Neither of these measure individuals per se but in common with density they are discrete variables, that is, the count of numbers of pin hits or subsquares occupied. In contrast, both biomass and basal area are continuous variables. Williamson (2010) linked the differential veiling to individuals. In contrast, we show that whilst differential veiling does occur with individuals this is not a property of individuals per se, rather differential veiling is a consequence of using a discrete rather than continuous abundance measure. Counting individuals is, along with a multitude of other sampling measures discrete, and it is the discreteness, not the individualness which produces the differential veil line. This is an important distinction, especially relevant to fields, such as plant sampling, where discrete abundance measures other than counting individuals are common.
Materials and methods
Point quadrat cover versus biomass data
The point quadrat cover versus biomass data come from garigue vegetation on serpentine soil in Tuscany, Italy where 35 1 m × 1 m plots were surveyed in two ways (Chiarucci et al. 1999). Point quadrat cover estimates cover using the number of pins touching each species, in this case contacts out of 441 pins in each 1-m2 plot. Biomass was measured as the dry weight (after 48 h at 80°C) for each species in each plot. The plots were subject to seven different experimental treatments, but we ignore that here. We also ignore any species present in a plot but happening not to be touched by any pin, recorded as 0·5 pin in Chiarucci et al. (1999). We analysed three assemblage levels: (i) the average per plot for each species over all 35 plots, (ii) the concatenation of all the individual plots, that is, data for all 35 individual plots superimposed in one graph and (iii) the results for a single plot, we used plot 31 as this is the plot set out in detail in Chiarucci et al. (1999).
Density versus Basal area data
The density vs. basal area data come from 2005 BCI data (50 ha plot on Barro Colorado Island, Panama) (https://ctfs.arnarb.harvard.edu/datasets/BCI/abundance). They are freely available and are the first year in which all the individuals there were identified to species. We have used the trees with a diameter at breast height (d.b.h.) >10 cm as the resolution of measurement means that some saplings have a recorded basal area of zero.
Instead of function fitting, biomass vs. point quadrat cover data and individuals vs. basal area data were investigated using marginal plots (Williamson 2010) and then calculating skewness and kurtosis for the four pairs of marginal distributions. Marginal plots are ordinary scattergrams for two variables with the addition of a dotplot of each variable along its own axis. Note that in the vertical dotplot, the low value (‘left-hand’) end is at the bottom. Dotplots were preferred to histograms as they bring out the singleton and doubleton values more clearly.
The average over all 35 1-m2 plots (Fig. 1a) behaves as fully censused data in the sense of Ulrich, Ollik & Ugland (2010) even though it is still a small sample of garigue vegetation. Both cover (measured by points) and biomass (measured as dry weight) show reasonably symmetrical plots on logarithmic abundance scales. Table 1 shows that as usual in well-sampled data (Williamson 2010), both skewness and kurtosis tend to be negative, that is, the data are slightly left skew and platykurtic. But there are too few points, only 34, for either to be statistically significant.
Table 1. Skewness and kurtosis of the marginal plots for both biomass and cover for the three cover versus biomass assemblage levels: A) the average of the 35 plots; B) the concatenation of the 34 plots individually and C) an individual plot (plot 31), as in Fig. 1
Mean of all plots
Concatenation of all individual plots
N (number of data points)
P-values in brackets. Note that all signs are negative except for the skewness of individual cover plots (both all plots and plot 31). Those two positive values are shown in bold, as are significant probabilities.
The point quadrat cover data for individual 1-m2 plots (Fig. 1b,c) behave as ‘incompletely sampled’ in the sense of Ulrich, Ollik & Ugland (2010), an incomplete sample of the garigue vegetation. However, the biomass data for the same individual plots are much more symmetrical, and in the sense of Ulrich, Ollik & Ugland (2010), they should be considered ‘fully censused’. Realistically, both the 1-m2 plots and the set of 35 of them are samples, whether of biomass or point quadrat cover, at their respective scales, but neither is a complete community.
The seven different treatments across the plots cause a lot of scatter, but the dominance of singleton point quadrat cover estimates (i.e. a species sufficiently rare that it is hit by only one pin) and subdominance of doubleton point quadrat cover estimates is clear in the marginal dotplot (Fig. 1b). The scatter plot appears truncated at the singleton line. This is less clear when looking at an individual plot, such as plot 31 (Fig. 1c) because of the paucity of points, but can still be seen. Both the concatenation of all 35 plots (Fig. 1b) and the single plot 31 (Fig. 1c) show that the difference between biomass and point quadrat cover SADs comes from the point quadrat cover being measured discretely, with a cut-off at one sample pin, whilst biomass is a continuous variable. That difference leads to species with low cover being lost on sampling, whereas species with low biomass are lost only if they also have low cover.
The different effects of sampling on (continuous) biomass and (discrete) point quadrat cover are clear on the skewness statistics. The skewness of biomass is nonsignificantly negative in all three graphs but the skewness of point quadrat cover shifts from a left skew (−0·302) over the whole 35-m2 sample to a right skew (+0·406) over the set of individual plots. Only the latter, based on a much larger set of points, is statistically significant but that makes the difference between the two estimates highly significant. The kurtosis values are negative in all six cases. Only one is formally, and highly, significant but the set has a probability of 1/32, significant at the 5% level.
As with the point quadrat cover vs. biomass data the density vs. basal area data show an obvious truncation to the marginal plot of the discrete density data without a corresponding truncation in the continuous basal area data (Fig. 1d). Skewness and kurtosis values are given in Table 2 and behave as before.
Table 2. Skewness and kurtosis of the marginal plots for both basal area and density (individuals >10 cm d.b.h.) for the BCI 2005 data
N (number of data points)
P-values in brackets. Note that all signs are negative except for the skewness of the density plot (shown in bold along with significant probabilities).
Discussion and conclusion
Chiarucci et al. (1999), Connolly et al. (2005) and Morlon et al. (2009) all noted that different methods of sampling lead to mathematically different SADs. Williamson (2010) found a simple way of expressing and explaining this: marginal plots, as in Fig. 1, show that the distinction between sampling by discrete abundance measures such as point quadrat cover or individuals and by continuous abundance measures such as biomass or basal area necessarily leads to the difference in SADs. Williamson (2010) emphasised the ‘individuals’ part of that explanation but the data of Chiarucci et al. (1999) reanalysed here show that it is in fact the ‘discreteness’ not the ‘individualness’ of the data that creates the effect. Cover measured by point quadrats is discrete but what is measured is not a set of individuals but a set of pin hits approximating cover. Williamson (2010) also said ‘biomass SADs are different from individuals SADs and need not have the same mathematical form’. The Chiarucci et al. (1999) data show that the description of SADs measured by discrete counts should be a discrete mathematical function whilst those measured by continuous values require a continuous mathematical function, an important difference with a major effect on the sampling properties of the SADs.
This finding is confirmed by the BCI tree data and by another plant study, Guo & Rundel (1997), on postfire vegetation in Californian chaparral. Because of the vegetation type Guo and Rundel could measure species abundance as individuals, as cover (assessed by eye) and as biomass. They fitted no functions, but their cumulative SADs show a clear dominance of singletons in their discrete individuals data but not in the other two continuous forms.
If avoiding truncation of the SAD is important to the study then the abundance measure used must be continuous in nature, for example, biomass. On the other hand if using a discrete abundance measure, for example, density, point quadrat cover, local frequency, is desirable or unavoidable the truncation effect this will have on the SAD needs to recognised as a sampling artefact. The effect will be most noticeable in small samples.