- Top of page
- Materials and methods
Macroecology is the study of major patterns shown by organisms (Brown 1995; Gaston & Blackburn 2000) and one fundamental pattern is the distribution of individuals among species, called the species abundance distribution (SAD). Interest in SADs was strong from the 1940s (Fisher, Corbet & Williams 1943; Preston 1948) to the 1960s (Preston 1962; Williams 1964) when controversy was generated as to whether data fitted one statistical model better than another. MacArthur (1960) made an important advance by proposing biological rather than statistical models for the patterns observed. This trend was continued by the development of niche apportionment models (Sugihara 1980; Tokeshi 1990) and more recently by Hubbel (2001) who has revived interest in SADs by deriving a general theory of biodiversity and biogeography based on zero-sum dynamics of species and on ecological equivalence among individuals of species within assemblages, the ZSM model. Yet McGill (2002) showed that a log-normal distribution fitted the data sets analysed by Hubbell (2001) better than the ZSM. Recently, Volkov et al. (2003) have reanalysed the same data used by Hubbell and McGill, using a simplified version of the Hubbell model, and showed that data on SADs from a tropical forest at Pasoh, Malaysia (Manokaran, & Swaine 1994) fitted the log-normal and confirm that data from Barro Colorado Island (BCI), Panama (Condit et al. 2002) did also.
One interesting aspect of the above studies is that they use remarkable few data sets to make ecological inferences. No marine (or freshwater) data sets have been used. Here we analyse a data set with a large spatial extent from the marine environment (an assemblage of coastal sediment benthos) and compare SADs of these data with the BCI tropical tree data sets in order to see if there are any grounds for believing that marine and terrestrial systems may be structured in different ways. Rather than simply concentrating on the summed data over the whole area studied as done by previous authors, for both the marine and BCI data we have analysed the data at different spatial scales to see if SAD patterns change with scale.
In 1982 we proposed (Ugland & Gray 1982) that marine benthic assemblages comprised three groups of log-normally distributed species, rare, moderately common and common. We showed that in response to a disturbance (organic enrichment) the three groups moved apart and became distinct whereas in undisturbed assemblages they were more or less fused into a single log-normal distribution. This idea was studied further by Magurran & Henderson (2003) who showed that data from a 20-year study of a fish assemblage from the Severn estuary, UK comprised a group of rare and a group of common species. They suggested that the rare group fitted a log-series distribution, whereas the common group fitted a log-normal distribution.
In a recent paper Williamson & Gaston (2005) state that the log-normal is not an appropriate null model for SADs. Their argument is based on fitting a single log-normal distribution to three sets of data, breeding birds in Britain, the 50 ha BCI data and from butterflies from Jatun Sacha, Ecuador and testing a variety of fitting methods and goodness-of-fit tests. They did not, however, fit more than one log-normal distribution to the data. Here we use a statistical method for determining if more than one log-normal group of species can be found in the data sets and compare the statistical differences between a single and two-group model. Finally, we consider the biological reasons underlying the two-group model.
- Top of page
- Materials and methods
Table 1 shows a summary of the data. The BCI data set used has a total of 225 species and 21 457 individuals, whereas the Oseberg data has 702 species and 135 716 individuals. The data were collected in different ways. A BCI exact counts of all species were made and their georeferenced positions recorded. For the Oseberg data samples were collected by grabs within localized areas but the data are simply estimates of numbers of species rather than exact counts. Thus while the BCI data covered only 0·5 km2 the Oseberg data covered a much larger area as the seven fields studied were spread over a much wider area of seabed (28 km2). These differences have important consequences for interpreting SAD patterns, which will be discussed later.
Table 1. Summary of data used in the analyses
|No. of species|| 161|| 182|| 208|| 225|| 246|| 408|| 447|| 702|
|No. of individuals|| 2228|| 4049|| 11 082|| 21 457||7429||47 506||71 654||135,716|
|No. of samples|| 5|| 10|| 25|| 50|| 16|| 49|| 66|| 114|
|Area sampled (m2)||50 000||100 000||250 000||500 000|| 8|| 24·5|| 33|| 57|
|Area covered (%)|| 10|| 20|| 50|| 100|| 14|| 43|| 58|| 100|
|Actual area covered (km2)|| 0·05|| 0·1|| 0·25|| 0·5|| 4|| 12|| 16|| 28|
The frequency distributions of the species across sites in the large marine benthic data set (Oseberg) and at BCI are shown in Fig. 1. Figure 1(a) shows that at Oseberg the assemblage has a large number of species that are represented by few individuals (rare species) yet the data cover a very large number of sites (114). While a similar general pattern occurs at BCI the dominance of rare species is not so complete. Nevertheless, rareness is a dominant characteristic of marine soft sediment and tropical tree assemblages.
Figure 2 shows the SAD plots for the marine (Oseberg) and terrestrial (BCI) data at four different scales. For the Oseberg data in all cases the data are truncated and only in the complete data set is the mode exposed. In contrast for the BCI data even at the smallest scale (5 ha) the mode is exposed and as sample size increases the mode becomes more exposed. Preston (1948) would call this ‘unveiling of the veil-line’, i.e. as a larger sample is taken one finds more species with abundances of one individual per sample. The single log-normal group model fitted to these data (lines) shows a good fit (Table 1). Hubbell (2001) used as an important argument for developing his ZSM model that there was an ‘overweight’ of rare species in the BCI data set, which could not be described by a log-normal distribution. [Note however, that the data are not exactly similar to those reproduced in Hubbell's book. Note also that McGill's (2003) and Volkov et al.'s (2003) plots of the 1995b data record 11 species in octave 1 (i.e. half the number of species that actually occur with one individual per species). This method of plotting was first used by Williams (1964), yet Preston (1948) did not in any of his figures, plot and utilize the data for this octave. We have tried different methods of binning the data and deriving logarithmic classes but are unable to reproduce Hubbell's figures from all the BCI data we have analysed.] However, we find that neither at 25 ha nor 50 ha (Fig. 2) is there a clear overabundance of rare species.
Figure 2. SADs for (a) benthos of the Oseberg field, Norwegian continental shelf, and (b) tropical trees at BCI. Data are plotted using the Preston (1948, 1962) method, where the number of species with between 0 and 1 individuals per species are not shown. Areas sampled: OSE = Oseberg oil field, Norway 1 = 1 field sampled, 2 = fields 1 + 2, etc., BCI 05 = 5 ha area, 10 = 10 ha area, etc.
Download figure to PowerPoint
In Fig. 2 all the plots, however, are characterized by their ‘bumpiness’, which suggested to us that more than one log-normal distribution underlies the data. Figure 3 shows two-log-normal groups fitted to the same data. The plots look reasonably good fits. Table 2 shows that at Oseberg 1 the single model was not a good fit whereas all the remaining data fitted both models. Thus Table 2 shows that in addition to a fit to the single log-normal the two-group model also shows a good fit to the data. Figure 3 also shows that the BCI data at the smallest scale (5 ha) are similar to the largest sample from Oseberg with a truncated distribution and mode in the second octave.
Figure 3. The fits of the two-group model compared with observed number of species for (a) benthos of the Oseberg field, Norwegian continental shelf, and (b) tropical trees at BCI.
Download figure to PowerPoint
Table 2. χ2 goodness-of-fit tests to one-group and two-group models. BCI and OSE labels as shown in Table 1
With the marine (Oseberg) data the rare group of species is dominant at all spatial scales (Fig. 3). However, in the tropical tree (BCI) data the rare group of species is only dominant at small scales (5 and 10 ha) and decreases in dominance at the larger scales sampled.
The Oseberg data (Fig. 3) suggest that there may be a third group of species at the higher octaves, but such a group cannot be fitted statistically as the numbers of species within each octave are too small to apply the χ2 test. This is also shown by the fact that including data from all the octaves (not just 0–8 as shown in Table 2) shows that the two-group model at Oseberg 2 and 4 is a significantly different fit from the expected distribution. Thus it is these higher octaves that are influencing goodness-of-fit of the χ2 test.
- Top of page
- Materials and methods
Patterns in site occupancy of species have been described for many terrestrial assemblages (see review by Gaston 1994) and for benthic data by Ellingsen & Gray 2002). In both marine and terrestrial assemblages rareness, as measured by number of sites occupied by one, two or three species is a common feature (Fig. 1). Yet analyses of the species found at the different scales analysed showed rather different patterns between the tropical tree assemblage and the benthic fauna. At the smallest scale analysed (5 ha for tropical trees) 60% of the total number of species were found, whereas, for the marine data sampling one single area at Oseberg gave only 30% of the total number of species. An analysis of the rare species group showed that rare species at BCI were consistently rare at all of the areas sampled. This contrasts with McGill's (2003) interpretation of structuring processes that ‘the rare species that make it into the local community very often disappear after just a few repeated samples because of their rarity.’ At Oseberg as sample size increased, although the number of species found with one individual per sample remained high, the species that occurred at low abundances at the smallest scale showed highly variable abundances at larger scales and were randomly distributed among the octave classes. The difference between the BCI and Oseberg data probably relates to sampling effort as all species of tree were identified within the 50 ha plot at BCI, whereas at the largest scale sampled at Oseberg < 0·02% of the total area of sediment in the area was in fact sampled. At Oseberg, species that were rare in one area were different to those that were rare in another area and patches of higher abundances of a species that was rare in one area were found in other areas. We find these results intriguing and are investigating these aspects in more detail.
The SAD patterns were analysed by fitting a single log-normal distribution (Fig. 2) and also a two-group log-normal model (Fig. 3) to both the marine benthic and tropical forest data. Recently, Williamson & Gaston (2005) argued that the log-normal is not an appropriate null model for SADs because, following Pielou (1975), the Central Limit Theorem is not applicable to assemblages of species, but only to pooled subpopulations of a single species. Yet the log-normal fits an extremely wide selection of ecological, geological and sociological data (Limpert, Stahel & Abbt 2001). A recent example is Longino, Coddington & Colwell's (2002) study of ants in a tropical rain forest. They concluded that the abundance distribution was log-normal, although it has to be added that they did not apply any goodness-of-fit tests. Limpert et al. (2001) state that, ‘the sum of several independent normal variables is itself a normal random variable. For quantities with a log-normal distribution, however, multiplication is the relevant operation for combining them’. This has also been pointed out by Harte (2003) who stated that if birth and death rates are governed by products of random factors with a different product chosen for each species then the Central Limit Theorem states that the rates will be distributed log-normally. We would add another major variable to those of birth and death rates that are known to affect assemblage structure namely the immigration of species to the sampled area from outside. Thus we reject Pielou's (1975) and Williamson & Gaston's (2005) argument and believe that the log-normal distribution can be derived by the simple assumption that the population sizes of the variety of species in a given assemblage are determined by the multiplicative effects of environmental and biological variables, which therefore, lead to one (or more) log-normal SADs.
One of the major problems with fitting models, such as the log-normal, to SADs is of testing goodness-of-fit to the theoretical distributions. McGill (2003) used eight different tests and Williamson & Gaston (2005) four additional ones. Both of these studies tested some of the BCI data though not exactly the same data sets. McGill (2003) concluded that the log-normal fitted the BCI data, whereas Williamson & Gaston (2005) showed that two tests, the Anderson–Darling test and the Ryan–Joiner test, showed the BCI data were significantly different from the log-normal, whereas with the Kolmogorov–Smirnov test there was no significant difference from the log-normal. Volkov et al. (2003) also tested the same BCI data as used by McGill (2003) and found that there was no significant difference between the fits of the ZSM model (Hubbell 2001) and the log-normal to the data. Although they argued that the ZSM was a better fit as it had a slightly lower χ2. Yet they only fitted one single data set and as Etienne & Olff (2004) show SADs vary between surveys of the same 50 ha plots. Therefore, within the limits of type I statistical error we believe that one cannot use such an argument to state than one model is better than another if both fit within the P 0·05 criterion of risk of committing a type I error. A more recent and thorough analysis of the BCI data using all years and taking a Bayesian statistical approach has been done by Etienne & Olff (2004). They compared goodness-of-fit between the ZSM model and the log-normal and showed that the results ‘do not point decisively in the direction of one of these models’. It appears therefore that the log-normal has not been superseded by the ZSM as a model for the BCI data (see also Harte 2003). Our analysis of SAD patterns at different scales for BCI and Oseberg showed that (Table 1) within the accepted limits for type I statistical errors of the χ2 and Kolmogorov–Smirnov tests both a single and a two-group log-normal model could be fitted to the data. Thus we propose that in addition to the single log-normal model fitted by McGill (2003) and Volkov et al. (2003) a two-group model should also be considered as an alternative hypothesis to the ZSM (Hubbell 2001).
The ZSM is notoriously difficult to fit (McGill 2003; Volkov et al. 2003; Etienne & Olff 2004) and as the log-normal has not been discarded as an alternative model (Harte 2003) we saw no need to test the fit to the ZSM in the context of this paper. We have tested the ZSM with marine data in another context (in review) and have encountered major problems in obtaining results for the ZSM.
In reviewing SAD patterns in relation to rarity Gaston (1994) suggested that bimodal patterns (as suggested for the benthos, Fig. 2) were uncommon and he showed only one example that of dung beetles in montane pastures and tropical forests (Hanski 1991; Hanski & Cambefort 1991). Hanski (1991) suggested that the bimodal pattern resulted from a mixture of a group of rare species that are nonlocal and that of a group of local species. Hubbell (2001) also regarded immigration from outside the area sampled as a key factor influencing SADs. Magurran & Henderson (2003) comprehensive analysis of a 20-year data set of fish species in the Severn estuary, UK also found a bimodal pattern. Magurran & Henderson (2003) argued that the log-series fitted the rare group, whereas a log-normal fitted the common group. However, a fit to a truncated log-normal is unlikely to be distinguishable statistically from a log-series. Thus it is not necessarily true, as argued recently by Chave (2004), that the rare species are neutral because they fit a log-series and the common species are non-neutral because they fit a log-normal.
Williamson & Gaston (2005) based their argument that the log-normal was not an appropriate null model on the assumption of fits of SADs to a single log-normal distribution, not two (or more) log-normal groups of species. In fact their fig. 4 using normal probability plots for data on British breeding birds, BCI > 1 cm data set and butterflies from Jatun Sacha, Ecuador suggest to us that there are at least two log-normal groups present in all three data sets. Likewise, Etienne & Olff (2004) find that the single log-normal did not fit extremely left-skewed data from a freshwater fish assemblage at Cato Maraca, Venezuela. This again is not at all surprising as these data are extremely ‘bumpy’, and suggests at least two log-normal distributions underlie the data. These fish data are remarkably similar to the log left-skewed (and bumpy) British breeding bird data presented in Williamson & Gaston (2005). Likewise the complete sample for Longino et al.'s (2002) ants in a tropical forest clearly shows two log-normal groups rather than the single model fitted. These examples suggest that the two-group log-normal SAD distribution is probably widespread in nature.
Earlier we (Ugland & Gray 1982) argued for a three-log-normal group model. The third group is that of the very common species. These are usually extremely abundant with densities in hundreds or thousands per sampling unit and yet there are usually relatively few species. Within the limits of traditional tests such as the χ2 it is not possible to distinguish statistically such small numbers of species. Yet the SAD plots consistently show the presence of this group of abundant species (e.g. Fig. 3a for the Oseberg data). As mentioned in the Results section evidence for the third group is also shown by the fact that including data from all the octaves (not just 0–8 as shown in Table 2) gave a significantly different fit from the expected two-group model. This was shown to be due to the higher octaves being responsible for the divergence from the expected in the χ2 goodness-of-fit test. We interpret these findings that the species comprise a putative third group of common species. These common species are, of course, highly significant and have been the primary focus of studies of population dynamics in natural assemblages and so should not be neglected.
Analyses of the changing SAD patterns with scale for the BCI data (Fig. 3b) suggests that for the tropical forest data the SAD patterns develop from a two-group structure at small sample size, where the rare group dominates and immigration from outside the sample area occurs. However, in the complete 50 ha sample the data approach a single log-normal distribution (but the two groups of rare and common species are still distinguishable with the methods developed here). We suggest that at BCI as the sampled area is enlarged immigration becomes relatively less important and the common group of species becomes dominating. In contrast at all scales the benthic data (Fig. 3) shows continuous dominance by the group of rare species, which species are presumably immigrating from outside the sampled area occurred.
Why are marine benthic assemblages always dominated by the group of rare species? We believe that it is most likely a sampling problem. It is simply not possible to sample completely all the benthos. Likewise it is not possible to sample all the fish in an estuary (Magurran & Henderson 2003), or to sample all the species of insects in oak trees in a wood (Southwood 1996), or Macrolepidoptera caught in a light traps (Williams 1964), with the result that such assemblages appear, in both space and time, to be dominated continuously by immigration of rare species. We believe that natural systems with such characteristics are likely to be very common. In contrast for the 50 ha plots of tropical trees analysed at BCI (Hubbell 2001) and Pasoh (Manokaran & Swaine 1994; He & Legendre 2002) all trees have been counted and identified to species. One cannot, however, envisage sampling all the insects within the 50 ha of the BCI forest. Etienne & Olff (2004) repeat the statement attributed to Preston (1948) that the log-normal strictly only applies to the entire community and not to a sample and yet even the BCI data from the 50 ha plot are only a sample from the forest and so it too is not a complete sample. Likewise the data on SADs for British and North American breeding birds, which are often claimed to be one of the few fully censused assemblages (see McGill 2003 for references to this statement) falls on this argument, as just because the birds are sampled in the whole of UK or North America does not mean that they are not part of a larger assemblage also. Gregory (1994) has shown that the British bird data in fact fit a log-normal.
Additional factors that may be significant in explaining the SAD patterns discussed in this paper are that in the marine assemblages the organisms are of relatively small size and biomass (< 20 cm in length for benthos), the organisms are relatively short-lived (usually < 20 years) and are mostly mobile. These properties also occur in the insects on oak trees studied by Southwood 1996) and in modern Macrolepidoptera the SAD patterns of which were studied extensively by Williams (1964). Tropical tree assemblages are characterized by individuals of large biomass that are very long-lived often more than 100 years and are of course immobile. With the passage of time the chance of new immigrants establishing within the BCI forest is reduced, but not negligible. We suggest that systems such as tropical forests may have rather exceptional characteristics. In nature there are probably many systems that are like the marine benthos and insect assemblages in woods (Southwood 1996) that are dominated by small-sized and relatively low biomass species not occupying the whole habitat space available so the systems are open to constant immigration. This results in patterns where the rare group of species dominates at whatever scale (in space or time) the assemblage is sampled. Yet not all marine systems are structured in this way. In a fouling assemblage on a coastal hard substratum species-rich systems were found to be more resistant to invasion of immigrants than species-poor systems (Stachowicz, Whitlatch & Osman 1999). In such systems we predict a single log-normal SAD pattern will be the rule.
In developing a new framework for community ecology Leibold et al. (2004) suggested that there were four basic patterns of dynamics in metacommunities: (1) a patch-dynamic paradigm; (2) the species-sorting paradigm; (3) the mass-effects paradigm; and (4) the neutral paradigm. The first three assume that species differ from each other in terms of niche dynamics and/or on their abilities to disperse to avoid local extinction. This is not so for the neutral paradigm where there is assumed to be ecological equivalence among all individuals of every species in a given trophically defined community. This assumption is generally regarded as not being biologically realistic (Yu, Terborough & Potts 1998; Bell 2001; Clark & McLachlan 2003; Alonso & McKane 2004). The patch-dynamic paradigm and the species-sorting paradigm are based on a separation of time scales between local dynamics and colonization–extinction dynamics. This clearly does not apply to the data sets considered here. The mass-effects paradigm is based on the idea that local dynamics are affected by dispersal from the regional species pool, which is clearly the appropriate model for the data considered here. However, rare species probably do not obey traditional metapopulation dynamic equations (Levins 1969; Hanski 1999) as they have stochastic population dynamics and yet persist in assemblages at low population sizes.
We suggest that the simple model proposed here, based on two log-normal distributions, may be a realistic and parsimonious alternative to the ZSM reflecting the structure of many natural systems. As stated above our interpretation of the log-normal is that it is merely a statistical descriptor of patterns. In our case the rare group of species and the common group of species probably result from the multiplicative effects of environmental and biological variables acting on all the species within each group.
Perhaps one of the main points to emerge from this study is that many data sets are not complete counts but estimates of numbers of species occurring. Sample coverage greatly affects the shape and therefore, the interpretation of patterns of rareness, as has been pointed out earlier by Gaston (1994). The consequence is that in making generalizations about patterns of SADs one must consider carefully the biological implications of varying sampling coverage, as noted earlier by Whittaker (1965), Tokeshi (1993) and McGill (2003).