Table 1 gives the results of a keyword search of Biological Abstracts (through 2007) for the largest and most ambitious tropical arthropod surveys that provide data on singletons. As these studies clearly show, high singleton frequencies characterize typical tropical arthropod surveys, averaging 32% of species from the 71 studies. Why are there so many singletons in those surveys? Clearly, community-level singletons (and the species they represent) would have no chance to reproduce and could play no significant ecological role. Although the tropics are said to harbour many rare species, presumably most are not so rare as to lack at least a few conspecific neighbours with whom successfully to mate. Hence, singletons in biological surveys are anomalies, and as such have attracted much attention. To explain them, an array of ad hoc hypotheses have been proposed. However, we propose that, particularly when singleton frequencies are high, undersampling as a null hypothesis should precede more biological ad hoc explanations (McGill 2003).
Table 1. Summary of tropical arthropod surveys. Arthropod surveys from tropical forest sites reporting total abundance (abun., or species presence per sample for ants, Agosti et al. 2000), species richness (spp.), and singletons (reported, calculated from figures given, or approximated as Fisher's α, noted in source column). Intensity is abun./spp. A search of Biological Abstracts (1986–2007) on the terms (species richness) and (Arthropoda) and (Oriental region or Australasian region or Neotropical region or Ethiopian region) produced 514 results, many of which did not provide the required inventory statistics or were not from wet tropical sites. Those meeting our criteria, in addition to those known to us personally, are listed below. References for this table are listed in the Appendix| Taxon | Study site | Abun. | Spp. | Singletons | Intensity | Percentage of Singletons | Source |
|---|
| Arthropods | Australia | 20 507 | 759 | 271 | 27 | 36 | Basset & Kitching 1991 |
| Insecta | Costa Rica (Area 1) | 488 | 142 | 91 | 3 | 64 | Janzen & Schoener 1968 |
| Insecta | Costa Rica (Area 2) | 1362 | 262 | 165 | 5 | 63 | ‘ ’ |
| Insecta | Costa Rica (Area 3) | 4857 | 404 | 254 | 12 | 63 | ‘ ’ |
| Insecta | Costa Rica (Area 4) | 1339 | 545 | 390 | 2 | 72 | ‘ ’ |
| Insecta (leaf-chewing+ sap-sucking) | New Guinea | 80 062 | 1050 | 278 | 76 | 26 | Novotný & Basset 2000 |
| Insecta | Guyana | 27 735 | 604 | 229·5 | 46 | 38 | Basset et al. 2001 (singletons calculated) |
| Blattaria | Panama (BCI) | 3224 | 79 | 15 | 41 | 19 | Wolda 1983 (Fisher's α) |
| Coleoptera | Australia (Queensland) | 10 000 | 1514 | 612 | 7 | 40 | Monteith & Davies 1984 (approx. values) |
| Coleoptera: Curculionidae | Panama (BCI) | 28 521 | 703 | 131 | 41 | 19 | Wolda 1987 (Fisher's α) |
| Coleoptera: Pselaphidae, Anthicidae | Panama (BCI) | 6482 | 114 | 19·7 | 57 | 17 | ‘ ’ |
| Coleoptera | Panama (BCI) | 34 705 | 597 | 102·5 | 58 | 17 | ‘ ’ |
| Coleoptera | New Guinea | 4840 | 633 | 321 | 8 | 51 | Allison et al. 1993 |
| Coleoptera | New Guinea | 3977 | 418 | 199 | 10 | 48 | Allison et al. 1997 |
| Coleoptera | Peru (Tambopata) | 15 869 | 3429 | 1728 | 5 | 50 | Erwin 1997 |
| Coleoptera | Sulawesi | 18 000 | 1355 | 623 | 13 | 46 | Hammond et al. 1997 (approx. values) |
| Coleoptera | Brazil | 8454 | 993 | 446·9 | 9 | 45 | Didham et al. 1998 (singletons calculated) |
| Coleoptera: Curculionidae | Honduras | 26 891 | 293 | 38 | 9 | 13 | Anderson & Ashe 2000 |
| Coleoptera: Staphylinidae | Honduras | 7349 | 224 | 53 | 33 | 24 | ‘ ’ |
| Coleoptera | Malaysia | 8028 | 1711 | 823 | 5 | 48 | Chung et al. 2000 |
| Coleoptera | Uganda | 29 736 | 1433 | 596 | 21 | 42 | Wagner 2000 |
| Coleoptera | Ecuador | 2329 | 318 | 91 | 7 | 29 | Lucky et al. 2002 |
| Coleoptera: Scarabaeinae | Bolivia | 4050 | 73 | 7 | 55 | 10 | Spector & Ayzama 2003 |
| Coleoptera: Pselaphinae, Histeridae | Ecuador | 3465 | 385 | 155 | 9 | 40 | Carlton et al. 2004 |
| Coleoptera: Phytophagous | Panama | 3009 | 364 | 139 | 8 | 38 | Ødegaard 2004 |
| Coleoptera | Ecuador | 15 181 | 2001 | 397 | 8 | 20 | Erwin et al. 2005 |
| Coleoptera: Scarabaeinae | Colombia | 7894 | 101 | 20 | 78 | 20 | Escobar et al. 2005 |
| Coleoptera | Brazil (Parana) | 1883 | 518 | 266 | 4 | 51 | Ganho & Marinoni 2005 |
| Coleoptera: Aticini | Brazil (Parana) | 1891 | 106 | 32 | 18 | 30 | Linzmeier et al. 2006 |
| Coleoptera | Australia | 29 986 | 1473 | 526 | 20 | 36 | Stork & Grimbacher 2006 |
| Diptera: Muscidae | Brazil (Parana) | 7014 | 91 | 10 | 77 | 11 | Costacurta et al. 2003 |
| Diptera: Phoridae | Costa Rica | 3341 | 115 | 20 | 29 | 17 | Brown 2004 |
| Diptera: Syrphidae | Brazil (Parana) | 392 | 76 | 12 | 5 | 16 | Marinoni et al. 2004 |
| Ephemoptera | Panama (Corriente Grande) | 7178 | 27 | 4 | 266 | 15 | Wolda & Flowers 1985 (Fisher's α) |
| Ephemoptera | Panama (Miramar) | 29 120 | 33 | 4 | 882 | 12 | ‘ ’ |
| Ephemoptera | Zaire | 29 892 | 21 | 2 | 1423 | 10 | ‘ ’ |
| Hemiptera | Australia | 6004 | 98 | 35 | 61 | 36 | Andrew & Hughes 2005 |
| Homoptera | Panama (BCI) | 22 046 | 458 | 82·1 | 48 | 18 | Wolda 1987 (Fisher's α) |
| Homoptera | Panama (Pipeline Rd.) | 1324 | 332 | 126 | 4 | 38 | Wolda 1979 |
| Hymenoptera: Parasitica | Sulawesi | 700 | 293 | 179 | 2 | 61 | Noyes 1989 |
| Hymenoptera: Formicidae | Costa Rica (Monteverde) | 3998 | 53 | 6 | 75 | 11 | Longino & Nadkarni 1990 |
| Hymenoptera: Formicidae | Costa Rica | 7904 | 437 | 51 | 18 | 12 | Longino et al. 2002 |
| Hymenoptera: Apidae | Brazil (Minas Gerais) | 1183 | 20 | 6 | 59 | 30 | Nemesio & Silveira 2006 |
| Lepidoptera: butterflies | Malaysia | 9031 | 620 | 118 | 15 | 19 | Corbet 1942 |
| Lepidoptera: moths | Malaysia | 9461 | 1048 | 538 | 9 | 51 | Barlow & Woiwod 1989 |
| Lepidoptera: butterflies | Ecuador | 6690 | 130 | 20 | 5 | 15 | DeVries et al. 1997 |
| Lepidoptera: butterflies | Ecuador | 883 | 91 | 22 | 10 | 24 | DeVries et al. 1999 |
| Lepidoptera: butterflies | Ecuador | 11 861 | 128 | 18 | 93 | 14 | DeVries & Walla 2001 |
| Lepidoptera | Borneo | 485 | 53 | 16 | 9 | 30 | Schulze et al. 2001 |
| Lepidoptera: butterflies | Thailand | 1936 | 53 | 4 | 37 | 8 | Ghazoul 2002 |
| Lepidoptera: Geometridae | Ecuador | 23 720 | 868 | 161 | 27 | 19 | Hilt et al. 2006 |
| Odonata | Peru | 1537 | 136 | 31 | 11 | 23 | Louton et al. 1996 |
| Orthoptera | Panama (BCI) | 1566 | 73 | 15·9 | 21 | 22 | Wolda 1987 (Fisher's α) |
| Psocoptera | Panama (BCI) | 10 092 | 148 | 20 | 68 | 14 | Broadhead & Wolda 1985 (Fisher's α) |
| Psocoptera | Panama (Fortuna) | 4301 | 84 | 10 | 15 | 12 | ‘ ’ |
| Araneae | Bolivia (50 m) | 875 | 191 | 89 | 5 | 47 | Coddington et al. 1991, 1996 |
| Araneae | Bolivia (1200 m) | 1109 | 329 | 147 | 3 | 45 | ‘ ’ |
| Araneae | Bolivia (2200 m) | 654 | 158 | 70 | 4 | 44 | ‘ ’ |
| Araneae | Brazil (Manaus) | 75 | 62 | 32 | 1 | 52 | Höfer et al. 1994 |
| Araneae | Tobago | 1777 | 98 | 27 | 18 | 28 | Hormiga & Coddington 1994 |
| Araneae | Peru (Samiria) | 5895 | 1140 | 520 | 5 | 46 | Silva 1996 |
| Araneae | Peru (Pakitza) | 2616 | 498 | 207 | 5 | 42 | Silva & Coddington 1996 |
| Araneae | Costa Rica | 7144 | 86 | 11 | 83 | 13 | Bodner 2002 |
| Araneae | Tanzania (understorey) | 9096 | 170 | 32 | 54 | 19 | Sørensen et al. 2002 |
| Araneae | Tanzania (canopy) | 5233 | 149 | 35 | 35 | 23 | Sørensen 2003 |
| Araneae | Malaysia | 6999 | 578 | 145 | 12 | 25 | Floren & Deeleman-Reinhold 2005, personal communication |
| Araneae | Mt. Cameroon (500 m) | 573 | 231 | 93 | 2 | 40 | Coddington et al., unpublished |
| Araneae | Mt. Cameroon (3000 m) | 1555 | 55 | 14 | 28 | 25 | ‘ ’ |
| Araneae | Peru (Tambopata) | 1821 | 635 | 341 | 3 | 54 | Coddington & Silva, unpublished |
| Araneae | Peru (Manu) | 222 | 123 | 78 | 2 | 63 | Erwin & Coddington, unpublished |
| Araneae | Guyana | 5964 | 351 | 101 | 17 | 29 | This study |
| Averages | | 9372 | 464 | 176 | 61·6 | 31·6 | |
Singleton tropical arthropod species are anomalous for several reasons. First, minimum viable population sizes are conventionally at least 500 individuals (Gilpin & Soulé 1986). Second, many arthropods begin life clumped because eggs are clumped when laid – in spiders eggs are clustered within an egg sac. Most nonvolant arthropods are small and probably rarely travel hundreds or even dozens of metres to mate. Third, clumped distributions in nature are far more common than random or dispersed (Krebs 1999). While clumping certainly depends on scale, at hectare scales randomness is typical of canopy trees and jaguars, not small, nonflying, sedentary arthropods such as spiders.
Ad hoc explanations for singletons often invoke aspects of the biology of particular groups, such as host or food plant specificity (Price et al. 1995). In spiders, males of sedentary web-spinning species must wander to find females (potentially passing through atypical habitat patches, i.e., tourists), and are likely to be small and rare (Vollrath & Parker 1992). General explanations include source-sink phenomena or mass-effects (e.g. ‘ecological drift,’Hubbell 2001) at both local (‘tourist’) and regional (‘waif’ or ‘vagrant’) scales (Schmida & Wilson 1985; Pulliam 1988; Southwood 1996; Stork & Hammond 1997; Novotný & Basset 2000; Magurran & Henderson 2003; Basset et al. 2004; Ødegaard 2004). Time, space, or method ‘edge effects’ are also frequent explanations. Adults outside their breeding seasons are scarce, and if only adults are identifiable (true for spiders), will be artefactually rare (Ulrich 2001; Longino, Coddington, & Colwell 2002; Scharff et al. 2003; Basset et al. 2004). Nocturnality or seasonal migration could produce similar effects. Space edge effects are usually microhabitat preferences. Species patches just trespassing on plot boundaries might produce many ‘false’ singletons. Method edge effects are the accidental sampling of a species by an inappropriate method, such as a canopy species in a pitfall trap (Longino et al. 2002; Scharff et al. 2003). Finally, singletons may be absolutely rare, i.e. sparse with large nearest-neighbour distances throughout their range. Perhaps, as is now recognized for tropical trees (Pitman et al. 1999; Kenfack et al. 2006), we drastically underestimate the scale at which many tropical arthropod species live and ought to be sampled.
Undersampling bias and biological explanations are not mutually exclusive. However, if repeated random sampling of communities modelled on statistical parameters estimated from the sample mimic the observed results, undersampling should serve as the initial null explanation for high singleton frequencies (McGill 2003), analogous to the use of null models in other fields (Harte et al. 2001; Hubbell 2001). Variation not explained by undersampling may then be attributed to more complex causes.
Statistical methods to assess undersampling bias are relatively recent; quantitative estimates of its magnitude have been historically difficult, if not impossible, to obtain. Observed richness values are traditionally used for descriptive or comparative purposes (Groombridge 1992; Heywood & Watson 1995; Levin 2001). If high singleton frequencies indicate undersampling, however, then tropical arthropod communities are substantially larger than measured, and comparisons based on observed numbers are misleading. This has important implications for conservation biology, and also implies that typical inventories are under-resourced and/or poorly designed.
Here we use the results of an intensive 1-ha survey of spiders to test various explanations for high singleton frequency. Although spiders are typical sedentary arthropod predators and these results may apply only to that guild, high singleton frequencies also characterize inventories of other tropical arthropods (Table 1). Specifically, we test four process hypotheses and the null hypothesis of undersampling bias: singletons tend to be small and therefore missed; singletons tend to be males because as adults they travel further than females; nearest conspecific distances exceed 0·25–1 ha spatial scales (population structure is much larger than anticipated); singletons are ‘cryptic’ and hard to detect; and singletons are simply an artefact of undersampling because the scope of the survey exceeded sampling resources.
Results
- Top of page
- Summary
- Methods
- Results
- Discussion
- Acknowledgements
- References
- Appendix 1. References for Tables 1 and 2
The five collectors accumulated 300 samples over 10 days from the 1-ha plot (Table 3) containing a total of 5965 adults (and 6953 juveniles) of 352 species, of which 101 were singletons (29%) and 56 were doubletons. The most abundant species numbered 412. Inventory completion (observed richness/Chao1 estimate) ranged from 15% to 71% among methods, and overall was 79%. Sampling intensity (no. of ind./no. of spp.) ranged from 1·4 to 10·5 among methods and overall was 17. The survey compares favourably to other large published efforts in intensity and numbers of species encountered, considering that most spider species cannot be trapped (Table 1). However, the continually rising accumulation curves and richness estimators indicate that the inventory was still incomplete by the end of sampling (Fig. 2). The 95% upper confidence limit of the Chao2 estimator (itself only a lower-bound estimate), for example, was 520 species, but clearly had not reached a limit. True species richness in the hectare almost certainly exceeded 500 species, and probably much more.
Table 3. Collecting methods and results. AE, BE, CR, GR, PF, SW, D and N stand for aerial, beating, cryptic, ground, pitfall, sweeping, day, and night collecting methods, respectively (see text). Sample intensity is total individuals/total species. Inventory completion is total species/Chao1 estimate | | AED | AEN | BED | BEN | CRD | CRN | GRD | GRN | PF | SWD | SWN | Total |
|---|
| No. of samples | 12 | 76 | 36 | 19 | 28 | 20 | 28 | 32 | 46 | 2 | 1 | 300 |
| Total individuals | 102 | 2210 | 644 | 272 | 528 | 399 | 621 | 703 | 439 | 23 | 24 | 5965 |
| Total species | 45 | 210 | 138 | 95 | 69 | 72 | 69 | 115 | 57 | 15 | 17 | 352 |
| Singletons | 32 | 73 | 57 | 53 | 29 | 34 | 30 | 54 | 25 | 11 | 14 | 101 |
| Doubletons | 2 | 29 | 29 | 15 | 9 | 13 | 9 | 17 | 4 | 2 | 1 | 54 |
| Sample intensity | 2·3 | 10·5 | 4·7 | 2·9 | 7·7 | 5·5 | 9·0 | 6·2 | 7·7 | 1·5 | 1·4 | 17·0 |
| Percentage of Singletons | 71% | 35% | 41% | 56% | 42% | 47% | 43% | 47% | 44% | 73% | 82% | 29% |
| Chao1 estimate | 301 | 302 | 194 | 189 | 112 | 120 | 115 | 200 | 135 | 45 | n/a | 443 |
| Inv. completion | 15% | 70% | 71% | 50% | 62% | 60% | 60% | 57% | 42% | 33% | | 79% |
The mean and standard deviation of the body lengths of adults collected was 2·89 ± 2·85 mm (thus an estimate of the average size of an adult lowland tropical moist forest spider). The mean singleton body length was 5·30 ± 4·67 mm. Singletons are significantly larger, not smaller, than the average species.
The overall male : female sex ratio in the sample was significantly female biased (1:1·3, P < 0·01). The overall singleton sex ratio was as biased as the total sample (1:1·7, P = 0·18). Sedentary web-spinner singletons were equally biased (1:2·8, P = 0·12). Singletons, therefore, are not disproportionately males of sedentary web-spinning species.
The distribution of doubletons across subplots (Fig. 1) was random (P = 0·82) as was the incidence of singletons from the centre to the outermost subplot (P = 0·80). Tripletons also showed no tendency to clump within subplots. Conspecific nearest neighbour distances, therefore, are not clumped at the coarse 0·25- to 1-ha scales tested here.
Singletons showed no taxonomic pattern, occurring in families in proportion to the latter's relative abundance (P > 0·99). If undersampling bias varied according to lifestyle defined as family identity, the effect was not detectable at this level of sampling intensity.
The observed data fit the lognormal distribution well (Fig. 3, 0·9 > P > 0·5). The predicted number of species in the modal octave S0 was 76·4 ± 13 (µ = 6·2562), the variance term (‘a’) was 0·195 ± 0·210 (σ = 3·6262) and estimated community size 694 species.
Figure 4 shows the results of 1000 random draws of constant sampling effort (6000 individuals) from simulated lognormal distributions with the above parameter values for 500, 600, 700, and 800 total species, compared to the observed data (arrow). For clarity, only 25 randomly chosen samples from each community size are plotted, as otherwise the observed data point would have been completely obscured. Observed richness rises with total richness, and numbers and percentages of singletons and doubletons rise because, as richness increases, sampling intensity decreases. On these three statistics, the empirical sample falls between the 700 and 800 species model communities, roughly agreeing with the lognormal richness estimate in Fig. 3. Overall, it falls well within the stochastic variation seen in these random draws from ‘null’ lognormal distributions (Fig. 4). True singletons in the model lognormal communities averaged only 4% of the total.
To assess how many more specimens would be required to enable richness estimators to cover true community richness under these circumstances, we sampled 60 000 (intensity 170) and 120 000 individuals (intensity 340) from the 700 species lognormal community, thus 10 and 20 times the actual sampling effort. At an intensity of 170, percentage of singletons was 14%, and the Chao and coverage estimators were 595–600 species with Chao upper confidence intervals of 636 species – still short of the true 700 species richness. At an intensity of 340, the Chao and coverage estimators were 650–663 species, with a Chao upper confidence interval of 702 species – thus just covering the true richness value – and percentage of singletons fell to 9%.
Figure 5 depicts the logarithmic decline in singletons with sampling intensity for the data of Table 1, and predicts zero singletons at sampling intensities of roughly 1100. Sampling the model community at that intensity yielded on average 4% singletons and 658 species observed.
We present what few data exist on tropical spider densities in Table 2. Ignoring differences due to locality and construed as a ground to canopy vertical 1 m2 column, the leaf litter contains most individuals, the canopy/subcanopy less, and the shrub/understorey layer least. Given the decrease in leaf area or other substrate with height above the ground, the decline is plausible. It predicts, extremely roughly, about 2 million total spiders per hectare of tropical forest (range 1·1–3·4 M). The modelled lognormal populations ranged between 1·2 and 3·3 million individuals, which agrees with Table 2.
Discussion
- Top of page
- Summary
- Methods
- Results
- Discussion
- Acknowledgements
- References
- Appendix 1. References for Tables 1 and 2
In this study, the empirical sample of 6000 individuals may have included only half the species present, with singletons comprising 29% of species observed. Nonparametric richness estimators suggested only 443–460 species, a shortfall of 35% compared to the lognormal estimate. While any singleton may have been due to any of the process explanations discussed above, the simplest explanation for the high frequency is undersampling. As sampling continues and singleton frequencies drop, biological explanations become more plausible.
Two ‘biological’ explanations were statistically significant, but neither in the direction hypothesized. Singletons were significantly larger (not smaller) than the average spider. Twenty-three singletons over 7 cm caused that difference. These were mostly large cursorial species (including ctenids, sparassids, and miturgids) for which absolute densities of one, or very few individuals per hectare are plausible. Singletons are also disproportionately females, not males, but the sample in general was female-biased, and singletons no more so, even among sedentary web-spinning species where the presumed bias towards singleton wandering males should have been most pronounced. Adult male spiders are relatively short lived and wandering males experience exceptionally high mortality (Vollrath & Parker 1992); both of these factors likely contribute to a female-biased sex ratio in the inventory data, even if the sex ratio at birth were even (as they are for most spiders examined to date, see Avilés & Maddison 1991, Avilés, McCormack, Cutter & Bukowski 2000).
The other explanations tested, lifestyle, spatial edge effects, and clumping of individuals at 1-ha scales and below, were insignificant. Novotný & Basset (2000) and Ulrich (2001) also found that few biological explanations of singletons were supported. Magurran & Henderson (2003) use a 21-year data set on a temperate fish community of 80 species to show that about a third to a half of the species accumulated over that time-span were tourists or waifs. In any given short-term sampling event, however, presumably few of the rare species would have been tourists or waifs. In a spider inventory of a ‘known’ fauna, Scharff et al. (2003) hypothesized 58% of singletons as phenological, methodological, or spatial edge effects, but they did not test the null hypothesis of undersampling bias. For relatively instantaneous events such as this inventory, singleton frequencies are about what one would expect from random samples of a lognormally distributed community – in this case, of about 700 species. The null hypothesis of undersampling bias cannot be rejected.
This was an intense, short-term inventory (300 person-hours), designed to yield an ‘instantaneous’ richness estimate that avoided the confounding effect of phenological change. Especially in relatively aseasonal tropical habitats, sampling year round or for multiple years might yield a more complete inventory over and above the effect of greater sampling intensity (DeVries, Walla, & Greeney 1999; Scharff et al. 2003). Increasing the sampling area might also improve efficiency, especially if, as perhaps suggested by the significantly larger singleton size and the possibility that some true singletons occur in any given hectare, we underestimated the scale at which sedentary tropical arthropods should be sampled. Their lifetime ranges may encompass much larger areas. On the other hand, species richness increases logarithmically with area (Rozenzweig 1995), burdening species richness estimates. Regardless, the key point is that the scope of the inventory must be carefully matched to available resources.
What little we know of tropical spider communities broadly agrees with the predictions of the lognormal fit (Table 2). Our empirical sample included only nine of 23 predicted octaves, yet the implied community, when randomly sampled at the same intensity, compared well to empirical observations of total species, numbers of singletons and doubletons, maximum abundance, and total numbers of individuals (Fig. 4). None of the collecting methods used in Table 2 are completely efficient, therefore, the actual hectare abundance of spiders is probably higher.
When the modelled 700 species community was sampled at an intensity of 1100, on average 658 species and 4% singletons resulted. Lognormal distributions always predict some singletons (here on average 28 or 4%), and stochastic replicates never contain all 700 species (here on average 685). Practically speaking, sampling intensities of 1100 detects just about as many species as stochastic models provide.
For these data, a sampling intensity of 340 (10 times the actual sampling effort) was just sufficient to include the known richness within the upper bound of the Chao2 estimator. This implies that inventories, as a rule of thumb, should aim for intensities between that and 1100 to obtain realistic nonparametric estimates of species richness.
Richness estimators are relatively more efficient if they can report the true richness based on relatively few data. The efficiency of available nonparametric richness estimators is poor in the sense that roughly three quarters of the community must be observed before the estimator confidence interval actually covers the true value (Walther & Morand 1998). Chao estimators, moreover, have a maximum upper bound of about half the square of the observed richness (if the sample of n species contains n-1 singletons or uniques and one doubleton or duplicate), but in practice such efficiencies are never achieved because of the improbability of so biased a sample.
The lognormal distribution can potentially result in higher richness estimates than nonparametric approaches (given the same data) because it assumes the relative abundance distribution is symmetric around the modal octave (Sugihara 1980; Longino et al. 2002), and therefore tends to at least double the observed richness. A number of authors argue that empirical communities show an asymmetric excess of rare species (Nee, Harvey, & May 1991; Nekola & Brown 2007), and Hubbell and co-workers argue from first principles that such is expected (Hubbell 2001; Volkov et al. 2003). However, McGill (2003) suggests that this observed skew in species abundance distributions may also be a sampling artefact. One might also point out that the lognormal even less realistically overestimates the abundant tail of the distribution (Fig. 3, Longino et al. 2002; Magurran & Henderson 2003). However, even if the lognormal slightly underestimates rare species, that error is small compared to the gross negative bias of nonparametric estimators at small sample sizes.
The stochastic variation in small samples drawn from the same lognormal population is impressive (Fig. 4). For the 700 species case, 1000 draws of 6000 individuals produced singleton counts of 62–134 and observed richnesses from 121–414, which comfortably cover the observed statistics of 101 and 351. The lognormal distribution therefore may still be a useful method to estimate species richness under circumstances where many data are available, yet not enough for nonparametric estimators to function well. Unlike the relative abundance distribution-based estimators of Ulrich (1999, 2005), it does not require an explicit ratio of sampled to total habitat area, and thus is more practical in the field.
If general, this result implies that even large survey efforts (Table 1, Fig. 5) continue to undersample tropical arthropod biodiversity by perhaps a factor of 2 if singletons average 32% of the total. In many surveys, the figure is much higher (Table 1). Undersampling is a serious issue even for large mammal and bird surveys, where singletons average 16% (Bernard & Fenton 2002; Shankar and Sukamar 2002; McCain 2004). Consequently, typical surveys will underestimate species richness, with obvious implications for our understanding of biodiversity, and for any conservation decisions based on such data.
In summary, it appears that most tropical arthropod biodiversity surveys have been severely under-resourced if their goal was to census or estimate species richness of a defined taxonomic community at a particular place and time. Reliable methods do exist to estimate how many data are required to estimate many ecological statistics (Krebs 1999; Magurran 2004), but species richness historically is an exception. One may hope that future statistical research will improve estimator efficiency, but in the meantime the use of existing estimators dramatically exposes the gap between inventory design as implemented and the minimum necessary to obtain reliable richness estimates. Here the lognormal was more efficient than nonparametric estimators, and perhaps should be used more frequently. Species richness estimators are increasingly used in basic research to detect undersampling bias; results thus far suggest that it is ubiquitous and severe. Rather than scaling back inventory goals, we suggest that inventory analyses continue to assess undersampling bias in order to justify the budgets required to obtain adequate data. Funding sources and consumers of these essential data can scarcely argue that inadequate results are acceptable. If results continue to demonstrate that much greater sampling intensities are required, such will eventually become the norm, rather than the exception.