Present address: Jan Beck, Department of Environmental Sciences (Biogeography & Applied Ecology), University of Basel, St. Johanns-Vorstadt 10, 4056 Basel, Switzerland. (Tel. +41-2670810, E-mail firstname.lastname@example.org).
1The identification of spatial patterns of species richness at regional scales, such as biodiversity hotspots, is complicated in invertebrate taxa (particularly those from tropical regions) by incomplete and biased inventory data. Estimation techniques of regional species richness from incompletely sampled landscapes have recently become available, but their applicability to data from museum collections and local taxonomic checklists has not been investigated.
2We use records of sphingid moths in grid cells of 1° latitude and longitude on 14 Malesian islands to estimate the total species richness of these islands, as well as that of the whole archipelago, by five parametric and nonparametric estimation techniques. We compare these values to figures based on GIS-supported estimates of geographical ranges of each species.
3Our analyses suggest that the F3 estimator of regional species richness leads to least deviation from the GIS-based estimate, followed by Chao2 and ICE, which are often employed in local comparisons. Overestimation occurred more often than underestimation and estimates of well-sampled islands (with many available grid cells) are less deviant than those of poorly sampled islands. We did not obtain conclusive results as to whether strongly undersampled grid cells are better excluded from an analysis (at the cost of reduced grid cell number) or not.
4Synthesis and applications. In agreement with some previously published assessments, we conclude that the F3 estimator has the greatest potential for predicting regional species richness in partially sampled landscapes. Sample-based methods for estimating regional species richness can provide an alternative to the much more work-intensive geographical modelling of species distributions, which may facilitate the inclusion of tropical invertebrate groups in documenting global diversity patterns. However, under conditions of incomplete, non-systematic sampling, which is typical for museum and checklist data, errors can still be large, particularly if the number of sampling units (e.g. grid cells) is low. Estimation values should not be interpreted uncritically when the data conditions that lead to biased values are not precisely defined.
Describing and understanding species diversity patterns is a major research challenge in biodiversity, biogeography and conservation (e.g. Rosenzweig 1995, 2003) that might generate the much-needed knowledge necessary for ecosystem-wide ‘diversity-management’ in the wake of the current anthropogenic extinction wave (e.g. Sodhi et al. 2004 for South-east Asia). The shortage of reliable and complete sampling data has limited most larger-scale analyses to temperate regions or to well-sampled taxa such as plants or vertebrates (e.g. Barthlott et al. 1996; Kier & Barthlott 2001; Jetz & Rahbek 2002; Hawkins et al. 2003; Gaston & Evans 2004), whereas the incompleteness of invertebrate sampling, particularly in hyper-diverse tropical rainforest regions, has mostly prevented such efforts in other taxonomic groups.
The analysis of local species diversity on the basis of incomplete samples has led to well-established methods such as diversity indices, rarefaction and local species richness estimators (see, e.g. Southwood & Henderson 2000; pp. 462–506; Ulrich & Ollik 2005 review recent developments). Methods to estimate regional species richness on the basis of incomplete local samples and incomplete coverage of local habitat differences (β-diversity) have only recently begun to be applied (Mawdsley 1996). The software WS2M (Turner et al. 2000) can be used to compute a number of parametric and nonparametric estimators, some of which are also used for local estimates (e.g. incidence-based estimator (ICE), asymptotic Michaelis-Menten (MM) functions; Chazdon et al. 1998). It also includes new methods specifically designed to fit the shape of species accumulation curves, and thus the spatial species turnover between sites (see Turner et al. 2000; Rosenzweig et al. 2003; for details; we use their acronyms for methods throughout this paper). Rosenzweig et al. (2003) showed that some methods (MM, F3, F5) performed very well in estimating continental species richness for North American butterflies on the basis of a relatively small number of included ecoregions, providing these were not spatially clustered. Other studies (e.g. Oertli et al. 2002; Dangles et al. 2004) have also applied estimation methods from WS2M, though mostly on relatively small spatial scales. It is unknown, however, to what degree these promising results might be dependent on local completeness of samples, which is unusual for most invertebrate samples. Most recently, Hortal et al. (2006) reported rather poor performance of these techniques using arthropod data from the Azorean islands.
Museum collections and taxonomic publications with local species checklists contain a huge and largely untapped amount of information on the large-scale distribution of biodiversity (Graham et al. 2004). Data that has accumulated over decades or centuries are far more comprehensive than those that can be collected in modern field studies. These data might, if properly analysed, allow comparisons of the spatial distribution of biodiversity (e.g. ‘hotspots’; Reid 1998; Myers et al. 2000) of less investigated taxa with the well-established patterns found in plants or vertebrates, similar to that attempted at local scales in the context of biodiversity indication (Lawton et al. 1998; Moritz et al. 2001; Schulze et al. 2004; Hilt & Fiedler 2006). Such knowledge would considerably substantiate management and conservation decisions, as well as allowing investigations of the coincidence, or divergence, of patterns between different groups. ‘Specimen label’ data, as provided by museums and checklists, however, have a number of biases (as discussed in Graham et al. 2004) with as yet unexplored effects on species richness estimates.
Here we analyse the species richness of hawkmoths (Lepidoptera: Sphingidae) in Malesia, a highly fragmented archipelago with distinct biogeographical subregions (Whitmore 1987; Hall & Holloway 1998; Beck et al. 2006a). Although sphingid moths are among the best-sampled tropical insect taxa, our data still feature all the problems typical for collection-based, presence-only data, such as incomplete coverage of local samples, and habitat heterogeneity, and a clustering of sampling effort in logistically favourable spots – many of which violate the inherent assumptions of most estimation techniques (e.g. random sampling). We compare sample-based estimator techniques with figures based on geographical distribution estimates for each species on two spatial scales, estimating the sphingid diversity of several large islands from records in grid cells, as well as estimating the diversity across the whole archipelago from the records on those islands. Furthermore, we assess how results depend on the utilization of more, but less thoroughly sampled, locations, as the number of sites vs. sampling intensity per site is a common trade-off in biodiversity surveys (e.g. Novotny & Missa 2000).
We compiled approximately 10 000 sphingid records from Malesia (i.e. Malaysia, Indonesia, East Timor, Brunei, Philippines and Papua New Guinea excluding Bougainville), where 302 sphingid species (four as yet unnamed) are currently known. Data originate from museum collections, publications and unpublished lists (see Beck & Kitching 2004 for details; note that one ‘record’ refers to a unique combination of species, place, altitude, year and source, but may involve from one to several hundreds of specimens). Sample sites were allocated with a precision of at least one degree latitude and longitude. More precise localization of records that can be up to 185 years old was not feasible in most cases. A small number of records (c. 0·2%) were excluded from all analyses due to doubts about their credibility (possible misidentifications or mislabelling, see Beck & Kitching 2004). Fewer than 5% of records had no precise location information available. These records were set to the most likely grid cell (given logistic conditions at the time of sampling) and included in analyses. Analyses of 1°-grid cells (see below) are slightly biased as their actual surface area changes with latitude as well as topography, but with the ‘inner-tropical’ extent of our analyses (11°S to 20°N) this error can be neglected.
Records were entered into a Geographic Information System (GIS; ArcView 3·2, 2000). We estimated the geographical ranges of each species as described elsewhere (Beck & Kitching 2004). In short, we applied a mixture of a niche-based approach, underlying records with habitat maps (such as altitudinal relief, vegetation zones, precipitation or minimum winter temperature, cf. Cowley et al. 2000) to find potentially suitable habitats for each species, and a consideration of historical and species-specific constraints. The limits of many species within Malesia are determined to a large extent by historical dispersal limits rather than by habitat alone, so we did not extend species ranges beyond actual records across such known borders. Uneven sampling efforts in different regions (see also Fagan & Kareiva 1997; Soberón et al. 2000; Graham et al. 2004) and the likelihood of finding, correctly identifying or reporting different species were carefully assessed on a species-by-species basis. Taking all these factors into consideration, the best estimate of each species’ range was digitized. We did not use an explicit computer model to estimate ranges, as the analysis of presence-only data is still problematic for statistical habitat models (e.g. Zaniewski et al. 2002), particularly for rare species (Engler et al. 2004), despite successful application on smaller geographical scales (e.g. Iverson & Prasad 1998; Ray et al. 2002; Raxworthy et al. 2003). Overlaying species’ ranges produced estimated checklists for islands of the Malesian region. Original records, range maps and island checklists are presented by Beck & Kitching (2004).
We used the software WS2M (Turner et al. 2000) to estimate species richness of the larger islands of Malesia (we avoided small islands because of our relatively large grid cells) on the basis of recorded presences in 1°-grid cells (Fig. 1). Furthermore, we estimated the species richness of Malesia as a whole from checklists of these islands as well as from the grid cells within them. For some well-sampled islands, as well as Malesia as a whole, we repeated analyses using only grid cells with at least 10, 20, 30 or 100 records, respectively, to exclude data from severely undersampled grid cells. Sampling units were randomized 1000 times to remove entry order effects. We focused our investigation on three methods that are commonly used in estimations of local habitat diversity (Chao2, ICE, MM), but which are also applied occasionally to estimations at regional scales (e.g. Folgarait et al. 2005), and two methods (F3, F5) that were specifically designed and recommended for estimating regional diversity in unsampled habitats (Rosenzweig et al. 2003). These methods represent variations of a family of asymptotic species-accumulation curves that are all forced through the point (1,1) (one species at a sample size of one individual), but vary in their curvature (see Rosenzweig et al. 2003 for details; see Appendix S1, Supplementary material, for formulae of all estimators and details on program settings).
The software EstimateS (version 7.5, Colwell 2005) also allows calculation of Chao2, ICE and MM, using an analytically derived species-accumulation curve rather than randomized data (‘Mao Tau’ function; Colwell et al. 2004; Mao et al. 2005). As expected, scores from the two programs are identical for ICE and very similar for Chao2 and MM (median divergence < 3 species). Large deviations were only observed on an island that generally produced unreasonable estimates (Palawan, see below). For consistency we choose to present only WS2M-scores. Many more nonparametric or curve-fitting estimation techniques exist (e.g. Colwell & Coddington 1994; Colwell et al. 2004; Ulrich & Ollik 2005; Jiménez-Valdere et al. 2006), and some have been used in similar comparisons (e.g. Chazdon et al. 1998; Hortal et al. 2006). We choose to concentrate here on a handful of easily applied measures that appeared promising for applications at large geographical scales.
We used 7867 records from 194 grid cells on 14 islands to produce sample-based richness estimates. Grid cells and islands differ substantially in sampling effort, i.e. in the number of records (Fig. 1), which correlates with the number of recorded species (N = 194, Spearman's R = 0·98, P < 0·0001).
Table 1 gives results of the various estimation techniques for the islands. Most estimators are considerably higher than recorded or GIS-estimated species richness (median across all estimators and islands: 17·4, respectively, 16·1 species). Ranks of estimation methods correlate (N = 14, P < 0·001, Spearman's R > 0·90 for all pairs except GIS-MM, R = 0·84). Linear regressions of GIS-estimates with other methods are significant only for F3 (Pearson's r2 = 0·69, P < 0·0001), Chao2 (r2 = 0·36, P = 0·025) and MM (r2 = 0·29, P = 0·048).
Table 1. Number of 1°-grid cells, records, recorded species (Sobs) and various estimates of species richness for 14 islands. Scores for the estimator F6, which was recommended against in Rosenzweig et al. (2003), are also presented, but were not considered in analysis. Estimator scores (except GIS) were calculated with WS2M, applying a 1000-fold randomization of sampling order, otherwise using recommended settings (see Appendix S1, Supplementary material, for details)
For further comparisons, we calculated residuals from the GIS-estimate for the sample-based estimates and normalized their absolute values by the expected number of species according to the GIS-estimate. Negative residuals are relatively small, whereas some very large positive deviations occur (Fig. 2). Estimates are closer to GIS-derived measures on islands with a large number of grid cells or records, but rank correlations with residuals are not significant except for grid cells and MM-scores (N = 14, Spearman's R = −0·62, P < 0·02). Surprisingly, normalizing grid cell count and records for island size (i.e. grid cells or records per km2) did not yield clear relations with estimate residuals (not shown) and produced non-significantly positive rank correlations.
Medians of absolute residuals (in per cent of GIS species richness) lead to a rank order of deviation from the GIS-estimate (Table 2), and island-specific differences in estimator performance are significant (Friedman-anova: N = 14, = 26·11, P < 0·0001). F3 is judged closest to GIS-scores, with Chao2 and ICE closely following. In contrast, F5, and particularly MM, deviate considerably from GIS estimates. A consideration of mean deviations of the non-normal distributed data (not shown) confirms this rank order, but differences in estimator performance become evident even within the ‘best’ three methods, as gross overestimations occurred least commonly in F3. However, even after excluding data for Palawan (which did not produce reasonable results with any sample-based technique, Table 1) deviations from the GIS-estimate ranged from 11 (F3) to > 30 (MM) species, or 13 to > 50% (means across islands).
Table 2. Species richness estimators, sorted by median residuals from GIS-estimates of 14 islands. Absolute residuals were normalized by each island's species richness (according to GIS-estimate). Medians were also calculated separately for nine well-sampled (Sobs/GIS > 0·9) and five poorly sampled (Sobs/GIS = 0·7–0·9) islands
Median (%) [all islands]
Median (%) [well-sampled]
Median (%) [poorly sampled]
On the basis of normalized residuals of the estimators, the species richness of New Guinea has been assessed relatively well (a counter-intuitive result, as this islands contains huge stretches of virtually unexplored forests), whereas the data from Palawan led to particularly bad estimates. Small numbers of grid cells lead to erratic estimator values, which are least extreme in F3 (see Fig. 3 for some examples). The general assessment of estimator performance did not change if only the well sampled or only the poorly sampled islands were considered, although scores for some estimators varied (Table 2).
Estimates of the species richness of the whole Malesian archipelago, based on checklists for 14 large islands as well as the grid cells within them, are compared with the total number of currently known species from the region, including islands and biogeographical subregions that are not part of our data (Fig. 4). Among those based on grid cells, F3 and F5 produce stable estimates that are very close to the number of known species after about 100 grid cells, whereas the other estimators remained below that value. However, estimators based on island checklists are still rising after inclusion of all 14 islands and lead to estimates considerably higher than the number of known species.
The exclusion of severely undersampled sampling units (grid cells or islands) does not lead to unequivocal changes in estimator scores (Fig. 5). There are no consistent patterns that hold across estimator type or island. Even on Borneo, where, after exclusion of all grid cells with fewer than 30 records the data still contain all recorded species, estimators do not follow a common trend toward higher or lower scores.
Our analyses indicate that of the sample-based methods of estimating regional species diversity, F3 performed best overall, thereby corroborating the assessment of Rosenzweig et al. (2003). The estimates still have considerable deviations of up to more than 25% from supposedly true values (see below for discussion), even if scores for Palawan with its extremely poor estimations are excluded (Table 1). The Michaelis-Menten estimator performed worst in our comparison, never reaching values within 10% of the GIS assessment.
Estimation techniques that were designed for application to local samples did not perform any worse than those specifically formulated for assessments in a regional context, contrary to results of Rosenzweig et al. (2003). Furthermore, we did not find a general distinction in estimation quality between curve-fitting techniques (F3, F5, MM) and those that do not depend directly on the shape of the species accumulation curve (Chao2, ICE). On theoretical grounds, nonparametric estimators might be expected to be more robust to different relative abundance distributions (RADs) in samples (which would be advantageous, all else being equal), but a general sensitivity to RADs was nevertheless evident for all estimators in simulation studies (Brose et al. 2003).
However, a large number of randomizations were necessary to obtain smooth accumulation curves close to the analytically derived curves computed using EstimateS (not shown; Colwell 2005; an application of the ‘Mao Tau’ function would probably benefit all estimation techniques, but software is not yet available). Rosenzweig et al. (2003) presented the estimator F6 as theoretically suitable, but obtained poor results in their empirical test on North American butterflies (see below), and hence did not recommend it. However, we did not find any indication of an outstandingly bad performance except for its failure in one case only to compute a meaningful value (see Table 1 for scores, analysis not shown).
Although overestimation was common among all estimation techniques applied to the 14 islands in this analysis, the application of local estimating techniques such as Chao2 or MM to a substantial number of quantitatively sampled sites on Borneo led to an underestimation of the island's totally known species inventory (Beck et al. 2006b).
Brose et al. (2003) and Brose & Martinez (2004) have shown that the accuracy of various local estimators is dependent on the sample coverage. Indeed, F3 and Chao2 scored much more closely to the GIS estimate on five poorly sampled islands (Table 2), whereas other techniques performed better on well-sampled islands. However, the number of islands might be too small for the meaningful documentation of a pattern.
the reliability of geography-based estimates
Rosenzweig et al. (2003) tested butterfly data from an artificially rarefied number of North American ecoregions, which they reasonably assumed to have been completely sampled, against the total known species richness of the continent. Thus, although perfect to test the precision of estimation methods, their data are unrealistically good for the proposed purpose of producing estimates from incompletely sampled habitats. Furthermore, North American species richness is probably mostly determined by climate-mediated habitat differences (Kerr et al. 2001). In contrast, our data are strongly affected by historical biogeography (Beck et al. 2006a) and contain all the biases that may be found in distributional data of tropical invertebrates in a large spatial context: we cannot test estimates against values of known species richness. Rather, we have compared results from sample-based methods against our far more time-intensive GIS estimates, which we consider to be more reliable. Hence, the value of our analyses depends on the correctness of that assessment, which cannot be empirically confirmed or refuted, as large-scale, systematic sampling across broad areas of Malesia is unlikely to be undertaken in the foreseeable future. However, it appears very unlikely to us that a truly large number of unexpected taxa are present on the islands used in this analysis, although new records will certainly prove some GIS estimates to be incorrect by a few species.
With a large number of sampling units, sample-based methods estimated the number of species at or below that currently known across Malesia, whereas they produced considerably higher estimates for the same data using a spatially more coarse resolution (Fig. 4). This shows the dependency of estimator scores on the number of sampling units, which we also noticed in our comparison between islands (Fig. 3).
As an independent measure of estimator quality we correlated log-transformed island sizes with species richness estimates, expecting a positive relation (the ‘species-area relationship’, e.g. Rosenzweig 1995) despite additional historical and environmental impacts on species richness (Beck et al. 2006a). A significant correlation was found only for the GIS-estimate (N = 14, Pearson's r2 = 0·44, P < 0·01), and not for any of the sample-based methods (all r2 < 0·01 except F3, r2 = 0·12), although after exclusion of the Palawan data all methods produced significant (P < 0·05), positive species-area curves, with r2 > 0·50 for Chao2, GIS, F3 and ICE. This makes it seem likely that the high estimates for islands with a low number of grid cells, most particularly on Palawan, are indeed errors, rather than the result of the GIS method greatly underestimating the islands’ species richness. Furthermore, GIS estimates led to easily interpretable patterns of regional species richness (Beck et al. 2006a), and the frequency distribution of species’ range sizes follows the same pattern found in many other taxa (Beck et al. 2006c).
application of estimators
Estimators often performed very well on well-sampled islands, but the most serious problems arose in just those situations where estimates would be most needed, on islands where observed species richness is certainly incomplete. Erroneous estimates occur in many estimators of local or regional species richness at small sample sizes (Ulrich & Ollik 2005 and references therein). Underestimations are commonly described in the literature, whereas here we report mostly overestimation errors. This point warns particularly against the uncritical application of species richness estimators from specimen-label data to heavily under-sampled locations or taxa, especially considering that every island in our present analysis had at least 70% of their GIS-expected species inventory recorded. Only a better understanding of what data biases lead to erroneous estimates may allow a priori excluding such samples in the future. Similar problems are known from local species richness estimates, where estimates based on a small number of sampling units may be quite unreliable before reaching more or less asymptotic values at higher sample sizes (see, e.g. Brehm et al. 2005 for data). From our analyses we tentatively conclude that a large number of sampling units (grid cells) is a good guard against such effects. Many fine-scaled sampling units (grid cells) produced more stable, and probably also more reliable, estimates than few large-scaled samples (island checklists), but may lead to underestimations if biogeographical regions (or other sources of β-diversity) are excluded from sampling.
Unfortunately, our analyses allow no general conclusions to be drawn regarding the effects of very poorly sampled grid cells on species richness estimates. We tentatively suggest that if data are available, sample-based estimates may benefit from using many small rather than few large grid cells. When the necessary data are available, estimates should be checked for a realistic range of values by comparing them to independent (i.e. not sample-based) methods, such as rank–abundance distributions (Ulrich & Ollik 2005), species–area curves (Krishnamani et al. 2004), GIS-supported habitat models (e.g. Iverson & Prasad 1998; Wiley et al. 2003) or species description rates (e.g. Gaston et al. 1995; Scoble et al. 1995; Solow & Smith 2005).
synthesis and application
The use of sample-based methods for estimating regional species richness can provide an alternative to the much more work-intensive geographical modelling of species distribution, if the aim is a comparison of species richness irrespective of species identity. This may facilitate the inclusion of tropical invertebrate groups in documenting global diversity patterns. For many taxa, the necessary distribution records may already be available in museums (e.g. Graham et al. 2004) and (often locally) published checklists (see, e.g. literature in Beck & Kitching 2004), but these data have to be processed to correct for their incompleteness. We concur with previously published assessments that the F3 estimator (Rosenzweig et al. 2003) has the greatest potential of predicting regional species richness in incompletely sampled landscapes. However, under conditions of locally incomplete, spatially non-systematic sampling, which is typical for museum and checklist data, estimate errors were still moderate to large (> 20% for many islands), particularly if the number of sampling units (e.g. grid cells) is low. Estimation values should not be used uncritically as long as the data conditions that lead to errors are not precisely defined.
We are very grateful to all those professional and non-professional collectors who supplied unpublished distribution records of species (full details are listed in Beck & Kitching 2004). We particularly thank Konrad Fiedler and Robert Colwell for helpful comments on earlier drafts of the manuscript. Parts of data preparation were carried out during J.B.'s PhD work at Würzburg University, supported financially by scholarships of the German Research Council (DFG), the German Academic Exchange Service (DAAD) and the Sys-Resource Program of the EU.