Arthropod diversity and the future of all-taxa inventories

Authors


Success or failure in the global challenge of cataloguing all life on Earth will depend to a large extent on our ability to survey and identify the countless millions of arthropod species that remain undescribed. Despite some of the more obvious taxonomic impediments limiting species description rates (Godfray & Knapp, 2004; Wheeler et al., 2004; Evenhuis, 2007; Zhang, 2012), May (2004, 2011) has been vocal in suggesting that the key rate-limiting step in cataloguing biodiversity will ultimately be the craft of collecting specimens in the field. As field ecologists, we naturally have a lot of sympathy with this perspective. In spite of this, the sorts of All-Taxa Biodiversity Inventories (ATBIs) that have been spawned around the world in response to May's call for more intensive field cataloguing have met with comparatively limited success, and have struggled to maintain adequate funding. The reasons why are really twofold. First, as all field ecologists will know, the volume of arthropod specimens that can be obtained with modern sampling techniques vastly outstrips any capacity to identify this material (Gotelli, 2004); screening large numbers of common species to find the long tail of vanishingly rare species can be prohibitively time-consuming and expensive. Second, the criticism of a ‘stamp-collecting’ approach to ATBIs has been hard to shake, and has drawn into question the utility and sustainability of the approach.

An alternative approach initiated 10 years ago in Panama has recently ‘come of age’ in the first comprehensive analysis of the spatial distribution of arthropod biodiversity within a lowland tropical forest (Basset et al., 2012). Project IBISCA-Panama (Investigating the BIodiversity of Soil and Canopy Arthropods, Panama module) represents a large-scale research initiative conducted explicitly as a ‘joint venture’ between taxonomists and ecologists. The subtle, but important difference from the ATBI approach is that the research is founded on a set of fundamental ecological questions about the scaling of arthropod species diversity across space and time (Basset et al., 2007), with species discovery and taxonomic inventory as collateral benefits. This is not to imply that taxonomy is in any way a secondary pursuit to ecology, or that it should be considered a service industry to science (Godfray & Knapp, 2004), but rather that the IBISCA approach fosters a dual purpose to the research by formalising a structured inventory within a hypothesis-driven framework. Ecologists clearly benefit from the credibility of taxonomic identification afforded by professional taxonomists and their expertise in compiling life history or distributional information for known species. Taxonomists, too, benefit from participating in a quantitatively structured sampling programme, which can generate new biogeographic or environmental hypotheses about species distributions, and where the relevance of the work to ecological theory and conservation practice is enhanced. We see this as very clearly meeting Gotelli's (2004) plea for greater collaboration between taxonomy and community ecology, which ultimately strengthens both disciplines.

In the case of IBISCA-Panama, a team of over 100 ecologists and taxonomists sampled a wide range of target groups across the phylogenetic breadth of arthropod taxa, using a comprehensive range of sampling approaches from ground- to canopy-level in 12 spatially distributed sites in a lowland forest (Basset et al., 2007). The structured design allowed extrapolation of total species richness estimates for multiple taxa, proportions of undescribed species in some groups, and predictions about the spatial scaling of species turnover within tropical forests (Basset et al., 2012). Surprisingly, a very large fraction (ca 60%) of the total estimated arthropod biodiversity in the region could be sampled from a spatially distributed set of sampling plots totalling as little as 1 ha – provided that these plots are widely spaced and representative of habitat heterogeneity in the region. Interestingly, patterns of spatial scaling of arthropod beta diversity showed a close correspondence to the spatial scaling of plant species turnover across the same area, and this result held not just for herbivorous arthropods but also for non-herbivorous taxa as well (Basset et al., 2012).

Beyond the research outcomes of this one project, the IBISCA model has found broad appeal among ecologists and taxonomists, with four other similar programmes starting since IBISCA-Panama: IBISCA-Queensland (Australia, 2006–2008), IBISCA-Santo (Vanuatu, 2006), IBISCA-Auvergne (France, 2008–2009) and IBISCA-Niugini (Papua New Guinea, 2012), each with a differing hypothesis-driven framework, but with the same operating model (www.ibisca.net). The question will be whether or not the IBISCA agenda can continue to deliver ‘big science’ outcomes, while at the same time fostering species discovery, when the number of taxonomists and the funding for taxonomy continues to dwindle. In IBISCA-Panama alone, the 24 000+ trap-days sampling effort has generated many lifetimes of taxonomic work, and that is only for select target taxa. Almost 10 years of work were required just to synthesise the first results, and this is still admittedly an inadequate caricature of an ‘all taxa’ inventory. The scale of the task required to achieve this outcome for one rainforest in Panama exposes the sheer magnitude of assessing the spatial distribution of arthropod biodiversity in a meaningful ecological fashion that will be useful for conservation management of all taxa. Moreover, the same small subset of target taxa will continue to dominate arthropod biodiversity surveys simply because of the limited availability of taxonomic experts. Inevitably, this is going to lead to rapid saturation of global taxonomic capacity, swamping willing and able taxonomists with far too much material to handle. What is urgently needed is a genuine global commitment to support species discovery with high-throughput technologies for the molecular and morphological characterisation of arthropod biodiversity, which have been talked about for so long (Godfray, 2002; Smith, 2005; La Salle et al., 2009).

There are two major ways in which high-throughput technologies could dramatically increase sample processing rates, and the taxonomic identification and description of new species: through ‘accelerated genomics’ approaches to molecular characterisation and through ‘accelerated phenomics’ approaches to morphological characterisation of many individuals within pooled samples. In both instances, dramatic advances in high-throughput technologies are being made for organisms other than arthropods, and entomologists could benefit greatly from increased uptake of these approaches.

In the first instance, the emergence of microbial metagenomics (the direct sequencing of genomic material from environmental samples) has been a revolution in characterising the diversity (Keller & Zengler, 2004), community structure (Tyson et al., 2004) and functional roles (Zhou et al., 2102) of microbial assemblages invisible to the naked eye. In a recent article, Poole et al. (2012) consider that we are now on the cusp of similar advances in eukaryote metagenomics, which will revolutionise the way that the ecology of multicellular organisms is done in the future. Arthropod metagenomics would represent more than just a quantitative increase in the speed of current single-gene molecular bar-coding approaches (with all their attendant limitations; Smith, 2005), but rather a step-change in the assembly of partial genomes for many individuals within a given sample, providing data from (potentially) a very large number of molecular markers at the same time. No doubt numerous technical challenges need to be overcome before this goal becomes a reality, but the potential advantages would be staggering. Not least among these advantages would be the ability to mass-screen large numbers of samples more rapidly than using conventional approaches, obtain equivalent coverage of taxonomically challenging versus taxonomically tractable groups, and identify males versus females or adults versus juveniles with equal ease. Naturally, there will be many valid concerns about the fate of valuable specimens in a metagenomic analysis, but the least-cost path of blending whole samples to extract pooled DNA need not be the only approach taken. A small tissue sample could be extracted from each individual, leaving the specimens largely intact, and then the tissue samples could be pooled for high-throughput analysis. In either case, it is important to remember that metagenomics can never be a replacement for integrated taxonomy (Johnson, 2011). It is better viewed as a genomic triage process with high breadth of coverage to point ecologists and taxonomist toward lineages of particular interest, but low depth of information content with which to validate species identities without external reference to a master taxonomy compiled by standard methods. On this latter point, we might even envisage some surprising benefits of metagenomics for the taxonomic community, in that the increased sequencing of ecological samples would drive intense demand for a more refined master taxonomy against which to validate genomic assignments. Moreover, the taxa of greatest ecological or applied interest may not be those for which there is currently adequate taxonomic revision, and metagenomics may thus provide reciprocal illumination of taxonomic priorities for the future.

The second major way in which high-throughput technologies could accelerate sample processing and species discovery is in the adoption of new approaches to the automated capture of morphological characters from individual specimens (La Salle et al., 2009). In the last 5 years there has been a revolution in the development of phenomics – the acquisition of high-dimensional phenotypic data on an organism-wide scale (Houle, 2010; Houle et al., 2010) – throughout many disciplines of biology. After the term ‘phenomics’ first appeared in the literature in 1997 there was at most only a scattering of references to the concept until around 2005, but since then interest in phenomics has increased exponentially, mostly in relation to genome-to-phenome association studies, and high-throughput phenotype imaging for accelerated plant breeding and selection (Houle et al., 2010). Arthropod research has featured prominently in several key aspects of phenomics research, such as the intensive work on mapping the Drosophila phenome, but extension to arthropod biodiversity research and species discovery has been superficial as yet (La Salle et al., 2009). Compare this with the proliferation of plant phenomics centres worldwide (e.g. the International Plant Phenomics Network, the Australian Plant Phenomics Facility, the National Plant Phenomics Centre, UK and the European Plant Phenomics Network, to name just a few). One might well ask where the arthropod phenomics centres are. We believe that a much more concerted effort is required from entomologists to adopt the latest available technologies (e.g. automated image capture, laser scanning confocal microscopy and three-dimensional reconstruction) and to ‘industrialize’ the acquisition of morphological data that will be of critical value to trait-based ecology (McGill et al., 2006), taxonomy (La Salle et al., 2009) and evolutionary systematics (Deans et al., 2012) alike. Already, moves in this direction are paying dividends for museum curators (Giles, 2005; La Salle et al., 2009; Johnson, 2011; Mantle et al., 2012), and have led to an explosion of online repositories for digital images of individual specimens (e.g. MorphBank, www.morphbank.net, and mirror sites around the world such as Morphbank-ALA in Australia, www.morphbank.ala.org.au; Mantle et al., 2012) and contributed to derivative compilations of species-level traits (e.g. Vieira et al., 2006). What is conspicuous by its absence is an equivalent drive to automate the acquisition of individual-level trait data from specimens collected in large-scale arthropod biodiversity surveys. For example, high-quality images of arthropod ‘soup’ samples from raw trap-catches would be a boon for online image analysis, as is already underway for bulk archived samples of invertebrates stored in ethanol at the Australian National Insect Collection, for example. Naturally, this is no substitute for a professional entomologist looking at an actual specimen. In spite of this, just as with high-throughput genomics, what we are talking about here is more of a rapid initial triage process that defines the full scope of the data and narrows down the potential range of target groups of interest for further study, while at the same time producing invaluable morphological (or molecular) characters. This can only be a win-win strategy for ecology and taxonomy, and would almost certainly relieve many of the current impediments to all-taxa inventories.

Ancillary