Progress in science is episodic and uneven; some fields get hot while others dwindle. Fungal ecology is undergoing an observationally driven boom, resulting from the application of pyrosequencing technology. Three papers in the current issue of New Phytologist (Buée et al., 449–456; Jumpponen & Jones, 438–448; Öpik et al., 424–437) illustrate the state of the science, using 454 sequencing of clone libraries derived from PCR-amplified ribosomal RNA (rRNA) genes to characterize fungal communities. The volume of data in these studies is staggering; Buée et al. generated 166 000 sequences, while Öpik et al. and Jumpponen & Jones analyzed 125 000 and 18 000 sequences, respectively. The flood of data from these and similar studies provides ecologists with opportunities to describe fungal communities in greater detail than ever before. At the same time, the new technology challenges the taxonomic community to provide resources for molecular identification and to consider how (or whether) the rapidly accumulating sequence data can be used for species discovery and description.
‘The studies of Öpik et al., Jumpponen & Jones and Buée et al. prompted us to ask whether ecology or taxonomy is currently leading the way in the discovery of new taxa, and also to assess the growth of the molecular database for taxon identification.’
Öpik et al. targeted partial nuclear small subunit (18S) rRNA genes in arbuscular mycorrhizal fungi (Glomeromycota) in the roots of ten different plant species in a 10 × 10-m plot. The plants were divided into generalists, which can be found in a wide range of habitat types, and forest specialists. The fungal sequences were assigned to 48 Virtual Taxa (VT), defined as groups with sequence similarity of at least 97%. Forest specialist plants harboured much higher diversity of Glomeromycota than generalists; 22 VT were limited to forest specialists, while only one VT was restricted to generalist species. A phylogenetic tree of the VT suggests that there have been multiple transitions between association with forest specialists and generalists in the evolution of Glomeromycota (although Glomus group A appears to be particularly rich in taxa associated with generalists, while the Acaulosporaceae and Glomus group C are represented almost entirely by VT associated with forest specialists; see Fig. 1 contained within Öpik et al. Similarly, cluster analysis of the fungal communities of plant hosts yielded a topology that separates VT associated with forest specialists and generalists into distinct groups, but that conflicts with current views of angiosperm phylogeny (e.g. the forest specialist Galeobdolon luteum and the generalist Veronica chamaedrys are in separate clusters, although both are members of the Lamiales; Fig. 3, contained within Öpik et al.). In sum, the work of Öpik et al. suggests that there are co-adapted pools of forest-specialist Glomeromycota and plants, and that habitat preference may be a more important driver of host–fungus associations than host or fungus phylogeny, at least over very broad evolutionary scales.
The other studies are largely in the realm of descriptive ecology and molecular natural history (this is not intended as a criticism). Jumpponen & Jones and Buée et al. both used sequences of the internal transcribed spacer (ITS) region of nuclear rRNA genes, which has become the de facto fungal barcode locus for most Dikarya (Basidiomycota and Ascomycota; Seifert, 2008). Jumpponen & Jones studied the fungal community in the phyllosphere (including epiphytes and endophytes) of a single tree species, Quercus macrocarpa, which was sampled in urban and rural environments. Richness and diversity of the phyllosphere fungal communities were greater in the rural habitats than in the urban habitats. Buée et al. characterized fungal diversity in forest soils associated with six different dominant tree species. An important finding of Buée et al. was that use of a custom-curated sequence database enabled many more sequences to be identified than a wholesale blast analysis of the notoriously error-prone GenBank database (Nilsson et al., 2006).
Ecologists are naturally excited about the promise of pyrosequencing methods for describing fungal communities, but systematists should also consider how the outpouring of data from molecular ecological studies impacts their discipline. At a basic level, taxonomy has two complementary goals: (1) discovery and description of new taxa (including clades), and (2) provision of resources for identification. The studies of Öpik et al., Jumpponen & Jones and Buée et al. prompted us to ask whether ecology or taxonomy is currently leading the way in the discovery of new taxa, and also to assess the growth of the molecular database for taxon identification.
To evaluate progress in species description, we surveyed new names for species (i.e. excluding infraspecific taxa and combinations) of Ascomycota, Basidiomycota and Glomeromycota recorded in the Index of Fungi (published every 6 months by CABI) from 2000 to 2009 (Table 1). The overall rate of species discovery has been fairly level for the last 10 yr, with an average of 1223 new species described per year, mostly Ascomycota. In 2008, the last year for which complete data are available, 1009 species were described. These figures probably overestimate the rate of species discovery somewhat, because of redescriptions of previously published taxa, which have been estimated to occur at a rate of one synonym for every 2.5 truly novel descriptions (Hawksworth, 1991).
|Year||Ascomycota (%)||Basidiomycota (%)||Glomeromycota (%)||All groups (%)|
|2000||746, 142 (19)||427, 74 (17)||4, 2 (50)||1177, 218 (19)|
|2001||841, 198 (24)||439, 93 (21)||2, 0 (0)||1282, 291 (23)|
|2002||815, 121 (15)||387, 74 (19)||9, 0 (0)||1211, 195 (16)|
|2003||913, 162 (18)||459, 77 (17)||3, 2 (67)||1375, 241 (18)|
|2004||916, 279 (30)||545, 104 (19)||8, 4 (50)||1469, 387 (26)|
|2005||687, 220 (32)||283, 67 (24)||2, 1 (50)||972, 288 (27)|
|2006||779, 237 (30)||356, 58 (16)||4, 2 (50)||1139, 297 (26)|
|2007||947, 225 (24)||426, 95 (22)||2, 2 (100)||1375, 322 (23)|
|2008||709, 230 (32)||296, 41 (14)||4, 1 (25)||1009, 272 (27)|
|2009||260, 61 (23)||89, 18 (20)||6, 2 (33)||355, 81 (23)|
|Total||7613, 1875 (25)||3707, 701 (19)||44, 16 (36)||11 364, 2592 (23)|
It is very difficult to estimate how many undescribed species of fungi have been detected by molecular ecologists in recent years, but the data from Öpik et al., Jumpponen & Jones and Buée et al. provide a hint about the magnitude of taxon discovery that is occurring. Jumpponen & Jones assigned their sequences to 689 operational taxonomic units (OTUs) based on a 95% sequence identity criterion, including 329 singletons and 360 nonsingletons. A blast analysis of the nonsingletons found that 214 OTUs had no matches in GenBank at the 95% identity level, and the top hits for 26 of the remaining OTUs were not identified to species level (Suppl. Table S2, provided as part of Jumpponen & Jones). Thus, c. 240 OTUs could not be referred to a named species. This is a conservative figure, because it excludes the 329 singleton OTUs that evidently were not subjected to blast analysis. Buée et al. pooled their sequences into c. 1000 OTUs using a 97% sequence identity criterion and then used the program megan to place the sequences at the least-inclusive level possible in the NCBI taxonomy (Huson et al., 2007). megan placed about 76 000 sequences into 111 taxa identified at the species level, while the remaining 90 000 sequences were placed into 65 more inclusive groups (i.e. genera, families, etc.; Buée et al., Suppl. Table S2). If one assumes an approximate correspondence between species assigned by megan and OTUs, based on a 97% similarity cut-off, then c. 890 of the OTUs found by Buée et al. could not be identified to the species level. Collectively, these two ecological studies detected at least 1130 potentially novel taxa, which exceeds the total number of species of Ascomycota, Basidiomycota and Glomeromycota described by the entire taxonomic community in 2008 (Table 1).
Öpik et al. compared their Glomeromycota 18S sequences with sequences in a reference database, containing 243 predefined VT, called MaarjAM (http://moritz.botany.ut.ee/maarjam/). Öpik et al. concluded that all 48 of the VT that they detected were already represented in MaarjAM (potentially novel VT were observed, but these were interpreted as sequencing artifacts). Nevertheless, the composition of MaarjAM clearly indicates that most new taxa of Glomeromycota are being discovered as environmental sequences. One hundred and eighty-four (76%) of the VT in MaarjAM are composed of sequences known only from molecular environmental surveys, which is > 4 times the total number of new species of Glomeromycota reported in the Index of Fungi over the last 10 yr combined (Table 1).
Of course, some of the unidentified sequences may simply represent described species that have not yet been subjected to molecular analysis, but it is unlikely that many of the unidentified OTUs are in this category. As of this writing, the GenBank taxonomy browser reports that 19 848 fungal species are represented in the database. This is almost 20% of the c. 100 000 species of fungi that have been described (Kirk et al., 2008), assuming no redundancy or error, but it is only 0.6–1.3% of the 1.5–3.5 million fungal species that have been estimated to exist (Hawksworth, 2001; O’Brien et al., 2005). Therefore, most of the unidentified molecular OTUs probably represent undescribed taxa.
Growing the database for molecular identification should be a priority for fungal taxonomists (Kõljalg et al., 2005). We were interested in knowing what proportion of new species descriptions are coupled with deposition of reference sequences, so we created a Perl script to query the GenBank taxonomy for new species names reported in the Index of Fungi. Overall, 23% of the species recorded from 2000 to 2009 are represented by sequences in GenBank (Table 1). Twenty-five per cent of the names from 2005 to 2009 have sequences compared with 20% from 2000 to 2004. This suggests a modest increase in the rate of sequencing associated with species description over the last 10 yr. On the other hand, there is a growing backlog of relatively recent type material that has not been sequenced, now comprising 8772 species described since 2000.
We conclude that molecular ecological studies, especially those using pyrosequencing, are now – or will soon be – detecting significantly more undescribed species of fungi than traditional taxonomic studies. Furthermore, molecular species discovery has the potential to accelerate dramatically, while taxonomy seems to have reached a plateau (Table 1). The disconnect between species description and deposition of sequences is particularly troubling. Even as the total number of named sequences in GenBank increases, the gap between the number of described species and reference sequences is widening. To narrow the gap, it will be necessary to intensify efforts to obtain sequences from reference collections (herbaria) and culture collections (Brock et al., 2008) and to urge authors of new names to deposit reference sequences. Journals and funding agencies could help in this regard, by insisting that new species descriptions be accompanied by sequence data whenever possible.
Until now, fungal molecular ecologists have operated largely as consumers of resources generated by taxonomists, specifically databases of named sequences, classifications and voucher materials. However, the taxonomic community is clearly challenged to provide adequate resources for molecular identification, and appears to be falling behind ecologists in the discovery of new taxa. Given this situation, it may be appropriate to consider inverting the traditional relationship between taxonomy and ecology, and to ask whether taxonomists, in their quest to document the global diversity of fungi, should not become consumers of the products of ecological studies. In short, it may be time to take a serious look at formal species description based on environmental sequences.
To an extent, this work has already begun; molecular ecologists routinely identify OTUs using sequences, but these entities are not being formally classified as species and therefore they do not enter Index Fungorum or other downstream taxonomic databases, such as MycoBank, GBIF, Species2000 and the Encyclopedia of Life. This is unfortunate because the concept of species, controversial though it may be, is deeply ingrained in the ways that humans perceive biodiversity, and species-based classifications impact such disparate fields as pathology (diagnosis), conservation biology (biodiversity hot spots) and comparative biology (key innovations). Tremendous discoveries are being made through molecular ecology, but the failure to classify molecular OTUs as species limits our ability to bring these discoveries to bear on disciplines outside fungal ecology.
Adoption of sequence-based species description would represent a major shift for fungal taxonomy. For some, the loss of morphological descriptions could be disconcerting. Techniques such as fluorescent in situ hybridization (FISH) could be used to visualize taxa in the environment, and phylogeny-based character optimizations could be used to predict diverse properties of unseen organisms, but many taxa might never be directly observed. Another potential problem with sequence-based species description is that the duality of sequence-based taxonomy and specimen-based taxonomy could create synonyms, but this is nothing new for mycologists, who are accustomed to dealing with parallel naming of anamorphs and teleomorphs.
We are not advocating adoption of a uniform global standard for species delimitation based on a particular gene or sequence similarity score. Indeed, the requirements for molecular species description will probably vary from group to group (Seifert et al., 2007) and should be based on the taxonomic judgement of experts. All we suggest is that the taxonomic community, together with ecologists and bioinformaticians, engage in a conversation about the possibility for sequence-based taxon description. If it is determined that this is desirable, then it will be necessary to propose language to modernize the International Code of Botanical Nomenclature (McNeill et al., 2006), which sets the rules by which fungi are named.