The future promises ever more automation in biology, enabling researchers to collect tremendous amounts of data in hours and days rather than months or years. But where should this great power be directed and to what should it be applied? One possibility is the pursuit of a simple question —‘What is this organism?’ — a question asked daily not only by every specialist taxonomist but also by most working biologists and a significant proportion of the general public. At the same time, there is little doubt that human activity is altering the evolution of all plant and animal forms on the planet and driving many to extinction. The preservation of biodiversity has been a popular conservation goal, yet despite efforts undertaken by the Parties to the Convention on Biological Diversity and others, biodiversity loss has largely continued, and often we have little idea of which species have gone and which remain. Regrettably, few people have the knowledge of the depth or extent of biodiversity, even in local areas, to address this issue.
In recognition of this gap in our knowledge, the Canadian Barcode of Life Network (and the large international consortium of which it is a part) aims to develop an accurate, rapid, cost-effective and universally accessible DNA-based system for species identification (Secretariat of the Convention on Biological Diversity 2005). For this goal to be accomplished, it is necessary that the DNA fragment being used is highly standardized and is the same for all organisms (or as many as possible). Without this standardization, any hope of species identification is lost: determining the sequence of the ADH gene from a group of unknown flies will not help to identify them unless this sequence is already known for all of the potential species. The sequence that has been identified by the Consortium for the Barcode of Life to act as the standard sequence for animal life is approximately 650 bp from the 5′-end of the mitochondrial gene for cytochrome c oxidase subunit I. It is known that this gene fragment will not be universal for all species, yet the goal is standardization with as few exceptions as is possible.
Despite some popular misconceptions, the goal of DNA barcoding is neither to determine the tree of life nor to carry out phylogenetic studies. The goal of DNA barcoding is also not molecular taxonomy, as it is not intended to replace classical taxonomy. Its purpose is to carry out species identifications so that even non-experts can determine what species might be at hand, and to do so in a rapid and inexpensive manner. This does not mean that barcodes lack phylogenetic information, or that the sequences do not contribute to taxonomic knowledge. Barcodes can provide evidence for cryptic species, and contribute to knowledge of phylogeny and biogeography. Each of these, however, requires corroboration from additional sources of information for robust support of the hypotheses generated by barcoding. For example, no one would attempt to reconstruct the phylogenetic history of the Diptera from 600 bp of mitochondrial sequence.
Central to the DNA barcoding enterprise is a database of previously identified reference specimens and their corresponding COI sequences. This requires taxonomists to apply their knowledge and to provide identifications of specimens that can then be barcoded. They must provide their intimate knowledge of the species ranges and morphologies to direct sampling strategies that would cover the greatest likely range of genetic variation. It is then these couplets of information that can be used to identify an unknown specimen. The Barcode of Life Database (BOLD; Ratnasingham & Hebert 2007) is a curated and searchable database that is increasingly able to give researchers and the general public this power.
To maximize its potential, this database needs to be filled with couplets of known sequences and known species. Led by a national research network in Canada, work is under way to accomplish this task. The major species and the bulk of biodiversity can thus be recorded and hopefully preserved. This supplement to Molecular Ecology Resources is devoted to showcasing this work and providing a forum for the discussion of issues dealing with barcoding. The supplement grew out of the second Canadian Barcode of Life Network Scientific Symposium devoted to DNA barcoding, held at the Royal Ontario Museum (Toronto, Canada, 28–29 April 2008). Its goal is to record the successes as well as the problems. Some of the papers are deliberately provocative to ignite debate while others are reports of exciting preliminary data.
The first section of this volume begins with an exposé of what can be done with barcoding and with some goals for the future large-scale application of the technology. In our keynote article, Janzen et al. (2009) illustrate how an entire region's biodiversity can be examined using barcoding. Borisenko et al. (2009) and Ivanova et al. (2009) outline the practice of barcoding in a natural history collection context. The differences between barcoding and identification by traditional morphological methods are discussed by Packer et al. (2009). They propose that barcoding is a more unbiased and more accurate method than morphology in some cases. On a more applied note, Jakupciak & Colwell (2009) discuss other potential biological agent detection technologies. The use of oligonucleotide arrays led Zahariev et al. (2009) to examine methods to quickly find the most useful oligos for species identification.
While the COI gene has proven remarkably successful in arthropod and vertebrate species, its application to other groups of organisms is more uneven. Of particular interest is how well the marker might be able to identify lower eukaryotes. Two groups of organisms are examined here, diatoms by Moniz & Kaczmarska (2009) and trematodes by Moszczynska et al. (2009), both with only limited success using ‘universal’ primers. Additional obstacles need to be overcome before barcoding is easily accessible for these groups.
Seifert (2009) reviews efforts to create a barcoding system in fungi. In this group, the COI gene sequence is often interrupted by introns that make sequencing difficult. Not only are introns a problem in several fungi but multiple copies of the gene can further complicate identification (Gilmore et al. 2009). As a result, ITS has been suggested as a primary marker for use with fungi. Similarly Vialle et al. (2009) conclude that while some mitochondrial genes do have suitable properties, the existing use of ITS makes it the preferable choice for this group of organisms. Nevertheless, on a finer level of identification, down to the clade level, COI oligos arrayed on a membrane are a suitable identification tool for Penicillium according to the study reported by Chen et al. (2009).
As with some fungi, the COI gene of plants is known to evolve too slowly to be useful for species diagnosis. Indeed, the search for suitable replacement markers for the plant kingdom has been proven sufficiently difficult to prompt Fazekas et al. (2009) to claim that plant species may be intrinsically harder to discriminate than animal species. The combination of the slow evolution of plant mitochondria and introgression of genes between distantly related species make plants a challenging group. Illustrating these problems is the article by Starr et al. (2009) showing that combinations of potential barcodes fail to discriminate more than 50% of the sedge species they sampled. However, the barcoding approach still has much to offer: even COI works well in the red algae in data reported by Saunders (2009), and proposed barcoding sequences from the chloroplast can discriminate species within the genus Acacia (Newmaster & Subramanyam 2009). Furthermore, despite a lack of traditional scientific evidence for a distinct species, a culturally known distinction between grass species in India can be confirmed by DNA barcoding (Ragupathy et al. 2009).
Barcoding has been demonstrated to work well with many arthropods. These need not be restricted to insects, as indicated in the article by Radulovici et al. (2009) where its efficacy is demonstrated with marine crustaceans. Its utility within the Insecta is also demonstrated by Foottit et al. (2009) for hemipteran species, by Sheffield et al. (2009) for hymenopteran species, by Rivera & Currie (2009) for dipteran species and by Emery et al. (2009) for lepidopteran species. Together, these papers show that the use of barcoding within the Insecta will continue to be a major tool for biodiversity detection. The article by Smith et al. (2009) takes a novel detection and makes use of DNA barcoding to explore not only the diversity sampled but, through the use of accumulation curves, to explore the amount of biodiversity that remains to be discovered or described within a region.
There are several projects to make a complete census of a group of organisms and to determine the barcodes (and a great deal of other information) from this complete census. We encourage the readers to organize their own group-specific or region-specific project. One of the groups for which a complete census is underway is the fish of the world (http://www.fishbol.org; Ward et al. 2009). As with other recent publications, Zemlak et al. (2009) report that barcodes have already revealed several overlooked marine fish species. Wong et al. (2009) show that the methodology is efficient even within the more slowly evolving sharks, although these are not strictly fish. Another very successful group project is the ‘All Birds Barcode Initiative’ and Baker et al. (2009) make use of well-known avian species to counter barcoding criticisms based on a rigid application of threshold distances.
This special volume illustrates some of the exciting work being conducted within the realm of DNA barcoding. From its conceptual birth in Canada, to its growing international footprint, barcoding furthers the democratization of biodiversity information in a significant way, yet an inventory of life on this planet will perhaps be its biggest legacy. This is evidenced by the broad-based support garnered for the International Barcode of Life (iBOL) project, a collaborative program which seeks to expand the Network's research on DNA barcoding into a global biodiversity science initiative — the largest of its kind.