N.G. is interested in various aspects of molecular evolution, and especially the relationship between species biology and genome evolutionary trends in animals. B.N. studies the evolutionary genomics of vertebrates, with a preference for birds and interests in phylogeny, molecular dating, polymorphism, influence of population size, and of mutation rate. S.G. combines theoretical and empirical population genomic approaches to analyse the influence of mating systems and other life-history traits on genome evolution, with some focus on plants. G.H. is an evolutionary geneticist interested in genomic conflicts, and especially the interaction between maternally-transmitted symbiotic microbes and their arthropod hosts.
Over the last three decades, mitochondrial DNA has been the most popular marker of molecular diversity, for a combination of technical ease-of-use considerations, and supposed biological and evolutionary properties of clonality, near-neutrality and clock-like nature of its substitution rate. Reviewing recent literature on the subject, we argue that mitochondrial DNA is not always clonal, far from neutrally evolving and certainly not clock-like, questioning its relevance as a witness of recent species and population history. We critically evaluate the usage of mitochondrial DNA for species delineation and identification. Finally, we note the great potential of accumulating mtDNA data for evolutionary and functional analysis of the mitochondrial genome.
Mitochondrial DNA (mtDNA) represents only a tiny fraction of organismal genome size, yet it has been by far the most popular marker of molecular diversity in animals over the last three decades. Following Avise et al. (1987) and Moritz et al. (1987), among others, population geneticists and molecular systematicists have adopted this tool with little reserve. Virtually every molecular study of animal species in the field involves mtDNA haplotyping at some stage. Not surprisingly, a mitochondrial fragment, COX1, was recently elected as the standardized tool for molecular taxonomy and identification (Ratnasingham & Hebert 2007).
The reasons for the adoption of mtDNA as marker of choice are well-known. Experimentally, mtDNA is relatively easy to amplify because it appears in multiple copies in the cell. Mitochondrial gene content is strongly conserved across animals, with very few duplications, no intron, and very short intergenic regions (Gissi et al. 2008). Mitochondrial DNA is highly variable in natural populations because of its elevated mutation rate, which can generate some signal about population history over short time frames. Variable regions (e.g. the control region) are typically flanked by highly conserved ones (e.g. ribosomal DNA), in which PCR primers can be designed. The only technical issues associated with the marker arise from illegitimate amplification of mitochondrial genes that have inserted into the nuclear genome (numts) in some species (Bensasson et al. 2001). Clearly, mtDNA is the most convenient and cheapest solution when a new species has to be genetically explored in the wild.
These practical issues presumably explain to a large extent the popularity of mtDNA in molecular ecology. The reasons most often invoked to justify this choice, however, are more fundamental (Ballard & Whitlock 2004; Ballard & Rand 2005). Mitochondrial DNA has a number of specific biological properties, which should make it an appropriate marker of molecular biodiversity. First, its inheritance is clonal (maternal), which means that the whole genome behaves as a single, nonrecombining locus – all sites share a common genealogy. This considerably simplifies the representation and analysis of within-species variation data. Secondly, mtDNA has been supposed to evolve in a nearly neutral fashion. Being involved in basic metabolic functions (respiration), mitochondrial-encoded genes have been considered as less likely than other genes to be involved in adaptive processes. Finally, and not independently, the evolutionary rate of mtDNA has been frequently assumed to be clock-like – in the absence of any mutations spreading through positive selection, only neutral (and slightly deleterious) mutations accumulate in time, so that mtDNA divergence levels should roughly reflect divergence times. Clonal, neutral and clock-like: mtDNA apparently stands as the ideal witness of population and species history.
Because of its popularity, mtDNA polymorphism and divergence data sets have grown at an impressive rate during the last 20 years; we now have the material to assess its properties a posteriori. Here we review a number of recent studies which questioned the basic assumptions about mtDNA evolutionary dynamics, namely clonality, near-neutrality and evolutionary rate constancy. We discuss the implications of these findings on the usage of mtDNA as a population and phylogenetic marker and its use in systematics.
It is widely considered that mtDNA is maternally transmitted, and therefore clonal, in animals (Birky 2001). Paternal mtDNA is eliminated before (as in crayfish, Moses 1961), during (as in Ascidia, Ursprung & Schabtach 1965) or after (as in mouse, Sutovsky et al. 1999) fertilization (see Xu et al. 2005 for review, and White et al. 2008; for exceptions). The transmission of mtDNA in the female germ line, furthermore, is characterized by a strong bottleneck, which reduces the within-individual diversity (Shoubridge & Wai 2007). Because of these two effects (absence of paternal leakage and germ-line bottleneck), distinct mtDNA lineages can co-occur within a zygote, a condition which prevents effective recombination. The lack of genetic exchange has been considered as a useful feature, as it implies that the within-species history of mtDNA can be appropriately represented by a unique tree, which traces back the origins and geographic movements of maternal lineages (Avise et al. 1987). The whole field of phylogeography, therefore, heavily relies on the assumption of clonal mtDNA inheritance.
Firmly enough established from classical genetics, the clonality assumption was little questioned until 1999, when three quasi-simultaneous articles suddenly challenged the dogma in humans. Analysing 29 complete human mitochondrial haplotypes, Eyre-Walker et al. (1999) found an unexpectedly high amount of within-species homoplasy (i.e. phylogenetic conflict between sites), which could apparently not be explained by mutation hotspots – homoplasic sites were not particularly variable across hominid species. They concluded that these many homoplasies were most probably the consequence of recombination events. Using the same data, Awadalla et al. (1999) reported a negative correlation between linkage disequilibrium and physical distance for pairs of sites, again supporting the idea that recombination breaks down allelic associations between distant loci. Hagelberg et al. (1999), finally, discovered a mitochondrial point mutation shared by distantly related mtDNA haplotypes in a small Melanesian island, but virtually nowhere else in the world. Arguing that two independent occurrences of this rare mutation in the same island were unlikely, the authors concluded that recombination was the most plausible explanation.
These exciting reports in human stimulated the search for instances of mtDNA recombination in various animal species. Systematic surveys of within-species mtDNA data available from public databases revealed significant departure from the clonality assumption in several species, including primates (Piganeau et al. 2004; Tsaousis et al. 2005). The authors, however, carefully noted that part of these bioinformatic-detected instances could correspond to artefacts – e.g. in vitro recombination – and called for experimental corroboration (Piganeau et al. 2004). Population evidence for mtDNA recombination was specifically reported in a mussel (Ladoukakis & Zouros 2001), a butterfly (Andolfatto et al. 2003), scorpions (Gantenbein et al. 2005), a lizard (Ujvari et al. 2007) and fish (Hoarau et al. 2002; Ciborowski et al. 2007), based either on linkage disequilibrium/physical distance analysis, or discovery of obvious recombinants between distant enough parental haplotypes. The mussel case was the least surprising because this species, like several other bivalves (Breton et al. 2007), hosts a paternally transmitted mitochondrial genome, so that all male individuals are heteroplasmic. An instance of paternal leakage followed by recombination, finally, was discovered in a human patient (Kraytsberg et al. 2004).
In flowering plants, while mitochondria, like chloroplast, are thought to be primarily maternally inherited (Reboud & Zeyl 1994), the occurrence of recombination is also debated. In Silene vulgaris, recent recombination events within and among genes, and between mitochondria and chloroplast, have been detected (Houliston & Olson 2006), but not recovered in a subsequent analysis (Barr et al. 2007). On a larger evolutionary scale, phylogenetic conflicts involving mtDNA support the hypothesis of recurrent horizontal transfer of mitochondrial genes during the history of angiosperms (Bergthorsson et al. 2003).
Although some would probably deserve confirmation, these case studies are of great relevance to the field of molecular ecology. They prove that mitochondrial recombination is possible, and call for caution when building and interpreting within-species mtDNA genealogies (Hey 2000). Ironically enough, we now know that the three human studies which motivated this fruitful search for mtDNA recombination were actually questionable. The island-specific pattern (Hagelberg et al. 1999) was resulting from an alignment error (Hagelberg et al. 2000). The decay of linkage disequilibrium with physical distance (Awadalla et al. 1999) was found sensitive to methodological settings, and to the analysed data set (Ingman et al. 2000; Innan & Nordborg 2002; Piganeau & Eyre-Walker 2004). The excess of within-species homoplasy (Eyre-Walker et al. 1999), finally, was most probably created by short-lived, uneasy to detect mutations hotspots (Galtier et al. 2006).
Besides the anecdote, one lesson to be drawn from this interesting debate is that peculiarities of the mitochondrial mutation process can generate recombination-like patterns of sequence variation. For this reason, and because of the lack of power of recombination detection methods (White & Gemmell 2009), the prevalence of mitochondrial recombination across animals is currently difficult to assess. Our view of current literature would tend to suggest that a large fraction of the within-species mtDNA homoplasy is caused by mutation hot-spots (Innan & Nordborg 2002; Galtier et al. 2006; see below). At any rate, the existence of strong within-species homoplasy is an obvious practical problem when analysing mtDNA population data, whether it is caused by true recombination or mutation-induced convergences.
Effective neutrality is another major assumption underlying most analyses of mtDNA population data, especially when mtDNA diversity patterns are interpreted in terms of demography, migrations and founding events. Everybody knows that mtDNA is not functionally neutral, as it hosts very important genes. It is assumed, however, that the mitochondrial genome essentially undergoes neutral or deleterious mutations – adaptive mutations, which would spread through positive selection, are held to be very rare. Deleterious changes being rapidly removed by purifying selection without affecting much the diversity at linked sites, the observable variation would therefore reflect neutral processes only, in agreement with the neutralist theory of molecular evolution (Kimura 1983).
Early analyses of the synonymous/nonsynonymous ratio appeared consistent with the prevalence of purifying selection in mtDNA: McDonald–Kreitman tests (McDonald & Kreitman 1991) uncovered a general excess of nonsynonymous polymorphisms in 31 mtDNA data sets, as revealed by a higher-than-one average neutrality index (NI, Weinreich & Rand 2000). A number of exceptions, however, have been reported. Episodes of positive selection during mitochondrial protein-coding gene evolution were detected in several groups of animals including primates (Grossman et al. 2004), agamid lizards and snakes (Castoe et al. 2008, 2009).
The neutrality assumption was recently challenged by a meta-analysis of >1600 animal species (Bazin et al. 2006). This study revealed that the average within-species level of mtDNA diversity is remarkably similar across animal phyla, including vertebrates and invertebrates. Nuclear loci, in contrast, show a higher average diversity in invertebrates than in vertebrates, in marine than in terrestrial species, in small than in large organisms, consistent with the theoretical relationship between population size and genetic diversity. Bazin et al. (2006) argued that this mitochondrial-specific lack of population size effect could only be explained by recurrent selective sweeps (adaptive evolution) in large populations. Recurrent selective sweeps are in agreement with Gillespie’s (2000)‘genetic draft’ model, and with increased sensitivity to genetic hitch-hiking, associated with the largely nonrecombining nature of mtDNA. According to this hypothesis, recurrent selective sweeps would affect mtDNA evolution in species with large population sizes, leading to frequent drops in diversity at the whole genome level. Consistently, Bazin et al. (2006) reported a lower average NI in invertebrates’ than in vertebrates’ mtDNA, again suggesting that adaptive evolution might significantly impact mtDNA variation patterns.
Three major selective mechanisms could account for such adaptive-like patterns (Rand 2001). The most obvious one invokes selection at the host level. Mitochondrial variants might be advantageous, e.g. by inducing more efficient/flexible energetic metabolism (Ballard & Rand 2005; Dowling et al. 2008). Fitness differences between naturally occurring mitochondrial haplotypes have been reported in Drosophila, mostly in connection with maximal lifespan (Ballard et al. 2007), and in copepods, in which mitochondrial introgression between populations leads to reduced fitness (Ellison & Burton 2008). Adaptive introgression of mtDNA in response to particular environmental pressures has been suggested in charrs (Doiron et al. 2002), wild goats (Ropiquet & Hassanin 2006) and hares (Alves et al. 2008). In human, finally, several recent articles have interpreted spatial and altitudinal mitochondrial variations in terms of local adaptation (e.g. Ruiz-Pesini et al. 2004) – a controversial point of view, though (Ingman & Gyllensten 2007).
The second potential source of positive selection in mitochondrial evolution is ‘selfish’ mitochondrial mutations, i.e. mutations that would favour the transmission of mtDNA to the next generation irrespective of the fitness of the host. For example, in Saccharomyces cerevisiae, the ‘petite’ phenotype (small colonies) is resulting from the local fixation of selfish mitochondrial variants showing respiratory deficiency but high replication rate (MacAlpine et al. 2001). Uniparental inheritance is thought to have evolved to avoid such genetic conflict between the mitochondrial and nuclear genomes (Hurst & Hamilton 1992). Uniparental inheritance, however, creates a new kind of conflict, between females (which transmit mtDNA) and males (which do not). This is typically the case in gynodioecious plants, such as Thymus, Plantago and Silene (Budar & Pelletier 2001). Specific mitochondrial haplotypes, which include novel chimeric genes resulting from gene duplications and rearrangements, cause male-sterility in these species (Saumitou-Laprade et al. 1994). Individuals carrying cytoplasmic male-sterility (CMS) mutations allocate the entire reproductive resource to the female germ line, maximizing the transmission of mitochondria. The arm race between CMS mitotypes and nuclear restorators (Rf) can lead either to epidemic selective episodes (Ingvarsson & Taylor 2002), or to long-term balancing selection maintaining a CMS/Rf cyclical polymorphism (Gouyon et al. 1991; Couvet et al. 1998). CMS has been investigated in detail at the molecular level in the genus Silene, where the observed departure from neutral evolution is in agreement with long-term balancing selection (Houliston & Olson 2006). CMS has not been documented in animals. However, it should be noted that maternal inheritance can results in sexually antagonistic co-evolution in dioecious species as well (Zeh & Zeh 2005). Male-deleterious, female-neutral mutations (e.g. mutations affecting sperm competitive ability) can be maintained at high enough frequency (Frank & Hurst 1996), paving the route for the selection of nuclear mutations increasing male fitness.
The third potential source of mitochondrial selective sweeps is through hitch-hiking. Even under the condition that mtDNA variants are themselves neutral, mitochondria occur within the cell alongside other genetic elements that are also inherited maternally. Any selection upon these elements is felt across the maternally inherited genetic baggage, including the mitochondrion. The maternally inherited genome can comprise symbiotic bacteria that pass from a female to her progeny, and (in species that are female heterogametic) the W (female determining) chromosome.
Maternally inherited symbiotic microbes are common in arthropods. For Wolbachia, which is probably the most common inherited symbiont, the rarity of co-speciation of Wolbachia and host (Shoemaker et al. 2002) combined with the commonness of Wolbachia overall (possibly up to 65% of species: Hilgenboecker et al. 2008) indicates that infections spread and are lost from host species on a regular basis. This general inference is reflected in contemporary case studies where waves of spread of Wolbachia across a species range have been observed (Turelli & Hoffmann 1991; Hoshizaki & Shimada 1995).
Spread of these agents, presuming this is associated with maternal transmission alone (likely to be a fair assumption in most cases), will be a regular driver of selective sweeps on arthropod mitochondrial genomes. A Wolbachia strain will normally be associated initially with a single mtDNA haplotype, and as spread occurs, so then does the associated mtDNA type. This concept also has empirical support, both in terms of observed change in mtDNA haplotype frequency associated with spread of Wolbachia (Turelli et al. 1992), and in terms of the expected product of the sweep: very low mtDNA variability in species infected with Wolbachia (Jiggins 2003). The drive associated with the spread of maternally inherited agents can be very high, creating selection coefficients which are commonly in excess of 0.01 and may be of much greater magnitude than those considered typical of ‘adaptive evolution’. Indeed, a number of case studies indicate that the drive appears to be sufficient to move both infection and associated mtDNA haplotype across species boundaries following hybridization (Rousset & Solignac 1995; Ballard 2000; Jiggins 2003).
In addition to producing bouts of positive selection, inherited symbiont presence may also violate neutrality by producing balancing selection on different mtDNA variants, when different infections in a host species are under balancing selection themselves. When this occurs, the observation is of greater than expected intraspecific mtDNA diversity, and relatively deep nodes in mtDNA divergence within a species, with particular mtDNA haplotypes associated with particular infection variants (Schulenburg et al. 2002; Shoemaker et al. 2003; Charlat et al. 2009).
The ‘common’ symbiotic microbes themselves are rather diverse (Duron et al. 2008) – with the known microbial diversity growing – and this means one cannot ‘test against’ the presence of the agents with any certainty – absence of evidence of known inherited bacteria in a particular arthropod species does not mean that unknown ones are not present. Furthermore, the rate of gain and loss of these elements on an evolutionary timescale is such that current infection state is not a good indicator of past state. Thus, one can never know that mtDNA evolution has not been influenced by symbionts past or present.
Less work has been conducted on the impact of the W chromosome. However, in birds, Hill-Robertson interference between mtDNA and W chromosome has been shown to affect mitochondrial evolution (Berlin et al. 2007) in reducing neutral diversity (but see Hickey 2008; Nabholz et al. 2009) and altering the efficacy of selection.
The occurrence of recurrent mitchondrial selective sweeps, if confirmed, has strong practical consequences. Under recurrent adaptive evolution, the within-species level of mtDNA diversity and the age of mtDNA common ancestors primarily reflect the time elapsed since the last selective sweep, not demographic processes. In invertebrates, and especially in Wolbachia-infected arthropods, mtDNA should not be a priori considered as a ‘neutral’ phylogeographic marker. The pattern is less clear in species with lower population sizes, e.g. vertebrates, in which genetic draft is theoretically less efficient (Gillespie 2000), and maternally inherited symbionts undocumented. Nabholz et al. (2008a) examined within-species mtDNA diversity patterns across mammals. No direct evidence for adaptive evolution was detected from McDonald–Kreitman tests, in agreement with Weinreich & Rand (2000), suggesting that mammals could be immune from the genetic draft effect invoked by Bazin et al. (2006). Nabholz et al. (2008a), however, failed to recover any significant relationship between within-species diversity and species population size, approached through a battery of life history and ecological variables: large, endemic, endangered mammalian species did not appear less polymorphic than small, widespread, healthy ones. A similar result was obtained in birds (Nabholz et al. 2009). The lack of relationship were interpreted as reflecting lack of demographic equilibrium, e.g. through recurrent bottlenecks (Nabholz et al. 2008a), or an inverse relationship between the per-generation mutation rate and population size (Piganeau & Eyre-Walker 2009).
Whatever the underlying causes of the patterns observed, these studies demonstrate that the within-species level of mtDNA diversity per se is not a good marker of population size and species health, as observed both at the metazoa and mammalian levels. Nonequilibrium processes apparently dominate. The classical interpretation of genetic diversity as the product of mutation rate by population size, as expected at mutation-drift equilibrium, is strongly questionable as far as mtDNA data are concerned.
Constant mutation rate?
Because of its relatively high substitution rate, mtDNA has been extensively used as a phylogenetic marker at recent time scales, both for tree building and molecular dating. Brown et al. (1979) first made use of fossil data to calibrate the mitochondrial molecular clock in primates. Although these authors carefully indicated that this calibration was not necessarily general to other lineages (Moritz et al. 1987), their famous ‘2% per million year’ was often considered as a reasonable reference in the absence of relevant fossil data, in mammals, and more generally in vertebrates. The notion that the mtDNA evolutionary rate is little variable across taxa was supported by large-scale phylogenetic analyses in vertebrates (Martin 1995; Gissi et al. 2000; Bininda-Emonds 2007), which reported a limited (two or three-fold) difference in substitution rate between fast evolving and slow-evolving lineages.
These studies, however, made use of a small number of species and/or fossils, and compared highly divergent sequences, for which rate estimation can be strongly affected by multiple substitutions. Analysing >1500 cytochrome b sequences, a large number of fossil calibration dates, focusing on recent divergences only, and on the neutrally evolving third codon positions, Nabholz et al. (2008b, 2009) demonstrated that the mtDNA substitution rate varies by 30-fold between birds, and 100-fold between mammals – third codon positions are substituted every 100 million year, on average, in the slow-evolving whales, but every million year, on average, in fast-evolving gerbils. In arthropods, Xu et al. (2006) also reported significant variations in substitution rate across species, correlated to variations of the rate of gene-order rearrangement. Extreme variations of mtDNA substitution rate were also reported in the generally slow-evolving plants. Various species of the genera Plantago, Pelargonium, and Silene have recently undergone extraordinary (up to 1000-fold) accelerations of the mtDNA substitution rate, for yet unexplained reasons (Cho et al. 2004; Mower et al. 2007). Being observed at third codon positions, these variations of substitution rate must reflect variations in mutation rate, not in the intensity of natural selection.
The molecular clock, therefore, is certainly not a tenable assumption as far as mtDNA is concerned. Nonclock-like evolution is common, and the departure from homogeneous rates can be very strong. In mammals, the mitochondrial mutation rate appears more variable across lineages than the nuclear one (Nabholz et al. 2008b). One should refrain from propagating to taxon Y the absolute rate estimated in taxon X, and adopt molecular dating methods which do not assume constant rate across lineages. Mutation rate heterogeneity, finally, should not be neglected when comparing mtDNA variation patterns across species. Nabholz et al. (2008a, 2009), for instance, found that the level of within-species diversity observable within mammalian and bird species primarily reflects variations of mutation rate.
Castresana (2001) analysed the taxonomy of large vs. small mammals, and concluded from measures of cytochrome b divergence that groups made of large mammals tend to be over split, and those made of small mammals overlumped. The extreme between-species variation in mutation rate, however, is likely to impact mtDNA divergence. The nine mammalian families stressed as over lumped by Castresana (2001) indeed show significantly faster average rate of evolution (0.065 ± 0.034 substitution/third codon position/Myrs) than the 12 supposedly over split families (0.017 ± 0.013, P-value = 0.003, t-test, data from Nabholz et al. 2008b). We made use of the divergence dates from Nabholz et al. (2008b) to examine the typical level of divergence (in time) of congeneric species in various mammalian orders: the ‘depth’ of a genus was calculated as the median of all the divergence dates between the species it includes. This measure appears homogeneous enough between clades: 80% of the genera show a median divergence date between 3 and 14.3 Myr (n = 93, see Fig 1). Genus ‘depth’, moreover, is not related to species body mass. Cetartiodactyla and Rodentia, for example, have very similar within-genus divergence dates despite a huge difference in body mass (Figure 1, median of ∼70 kg vs. 11 g). This result shows that, at least in mammals, the arbitrarily defined genera correspond to clades sharing relatively similar evolutionary ages. This re-examination of Castresana’s (2001) work illustrates the dangers of disregarding the erratic nature of the mtDNA molecular clock and using divergence thresholds in assignation of species status.
The causes of the strong variations of mutation rate across lineages are unclear yet. Generation time probably explains a substantial portion of the variation – phylogenetic analyses measure the substitution rate per year, not per generation. Relating substitution rate variation to life history traits in mammals and birds, Nabholz et al. (2008b, 2009) and Welch et al. (2008) concluded that generation time alone could not explain the data. They found a significant effect of species longevity, and suggested that a low (somatic) mutation rate could be required to achieve long lifespan, in agreement with the mitochondrial theory of ageing (Galtier et al. 2009).
The proximal reasons for mutation rate variations are also largely unknown. Are mitochondrial mutations essentially replication-dependent (i.e. because of errors of the DNA polymerase), or replication-independent (i.e. because of mutagenic species by-produced during oxidative phosphorylations)? What are the influence and relative intensity of mutation hotspots? Clarifying these issues would be of great relevance to the interpretation of mtDNA sequence variations within/between species. A proper description of the mitochondrial mutation process could help resolving the paradox raised by Ho et al. (2005), who consistently reported a faster mtDNA molecular clock at recent vs. ancient time scale. These controversial results (Emerson 2007; Haag-Liautard et al. 2007; Ho et al. 2007) could perhaps be explained by the existence of undetected, short-lived mutation hotspots (Stoneking 2000; Galtier et al. 2006; Pulquério & Nichols 2007).
The worst marker?
This quick review of the evolutionary properties of mtDNA makes clear that it is not the ideal marker of molecular diversity as it has been supposed to be. Mitochondrial DNA is immune from neither recombination, positive selection nor erratic evolutionary rate. Being gene dense, sensitive to hitch-hiking, in linkage disequilibrium with selected selfish genetic elements, and prone to genomic conflicts, mtDNA has many reasons for badly representing population history. Being located in a metabolically active, highly oxidative environment, mtDNA undergoes a complex mutation process, highly variable in space and time. Most importantly, the prevalence of these many confounding effects is largely unknown, and obviously variable across taxa. For all these reasons, mtDNA is perhaps intrinsically the worst population genetic and phylogenetic molecular marker we can think of.
The use of mtDNA has, in fact, become recently resurgent, in particular associated with DNA barcoding for taxonomic identification and investigation of biodiversity. In DNA barcoding, a short stretch of DNA (‘barcode’) is commonly used to allocate an unidentified individual to a species. Conversely, high levels of variation between specimen barcodes from previously accepted species are held to indicate that cryptic species are probably present, and differentiation beyond a threshold between anonymous samples likewise indicate that they belong to different species. The most common stretch of DNA used in barcoding is the COX1 gene of mtDNA. It is thus pertinent to ask how the evolutionary genetic processes of mtDNA fit/disrupt attempts to barcode.
In favour of mtDNA barcoding is the existence of fairly common selective sweeps in mtDNA. Whilst selective sweeps are the enemy of ‘classic’ interpretations of genomic diversity and population history from mtDNA diversity, these sweeps make attempts to barcode species more robust – they reduce intraspecific variation in mtDNA, and make the ‘gap’ between inter- and intra-specific diversity more pronounced. However, various evolutionary processes (related earlier) may either homogenize distinct species for mtDNA through hybrid introgression (making inter-specific mtDNA diversity very low), or produce balancing selection (that maintains high intraspecific diversity). Such effects can seriously erode the distinction between intra- and inter-specific diversity, creating errors in mtDNA driven systematics, and leading to problems in using barcodes in specimen identification. Beyond processes that alter levels of intra/inter-specific variation, we have noted that the evolutionary rate of mtDNA is heterogeneous. The consequence of this is that universal distance-based ‘thresholds’ for delineation of taxa into species do not exist. Rates of evolution may vary even in quite recent clades, compromising the quest to produce ‘rules’ for species delineation based on mtDNA divergence.
Despite the many concerns we express about the use of the mitochondrial marker in molecular ecology, we are confident that scientists will keep on analysing mtDNA variation in the wild. The reason why we think mtDNA will keep its popularity is its practical convenience. Targeting mtDNA is by far the cheapest way to get a first idea of the genetic structure of a yet uncharacterized species, and represents an easy method of gaining some insight into the species present. Notwithstanding this, the acquisition of new mtDNA data has, despite it being unfit for some of its stated purposes, serendipitously allowed insight into many aspects of mitochondrial genome evolution. The mitochondrion is involved in major cellular processes, including apoptosis and ageing (Lane 2005). It could be involved in various adaptive events, and has been found to be associated with a number of diseases in human (Wallace 2008). While its relevance as a nearly neutral marker has been declining, there is a growing interest in functional, selected processes associated with mtDNA variations. Comparative mitogenomics is a promising research area, at the boundary between evolutionary biology and medicine, for which population data from a variety of species are welcome more-than-ever.
This work was supported by Agence Nationale de la Recherche project MitoSys, European Research Council project PopPhyl and National Environment Research Council.