• Bastien Boussau,

    1. Department of Integrative Biology, University of California—Berkeley, 3060 Valley Life Sciences Building, Berkeley, California, 94720–3140
    2. Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, CNRS, UMR5558, Villeurbanne, France
    3. E-mail:
    Search for more papers by this author
  • Jeremy M. Brown,

    1. Department of Integrative Biology, University of California—Berkeley, 3060 Valley Life Sciences Building, Berkeley, California, 94720–3140
    Search for more papers by this author
  • Matthew K. Fujita

    1. Museum of Comparative Zoology, Harvard University, 26 Oxford St., Cambridge, Massachusetts, 02138
    Search for more papers by this author


Genomes vary greatly in size and complexity, and identifying the evolutionary forces that have generated this variation remains a major goal in biology. A controversial proposal is that most changes in genome size are initially deleterious and therefore are linked to episodes of decrease in effective population sizes. Support for this hypothesis comes from large-scale comparative analyses, but vanishes when phylogenetic nonindependence is taken into account. Another approach to test this hypothesis involves analyzing sequence evolution among clades where duplications have recently fixed. Here we show that episodes of fixation of duplications in mitochondrial genomes of the gecko Heteronotia binoei (two independent clades) and of mantellid frogs (five distinct branches) coincide with reductions in the ability of selection to purge slightly deleterious mutations. Our results support the idea that genome complexity can arise through nonadaptive processes in tetrapods.

Lynch and Conery (2003) proposed that gene duplication events and other changes that alter genome sizes are initially deleterious even if they ultimately become adaptive. The initial survival of duplicate genes is thus dependent on small effective population size, in which selection is weakened. Support for this hypothesis originally came from the observation of a strong negative correlation between genome size and Neu, a product of effective population size (Ne) and mutation rate (u) that can be estimated from silent-site diversity. However, this strong negative correlation may have been confounded by the phylogenetic relatedness between species. Although the first phylogenetically corrected analysis supported Lynch and Conery's hypothesis (based on 33 fish species, Yi and Streelman 2005, results later disputed by Gregory and Witt 2008), the most recent studies found no significant correlation between genome size and Neu when accounting for phylogenetic structure in a dataset of 205 seed plants (Whitney et al. 2010) and in Lynch and Conery's (2003) original dataset (30 taxa, Whitney and Garland 2010). Such negative results may arise from the lack of a substantive linear relationship between genome size and Neu, from a lack of power in Lynch and Conery's original dataset (Whitney and Garland 2010), or from the inability of Brownian motion based models used in standard comparative methods to adequately capture the true process of genome size evolution.

An alternate avenue to test this hypothesis is to examine the ratio of nonsynonymous to synonymous substitutions (Dn/Ds) among recently diverged taxa that differ in genome size. According to Lynch and Conery's (2003) hypothesis, Dn/Ds should increase as genome size changes. In bacteria, Kuo, Moran and Ochman (2009) found that Dn/Ds increased as genome size decreased. It appears that, in prokaryotes, a strong deletion bias leads to the fixation by drift of a large excess of deletions compared to insertions. Therefore, as the effective population size decreases, Dn/Ds increases, but genome size decreases. Because this deletion bias is not as strong in eukaryotes as it is in prokaryotes (Kuo and Ochman 2009), it remains to be seen whether Lynch and Conery's (2003) hypothesis can explain increases in genome size among other groups of organisms.

Tetrapod mitochondrial genomes can be used to test Lynch and Conery's (2003) hypothesis due to their dense sampling among some recently diverged taxa. Small, tandem duplications in the control region of mitochondrial genomes have often occurred along the branches of the tetrapod phylogeny. Larger duplications in other parts of the mitochondrial genome are much less frequent and are often found on terminal branches, suggesting that they are generally deleterious. Such mitochondrial duplications occurred recently in the gecko Heteronotia binoei (Moritz 1991) and in mantellid frogs of Madagascar (Kurabayashi et al. 2008), enabling us to address how such changes evolve on short timescales.

There are several mechanisms that can generate duplications in mitochondrial genomes. Duplications in parthenogenetic Heteronotia likely occurred by a slipped-strand mispairing mechanism, whereby the leading strand dissociates during DNA replication and re-anneals at a downstream location, causing the re-synthesis (and therefore duplication) of a portion of the genome (Fujita et al. 2007). In the mantellid frogs, several of the duplications are thought to have occurred by intramolecular recombination, which buds off a mini-circle that can reincorporate at any position back into the genome, creating nontandem duplications (Kurabayashi et al. 2008; see also Mueller and Boore 2005 for examples in salamanders).

In H. binoei, two independent lines of evidence support the idea that increases in mitochondrial genome size arose nonadaptively. First, large duplications are restricted to two independent parthenogenetic lineages (Fujita, Boore and Moritz 2007; Fig. 1A, B), where smaller mitochondrial effective population size reduces the ability of selection to purge these large duplications (the theory behind changes in mitochondrial Ne for parthenogenetic lineages is explained further in Paland and Lynch (2006)). More generally, such duplications have occurred frequently in multiple asexual, but not sexual, lineages across squamates (Fujita, Boore and Moritz 2007). Second, in line with Paland and Lynch's result in Daphnia (2006), we find a significantly higher Dn/Ds in two independent parthenogenetic lineages compared to sexual lineages (Table 1). This pattern exists both in a thoroughly sampled dataset of 215 sequences of the mitochondrial gene NAD2 and in a large concatenated alignment of 10 genes sampled in 13 individuals (see the Methods section). These two lines of evidence confirm that large genomic duplications are restricted to lineages where the efficiency of selection has been reduced.

Figure 1.

Maximum likelihood phylogenies used in the analyses. (A) Heteronotia dataset 1, containing 215 NAD2 sequences. Sizes of duplicated regions are indicated after sequence names where duplications have been found. (B) Heteronotia dataset 2, from an alignment of 10 mitochondrial protein-coding genes. Sizes of duplicated regions are indicated after sequence names where duplications have been found. (C) Mantellid frogs, with branches labeled with the estimated sizes of inferred duplication events, in bases, based on the scenario from Kurabayashi et al. (2008). Red branches correspond to asexual lineages (A and B) or predicted duplication events (C).

Table 1.  Datasets and results of the tests for relaxation of selection on focal branches. The first two rows correspond to two different sets of Heteronotia binoei sequences. The first is an alignment of nad2 sequences from 215 individuals, whereas the second is an alignment of 10 concatenated gene sequences from 13 individuals.
TaxonNumber of speciesNumber of individualsNumber of genes Log Likelihood  (M0)1 Log Likelihood  (M1)2LRT P-valueDn/Ds (M1)3
  1. 1All branches constrained to evolve under one Dn/Ds.

  2. 2One Dn/Ds for sexual lineages (Heteronotia binoei) or for branches without duplications (mantellids), and another for asexual lineages (H. binoei) or for branches where a duplication is predicted to have occurred (mantellids). Model M1 has one more parameter than model M0.

  3. 3In each case, Dn/Ds is significantly higher in asexual lineages or on branches with duplications.

H. binoei 1215 1−18323−183200.020.18 and 0.35
H. binoei 1 1310−22724−227179×10−50.09 and 0.14
Mantellid Frogs17 17 4−47045−470405×10−40.10 and 0.13

To explore the generality of the relationship between mitochondrial duplications and selective efficacy, we analyzed mitochondrial genomes from a separate, distantly related tetrapod group for which appropriate data were available: sexually reproducing mantellid frogs of Madagascar. As in H. binoei, duplications tend to co-occur with decreases in selection efficiency, in five branches along the mantellid phylogeny (Fig. 1C). Across the mantellid phylogeny, we find a higher Dn/Ds on branches where a duplication is predicted to have occurred (based on the scenario outlined in Kurabayashi et al. 2008) than on branches where no duplication is predicted to have occurred (Table 1; Fig. 1C). The genes included in the analysis were not themselves involved in the duplication events (Kurabayashi et al. 2008), so these increased Dn/Ds cannot be linked to neo- or subfunctionalization events. Additionally, the association between duplications and increased Dn/Ds cannot simply be a byproduct of a change in reproductive mode (e.g., altered mutational or ecological pressures associated with parthenogenesis), because all these mantellids are sexual.

Our analyses show that genomic duplications occur concomitantly with lower estimated selection strength in the mitochondria of two distantly related tetrapod clades. To our knowledge, this is the first demonstration of a link between genome structure evolution in tetrapods and selective efficiency over short evolutionary timescales. These findings support a view of tetrapod genome evolution where major alterations first fix nonadaptively, but may ultimately contribute to an increase in genomic complexity (Lynch and Conery 2003).


We analyzed mitochondrial genome evolution in clades with sufficient sequence data, large duplications, and low divergence (i.e., Ds is small enough to accurately estimate). Two tetrapod clades were found to meet these criteria. (1) Heteronotia binoei is a lizard species complex with multiple chromosomal lineages, including two asexuals (3N1 and 3N2). We used two datasets to investigate sequence evolution. Dataset 1 contains nad2 gene sequences from 688 individuals, including 320 from the parthenogenetic lineages (3N1: 231, 3N2: 89). These sequences are available from GenBank, or were newly collected according to protocols in Fujita et al. (2010) (new GenBank accession numbers HQ132796-HQ132922). Dataset 2 contains 10 mitochondrial protein-coding genes from 13 individuals, including 10 CA6-type individuals (nine 3N1 and one CA6 sexual; Fujita, Boore and Moritz 2007), one 3N2, and one SM6 sexual. The 3N2 and SM6 sequences were collected according to methodology described in Fujita, Boore, and Moritz (2007), and have been deposited into GenBank (accession numbers HQ153035 and HQ153036). Mitochondrial genomes for sexual individuals have a size of 17 kb, but several parthenogenetic lineages contain large duplications (Fujita, Boore, and Moritz 2007). More specifically, all 3N2 parthenogens possess a duplication of approximately 5 kb in length (Zevering et al. 1991), whereas only some lineages of 3N1 experienced independent and recent duplications that range in size from 1.2 to 10.4 kb (Moritz 1991; Fujita et al. 2007). In certain cases, one copy of a duplicated gene has become a pseudogene (Fujita, Boore, and Moritz 2007). When pseudogenization was detected, we only used the functional copy. (2) Mantellid frog sequence data used in this study come from a group of 17 closely related species, with genome sizes ranging between 9.5 kb and 19 kb (median 13 kb, Kurabayashi et al., 2008). Available mitochondrial sequences were downloaded using accession numbers provided by Kurabayashi et al. (2008), who also provide great detail on putative duplication events across sampled mantellids. Protein-coding DNA sequences of cytochrome b and nad subunits 1, 2, and 5 were extracted and clustered as orthologs. In both Heteronotia and mantellid frogs, duplications are up to several kilobases and involve several genes, which can include tRNA, rRNA, and protein-coding genes, as well as the control region in the frogs.

Sequences were translated and aligned based on amino-acids in Seaview (Gouy, Guindon and Gascuel, 2010) using the muscle algorithm (Edgar 2004) with default parameters. For multigene datasets, individual gene alignments were then concatenated. Identical sequences were removed from dataset 1 using bppPhySamp (Dutheil and Boussau 2008). A total of 215 sequences were kept for subsequent analyses, including 10 from 3N1 and 9 from 3N2.

Maximum likelihood (ML) phylogenetic trees were inferred from nucleotide sequences with PhyML (Guindon and Gascuel 2003), using a GTR + I +Γ model. The inferred ML topology for mantellid frogs was congruent with the published estimate (Kurabayashi et al. 2008). Trees and alignments are available from Treebase (

Ratios of nonsynonymous to synonymous substitution rates (Dn/Ds) were estimated using bppml (Dutheil and Boussau 2008). Topologies were fixed (Fig. 1), but branch lengths and Dn/Ds were reestimated by bppml. Two analyses were performed per topology with either one (M0) or two (M1) free Dn/Ds across branches. For H. binoei, different ratios corresponded to asexual and sexual branches. For mantellids, different ratios corresponded to branches with, and without predicted gene duplications, according to the scenario described by Kurabayashi et al. (2008).

Associate Editor: H. Innan


We thank J. Bull, C. Moritz, the associate editor, and two anonymous reviewers for helpful comments on the manuscript. BB was supported by a postdoctoral fellowship from the Human Frontier Science Program and the CNRS. JMB and MKF were supported by National Science Foundation postdoctoral research fellowships in biology (Award Nos. DBI-0905867 and DBI-0905714, respectively).