Genomic resources of various organisms have been analyzed to study speciation genetics and the genomic consequences of speciation. Several methods relying on divergence estimates, single nucleotide polymorphism (SNP) data, or estimates of nonsynonymous and synonymous divergence have been developed to identify outlier loci in the genome (Fig. 1). Measures that reflect divergent adaptive evolution include the fixation index Fst, Tajima's D, statistics based on the site frequency spectrum, and, for coding sequences, estimates of nonsynonymous and synonymous divergence or polymorphism.
Figure 1. Signatures of natural selection identified as outlier loci in genome alignments. In this graphic example, the genomes of four species are compared. One gene shows an excess of adaptive mutations in species 2, shown as a high value of the selection measure. The gene is conserved in species 1, 3, and 4, which suggests that the adaptive mutations are associated with selection imposed only in the environment of species 2.
Download figure to PowerPoint
Divergent selection is reflected as significantly increased levels of sequence differentiation. When coding sequences of more than two species are compared, it is possible to test the strength of positive selection that has affected a particular locus in the different species. Such branch-specific estimates can elucidate the different selection pressures that have acted on the divergent species. Within a population, strong selection of a particular trait or an advantageous allele can cause a selective sweep and a depletion of variation within the genomic region spanning the selected locus. Genetic hitchhiking will cause allelic variants linked to the selected locus to increase in frequency. Comparative population genomics approaches have the power to contrast variation within and between species and thereby uncover the present effect of natural selection in current populations and the past effect of natural selection during species divergence. Ultimately, such studies can unravel speciation scenarios and provide candidate genes involved in ecological speciation and reproductive isolation. Such comparative genome approaches have been explored in nonfungal model systems and demonstrate the power of the use of genomic data to study speciation. For example, in the flowering plant Mimulus gattus, a hybrid lethality allele hitchhiked with a strongly selected allele for copper tolerance (Fig. 2) (Wright et al., 2013). The linkage between copper tolerance and hybrid lethality confers reproductive isolation between copper-tolerant and copper-nontolerant populations. In M. gattus, hybrid incompatibility is thereby evolving as an incidental by-product of divergent adaptation in distinct environments.
Figure 2. Lineage-specific positive selection and a selective sweep in Mimulus gattus. In a comparative population genomics study, variation within a population and between populations can be compared. The example shows the distribution of polymorphisms and the frequency of alleles at two loci in the populations of lineages 1 and 2. Different colored circles represent different alleles at the two loci Hl (hybrid lethality) and Tol (metal tolerance). Lineage 1 occurs in a metal-polluted environment, and an allele for metal tolerance at the Tol locus has been fixed in this population (filled red circles). Lineage 2 occurs at an unpolluted site, and the metal-tolerance allele does not exist in this population (orange and open red circles). The strong selection imposed at the Tol locus in the population of lineage 1 has resulted in a selective sweep. At a neighboring locus, an allele conferring lethality in hybrids of lineage 1 and lineage 2 (filled blue circles) has hitchhiked with the metal-tolerance allele and has also become fixed in lineage 1. Lineage 2 has not experienced a similar selective sweep in the Tol region, and different polymorphisms and alleles are present at the Tol and Hl loci. The divergent selection between lineages 1 and 2 at the Tol locus is reflected in an increased divergence, here measured as the fixation Index Fst (gray line). The loss of variation mediated by the selective sweep in lineage 1 is shown as a strong decrease in nucleotide variability, here measured as nucleotide diversity π (purple line). The figure was inspired by Wright et al. (2013).
Download figure to PowerPoint
1. Lessons from stick insects, primates, and fruit flies
The genomic data of 161 individuals of the stick insect Timema cristinae sampled from 28 populations have been used to elucidate the processes of population divergence and the evolution of reproductive isolation during incipient speciation (Nosil et al., 2012). Timema cristinae feeds on two distinct host species. The populations of T. cristinae feeding on these two host species show strong patterns of divergent selection and increased reproductive isolation. Reproductive isolation is expected to evolve as a consequence of ecological specialization. However, in addition to ecological specialization, allopatric and sympatric populations show varying levels of reproductive isolation, which suggests a role of reinforcement selection. Reinforcement selection acts to increase reproductive isolation between species. Among the 28 populations, Nosil and colleagues identified outlier loci – defined as loci with an exceptionally high level of between-population divergence (Nosil et al., 2012). By conducting pairwise comparisons between populations, the authors were able to show that the number of outlier loci increased with geographic distance as a consequence of weaker gene flow and demographic variability between geographically separated populations. Furthermore, some outlier loci could be associated directly with host-specific selection, which supports the importance of ecological adaptation in population divergence. Finally, a considerable number of outlier loci were exclusively present in sympatric or adjacent populations, between which the level of reproductive isolation was stronger. The authors propose that these outlier loci could play a role in the observed reproductive isolation between sympatric populations, and they conclude that incipient speciation in T. cristinae is a product of multiple factors, including divergent host selection, geographic isolation between allopatric populations, and reinforcement between sympatric populations.
For decades, a large number of studies have aimed to unravel the speciation history of humans. The availability of full genome sequences of several primate species has allowed extensive analyses of genome evolution and evolutionary relationships between species (Locke et al., 2011; Prüfer et al., 2012; Scally et al., 2012). These large genomic data sets provide information on patterns of speciation, including speciation times and demographic changes, such as alterations in effective population sizes and levels of introgression. While these parameters can be assessed as averages for the species, they can also be inferred locally across genome alignments. Local estimates of divergence, effective population size, recombination rate, and incomplete lineage sorting (ILS) supply information on the selection that acted on the genomes in the past and was associated with speciation (Dutheil & Hobolth, 2012). ILS has been used to explore evolutionary forces during the speciation period of human, chimpanzee, and gorilla (Scally et al., 2012). The amount of ILS is a function of the ratio of ancestral effective population size over the time between two speciation events. For a given time-span between two speciation events, a large amount of ILS reflects a large ancestral effective population size (Dutheil & Hobolth, 2012). Genomic regions that have experienced strong selection during speciation can be recognized as regions devoid of ILS. Species-specific traits and genes that have been affected strongly by natural selection during speciation are expected to be located in such regions where ancestral variation is absent. In fungi, genome-wide estimates of ILS have so far only been obtained in the wheat pathogen Zymoseptoria tritici (synonym Mycosphaerella graminicola) and its closest relatives, as will be described below in section V.3 ‘Host-driven divergence and ecological speciation’ (Stukenbrock et al., 2011).
Another important model in speciation genetics is Drosophila melanogaster and closely related species. Earlier studies aimed to identify single speciation genes, while more recent studies have focused on the impact of chromosome rearrangements on reproductive isolation between closely related species.
Chromosomal inversions are present in the closely related parapatric species Drosophila pseudoobscura and Drosophila persimilis (Kulathinal et al., 2009). Hybrids of D. pseudoobscura and D. persimilis are rarely found in nature, and male hybrids are sterile. Kulathinal and colleagues have investigated the genomic consequences of three large paracentric chromosomal inversions and their role in speciation (Kulathinal et al., 2009). Inverted chromosomal regions are characterized by significantly higher divergence and strong suppression of recombination, which suggests that gene flow during the initial species divergence affected the inverted regions to a much smaller extent (Kulathinal et al., 2009; McGaugh & Noor, 2012). Recombination is suppressed in these regions as a result of the fatal consequences of meiotic crossing over between heterokaryons, as 50% of the products of recombination will be either acentric or dicentric (Fig. 3). Consequently, genes located inside the inversions are ‘protected’ from gene flow with other populations and will diverge to a greater extent. In D. pseudoobscura and D. persimilis, genes located in the inverted regions include those involved in male sterility, male mating success, female species preferences, and hybrid inviability. Higher divergence in these genes may indeed promote reproductive isolation between the two Drosophila species.
Figure 3. Chromosomal rearrangements have considerable impact on meiotic products. The synteny plot on the left illustrates the alignment of chromosome 1 from species 1 and species 2. The red diagonal line shows the homologous match between the two sequences, and the black oval indicates the position of the centromere. The synteny is broken by an inversion (green line) that is visualized as a change in slope of the diagonal line from positive to negative. A paracentric inversion (white and blue areas on the chromosome) that does not span the centromere has negative consequences for the outcome of sexual recombination between the two species. The meiotic products include two normal chromosomes, but also one dicentric and one acentric chromosome. The acentric chromosome cannot survive, and the dicentric chromosome may be pulled apart during mitosis, leading to a random loss or gain of genetic material.
Download figure to PowerPoint
To what extent chromosomal rearrangements can lead to hybrid inviability of fungal pathogens is a crucial and important question. Unexpectedly high levels of structural variation within sexual species show that individuals can mate despite variation in their chromosomal composition. Recombination may, however, be suppressed in affected genomic locations, as found in Drosophila, leaving these regions more prone to accelerated evolution.