Speciation may be promoted in hybrid zones if there is an interruption to gene flow between the hybridizing forms. For hybridizing chromosome races of the house mouse in Valtellina (Italy), distinguished by whole-arm chromosomal rearrangements, previous studies have shown that there is greater interruption to gene flow at the centromeres of chromosomes that differ between the races than at distal regions of the same chromosome or at the centromeres of other chromosomes. Here, by increasing the number of markers along race-specific chromosomes, we reveal a decay in between-race genetic differentiation from the centromere to the distal telomere. For the first time, we use simulation models to investigate the possible role of recombination suppression and hybrid breakdown in generating this pattern. We also consider epistasis and selective sweeps as explanations for isolated chromosomal regions away from the centromere showing differentiation between the races. Hybrid breakdown alone is the simplest explanation for the decay in genetic differentiation with distance from the centromere. Robertsonian fusions/whole-arm reciprocal translocations are common chromosomal rearrangements characterizing both closely related species and races within species, and this fine-scale empirical analysis suggests that the unfitness associated with these rearrangements in the heterozygous state may contribute to the speciation process.
Areas of contact and hybridization between genetically distinct forms within species are known as “hybrid zones” of particular interest for their possible role in speciation (Barton and Hewitt 1985; Harrison 1990; Jiggins and Mallet 2000). One way the hybridizing forms may speciate is if genetic exchange is interrupted in certain regions of the genome, and for those regions to accumulate genetic incompatibilities. The possibility of divergence with gene flow via these “genomic islands of speciation” (Turner et al. 2005) has sparked considerable recent interest (Pinho and Hey 2010; Smadja and Butlin 2011; Feder et al. 2012; Nachman and Payseur 2012).
A partial interruption to gene flow may occur in hybrid zones between “chromosome races” (forms that differ by chromosomal rearrangements). The F1 hybrids are chromosomal heterozygotes, for which meiotic chromosomal pairing near the rearrangement breakpoints may be incomplete or nonhomologous, inhibiting recombination (Searle 1993), or recombination may be unproductive through the generation of unviable products (Rieseberg 2001). Given that genetic interchange across hybrid zones can only occur via hybrids, genomic regions where recombination fails in hybrids will also fail to show gene flow. That localized failure of recombination in hybrids (recombination suppression) may promote speciation between chromosomal forms was suggested by Rieseberg (2001), and supported theoretically by Navarro and Barton (2003) and empirically by Noor et al. (2001).
Chromosomal rearrangements may also interrupt gene flow in hybrid zones through hybrid breakdown: the reduced fitness of chromosomal heterozygotes relative to homozygotes. Such unfitness may characterize hybridizing races differing by whole-arm chromosomal rearrangements, such as Robertsonian fusions (telocentrics fusing centromerically to form metacentrics) or WARTs (whole-arm reciprocal translocations swapping chromosome arms among metacentrics or between metacentrics and telocentrics). Heterozygotes for these rearrangements can show substantial infertility on chromosomal grounds (Searle 1993; Hauffe and Searle 1998; Castiglia and Capanna 2000; Wallace et al. 2002; Jadwiszczak and Banaszek 2006; Fedyk and Chętnicki 2010; Sans-Fuentes et al. 2010; Nunes et al. 2011), and gene flow across hybrid zones is expected to be most interrupted in genomic regions close to the chromosomal rearrangement breakpoints, which are equivalent to unfitness loci in genic systems with heterozygous disadvantage (Panithanarak et al. 2004; Franchini et al. 2010). Therefore, for whole-arm rearrangements, reduced gene flow near the centromere of the rearranged chromosomes (site of the rearrangement breakpoint) can be explained by either recombination suppression or hybrid breakdown, the two alternative models of chromosomal speciation (Rieseberg 2001; Faria and Navarro 2010; Jackson 2011). The fact that hybridizing forms differing by whole-arm rearrangements may show reduced gene flow was found in shrews (Sorex spp.); with greater differentiation for rearranged than colinear chromosomes (Basset et al. 2006; Yannic et al. 2009).
Here, we consider the impact of whole-arm chromosomal rearrangements on gene flow across a hybrid zone between chromosome races of the house mouse (Mus musculus). The western subspecies (domesticus) is subdivided into numerous such races, each characterized by a different set of metacentric chromosomes that derive from the ancestral complement of 40 telocentrics (Piálek et al. 2005; White et al. 2010). Abnormalities of meiotic pairing (incomplete or nonhomologous) in the vicinity of the centromere of heterozygous chromosomes have been observed (Searle 1993; Wallace et al. 2002) and could result in reduced recombination or reduced fertility due to germ cell death (resulting from inappropriate gene expression associated with incomplete pairing: Searle 1993; Sans-Fuentes et al. 2010). In males, germ cell death may also follow sex chromosome gene expression abnormalities resulting from interactions between the sex bivalent and unpaired autosomal regions (Johannisson and Winking 1994; Forejt 1996). A further source of reduced fertility in chromosomal heterozygotes is the high frequency of anaphase I nondisjunction with resulting unviable aneuploid embryos (Searle 1993; Hauffe and Searle 1998).
Our study site has been Valtellina, an alpine valley with five chromosome races (Fig. 1; Hauffe and Searle 1993). Two races (Poschiavo and Mid Valtellina: CHPO and IMVA) have metacentric 8.12 and telocentrics 2 and 10 (hereafter referred to as the “8.12 races”), whereas two others (Lower and Upper Valtellina: ILVA and IUVA) have metacentrics 2.8 and 10.12 (“10.12 races”). Additionally, standard all telocentric mice (40ST) are found in two villages (Fig. 1).
Hybrids between house mouse chromosome races can be chromosomally heterozygous in two different ways (Searle 1993). Where races differ by a metacentric versus the homologous telocentrics, the hybrids are “simple heterozygotes” that produce a chain-of-three meiotic configuration, whereas races differing by metacentrics with monobrachial (single arm) homology form hybrids that are “complex heterozygotes,” with longer meiotic chain or ring configurations. Metacentrics with monobrachial homology are formed by independent accumulation of different Robertsonian fusions in two races, or by WARTs. The “8.12” and “10.12” races in Valtellina generate hybrids with a meiotic chain-of-five configuration (2–2.8–8.12–12.10–10; Hauffe and Searle 1998). These complex heterozygotes show substantial infertility (Hauffe and Searle 1998) and are expected to suffer disrupted meiotic chromosomal pairing near the centromeres in the heterozygous configuration (Searle 1993), and hence recombination suppression.
Therefore, by comparing the “8.12” and “10.12” races, we reflect the major chromosomal difference among chromosomal forms in Valtellina. Within both the 8.12 and 10.12 races, chromosomes 7 and 18 occur as telocentrics or as a metacentric. This difference is unlikely to be a major hindrance to gene flow because crosses between mice with telocentrics 7 and 18 and metacentric 7.18 generate only single simple heterozygotes (with one chain-of-three configuration at meiosis). As previously (Panithanarak et al. 2004), by simplifying Valtellina to a two-taxon rather than a four-taxon system, statistical power is maximized.
That previous study of the Valtellina hybrid zone showed differentiation of the 8.12 and 10.12 races at microsatellite loci close to the centromeres of chromosomes 10 and 12, but not distally on these chromosomes nor centromerically on other chromosomes common to the 8.12 and 10.12 races (Panithanarak et al. 2004). This suggests that the centromeric differentiation on chromosomes 10 and 12 can be attributed to chromosomal rearrangements with breakpoints at the centromeres of those chromosomes, and not due to some generalized “centromere effect” (Carneiro et al. 2009; Neafsey et al. 2010; Pinho and Hey 2010; Nachman and Payseur 2012). Our interpretation that chromosomal rearrangements with centromeric breakpoints in the house mouse have an impact on gene flow is also supported by a small-scale study on another hybrid zone between mouse chromosome races (Franchini et al. 2010). Therefore, the results of Panithanarak et al. (2004) indicate that gene flow is interrupted between hybridizing chromosome races in Valtellina as a result of the chromosomal rearrangements that distinguish them, but the exact extent of this effect along the chromosome arms was not determined.
In the present study, we have used the same specimens from the Valtellina zone as Panithanarak et al. (2004), and focused on the same chromosomes (10 and 12), but substantially increased the number and distribution of loci. Thus, our analysis of gene flow across the Valtellina zone is based on a total of 17 microsatellites along chromosome 10, and 23 along chromosome 12 (adding 30 loci to those studied by Panithanarak et al. ). The aim of our study is to extend the analysis of those particular chromosomes shown previously to have strong genic differentiation between chromosomal races in the Valtellina zone. All mice were karyotyped and sampled as intensively as possible from relatively low density populations in a small geographic area (a 20 km river valley) over three years (Hauffe and Searle 1993).
Furthermore, to interpret these results, we conducted computer simulations of gene flow between chromosome races in a hybrid zone. We simplified the hybridization to a situation where one chromosome was present as a telocentric in one race and as one arm of a metacentric in the other, and examined the impact of hybridization on levels of differentiation for loci along that chromosome arm. These results can be generalized to the individual heterozygous chromosomes of any Robertsonian or WART heterozygote, and we were able to assess the effect of hybrid breakdown and recombination suppression on locus-by-locus differentiation. As well as exploring the impact of hybrid breakdown and recombination suppression on differentiation close to the centromere, we also examined the role of selective sweeps and epistasis in modifying this pattern. Selective sweeps that spread through the 8.12 and 10.12 hybridizing races (relating to origin and spread of a universally advantageous allele) could locally reduce differentiation within any otherwise differentiated region. Epistasis between centromeric and distal loci on the same chromosome could have the reverse effect to cause distal markers to show differentiation.
Our simulations followed as closely as possible the biological features of house mice, building on data on ecology and hybrid fitness as determined in Valtellina (Hauffe and Searle 1998; Hauffe et al. 2000; Piálek et al. 2001). Test simulations reflecting the apparently demic structure of house mouse populations recorded in Valtellina (Hauffe et al. 2000; Piálek et al. 2001) gave unrealistic results under a range of migration rates (0.1 ≤ m ≤ 0.3). A much better fit with the observed findings was obtained when the 8.12 and 10.12 races were treated as simplified large panmictic populations, and we present those results here. There are other situations where ecological parameters measured over relatively short time periods relate poorly to genetic structure, for example, field estimates of dispersal tend to be too low in hybrid zone analysis (Barton and Hewitt 1985).
The house mouse is a primary model for “chromosomal speciation” (Rieseberg 2001; Coyne and Orr 2004) and the process of hybridization between the chromosome races is critical (Capanna and Corti 1982; Baker and Bickham 1986; King 1993; Piálek et al. 2001; Franchini et al. 2010). Here we present the most fine-scale empirical analysis of gene flow across a chromosomal hybrid zone in the house mouse coupled with the first simulations comparing hybrid breakdown and recombination suppression as competing models of chromosomal speciation (Rieseberg 2001) in house mouse hybrid zones. As Nachman and Payseur (2012) have pointed out, simulation modeling is a much-needed addition to molecular studies to help understand speciation with gene flow.
To compare microsatellites between the 8.12 and 10.12 races in Valtellina, the 99 specimens analyzed similarly in Panithanarak et al. (2004) were used here, following the same methods of molecular typing. The entire set of microsatellite loci analyzed is illustrated in Figure 2, including the 30 new loci and D10Mit75, 246, 51, 180, and 103 and D12Mit145, 182, 11, 30, and 8 for which data were available from Panithanarak et al. (2004); one other locus from that study (D12Mit44) was excluded because Panithanarak et al. reported conflicting data on its mapping position.
The individual house mice typed for the 30 new microsatellite loci were distributed over 14 villages in Valtellina and microsatellite allele frequencies were calculated on a village-by-village basis for comparison with the cytogenetic data collected by Hauffe and Searle (1993). This is illustrated with an example in Figure 1. We were able to genotype all 99 individuals for all new loci except for D12Mit153 in the following cases (numbers of individuals that failed in parentheses): Sondalo (6), Villa di Tirano (1), Lovero (2).
To test for differentiation between the 8.12 and 10.12 races, locus-by-locus analyses of molecular variance (AMOVAs) were conducted using Arlequin 3.5 (Bern, Switzerland) (Excoffier and Lischer 2010), generating fixation indices at three levels (among races, among villages within races, and among villages). Migiondo, Grosio, Grosotto, Farm Via Prada, and Vione were villages characterized by the 8.12 races and Sontiolo, Tiolo, Lago, Lovero, Biolo, Sernio and Villa di Tirano had the 10.12 races (Fig. 1). Sondalo and Sommacologna were villages with both 8.12 and 10.12 races. However, not all individuals within a village had a fully homozygous 8.12 or 10.12 karyotype due to introgression between the races or involving the 40ST mice. Because of the occurrence of introgression and villages with mixed racial characteristics, we performed four different AMOVAs (hereafter called analyses W–Z) to examine the data from different perspectives. For the analyses W and X, we limited ourselves to homozygous pure race individuals, taking the view that these are the best representatives of the races and therefore the best reflection of differentiation between them. For analysis W (n = 88), we included the Sondalo and Sommacologna populations, and divided them into 8.12 and 10.12 subpopulations. For analysis X (n = 73), we excluded Sondalo and Sommacologna on the grounds that these mixed populations were not equivalent to the others.
For analyses Y and Z, we took a different approach considering all individuals within the villages as representing the races associated with those villages, even if some of the individuals are not fully homozygous for the 8.12 and 10.12 races. Thus, this represents the view that each village used to be composed entirely of such fully homozygous individuals and that it is through introgression that some individuals no longer have such fully homozygous karyotypes. To ignore that introgression could be viewed as failing to reflect the full extent of gene flow between the races. The mixed populations of Sondalo and Sommacologna were treated in different ways from the other villages. In analysis Y (n = 79) they were excluded (so that we only compared “8.12” and “10.12” populations), in Z (n = 99) they were treated as a third type of population (a mixed “8.12+10.12” population).
These four types of analysis look at introgression in different ways. The Sondalo and Sommacologna populations containing both the 8.12 and 10.12 races can either be excluded or analyzed together with the single race populations; hence the options W versus X and Y versus Z. To consider just homozygous individuals for each race (W, X) focuses on proven genetic infiltration of the chromosomes 10 and 12 between the 8.12 and 10.12 races, whereas considering individuals at the population level (Y, Z) increases the sample sizes and reveals the full set of alleles that have introgressed into the populations in a manner more typical of hybrid zone analysis.
For each of the four analyses, the among race FST values for individual loci from the AMOVA were plotted against cM distance along the relevant chromosome. The effect of distance from the centromere on FST was determined by fitting a linear regression model with square-root arcsine transformed FST values. A second, maximal, model was fitted with both distance from centromere and square-root arcsine transformed mean expected heterozygosity (calculated in Arlequin) as the independent variables, and whether heterozygosity made a significant contribution to the model was assessed by comparing the two models using a likelihood ratio test. Heterozygosity at microsatellite loci is thought to be positively correlated with mutation rates (Amos et al. 1996), so this factor was introduced to control for differences in the unobserved mutation rates between loci. Models were fitted and compared using R.
If recombination suppression leads to elevated FST for a region near the centromere, this may be expected to result in a segmented curve, rather than a smooth decline in FST with distance from the centromere. Thus, the nonhomologous pairing that leads to recombination suppression is expected to be in a limited region close to the centromere, based on empirical results (e.g., Wallace et al. 2002). To test whether parts of our curves had different slopes, we used the davies.test function from the segmented R package (Muggeo 2008). A one-tailed test was used, as under recombination suppression we expect the slope to be lower nearer the centromere.
We conducted simulations to determine whether the observed patterns of differentiation along the two chromosomes (10 and 12) in house mice in Valtellina were consistent with a meeting of two races with a greater or lesser effect of recombination suppression in the centromeric region of heterozygous chromosomes in hybrids and/or hybrid breakdown. We also simulated specific conditions of epistasis and selective sweeps under circumstances where there were differing degrees of recombination suppression or hybrid breakdown. The full range of conditions are summarized in Table S1 and described in detail below.
In our model, we simulate two populations of constant size with 100 individuals. Each individual within a population has two chromosomes, each of which can be in one of two dispositions, M (where the chromosome is one arm of a metacentric) or T (where the chromosome exists as a telocentric), so that individuals can be MM, TM, MT, or TT. As an example, considering chromosome 16 in northern Italy which may occur in a unfused state (as a telocentric) or in a fused state (as part of the metacentric 16.17), then MM relates to individuals homozygous for 16.17, TM and MT relates to individuals that are heterozygous with telocentrics 16 and 17 and a metacentric 16.17, and TT relates to individuals with 16 and 17 in a homozygous telocentric state. In the simulations, we are only concerned with the behavior of the focal chromosome, that is, chromosome 16 in this example.
Further attributes of the simulation are as follows: along a chromosome arm are 80 microsatellite loci, at regular intervals of 1 centimorgan (cM). Initially, all the individuals in population 1 are homozygous TT, and all individuals in population 2 are homozygous MM. Microsatellite loci start with an allele of length 60 in the metacentric population, and 55 in the telocentric population. The model was run for 1000 generations with no gene flow to generate an approximate mutation-drift equilibrium. We used a strict stepwise model of microsatellite mutation and a mutation rate of 5 × 10−4 (Estoup and Angers 1998).
In the next generation, the inhabitants of each population are produced by picking parents at random. After the 1000-generation equilibrating phase, the model was run for a further 400 generations (chosen because the villages are thought to have been recolonized by house mice following a flood event approximately 400 generations ago: Piálek et al. 2001). Migration between populations occurred at a rate of 0.01. Parents were picked from within the current population with probability 0.99 or from the adjacent population with probability 0.01. If the parent picked was a chromosomal heterozygote, it was rejected with probability schrom (selection coefficient against chromosomal heterozygotes) and another parent was picked until a parent passed the schrom selection test. With equal probability, the gamete passed on by a parent may be one or the other of its own gametes. First, the chromosomal disposition of the selected parental gamete is copied into the new gamete. Then, the allele at locus 1 (the closest to the centromere) is copied. With probability 0.99, this is the cis allele (on the same parental gamete as the centromere just copied), or with probability 0.01, this is the trans allele. For locus 2 (the next closest to the centromere), the same procedure applies, but this time the point of reference for recombination is locus 1, rather than the centromere. Loci are spaced 1 cM apart, so, by definition each neighboring pair of loci experiences 1% recombination per generation. Therefore, with probability 0.99, the allele copied is on the same parental gamete as that of locus 1, or with probability 0.01, it is copied from the other parental gamete.
In chromosomal homozygotes, recombination occurs as specified above. However, in chromosomal heterozygotes, recombination may be suppressed. A recombination suppression effect, rs, can be set, which applies to the loci around the centromere in the region 0 to +10. A random number between 0 and 1 is drawn, and if this is less than rs, no recombination occurs in this region, that is, the cis allele is always inherited. For loci outside this region, recombination then occurs as normal. There were no clear empirical data on which to base the region of recombination suppression. However, we also carried out analyses using regions 5 and 20 cM from the centromere, and got qualitatively similar results as with the 10 cM region chosen.
Parameter values chosen were recombination suppression rs = 0, 0.5, 0.9, 0.99, and selection against chromosomal heterozygotes schrom = 0, 0.25, 0.5. From pachytene studies of hybrids between chromosome races of the house mouse, nonhomologous pairing around the centromere appears to be almost universal for heterozygous chromosomes (Wallace et al. 2002), hence the justification for these high values for recombination suppression. However, there are also empirical data that suggest that recombination does, at least occasionally, occur in the centromeric region, so it would be inappropriate to have rs = 1 (Panithanarak et al. 2004). The selection against heterozygotes has previously been estimated as 0.221 (Piálek et al. 2001) based on empirical data (Hauffe and Searle 1998) and the values used here are close to, above, and below that value. For each parameter set, we ran 1000 replicates of the simulation.
At the end of each simulation run, 30 individuals were sampled from each population, and FST values were calculated for each locus following Weir and Cockerham (1984). The 95% quantiles of FST for each locus were calculated in R, and plotted against distance along the chromosome. For half of the simulation runs, only homozygous TT individuals were sampled from population 1 and only homozygous MM individuals were sampled from population 2. This corresponds approximately to our first two sampling approaches (W and X) to calculate AMOVAs for the empirical data. For the other half of the simulation runs, 30 individuals were sampled from each population without respect to karyotype. This corresponds approximately to our third and fourth empirical sampling approach (Y and Z), treating the 8.12 and 10.12 races as populations.
To help understand our empirical results where isolated groups of loci well-separated from the centromere show differentiation, we simulated representative examples of a selective sweep and an epistatic interaction. For the selective sweep, in the telocentric race 100 individuals were set to be heterozygous AA+ at locus 5. The allele A+ is favored, with selection coefficient s operating against alleles A and a. For epistasis, cis interactions were simulated between chromosome configurations and locus 25. The fitness scheme for this cis epistasis is shown in Table 1. Thus, we assume that although selection against chromosomal heterozygotes operates through differential fertility, selective sweeps and epistasis operate through differential viability. Thus, when individuals of the next generation are created, they must pass a selection test before they can enter the population. Individuals have a fitness parameter, which is one minus each source of unfitness. A random number between zero and one is drawn. If this is less than the fitness, then that individual is allowed to enter the population. If not, the process is repeated until an individual passes the selection test.
Table 1. Fitness scheme for cis epistasis (see Methods). Paternally derived alleles/chromosomes are listed first
Genotype at locus 25
Tables S2–S5 show the full AMOVA results, and Table 2 brings together the among race FST values for all four analyses (W–Z: see Methods). The results vary between chromosomes and between the analyses.
Table 2. Among race differentiation revealed through analyses of molecular variance (AMOVAs), with four different analyses (W–Z, see Methods) for (A) 17 microsatellite loci on chromosome 10 and (B) 23 loci on chromosome 12
For chromosome 10, analyses W–Y give very similar results: high, usually significant FST values between 0 and 5.5 cM from the centromere, low and nonsignificant values between 5.5 and 12.0 cM, high and significant values at the neighboring loci at 14.2 and 15.3 cM, and then low and nonsignificant values for the more distal parts of the chromosome (Table 2). Different results are obtained with analysis Z (where all individuals from the target villages are included, and where the villages are in three categories: “8.12,” “10.12,” and “8.12 + 10.12”). Here all 12 loci between zero and 20.8 cM have high, usually significant FST values, except for single loci at 5.5 and 12.0 cM, whereas the four more distal loci had low and nonsignificant values.
For chromosome 12, the four analyses gave similar results: high, usually significant FST values between zero and 9.8 cM from the centromere, low and nonsignificant values between 9.8 and 10.9 cM, high and significant values at a locus at 12.0 cM and at 21.9 cM, whereas loci in between these last two and distal to the 21.9 cM locus showed low and nonsignificant values (Table 2). Again analysis Z generated a larger number of loci with significant FST values, but the discrepancy with the other analyses was not as marked as for chromosome 10.
Figure 3 shows graphs of the relationship between FST values and distance from the centromere for each of the analyses. For chromosome 12, irrespective of analysis, distance from the centromere has a significant negative effect on FST. For chromosome 10, it is only analysis Z that shows a strong relationship (highly significant).
Mean expected heterozygosity did not vary with locus position for any of the analyses, and heterozygosity had no additional negative effect on differentiation of loci (i.e., after accounting for distance from the centromere) in any of the analyses. There was no statistical support for a breakpoint in the regression line of FST versus distance along the chromosome in either chromosome 10 or chromosome 12 for any of the analyses, although we caution that the Davies test may not have sufficient power to reflect a weak positive result of this sort.
Figures 4 and 5 show the results of our simulations. In these graphs, locus 0 is the centromere, which has only two alleles, metacentric or telocentric.
For the simulation runs where individuals are sampled with respect to karyotype (Fig. 4A), recombination suppression does not prevent chromosomes from moving between populations, but we do not sample these. Recombination is reduced around the centromere, leading to higher linkage disequilibrium between the centromere and loci in the region of suppression, and as we are only sampling homozygotes of each race, this generates high FST values around the centromere. Recombination suppression has little effect on the value of FST unless recombination suppression is greater than 0.5. Unfitness of chromosomal heterozygotes alone, and particularly in synergy with recombination suppression, acts to enhance differentiation at loci close to the centromere.
A different picture emerges when the two races are sampled as populations without respect to karyotype (Fig. 5A). When there is recombination suppression but no selection against heterozygotes, recombination suppression has no effect. This is because the different chromosomes can pass freely from one population to the other along with their linked loci, and as FST is assessed between geographic populations here rather than between chromosome races, FST values decline very quickly on secondary contact. However, hybrid breakdown can slow the spread of chromosomes from one population to the other, and can generate regions of higher FST close to the centromere. This effect of hybrid breakdown is enhanced when recombination suppression is greater than 0.5.
Figure 4B shows the results of the simulation model when there is a selective sweep at locus 5 with s = 0.1, when only homozygous individuals are sampled. With low levels of recombination suppression, a selective sweep produces regions of low FST close to the centromere. With no selection against chromosomal heterozygotes, as recombination suppression increases there is more noise in the data as in many runs of the model there are
insufficient numbers of homozygotes of both races. This is because the chromosome linked to the favored allele also tends to become fixed throughout both populations. This occurs less frequently when there is selection against chromosomal heterozygotes, as this prevents either chromosomal configuration from spreading into a population of the other configuration. These are the conditions when a localized signal of the selective sweep can be seen.
When individuals are sampled without respect to karyotype (Fig. 5B), at moderate hybrid breakdown but high recombination suppression, the chromosome that started with the favored allele is swept to fixation. As all the individuals in both populations will then have chromosomes TT, the FST values between them are very low.
Figures 4C and 5C show the results of the simulation model when there is epistatic selection between the chromosome type and locus 25 with s = 0.1. Results are similar for both sampling regimes. Under a wide range of conditions this produces a high FST value at the centromere and locus 25, with a trough in between.
In the hybrid zone between the 8.12 and 10.12 races of house mouse in Valtellina, both chromosomes 10 and 12 showed high differentiation at the centromere and a decay in that differentiation (significant in all analyses for chromosome 12) with distance along the chromosome. Thus, loci show a greater interruption to gene flow if they are closer to the centromere. This supports previous results in the same hybrid zone (Panithanarak et al. 2004) and on another hybrid zone between chromosome races in the house mouse (Franchini et al. 2010). These earlier studies showed a difference between loci in the immediate vicinity of the centromere in comparison with loci in the immediate vicinity of the distal telomere. In our study, we have been able to show that the region of differentiation at the centromere extends substantially into interstitial regions of the chromosome. This region extends approximately 5.5 cM for chromosome 10 and 9.8 cM for chromosome 12 which equates to physical distances of about 19 Mb and 33 Mb from the centromere, respectively, based on marker location in the house mouse genome (Mouse Genome Informatics, Jackson Laboratory: http://www.informatics.jax.org/). The total map and physical lengths of the chromosomes are 78 cM and 130 Mb for chromosome 10 and 64 cM and 121 Mb for chromosome 12. Thus, for chromosome 12, the region of differentiation extends from the centromere over a quarter of the physical length of the chromosome.
Between two populations, when there is greater genetic differentiation for certain parts of the genome than others, it may be for causes other than interruption to gene flow. Indeed, populations that are not interbreeding may show variable patterns of genetic differentiation across the genome due, for instance, to selective sweeps in regions of low recombination (Noor and Bennett 2009; Nachman and Payseur 2012). The populations we examined were all within a 20-km stretch of hybrid zone and all genetic studies show evidence of individuals moving between populations and interbreeding (Hauffe and Searle 1993; Fraguedakis-Tsolis et al. 1997; Hauffe et al. 2004; Panithanarak et al. 2004). Therefore, there is the opportunity for gene flow which could, of course, potentially involve all parts of the genome. Levels of interbreeding undoubtedly do have an impact, and where the 8.12 and 10.12 races occur in the same village (Sondalo and Sommacologna), the 8.12 and 10.12 individuals in the population are no longer differentiated in the centromeric regions of chromosomes 10 and 12 (Panithanarak et al. 2004).
In the Valtellina hybrid zone, the greatest differentiation detected between the 8.12 and 10.12 races is at the centromeres of chromosomes 10 and 12. The centromere itself is an area of reduced recombination in the house mouse and other species (Nachman and Churchill 1996) and when two differentiated taxa are in contact and hybridize, there are grounds for expecting the centromeric regions to be more differentiated than other parts of the genome (Carneiro et al. 2009; Pinho and Hey 2010; Nachman and Payseur 2012). However, the greater differentiation seen in the proximal rather than the distal regions of chromosome 10 and 12 does not appear to be primarily due to a “centromere effect.” First, in addition to chromosomes that do differ by chromosomal rearrangements between races, both Panithanarak et al. (2004) and Franchini et al. (2010) examined chromosomes that did not differ between the races (i.e., that are colinear), and those did not show differentiation at the centromere. Panithanarak et al. (2004) and Franchini et al. (2010) found differentiation at the centromere at the two and five rearranged chromosomes they examined, respectively, and no differentiation at the centromere at the two and one colinear chromosomes they examined. Second, the region of differentiation that we find appears very large for a centromere effect. In their study of hybridization of two highly differentiated subspecies of rabbits, Carneiro et al. (2009) demonstrated a centromere effect in only three of five chromosomes examined, and a subsequent study on one of the affected chromosomes showed that the differentiation was highly localized to the immediate vicinity of the centromere (Carneiro et al. 2010); a marker about 5 Mb from the centromere did not show differentiation. Likewise, in Anopheles mosquitoes, the centromere effect is very restricted (see Fig. 1 in Neafsey et al. 2010).
It appears most reasonable, therefore, that the genic differentiation on chromosomes 10 and 12 as seen on hybridization of the 8.12 and 10.12 races in Valtellina reflects the chromosomal rearrangements that affect these two chromosomes. Hybrids between the 8.12 and 10.12 races form a meiotic chain-of-five configuration known to be associated with reduced fertility (Hauffe and Searle 1998) and may predictably show abnormal (including nonhomologous) pairing around the centromere (Searle 1993). So, the competing models of “chromosomal speciation”—hybrid breakdown and recombination suppression (in hybrids)—are both possible in this system.
These competing possibilities were examined in the simulation study. The simulation study did not use a demic structure because that gave unrealistic preliminary results. This failure is of interest because much emphasis has been placed on the demic structure to explain genetic features of mouse populations (Selander 1970; Boursot et al. 1993; Pocock et al. 2005). But over longer periods of time, which could be more important in terms of gene flow across a hybrid zone, mice may approximate to single undivided populations. This requires further investigation and the issue is of interest, given that the house mouse is such a key model in evolutionary genetics (Macholán et al. 2012). As a demic structure was used in previous simulations relating to hybrid race formation in Valtellina (Piálek et al. 2001), it would also be interesting to reexamine that using simulations based on single populations.
Considering the outcome of the simulation study, hybrid breakdown appears to explain the results remarkably well. Moderate levels of unfitness (s = 0.25) could account for the decay in differentiation along the chromosome observed (Figs. 4A, 5A). This degree of hybrid breakdown fits well with empirical results (s = 0.221: Piálek et al. 2001).
Although recombination suppression could also maintain genetic differentiation in the absence of hybrid breakdown for one of the analyses (Fig. 4A), it had little effect unless values of recombination suppression were greater than 0.5, and the same result was not obtained when the two races were sampled as populations without respect to karyotype (Fig. 5A). In addition, although there is an expectation of reduced variation in regions of reduced recombination (Nachman 2002), we found no relationship between expected heterozygosity and distance from the centromere for either chromosome 10 or 12. Neither did we find a change in the slope of the relationship between FST and distance from the centromere, which might be expected if recombination was suppressed over a discrete region. Given that wild mice heterozygous for Robertsonian fusions show consistent, distinct, but rather short regions of nonhomologous pairing (Wallace et al. 2002), the region of recombination suppression that we simulated was also short and we allowed very high levels of suppression among the values examined.
Although hybrid breakdown alone could explain the observed results in Valtellina, it is notable from the simulations that unfitness and recombination suppression act synergistically to enhance differentiation in the centromeric region in heterozygotes (Figs 4A, 5A) and it could be that both processes are contributing to the differentiation observed.
Navarro and Barton (2003) also model differentiation through recombination suppression associated with chromosomal rearrangements (in their case inversions). Their model showed that recombination suppression caused by chromosomal rearrangements favored differentiation and the buildup of genetic incompatibilities between races. Although their model was designed to remove the requirement of hybrid underdominance from chromosomal speciation, they also point out that “some sort of selection must be maintaining different frequencies of the arrangement in different locations.” In their model, each chromosomal rearrangement (or set of selected loci) was favored in one population and selected against in the other. Although hybrid breakdown and environmental selection need not be mutually exclusive (Polyakov et al. 2011), we know that chromosomal rearrangements cause hybrid breakdown, so in our case it is not necessary to invoke additional environmental selection as the starting basis for the separation of chromosomal rearrangements.
In visualizing the decay in differentiation of loci along chromosomes 10 and 12 from the centromere toward the distal telomere, there is not an entirely smooth progression in FST values (Fig. 3). Instead there are regions where there are loci with low, nonsignificant FST values in between loci with high significant FST values. For chromosome 10, these loci are D10mit281, D10mit51, D10mit282, D10mit2, and D10mit86, located between 5.5 and 12.0 cM from the centromere. For chromosome 12, these loci are D12mit85, D12mit171, and D12mit146, located between 9.8 and 10.9 cM from the centromere.
These could be examples of shared ancestral polymorphism in otherwise well-differentiated taxa, leading to low FST values. For example, in the Carneiro et al. (2010) study of differentiation between rabbit subspecies there is one locus in the proximal long arm of the X chromosome that is a clear outlier and may be best explained this way. However, for chromosomes 10 and 12 in the Valtellina hybrid zone there are groups of neighboring loci all with low FST values in a region otherwise dominated by high FST values.
One possible explanation is that the low FST values represent a selective sweep. In other words, there is an allele at one locus in the differentiated region that has a selective advantage in both the 8.12 and 10.12 races and that allele spreads to fixation in both races, with alleles at linked loci also spreading through the two races by genetic hitchhiking (Maynard Smith and Haigh 1974). We simulated this possibility and showed that the sweep could generate the type of results observed under conditions where the high FST values are a consequence of moderate hybrid breakdown (Figs. 4B, 5B).
Another possibility is that the high FST values near the centromeres of chromosomes 10 and 12 reflect hybrid breakdown and/or recombination suppression and that high FST values elsewhere in the chromosome are the consequence of epistasis between centromeric loci and loci elsewhere in the chromosome. We simulated epistasis between the centromere itself and an interstitial locus. Under a wide variety of conditions of hybrid breakdown and recombination suppression, we obtained a peak of high FST values near the centromere, then a trough, and then high values again at an interstitial location in the chromosome.
In considering our simulation results overall, we should emphasize that we only considered a strict stepwise mutation model for microsatellites. More complex models of mutation may lead to different outcomes. However, such models would likely lead to more variance between loci, and hence make selecting between competing scenarios more difficult. Thus, we feel justified in our conservative model selection. With this proviso, we can conclude that the empirical data that we obtained for Valtellina mice, with high centromeric FST values on chromosomes 10 and 12 and their decay along the chromosomes, are best explained by hybrid breakdown, but with the possibility that other processes (recombination suppression, selective sweeps, epistasis) are involved. The simplest explanation is that hybrid breakdown is driving the results and there are clear data showing that hybrids between the 8.12 and 10.12 races do have reduced fitness (Hauffe and Searle 1998). From data for another hybrid zone between chromosome races of house mice, Franchini et al. (2010) also inferred a primary importance of hybrid breakdown in promoting reduced gene flow in centromeric regions. Their results did not fit with recombination suppression, applying knowledge of chiasmata distribution to data on genic divergence involving different populations in the vicinity of the hybrid zone examined.
Shrews provide an interesting comparison to our results with mice. Here, two chromosomal forms (Sorex araneus and S. antinorii) make contact and hybridize sporadically. Microsatellite markers on rearranged chromosomes diverge more between the species than markers on colinear chromosomes (Basset et al. 2006; Yannic et al. 2009). The species differ by whole-arm rearrangements and F1 hybrids form a chain-of-eleven configuration. Here there may be more extreme hybrid breakdown maintaining the pattern of genic divergence than in the Valtellina case. The differentiation currently seen in the S. araneus–S. antinorii zone, as with the 8.12 and 10.12 races in Valtellina, is likely to have been acquired largely in allopatry, with the hybrid breakdown retaining this preexisting differentiation in certain genomic regions. This is of course also the situation that we have modeled in our simulation studies. Recombination suppression may also be involved in maintaining the differentiation, but there is no certainty of that, as there appears to be very little nonhomologous pairing in shrew chromosomal heterozygotes (Wallace and Searle 1990); although, as in house mice, long meiotic chain configurations have not been examined in sufficient detail.
Interestingly, there are also hybrid zones within S. araneus where the chromosomal differentiation is as great or nearly as great as for S. araneus and S. antinorii, and yet, in those zones, there are no differences with regards genetic differentiation between the rearranged and colinear chromosomes (Horn et al. 2012). This may reflect a lack of power in the analysis: only 16 microsatellite loci were used in the study of which up to six were mapped to rearranged chromosomes, with an unknown position, which could by chance be relatively distal.
If we are right that hybrid breakdown is the primary factor maintaining genetic differentiation between chromosome races that differ by whole-arm rearrangements, it may promote speciation in two ways. First, by preserving genomic regions of genic divergence (as demonstrated here), which might accumulate genic incompatibilities that lead to reproductive isolation. Second, by creating a selection pressure for assortative mating (reinforcement), which again could lead to a complete cessation of reproduction. The two processes may act together, in that alleles vital in assortative mating may accumulate at loci in the proximal regions of the differentiated chromosomes; such a process has been suggested to explain an apparent reinforcement event in Valtellina (Hauffe and Searle 1992; Piálek et al. 2001). It should be noted that reinforcement has been demonstrated in the hybrid zone between the M. m. musculus and M. m. domesticus subspecies (Smadja and Ganem 2005; Ganem et al. 2008; Bimová et al. 2011).
Although there may be no necessity to invoke recombination suppression in reducing gene flow between chromosome races of house mice that differ by whole-arm rearrangements, an impact of recombination suppression closely in association with centromeres and inversions has been demonstrated in other systems, with particularly clear data for Anopheles mosquitoes (Neafsey et al. 2010). The importance of selective sweeps in determining patterns of genetic differentiation are also emphasized in the Anopheles system (Neafsey et al. 2010).
There is great interest in the possibility of divergence with gene flow and realization that chromosomal rearrangements may be involved in this process, but much of the discussion to date has related to chromosomal inversions (Rieseberg 2001). Here, we have used empirical data and modeling to show in greater detail than ever before how whole-arm chromosomal rearrangements may also be involved.
We received support from Programme Alβan of the European Union (E03D08916AR to MDG), the Natural Environment Research Council (to JBS, HCH, and MDG), the Seventh European Community Framework Programme (Marie Curie FP7-PEOPLE-2009-IOF to TAW), the Thai government (to TP), and the Fondazione E. Mach (to HCH). We thank three anonymous reviewers for valuable comments.