• Leslie M. Turner,

    1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August Thienemannstrasse 2, 24306 Ploen, Germany
    Search for more papers by this author
    • Current address: Laboratory of Genetics, 2455 Genetics/ Biotechnology, University of Wisconsin, Madison, WI 53706

  • Denise J. Schwahn,

    1. Research Animal Resources Center, University of Wisconsin, Madison, Wisconsin 53726
    Search for more papers by this author
  • Bettina Harr

    1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August Thienemannstrasse 2, 24306 Ploen, Germany
    Search for more papers by this author


Barriers to gene flow between naturally hybridizing taxa reveal the initial stages of speciation. Reduced hybrid fertility is a common feature of reproductive barriers separating recently diverged species. In house mice (Mus musculus), hybrid male sterility has been studied extensively using experimental crosses between subspecies. Here, we present the first detailed picture of hybrid male fertility in the European M. m. domesticusM. m. musculus hybrid zone. Complete sterility appears rare or absent in natural hybrids but a large proportion of males (∼30%) have sperm count or relative testis weight below the range in pure subspecies, and likely suffer reduced fertility. Comparison of a suite of traits related to fertility among subfertile males indicates reduced hybrid fertility in the contact zone is highly variable among individuals and ancestry groups in the type, number, and severity of spermatogenesis defects present. Taken together, these results suggest multiple underlying genetic incompatibilities are segregating in the hybrid zone, which likely contribute to reproductive isolation between subspecies.

Studies of natural hybrid zones have provided many important insights into speciation processes (e.g., Barton and Hewitt 1985, 1989; Harrison 1990; Rieseberg and Buerkle 2002; Buerkle and Lexer 2008; Payseur 2010). In contrast to studies of fully reproductively isolated species, where it is difficult to distinguish incompatibilities that initially contributed to speciation from incompatibilities that arose after reproductive barriers were complete, it is possible to directly identify mechanisms preventing gene flow between taxa that hybridize (Coyne and Orr 2004). Investigating reproductive isolation in an ecological context reveals additional barriers that may be operating in nature (Mallet 2006; Noor and Feder 2006). Patterns of spatial variation in phenotype and genotype frequencies across a hybrid zone can be used to estimate the strength of selection against hybrids and identify regions of the genome likely harboring reproductive isolation genes (Barton and Hewitt 1985; Szymura and Barton 1986).

House mice are an important model for understanding speciation because there are naturally occurring hybrid zones between subspecies and a wealth of available genetic tools and resources. The western and eastern European house mouse subspecies (M. m. domesticus and M. m. musculus, respectively) interbreed in a narrow contact zone running from Bulgaria to Denmark (Fig. 1A; Sage et al. 1986b; Boursot et al. 1993). Detailed genetic analyses of several geographically distant hybrid zone transects have identified genomic regions that show reduced introgression across the zone. These regions may contain genes that contribute to reproductive isolation (Sage et al. 1986b; Vanlerberghe et al. 1986, 1988a,b; Nance et al. 1990; Tucker et al. 1992; Payseur et al. 2004; Bozikova et al. 2005; Payseur and Nachman 2005; Macholan et al. 2007; Teeter et al. 2008, 2010). Loci on the X chromosome show the most striking patterns of reduced gene flow, consistent with the “large X-effect,” a disproportionately large role of factors on the X chromosome in causing hybrid defects, documented in numerous animal taxa (Coyne and Orr 1989, 2004).

Figure 1.

Sampling in the hybrid zone. (A) The European house mouse hybrid zone, black square indicates sampling area. (B) Locations of sampling localities. Circles indicate sites where mice bred to produce phenotyped males were captured, open triangles indicate other sampling sites. (C). Hybrid index and longitude of trapping site is plotted for each wild-caught mouse; parents of phenotyped males are indicated with circles and other individuals with open triangles. Names of sites where breeding mice were collected are indicated above the x-axis.

It is not yet clear which mechanism(s) of reproductive isolation maintains the hybrid zone. Assortative mating preference between M. m. domesticus and M. m. musculus (hereafter referred to as domesticus and musculus) varies among individuals and populations, but is often weak and asymmetric (Laukaitis et al. 1997; Smadja and Ganem 2002; Smadja et al. 2004; Bimova et al. 2005; Ganem et al. 2008). Experimental crosses between subspecies have demonstrated reduced fertility of both male and female hybrids (reviewed in Britton-Davidian et al. 2005; Good et al. 2008b). Mice from the hybrid zone have higher intestinal parasite loads than mice from pure populations, suggesting hybrids may suffer reduced immune function and viability (Sage et al. 1986a; Moulia et al. 1991, 1993; Derothe et al. 2001, L. M. Turner unpubl. data). Each of these phenotypes may contribute to reproductive isolation, but their relative strength and importance in maintaining the subspecies barrier are unknown.

Hybrid male sterility is the most well studied of these potential isolating mechanisms; it has been documented in offspring from crosses between musculus and domesticus using individuals caught in the wild, wild-derived inbred strains, standard laboratory strains (primarily domesticus in origin, Yang et al. 2007), or some combination of these (Forejt and Ivanyi 1974; Vanlerberghe et al. 1986; Chubb and Nolan 1987; Yoshiki et al. 1993; Alibert et al. 1997; Britton-Davidian et al. 2005; Vyskocilova et al. 2005; Good et al. 2008b). The type and severity of fertility defects observed depends on the geographic origin of the strains and also varies among individuals within regions. This variability suggests that multiple genetic incompatibilities contribute to hybrid male sterility. Studies of congenic or consomic strains produced by introducing a chromosomal region from one subspecies into an inbred strain of another genetic background have identified several candidate sterility loci (Storchova et al. 2004; Trachtulec et al. 2005; Good et al. 2008a; Gregorova et al. 2008). Recently, one of these loci was identified as the gene Prdm9, a histone H3 methyltransferase that activates transcription of genes essential for meiosis (Mihola et al. 2009). Prdm9, the first mammalian speciation gene identified, causes spermatogenic arrest in hybrids between the musculus strain PWD and some classical laboratory mouse strains (primarily domesticus in origin, Yang et al. 2007).

Most evidence for hybrid male sterility in house mice is based on F1 hybrids or congenic/consomic individuals derived from experimental crosses between pure subspecies in the laboratory. Individuals in natural hybrid zones, however, show a range of admixture between subspecies and intercrossing over many generations has produced a diversity of mosaic genotypes (Teeter et al. 2010). F1 hybrids are very rare or absent because the hybrid zone is wide relative to dispersal distance, thus pure musculus and domesticus rarely come into contact. The genetic incompatibilities causing hybrid defects are thought to arise from negative epistasis, a type of genetic interaction where the deleterious effect of a mutation depends on (or is modified by) a locus elsewhere in the genome (Dobzhansky 1937; Muller 1942; Coyne and Orr 2004). To understand the role hybrid male sterility plays in maintaining the species barrier, it is important to characterize reduced fertility in the diverse genetic backgrounds present in natural hybrid zones.

We investigate hybrid male sterility in natural populations from the European house mouse contact zone. We measured several reproductive phenotypes to determine the number, type, severity, and prevalence of fertility defects present. To account for environmental and age-related variation, we phenotyped first generation offspring of wild-caught hybrids, bred using a scheme designed to mimic matings that occur in the wild. We find that reduced fertility is common in natural hybrids. Variation in the form and frequency of hybrid sterility suggests multiple incompatibilities are segregating in the hybrid zone.

Materials and Methods


In 2008, we trapped 234 mice in farm buildings and stables in Bavaria, Germany (Fig. 1 and Table 1), in the same region of the hybrid zone sampled by R. D. Sage in 1984, 1985, and 1992 (Sage et al. 1986b; Payseur et al. 2004). Ninety-four mice (52 females, 42 males) from sites chosen to sample the full range of admixture were transported to the Max Planck Institute for Evolutionary Biology in Plön for breeding. To mimic matings that occur in the wild as closely as possible, we set up breeding pairs (N = 90) between a male and female from the same or nearby sites. Males and females were mated multiple times in different combinations to maximize genetic variation in hybrid offspring. For most successful pairs (68 pairs that produced one or more litters), we did not remove the male before the female gave birth. The exceptions (6/68 pairs) occurred when males were needed to pair with a different female because no other male from the same site was available to breed. The 22 unsuccessful pairs were maintained for 17–82 days (mean 49.5), with some pairs maintained for a relatively shorter time because one individual of the pair was needed as a mate elsewhere. We weaned litters at 28 days and housed males individually to prevent any effect of social dominance interactions between males on fertility.

Table 1.  Sampling localities.
SiteLatitude (°N)Longitude (°E)TownN1Mean hybrid index2 (range)Admixture category3
  1. 1Number of individuals captured. Number genotyped is indicated in parentheses if different.
    2Hybrid index = proportion of alleles with M. musculus ancestry at 37 SNP loci.
    3Admixture categories: D =“domesticus,” mean hybrid index <0.2; HD =“hybrid domesticus,” mean hybrid index 0.2–0.5; HM =“hybrid musculus,” mean hybrid index 0.5–0.8; M =“musculus,” mean hybrid index >0.8.

KI48.35911.502Milbertshofen 30.05 (0.03–0.07)D
DR48.39311.538Dürnbach 20.05 (0.05–0.05)D
SO48.40011.539Pelka15 (13)0.05 (0–0.23)D
MO48.38911.569Hohenbercha 60.05 (0.02–0.07)D
SC48.37311.570Appercha 10.03D
RO48.31511.587Deutenhausen 5 (3)0.29 (0.28–0.30)HD
KL48.38211.614Giesenbach 70.16 (0.05–0.73)D
KR48.40211.617Kranzberg 40.12 (0.09–0.15)D
OB48.36311.647Giggenhausen 40.10 (0–0.17)D
ST48.37511.658Pallhausen140.10 (0.03–0.17)D
FS48.33511.665Neufahrn bei Freising270.42 (0.32–0.55)HD
NE48.31411.677Neufahrn bei Freising 10.28HD
HO48.39911.693Hohenbachern24 (23)0.31 (0.23–0.43)HD
AP48.36411.703Pulling 10.52HM
KA48.45811.708Palzing130.74 (0.66–0.83)HM
HA48.44011.721Haindlfing200.75 (0.62–0.83)HM
MS48.45111.731Moos 70.77 (0.68–0.83)HM
PZ48.27611.747Zengermoos 10.77HM
TU48.42911.748Tüntenhausen21 (20)0.73 (0.64–0.82)HM
GO48.31511.761Goldach110.68 (0.08–0.80)HM
RF48.23511.767Finsingermoos 70.84 (0.78–0.90)M
AL48.34011.922Altham 30.82 (0.79–0.87)M
LK48.36211.951Lohkirchen 10.87M
GL48.32111.965Emling310.87 (0.82–0.95)M
RN48.27511.975Windshud 10.91M
RE48.25911.994Neufahrn bei Erding 20.86 (0.83–0.88)M
SR48.30611.996Bergarn 20.93 (0.93–0.93)M


We extracted DNA from liver, spleen, or ear samples using salt extraction or DNeasy kits (Qiagen, Hilden, Germany). We genotyped wild-caught individuals at 48 single nucleotide polymorphism loci (SNPs; Table S1A). Forty-six SNPs (1–3 per chromosome) showed fixed differences between the seven wild-derived inbred domesticus and eight wild-derived inbred musculus strains in the Wellcome trust-CTC Mouse Strain Genotype Set (Harr 2006) and strong differentiation between subspecies in a geographically diverse sample of 70 wild-caught mice (B. Harr, unpubl. data). Two additional SNPs, located in one region, had fixed differences between subspecies identified in our laboratory for another project. Multiplex genotyping was performed in two batches by the Cologne Center for Genomics (Cologne) using the SNPstream genotyping system (Beckman Coulter, Brea, CA). We genotyped 13 pure domesticus individuals (Massif Central, France and Cologne/Bonn, Germany) and 16 musculus individuals (Námest nad Oslavou, Czech Republic and Almaty, Kazakhstan) to confirm that alleles were diagnostic. For logistical reasons, genotyping was performed in two batches. Genotyping failed or genotypes were excluded from analysis for one (nine loci) or both (10 loci) batches because of high failure rates (>10% within a batch), a missing/unusually low genotype class, or because they were not fixed (diagnostic allele >0.8) in pure domesticus or musculus individuals (Table S1). Two SNPs located near each other were tightly linked; the SNP with the lower call rate was excluded. Thirty-seven loci were retained in the final dataset for at least one batch (30 loci for 208 mice, 35 loci for 24 mice), with an average call rate of 82.0%.

Following Teeter et al. (2010), we assigned each individual a hybrid index, which is the proportion of SNP alleles from musculus. We used this index to classify individuals into four “admixture categories” for statistical comparisons (see section Statistical Analysis): “D”—domesticus (hybrid index <0.2), “HD”—predominantly domesticus hybrids (hybrid index 0.2–0.5), “HM”—predominantly musculus hybrids (hybrid index 0.5–0.8), and “M”—musculus (hybrid index >0.8). For some analyses, we lumped HD and HM into a single hybrid (H) category. We chose the limits for each category based on the observed distribution of hybrid indices among sites (Fig. 1C). It was not possible to define nonoverlapping categories using individual hybrid index values such that all individuals from each site were assigned the same category. Therefore, we assigned admixture categories to each collection site based on the mean hybrid index of wild-caught individuals from that site. Each breeding individual was assigned the admixture category of its site, such that mating pairs composed of individuals from the same site were always assigned to the same category. As a result, there is some overlap in the ranges of individual hybrid indices for breeding individuals between the M and HM categories (Table 2).

Table 2.  Breeding success of wild-caught mice.
Admixture category1Mean hybrid index (range)2PairsFertile pairs (%)Litters (survived)Males3 (phenotyped)Females3 (%)Total3Mean (SD) litter size3
  1. 1Admixture category defined as in Table 1.
    2Weighted mean of hybrid indices (defined as in Table 1) for mated individuals and range of individual hybrid indices in category.
    3Number of offspring and litter size are reported for litters that survived to weaning.
    *Significantly lower than HM (P = 0.002, Mann–Whitney U Test, alpha = 0.008 with Bonferroni correction for six pairwise comparisons).

D0.06 (0–0.14)1311 (85)12 (11) 28 (17)31 (53) 595.4 (2.0)
HD0.38 (0.24–0.50)3120 (65)28 (25) 47 (38)58 (55)1054.2* (1.5)
HM0.74 (0.63–0.83)3328 (85)41 (35)116 (95)88 (43)2065.9 (2.2)
M0.87 (0.82–0.95)13 9 (69)18 (15) 34 (16)42 (55) 765.1 (1.9)

Male offspring were assigned to the same admixture category as their parents. We validated category assignments using high-density genotype data for 149 laboratory-bred males generated for future genetic mapping studies. Atlas Biolabs (Berlin, Germany) performed genotyping using Mouse Diversity Genotyping Arrays (Affymetrix, Santa Clara, CA). We estimated overall genome composition based on 270 autosomal SNPs (at ∼5 cM intervals, Table S1B) fixed between domesticus (Massif Central, France and Cologne/Bonn, Germany) and musculus (Almaty, Kazakhstan; F. Staubach, unpubl. data). Hybrid indices for offspring were highly correlated with the average parental hybrid index (r = 0.973, N = 149), indicating that the SNPstream and Affymetrix Mouse Diversity Array genotyping methods were consistent. Twenty-three males with pure domesticus or musculus parents were genotyped using SNPstream, as described above for wild-caught individuals. Hybrid index ranges for male offspring assigned to each admixture category do not overlap (D: 0.01–0.12; HD: 0.16–0.47; HM: 0.67–0.77; M: 0.78–0.96).

We genotyped a subset of laboratory-bred males for the t haplotype, a variant of the t complex on chromosome 17 segregating in wild house mouse populations, characterized by four inversions (Silver 1985; Ardlie and Silver 1996). The t haplotype harbors mutations that cause transmission ratio distortion and sterility in males (reduced/abnormal sperm). To distinguish between reduced fertility caused by the t haplotype versus hybrid incompatibilities, we compared the frequency of the t haplotype between a set of low-fertility males (40 males with relative testis weight and/or sperm count outside the pure subspecies range, see Table 4) and a set of males with fertility parameters within the pure subspecies range (4 M, 4 D, 8 HD, 9 HM). We identified individuals carrying the t haplotype using a PCR fragment-length assay for the Tcp1 locus (Planchart et al. 2000). Three individuals (1 M, 2 low-fertility males) failed to amplify.

Table 4.  Hybrid males with fertility traits outside the pure subspecies range.
TraitMinimum value in domesticus/musculusHD1 below No. (%)HM1 below No. (%)H2 below No. (%)Maximum value in domesticus/musculusHD1 above No. (%)HM1 above No. (%)H2 above No. (%)
  1. 1Admixture category of parents (as defined in Table 1).
    2H = hybrid – combined HD and HM.
    3Sperm morphology and motility measures defined in Figure 2.

Relative testis weight (mg testis/g body weight)  6.14 9 (24) 4 (4)13 (10) 13.140 (0)2 (2)2 (2)
Sperm count (million)  1.9914 (37)22 (23)36 (27) 9.920 (0)1 (1)1 (1)
Relative testis weight and sperm count  7 (18) 2 (2) 9 (7) 0 (0)0 (0)0 (0)
Relative testis weight or sperm count 16 (42)24 (25)40 (30) 0 (0)3 (3)3 (2)
Head shape ratio3 (length/width)  1.75 1 (3) 1 (1) 2 (2) 2.270 (0)0 (0)0 (0)
VCL (μm/sec)3195.0 0 (0) 5 (6) 5 (4)302.30 (0)3 (3)3 (2)
VAP (μm/sec)3118.0 3 (9) 9 (10)12 (10)160.72 (6)7 (8)9 (7)
VSL (μm/sec)3 62.3 0 (0)15 (17)15 (12)112.00 (0)0 (0)0 (0)
ALH (μm/sec)3 10.7 1 (3) 6 (7) 7 (6) 19.51 (3)2 (2)3 (2)
BCF (Hz)3 17.8 3 (9) 0 (0) 3 (3) 32.80 (0)1 (1)1 (1)


We assigned males identification numbers at weaning so that phenotyping was performed blind with respect to genotype. We sacrificed males by CO2 asphyxiation at age 9–12 weeks. The left testis of each male was fixed in Bouin's solution (Sigma-Aldrich, Munich) for histological analysis. The left epididymis was sliced with 18 g needles in 1.0 mL mT-B25 buffer (Shi and Roldan 1995) and incubated for 90 min at 37°C and 5% CO2 to allow for capacitation, the process that prepares sperm to fertilize eggs (a suite of changes occurring in vivo in the female reproductive tract). An aliquot of the sperm suspension was heat-killed at 60°C for 10 min and counted in duplicate using a Neubauer hemacytometer. When possible (>500,000 sperm/mL), we counted a minimum of 200 sperm per chamber. The maximum acceptable difference between two counts was determined using Appendix I of the NAFA-ESHRE Manual on Basic Semen Analysis (2002); counts are acceptable if N1 – N2/✓(N1 + N2) < 1.96, that is, if the two estimates fall within the 95% confidence interval (Poisson distribution) of the total count. If the difference between counts was too high, both were discarded and two new counts were performed.

To assess sperm morphology and motility, we performed computer-assisted sperm analysis (CASA) using a CEROS Sperm Analyzer (Hamilton Thorne, Beverly MA). The sperm suspension was diluted to ∼1 million/mL and ∼25 μl loaded into 100 μm depth slide chambers (Leja, Nieuw-Vennep, The Netherlands). CASA measurements include sperm head elongation (width/length), track velocity (VCL), smoothed path velocity (VAP), straight line velocity (VSL), amplitude of lateral head displacement (ALH), and beat cross frequency (BCF). A diagram illustrating these measures is presented in Figure 2. Sperm head elongation may be relevant to fertility because it will reflect differences in the length of the apical hook of the sperm head. Hooked heads are characteristic of murid sperm, and variation in hook morphology is correlated with degree of sperm competition (Immler et al. 2007). Reduced hook size has been previously associated with hybrid sterility (Good et al. 2008a). We report the inverse of the elongation measurement (sperm head shape ratio = head length/width) to be consistent with other measures, for which lower values are associated with reduced fertility.

Figure 2.

Computer-assisted sperm analysis (CASA) measures of sperm morphology (A) and sperm motility (B).

Following Goossens et al. (2008), we used the following CASA parameter settings: 30 frames recorded/field; recording rate, 60 frames/sec; minimum contrast, 60; minimum cell size, 6 pixels; minimum progressive velocity VAP, 75 μm/s; slow sperm (<25 μm/s) not counted as motile. We collected data for a minimum of four fields, 50 motile sperm and 400 total sperm for each individual. Fifteen individuals had fewer than 50 motile sperm present in the chamber. We included sperm motility and morphology data for four individuals with >25 motile sperm and excluded the rest.

We performed histological analyses of testes to identify the stage at which spermatogenesis was disrupted in males with fertility defects. Fixed testes were embedded in paraffin, sectioned (6 μm) and stained with hematoxylin-eosin following standard protocols. We performed quantitative analysis of sections from five musculus males, five domesticus males, 10 hybrids with relative testis weight and sperm count within the pure range, 10 hybrids with sperm count below the pure range, three hybrids with relative testis weight below the pure range, and one hybrid with both sperm count and relative testis weight below the pure range. We measured the width and length of 10 stage VII (Russell et al. 1990) seminiferous tubules per male. Sections from an additional 10 hybrids (eight with relative testis weight and sperm count below the pure range, one with low sperm count, and one with low relative testis weight) were examined in detail, but either had fewer than 10 stage VII tubules, or abnormalities were too severe for accurate staging.

We estimated the tubular area and perimeter based on the dimensions of an ellipse with the same width and length: area =πab, and perimeter =π(a + b)(1 + 3x2/[10 – inline image]) (Ramanujan 1914), where a = width/2, b = length/2, and x = (a – b)/(a + b). We counted pachytene spermatocytes, and round spermatids in five stage VII tubules from each male to calculate the spermatid to spermatocyte ratio (SSR), which is expected to be 4:1 as a result of normal meiosis. We calculated the number of spermatocytes per 100 μm tubule perimeter, to compare the abundance of meiotic cells between individuals.


We performed statistical analyses using R (R Development Core Team 2010) and the R package NLME (Pinheiro et al. 2011). To test for biased sex ratios in offspring of wild-caught mating pairs, we simulated 1000 datasets matching our breeding pair data (68 pairs with the observed number of offspring/pair), assigning sex to offspring by drawing from a binomial distribution with probability of success = 0.5. We determined the number of trials with the observed number of male- and female-biased pairs or more extreme values, in either direction (two-tailed test).

For comparisons of fertility between hybrids and pure subspecies, we treated ancestry as a categorical variable (admixture category, defined above) for simplicity. We expect reduced fertility to be more common on average in hybrids with more intermediate ancestry; however, due to the epistatic and potentially asymmetric nature of hybrid incompatibilities, modeling the relationship between fertility and a continuous measure of admixture is not straightforward. To account for family structure, we fit linear mixed models with dam and sire as crossed random effects.

We accounted for a significant association between testis weight and body weight (r = 0.25, N = 166, P = 0.001) in two ways. First, for ease of interpretation and comparison with previous studies of hybrid male sterility (Britton-Davidian et al. 2005; Good et al. 2008b), we report relative testis weight (combined testes weight/body weight) in addition to raw testis weight. Second, we included body weight as a covariate in the model.

Sperm counts <500,000/mL could not be measured precisely (±10%) and we set counts of less than 250,000 to 250,000 and counts 250,000–500,000 to 500,000 in our analyses. All sperm counts were square-root transformed. Sperm count was significantly correlated with testis weight (rho = 0.47, Table 5). To compare sperm count independent of variation in testis weight, we performed additional statistical comparisons including testis weight and body weight as covariates.

Table 5.  Correlations between fertility traits. Spearman's correlation coefficients (rho, top half) and P values (bottom half) are indicated. Significant values (P < 0.05) are in bold.
 Testis weightTestis weight partial1Sperm countSperm count partial2Head shape ratioSperm motility measures
  1. 1Partial correlation with testis weight, controlling for body weight.
    2Partial correlation with sperm count, controlling for testis weight.
    3Sperm motility measures defined in Figure 2.

Testis weight - - 0.47-0.18 0.17 0.06 0.04 0.11 0.05
Testis weight partial1 - - 0.45-0.22 0.13 0.03 0.02 0.08 0.09
Sperm count<0.001<0.001 --0.20 0.30 0.24 0.22 0.08 0.02
Sperm count partial2 - - --0.14 0.26 0.24 0.22 0.04 0.00
Head shape ratio 0.022 0.005 0.0130.090- 0.07 0.06 0.14−0.02 0.04
VCL 0.040 0.118<0.0010.0010.387 - 0.81 0.44 0.61 0.08
VAP 0.455 0.744 0.0020.0020.426<0.001 - 0.42 0.38−0.09
VSL 0.624 0.834 0.0060.0040.080<0.001<0.001 - 0.13−0.40
ALH 0.160 0.297 0.3110.6360.839<0.001<0.001 0.109 -−0.07
BCF 0.535 0.287 0.7620.9610.595 0.324 0.249<0.001 0.386 -

We compared quantitative histological measures among admixture categories, and further subdivided HD and HM into normal and low fertility (sperm count and/or relative testis weight below pure subspecies range) groups. Groups were compared using nonparametric Kruskal–Wallis tests.



We found a steep transition in allele frequencies across the transect (Fig. 1B); from an average hybrid index of 0.05 at the westernmost site to an average hybrid index of 0.93 at the easternmost site, over a distance of 36.5 km (Table 1). The genotype data confirm that mice chosen for breeding represent the range of genetic admixture present in the hybrid zone (Fig. 1C). Despite intense sampling effort in the center of the hybrid zone, few individuals were exactly intermediate in genetic background; only 10 of 228 individuals had a hybrid index of 0.45–0.55. The maximum percentage of heterozygous alleles in an individual was 60%, thus no F1 individuals were sampled. These results are consistent with previous genetic studies in this region of the hybrid zone (Sage et al. 1986b; Tucker et al. 1992; Payseur et al. 2004; Payseur and Nachman 2005; Teeter et al. 2008, 2010).


Poor reproductive performance of both male and female F1 domesticus–musculus has been documented previously in backcrosses (Britton-Davidian et al. 2005). We are unable to perform a detailed analysis of hybrid fecundity using the data collected for this study, because our breeding scheme was designed to maximize numbers of hybrid offspring for phenotyping rather than to assess fertility. However, the breeding data do show that hybrids have similar fertility to pure subspecies pairs (Table 2). The combined success rate for hybrid pairs (75%, 48/64) was similar to the success rate for pure subspecies pairs (77%, 20/26). All males and 45 of 47 females (96%), that were paired at least twice produced offspring, indicating completely sterile individuals are rare or absent.

Some evidence suggests reduced fecundity in those hybrids that are predominantly domesticus. The two unsuccessful females were from the same HD site (HO), which had the lowest success rate overall (45% pairs reproduced, Table S2). Moreover, HD pairs had smaller litters on average than HM or pure subspecies pairs. Litter size varied significantly among admixture categories (P = 0.02, Kruskal–Wallis test), and in pairwise tests the HD–HM comparison was significant (P = 0.01).

We tested for sex ratio distortion, which has been reported previously in the European house mouse hybrid zone (Macholan et al. 2008). The combined sex ratio for all offspring was male biased in HM and female biased in HD and pure subspecies (Table 2), however the number of biased families was only significantly different from expectations under an equal sex ratio for musculus (P = 0.015, 1000 binomial simulations).


We measured fertility parameters in 133 laboratory-bred hybrid males and 33 laboratory-bred pure subspecies (assigned to domesticus or musculus categories) males. Relative testis weight (testis weight divided by body weight) and sperm count show a large range of variation in hybrids (Fig. 3). Both traits (and raw testis weight) are lower on average in hybrids than in pure domesticus and musculus (Table 3). Predominantly domesticus hybrids are more severely affected than predominantly musculus hybrids. First, trait means are lower in HD than HM, and second, both sperm count and relative testis weight were significantly lower in HD, but only sperm count was significantly lower than pure subspecies in HM.

Figure 3.

Histograms of relative testis weight (A) and sperm count (B) in musculus, domesticus and hybrids. Dashed lines indicate median values.

Table 3.  Fertility phenotypes in M. m. domesticus, M. m. musculus, and hybrids.
 D1 mean (SD)HD1 mean (SD)HM1 mean (SD)M1 mean (SD)  
TraitN = 17/172N = 38/322N = 95/902N = 16/162P3Sig. pairwise comparisons
  1. 1Admixture category of parents (as defined in Table 1).
    2Total sample size/sample size for sperm morphology and motility traits measured by CASA.
    3Linear mixed model with admixture category as fixed effect, and dam and sire as crossed random effects.
    4Combined testes weight. Body weight was included as a covariate for statistical analysis.
    5Combined testes weight divided by body weight.
    6 Sperm motility measures defined in Figure 2.

Testis weight (mg)4  186 (38)  153 (55)  176 (36)  205 (45)0.081HD–D
Relative testis weight5 (mg testis/g body weight)  9.63 (1.46)  7.45 (2.56)  9.50 (1.90) 10.28 (1.91)0.020HD–D
Sperm count (million)  5.28 (2.29)  2.74 (2.18)  3.72 (2.32)  4.86 (2.02)0.001HD–D
Sperm count, controlling for testis weight and body weight    <0.001HD–D
Sperm morphology and motility
Head shape ratio6 (length/width)  1.93 (0.11)  1.87 (0.08)  1.94 (0.10)  2.02 (0.11)<0.001D–M
VCL6 (μm/sec)229.5 (25.7)241.8 (26.4)240.4 (31.4)265.4 (23.5)0.015D–M
VAP6 (μm/sec)132.3 (9.7)137.0 (15.1)139.2 (17.3)142.4 (13.1)0.442 
VSL6 (μm/sec) 75.0 (8.5) 82.0 (11.5) 71.3 (9.3) 80.5 (14.5)0.021HD–HM
ALH6 (μm/sec) 14.1 (2.6) 14.5 (2.2) 13.9 (2.2) 14.7 (1.8)0.631 
BCF6 (Hz) 22.8 (2.3) 21.4 (2.7) 24.0 (2.8) 25.5 (2.7)<0.001D–M

In both groups of hybrids, a substantial proportion of individuals had relative testis weight and/or sperm count below the range found in pure subspecies, suggesting they have reduced fertility (Table 4). Trait values in hybrids may fall outside the pure range in the absence of hybrid incompatibilities, due to inheriting complementary additive alleles from each pure subspecies (transgressive segregation) (Rieseberg et al. 1999). However, the large difference in the percentage of hybrids with values outside the pure subspecies range on the lower (30%) vs. higher side (2%) suggests that it is an unlikely explanation for the observed pattern. The t haplotype is another potential cause of sterility in wild house mice unrelated to hybrid incompatibilities. The t haplotype was present at low frequency (2.4%) in the subset of laboratory-bred males genotyped (N = 61), and showed no association with reduced fertility; 1 of 38 low-fertility males was heterozygous for the t haplotype and 2 of 23 males with fertility parameters within the pure subspecies range were heteroyzygotes.

Most individuals in the groups we categorized as pure domesticus and musculus are slightly introgressed; thus using minimum values in these groups is a conservative threshold for defining a fertile range. Comparisons with previous fertility studies in house mice provide additional evidence that the hybrids in the low range of testis weight or sperm count are likely to be sterile or subfertile. For example, relative testis weights and sperm counts below the minimum pure value (<6.14 mg/g relative testis weight, <1.99 million sperm) are within the range reported by Good et al. (2008) in sterile F1 hybrids (maximum value of relative single testis weight ∼4 mg/g and maximum sperm count ∼6 million in 2 epididymides, equivalent to ∼8 mg/g relative combined testes weight and ∼3 million sperm as measured in this study). Searle and Beechey (1974) found that mice with epididymal sperm counts reduced to <10% of normal values had low fertilization rates. If we use the lower pure subspecies mean (4.86 million, musculus) as the “normal” value, 18 hybrids (14%) would likely have impaired fertility on the basis of that criterion.


To determine whether sperm produced by subfertile hybrid males are lower in quality as well as in quantity, we measured sperm morphology and motility traits using CASA. CASA measurements are not reliable for samples with few or no motile sperm, thus these results do not include 11 (8.3%) hybrid males with severely reduced sperm counts. We first investigated variation in sperm head shape, which reflects differences in apical hook length. In HM, mean head shape ratio was significantly lower than pure musculus but the mean and range were similar to pure domesticus. Head shape ratio was significantly lower in HD than both pure subspecies (Table 3). However, head shape ratio differed significantly between musculus and domesticus; thus shorter sperm heads in hybrids may be caused by an incompatibility or solely reflect subspecies differences in this trait. A modest but significant negative correlation between head shape ratio and testis weight (rho = 0.22, partial correlation, Table 5) suggests males with impaired fertility produce sperm with shorter hooks.

Consistent with previously reported subspecies differences in sperm swimming speed (Dean and Nachman 2009), mean values for all sperm motility traits were higher in pure musculus than domesticus, with differences in track velocity (VCL), and beat cross-frequency (BCF) significant (Table 3). Hybrid means were intermediate between the two subspecies for some traits, and none were significantly lower than both pure subspecies. However a small proportion of hybrids measured (3–12%, Table 4) did have motility trait values below the pure range. For VCL, VAP, ALH, and BCF, a similar proportion of hybrids fell above and below the pure range, suggesting the larger range of variation in hybrids may be due to transgressive segregation. In contrast, 17% of HM individuals had VSL (straight-line velocity) below the pure range and none were above. Lack of a consistent direction of movement in low VSL sperm might impair fertility in HM males. Significant correlations between swimming speed (VAP, VSL, VCL, Table 5) and sperm count in hybrids suggest males producing few sperm also produce lower quality sperm.


We performed histological analyses of testes from hybrids and pure subspecies males to gain insight into the mechanisms underlying hybrid sterility. We observed several abnormalities in hybrids indicating testicular degeneration and atrophy, including multinucleated syncytial cells, thin spermatogenic epithelia, and vacuolated Sertoli cells (Fig. 4B–F). Additionally, we present evidence of reduced or arrested meiosis, including increased apoptosis (chiefly round spermatids) with subsequent depletion of round spermatids and decreased numbers of meiotic spermatocytes in severely atrophic tubules (Fig. 4E,F). We also observed postmeiotic defects in spermiogenesis, including reduced numbers of elongating spermatids, degeneration of round/elongating spermatids, and retention of spermatids (Fig. 4E), which is often associated with abnormal sperm parameters and subfertility.

Figure 4.

Histological testis sections. (A) Normal stage VII tubule from a domesticus male (200× magnification). (B–F) Examples of abnormalities in hybrids (200×): B. The tubule on the left has disorganized epithelium, several hypereosinophilic and individuated (apoptotic) round spermatids (arrows), and a multinucleated syncytium (arrowhead), which indicate degeneration; there are also markedly decreased numbers of round spermatids. The tubule on the upper right is segmentally atrophic, with decreased numbers of spermatocytes and round spermatids, but it is producing sperm. (C) The central tubule (asterisk) has markedly atrophic epithelium, but spermatogenesis is occurring. Note the disorganized tubule on the lower left and the atrophic tubule on the upper right. The tubule on the right is mildly atrophic and segmentally lacks orderly spermiogenesis. (D) A moderately affected tubule demonstrates rare spermatocytes (S), and multiple round spermatids are undergoing apoptosis (arrow). A multinucleated syncytium is also present (arrowhead). (E) Two severely atrophic tubules (asterisks) completely lack round spermatids; the epithelium consists of only Sertoli cells, spermatogonia, rare spermatocytes, and retention of the last generation of spermatids. Note the variable atrophy of adjacent tubules, where small numbers of round spermatids are present in lower left and upper right tubules. (F). The testis and tubules have undergone advanced and severe atrophy. Severely vacuolated Sertoli cells remain, as do small numbers of spermatogonia and rare meiotic cells. There is a marked decrease in tubule number and diameter (compare to A), and consequently an apparent increase in the number of interstitial Leydig cells. Scale bar is 50 μm. (G–H) Lower magnification (80×) views of testis from hybrid males with low sperm count and relative testis weight below (G) vs. within (H) the pure subspecies range. Scale bar in H is 100 μm.

The type and severity of defects present varied among individuals and among seminiferous tubules within an individual (e.g., Fig 4B,C). For example, individuals with very low testis weight frequently showed marked reductions in germ cell abundance (Fig. 4G) and/or reduced size and number of tubules. In contrast, in individuals with low sperm counts but testis weight in the pure subspecies range, the overall abundance of germ cells appeared relatively normal, but cells had abnormal morphology, the epithelial structure was disorganized, and the relative abundance of different cell types was altered (Fig. 4H).

We performed quantitative histological analysis of a subset of males, however accurate comparisons were only possible for seminiferous tubules with relatively normal morphology. The most severely affected males were excluded because they had few or no stage VII tubules. The tubules measured in the other hybrids with low testis weight or sperm count were those most normal in appearance; defective tubules were frequently present in the same individual. There were no significant differences among groups in the histological measures, however we had low power due to small sample sizes. Mean seminiferous tubule area is lower in domesticus than musculus, suggesting there may be a subspecies difference in this trait (Table 6). There was a trend toward decreased abundance of meiotic cells (pachytene spermatocytes per 100 μm tubule perimeter) in hybrids compared to pure subspecies, consistent with our qualitative observations of decreased meiosis in more abnormal tubules in some individuals. In contrast, spermatid-to-spermatocyte ratios were similar among groups, and perhaps higher in low fertility HD males, thus there is no indication of meiotic arrest in measured tubules.

Table 6.  Quantitative testis histology.
Admixture category1Subfertile?2NMean Seminiferous tubule area (μm2) (SD)Range Seminiferous tubule areaMean Spermatocyte abundance (spermatocytes /μm tubule perimeter) (SD)Range Spermatocyte abundanceMean Spermatid- to- Spermatocyte ratio5 (SD)Range Spermatid- to- Spermatocyte ratio
  1. 1Admixture category of parents (as defined in Table 1).
    2Hybrids with relative testis weight and/or sperm count below the pure subspecies range.

D 530,866 (6,198)(24,573–40,731)9.13 (1.06)(7.65–10.23)2.74 (0.42)(2.13–3.31)
HDNo527,865 (2,254)(26,295–31,852)8.05 (1.02)(6.88–9.69)2.54 (0.36)(1.91–2.82)
HDYes629,203 (4,829)(24,212–35,938)7.96 (1.19)(6.78–9.52)3.22 (0.27)(2.92–3.58)
HMNo534,055 (3,827)(29,989–39,086)8.79 (0.28)(8.55–9.27)2.99 (0.35)(2.58–3.55)
HMYes830,551 (4,850)(25,138–37,538)8.35 (0.98)(6.27–9.39)2.88 (0.29)(2.40–3.36)
M 535,605 (5,500)(29,231–41,833)9.31 (0.95)(7.76–10.21)3.04 (0.33)(2.65–3.44)


Much of the research on speciation processes in house mice has focused on two topics—patterns of introgression across natural hybrid zones and the genetic basis of hybrid male sterility. Evidence linking hybrid male sterility to reduced gene flow in natural contact zones between subspecies, however, has been lacking. Our study reveals that low fertility phenotypes are prevalent in the hybrid zone but the type, number, and severity of defects varies widely among individuals and between groups of hybrids with different ancestry. This variation suggests multiple underlying genetic incompatibilities are segregating in the Bavarian hybrid zone.

There are two possible alternative causes for the observed fertility defects in the hybrid zone. We can clearly rule out one of these: we found no evidence that the t haplotype causes reduced hybrid fitness in the Bavarian hybrid zone. The second alternative is karyotypic variation; frequent chromosomal rearrangements (usually Robertsonian fusions) within the subspecies domesticus (reviewed in Pialek et al. 2005) can cause partial or complete sterility when present in a heterozygous state (e.g., Hauffe and Searle 1998; Castiglia and Capanna 2000). Currently no information is available concerning karyotype status of mice in or near our study populations in Bavaria, however distant populations in southern Germany harbor multiple Robertsonian fusions (Pialek et al. 2005). Reduced hybrid fertility phenotypes we report are qualitatively similar to some phenotypes observed in animals heterozygous for Robertsonian fusions (Sans-Fuentes et al. 2010), thus this potential alternative cause of sterility needs to be ruled out in future studies.

The low fertility phenotypes we report are less severe, on average, than those previously described in F1 hybrids from crosses between subspecies (Britton-Davidian et al. 2005; Vyskocilova et al. 2005; Good et al. 2008b), perhaps because we have included individuals with several generations of hybrid ancestry. Characterization of these more varied and subtle phenotypes present in genetically diverse natural hybrids represents an important step toward understanding the role of hybrid male sterility in the house mouse speciation process.


The pattern of correlations among fertility-related traits suggests multiple forms of hybrid sterility, that is, the suite of fertility parameters we measured are not simply different aspects of a single low fertility phenotype. Although sperm count and testis weight are significantly correlated, variation in testis weight explains only ∼20% of the variation in sperm count (Table 5) and many hybrid individuals (20%) had low sperm counts but testis weight within the range of the pure subspecies, particularly in the HM group (Table 4). Correlations among fertility traits were modest, in general, and several trait pairs were not significantly correlated (Table 5). Principal components analysis of the fertility traits reveals complex variation (Table S4); the first two principal components (PCs) only explain ∼50% of the variation. The first axis of variation represents several sperm motility traits, but no clear pattern is apparent from variable loadings of the remaining PCs.

Testis histology revealed additional variation in the type of abnormalities present in hybrids that had low sperm counts and/or relative testis weight (Fig. 4). Defects were chiefly meiotic or postmeiotic, sparing the spermatogonia and the Sertoli cells (until the tubule was severely affected and lacked the functional capacity to produce sperm). A wide range of histologic lesions was identified, and the severity of the lesions was highly variable, both within individual testes and between testes from different individuals. This variation suggests low-fertility phenotypes apparent upon gross examination (i.e., low testis weight, low sperm count) may in fact result from multiple defects in spermatogenesis with similar outcomes.

The prevalence of several reduced fertility traits differed between predominantly domesticus hybrids and predominantly musculus hybrids (Table 4); relative testis weight, sperm count, and BCF were more frequently below the pure subspecies range in HD, whereas two measures of sperm swimming speed (VCL, VSL) were more frequently below the pure subspecies range in HM. Several factors likely contribute to differences between HD and HM. First, the higher frequency and severity of low fertility in HD males overall may be explained in part by the fact that HD males were more intermediate in ancestry than HM males (mean hybrid indices 0.38 and 0.71, respectively). Second, the frequency of incompatible domesticus alleles in pure domesticus populations might differ from the frequency of incompatible musculus alleles in pure musculus populations. Variable outcomes in crosses between subspecies, even using wild-caught individuals from a single geographic area, have shown that hybrid sterility alleles are often polymorphic and spatially vary in frequency (Britton-Davidian et al. 2005; Vyskocilova et al. 2005, 2009; Good et al. 2008b). Finally, reduced hybrid fertility phenotypes due to dominant or X-linked alleles will occur even when the frequency of introgressed alleles is low. Repeated observation of F1 sterility in crosses indicates some incompatibilities between musculus and domesticus are dominant. Furthermore, sterility has been repeatedly linked to the musculus X chromosome (Good et al. 2008b). Introgression of the musculus X might contribute to the higher prevalence of reduced fertility in HD.

Patterns of phenotypic variation suggest reduced hybrid fertility is unlikely to be explained by a single two-locus Dobzhansky–Muller incompatibility. Sterility may be caused by a complex multilocus epistatic interaction, with less-severe phenotypes found in hybrids with some but not all interacting reduced fertility alleles. Alternatively, variation may reflect multiple independent incompatibilities. Genetic mapping of the male fertility traits in the hybrid zone, which is currently underway in our laboratory, is needed to verify the presence of multiple incompatibilities, test for interactions between loci, and identify candidate sterility genes. In the future, more in-depth histological and immunostaining analyses of reproductive tissues from subfertile hybrids may aid in pinpointing the timing and cause of spermatogenesis defects.


There is a striking concordance in patterns of genetic admixture between the samples we collected in the Bavarian hybrid zone and samples collected in the same region 16–24 years earlier (Table S3, Teeter et al. 2008). The stability of the hybrid zone suggests that isolation mechanisms act efficiently to maintain subspecies distinctness over time, and house mice are indeed incipient species. What is the role of hybrid male sterility in maintaining barriers to gene flow?

The reduced fertility phenotypes we observed in natural hybrids, are milder on average than phenotypes previously described in sterile F1s from crosses between subspecies (Britton-Davidian et al. 2005; Vyskocilova et al. 2005; Good et al. 2008b), suggesting subfertility is common but complete sterility is rare. Moreover, our breeding efforts with wild-caught hybrids were generally successful (Table 2). This suggests that late generation hybrids are quite fit and fertility is not an important barrier.

However, multiple mating of female house mice is common in nature, suggesting sperm competition is an important element of sexual selection (Dean et al. 2006; Firman and Simmons 2008). In a competitive context, even small reductions in sperm number and quality might have a severe impact on male reproductive success (Snook 2005). Noncompetitive heterosubspecific matings between domesticus and musculus are successful and litter sizes are generally comparable to consubspecific matings (Britton-Davidian et al. 2005), yet fertilization is faster in consubspecific than heterosubspecific matings (Dean and Nachman 2009), indicating the presence of postmating, prezygotic barriers. Hybrid males may suffer lower reproductive success through a combination of reduced sperm number, defects in sperm form or function, and decreased competitive ability relative to pure males.

Despite the prevalence of hybrid sterility and other hybrid defects among recently diverged species pairs, it has often been argued that the role of postzygotic barriers in speciation is negligible compared to premating barriers, which are more effective at preventing gene flow because they act earlier in the reproductive process (reviewed in Coyne and Orr 2004; Schemske 2010). Current barriers to gene flow between species, however, may differ from those that initially caused reproductive isolation. Postzygotic barriers can be critical in initial species divergence if they evolve before prezygotic barriers. Weak evidence for assortative mating between subspecies and the persistence of the hybrid zone suggest premating barriers are insufficient to maintain isolation between house mice in the early stages of speciation. In addition to hybrid male sterility, evidence for high parasite loads and reduced female fertility in hybrids suggests several forms of postzygotic barriers may be important (Sage et al. 1986a; Moulia et al. 1991, 1993; Derothe et al. 2001; Britton-Davidian et al. 2005). The effects of different barriers are likely to be enhanced in combination. Hybrids in poor condition due to high parasite loads might have low fecundity or be unable to compete effectively for mates. It is probable that subspecies distinctness is maintained by the cumulative effect of multiple mechanisms, each of which would be an incomplete barrier to gene flow in isolation. In-depth analyses of female fertility, parasite load, and immune function in natural hybrids are needed to assess the contribution of each potential isolating mechanism.

In opposition to barriers to gene flow that act to maintain genetic isolation between subspecies, other mechanisms may act to promote gene exchange in the hybrid zone, and potentially reduce isolation. One example is the recently reported case of genetic conflict in the Czech region of the house mouse hybrid zone (Macholan et al. 2008). The musculus Y chromosome and mitochondrial haplotype are more introgressed into domesticus territory than the rest of the genome. The authors argue that an association between introgression of the Y and census sex ratios indicates there is a male-biasing sex ratio distorter on the musculus Y, and genetic conflict between the sexes explains the pattern of increased Y introgression. Genetic conflict thus promotes increased gene flow between subspecies, overcoming effects of heterogametic incompatibilities. We searched for evidence of a driving, sex-ratio biasing musculus Y chromosome in our data: the combined sex ratio at weaning for HM was male-biased, in contrast to other ancestry categories, which were slightly female biased (Table 2). The number of male-biased HM families, however, was not significantly different from expectations under parity. All HM males had musculus Y chromosomes, as inferred from the genotypes of their sons, thus the trend in our data is consistent with a sex ratio distorter on the musculus Y which is effective only on a partially domesticus background. However, we found no evidence for increased introgression of the musculus Y; all HD males had domesticus Y chromosomes. It seems unlikely that a driving Y is increasing gene flow between subspecies in the Bavarian hybrid zone.


In conclusion, we find that reduced hybrid male fertility in house mice, previously documented in detail only in crosses between subspecies, is indeed common in a natural contact zone. A complex pattern of variation in a suite of fertility-related traits among individuals and admixture categories suggests multiple hybrid sterility loci are present in the hybrid zone. Incompatibilities causing sterility likely act in combination with each other and with other incompatibilities affecting hybrid fitness to form the strong barrier to gene flow evident from the abrupt geographic transition between subspecies in the contact zone. Future studies of sterility and other types of reduced fitness in natural hybrids are needed to determine the strength of each isolating barrier and how they interact.

Associate Editor: C. Peichel


We thank R. Sage, M. White, and B. Payseur for useful discussion, and D. Tautz for advice and logistical support. Comments from T. Price, C. Peichel, and two anonymous reviewers greatly improved this manuscript. A. Cerwenka, J. Döring, E. Hardouin, and N. Hess assisted with fieldwork. H. Harre and H. Krehenwinkel provided technical assistance. C. Pfeifle provided guidance with breeding and management of mice. M. Bartosek helped with drawing maps. This research was supported with funds from the Deutsche Forschungsgemeinschaft to B. Harr (SFB-680) and the Max Planck Society to D. Tautz.