HALDANE'S RULE IN AN AVIAN SYSTEM: USING CLINE THEORY AND DIVERGENCE POPULATION GENETICS TO TEST FOR DIFFERENTIAL INTROGRESSION OF MITOCHONDRIAL, AUTOSOMAL, AND SEX-LINKED LOCI ACROSS THE PASSERINA BUNTING HYBRID ZONE
Matthew D. Carling,
Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana 70803
Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803
Using cline fitting and divergence population genetics, we tested a prediction of Haldane's rule: autosomal alleles should introgress more than z-linked alleles or mitochondrial haplotypes across the Passerina amoena/Passerina cyanea (Aves: Cardinalidae) hybrid zone. We screened 222 individuals collected along a transect in the Great Plains of North America that spans the contact zone for mitochondrial (two genes), autosomal (four loci) and z-linked (two loci) markers. Maximum-likelihood cline widths estimated from the mitochondrial (223 km) and z-linked (309 km) datasets were significantly narrower on average than the autosomal cline widths (466 km). We also found that mean coalescent-based estimates of introgression were larger for the autosomal loci (0.63 genes/generation, scaled to the mutation rate μ) than for both the mitochondrial (0.27) and z-linked loci (0.59). These patterns are consistent with Haldane's rule, but the among-locus variation also suggests many independently segregating loci are required to investigate introgression patterns across the genome. These results provide the first comprehensive comparison of mitochondrial, sex-linked, and autosomal loci across an avian hybrid zone and add to the body of evidence suggesting that sex chromosomes play an important role in the formation and maintenance of reproductive isolation between closely related species.
Hybrid zones, formed after the secondary contact of partially reproductively isolated taxa (Hewitt 1988; Harrison 1990; Arnold 1997; Mallet 2005), are often used to investigate Haldane's rule (Haldane 1922) in nonmodel organisms for which laboratory crosses are impractical (Hagen and Scriber 1989; Brumfield et al. 2001; Saetre et al. 2003). The dominance theory of Haldane's rule allows for the development of testable hypotheses related to the differential introgression of different loci across these hybrid zones. Based on the Dobzhansky-Muller (D-M) incompatibility model (Dobzhansky 1937; Muller 1940, 1942), the dominance theory states that inviability or sterility arises from the interaction between two genes that evolved incompatible alleles in allopatry. If the alleles causing hybrid incompatibility are recessive, their impact will be much larger if one or both of the genes are located on the sex chromosomes.
Support for the dominance theory comes from a variety of empirical sources, including direct tests of a simple prediction: if genes causing hybrid sterility or inviability are recessive (thus being masked in F1 hybrids), it should be possible to find the causative alleles if they can be made homozygous or hemizygous. Such alleles have been found, mainly in Drosophila (Sawamura et al. 2000; Presgraves 2003; Tao and Hartl 2003). The dominance theory also is supported by comparative contrasts; researchers have found that Haldane's rule evolved more quickly in Drosophila with larger X chromosomes than in those with smaller X chromosomes (Turelli and Begun 1997). This “large-X effect” likely plays a role in non-Drosophila systems as well. It predicts that if the sex chromosomes play a large role in reproductive isolation, sex-linked alleles will show less introgression than autosomal alleles across hybrid zones. Such reductions in sex-linked introgression have been found in studies of natural hybrid zones between species of butterflies, mice, and birds (Hagen and Scriber 1989; Tucker et al. 1992; Saetre et al. 2003).
Mechanisms other than the dominance theory have been proposed to explain Haldane's rule (Coyne and Orr 2004), but they either do not apply to organisms with ZW sex determination (faster-male theory; Wu and Davis 1993; Wu et al. 1996), rely partially on the dominance theory (faster-x theory; Charlesworth et al. 1987), or are difficult to test in natural systems (meiotic drive theory; Frank 1991; Tao et al. 2003). Thus, our work focuses on testing introgression predictions based on the dominance theory.
In birds, females are the heterogametic sex (ZW) and thus, female hybrids are expected to show greater fitness reductions than male hybrids. Haldane's rule predicts that mitochondrial haplotypes, because they are maternally inherited, would show patterns of reduced introgression compared to autosomal alleles. Therefore, in taxa with heterogametic females, Haldane's rule predicts autosomal alleles will show greater levels of introgression than either mitochondrial or sex-linked alleles (z-linked hereafter).
Tests of Haldane's rule predictions in hybrid zones often involve investigating the relative selective forces acting on different loci by employing cline analyses to explain the changes in allele frequencies occurring along a geographic transect (Porter et al. 1997; Brumfield et al. 2001; Blum 2002; Payseur et al. 2004; Macholan et al. 2007). Cline theory predicts a relatively straightforward relationship between natural selection against hybrids and cline width—stronger selection against introgression results in a narrower cline (Slatkin 1973; Slatkin and Maruyama 1975; Endler 1977). Accordingly, loci (or linked loci) having alleles with narrower clines are thought to be more important in maintaining reproductive isolation between the hybridizing taxa than loci with wider clines. It is expected then, that mitochondrial haplotypes (in taxa with heterogametic females) and z-linked alleles should show narrower clines than autosomal alleles.
More recently, the increased sophistication of coalescent-based analytical techniques (Wakeley and Hey 1997; Hey 2005; Putnam et al. 2007; Rosenblum et al. 2007) has provided a new manner in which to investigate the selective forces acting on the movement of alleles across hybrid zones. Such methods use information contained in the genealogy of population samples to estimate a variety of population genetic parameters, including effective population size, introgression rates and divergence time, that can be used to explore the evolutionary history of populations. For example, these techniques have been used to reject strict allopatric speciation models in a number of taxa (Machado et al. 2002; Osada and Wu 2005). Additionally, by comparing gene flow parameters estimated from different loci, such coalescent-based methods can be used to investigate patterns of differential introgression. Loci with reduced rates of introgression may contribute more to the maintenance of reproductive isolation than loci with greater rates. Again, according to Haldane's rule, mitochondrial and z-linked alleles are predicted to show decreased rates of introgression compared to autosomal alleles.
In this study we had three principal objectives. The first was to provide the first genetic characterization of the hybrid zone between Passerina amoena and Passerina cyanea. Our second objective was to test the prediction that, according to Haldane's rule, both mitochondrial haplotypes and z-linked alleles should introgress less, on average, than autosomal alleles. Because two-thirds of z-linked alleles are carried by males, mitochondrial haplotypes should also show a reduction in introgression when compare with z-linked alleles. Our third objective was to investigate the utility of integrating cline-based and coalescent-based analyses of multilocus datasets to test the above predictions.
STUDY SYSTEM: PASSERINA BUNTINGS
Passerina amoena (Lazuli Bunting) and P. cyanea (Indigo Bunting) are closely related species (Carling and Brumfield 2008) that form a hybrid zone in which their breeding ranges overlap. The Passerina bunting hybrid zone (Fig. 1) is one of many hybrid zones clustered in a “suture zone” (Remington 1968) located near the junction between the western edge of the Great Plains and the eastern foothills of the Rocky Mountains (Swenson and Howard 2004, 2005). Although a diversity of taxa have contact zones in the same geographical location, the clustering of avian hybrid zones in the Northern Rocky Mountain zone is particularly striking (Rising 1983). Besides Passerina buntings, the distributions of 13 pairs of avian taxa meet in the Great Plains and form 10 additional hybrid zones (Rising 1983).
Previous morphological studies of the Passerina hybrid zone reported differences in the extent of introgression between P. amoena and P. cyanea. Sibley and Short (1959) reported widespread hybridization—nearly 70% of specimens collected from South Dakota, Wyoming, Nebraska, and Colorado were classified as hybrids. In contrast, two later studies reported much lower levels. Of 153 buntings collected across North Dakota and eastern Montana, only 35 were identified as hybrids (Kroodsma 1975), and ∼5% of the 53 buntings collected in Nebraska and Wyoming were identified as hybrids (Emlen et al. 1975). These three studies differed in the location of the transect sampled as well as in the methods used to classify individuals as P. amoena, P. cyanea, or hybrids. Although the discrepancies in the reported frequency of hybridization may reflect different hybrid zone dynamics among the three transects, at least a portion of the difference is due to methodological differences. Both Kroodsma (1975) and Emlen et al. (1975) used a more conservative hybrid index score than Sibley and Short (1959), and a reassessment of the specimens collected by Sibley and Short suggested only ∼7% of their specimens were hybrids (Emlen et al. 1975).
Behavioral studies of the Passerina bunting hybrid zone provide some support for assortative mating and interspecific male territoriality. Using captive individuals, Baker (1991) showed that P. amoena and P. cyanea males from sympatric populations respond antagonistically to each other's songs and that sympatric females give more copulation solicitation displays when exposed to conspecific male traits than when exposed to heterospecific male traits (Baker and Baker 1990; Baker 1996). Subsequent field-based research suggested male plumage may be a better predictor of conspecific versus heterospecific status when females are choosing mates (Baker and Boylan 1999). Because male buntings are able to alter their songs phrases from one breeding season to the next (Payne 1981), females may opt to rely on male plumage patterns, because they are likely less plastic than song characteristics. Field research also provided some support for selection against hybrids. Pairings that involved at least one hybrid individual (either the male or female, there were no instances in which both sexes were hybrids) resulted in fewer nestlings and fledglings when compared to pairings involving nonhybrid individuals (Baker and Boylan 1999).
SAMPLING, AMPLIFICATION AND SEQUENCING
During the summers of 2004–2007, we collected vouchered population samples of P. amoena, P. cyanea, and P. amoena×P. cyanea hybrids from 21 localities spanning the contact zone (Fig. 1, Table 1 and Supporting Information Table S1). For reference parental samples, we also acquired tissues from allopatric populations of P. amoena and P. cyanea west and east of the contact zone, respectively (Fig. 1, Table 1 and Supporting Information Table S1), resulting in 248 individuals collected from 27 focal populations. Four additional localities represented by a single individual (Supporting Information Table S1) were included in the coalescent-based analyses, but not in the cline-based analyses. All skin and tissue specimens collected for this project were deposited in the Louisiana State University Museum of Natural Science.
Table 1. Sampling localities, sample sizes, and distance along sampling transect of individuals analyzed in this study.
Distance from locality 1 (km)
1Access to these localities generously provided by private landowners.
2Individuals from these localities were excluded from cline analyses; see Supporting Information Table S1 for exact sampling localities.
MT: Custer National Forest #1
WY: Bighorn National Forest #1
WY: Bighorn National Forest #2
CO: Roosevelt National Forest
MT: Custer National Forest #2
WY: Medicine Bow National Forest
ND: Little Missouri National Grassland
SD: Custer National Forest
WY: Sand Creek1
NE: White River1
SD: Black Hills National Forest
NE: Ponderosa State Wildlife Management Area
SD: The Nature Conservancy Whitney Preserve
NE: Nebraska National Forest
SD: Ft. Meade National Recreation Area
NE: The Nature Conservancy
Niobrara Valley Preserve
SD: Carpenter Game Production Area
ND: The Nature Conservancy
Pigeon Point Preserve
NE: Wiseman State Wildlife Area
SD: Newton Hills Game Production Area
We extracted genomic DNA from ∼25 mg of pectoral muscle using either standard phenol/chloroform methods or a DNeasy Tissue Kit (QIAGEN Inc., Valencia, CA). Each individual was amplified at a suite of eight loci (two mitochondrial, four autosomal, and two z-linked markers; Table 2) using the following PCR conditions in a 25 μl reaction: ∼40 ng template DNA (2 μl of DNA extracts), 1 μl of 10 mM dNTPs (2.5 mM each dATP, dTTP, dCTP, dGTP), 1 μl of each primer (10 mM, Table 2), 2.5 μl 10 × Buffer with MgCl2 (15 mM), 0.1 μl Taq (5 U/μl of either AmpliTaq DNA Polymerase, Applied Biosystems Inc., Foster City, CA or Taq DNA Polymerase, New England Biolabs, Ipswich, MA), and 17.4 μl sterile diH2O. The thermocycling profile was as follows: an initial 95°C denaturation for 2 min followed by 35 cycles consisting of a 30 sec, 95°C denaturation step, a 30 sec, locus-specific temperature primer annealing step (Table 2), and a 2 min, 72°C extension step, and a final extension of 5 min at 72°C. To check for amplification we electrophoresed 2.5 μl of each PCR product on a 1% agarose gel.
Table 2. PCR annealing temperatures, sequence lengths, primer sequences, diversity measures, and GenBank accession numbers for mitochondrial, autosomal, and z-linked loci.
Internal sequencing primers3
1Length of longest independently segregating block.
4ND3 and Cont. Region were analyzed together as a single mtDNA dataset.
F: TAC CTA GGA GGT GGG CGA AT
R: CCC AAA CAT TAT CTC CAA AA
F: AGC ACC TTT GAA CAG TGG TT
R: TAC TTT ATG GAG ACG ACG GA
We cycle-sequenced both strands of all PEG-purified PCR amplicons in a 7 μl reaction using 1.5 μl of 5 X sequencing buffer (Applied Biosystems), 1 μl of 10 mM primer, 2.5 μl of template, 0.15–0.25 μl Big Dye Terminator Cycle-Sequencing Kit v 3.1 (A), and 1.85–1.95 μl sterile diH2O. We cleaned cycle-sequencing products on Sephadex (G-50 fine) columns and electrophoresed the cleaned products on a 3100 Genetic Analyzer (Applied Biosystems). All sequences were edited and assembled using Sequencher ver. 4.7 (GeneCodes Corp., Ann Arbor, MI). When direct sequencing of purified PCR amplicons of autosomal and z-linked loci revealed more than one heterozygous site within a sequence, we resolved haplotypes probabilistically using PHASE (Stephens et al. 2001; Stephens and Donnelly 2003).
To identify the largest independently segregating block of sequence data for all nuclear loci (autosomal and z-linked), we tested for intralocus recombination using the four-gamete test as implemented in DnaSP ver. 4.10 (Rozas et al. 2003). We also used DnaSP to calculate haplotype diversity (Hd; Nei 1987) and nucleotide diversity (π; Tajima 1983) within each species. Species identification was based on plumage pattern (see below). To assign haplotypes as belonging to either P. amoena or P. cyanea, we used TCS ver. 1.21 (Clement et al. 2000) to build parsimony-based haplotype networks that included all phased haplotypes. Individual haplotypes with frequencies ≥ 0.80 in the “allopatric” P. amoena population (WA) were classified as P. amoena haplotypes. Alternately, haplotypes with frequencies ≥ 0.80 in the “allopatric” P. cyanea population (LA) were designated as P. cyanea haplotypes. The most common MYO2 haplotype could not be assigned as either P. amoena or P. cyanea based on the above criterion. For the cline analyses (see below) we estimated cline shape parameters for the most common MYO2 haplotype.
For the cline analyses we excluded those individuals for which PHASE was unable to assign haplotypes at a probability greater than 0.75 (range: 4–21 individuals per locus), and the 26 individuals from Louisiana, because they did not fall on a natural linear transect from west to east. With the remaining individuals (Table 3), we used a procedure similar to Macholan et al. (2007) to generate a linear transect spanning the hybrid zone. First, for localities 3–23 we averaged the frequency of P. cyanea alleles across all seven loci (Fig. 1C). In the case of MYO2, we only included those alleles that could be unambiguously assigned to either P. amoena or P. cyanea, thus the most common allele was not included in this calculation. Next, using the Contour Plot option in JMP 5.1, we plotted the 0.5-isocline using x and y coordinates and average allele frequencies of localities 3–23, which are those populations within the previously described area of overlap between P. amoena and P. cyanea. Treating this 0.5-isocline as the center of the zone, we measured the shortest straight-line distance between all sampling localities and the 0.5-isocline. Because all individuals from Washington (locality 1), Minnesota (24), Illinois (25), and Michigan (26) were not from the same sampling locality within the respective states, we plotted all sampling localities in Google Earth and then picked a point at the estimated center of the resultant centroid. In this way, the samples from Washington, Minnesota, Illinois, and Michigan were collapsed into a single locality for each state (Table 1). The farthest sampling locality (WA) was then set to 0 km and the location of all other localities was recalculated accordingly, resulting in a linear transect from WA to MI (Table 1).
Table 3. Cline shape parameters (two-unit likelihood support limits) for locus-specific and combined loci datasets.
c (km from locality 1)
1Number of individuals (mtDNA) or number of chromosomes (all other loci) sampled for cline analyses.
2Combined dataset of ND3 and Cont. Region haplotypes.
3Combined dataset of ALDOB3 and BRM15 alleles.
4Combined dataset of GADPH, MC1R and RHO1 alleles.
Along this transect, we investigated introgression patterns using the methods developed by Szymura and Barton (1986, 1991) and implemented in the program ClineFit (Porter et al. 1997). These methods estimate cline shape parameters using three equations that model the relationship between allele frequency data within a sampling locality and the geographic location of those localities. The first equation (1) describes a symmetrical, sinusoidal shape in the center of the cline and the other two equations (2 and 3) describe the exponential change in allele frequencies on the left and right sides of the cline:
In these equations, c represents the location of the center of the cline (measured in kilometers from the WA locality), w is the cline width (1/max slope), and x is the geographic location along the sampling transect (kilometers from WA locality). In equations (2 and 3), zL and zR describe the distance from the center (c) of a vertical asymptote for the exponential decay of allele frequencies on the left and right side of the zone, respectively. The parameters, θL and θR are the exponential decay values relative to the shape of the central cline (eq. 1) on the left and right sides. The three equations are related in that as the parameters z and θ approach zero and 1, equation (2) and (3) approach the shape of equation (1) on their respective sides of the zone. The center of the zone (c) is the point along the transect in which allele frequencies change most rapidly and the width of the zone (w) provides an estimate of the geographic distance over which that rapid change in allele frequencies occurs. The parameters θL and θR describe the exponential rate of change in allele frequencies in the western and eastern tails of the cline and zL and zR provide information on the geographic distance over which the exponential decay in the tails occurs.
For each locus, as well as combined autosomal loci and combined z-linked loci datasets (n= 9 datasets), we tested the fit of two different models using likelihood-ratio tests: the two-parameter model, which includes only the center and width parameters, and the six-parameter model, which estimates the six shape parameters described (Porter et al. 1997; Brumfield et al. 2001). In the majority of tests (seven of nine), the six-parameter model was a significantly better fit to the data (likelihood-ratio tests, P < 0.05). Thus, to facilitate comparisons between the different datasets, we used the six-parameter model to describe the cline shape for every dataset. The following search parameters were used in each analysis: burn-in: parameter tries per step—300; sampling for support: replicates saved ≥ 2000, and 30 replicates between saves. Differences in parameter estimates between different loci were assessed using the two-unit likelihood support limits, which are analogous to 95% confidence limits (Edwards 1992), and by comparing the log likelihood score of the cline parameter estimates from an unconstrained model to the log likelihood score when the width was constrained to equal the width estimated from an unconstrained search of another dataset. Significance was assessed using a likelihood-ratio test. Because an initial inspection suggested the change in MYO2 allele frequency data across the transect was not clinal (Fig. 2), it was excluded from the estimates of combined autosomal cline shape parameters.
If introgression of autosomal alleles is greater than that of mitochondrial and z-linked markers, autosomal loci should show increased levels of interspecific gene flow. We assessed this using the coalescent-based nonequilibrium Isolation with Migration model implemented in the software program IM (Hey and Nielsen 2004; Hey 2005). By combining coalescent theory with Bayesian methodologies, IM simultaneously estimates multiple population genetic parameters for two populations. These parameters, which are scaled to the neutral mutation rate, μ, include: θ1=4N1μ, θ2=4N2μ, θA=4NAμ, t=tμ, m1=m1/μ, and m2=m2/μ. The two migration rates, m1 and m2, allow for inferences of asymmetric introgression, and can be used to compare gene flow rates between different marker classes (e.g., mitochondrial vs. autosomal, and z-linked vs. autosomal).
Because IM assumes no intralocus recombination and requires fully resolved haplotypes (i.e., no ambiguous sites) with no gaps, we had to pare down the nuclear DNA sequence datasets. First, we removed all gaps in the alignments. Second, we included only those individuals for which PHASE was able to assign haplotypes with a probability of greater than 0.75. We tested for recombination as described above. Lastly, we used the dominant plumage motif (e.g., presence of white wing bars and rusty breast band are characteristic of P. amoena males) of each specimen to assign individual birds to either the P. amoena or P. cyanea population. Because females of the two species are difficult to distinguish morphologically, we excluded eight females collected in the contact zone. We also excluded two males (Supporting Information Table S1) for which definitive population assignment was difficult (Table 4). Because all individual classifications were the same in each dataset, these exclusions are unlikely to systematically bias the results.
Table 4. Smoothed highpoint coalescent-based estimates (90% Highest Posterior Densities) of per generation asymmetric introgression rates between P. amoena and P. cyanea.
N1 P. amoena
N1 P. cyanea
Introgression from P. amoena into P. cyanea
Introgression from P. cyanea into P. amoena
1Sample sizes (N) are the number of individuals (mtDNA) or the number of chromosomes (all other loci) designated as belonging to either P. amoena or P. cyanea.
To facilitate comparisons of introgression estimates between different loci, we combined the sequence data for all loci into a single dataset. From this dataset we estimated one effective population size (we forced θ1=θ2=θA), divergence time (t), and asymmetric introgression rates (m1 and m2) for each locus. Additionally, we included inheritance scalars in the input file (mitochondrial—0.25, autosomal—1.0, z-linked—0.75) to account for the difference in effective population sizes due to different ploidy and inheritance modes of the three marker types. We initially ran IM using the HKY finite substitution model with wide, uninformative priors for > 2 million steps and used these initial runs to identify more appropriate priors (Won and Hey 2005). These adjusted priors (θ1= 10, m1= 10, m2= 10, t= 10) were then used in two replicate “final” analyses that differed only in starting random number seed. All “final” analyses were run with a burn-in of 200,000 steps and were allowed to continue until the Effective Sample Size (ESS) values for each parameter were > 100 (Hey 2005). Convergence was also assessed by inspecting the plots of parameter trend lines and by comparing the results of the two replicate runs. Because all parameter estimates for the two replicate runs were qualitatively similar, we only present the estimates from the longest “final” run of each dataset (number of steps greater than 5 × 107). The 14 directional introgression rate estimates (two mitochondrial rates—from P. cyanea into P. amoena and vice-versa, eight autosomal rates (two for each of four loci), and four z-linked rates (two for each of two loci)) from the longest “final” run were used in comparisons of differential introgression.
The introgression estimates are scaled to the mutation rate, which is approximately 10 times faster for mitochondrial than for nuclear loci (Graur and Li 2000). We assumed the following mutation rates: 3.6 × 10−9 substitutions/site/year for autosomal and 3.9 × 10−9 substitutions/site/year for z-linked loci (Axelsson et al. 2004). We divided the introgression estimates for the mitochondrial data by 10 to make the estimates more readily comparable across loci.
No heterozygous sites in the two presumed z-linked loci (ALDOB3 and BRM15) were found in any females. For four of seven loci, both haplotype diversity (Hd) and nucleotide diversity (π) were larger in P. cyanea than in P. amoena (Table 2). In two datasets, RHO1 and GADPH, both diversity measures were larger in P. amoena and in the mitochondrial dataset, haplotype diversity was larger in P. cyanea, but nucleotide diversity was higher in P. amoena.
Males that were defined as P. amoena based on plumage patterns (see section “Coalescent Analyses”) were found as far east as population 18 and P. cyanea males were found as far west as population 11 (Fig. 1). We refer to this region, which is approximately 250-km wide, as the “contact zone.” No genetically pure individuals of the wrong type (e.g., an individual with all P. amoena alleles but identified as P. cyanea based on plumage) were found outside of the contact zone. After excluding the MYO2 data, there were no individuals collected on the east side of the contact zone with a frequency of P. amoena alleles greater than 0.273. A similar pattern was found on the west side: no single individual had a frequency of P. cyanea alleles greater than 0.364.
CLINE SHAPE ESTIMATES
The centers of all locus-specific clines, with the exception of MYO2, were coincident and fell within a 64 km range (extremes: GADPH—1259 km east of locality 1 and ALDOB3—1353 east of locality 1; Fig. 2 and Table 3). This 64 km range begins just east of locality 17 and encompasses the midpoint between localities 11 and 18, which form the boundaries of the contact zone (see above). Similarly, the cline widths of all locus-specific clines were concordant (Table 3). Widths ranged from 175 km (GADPH) to 461 km (MC1R). Cline widths of the other four loci (mtDNA, ALDOB3, BRM15, and RHO1) were all between 223 and 268 km, approximately the same width as the contact zone (∼250 km). Overall, the estimates of cline center were less variable (coefficient of variation = 0.028) than the estimates of cline width (c.v. = 0.368). There was no association between the locus-specific estimates of cline center and width (ANOVA P > 0.4).
The cline centers of the combined autosomal (1333 km east of locality 1; minus MYO2) and z-linked (1361 km east of locality 1) datasets were coincident (Table 3), but shifted east from the mitochondrial cline center (1262 km east of locality 1; Fig. 2 and Table 3). The autosomal cline width was 466 km (two-unit likelihood support limits: 344–545 km), significantly greater than the mitochondrial width (223 km; two-unit likelihood support limits: 158–305 km), and greater than the z-linked cline width (309 km). The difference in cline widths between the autosomal and z-linked clines was almost significant using the nonoverlapping support limits criterion (autosomal two-unit likelihood support limits: 344–545 km; z-linked: 241–375 km). Forcing the z-linked width to equal the autosomal width (466 km) resulted in a significantly worse fit of the data to the model (χ2= 32.1, df = 1, P < 0.0001). The difference between the mitochondrial and z-linked cline widths was not significant by either method (two-unit likelihood support limits; Table 3), although forcing the mitochondrial width to be 309 km (the z-linked width) resulted in a worse fit that was close to significant (χ2= 2.85, df = 1, P= 0.09). As with the locus-specific clines, the cline center estimates for the combined datasets were less variable (c.v. = 0.015) than the cline width estimates (c.v. = 0.286).
With the exception of divergence time (t) the distributions of the posterior probabilities of all parameter estimates showed a single peak. The smoothed highpoint estimate of θ1 was 2.285 (90% HPD: 1.763–2.895). Smoothed high point estimates along with the 90% highest posterior density intervals (90% HPD) are given for all introgression parameters in Table 4 and Supporting Information Figure S1. Although none of the introgression parameter estimates were significantly different by this criterion, as in the cline shape estimates, the variation in locus-specific introgression rates was considerable (Table 4 and Supporting Information Fig. S1). Introgression from P. amoena into P. cyanea estimated from the MYO2 data was more than 100-fold greater than mitochondrial introgression in the same direction (Table 4 and Supporting Information Fig. S1). No pairwise comparisons of locus-specific introgression estimates from P. cyanea into P. amoena differed by more than a factor of seven (Table 4).
For all appropriate pairwise comparisons, IM tallies which parameter is larger during each recorded step (every 10 steps in the chain), and this tally can also be used to assess differences in parameter estimates. For any given locus the proportion of steps in which the estimate of introgression from P. cyanea into P. amoena is greater than in the opposite direction is indicative of asymmetric introgression at that locus; the opposite pattern is also possible. Using this criterion, mitochondrial introgression from P. cyanea into P. amoena is significantly greater than mitochondrial introgression from P. amoena into P. cyanea (proportion of recorded steps in which the introgression estimate into P. amoena was larger than in the opposite direction: 0.979). Although not significant at an α= 0.05, introgression estimates at two other loci were asymmetric. Like the estimates for the mitochondrial data, introgression estimates for GADPH were greater from P. cyanea into P. amoena (proportion: 0.946). Introgression of ALDOB3 alleles was also asymmetric, but in the opposite direction. In most steps the estimate from P. amoena into P. cyanea was larger than from P. cyanea into P. amoena (proportion: 0.902). In all other locus-specific comparisons of asymmetric introgression estimates, the proportion of steps in which introgression in one direction was larger than in the other direction was never greater than 0.643.
To compare the coalescent-based introgression estimates with the average mitochondrial, autosomal, and z-linked cline widths, we calculated a mean introgression rate for each class of genetic marker (mitochondrial, autosomal, z-linked) from the IM parameter estimates. These means were calculated as the sum of all smoothed highpoint introgression parameter estimates within a marker class, divided by the number of parameters. The average autosomal introgression estimate was the largest (0.63 genes per generation, scaled to the mutation rate μ), followed by the z-linked estimate (0.59), and the mitochondrial estimate (0.27).
Using DNA sequence data from a suite of mitochondrial, autosomal, and z-linked loci, we conducted the first detailed genetic characterization of the Passerina bunting hybrid zone. Our results demonstrate that, on average, mitochondrial and z-linked markers exhibit reduced introgression relative to autosomal markers (Fig. 2 and Table 3). This finding is consistent with theoretical predictions of Haldane's rule, stemming from the dominance theory, as well as with empirical results in other systems, but is the first of its kind in an avian system. Within each marker class (i.e., mitochondrial, autosomal, and z-linked), we found considerable among-locus variance in introgression estimates (Fig. 2 and Table 3). Because patterns of differential introgression are due to the amount of genetic drift and the strength of selection acting on each locus, our results suggest that many more independently segregating markers than were used here are required (1) to adequately quantify mean levels of introgression across the genome and (2) to help disentangle the relative roles of genetic drift and natural selection on each locus.
We also found some support for the hypothesis that mitochondrial haplotypes should introgress less than z-linked alleles. The coalescent-based estimate of mitochondrial introgression was lower (0.27 genes per generation, scaled to the mutation rate μ) than for z-linked loci (0.59). The estimates of cline width suggest that mitochondrial haplotypes (223 km) introgressed less than z-lined alleles (309 km), but the difference was not significant. To our knowledge, there has been only one other attempt to systematically compare introgression patterns of mitochondrial, autosomal, and z-linked loci between taxa with a ZW sex-determination system. Cianchi et al. (2003) found less introgression of mitochondrial and sex-linked loci than of autosomal loci across a hybrid zone between the swallowtail butterflies Papilo machaon and P. hospiton, but they did not statistically test the significance of the reductions.
Other hybrid zone studies between taxa with heterogametic females, particularly between Lepidopteran species, have found reductions in mitochondrial introgression relative to autosomal alleles (Hagen and Scriber 1989; Jiggins et al. 1997; Dasmahapatra et al. 2002; Kronforst et al. 2006), and a few have reported lower introgression of sex-linked alleles than autosomal alleles (Hagen 1990; Putnam et al. 2007). Although these studies were not designed to explicitly test the different hypotheses for the genetic mechanism of Haldane's rule, they do, along with our study, strongly support the predictions of the dominance theory and are also consistent with the dominance requiring faster-x theory (Charlesworth et al. 1987; Orr 1997). As noted above, the faster-male theory (Wu and Davis 1993; Wu et al. 1996) cannot explain Haldane's rule in Lepidopteran and avian taxa. Thus it is clear that the dominance theory of Haldane's rule plays an important role in causing the fitness reductions seen in hybrids between taxa with heterogametic females.
MAINTENANCE OF PASSERINA HYBRID ZONE
Multiple lines of evidence suggest that the narrow clines (cline analysis) and low migration rates (coalescent analysis) observed across most markers sampled in this study cannot be explained solely by neutral diffusion following secondary contact. In the absence of selection, the neutral diffusion model (Endler 1977; Barton and Gale 1993) can be used to approximate the expected width of a cline using estimates of root-mean-square (RMS) dispersal distance (σ) and time (t), in generations, since contact (; Barton and Gale 1993). In Passerina buntings reliable estimates of both RMS and time since contact are unavailable, but we can place limits on these variables to explore how the hybrid zone might appear under the neutral diffusion hypothesis.
Representative estimates of RMS in other North American passerines range from ∼30 km (Dendroica occidentalis and D. townsendi; Rohwer and Wood 1998) to ∼150 km (Catharus ustulatus; Ruegg 2008). Although based on only two individuals, recapture data of P. cyanea banded as nestlings suggest that the dispersal distance from natal sites can be quite large (52 and 350 km; Payne 1992). Given the lack of any RMS estimates in Passerina buntings, we use the estimates from other species as a rough guide. Individuals of P. amoena were reported to be breeding in eastern South Dakota (near population 22) in 1888 (Sibley and Short 1959), which indicates that the two species have likely been in contact for a minimum of 120 generations (assuming generation time = 1 year). Using a conservative estimate of 30 km for RMS dispersal and 120 years as limits on time since contact, the neutral diffusion model predicts the Passerina hybrid zone should be ∼830 wide. The widths of all clines were narrower (range: 165–523 km) than even the most conservative estimate provided by the neutral diffusion model (∼830 km), thus selection appears to be maintaining the width of the hybrid zone, at least at some loci.
Because neutral diffusion is an unlikely explanation for the introgression patterns, alternative explanations for the maintenance of the Passerina hybrid zone are required. The general geographic location of the hybrid zone coincides with the Great Plains–Rocky Mountain ecotone (Swenson and Howard 2004, 2005), raising the possibility that the bounded-superiority hypothesis (Moore 1977), wherein hybrid individuals have increased fitness over parental types in the intermediate environment occurring at the transition between two ecosystems' may explain the Passerina hybrid zone maintenance. However, ecological niche modeling suggests that the fundamental niche of the Passerina hybrid zone is much wider (∼750 km from eastern Wyoming to western Iowa; Swenson 2006) than estimated in this study. If hybrids are indeed more fit than parentals in transitional habitats, the hybrid zone should cover the entire area of transitional habitat, but that is not the case. Additionally, numerous “pure” parentals of both species are found within the hybrid zone (Baker and Boylan 1999), which is in the area of transitional habitat (Swenson 2006), and field studies conducted in the hybrid zone found that pairings involving at least one hybrid individual fledge fewer young than pairings between nonhybrid individuals (Baker and Boylan 1999).
The available data, from our genetic work, field studies of reproductive success in hybrids, and ecological niche modeling, all point to selection against hybridization playing a prominent role in the maintenance of the narrow Passerina hybrid zone. What the selection pressure is remains unknown, but previous studies suggest a number of possibilities. One interesting post-mating hypothesis implicates differences in the molt schedules of P. amoena and P. cyanea (Rohwer and Johnson 1992; Voelker and Rohwer 1998). Passerina amoena individuals begin their prebasic molt before leaving breeding areas in August and migrate to a molting “hot spot” in southern Arizona and New Mexico or in southern Baja California (Greene et al. 1996). They generally stay at these molting “hot spots” for one month while they complete the prebasic molt before continuing on to their wintering grounds in western Mexico. In contrast, P. cyanea individuals leave the breeding areas approximately one month later, having already undergone their prebasic molt, and migrate directly to their wintering grounds in Central and northern South America (Payne 1992). It is possible then, that hybrid individuals could undergo two prebasic molts (instead of the usual single prebasic molt), one on the breeding grounds, as seen in P. cyanea, and another sometime later, as in P. amoena. Alternately, there may be delays in either the timing of molt or migration that jeopardize hybrid survival or future reproductive success (Helbig 1991; Ruegg 2008). Although our data cannot address the role the differences in the timing of molt and migration may play in reducing hybrid fitness, these potential problems seem unlikely to be the sole selective agent. At least some hybrid individuals do return to the breeding grounds (Sibley and Short 1959; Emlen et al. 1975; Kroodsma 1975; Baker and Boylan 1999; M. D. Carling, pers. obs.).
A premating hypothesis is that the intermediate plumages and songs of hybrid males limit their ability to acquire territories and mates. Females of both species routinely give more copulation-solicitation displays to conspecific males than to heterospecific males in experiments with captive birds (Baker 1996). Interestingly, females appear to take more cues from plumage than song, perhaps because male Passerina buntings are capable of altering their song from one year to the next (Payne 1981), which subsequently reduces the reliability of song as an accurate indicator of a male's genetic makeup. In sympatry, pure plumaged parentals mate assortatively but hybrid females mate randomly, which also implies sexual selection against hybrid males (Baker and Boylan 1999). Careful field studies that directly examine pre and postmating reproductive isolating mechanisms in this system are needed.
COMPARING CLINEFIT AND IM RESULTS
In general, the cline-based and coalescent-based analytical methods produced the same results—introgression of mitochondrial and z-linked markers was less, on average, than that of autosomal markers. The differences seen can likely be attributed to the differences in what the two techniques are estimating. ClineFit describes the sigmoidal change in allele frequencies across the geographic transect. Alleles found on the “wrong” side of the hybrid zone (e.g., a P. cyanea GADPH allele found in a P. amoena individual in population 7) extend the width of the hybrid zone. As such, there is a chance that if alternate alleles are not fixed at each end of the transect, cline widths may be overestimated because the presence of the “wrong” allele is assumed to be introgression when it may in fact be retained ancestral polymorphism. This may explain the difference between the wide MC1R cline (Table 3) and the fairly limited rates of introgression of MC1R estimated using IM (Table 4 and Supporting Information Fig. S1). The presence of P. cyanea alleles on the west end of the transect resulted in a wide cline, but IM likely attributed the presence of those alleles to the retention of ancestral polymorphisms that have not yet completely sorted.
The geographic location of the samples can also influence cline-based and coalescent-based inferences of introgression differently. No geographic information is provided as input for IM analyses, with individual DNA sequences simply classified as belonging to one of two populations. Therefore, differences like those seen for GADPH, which had the narrowest cline of any single locus dataset (Table 3), but the second highest coalescent-based estimate of introgression (Table 4), might be best explained by geography. There were a relatively large number of individuals with P. amoena plumage that possessed a P. cyanea GADPH allele, and vice-versa, leading to the high IM estimates of introgression, but most of those individuals were clustered in the contact zone, leading to the narrow cline width. For some loci, the two methods are quite consistent. For example, the mitochondrial cline was relatively narrow and IM estimates of introgression were low, because few individuals possessed the “wrong” mitochondrial haplotype for their plumage type, but those that did were contained within a narrow geographic region.
Despite the potential for inconsistencies, combining cline-based and coalescent-based methods offers a great deal of promise for investigating the role of post-divergence introgression in the formation and maintenance of species. Certainly, cline-based analyses are more limited in their ability to separate introgression from the retention of ancestral polymorphisms and cannot be used to estimate divergence times of loci with interesting introgression patterns. Although coalescent-based methods are continually improving (Putnam et al. 2007; Rosenblum et al. 2007), they are not precisely designed to explore the impact of introgression in a geographical setting. That said, it seems possible that researchers could set up a geographically structured experimental design to take advantage of the benefits of coalescent-based methods. By hierarchically analyzing two populations at increasing distances from each other, levels of introgression across a landscape may be estimated using current coalescent-based methods.
In Ficedula flycatchers, phylogenetic analyses of mitochondrial, autosomal, and z-linked loci suggest sex chromosomes play an important role in reproductive isolation between Ficedula albicollis and F. hypoleuca (Saetre et al. 2003), two species that hybridize where their breeding ranges overlap in Europe (Saetre et al. 1997, 1999, 2003, 2001; Veen et al. 2001). Although introgression patterns of z-linked loci along a geographic sampling transect have yet to be explored in the Ficedula hybrid zone, their research is the only other study, to our knowledge, to have used z-linked loci to investigate the genetics of reproductive isolation between hybridizing avian species. As the availability of genetic and genomic resources continues to grow (Backstrom et al. 2008), it is likely that more researchers will investigate introgression patterns of z-linked loci across avian hybrid zones. Such studies will be critical to comparative investigations of avian speciation, and will help address whether genes of the same functional class are routinely involved in reproductive isolation.
Associate Editor: M. Webster
We thank S. Birks (University of Washington Burke Museum of Natural History and Culture), A. Capparella (Illinois State University), D. Dittmann (Louisiana State University Museum of Natural Science), J. Hinshaw (University of Michigan Museum of Zoology), and M. Westberg (University of Minnesota Bell Museum) for generously providing tissue samples for this project. S. Carling and J. Maley were outstanding field assistants. To C. Burney, Z. Cheviron, A. Cuervo, E. Derryberry, R. Eytan, J. Klicka, I. Lovette, J. Maley, and M. Webster all provided helpful comments on the manuscript. J. Hey and A. Porter assisted greatly with their software packages IM and ClineFit, respectively. Scientific collecting permits were granted by the Colorado Division of Wildlife, Montana Fish, Wildlife and Parks, Nebraska Game and Parks, North Dakota Game and Fish, South Dakota Game, Fish and Parks, Wyoming Game and Fish, and the United States Fish and Wildlife Service. Louisiana State University Institutional Animal Care and Use Committee Protocol: 08–025. The map data were provided by NatureServe in collaboration with R. Ridgely, J. Zook, The Nature Conservancy—Migratory Bird Program, Conservation International—CABS, World Wildlife Fund—US, and Environment Canada—WILDSPACE. This work was supported by grants from NSF (DEB-0543562, DEB-0808464, and DBI-0400797) to RTB, as well as by grants from the AMNH Chapman Fund, AOU, Explorer's Club, LSUMNS Birdathon and Prepathon Funds, LSU Department of Biological Sciences, and Sigma-Xi.