Variable selection pressures across heterogeneous landscapes can lead to local adaptation of populations. The extent of local adaptation depends on the interplay between natural selection and gene flow, but the nature of this relationship is complex. Gene flow can constrain local adaptation by eroding differentiation driven by natural selection, or local adaptation can itself constrain gene flow through selection against maladapted immigrants. Here we test for evidence that natural selection constrains gene flow among populations of a widespread passerine bird (Zonotrichia capensis) that are distributed along an elevational gradient in the Peruvian Andes. Using multilocus sequences and microsatellites screened in 142 individuals collected along a series of replicate transects, we found that mitochondrial gene flow was significantly reduced along elevational transects relative to latitudinal control transects. Nuclear gene flow, however, was not similarly reduced. Clines in mitochondrial haplotype frequency were strongly associated with transitions in environmental variables along the elevational transects, but this association was not observed for the nuclear markers. These results suggest that natural selection constrains mitochondrial gene flow along elevational gradients and that the mitonuclear discrepancy may be due to local adaptation of mitochondrial haplotypes.
When species are distributed across heterogeneous landscapes, divergent selection pressures can result in adaptive evolution to local environmental conditions. Gene flow among populations experiencing different selection regimes can influence local adaptation in a number of ways. If the strength of selection is weak relative to the level of gene flow, genetic variation will be homogenized among populations, preventing or retarding the attainment of local adaptive optima. When selection is strong, however, genetic differentiation can be maintained by a balance between gene flow and selection against maladapted immigants, and local adaptation may occur in the face of gene flow. Under a simple model of migration-selection balance where an allele A is advantageous in one habitat, but deleterious in another, the specific conditions under which local adaptation may be expected can be expressed as: m/s < α/1 –α, where m is the migration rate between habitats, s is the intensity of selection, and α is the ratio of selection coefficients set such that α≤ 1 (Bulmer 1972; Lenormand 2002). Thus the potential for local adaptation is greatest not only when selection is strong, but also when the intensity of selection is roughly equivalent between habitats (i.e., as α approaches 1).
Given these theoretical predictions, an understanding of the interplay between natural selection and gene flow among populations in different environments is fundamental to the study of local adaptation, but disentangling their relative effects is challenging (Nosil and Crespi 2004; Garant et al. 2007; Räsänen and Hendry 2008). For example, low levels of adaptive divergence may reflect either high levels of gene flow or weak divergent selection (Moore and Hendry 2005). Thus, inverse correlations between adaptive phenotypic divergence and gene flow alone cannot be used to infer the cause of such associations (Räsänen and Hendry 2008). Inferences of the causal relationships between gene flow and adaptive divergence require tests that can separate the effects of gene flow from those of natural selection. Being able to quantify the amount of gene flow occurring among populations that occupy different environments is a critical component of such tests. If divergent natural selection constrains gene flow between populations that are adapted to different environments, then the amount of gene flow should be negatively associated with the degree of environmental contrast between populations (Räsänen and Hendry 2008).
The primary objective of this study was to examine the interaction between divergent natural selection and gene flow in the genetic differentiation of Rufous-collared Sparrow (Zonotrichia capensis) populations that occur along an extreme environmental gradient. In the Peruvian Andes, Z. capensis is distributed continuously from sea level to over 4500 m above sea level (asl) (Schulenberg et al. 2007). The cold, hypoxic conditions of high elevation habitats impose severe physiological stress on endothermic vertebrates, and populations of Z. capensis occurring at different elevations along the Pacific slope in Peru differ in physiological parameters that are likely to be adaptive. Individuals collected at 4500 m asl have significantly higher metabolic rates and reduced lower critical temperatures (the temperature at which metabolic resources must be used to maintain constant body temperature) than those collected at sea level, consistent with local adaptation to cold, high elevation habitats (Castro 1983; Castro et al. 1985; Castro and Wunder 1990).
Here, we use genetic data from population samples collected along a series of replicate transects along the elevational gradient to test two interrelated predictions of the hypothesis that divergent natural selection constrains gene flow among populations occupying different altitudinal environments. We predicted to find (1) reduced gene flow along elevational transects compared to latitudinal control transects, and (2) concordance and coincidence between clines in allele frequencies and environmental variables along the elevational gradient.
Our primary experimental design was composed of two elevational test transects and three latitudinal control transects on the Pacific slope of the Peruvian Andes (Fig. 1, Table 1). Along the transects, we sampled a total of 142 individuals from 11 localities (Table 2). Voucher specimens and tissue samples (Table S1) are deposited in the ornithological collections of the Louisiana State University Museum of Natural Science (Baton Rouge), the Museo de Historia Natural, Universidad Nacional Mayor de San Marcos (Lima, Peru) and the Centro de Ornitología y Biodiversidad (Lima, Peru). Elevational transects spanned nearly 4000 m in elevational gain over relatively short geographic distances (mean elevational gain = 3800 m, mean linear distance = 135.2 km), whereas the control transects spanned a much greater geographic distance over a relatively uniform elevation (mean elevational gain = 175 m, mean linear distance = 407.1 km) (Fig. 1, Table 1). The greater overall distance of the control transects allowed us to distinguish the effects of elevation from those of geographic distance on patterns of gene flow and population genetic structure. Zonotrichia capensis is an abundant species that is distributed more or less continuously along the two elevational transects (T1 and T2 in Fig. 1) (Z. A. Cheviron, pers. obs., Schulenberg et al. 2007), as well as control transect C3 that traverses the high puna grasslands of the central Andes (Z. A. Cheviron, pers. obs., Schulenberg et al. 2007). In contrast, Z. capensis is restricted to desert oases and irrigated agricultural fields along the arid coast of Peru (control transects C1 and C2), resulting in a patchy coastal distribution along these control transects. The greatest physical barriers to dispersal within the study area occur, therefore, along the coastal control transects.
Table 1. Summary of test (T) and control (C) transects. Note that population samples located at the ends of each test transect were also included in the control transects. Thus, those sites and the individuals collected there are counted twice.
Linear distance (km)
Elevational gain (m)
Table 2. Sampling locality information. Distance refers to the distance from the lowest sampling locality along an elevational transect. Climate PC1 scores are scaled from zero to one. Voucher specimen information is given in the supporting information. Note that population samples located at the ends of each test transect were also included in the control transects. Thus, those sites and the individuals collected there are counted twice.
Sequences of one mitochondrial gene (NADH subunit 3 [ND3]–384 bp), one nuclear autosomal intron (ß-actin intron 3 [ßact3]–370 bp), and one z-linked intron (Aldolase B intron 6 [AldoB6]–550 bp) were collected from each individual. Total genomic DNA was extracted from pectoral muscle using DNeasy tissue extraction kits (Qiagen, Valencia, CA) or standard phenol chloroform protocols. All polymerase chain reactions (PCRs) were performed in 25-μl volumes, with 16.9 μl of nuclease-free water (Sigma-Aldrich, St. Louis, MO), 0.1 μl AmpliTaq DNA polymerase (Applied Biosystems, Foster City, CA), 2.5 μl 1× Tris buffer with MgCl2 (Applied Biosystems), 1.5 μl dNTPs (each dNTP 50 μM), 1.5 μl (10 μM) of each primer, and 50 ng of template DNA. The thermocycling profile was as follows: an initial denaturation step at 94°C for 2 min followed by 35 cycles of a 30 sec, 94°C denaturation step, a 30 sec annealing step at a marker-specific annealing temperature (see below), and a 45-sec extension step at 72°C. A final extension step was carried out at 72°C for 5 min. We used the following locus-specific PCR primers and annealing temperatures: ND3–primers from Chesser (1999) with a 46°C annealing temperature; ßact3–primers from Carling and Brumfield (2008) with a 62.5°C annealing temperature; AldoB6–primers designed for this study were ALDOB6 F 5′AAGATCACCAGCACAACACCCTCT 3′ and ALDOB6 R 5′ AGGCTGCTGTGGAAAGACAGCTTA 3′, which were used with a 60°C annealing temperature. We ran negative controls for all PCRs and confirmed amplification by electrophoresing 2.5 μl of each amplicon on a 1.5% agarose gel. Amplicons were purified using 20% polyethylene glycol (PEG) precipitation.
All PEG-purified amplicons were sequenced in both directions in 7 μl reaction volumes, with 1.75–1.8 μl of nuclease-free water, 1.5 μl of 5× sequencing buffer (Applied Biosystems, Foster City, CA), 1 μl each of the 10 μM PCR primers, 2.5 μl of template DNA and 0.20–0.25 μl of Big Dye Terminator Cycle-Sequencing Kit version 3.1 (Applied Biosystems). Sequencing products were purified using Sephadex (G-50 fine) columns and visualized using an ABI 3100 Genetic Analyzer. Sequences from both strands were aligned using SEQUENCHER, version 4.6 (Genecodes, Ann Arbor, MI) and differences between the strands were resolved by eye. Mitochondrial sequences were examined to confirm the absence of double peaks in chromatograms, stop codons, indels, or other anomalies that are indicative of mitochondrial pseudogenes, none of which were detected. If direct sequencing of autosomal z-linked loci revealed more than one heterozygous site within a sequence, we resolved haplotypes probabilistically using the program PHASE (Stephens et al. 2001; Stephens and Donnelly 2003). Most haplotypes were resolved with posterior probabilities greater than 95% for both introns (ßact3–90.7%, AldoB6–90.5%), and over 95% of the haplotypes were resolved with posterior probabilities that were greater than 75%. The sex of 17 individuals (11.8%) could not be determined by visual examination of specimen gonads, so we determined the sex of these individuals molecularly using a PCR-based assay (Griffiths et al. 1998). All sequence data generated for this study were deposited in GenBank (FJ628777-FJ629178).
Individuals were genotyped at four nuclear autosomal microsatellite loci. Primers for two loci (Gf01 and Gf06) were designed for Darwin's Finches (Petren 1998) and were previously shown to be polymorphic in Z. capensis (Moore et al. 2005). The other two loci (Mme12 and Lox1) were designed for Melospiza melodia and Loxia scotica, respectively (Piertney et al. 1998; Jeffery et al. 2001).
All microsatellite PCRs were performed in 10 μl volumes, with 5.9 μl of nuclease-free water, 1 μl 10× Tris buffer without MgCl2 (Applied Biosystems), 1 μl of MgCl2 (25 mg/mL, Applied Biosystems), 0.5 μl of dNTPs (each dNTP 50 μM), 0.5 μl of each primer (10 μM), 0.1 μl AmpliTaq DNA Polymerase, and 20 ng of template DNA. Forward primers were fluorescently labeled with one of three dyes (6-FAM, HEX, or NED) allowing for multiplexing of amplicons during fragment analysis. Thermocycling conditions were as follows: an initial denaturation step at 94°C for 3 min, followed by 35 cycles of a 40-sec, 94°C denaturation step, a 40-sec annealing step at a marker-specific annealing temperature (see below), and a 40-sec extension step at 72°C. A final extension step was carried out at 72°C for 5 min. Marker-specific annealing temperatures were as follows: 50°C for Gf01 and Gf06, 54°C for Lox 1, and 60°C structure genetic for Mme 12. Following amplification, PCR products were diluted 10–12 fold with nuclease-free water. Diluted amplicons (2.5 μl) were mixed with 0.15 μl of ROX 400 HD size standard and 12.5 μl of HiDi formamide (Applied Biosystems). Electrophoresis was performed using an ABI 3100 Genetic Analyzer.
Microsatellite genotypes were scored using the program GeneMapper (version 3.7, Applied Biosystems). We tested for departures from Hardy–Weinberg equilibrium (HWE) for each locus and each locality using Arlequin 3.11 (Excoffier et al. 2005). There was no evidence of departures from HWE in any of the 11 sampling localities for Mme 12, Gf06, and Lox 1. Gf01 deviated from HWE expectations at two sampling sites, but these deviations were not significant after Bonferroni correction.
Geographic structure and genetic variation
For all analyses of population genetic structure, we analyzed the three marker classes (mtDNA, introns, and microsatellites) separately to test for congruence among them. We used the program Geneland 1.0.7 (Guillot et al. 2005, 2008) to infer the number of populations (K) present in the study area and to map the spatial distributions of these populations. Geneland implements a Bayesian Markov Chain Monte Carlo (MCMC) clustering algorithm that identifies genetic discontinuities while taking into account the spatial distribution of population samples. For each estimate of K we allowed the possible number of populations to vary from 1 to 11 (the total number of sampling sites) and used the following run parameters: 2.5 × 105 MCMC iterations with a thinning parameter of 10. All estimates of K were repeated five times to assess convergence. The most probable value for K was identical for each replicate run for each dataset. MCMC runs were postprocessed using a burn-in of 2500 iterations to obtain posterior probabilities of population membership for each individual.
We generated summary diversity statistics for each marker and sampling locality. For the sequence-based markers (ND3, βact3, and AldoB6), we calculated nucleotide diversity (π) and haplotype diversity (h) using DnaSp version 4.10 (Rozas et al. 2003) for both the total dataset and each sampling site. For the microsatellite loci, we calculated observed (Ho) and expected (He) heterozygosity using Arlequin for both the total dataset and each sampling site. We calculated linearized Fst values following Slatkin (1995) for all pairwise comparisons of the 11 sampling localities using Arlequin. Significance of Fst values was assessed using 10,000 random permutations of individuals among sampling sites. We tested for correlations between Fst and either the linear distance between sampling sites or the elevational difference between sites using the 10 sites where more than five individuals were sampled using Mantel tests in Arlequin. Significance of the Mantel tests was assessed using 10,000 random permutations.
Gene flow along the test and control transects
We used the Bayesian MCMC method implemented in the program IM (Hey and Neilsen 2004) to estimate migration (gene flow) rates along each elevational and control transect. Because IM assumes no intralocus recombination, we first tested for recombination in ßact3 and AldoB6 using the four-gamete test (Hudson and Kaplan 1985) as implemented in DnaSp. Recombination was detected in both markers, so we retained for the IM analyses only the largest independently segregating block of sequence for both loci (ßact3–129 bp, AldoB6–132 bp). To limit the number of parameters estimated in the IM analyses, we only examined populations from the ends of each transect (sampling sites T1–1, T1–5, T2–1, T2–5, and C2–2 see Table 2). This allowed us to estimate gene flow rates along the length of each transect, while avoiding the transition zone between high and low altitude populations along the elevational transects (see Results). For each transect, we initially estimated the parameter m (rate of migration or gene flow between two populations of interest, scaled by the neutral mutation rate μ) for each of the three marker classes. In a second set of analyses, we estimated m for each locus separately. We used an inheritance scalar to adjust IM parameter estimates for the differences in effective population size among markers (ND3 = 0.25; AldoB6 = 0.75; all other markers = 1). We constrained m to be symmetric; this reduced the number of parameters estimated and simplified interpretation of patterns of gene flow along the sample transects. Allowing for asymmetric migration rates did not affect the overall results (not shown).
For all IM analyses, we performed initial runs with large, flat priors for each parameter (Won and Hey 2005). Based on the results of these preliminary runs, we defined upper bounds for each prior that encompassed the entire posterior distribution for each parameter estimate. These upper bounds were used to define a uniform prior distribution for each parameter. Using these priors, we performed two replicate runs for each transect comparison to assess convergence in parameter estimates. All runs were performed using a burnin of 105 steps and the program was allowed to run until the effective sample size (ESS) for each parameter estimate was at least 150 (> 106 steps). Replicate runs were performed with identical priors but different random number seeds to ensure that results were similar among replicate runs. Parameter estimates were highly similar between replicate runs, so we only present results from the longer of the two.
Congruence between population genetic and environmental clines
Because we observed an abrupt shift in mitochondrial haplotype frequency along the elevational transects, we tested for an association between transitions in haplotype frequency and climatological attributes of the elevational transects by using cline-fitting analyses. Climate GIS layers for the study area were downloaded from the online WORLDCLIM database (http://www.worldclim.org). The BIOCLIM algorithm implemented in DIVA-GIS (Hijmans et al. 2004) was used to derive 19 climatic variables from three input parameters (mean monthly values of maximum temperature, minimum temperature, and precipitation) (Table S2). Climatic profiles for each sampling site were calculated using an interpolation procedure in DIVA-GIS. Because many of these climatic variables are likely to be correlated, we used JMP to perform a principal component analysis (PCA) on a correlation matrix of the 19 climatic variables. The first PCA axis (PC1) accounted for 69.8% of the climatic variation along the elevational transects and variables associated with temperature, precipitation, and seasonality loaded heavily on this axis (Table S2).
To assess the relationship between clines in haplotype frequency and climatic variables (PC1 scores), we used the program ClineFit (Porter et al. 1997) to estimate the centers and widths of haplotype frequency and climate clines along both elevational transects (T1 and T2). Cline center (c) represents the point along a transect where haplotype frequency (or climate PC1 score) changes most rapidly and cline width (w) represents the geographic distance over which this change occurs (Szymura and Barton 1986). For mitochondrial haplotype cline analyses, we used the frequency of the SNP that defined the high elevation haplogroup (see Results) at each sampling site along the elevational transects (Fig. 2A). We fit maximum-likelihood clines to the haplotype frequency data to estimate four parameters: c, w, and the haplotype frequencies at the ends of the cline (pmin and pmax), using the following search parameters in ClineFit: a burnin of 300 parameter tries per step, following the burnin, 2000 replicates were saved with 30 replicates run between saves. For climate cline analyses, we used climate PC1 scores from each sampling site that were scaled to values between 0 and 1, with the low elevation score set to zero. For the climate clines, we only estimated two parameters, c and w, using the same search parameters as in the haplotype cline analyses. We assessed cline concordance (equal cline widths) and coincidence (equal cline centers) statistically using two log-likelihood support limits (ln Lmax– 2), which are analogous to 95% confidence intervals (Edwards 1972), for the haplotype cline parameter estimates. Log-likelihood (ln L) values were not calculated for the climate clines because the likelihood equations implemented in ClineFit assume a genetic model and binomial variance. We assessed whether haplotype and climate cline parameter estimates were significantly different by determining if the parameter estimates for the climate cline fell within the log-likelihood support limits of the haplotype cline. For instance, if the estimate for the climate cline center estimate fell outside the log-likelihood support limits of the haplotype cline center, the centers were considered to be significantly different.
All seven loci were polymorphic and revealed varying degrees of population structure among the sampling sites (Table 3). Despite multiple attempts, not all individuals could be genotyped at each locus, so final sample sizes varied among markers (mean 134, range 113–142; Table 3). We observed a sharp transition in mitochondrial haplotype frequency along the elevational transects that was strongly associated with transitions in climatic variables (Fig. 2B, Table 4). Of the 26 unique haplotypes in the 142 individuals sampled, one (haplotype 1) was extremely abundant, occurring in 69.7% of the total individuals sampled (Fig. 2A). This haplotype was the dominant haplotype at low altitude sites, but its frequency declined with elevation along both elevational transects (Fig. 2B). Along elevational transects T1 and T2, haplotypes belonging to the high-elevation haplogroup (defined by the presence of SNP 368C/T, Fig. 2A) were most frequent at the highest elevations. SNP 368C/T is a noncoding substitution in the flanking Arginine tRNA corresponding to position 11,145 in the Gallus gallus mitochondrial genome (Dejardins and Morais 1990).
Table 3. Summary of the genetic data. For each of the sequence-based markers, nucleotide (π) and haplotype diversity (h) were calculated by pooling all of the sampled individuals. For the microsatellites, both the observed (Ho) and expected heterozygosities (He) were calculated by pooling all of the sampled individuals. Asterisks indicate deviations from Hardy–Weinberg equilibrium.
Table 4. Maximum-likelihood estimates of center (c) and width (w) for clines in mitochondrial haplotype frequencies and climatological variables along the two elevational transects. pmin is the estimated minimum frequency of high elevation haplotypes at the bottom of the elevational transects and pmax is the estimated maximum frequency of high elevation haplotypes at the top of the elevational transects. Likelihood support values (ln Lmax) and two log-likelihood support limits (ln Lmax– 2: in parentheses) are reported for the haplotype clines.
Centers for the mitochondrial haplotype and climate clines were coincident along transect T2 (mitochondrial cline center = 101.3 km [94.6–146.3 km], climate cline center = 95.7 km) suggesting that the sharpest transitions in both occur at approximately the same place along this transect (Table 4). The center of the mitochondrial cline was located a few kilometers upslope of sampling site T2–4, corresponding to an elevation of roughly 3300 m asl (Table 4). Along transect T1, the mitochondrial cline center was shifted significantly upslope of the climate cline center (mitochondrial cline center = 97.2 km [85.5–123.8 km], climate cline center = 84.2 km) (Table 4). However, the centers were only 13 km apart. The mitochondrial cline center was located a few kilometers upslope of sampling site T1–4, corresponding to an elevation of approximately 3900 m asl (Table 4). The widths of both mitochondrial clines were estimated to be narrower (mean mitochondrial cline width = 7.4 km) than the climate clines (mean climate cline width = 33.1 km), suggesting a sharper transition for mitochondrial haplotypes. However, the log-likelihood support limits for the mitochondrial cline width estimates were large and the differences were not statistically significant (Table 4).
We did not observe similar clines in allele frequencies for either nuclear intron (ßact3 and AldoB6). Instead, allele frequencies varied haphazardly among sampling sites along the elevational transects. Clines of the most common microsatellite alleles at low elevation sites were also not clearly associated with climatic variables, and there was considerable variation in allele frequencies among the sampling sites for all of the microsatellite markers.
THE EFFECT OF ELEVATION ON POPULATION GENETIC STRUCTURE
For the mitochondrial dataset, the Geneland analyses revealed two distinct populations (Fig. 3). Individuals sampled at the highest elevations of both elevational transects (sampling sites T1–5 and T2–5, see Table 2) formed a distinct population cluster to the exclusion of all other individuals in the study area. Similarly, pairwise mitochondrial Fst estimates revealed significant structure among sampling sites along the elevational transects. Along the control transects C1–C3, however, Fst values were not significantly different from zero, despite the fact that the control transects were, on average, over twice as long as the elevational transects (Table S3). Mitochondrial Fst was significantly associated with elevation (correlation coefficient = 0.39, P= 0.006), but not distance (correlation coefficient =−0.09, P= 0.51; Table 5).
Table 5. Results of the Mantel tests for correlations between geographic (linear distance and elevation difference) and genetic (Fst) distance matrices. Significant correlations (P < 0.05) are in bold.
For the nuclear datasets, Geneland inferred a single population (Fig. 3) and significant Fst values occurred haphazardly along both of the elevational transects (Table S3). In contrast to the mitochondrial dataset, Fst estimates for all control transect comparisons were significantly greater than zero for the microsatellite dataset and at least one control transect comparison was significant for both introns (AldoB6–C1 and C2; ßact3–C3) (Table S3). Also, in contrast to the mitochondrial dataset, microsatellite Fst tended to be strongly associated with distance, but not elevation (Table 5). For both nuclear introns, neither distance nor elevation was significantly associated with Fst (Table 5). It should be noted, however, that differences in allelic diversity among markers may have affected the power to detect these correlations.
THE EFFECT OF ELEVATION ON GENE FLOW
Mitochondrial effective gene flow was greatly reduced along the elevational transects compared to the control transects. Along both elevational transects, IM estimates of mitochondrial migration rate (m) were statistically indistinguishable from zero. In contrast, estimates of m were over 160 times greater, on average, along the control transects than the elevational transects (mcontol:melevational= 164.5; Table 6). We did not observe a similar discrepancy for either of the pooled nuclear datasets (microsatellites–mcontol:melevational= 0.82; introns –mcontrol:melevational= 1.29), suggesting that nuclear rates of effective gene flow are not appreciably decreased along elevational gradients.
Table 6. IM estimates of symmetrical migration rates (m) (90% highest posterior density [HPD] intervals). Estimates for loci pooled according to marker type are indicated in bold. Gf06 was invariant at the ends of two transects (T2 and C1), thus m was not estimated for this locus on these transects. Estimates of 0.005 are effectively zero.
Results were similar when the nuclear loci were analyzed separately. Ratios of m along elevational and control transects (melevational:mcontol) for the microsatellites ranged from 0.11 (Lox1) to 1.76 (Gf01) (Table 6). For the nuclear introns, ratios ranged from 1.06 (ßact3) to 1.57 (AldoB6) (Table 6). Thus, the reduction in mitochondrial migration rate along the elevational transects is far more severe than that observed for any of the nuclear markers.
The posterior probability distributions for some of the estimates of m produced a peak with a long nonzero tail (i.e., there was a plateau in the upper bounds of the posterior probability distribution) (Figs. S1–S3). This was most common for transects and loci with very large estimates of m. We interpreted this to mean that beyond a certain point, a broad range of migration rates were equally probable. In these cases, m was estimated as the point at which the posterior probability distribution began to plateau. It should be noted that this approach is conservative in that we chose the lowest estimate of m from a broad range of equally probable values.
THE ROLE OF NATURAL SELECTION IN POPULATION GENETIC STRUCTURE
Divergent selection pressures along environmental gradients can drive population differentiation by reducing gene flow among locally adapted populations. Despite the much greater linear distance of the control transects, we found that the rates of mitochondrial gene flow were over 160 times higher along the control transects than along the elevational transects (Table 6). The effect of elevation on population genetic structure was also demonstrated by the steep mitochondrial clines and by the significant correlation between mitochondrial Fst and the elevational difference between sampling sites. In addition, mitochondrial haplotype clines along both elevational transects were associated with environmental variables, suggesting they are maintained by environmentally mediated selection pressures. Because of the role of the mitochondrion in oxidative phosphorylation, the mitochondrial genome has been shown to be a target of natural selection in organisms inhabiting cold, hypoxic environments (Ehinger et al. 2002; Mishmar et al. 2003; Ruiz-Pesini et al. 2004; Fontanillas et al. 2005). Functional variation among mitochondrial haplotypes may be under strong selection at high altitude in Z. capensis.
Although steep genetic clines can also be formed by neutral diffusion of alleles following the secondary contact of differentiated populations (Haldane 1948; Endler 1977), this seems an unlikely explanation for the mitochondrial clines. All of the individuals examined in this study belong to the same subspecies, so secondary contact is not suggested by morphological differences between montane and coastal populations. Regardless, if one assumes the mitochondrial clines reflect secondary contact of montane and coastal populations, one can calculate the expected cline width under a neutral model by using the diffusion equation: w= 2.51σ√t (Barton and Gale 1993), where σ is the root mean square (RMS) dispersal distance and t is the time since secondary contact in generations. RMS dispersal distances are not available for Z. capensis, but estimates for other passerines with similar life histories can be used to explore the expected cline shapes under a neutral diffusion model. RMS dispersal distances for birds are typically estimated using either detailed studies of breeding populations or from banding recovery data. Estimates for RMS dispersal distances in eight passerine birds ranged from 0.3 km (Melospiza melodia) to 1.7 km (Thyromanes bewickii), with a mean of 1 km (Barrowclough 1980). More recent estimates of RMS dispersal distance for passerines range from 30 km (Dendroica occidentalis and D. townsendi–Rohwer and Wood 1998) to 150 km (Catharus ustulatus–Ruegg 2008). For Z. capensis, the available information suggests that an estimate of 1 km could be used to place a very conservative bound on cline shape under a neutral diffusion model.
Assuming the montane and coastal populations of Z. capensis are in secondary contact, it is unclear when that contact would have first occurred. Areas of nonglaciated habitat at the top of transect T1 have been available for approximately 20,000 years (Seltzer et al. 2002), which places bounds on the period of time that Z. capensis could have continuously occupied the length of the elevational gradient. Using a conservative estimate of 1 km RMS dispersal distance and a time of secondary contact of 10,000 generations ago (assuming a generation time of 1 year), the neutral diffusion model predicts a cline width of 251 km, which is nearly twice the total length of elevational transects T1 and T2. Mean dispersal distances for Z. capensis may be greater than 1 km given its large geographic distribution and the range of dispersal estimates available for passerine birds. Similarly, 10,000 years is only half of the time that nonglaciated high elevation habitats may have been available in the study area (Seltzer et al. 2002). Thus, by even conservative estimates, the mitochondrial cline widths are much narrower than would be predicted by a neutral diffusion model, which suggests that they are maintained by natural selection.
The strength of selection maintaining genetic clines can be estimated given an estimate of RMS dispersal distance and cline width (Haldane 1948; Endler 1977; Mullen and Hoekstra 2008). Under an environmental gradient model, where environmental conditions are assumed to change gradually and dispersal distances are less than the environmental transition, the strength of selection can be estimated as: s=σ2(2.4/w)3, where s is the selection coefficient, σ is the RMS dispersal distance, and w is the cline width (Endler 1977). Assuming a dispersal distance of 1 km, estimates of s are 0.048 for transect T1 and 0.026 for transect T2. Our cline analyses of climatic variables, however, demonstrate that environmental conditions change abruptly instead of gradually along the elevational transects. Under these conditions, the strength of selection can be estimated as: s= (σ/w)2 (Haldane 1948; Slatkin 1973). Again assuming a dispersal distance of 1 km, estimates of s are 0.023 for T1 and 0.015 for T2. This model assumes that dispersal distance is similar to or greater than the environmental transition. Climate cline widths on transects T1 and T2 are 37.6 and 28.5 km, respectively, and although this distance is much greater than the 1 km dispersal distance assumed here, these distances are within the range of dispersal estimates for passerine birds. Given the large confidence intervals around the mitochondrial cline widths (Table 5), selection coefficients under both models should be viewed cautiously. However, they are similar in magnitude to those estimated from clines in pigmentation traits known to be under natural selection for crypsis in oldfield mice (Peromyscus polionotus) (Mullen and Hoekstra 2008), but are lower than selection coefficients estimated from hybrid zones in Bombina toads (Szymura and Barton 1986) and Heliconius butterflies (Mallet et al. 1990).
MITONUCLEAR DISCORDANCE AND MIGRATION-SELECTION BALANCE
One striking result of this study was the discrepancy between the mitochondrial and nuclear markers in the patterns of gene flow and genetic structure. For both nuclear sequence and microsatellite datasets, gene flow was not substantially reduced along the elevational transects compared to the control transects and there was no relationship between clines in allele frequencies and environmental variables along the elevational transects. Moreover, in contrast to the mitochondrial dataset, microsatellite Fst was strongly correlated with distance. Thus, for the nuclear datasets, the elevational gradient appears to have little effect on patterns of gene flow and population genetic structure. Instead, an isolation-by-distance model better explains nuclear population genetic structure in Z. capensis.
This mitonuclear discrepancy supports the hypothesis that natural selection maintains the mitochondrial haplotype clines, but several alternative hypotheses could explain the mitonuclear discordance. First, the effective population size (Ne) of mitochondrial markers is one quarter of that of nuclear autosomal markers (Hudson 1990). Thus, the geographic structuring of genetic variation is expected to occur, on average, more rapidly for mitochondrial than for nuclear markers. Assuming differences in Ne among marker classes explain the mitonuclear discordance, we would also expect the z-linked locus (AldoB6) to show greater population genetic structure than the autosomal locus ßact3. This was not the case. For example, (1) there was no association between shifts in AldoB6 allele frequencies and environmental variables, as was the case for other nuclear loci, (2) Fst estimates along the elevational transects were similar for both nuclear introns, and (3) AldoB6 Fst was not significantly associated with elevation or distance. The available evidence suggests that differences in Ne are unlikely to explain the mitonuclear discordance, but data from many additional nuclear loci are needed to provide a stronger test of this possibility.
Differences between mitochondrial and nuclear patterns could also reflect limited elevational dispersal of females relative to males. This seems unlikely, because females of the closely related species, Zonotrichia leucophrys, tend to disperse greater distances, on average, than males (Morton 1997), as is the case with several other passerines (Greenwood and Harvey 1982). This dispersal pattern should lead to increased, rather than reduced mitochondrial gene flow. In this study, the levels of mitochondrial gene flow along the lengthy control transects were extremely high, presumably reflecting the ability of females to disperse long distances. It is difficult to envision a scenario in which females would disperse great distances along the control transects, but not along the elevational transects. Moreover, the disparity in mitochondrial Fst along the elevational and control transects was greater for males than females when the sexes were analyzed separately (males Fst elevational:Fst control= 13.02; females Fst elevational:Fst control= 8.76), suggesting that mtDNA patterns are not the result of reduced elevational dispersal of females relative to males. Thus, intersexual differences in natal dispersal distance also seem insufficient to explain the mitonuclear discrepancy.
A final alternative explanation is related to the analytical approaches we used. Our application of IM violates an assumption that the populations being compared are sister taxa that do not exchange genes with other unsampled populations. However, this violation should affect all loci equally and is thus unlikely to systematically bias the results toward reduced mitochondrial migration rates along the elevational transects. The IM results were also consistent with those of the cline analyses and the Mantel tests. A similar approach was used to examine differential introgression among loci across multiple species (over 6000 pairwise comparisons; Linnen and Farrell 2007).
We suggest that the discrepancy between mitochondrial and nuclear markers stems from the functional importance of the mitochondrion at high altitude. The vertebrate mitochondrial genome contains 13 protein-coding genes that are involved in oxidative phosphorylation. Differences among mitochondrial haplotypes that affect this process may be under strong selection at high altitudes, where cold, hypoxic conditions interact to create an aerobically challenging environment (Ehinger et al. 2002; Mishmar et al. 2003; Ruiz-Pesini et al. 2004; Fontanillas et al. 2005).
In the mitochondria of mammalian brown fat, the proton gradient of the electron transport chain can be dissipated as heat rather than being used for ATP production, resulting in nonshivering thermogenesis (Blank 1992). Mitochondrial haplotypes that reduce the coupling efficiency of oxidative phosphorylation and increase the efficiency of nonshivering thermogenesis have been hypothesized to be favored in cold climates (Ruiz-Pesini et al. 2004; Fontanilas et al. 2005). Nonshivering thermogenesis is known to occur in avian pectoral muscle (Bicudo et al. 2002), and it is possible that similar selection pressures exist for Z. capensis at high altitude. Comparative gene expression studies of high and low altitude populations of Z. capensis showed large-scale upregulation of genes involved in oxidative phosphorylation at high altitude, suggesting that this pathway is important in high-altitude stress compensation in this species (Cheviron et al. 2008). The sharp clines in haplotype frequency may represent local adaptation of mitochondrial haplotypes to different altitudinal environments. Because the mitochondrial genome is inherited as a single linkage group, selection could be acting on any of the 13 protein-coding genes. Although we cannot rule out the possibility that mitochondrial patterns reflect natural selection acting on loci located on the W chromosome, which is coinherited with the mitochondrion in female birds, the role of the mitochondrial genome in oxidative phosphorylation makes it a more likely candidate locus for local adaptation. It should be possible to test this hypothesis using mitochondrial genomic sequences and functional assays of high- and low-elevation mitochondrial haplotypes. Under this hypothesis, the mitonuclear discrepancy would reflect the greater movement of neutral nuclear alleles along the elevational gradients relative to the flow of locally adapted mitochondrial haplotypes. These results are consistent with a migration-selection balance model in which natural selection maintains mitochondrial differentiation in the face of gene flow along the gradient.
Associate Editor: H. Hoekstra
We wish to thank the Instituto Nacional de Recursos Naturales (INRENA) and the Peruvian government for issuing scientific collecting permits (INRENA Permit Nos. 042-2004-INRENA-IFFS-DCB, 012C/C-2005-INRENA-IANP, 004-2007-INRENA-IFFS-DCB). P. Benham, M. Brewer, M. Carling, T. Chesser, C. Cheviron, D. Lane, and T. Valqui assisted with fieldwork. M. Carling, C. Cheviron, M. Hellberg, H. Hoekstra, J. V. Remsen, A. Whitehead, and two anonymous reviewers provided useful comments on the manuscript. This work was supported by grants from the American Ornithologists' Union (AOU), the Buffett Foundation, the Explorers Club, the LSUMNS Birdathon, the LSU Dept. Biological Sciences and Sigma Xi to ZAC, as well as grants from National Science Foundation (DEB-0543562 and DBI-400797) to RTB. Experimental procedures were in accordance with Louisiana State University IACUC standards (protocol no. 06-124). The map data were provided by NatureServe in collaboration with R. Ridgely, J. Zook, The Nature Conservancy-Migratory Bird Program, Conservation International–CABS, World Wildlife Fund–US, and Environment Canada–WILDSPACE.