A HIGH-DENSITY SCAN OF THE Z CHROMOSOME IN FICEDULA FLYCATCHERS REVEALS CANDIDATE LOCI FOR DIVERSIFYING SELECTION

Authors

  • Niclas Backström,

    1. Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
    Search for more papers by this author
  • Johan Lindell,

    1. Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
    Search for more papers by this author
  • Yu Zhang,

    1. Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
    2. College of Animal Science and Technology, China Agricultural University, No. 2 Yuanmingyuan Xi Lu, Haidian, Beijing 100094, China
    Search for more papers by this author
  • Eleftheria Palkopoulou,

    1. Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
    Search for more papers by this author
  • Anna Qvarnström,

    1. Department of Animal Ecology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
    Search for more papers by this author
  • Glenn-Peter Sætre,

    1. Center for Ecological and Evolutionary Synthesis (CEES), Department of Biology, University of Oslo, P. O. Box 1066 Blindern, N-0316 Oslo, Norway
    Search for more papers by this author
  • Hans Ellegren

    1. Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-752 36 Uppsala, Sweden
    2. E-mail: Hans.Ellegren@ebc.uu.se
    Search for more papers by this author

Abstract

Theoretical and empirical data suggest that genes located on sex chromosomes may play an important role both for sexually selected traits and for traits involved in the build-up of hybrid incompatibilities. We investigated patterns of genetic variation in 73 genes located on the Z chromosomes of two species of the flycatcher genus Ficedula, the pied flycatcher and the collared flycatcher. Sequence data were evaluated for signs of selection potentially related to genomic differentiation in these young sister species, which hybridize despite reduced fitness of hybrids. Seven loci were significantly more divergent between the two species than expected under neutrality and they also displayed reduced nucleotide diversity, consistent with having been influenced by directional selection. Two of the detected candidate regions contain genes that are associated with plumage coloration in birds. Plumage characteristics play an important role in species recognition in these flycatchers suggesting that the detected genes may have been involved in the evolution of sexual isolation between the species.

A major goal in evolutionary biology is to reveal the genetic basis of traits that affect the fitness of individuals in natural populations (Ellegren and Sheldon 2008). Having that knowledge is essential to be able to address a number of general problems in ecology and evolution, including the fitness distribution of new mutations (Eyre-Walker and Keightley 2007), the relative effects of mutation, selection, and drift on genetic and phenotypic diversity (Mitchell-Olds et al. 2007), and the relative importance of changes in gene expression and protein structure on phenotypic evolution (Hoekstra and Coyne 2007; Carroll 2008). Given that directional selection is an influential force in driving speciation (Schluter 2009; Schluter and Conte 2009), understanding which genes that cause differences in reproductive success among individuals within specific populations should also provide insight into the genetics of differentiation and the build up of reproductive isolation between populations (Coyne and Orr 2004; Gavrilets 2004; Price 2007). Recently, several examples of identification of gene or regulatory sequences, or genomic regions, that underlie critical phenotypes in “ecological model species” have started to appear (e.g., Abzhanov et al. 2004; Colosimo et al. 2005; Hoekstra et al. 2006; Prud’homme et al. 2006; Linnen et al. 2009). Additionally, there are a few cases in which the genes governing hybrid inviability and sterility have been successfully identified (Coyne and Orr 2004; Noor and Feder 2006; Orr et al. 2007; Phadnis and Orr 2009). However, this progress is mainly restricted to the identification of genes involved in intrinsic postzygotic isolation in model species. Finding causative loci associated with adaptation and/or speciation remains a grand challenge for studies of natural populations. For practical reasons, candidate gene approaches or other means by which the search for causative loci can be limited to particular regions of the genome are desirable, if possible.

A traditional view holds that many genes with small additive effects are important in generating differentiation between populations (Fisher 1930). According to this way of reasoning, speciation should proceed through a gradual build-up of genetic incompatibilities between the genomes of diverging populations (e.g., Dobzhansky 1936; Muller 1942). Somewhat contrary to this view, recent empirical data suggest that few genes of large and epistatic effects can play an important role in the diversification process (Coyne and Orr 2004; Orr et al. 2004). Yet another possibility is that incompatibilities are built up at different rates in different regions of the genome. The rate of gene flow for loci that evolve under influence of divergent selection, for example, genes involved in local adaptation, is likely to be reduced compared to the rate at neutrally evolving regions (Smadja et al. 2008). Hence, if divergent populations develop isolating mechanisms that reduce levels of introgression (Qvarnström and Bailey 2009), interspecific recombination around loci experiencing divergent selection will be strongly reduced. This can result in large genomic blocks of elevated between-population divergence, a phenomenon termed “divergence hitchhiking” (Payseur et al. 2004; Harr 2006; Rogers and Bernatchez 2007; Via and West 2008). The size of regions affected by hitchhiking will be dependent on the rate of migration and the effective population size of each of the incipient species and can be extensive, especially if multiple loci are involved (Feder and Nosil 2010).

Studies on the genetic basis of reproductive isolation suggest that sex chromosomes play a large role in causing low fitness in hybrids (Qvarnström and Bailey 2009). The Drosophila X chromosome, for example, shows a disproportionately large effect in genetic analyses of hybrid sterility, an observation known as the “large X-effect” (Coyne 1992; Presgraves 2008). On a similar note, the avian Z chromosome (in birds, males are ZZ and females are ZW) has been shown to contain an overrepresentation of loci subject to adaptive evolution (Ellegren 2009). In addition, recent work in Ficedula flycatchers has established that species-specific male plumage traits, female mating preferences, and genes causing low hybrid fitness are all linked to the Z chromosome, (Sætre et al. 2003; Sæther et al. 2007). Tight genetic coupling of loci affecting both prezygotic isolation and postzygotic isolation would allow differentiation to proceed despite some gene flow and could facilitate genetic differentiation via reinforcement (Servedio and Sætre 2003). Together, these data suggest that a focus on sex chromosomes in search of genomic regions involved with divergent selection and speciation is warranted.

We have previously shown that avian sex-linked (Z-chromosomal) genes display an accelerated rate of sequence evolution compared to autosomal genes (Mank et al. 2007), a “fast-Z effect” analogous to the well-known “fast-X” phenomenon described for XY systems (Charlesworth et al. 1987). The tendency for fast evolution of sex-linked genes thus seems independent of form of heterogamety. Although it has been suggested that “fast-X” is owing to the exposure of recessive beneficial mutation in the heterogametic sex, and thereby a consequence of selection, we recently showed that the pattern of polymorphism and divergence in avian sequence data is also consistent with a neutral explanation for the “fast-Z” effect (Mank et al. 2010).

Birds are important model organisms for understanding the speciation process (Price 2007). Yet, it has been difficult to study the genetics of adaptation and population divergence in birds based on large-scale genetic marker analysis due to a shortage of genomic resources. There is now a draft genome sequence available for the chicken Gallus gallus (ICGSC 2004) and this has recently also become the case for the zebra finch Taeniopygia guttata (Warren et al. 2010). Moreover, we have developed a platform for genomic analyses of nonmodel bird species by the establishment of a conserved gene-based marker set evenly spread across the avian genome (Backström et al. 2008a). We have now extended this approach by developing a high-density, gene-based marker resource for the Z chromosome and apply these markers in a “genome scan” of two closely related sister species of the Old World flycatcher genus Ficedula, the pied flycatcher (F. hypoleuca) and the collared flycatcher (F. albicollis). A major advantage of this study system is that questions concerning all three major sources of reproductive isolation (i.e., ecological divergence, sexual isolation, and genetic incompatibilities) can be addressed (Qvarnström et al. 2010; Sætre and Sæther 2010). The main objective of our study was to investigate patterns of diversification along the Z chromosome of pied flycatcher and collared flycatcher, and to search for loci evolving under directional selection. We find evidence for several regions of elevated between-species divergence and reduced within-species nucleotide diversity, and discuss their relevance to adaptive population divergence.

Methods

STUDY SPECIES

The pied flycatcher and the collared flycatcher have a mitochondrial DNA divergence of approximately 3% and are estimated to have diverged from each other during the last one-and-a-half to two million years (Sætre et al. 2001). The distribution range of the pied flycatcher encompasses large areas of western and northern Eurasia, whereas the collared flycatcher occupies regions of central and eastern Europe. Because their common ancestor should have been affected repeatedly by the cyclic climate conditions of the Pleistocene, this suggests that the pied flycatcher was largely restricted to refugia on the Iberian Peninsula during glacial periods, whereas the collared flycatcher was restricted to a refugium on the Appenine peninsula (Sætre et al. 2001). The two species currently live sympatrically in central and eastern Europe, and on the Baltic Sea islands of Öland and Gotland. Hybridization occurs in these regions (Sætre et al. 1999, 2001; Veen et al. 2001), but females prefer males of their own species as mates (Sætre et al. 1997; Sæther et al. 2007), and both song (Qvarnström et al. 2006) and plumage (Wiley et al. 2005) play important roles in species recognition. Female hybrids are sterile whereas male hybrids are fertile but have severely reduced fitness (Svedin et al. 2008). The reduction in fitness remains for several backcrossed generations (Wiley et al. 2009).

SAMPLING AND DNA EXTRACTION

Blood samples from male pied and collared flycatchers were collected from four different populations representing both allopatric and sympatric locations. Allopatric individuals were sampled in Lingen, Germany (pied flycatcher, n= 10) and near Budapest, Hungary (collared flycatcher, n= 10). Sympatric individuals were collected on Öland, Sweden (pied flycatcher, n= 12 and collared flycatcher, n= 9). One red-breasted flycatcher (F. parva), collected in the Jeseník Mts, Czech Republic was included as outgroup. DNA was extracted by incubation with proteinase K (0.05 mg/mL final concentration) in Laird's buffer at 37°C over night, after which DNA was purified by two rounds of phenol-chloroform extraction and one round of pure chloroform treatment followed by precipitation with cold 96% ethanol and NaAc. DNA was rinsed once with cold 70% EtOH, dried and dissolved in ddH2O.

MARKER DEVELOPMENT

We have previously shown that chicken and collared flycatcher Z chromosomes are essentially syntenic and for large parts colinear (Backström et al. 2006). The Z chromosome is 75 Mb in the most recent version of the chicken genome assembly (http://www.ncbi.nlm.nih.gov/projects/genome/guide/chicken/) and our aim was to have Z-linked flycatcher markers corresponding to a density of one marker per MB along the homologous chicken Z chromosome. We used 23 previously published Z-linked markers (Backström et al. 2006) and developed additional markers by aligning chicken exon sequences from target locations (i.e., every Mb) to the draft assembly of the zebra finch genome (http://www.ncbi.nlm.nih.gov/genome/seq/BlastGen.cgi?taxid=59729). PCR primers were designed to anneal to conserved regions of exons (i.e., showing high degree of sequence identity between chicken and zebra finch), flanking introns 500–1000 bp in size. Following optimization of PCR and sequencing conditions using flycatcher DNA, the final set of markers consisted of 74 introns from 73 different genes, which in chicken are evenly spread along the entire Z chromosome (Fig. 1, Table 1).

Figure 1.

Position of loci along the Z chromosome according to the chicken physical map. Gene names are given to the left and physical positions (Mb) to the right. The bracketed interval has been affected by chromosomal rearrangements so the gene order is different in flycatchers (Backström et al. 2006). Gene names starting with UNKN are uncharacterized genes that lack a name.

Table 1.  Summary of diversity and divergence estimates. Pos = position on the chicken Z chromosome (Mb), Gene = gene name, Length = length of aligned intronic sequences, Fix = number of fixed differences between species, Share = number of shared polymorphisms between species, Fhpol = number of polymorphism segregating in the pied flycatcher only, Fapol = number of polymorphisms segregating in the collared flycatcher only, S = number of segregating sites, π= nucleotide diversity, FST= level of differentiation (Weir and Cockerhams FST), Fh = pied flycatcher and Fa = collared flycatcher, *= loci included in the subsequent zoom-in analysis. UNKN= uncharacterized gene with no known name.
PosGeneLengthFixShareFhpolFapolFh SFh πFa SFa πFST
 0.25NARS227003130.001010.00030.0087
 0.70MALT3900551590.0053180.01160.0509
01.32MADH2403105450.001640.00280.5163
01.76LOXHD174800109100.003290.00450.0911
03.79PI3K870007860.001180.00200.2971
03.82PI3K435312630.000770.00480.4886
07.07KIF24460041050.001640.00350.5098
07.41UNKN1206003530.003450.00900.1308
07.93FANCG360021330.002350.00140.0320
08.46KIAA0258372011420.001250.00250.4859
09.08MTMR12*606103550.000370.00310.5263
09.47TARS737014750.001580.00230.3056
09.72SLC45A2*100902126140.002880.00170.6153
09.75UNKN9*671107170.001810.00010.6529
09.88RAI14*18000020020.00110.0360
09.92BRIX* 95002220.003720.00310.6567
09.93DNAJA5*34000020020.00050.0360
09.95AGT2798005450.000640.00150.6652
10.29LMBR1*200010110.000820.00210.6991
10.35UPF0465*791202620.001360.00110.4392
10.36UNKN10*387001110.001410.00030.4406
10.39RANBP3L*332010210.000110.00180.2361
10.78NIPBL*653005550.002450.00220.4121
10.86UNKN11*552105250.001720.00110.5250
10.88UNKN12*295001410.000240.00210.1431
10.91NUP155*6340101110.0001120.00420.4952
10.93NUP155327403230.001320.00150.6514
11.36EGF574010010.000210.00260.6017
11.38EGFLAM*374008380.004930.00120.2891
11.40LIFR*632002420.000240.00210.5715
11.75DOC2627001210.000120.00090.3111
12.31PRKAA1838000200.000220.00050.7227
13.16NNT7551031130.0018110.00420.2427
14.32PARP8879000010.003370.01170.3982
15.12VLA1693005550.001850.00210.2957
16.88GPBP147704712110.0050160.01160.0917
18.88IPO1178403711100.0027140.00550.1364
19.91ADAMTS6418001310.000630.00390.4945
19.98PPWD15780131240.0003130.00740.5472
21.31ZTL1361015260.003630.00440.1552
22.93IQGAP2342003530.000350.00590.5623
23.07SV2C354003430.000640.00260.1012
23.68LNR42360117480.003550.00330.2735
25.91MIP652020220.000340.00360.2795
26.59DOCK8678032450.003070.00250.2883
27.48RFX3575205950.001690.00570.2836
28.57GLDC415211420.001750.00600.3598
29.41PTPRD705027590.002970.00290.2432
30.56TYRP1531113030.002700.00000.6704
31.46FRAS1569007770.003870.00390.3066
31.79UNKN2596002420.000440.00300.2379
33.21ADAMTSL3624013330.000830.00060.6858
33.54ASAH3L576017480.004150.00170.1890
34.37UNKN3632006460.001940.00240.4121
34.77MLSN2519003630.002360.00360.0965
35.44ZFAND5383002320.000330.00360.5948
36.32TRPM67342031430.0006140.00510.3634
37.20VPS13A599015460.001550.00350.3886
37.54PSAT991001510.000850.00650.4517
38.13TLE4644104640.001060.00190.4877
39.20RASEF89802129140.0043110.00180.2385
40.25MAK10967105850.000780.00260.4226
42.64SPINZ959120320.002450.00180.2887
43.33UNKN4429002620.001060.00450.2981
45.20APC565006360.001530.00110.0813
47.43EF5407003130.002210.00140.4326
48.90HISPPD1374010410.002350.00250.0443
50.19CHD1Z557103030.000900.00000.7636
50.99UDP4520031030.0003100.00390.4958
52.37GAF611008470.002140.00190.1812
52.84NRG1506004240.001420.00130.5358
53.60ABCA1231015260.011330.00390.1800
56.59MCTP1356002020.001710.00020.2142
58.48GPR98419213240.000630.00120.7359
59.65CCNH354002120.002610.00100.0897
59.70RASGAP731307070.000800.00000.7851
61.31UNKN58970061360.0014130.00370.1439
62.03RPS23262014050.003910.00070.3089
62.74UNKN6377005550.002150.00550.2921
64.41SNX306510071470.0030140.00750.2294
65.03SMC2433001310.000230.00110.1230
66.80PLAA433101210.001020.00040.6994
67.01UNKN7684002720.000370.0012−0.0144
68.68SMU16292001120.0009110.00240.2993
69.74DMXL1395103430.000540.00180.6323
71.23MCCC2536305250.002520.00050.5185
72.31ALDH7A1613005450.003540.00250.3386
73.07ZNF608525015260.001430.00130.3847
73.86UNKN8227000100.000010.00200.6571
74.59MELK274100400.000040.00320.8738

In the absence of a flycatcher genome assembly, or a detailed physical or genetic map, we use the physical location of markers on the chicken Z chromosome when graphically showing the chromosomal distribution of estimates of flycatcher population genetic parameters. It should be noted, however, that the gene order differs between collared flycatcher and chicken Z chromosomes in a region corresponding to positions 33.5 Mb and 53.6 Mb in the most recent assembly of the chicken genome (http://www.ncbi.nlm.nih.gov/projects/genome/guide/chicken/) (Backström et al. 2006). Several inversions appear to have occurred in this region and it is difficult based on sparse marker maps to unveil the precise chromosomal homologies (Backström et al. 2006).

MOLECULAR METHODS

All PCRs were conducted in 15–20 μl reactions with approximately 50 ng of template DNA, 0.20 μM of each primer, 50 μM dNTP, 0.025 U TaqGold polymerase (Applied Biosystems, Carlsbad, CA), and 2.5 mM MgCl2. The general temperature profile was an initial 5-min activation step at 95°C followed by 40 cycles of 30 s denaturation at 95°C, 40 s annealing at an optimized locus-specific annealing temperature and 45–60 s elongation at 72°C. PCR-products were cleaned with ExoSAP (USB Corp., Cleveland, OH) and 2 μl of purified PCR-product were used in sequencing reactions using BigDye cycle sequencing terminator chemistry and temperature profile settings according to the manufacturer's recommendations (Applied Biosystems, Carlsbad, CA). Sequencing reactions were cleaned with the XTerminator System following the manufacturer's recommendations and sequencing was performed on an ABI3730xl DNA Analyzer (Applied Biosystems, Carlsbad, CA). All sequences included in the analyses have been deposited in GenBank under accession numbers HQ207834-HQ211489 (Table S1).

DATA ANALYSIS

Raw sequence data were edited in Sequencher 4.7 (Gene Codes Corp., Ann Arbor, MI) and locus-specific contigs were created and used for polymorphism screening. Sequences from individual birds were exported and realigned using ClustalW as implemented in MEGA version 4.0 (Tamura et al. 2007), and MEGA was also used to estimate the nucleotide diversity (π). Single nucleotide polymorphisms in individual bird sequences (diploid) were randomly assigned to single chromosome sequences (haploid) using DAMBE version 5.0.23 (Xia and Xie 2001) and intraintronic haplotypes were estimated for each individual by applying the Bayesian ELB algorithm as implemented in Arlequin version 3.11 (Excoffier et al. 2005). Point estimates of FST-values were calculated between species for both allopatric, sympatric, and combined populations and between allopatric and sympatric populations within species using the method of Weir and Cockerham (1984) as implemented in fdist2 (Beaumont and Nichols 1996). Intraspecific estimates of FST were low (mean = 0.027 and 0.047 for the pied and the collared flycatcher, respectively) and therefore we restrict the figures and tables to mainly include the interspecific FST-values for allopatric and sympatric populations combined. The number of alleles and the expected heterozygosity for different loci were estimated in FSTAT version 2.9.3.2 (Goudet 1995).

We used the Bayesian method implemented in BAYESFST (Beaumont and Balding 2004) to identify loci in which FST-estimates were indicative of positive selection. The method uses a Markov-Chain-Monte-Carlo approach to estimate locus, population, and locus by population effects. The posterior distribution of the locus effect is used to identify if any of the empirical FST-values show signs of positive or balancing selection. The method appears to be relatively robust to variation in mutation rates and less sensitive to different demographic histories of included populations (Beaumont and Balding 2004) than the fdist2 method (Beaumont and Nichols 1996). BAYESFST was run with default settings for all individuals within each species, and also for allopatric and sympatric populations separately. A 5% significance level was applied to the statistical tests. Because we were mainly interested in loci putatively evolving under the influence of diversifying selection, we did not consider outlier loci with a lower than expected degree of differentiation (a negative value of the 97.5% quantile indicative of balancing or stabilizing selection). All runs were conducted twice to verify that the same outlier loci were detected in independent runs. Applying corrections for multiple testing is not straightforward in Bayesian posterior analyses (Beaumont 2008). However, simulations indicate that the false positive rate of neutral loci appearing as positively selected is low when applying the BAYESFST method. Only 25 of 6800 neutral loci were erroneously detected as positively selected by Beaumont and Balding (2004), giving a false positive rate of 0.0037. As the number of loci included in the analysis is 74 we expected to find approximately 0.27 outliers (0.0037 × 74 = 0.27) per analysis by pure chance even if all loci evolve neutrally. It should be noted, however, that certain demographic scenarios potentially could affect the rate whereby false positives are discovered. Hence, it is possible that our estimate of expected false positives is naïve because it is likely that both species in this study have been subjected to population size changes and perhaps also population subdivision during the time since their divergence. No matter the significance of these factors on the patterns of genetic differentiation, our main aim was to identify loci putatively evolving under influence of selection and to avoid the risk of discarding possible candidates we include all significant outliers from the Bayesian analysis in the subsequent analyses and in the discussion.

We assessed the variance within populations as well as among populations of partitions of loci by analysis of molecular variance (AMOVA) of haplotype data as implemented in Arlequin version 3.11 (Excoffier et al. 2005). Recent analyses have indicated that false positive results can be obtained when applying a model that does not take hierarchical population structure into account and we therefore also ran the analysis using a hierarchical structure model (Excoffier et al. 2009). With the exception of KIF24, all genes detected to be subject to diversifying selection in any of the Bayesian analyses were also significant or nearly significant outliers in the hierarchical analysis (data not shown). Therefore we restrict the presentation of results to only include the results from the Bayesian analysis.

Avian substitution rates at presumably neutral sites vary on a regional scale, suggesting variation in underlying mutation rate (Berlin et al. 2006). The neutral theory predicts that mutation rate and level of genetic diversity should be positively correlated. Low nucleotide diversity of a mutation cold-spot region could therefore mimic the low diversity seen after a selective sweep in a region with higher mutation rate. To investigate the effect of variation in mutation rate on nucleotide diversity levels, we estimated locus-specific average pairwise divergence between the pied flycatcher and the red-breasted flycatcher in MEGA, using the Jukes–Cantor model to correct for multiple hits (the comparison of the collared flycatcher and the red-breasted flycatcher gave essentially the same result).

Estimates of Tajima's D (Tajima 1989) and Fay and Wu's H (Fay and Wu 2000) were obtained in DnaSP version 4.50 (Rozas et al. 2003). Confidence limits (95%) were generated in DnaSP by resampling 1000 genealogies with the expected population mutation rate equal to the empirical average over all loci and assuming an intermediate rate of recombination (R= 2 per gene) and a sample size similar to the average of number of sampled individuals over all loci. In addition, a multi-locus HKA test (Hudson et al. 1987) as implemented in the software HKA (http://genfaculty.rutgers.edu/hey/software#HKA) was applied to the collared flycatcher and the pied flycatcher data to evaluate if any locus showed evidence for excess or deficiency of polymorphisms compared to expectations from divergence between the species.

Results

LEVELS OF POLYMORPHISM

We sequenced 74 introns from 73 sex-linked genes in allopatric and sympatric populations of pied flycatcher and collared flycatcher, corresponding to a total of 40.5 kb of unique sequence data from each species (Table 1). In chicken, the orthologous genes are evenly distributed on the entire Z chromosome with an average spacing of 1.0 Mb (Fig. 1). There were 34 fixed differences between the two species and 46 shared polymorphisms. The pied flycatcher showed 265 species-specific single nucleotide polymorphisms (SNPs) (π= 0.00182 ± 0.0009 SE) and the collared flycatcher showed 352 (π= 0.00316 ± 0.00137). Only one of the SNPs that was fixed between allopatric populations was shared among sympatric populations. Levels of genetic variability of noncoding sex chromosome sequences differed significantly between species (Wilcoxon's test, w= 1767, P= 0.000196) and were about 1.7 times higher in the collared flycatcher than in the pied flycatcher. There was no difference in nucleotide diversity when comparing allopatric and sympatric populations within each species (pied flycatcher, Wilcoxon's test, w= 2612, P= 0.630; collared flycatcher, w= 2823, P= 0.747). The number of haplotypes varied from two to 35 among loci (1 − 15 for the pied flycatcher and 1 − 22 for the collared flycatcher) and the average number of haplotypes per locus was 5.7 for the pied flycatcher and 6.3 for the collared flycatcher.

POPULATION DIFFERENTIATION ANALYSIS

The average FST between the two species was 0.362 ± 0.217 SD, with no significant difference for between species comparisons of sympatric (0.387 ± 0.230) and allopatric (0.365 ± 0.232) populations (Wilcoxon's test, w= 2585, P= 0.559). For individual loci, FST ranged from −0.014 to 0.874 (Fig. 2). Figure 3 shows how FST varies along the Z chromosome, as given by the location of markers in the chicken genome. There are several regions with particularly high FST-estimates, although in most cases these are single markers rather than representing peaks of several adjacent markers. Between species Bayesian analyses involving (1) allopatric populations, (2) sympatric populations, (3) and all individuals combined gave statistical support for seven outlier loci evolving under the influence of directional selection. Specifically, in the analysis of the allopatric data, the genes NUP155 and GPR98 were outliers and in the analysis of the sympatric data, the genes KIF24, CHD1Z, and MELK were outliers. In the analysis of the combined dataset, the genes TRP1, CHD1Z, PLAA, and MELK were outliers (Fig. 4). Hence, there was some overlap of genes with a significantly higher than expected levels of differentiation in the different datasets as CHD1Z and MELK were outliers in both the analysis of the sympatric populations and in the analysis of all individuals taken together.

Figure 2.

Distribution of FST-values between the pied flycatcher and the collared flycatcher for 73 Z-linked genes.

Figure 3.

Plot of FST-values along the Z chromosome. Genes putatively evolving under positive selection are indicated by arrows.

Figure 4.

Plots of the outlier analysis with BAYESFST for allopatric (top), sympatric (middle) and all samples (bottom), respectively. Significant and marginally significant (top 5) loci are indicated with arrows. Locus specific FST-estimates are plotted against the P-values. The vertical lines indicate the 95% confidence limit.

There was extensive heterogeneity in levels of nucleotide diversity, although there was no clear indication of a “diversity valley” with several adjacent markers showing reduced diversity (Fig. 5). However, the seven FST-outliers identified with the Bayesian analysis had marginally significantly lower nucleotide diversity in the collared flycatcher than the rest of the loci (π= 0.00178 and 0.00330, respectively; Wilcoxon's test w= 109, P= 0.047); no corresponding difference was found in the pied flycatcher (w= 171, P= 0.498). There were five invariable loci in one species and they included three of the positively selected genes (MELK in the pied flycatcher and TRP1 and CHD1Z in the collared flycatcher). Furthermore, an AMOVA revealed a lower ratio of within- to between-species molecular variance for loci assumed to be under diversifying selection than for putatively neutral loci (Table 2).

Figure 5.

Plot of nucleotide diversity (π) along the Z chromosome for the pied flycatcher (above) and the collared flycatcher (below). Genes putatively evolving under positive selection are indicated by arrows.

Table 2.  Average FST-values and results from the analyses of molecular variance for loci presumably evolving neutrally or under influence of positive selection according to the FST-outlier analyses and for all loci combined. The 95% confidence interval (CI) for FST values was obtained by bootstrapping 20,000 iterations over sites.
 Neutral (n= 58)Positive selection (n= 7)All loci (n= 74)
Within species variance (%)60.534.760.3
Between species variance (%)39.565.339.7
Average FST over loci 0.395 0.653 0.397
FST 95% CI over loci (0.349–0.440) (0.529–0.749) (0.348–0.446)

An outgroup sequence (the red-breasted flycatcher) was obtained for 56 of the genes and the mean intronic sequence divergence between outgroup and pied flycatcher was 0.021 ± 0.0059. There was no correlation between outgroup-pied flycatcher divergence and FST between pied flycatcher and collared flycatcher (Pearson's r2= 0.102, t=−0.740, df = 55, P= 0.464; essentially identical values are obtained when using outgroup-collared flycatcher divergence). However, species divergence and nucleotide diversity were marginally significantly correlated (pied flycatcher: r2= 0.241, t= 1.841, df = 55, P= 0.071; collared flycatcher: r2= 0.309, t= 2.412, df = 55, P= 0.019), as expected under neutrality. We therefore scaled nucleotide diversity by species divergence (Fig. 6) and found that the seven FST-outliers still had significantly lower nucleotide diversity than the putatively neutrally evolving loci in the collared flycatcher (Wilcoxon's test w= 76.5, P= 0.040). As before, there was no corresponding difference in the pied flycatcher (w= 95.0, P= 0.124).

Figure 6.

Plot of nucleotide diversity (π) scaled by divergence along the Z chromosome for the pied flycatcher (above) and the collared flycatcher (below). Genes putatively evolving under positive selection are indicated by arrows. Positions were loci are not connected by lines indicate genes in which no sequence data could be obtained from the outgroup species.

NEUTRALITY TESTS

Tajima's D-values were on average negative in the pied flycatcher (−0.521) and ranged from −1.904 to 2.453 (Table S2). In the collared flycatcher, Tajima's D was on average positive (mean = 0.144, range =−1.667–2.564) and significantly higher than in the pied flycatcher (Wilcoxon's test, w= 1568, P= 0.0001). For the 56 loci that could be sequenced in the outgroup species, it was possible to calculate Fay and Wu's H (Table S2). The mean for this statistic was −0.251 in the pied flycatcher and −0.144 in the collared flycatcher. The distribution of values ranged between −4.221 and 2.165 in the pied flycatcher and between −3.297 and 1.600 in the collared flycatcher. There was no difference in Tajima's D-values when comparing allopatric and sympatric populations, neither within the pied flycatcher (mean ± SD; allopatric =−0.201 ± 0.956 and sympatric =−0.450 ± 1.011), nor within the collared flycatcher (allopatric = 0.173 ± 1.035 and sympatric = 0.247 ± 1.036) and this was also the case for the Fay and Wu's H statistic (pied flycatcher; allopatric =−0.249 ± 1.133 and sympatric =−0.244 ± 1.168 and collared flycatcher; allopatric =−0.050 ± 0.984 and sympatric =−0.121 ± 1.020). The HKA test did not reveal any signs of deviation from the expected ratio of polymorphism to divergence for any locus (Sum of deviations = 102.3, df = 146, P= 0.998).

EXTENDED MARKER ANALYSIS IN A CANDIDATE REGION

To test for evidence of divergence hitchhiking in the vicinity of loci potentially evolving under the influence of diversifying selection we focused on a region including the FST outlier NUP155. Specifically, we resequenced one intron in all 40 birds for 16 additional genes which in chicken are located within a 2.3 Mb region including NUP155. The average FST for these new markers was slightly higher than the average FST for all markers in the initial scan (0.45 ± 0.20, Wilcoxon's test, w= 1011, P= 0.086; Table 1, Fig. 7). Moreover, the diversity levels (π) for both the pied flycatcher (0.0012 ± 0.0013) and the collared flycatcher (0.0018 ± 0.00099) were significantly lower in the NUP155 region compared to the average for all loci in the initial scan (Wilcoxon's test, w= 578, P= 0.040, and w= 537, P= 0.016, respectively). However, there was no obvious “FST-peak” within this region (Fig. 7, Table 1).

Figure 7.

Plot showing the FST-values around the NUP155 gene initially detected to be under positive selection. Included in the range are the genes included in the dense scan (open), the most closely located flanking genes on each side and interspersed genes included in the initial scan (filled). Loci within the gene NUP155, which was detected as an FST-outlier and within SLC45A2, which is known to affect plumage coloration in birds, are pointed out by the arrows.

Discussion

We performed a Z-chromosome scan for genomic regions potentially evolving under directional selection in two sister species of birds: the pied flycatcher and the collared flycatcher. Seven out of 73 gene markers were significantly more divergent than expected under neutrality (Fig. 4). The Z chromosomes of chicken and collared flycatcher are highly syntenic and for large parts colinear (Backström et al. 2006). According to map information from the chicken genome assembly, the 73 gene markers that we sequenced in the flycatchers have a mean marker interval of 1.0 Mb and correspond to about 10% of the total number of genes (734) currently annotated on the chicken Z chromosome. The use of gene-based markers should increase our possibilities for finding functionally relevant patterns of diversity and/or differentiation. This is not only because adaptive divergence can be caused by changes in protein structure but also because changes in gene expression can be mediated by nearby cis-regulatory elements (Wittkopp et al. 2004, 2009; Hoekstra and Coyne 2007; Carroll 2008; Rebeiz et al. 2009; Chan et al. 2010). The proportion of FST-outlier loci suggested to evolve under the influence of directional selection in our study (7/74 = 0.095, Fig. 3) is in the upper range of what has been found in studies of other species (Stinchombe and Hoekstra 2008). The use of gene markers may very well be part of the explanation for this observation. In a previous scan in Atlantic salmon (Salmo salar), using microsatellites from untranslated regions of genes, a similar proportion of loci presumed to be affected by directional selection was found (Vasemägi et al. 2005). Moreover, a comparison of gene-based markers and anonymous markers in oak (Quercus robur) revealed higher differentiation in the former than in the latter (Scotti-Saintagne et al. 2004).

There was no clear evidence for divergence hitchhiking effects in the form of FST-peaks involving several adjacent markers (Fig. 3). This was true both for the whole-chromosome scan and for the denser scan in the region around one outlier at the NUP155 locus. The average level of differentiation was slightly higher for markers in the denser marker scan compared to the chromosome average but in none of the species was the diversity level significantly reduced in this region compared to the chromosome average. Neither Tajima's D nor Fay and Wu's H values deviated from values expected under neutral evolution (Table S2) and the HKA test was nonsignificant. This may indicate that the effects of selection have been rather small, or that selection events occurred a long time ago so that recombination has broken up associations to loci in the vicinity of the selected loci. For divergence hitchhiking to make a significant signature on the patterns of differentiation, species probably need to be in an earlier stage of separation (cf. Via and West 2008) than is the case for the flycatcher species included in this study, which are estimated to have diverged about 1.5–2 million years ago (Sætre et al. 2001). Selection may have occurred during any time point and/or with different intensity in the different lineages. It could, for example, have occurred in allopatry during cold periods in Pleistocene or in the form of reinforcement during secondary contact in warm periods. Hence, although our data, with the resolution given by this marker set, provide no indication of strong ongoing diversifying selection, this does not rule out the possibility that it happened in earlier stages of their divergence. Because the seven outlier loci showed significantly reduced nucleotide diversity compared to neutrally evolving loci only in the collared flycatcher, it is tempting to suggest that selection might have been more intense or occurred more recently in the collared flycatcher lineage than in the pied flycatcher lineage. However, nucleotide diversity is generally lower in the pied flycatcher than in the collared flycatcher (Table 1) and this could have affected the possibility to detect a difference in diversity among putatively selected and neutrally evolving loci in the former species.

Recent developments in speciation research suggest that some genomic regions may contain higher numbers of genes that underlie the traits that cause reproductive isolation than other regions. Such “genomic islands of speciation” will experience reduced introgression rates (i.e., reduced interspecific recombination) compared to the rest of the genome (Wu 2001; Turner et al. 2005). The reduced interspecific recombination of genes in these regions may then gradually expand to include loci of other regions through physical or epistatic linkage, finally resulting in reproductive isolation across the whole genome (Hawthorne and Via 2001; Wu and Ting 2004). It has been shown that gene flow occurs between pied flycatcher and collared flycatcher in contemporary sympatric populations, however, this is mainly restricted to the autosomes (Sætre et al. 2003). This finding is confirmed in our study. We found only one site out of ≈ 40 kb that represented a fixed difference between species when comparing allopatric populations that was at the same time polymorphic when comparing the sympatric populations. Linkage on the Z chromosome of genes affecting reproductive isolation may have facilitated speciation due to reduced recombination rates given that the Z chromosome does not recombine in female meiosis. This should be particularly true if traits including both pre- and postzygotic isolation are tightly linked (Servedio and Sætre 2003; Lemmon and Kirkpatrick 2006; Sæther et al. 2007). However, the idea of facilitated linkage of traits related to reproductive isolation when these are sex-linked is somewhat challenged by recent empirical data on avian recombination rates. In collared flycatcher, the sex-averaged recombination rate of the Z chromosome is similar to the rate in larger autosomes (Backström et al. 2006, 2008b) and the same is true for zebra finch (Backström et al. 2010). It may be that reduced interspecific recombination rate on the flycatcher Z chromosome (Sætre et al. 2003) has evolved for other reasons than limited recombination on the Z chromosome. Reduced interspecific recombination could, for example, develop via chromosome rearrangements (Hoffman and Rieseberg 2008). From this perspective, it should be valuable to obtain comparative gene maps for the pied flycatcher and collared flycatcher Z chromosomes.

There is an increasing amount of evidence suggesting that color polymorphisms and the underlying pigmentation genes are important in adaptive population divergence (Hoekstra and Nachman 2003; Hoekstra et al. 2004, 2006; Hoekstra 2006). The demonstrated role of plumage characteristics in flycatcher pre- (Sætre et al. 1997; Wiley et al. 2005) and postzygotic isolation (Svedin et al. 2008) raises the possibility that pigmentation genes have been important also in population divergence in this system. There is an extensive list of genes identified to affect coloration in animals (see, e.g., the mouse genome informatics database (MGI), http://www.informatics.jax.org/) and 15 of these are reported to be associated with variation in plumage characters in chicken (http://apr2007.archive.ensembl.org/Gallus_gallus/goview?acc=GO:0048071). Two of these 15 genes are located on the Z chromosome, tyrosinase-related protein 1 (TYRP1) and solute carrier 45 A2 (SLC45A2). Strikingly, both genes appear close to regions in which we detected significant FST-outliers. In fact, TYRP1, which was included in our marker set, is one of the significant outliers itself (Fig. 3) and is one of only three markers that is completely invariable in the collared flycatcher (Table 1). In addition, the Tajima's D-value is significantly positive for TYRP1 in the pied flycatcher (Table S2), indicative of the gene evolving under balancing selection in this species. TYRP1 is a melanogenic enzyme required for the production of eumelanin, making it important for pigmentation and presumably also to secondary sexual plumage characteristics of birds (Nadeau et al. 2007). In Japanese quail, a reddish-brown phenotype (roux) is perfectly associated with homozygosity of a recessive nonsynonymous mutation at nucleotide position 845 in TYRP1 (Nadeau et al. 2007) and, importantly, in pied flycatchers different splice variants of TYRP1 are associated with plumage blackness in males (Buggiotti 2007). Because black pied flycatcher males have higher fitness in allopatric regions and brown pied flycatcher males have higher fitness in sympatric regions (Sætre et al. 1997; Buggiotti 2007), an explanation for a high positive Tajima's D-value could be that TYRP1 evolves under balancing selection in the pied flycatcher.

The other Z-linked pigmentation gene, SLC45A2, was not included in our initial marker set. However, in chicken it is located 1 Mb upstream from NUP155 which was identified as an FST-outlier in our scan (Fig. 3). SLC45A2 plays an important role in vesicle sorting in melanocytes by directing tyrosinase to premelanosomes (Costin et al. 2003). Mutations in this gene are associated with plumage color variation in chicken and Japanese quail, including “sex-linked imperfect albinism” (Gunnarsson et al. 2007). To investigate whether the outlier signal of NUP155 could be associated with selection on SLC45A2 we sequenced 16 additional genes located close to NUP155, including SLC45A2. However, the lack of a distinct FST-peak within this region (specifically, a peak that would have covered NUP155 as well as SLC45A2) indicates that the outlier signal of NUP155 discovered in the initial scan was not caused by indirect hitchhiking effects of directional selection at SLC45A2. Interestingly although, an intron located within SLC45A2 also expresses a relatively high degree of differentiation.

We found no significant difference in the level of genetic differentiation for interspecific comparisons of sympatric and allopatric populations, and neither Tajima's D nor Fay and Wu's H differed among allopatric and sympatric populations within any of the species. The sympatric region including the islands Öland and Gotland probably represents a very young contact zone (Sætre et al. 1999) and time may have been too short for reinforcement to affect the patterns of genetic variability. The marked character displacement observed in the presumably much older sympatric region in central Europe (Sætre et al. 1999) is not as pronounced in the Baltic Sea island populations (Borge et al. 2005).

To conclude, by scanning 73 Z-linked genes for patterns of genetic differentiation and nucleotide diversity we found seven putative regions for genes evolving under directional selection. Two of these regions contained genes associated with plumage coloration in birds but a targeted dense marker scan in one of these regions, including sequence analysis of an additional 16 genes, did not reveal any patterns of long range divergence hitchhiking. Potentially, the time since the initial phase of reproductive isolation may have been too long to allow the detection of hitchhiking around any “speciation gene” with the resolution given by our marker set. However, the seven outliers constitute obvious targets for more detailed analyses and further work will clearly be needed to establish the role of these in the overall divergence of the two Ficedula flycatchers. For instance, it would be interesting to investigate patterns of sequence divergence also in the Central European flycatcher hybrid zone, which show stronger marks of contemporary reinforcement selection (Sætre and Sæther 2010).


Associate Editor: L. Moyle

ACKNOWLEDGMENTS

We thank L. Excoffier for providing access to analyze our data with a hierarchical model of population differentiation and three anonymous reviewers for helpful comments on an earlier version of thearticle. This work was funded by grants from the Swedish Research Council, the Knut and Alice Wallenberg Foundation, and the European Research council to HE.

Ancillary