SEARCH

SEARCH BY CITATION

Keywords:

  • AFLP ;
  • genetic barrier;
  • hybrid zone;
  • introgression;
  • Mytilus ;
  • outlier;
  • selection

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

Scanning genomes for loci with high levels of population differentiation has become a standard of population genetics. FST outlier loci are most often interpreted as signatures of local selection, but outliers might arise for many other reasons too often left unexplored. Here, we tried to identify further the history and genetic basis underlying strong differentiation at FST outlier loci in a marine mussel. A genome scan of genetic differentiation has been conducted between Atlantic and Mediterranean populations of Mytilus galloprovincialis. The differentiation was low overall (FST = 0.03), but seven loci (2%) were strong FST outliers. We then analysed DNA sequence polymorphism at two outlier loci. The genetic structure proved to be the consequence of differential introgression of alleles from the sister-hybridizing species Mytilus edulis. Surprisingly, the Mediterranean population was the most introgressed at these two loci, although the contact zone between the two species is nowadays localized along the Atlantic coasts of France and the British Isles. A historical contact between M. edulis and Mediterranean M. galloprovincialis should have happened during glacial periods. It proved difficult to disentangle two hypotheses: (i) introgression was adaptive, implying edulis alleles have been favoured in Mediterranean populations, or (ii) the genetic architecture of the barrier to edulis gene flow is different between the two M. galloprovincialis backgrounds. Five of the seven outliers between M. galloprovincialis populations were also outliers between M. edulis and Atlantic M. galloprovincialis, which would support the latter hypothesis. Differential introgression across semi-permeable barriers to gene flow is a neglected scenario to interpret outlying loci that may prove more widespread than anticipated.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

Understanding evolutionary processes that maintain genetic diversity in natural populations is an essential goal of population genetics (Lewontin, 1974; Rockman, 2012). Genome scans now allow identifying loci, or genome regions, with greatly increased differentiation between populations, the so-called ‘FST outlier loci’ or ‘genomic islands of differentiation’ (Luikart et al., 2003; Storz, 2005; Nosil et al., 2009). FST outliers are usually interpreted as signatures of spatially heterogeneous selection (Beaumont, 2005), but various alternative hypotheses have also been proposed (Bierne et al., 2011). Identifying the genetic basis (Storz & Wheat, 2010; Barrett & Hoekstra, 2012) and reconstructing the history (Pogson, 2001; Colosimo et al., 2005; Faure et al., 2008) of candidate loci identified in FST scans are therefore required before to conclude on the evolutionary processes underlying genetic differentiation. Strongly differentiated loci between populations often show very deep coalescences (Schulte et al., 1997; Pogson, 2001; Colosimo et al., 2005; Wood et al., 2008) that imply polymorphisms have been maintained for long times. During such long periods, populations have likely changed in size and spatial distribution as a consequence of oscillatory climate changes (Hewitt, 1996, 2011) and might have evolved partial reproductive isolation mechanisms. Although local adaptation is likely to contribute to the differentiation, it is also likely that multifarious processes jointly act in generating semi-permeable genetic barriers to gene flow (Bierne et al., 2011).

Marine species with a pelagic larval phase are associated with low genetic differentiation over long distances (Palumbi, 1994; Hellberg, 2009), but genetic differentiation is often observed at some specific loci affected by selection (Koehn et al., 1976; Lemaire et al., 2000; Schmidt & Rand, 2001; Riginos et al., 2002; Johannesson & Andre, 2006; Murray & Hare, 2006; Faure et al., 2008). Surprisingly, these loci are not that difficult to uncover. As soon as a sufficient number of loci are analysed, a few loci often reveal genetic structure at key geographical locations that are increasingly being well characterized (Pelc et al., 2009; Bierne et al., 2011; Riginos et al., 2011), such as for instance the Oresund between the Baltic Sea and the North Sea (Johannesson & Andre, 2006). In recent years, increased availability of methods for the simultaneous analysis of large numbers of loci has facilitated a population genomics approach to this problem (Guinand et al., 2004; Nielsen et al., 2009). However, the focus on local adaptation as the exclusive explanation of FST outlier loci might be excessive (Bierne, 2010; Bierne et al., 2011; Roesti et al., 2012), especially because marine species have demographic characteristics conducive to generating discrepancies with simple theoretical models (Eldon & Wakeley, 2009; Bierne, 2010; Hedgecock & Pudovkin, 2011; Tice & Carlon, 2011).

The Almeria–Oran front (AOF), an oceanographic front that separates Atlantic from Mediterranean water masses, is a recognized hotspot of genetic structure in the Sea (Borsa et al., 1997; Patarnello et al., 2007). Although it certainly acts as a natural barrier to larval dispersal, examples exist of species with a planktonic dispersal phase that, from the panel of markers analysed, do not exhibit any genetic break at the AOF (Launey et al., 2002; Patarnello et al., 2007). This suggests that the oceanographic front itself is not sufficient alone to explain the genetic structure observed in many marine species. When differentiation is observed, it is often highly heterogeneous across loci (Quesada et al., 1995c; Lemaire et al., 2005; Kasapidis et al., 2012), suggesting the barrier to gene flow is not only natural but also genetic (Bierne et al., 2011). Indeed, genetic barriers to gene flow are often semi-permeable (Harrison, 1993) and are expected to coincide with natural barriers to gene flow (Hewitt, 1975; Barton, 1979). The mussel Mytilus galloprovincialis is one emblematic species that exhibit such a genetic break at the AOF. The genetic differentiation across the AOF is especially strong at the allozyme locus ODH (Sanjuan et al., 1994; Quesada et al., 1995c) and the mitochondrial genome (Quesada et al., 1995a; Sanjuan et al., 1996) whereas it is modest, although detectable, at other loci (Sanjuan et al., 1994; Quesada et al., 1995c; Diz & Presa, 2008).

We investigated the genomic architecture of differentiation between Atlantic and Mediterranean populations, and the history and genetic basis underlying the differentiation of high FST outlier loci in the mussel M. galloprovincialis. We first conducted an FST scan with a combination of Amplified Fragment Length Polymorphism (AFLP) and nuclear codominant markers (allozymes, microsatellites and length polymorphisms). Secondly, we analysed DNA sequence polymorphisms at two outlier loci. Although we focus on the differentiation across the AOF and FST outliers identified between Atlantic and Mediterranean populations of the species M. galloprovincialis, we also analysed samples of the sister-hybridizing species M. edulis with the same loci. We show that the genetic differentiation at five of the seven identified outliers is the consequence of a stronger introgression of heterospecific alleles into the Mediterranean than the Atlantic background of M. galloprovincialis.

Materials and methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

Sampling sites

We used five geographical samples of 48 individuals that, on the basis of previous publications, are representative of each of five panmictic patches (Skibinski et al., 1983; Coustau et al., 1991; Quesada et al., 1995c; Daguin et al., 2001; Bierne et al., 2003; Kijewski et al., 2011; Hilbish et al., 2012). (i) Samples from RO (Roscoff, Brittany, France), FA (Faro, Algarve, Portugal) and ST (Sète, Languedoc-Roussillon, France) are representative of M. galloprovincialis populations of Brittany, Atlantic coast of the Iberian Peninsula and Mediterranean Sea, respectively. Samples FA and ST were used for outlier detection. (ii) Samples from WS (Wadden Sea, Holland) and LU (Lupin, Poitou-Charente, France) are representative of M. edulis populations of the North Sea and Bay of Biscay, respectively (Fig. 1). WS, RO, LU and FA samples were described in Faure et al. (2008) and were collected in the same sites as samples Helgoland, Primel, Faro and Brouage described in Bierne et al. (2003).

image

Figure 1. Sampling locations of Mytilus spp. mussels. Patches of Mytilus galloprovincialis are indicated with black lines, and patches of Mytilus edulis are indicated by black and white lines. White and black dots represent M. edulis and M. galloprovincialis samples, respectively. The three hybrid zones described by Bierne et al. (2003) and the Almeria-Oran genetic break described in Quesada et al. (1995c) are indicated by dashed lines: HZ1, between M. galloprovincialis of the Atlantic coast of the Iberian Peninsula and M. edulis of the Bay of Biscay; HZ2, between M. edulis of the Bay of Biscay and M. galloprovincialis of Brittany; HZ3, between M. galloprovincialis of Brittany and M. edulis of the North Sea; and HZAOF between M. galloprovincialis of the Atlantic coast of the Iberian Peninsula and M. galloprovincialis of the Mediterranean Sea. WS: Wadden Sea, Holland; RO: Roscoff, Brittany, France; LU: Lupin, Poitou-Charente, France; CB: Combarro, Iberian Peninsula; AV: Aveiro, Iberian Peninsula; FA: Faro, Algarve, Portugal; BA: Banyuls-sur-mer, Languedoc-Roussillon, France; ST: Sète, Languedoc-Roussillon, France; PA: Palavas, Languedoc-Roussillon, France.

Download figure to PowerPoint

DNA extraction and AFLP scoring

Total DNA was extracted from gills using the DNeasy Tissue Kit (Qiagen, Vento, The Netherlands) following the instructions of the manufacturer, and extraction quality was checked on agarose gel. DNA concentration was measured for each sample using a NanoDrop8000 Spectrophotometer (Thermo Scientific, Wilmington, DE, USA) and standardized to a DNA concentration of 100 ng μL−1. All AFLP reactions were performed in 96-well plates, containing four different sampling sites and at least 30% of replicates from other plates to measure the repeatability of AFLP fingerprints and to perform an error rate analysis. AFLP procedure followed the original protocol by Vos et al. (1995), which employs the combination of enzymes EcoRI and MseI. Standardized genomic DNA was first digested with 2 U of EcoRI and 6 U of MseI at 37 °C for 2 h 30 min. Double-stranded EcoRI and MseI adapters (Table S1) were ligated to restriction fragments for 2 h at 22 °C. The preselective PCR amplification parameters were as follows: 5 min at 94 °C, 20 cycles of 30 s denaturing at 94 °C, 1 min annealing at 56 °C and 1 min extension at 72 °C, ending with 5 min at 72 °C for complete elongation, using two PCR primer pairs: Eco + A, Eco + C and Mse + C (see Table S1). The selective amplifications were performed with four chosen selective primer pairs Eco + ACG, Eco + CAG, Mse + CGA and Mse + CTC. Touchdown cycles were performed to increase PCR specificity by initiating annealing 30 s at 65 °C, then reduced by 0.7 °C for the next 12 cycles and maintained at 56 °C for the remaining 20 cycles. Selective PCR products were separated and detected using an Applied ABI Prism 3130XL (Applied Biosystems, Foster City, CA, USA) and sized by GeneScan™ 500 ROX™ Size Standard (Applied Biosystems).

GeneMapper® v4.5 software (Applied Biosystems) was used to read the resulting chromatograms. Fragment length classes (bin sets) were created automatically and manually revised for each primer combination in the interval 50–500 base pairs. Only unambiguously identifiable fragments were translated into a binary matrix of peak intensity values for each detected locus at each sample. This matrix was secondary translated into a phenotype matrix of presence (1) and absence (0) of the fragments with AFLPScore v1.4b (Whitlock et al., 2008). Nomenclature for AFLP-scoring thresholds follows Whitlock et al. (2008), where ‘Locus-calling’ and ‘Phenotype-calling’ are thresholds used to exclude error-prone loci from the AFLP genotypes table and to determine fragment phenotype (band presence or absence) based on raw chromatogram peak height data, respectively. Error rates were measured using the method of Hadfield et al. (2006), modified for dominant data as implemented in the R package MasterBayes, which estimates the rate of error applying to each of the two alleles that underlie the genotypes scored using AFLP via a Gibbs sampler (Whitlock et al., 2008). Under this method, error rates ε1 and ε2 determine the probability of misscoring an AFLP fragment presence ‘1’ or absence ‘0’ allele, respectively. We also determined a widely used measure of error (the mismatch error rate) that compared the dissimilarity of pairs of AFLP phenotype profiles that had originated from the same genetic individual (replicated samples).

We estimated allelic frequencies at each AFLP marker with AFLPsurv v1.0 (Vekemans, 2002; Vekemans et al., 2002), assuming Hardy–Weinberg equilibrium in each population and using a Bayesian method with nonuniform prior distribution of allele frequencies (Zhivotovsky, 1999).

Nuclear codominant markers

We compiled data on 14 allozyme loci from Quesada et al. (1995c), six microsatellite loci from Diz & Presa (2008), eight length polymorphisms from Daguin et al. (2001), Bierne et al. (2002) and Boon et al. (2009) to which we have added new results for the Mediterranean sample (see details in Table S2). Sampling localities for allozymes and microsatellites were slightly different from those used for AFLPs and other codominant nuclear markers (see Table S2 and Fig. 1) but within the attested panmictic patches. We also analysed three new intron-length polymorphisms. Primers were designed to amplify portions of the endo-beta-1,4-glucanase gene (Xu et al., 2001), the endo-beta-1,4-mannanase gene (Xu et al., 2002) and the elongation factor 2 gene (from an EST library, Tanguy et al., 2008). An analysis of a sample of DNA sequences allowed us to identified indel polymorphisms in each of the three genes, and internal primers were redesigned in conserved regions each side of an indel polymorphism (see primer list in Table S3).

Outlier loci detection

Following Pérez-Figueroa et al. (2010), we performed three different methods to identify loci that depart from the expected neutral distribution of FST. We used the method of Beaumont & Nichols (1996) to obtain the average and 95% confidence interval of single locus FST as a function of heterozygosity. Simulations were performed with the fdist2 program (Beaumont & Nichols, 1996) for codominant markers and dfdist program, an extension of fdist software that allows the use of dominant markers, for the AFLP data set. Null allele frequencies were first estimated at each locus of the empirical AFLP data set using Zhivotovsky's (1999) Bayesian approach, enabling FST values to be estimated for each locus (Weir & Cockerham, 1984). For both data sets, the outlier threshold was defined by an envelope delimited by the 0.05 and 0.95 quantiles of simulated FST. A mean ‘neutral’ FST value supposedly uninfluenced by selected loci was then calculated after an iterative procedure to remove loci outlying the generated 0.95 quantile.

The second method used was developed to identify markers that show deviation from neutral expectation in pairwise comparisons of diverging populations. The detsel program (Vitalis et al., 2001, 2003; Vitalis, 2012) relies on a model in which a common ancestor population splits into two daughter populations that diverge independently by random drift after a possible bottleneck event (Vitalis et al., 2001, 2003). This method provides a parameter of divergence for each population, whose joint distribution under neutral expectations is generated using coalescent simulations. Loci detected outside the resulting confidence envelope of 95% were identified as potentially under selection.

The main difference between fdist2 and detsel lies in the underlying demographic model. fdist2/dfdist considers an island model of population structure, where a set of populations with constant and equal deme sizes are connected by gene flow (Wright, 1931), whereas detsel considers a pure divergence model, in which an ancestral population splits into two daughter populations.

The third method is a Bayesian approach implemented in bayescan v2.01 (Foll & Gaggiotti, 2008; Foll et al., 2010) to estimate directly the posterior probability that each locus is subject to selection. This method is an extension of the hierarchical Bayesian approach of Beaumont & Balding (2004) and is based on a logistic regression model. The posterior probability of one locus to be under selection is estimated by defining two alternative models. Selection is introduced by decomposing locus-population FST coefficients into a population-specific component (β), shared by all loci and a locus-specific component (α) shared by all the populations using a logistic regression. Departure from neutrality at a given locus is assumed when the locus-specific component is necessary to explain the observed pattern of diversity (α significantly different from 0). The respective posterior probabilities of these two models are estimated using a reversible jump Markov chain Monte Carlo (RJMCMC) approach that takes all loci into account in the analyses through the prior distribution, resolving the problem of multiple testing of a large number of genomic locations.

Recent simulation studies have advocated using two or more outlier detection methods to avoid false conclusions (Luikart et al., 2003; Vasemagi & Primmer, 2005; Pérez-Figueroa et al., 2010). Additionally, using a multiple test correction (Storz & Nachman, 2003; Galindo et al., 2009) is encouraged but have been criticized for being too conservative and reducing the power of outlier detection methods (Murray & Hare, 2006; Galindo et al., 2009). Ultimately, the measures taken to control the false discovery rate depend on the goal of the study. If the purpose of a genome scan is to target strongly differentiated loci for further research, the most appropriate approach is probably the use of more than one method and to combine the outlier lists to avoid the risk of losing interesting candidates.

DNA polymorphism and sequence analysis at outlier loci

DNA sequences were obtained for two outlier loci, EFbis and EF2. For this analysis, we used different primer pairs than those used for length polymorphism in order to amplify longer fragments. The new loci were named EFseq (Faure et al., 2007) and EF2seq. EFseq was amplified with the same reverse primer as EFbis, EFbis-R (5′-CTCAATCATGTTGTCTCCATGCC-3′) and a newly designed forward primer, EFseq-F (5′-AGGCTCCTTCAAGTACGCCTGGG-3′). EF2seq was amplified with the newly designed primer pair EF2seq-R (5′-TTATTGTTGTAGAAATTGTCACCCC-3′) and EF2seq-F (5′-TAATCAACTTGATTGATTCACCTGG-3′). Loci were amplified and cloned following the Mark–Recapture (MR) method (Bierne et al., 2007; Faure et al., 2007, 2008; Boon et al., 2009), the ancestor protocol of the tagging technique now widely used for multiplexing libraries in high-throughput parallel sequencing (Binladen et al., 2007; Galan et al., 2010). Every individual was PCR-amplified separately with different sets of 5′-tagged primers. PCR products were mixed together (equimolar) and cloned into a pGEM-T vector using the pGEM-T cloning kit (Promega, Madison, WI, USA). We used a capture effort of two in MR-cloning: 96 clones per population and per gene were sequenced at the Genoscope (Evry; http://www.genoscope.cns.fr/) with the universal primers sp6 and T7 flanking the insert on the plasmid. Sequence alignment was performed with ClustalW (Thompson et al., 1994) in the BioEdit interface (Hall, 1999) and verified by eye. Alignment gaps were excluded from the analyses. To give a representation of gene genealogy, phylogenetic reconstructions were obtained with mega 5.1 (Kumar et al., 2001) using the neighbour-joining algorithm with number of nucleotide differences.

Results

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

AFLP screening

A total of 422 AFLP loci were scored, of which 357 were polymorphic in FA and ST populations. The mismatch error rate, weighted to take into account the different numbers of loci arising from different primer combinations, was 3.6% (Table S4). A summary of the scoring parameters used, the error rates and the final number of loci retained for each selective primer combination is given in the Table S4.

Genetic structure and outlier identification between Atlantic and Mediterranean populations of Mytilus galloprovincialis

The analysis of genetic structure was conducted separately with the two types of molecular markers (codominant nuclear markers and AFLPs), but the results proved to be very similar. Overall, the genetic differentiation was weak, although significant, between Atlantic and Mediterranean populations of M. galloprovincialis (global FST = 0.03, FST = 0.031 for AFLPs, FST = 0.025 for codominant markers). Figure 2 shows FST at individual loci as a function of their heterozygosity together with the neutral envelope of FST obtained from simulations with the method of Beaumont & Nichols (1996) for codominant markers (Fig. 2a) and AFLPs (Fig. 2b). The results obtained with the F-mtDNA ND2-COIII marker (Kijewski et al., 2011) were also added in Fig. 2a to illustrate that the F-mtDNA ND2-COIII marker also exhibited an outlying pattern of population differentiation. However, the F-mtDNA ND2-COIII marker was not included in FST outlier detection methods due to its singular characteristics (haploid, uniparental inheritance, no-recombining). FST outliers were detected with the three methods of outlier detection, but the results were not totally consistent. The method of Beaumont & Nichols (1996) identified seven outliers (2%), the method of Vitalis et al. (2003) identified three outliers that were also identified by the method of Beaumont & Nichols (1996) and the method of Foll & Gaggiotti (2008) identified only two outliers that were also detected with the two previous methods. Four loci were detected by only one method (EF2, ODH, AFLP_1-187 and AFLP_1-111), one locus by two methods (AFLP_1-218) and two loci by the three methods (EFbis and AFLP_3-363). Individual FST values between Atlantic and Mediterranean populations of M. galloprovincialis at these seven loci were all above 0.2 whatever the type of marker.

image

Figure 2. FST values between Mytilus galloprovincialis of the Atlantic coast of the Iberian Peninsula and of the Mediterranean Sea plotted against heterozygosity for (a) 31 codominant nuclear loci (14 allozymes, six microsatellites and 11 length polymorphisms); the cross represents the result of F-mtDNA ND2-COIII marker (Kijewski et al., 2011) and (b) 357 AFLP loci. Average (dotted line) and 95% confidence envelope (bold lines; the superior limit of the 95% confidence interval for data excluding microsatellite loci is represented in grey) are the results from simulations performed with the fdist2 program (a) and the dfdist program (b). Outlier loci are annotated and represented by black dots (when identified by only one method), square (when identified by two methods) and diamonds (when identified by three methods, including bayescan). On the right side of each scan is represented the observed FST distribution.

Download figure to PowerPoint

Combining loci with different types and rates of mutation can sometimes be problematic because this can affects FST estimates under some conditions (Rousset, 1996, 2004; Balloux & Lugon-Moulin, 2002; Estoup et al., 2002). Although the effect is expected to be minimal under a simple migration–drift equilibrium model (Estoup et al., 2002; Rousset, 2004), FST being constructed to be insensible to loci diversity (Wright, 1951), one never knows how much natural populations deviate from such an equilibrium and when mutation can really be neglected in the face of gene flow. Here we found a low average differentiation, and similar differentiation with every type of markers (FST = 0.04 for microsatellites, FST = 0.02 for allozymes and FST = 0.01 for length polymorphism markers). Because microsatellites are the category of markers which can have the most deviant behaviour with other markers, we conducted additional FST outlier tests without microsatellite loci. We found the exact same results: EFbis was outlier with the three methods, and EF2 and ODH were outliers with the method of Beaumont & Nichols (1996). As the estimated average of FST was slightly lower without microsatellites, the neutral envelope estimated with fdist2 simulations was even narrowed (Fig. 2a).

Allele frequencies in M. galloprovincialis and M. edulis

Frequencies of the G allele (defined as being more frequent in M. galloprovincialis samples than in reference M. edulis sample of the North Sea) at the seven outlier loci in each of the five panmictic patches of Mytilus from the Mediterranean Sea to the North Sea (see Fig. 1) are presented in Fig. 3. Five outlier loci, among which the two loci identified by every outlier detection methods (EFbis and AFLP_3-363), revealed a decrease of the G allele in the Mediterranean population. AFLP_3-363 also revealed a surprisingly high frequency of the G allele in the M. edulis patch of the Bay of Biscay that might deserve further investigation. The two other outlier loci did not exhibit this pattern of a lower G allele frequency in the Mediterranean sample (ODH and AFLP_1-218). ODH is a tri-allelic locus, and the shift in allele frequency relied on two alleles that are more frequent in M. galloprovincialis populations whereas the frequency of the third allele, which was more frequent in M. edulis populations, did not change between Atlantic and Mediterranean samples of M. galloprovincialis (Fig. 3). AFLP_1-218 was the only one outlier locus that did not show strong allele frequency differences between M. edulis and M. galloprovincialis, so that the assignment of each allele to a species background was unreliable.

image

Figure 3. Frequency of the G allele, defined as the compound of alleles more frequent in Mytilus galloprovincialis samples than in the Mytilus edulis reference sample of the North Sea, at the seven outlier loci, in the five panmictic patches of the studied area. White: M. edulis populations; light grey: M. galloprovincialis populations of Brittany; dark grey: M. galloprovincialis populations of the Iberian Peninsula; black: M. galloprovincialis populations of the Mediterranean Sea.

Download figure to PowerPoint

Sequence analysis and phylogeny at outlier loci

Two of the seven outlier loci were chosen for subsequent analysis of DNA sequence polymorphisms. These two loci (EFbis and EF2) were those for which we had reference sequences to design primers and amplify a sufficiently long fragment (EFseq and EF2seq, respectively). For both loci, the reconstructed gene phylogeny (Fig. 4) revealed two highly divergent clades: the first composed of sequences mostly sampled in M. edulis populations (EF2seq A-clade and EFseq A-clade) and the second composed of sequences mostly sampled in M. galloprovincialis populations (EF2seq B-clade and EFseq B-clade). For these two loci, the analysis of DNA sequence allowed to validate the edulis origin of A-clade alleles sampled in M. galloprovincialis populations and to conclude that the strong differentiation between Atlantic and Mediterranean populations is the consequence of a stronger introgression of edulis allele in the latter population.

image

Figure 4. Allele phylogeny reconstructed by the neighbour-joining algorithm on the number of nucleotide differences at (a) EFseq and (b) EF2seq. Sequences sampled in Mytilus edulis populations of the North Sea and the Bay of Biscay are represented by white dots. Sequences sampled in Mytilus galloprovincialis populations of the Atlantic coast of the Iberian Peninsula and Brittany are represented by grey squares and sequences sampled in M. galloprovincialis populations of the Mediterranean Sea by black diamonds.

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

The present study aimed to identify further the history and the genetic basis of high FST outlier loci across the AOF in the marine mussel M. galloprovincialis. We first identified seven loci with an outlying level of differentiation between Atlantic and Mediterranean M. galloprovincialis. At five of the seven outlier loci, among which the two loci identified as outliers with every outlier detection method, the genetic structure proved to be the consequence of differential introgression of heterospecific alleles from the sister-hybridizing species M. edulis. Surprisingly, the Mediterranean population was more introgressed than the Atlantic population. We discuss the possible evolutionary processes that can underlie such a pattern of differential introgression.

The genomic architecture of differentiation across the AOF

The AOF is positioned at the easternmost edge of the Alboran Sea, a 300 km transition zone between superficial Atlantic waters and deep Mediterranean Sea waters (Tintoré et al., 1988). Gradients of temperature (1.4 °C) and salinity (2 psu), water currents (40 cm s−1) and the multiple eddies and gyres across a 2 km zone from the south-east Iberian Peninsula (Almeria) to Algeria (Oran) create a hydrogeographical barrier with biological, geological and chemical consequences (e.g. Sardà et al., 2004). Although the AOF is recognized as a hotspot of genetic structure (Quesada et al., 1995c; Pérez-Losada et al., 2002; Duran et al., 2004; Lemaire et al., 2005), examples exist of species that do not exhibit any genetic shift at the AOF (e.g. Launey et al., 2002) whereas their population sizes and dispersal capabilities are similar to those exhibiting such a shift. Several authors have therefore concluded that the AOF itself cannot be directly responsible for genetic shifts (Borsa et al., 1997; Lemaire et al., 2005; Patarnello et al., 2007; Bierne et al., 2011). Tension zones are expected to be trapped by natural barriers to gene flow (Barton, 1979) and/or environmental boundaries (Bierne et al., 2011), and the AOF is both. It has therefore been hypothesized that the AOF has trapped secondary contact tension zones in many marine species (Bierne et al., 2011).

In M. galloprovincialis, the Atlantic–Mediterranean phylogeographic split has been described with mtDNA-Restriction Fragment Length Polymorphism (RFLP) and sequence data (Quesada et al., 1995a, 1998a,b; Sanjuan et al., 1996; Hilbish et al., 2000), allozymes (Sanjuan et al., 1994; Quesada et al., 1995c), nuclear marker mac-1 (Daguin et al., 2001), microsatellites (Diz & Presa, 2008) and a combination of mitochondrial and nuclear markers (Śmietanka et al., 2010; Kijewski et al., 2011). The Atlantic–Mediterranean split suggests the existence of two main entities in M. galloprovincialis. Quesada et al. (1995b) argued that the genetic shift is a consequence of a secondary contact between allopatric populations and that the genetic divergence between Atlantic and Mediterranean populations predates the origin of the AOF, explaining the geographical coincidence of genetic breaks observed with different markers. However, the secondary contact hypothesis does not seem sufficient alone to explain why the genetic differentiation has not dissipated as gene flow has been restored between the two entities. In fact, the genetic differentiation was mainly visible with mitochondrial loci and the allozyme locus ODH. Our genome scan reveals that these two loci exhibit an outlying level of differentiation (Fig. 2a) and that the average differentiation, although significant, is modest (FST = 0.03). The semi-permeable nature of the barrier to gene flow may well sustain the hypothesis that a genetic barrier is superimposed on the natural and environmental barrier at the AOF and call for a better understanding of the processes that maintain the genetic differentiation at these two and the newly discovered outlying loci.

Differential introgression explains most outlier loci

The shift in allele frequency at six of the seven outlier loci was associated with a strong decrease of galloprovincialis alleles in the Mediterranean sample (Fig. 3). The phylogenetic analysis of DNA sequences at EFbis and EF2 (Fig. 4) allowed us to validate secondary introgression as the explanation of the pattern at least for these two loci. Unexpectedly, the Mediterranean population proved to be the most introgressed although it is geographically isolated from M. edulis populations nowadays.

The first issue is to understand how edulis alleles could have reached the Mediterranean population. They could have transited at low frequency through Atlantic populations. They could also have been introduced following farming of M. edulis in Mediterranean waters or incidental introduction by boats. In these two cases, selection should have favoured these heterospecific alleles in Mediterranean populations for their frequency to increase quickly and in a differential proportion to the rest of the genome. Alternatively, introgression could be historical and implies a direct contact between M. edulis and Mediterranean M. galloprovincialis. This might have well occurred during glacial periods although population distribution was shifted southward, making a contact possible somewhere at the entrance or within the Mediterranean Sea. Evidence for this latter hypothesis can be found in the results obtained at the EFbis locus in M. edulis populations (Faure et al., 2008; Bierne, 2010). Indeed, the genetic signature of a selective sweep has been observed in M. edulis populations in the form of a star-shaped genealogy (the A1 clade in Fig. 4, see Faure et al., 2008). Fitting a hitchhiking model to the results obtained on consecutive SNPs distributed along a 5 kb-long region around EFbis, Bierne (2010) managed to infer that A1 alleles increased from a frequency of 33% before the selective sweep to a frequency of 97% in M. edulis populations of the North Sea and a frequency of 57% in populations of the Bay of Biscay. A1 alleles represent 37% of the edulis alleles found in Mediterranean M. galloprovincialis (Fig. 5), which corresponds roughly to the frequency of A1 before the selective sweep in M. edulis. One might thus easily speculate that introgression between M. edulis and Mediterranean M. galloprovincialis happened before the intraspecific sweep in M. edulis and that hitchhiking has subsequently modified allele frequencies in M. edulis although the two species were no longer in contact. As the selective sweep dates back to about 25 000 years ago, the contact should have happened before. Using complete female and male mitochondrial genomes, Śmietanka et al. (2010) identified three historical introgression events between M. edulis and M. galloprovincialis, at 350, 200 and < 100 thousand years ago. It is therefore likely that several contacts have occurred between these two species, probably during interglacial periods, in a geographical context that cannot be inferred from the present distribution of the two species. It is interesting to emphasize that the younger event is a complete capture of the edulis female mitochondrial genome in Atlantic M. galloprovincialis (Quesada et al., 1998a), which illustrates that differential introgression need not be always in the same direction depending on genome regions.

image

Figure 5. Allele frequencies of clade A1 alleles (white), clade A2 alleles (striped) and clade B alleles (black) at the EFbis locus deduced from size allele frequencies. The number of sampled alleles is given within brackets above each histogram.

Download figure to PowerPoint

Heterospecific alleles have therefore probably introgressed Mediterranean M. galloprovincialis populations during a historical contact between these two entities, but the second issue is to understand why they have remained confined in the Mediterranean background at some genomic regions although the differentiation is modest on the rest of the genome. Secondary introgression has been detected at other loci with a similar frequency of edulis alleles in Atlantic and Mediterranean populations (Boon et al., 2009), and we detected no tendency for the Mediterranean population to be more introgressed at the genome level – it is only at outlier loci that this pattern is observed. We offer two hypotheses to explain differential introgression: (i) Introgression could be adaptive, implying edulis alleles were favoured in the Mediterranean Sea (the ‘adaptive introgression’ hypothesis) and (ii) the genetic architecture of the barrier to gene flow could be different between the two M. galloprovincialis backgrounds, implying the Mediterranean background was permeable to edulis introgression at the genome regions considered whereas the Atlantic background is not (the ‘semi-permeable barrier’ hypothesis). To obtain support for the latter hypothesis, we verified if the outlier loci identified between the two M. galloprovincialis populations were also outliers, or in the upper bound of the differentiation distribution, between Atlantic M. galloprovincialis and M. edulis. Except for one outlier locus (AFLP_1-218) and the F-mtDNA ND2-COIII marker, outliers between M. galloprovincialis populations were also outliers between M. edulis and Atlantic M. galloprovincialis (Fig. 6), suggesting that the genome of Atlantic M. galloprovincialis resists introgression from both M. edulis and Mediterranean M. galloprovincialis at most of the regions identified in our genome scan. Although this observation is in accordance with the ‘semi-permeable barrier’ hypothesis, it does not provide sufficient proof. Local adaptation could well have favoured the same alleles in M. edulis and Mediterranean M. galloprovincialis responding to a similar selective pressure, or have repeatedly targeted the same genes at different timescales, during species divergence and during population divergence within species. The ODH locus for instance is outlier in both comparisons, but the differentiation within M. galloprovincialis is independent of introgression. Distinguishing hypotheses on the basis of an FST scan analysis alone seems difficult. A decisive test would require walking on the chromosome towards the direct target of selection: under the adaptive introgression hypothesis, we should find a genome region fixed for edulis alleles near the outlier loci identified in this study, whereas under the semi-permeable barrier hypothesis introgressed alleles should not necessarily reach fixation in the M. galloprovincialis background.

image

Figure 6. Scatterplot of the genetic differentiation between Mytilus edulis (North Sea) and Atlantic Mytilus galloprovincialis as a function of the differentiation between Atlantic M. galloprovincialis and Mediterranean M. galloprovincialis. Dashed lines represent outlier FST thresholds for both comparisons. Dots represent AFLP markers and codominant nuclear markers, and the cross represents the result of F-mtDNA ND2-COIII marker (Kijewski et al., 2011). Outliers identified in the two comparisons are in black.

Download figure to PowerPoint

Conclusion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

Our study highlights the importance of history, and probably also of the genetic architecture of barriers to gene flow, in the genesis of strong genetic structure at specific loci and genome regions. It becomes increasingly documented that adaptive polymorphisms can easily cross species boundaries (Arnold, 2004; Anderson et al., 2009; Vekemans, 2010; Dasmahapatra et al., 2012; Staubach et al., 2012), and this source of new adaptation might be more common than traditionally recognized (The FroSpects Gregynog Workshop, 2012). Furthermore, natural populations are often subdivided into partially isolated genetic backgrounds (Charlesworth et al., 2003). These backgrounds can sometimes be recognized as taxonomic entities (ecotypes, races, subspecies or species) but not always. They might have remained invisible to preliminary phylogeographic analysis with a handful of molecular markers if the genetic barrier to gene flow affects unsampled portions of the genome (Bierne et al., 2011). Ignoring these subdivisions and focusing on a local scale without analysing outlying polymorphisms in sister (sub) species might lead to imprecise interpretations. Gagnaire et al. (2011) have recently showed that FST outliers identified within a Pacific eel species were also strongly differentiated between species. The slow leaking of heterospecific alleles in genomic island of differentiation revealed a within-species population structure that was otherwise invisible with neutral markers. We here provide another example of how the influence of hybridization with a foreign genome can influence within-species population structure and concur to the recommendation of widening the spatial and temporal scales at which one should investigate and interpret FST scans.

Acknowledgments

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

We acknowledge Michel Cantou for sampling and Pierre-Alexandre Gagnaire and two anonymous reviewers for useful comments on the manuscript. This study was supported by the Agence National de la Recherche (Hi-Flo project ANR-08-BLAN-0334) and the project Aquagenet (SUDOE, INTERREG IV B). This is article 2012-204 of Institut des Sciences de l'Evolution de Montpellier.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information

Supporting Information

  1. Top of page
  2. Abstract
  3. Introduction
  4. Materials and methods
  5. Results
  6. Discussion
  7. Conclusion
  8. Acknowledgments
  9. References
  10. Supporting Information
FilenameFormatSizeDescription
jeb12046-sup-0001-TableS1-S4.docxWord document24K

Table S1 Sequences (5′–3′) of oligonucleotide adaptors and primers used for AFLP genotyping.

Table S2 List of codominants markers used in this study.

Table S3 Sequences (5′–3′) of oligonucleotide primers used for the elongation factor 2 gene, the endo-beta-1,4-glucanase gene and the endo-beta-1,4-mannanase gene.

Table S4 Summary of AFLP selective primer combinations used during genotyping.

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.