Does natural selection explain the fine scale genetic structure at the nuclear exon Glu-5′ in blue mussels from Kerguelen?

The Kerguelen archipelago, isolated in the Southern Ocean, shelters a blue mussel Mytilus metapopulation far from any influence of continental populations or any known hybrid zone. The finely carved coast leads to a highly heterogeneous habitat. We investigated the impact of the environment on the genetic structure in those Kerguelen blue mussels by relating allele frequencies to habitat descriptors. A total sample comprising up to 2248 individuals from 35 locations was characterized using two nuclear markers, mac-1 and Glu-5′, and a mitochondrial marker (COI). The frequency data from 9 allozyme loci in 9 of these locations were also reanalyzed. Two other nuclear markers (EFbis and EFprem's) were monomorphic. Compared to Northern Hemisphere populations, polymorphism in Kerguelen blue mussels was lower for all markers except for the exon Glu-5′. At Glu-5′, genetic differences were observed between samples from distinct regions (FCT = 0.077), as well as within two regions, including between samples separated by <500 m. No significant differentiation was observed in the AMOVA analyses at the two other markers (mac-1 and COI). Like mac-1, all allozyme loci genotyped in a previous publication, displayed lower differentiation (Jost's D) and FST values than Glu-5′. Power simulations and confidence intervals support that Glu-5′ displays significantly higher differentiation than the other loci (except a single allozyme for which confidence intervals overlap). AMOVA analyses revealed significant effects of the giant kelp Macrocystis and wave exposure on this marker. We discuss the influence of hydrological conditions on the genetic differentiation among regions. In marine organisms with high fecundity and high dispersal potential, gene flow tends to erase differentiation, but this study showed significant differentiation at very small distance. This may be explained by the particular hydrology and the carved coastline of the Kerguelen archipelago, together with spatially variable selection at Glu-5′.


Introduction
In marine benthic organisms, a long planktonic larval stage generally allows gene flow between remote populations and consequently neutral genetic differentiation increases only slightly with geographical distance (Launey et al. 2002). Physical isolation (e.g., large distances, oceanic fronts, gyres) enhances genetic differences among populations. Differentiation may also arise locally through adaptation to localized environmental conditions (Maynard Smith 1966; Barton and Hewitt 1985). However, detecting adaptation through natural selection is difficult mainly because gene flow counters its effects at each generation (Kawecki and Ebert 2004;Sanford and Kelly 2011). Also, large variance in reproductive success (see Hedgecock's sweepstake reproduction hypothesis: Hedgecock 1994;Hedgecock and Pudovkin 2011) can generate transient chaotic patterns of genetic structure, known as 'chaotic genetic patchiness' (Johnson and Black 1984;Broquet et al. 2013) that sometimes resembles local adaptation. Differentiation between neighboring populations may be initiated by localized spatial heterogeneity in the environment such as hydrological characteristics (currents, exposition to wave action, salinity, temperature) or a complex topography (coastal shape, depth). Under conditions of fine-grained environmental heterogeneity, genetic differentiation at a selected locus may be higher between populations that differ environmentally even over short distances, than at other loci (Kawecki and Ebert 2004;Gagnaire et al. 2012).
The present work takes place in the Kerguelen archipelago, isolated in the southern Indian Ocean 4100 km southeast of South Africa and 4000 km west of Australia. The Kerguelen plateau is an obstacle to the eastward flow of the Antarctic Circumpolar Current and creates a large wake zone where water masses strongly mix (Park et al. 2008a,b). The cold superficial Antarctic Waters reach the west coast of the archipelago and separate into two parts drifting along south-and northward (Murail et al. 1977;Edgar 1987;Blain et al. 2001). The morphology of the Kerguelen Archipelago is the result of the volcanic activity combined with glacial erosion that led to a carved coast with protected bays and fjords, and a large enclosed bay with particular environmental conditions, the Gulf of Morbihan. In the very coastal perimeter, the salinity decreases drastically due to the important hydrographical network, and decreases even stronger in the shallow waters of the gulf, or deep inside the bays and fjords like at the Fjord des Portes Noires and the Fjord Henri Bossi ere ( Fig. 1) (Arnaud 1974;Murail et al. 1977). This archipelago is also characterized by a high level of endemism (Briggs 1966;McDowall 1968;Poulin and F eral 1995;Hennion and Walton 1997;Brandt et al. 1999;Frenot et al. 2001;Emerson 2002). Given the discrete geographical nature (Emerson 2002) and spatial heterogeneity of the environment, Kerguelen Islands seem particularly suited to investigate the association of the environment and the population differentiation in a marine species with a long planktonic larval stage (thereby a putatively high dispersal potential), such as smooth-shelled Mytilus (L.) mussels. The blue mussels from Kerguelen are mainly distributed in the intertidal zone from 0 to 2 m depth (Arnaud 1971(Arnaud , 1974, where the environmental conditions are the most variable. Kerguelen blue mussels have been described as M. desolationis (Lamy, 1936). However, their current taxonomic status, determined from morphology, allozymes, and nuclear and mitochondrial DNA sequences, is M. edulis platensis, the Southern-Hemisphere subspecies of M. edulis L. (McDonald et al. 1991;Borsa et al. 2012). Genetic differentiation between smoothshelled Mytilus spp. mussels (M. edulis L., M. galloprovincialis Lmk., M. trossulus Gould) and the phylogeography of these species has been studied extensively (e.g., McDonald et al. 1991;Sanjuan et al. 1997;Daguin 2000;Daguin and Borsa 2000;Hilbish et al. 2000;G erard et al. 2008), with particular focus on areas of hybridization (Skibinski et al. 1978;Skibinski 1983; V€ ain€ ol€ a and Hvilsom 1991; Viard et al. 1994;Gardner 1996;Bierne et al. 2002aBierne et al. ,b, 2003. The most famous example of genetic differentiation linked to the environment in blue mussels is the gradient at the lap locus correlated with salinity gradient along the eastern coast of North America (Koehn 1978). The physiological and selective roles of the lap locus have been highlighted (Hilbish et al. 1982;Hilbish and Koehn 1985a,b) and further studied in blue mussels from Kerguelen  and New Zealand (Gardner and Kathiravetpillai 1997;Gardner and Palmer 1998). In Kerguelen blue mussels, genetic differences between populations were apparent at three (lap, pgm, pgd) allozyme loci , and the structure reported to be related to salinity, wave exposure and, to a lesser extent, to the maximum shell length (as a proxy of fitness). However, no statistical analyses were conducted to support this conclusion. Theoretically, genetic differentiation may be due to physical barriers to gene flow but also to local adaptation under selective constraints (Williams 1966;Kawecki and Ebert 2004;Perrin et al. 2004). In cases of barriers to gene flow, the differentiation will affect a majority of loci, whereas in cases of local adaptation only a few loci are concerned.
To determine whether the genetic polymorphism of the blue mussel population of Kerguelen is driven by neutral and/or adaptive forces, we (i) investigated the influence of the water circulation around Kerguelen, first on the total genetic structure and second within differentiated groups and (ii) tested the influence of the habitat type at a smaller scale. To fill in these objectives, we used two nuclear markers polymorphic in Kerguelen blue mussels: Glu-5 0 (Inoue et al. 1995;Rawson et al. 1996) and mac-1 (Ohresser et al. 1997), and we also considered the sequence polymorphism at the mitochondrial DNA locus COI (G erard et al. 2008). We tested the polymorphism at EFbis (Bierne et al. 2002a) and EFprem's (this study), two introns of the elongation factor 1 alpha gene, which are physically linked. We collected blue mussel samples from all around the Kerguelen Archipelago, from contrasted habitats roughly described by five qualitative environmental variables. At a finer grid, a dense network of sites in the complex of islands of the Gulf of Morbihan was sampled to explore the distribution of the allele frequencies, taking into account the environmental changes over short distances.

Molecular markers
The locus Glu-5 0 (Inoue et al. 1995;Rawson et al. 1996) is located at the 5 0 extremity of exon Glu coding for an adhesive foot protein (Waite 1992). This locus contains an insertion/deletion (indel) zone, whose amplification reveals three alleles: (T, E and G) that, respectively distinguish M. trossulus, M. edulis, and M. galloprovincialis in the Northern Hemisphere (Borsa et al. 1999). The locus mac-1 is the first intron of the Mytilus actin protein (Ohresser et al. 1997). Among the 49 size-alleles described in the entire range of Mytilus spp., 22 alleles Mayes is the location of the sample 'KER' in Borsa et al. (2007). occur in the Southern Hemisphere, 8 of which have not yet been sampled in the Northern Hemisphere (Daguin and Borsa 2000). The four alleles encountered in the Kerguelen blue mussel population in Kerguelen are all shared with Northern-Hemisphere populations (Daguin and Borsa 2000;Borsa et al. 2007). The polymorphism at locus EFbis has been tested in one location in Kerguelen (Mayes Island in the Gulf of Morbihan) and was low, with two alleles detected (160: frequency 0.01; 161: 0.99) (Daguin 2000). We scored locus EFBis in other samples from around the archipelago. A new EPIC marker (EFprem's) in the second intron of EF1a was also scored. Both EFBis and EFprem's loci showed sample monomorphism (see Results). Table S1 summarizes primer names, sequences, and annealing temperatures required for the amplification of nuclear and mitochondrial DNA loci.
The genotypes at Glu-5 0 and mac-1 were determined from fragment-length variation on, respectively, 2 and 3% agarose gels. The amplification of the Glu-5 0 exon by primers Me-15 and Me-17 produced 210-bp (allele E) and 160-bp (allele G) fragments, typical of, respectively, M. edulis and M. galloprovincialis from the Northern Hemisphere ( Fig. 2A). At locus mac-1, fragments of 400 and 370 bp were revealed. According to Daguin (2000) and Borsa et al. (2007), the 400-bp fragment at locus mac-1 corresponds to allele c4, whereas the 370-bp fragment corresponds to either allele a2 or a3, which differ from one another by one base pair and cannot be distinguished on agarose gels (Fig. 2B). Consequently, the 370bp fragment is here noted 'a'. The denomination of COI haplotypes (KERF1 to KERF16) follows G erard et al. .

Genetic analysis
Heterozygosity was estimated by Nei's (1978) nonbiased heterozygosity index (H n.b ). F IS and F ST values were estimated according to Weir and Cockerham (1984) using the FSTAT procedure in the program GENETIX 4.02 (Belkhir et al. 2000). The significance of F IS (and respectively F ST ) values was assessed after 5000 permutations of alleles (resp. individuals) within (resp. between) samples, thus obtaining the distribution of F IS (resp. F ST ) pseudovalues under the null hypothesis of panmixia (resp. a nonstructured population). The probability (P) values associated to F IS or F ST estimates was the proportion of pseudovalues generated by 5000 random permutations larger than, or equal to the observed value. Mantel tests were used to assess the correlation of pairwise F ST values and geographical distances (by coastal line) between samples computing the association statistics Z (Mantel 1967).
The P-value of Z was the proportion of pseudovalues generated by permutations under the null hypothesis of independence of genetic and geographical distances, larger than, or equal to the observed value of Z. Mantel tests and permutations were computed by the program GENETIX 4.02. The false discovery rate correction for multiple comparisons was used to adjust levels of statistical significance (Benjamini and Hochberg 1995).
Analyses of molecular variance (AMOVA; Excoffier et al. 1992) were carried out on Glu-5 0 and mac-1 genotype data, as well as on COI sequences (conventional F ST based on frequencies) for comparative purposes using ARLEQUIN 3.0 (Excoffier et al. 2005). Following the results of pairwise F ST (see results), we defined and tested a geographical structure of three groups based on the origin of the samples (North + East, Gulf, South + West; Table 1). In the AMOVA grouping, the unique sample from the west coast (PCu) was lumped with the South group, from which it was not differentiated (see pairwise F ST results; Table 4). The exact test of sample differentiation (Raymond and Rousset 1995) was run (with 20,000 Markov chain, 1000 dememorization steps) using this same software based on mitochondrial haplotype frequency in the three groups. So-called neutrality tests (Fu's Fs and Tajima's D) were run to check whether the double hypothesis of demographic stability and selective neutrality of the COI marker could be rejected.
We reanalyzed gene frequency data of nine allozyme loci published in  and . Genotypes were not available, but we could compute F ST from gene frequency data using the relationship F ST = 1 -Ho/He, the exact tests of differentiation based on Jost's differentiations (see below and Table 2), and the contingency tables of the numbers of each allele in each population (Table S2).
Two approaches were used to compare differentiation levels among markers or data sets with contrasted samples sizes and polymorphism levels. (i) Jost's differentiation parameters (D) and confidence intervals were computed using the program SPADE (Jost 2008;Chao and Shen 2010). This was carried out for the common subset of seven populations which were analyzed in the present study (for Glu-5 0 and mac-1), in  and  for nine allozyme markers. The interest in this approach is that Jost's D is much less affected by polymorphism level than F ST and provides confidence Figure 2. Kerguelen blue mussels. Individual phenotypes scored on agarose gels at nuclear loci. The left lane is a 100-bp DNA ladder. Table 2. Kerguelen blue mussels. Jost's D differentiation estimates and its confidence interval (CI) calculated using a set seven populations common to the allozyme study of  and  and present study. Populations were PMt and PCx from the North region, PAF, IS, HdS, and BOS from the Gulf of Morbihan (all samples within the Bossi ere Fjord were pooled), and BT (Glu-5 0 and mac-1) or "Larose" (allozymes) from the South region. Bold values have confidence intervals that do not include zero.

Locus D CI
intervals. (ii) The POWSIM application (Ryman and Palm 2006) was used to compare results among markers which did not display similar sample sizes and allele frequency distributions. This application uses simulated data sets corresponding to a model of diverging populations (no migration): a given F ST level is chosen by the user, by selecting an appropriate pair of values for effective size and divergence time (in number of generations). We thus checked whether the small sample sizes or the reduced polymorphism for the COI and mac-1 data sets, respectively, relative to those of Glu-5 0 may affect our results in the finding of significant differentiation among regions. The simulations were run at the differentiation level found with the Glu-5 0 genotypic data (we used a value similar to both the overall F ST among all populations and the global F CT among regions), using the global frequency distributions of COI haplotypes and mac-1 alleles and their respective sample sizes in the three groups of populations (North + East, Gulf, South + West). The output of the program provides the proportion of cases in which significant differentiation is found. A median-joining parsimony network (Bandelt et al. 1999) of COI haplotypes was built using NETWORK 4.1.0.7 (available at www.fluxus-technology.com/).

Environmental factors
The habitat at each sampling site was described by five qualitative variables: (i) Substrate (rock, blocks, gravels, or sand); (ii) Wave Exposure (sheltered or exposed); (iii) Slope (flat, steep or hangover); (iv) Salinity (oceanic or influenced by freshwater); (v) Macrocystis (presence or absence). The Region (North, South, Gulf, and West) was also considered as a factor in the following statistical analyses. Correlation among environmental factors and frequency of allele G was assessed by pairwise Spearman's q values (Spearman 1904).
We also used AMOVAs on Glu-5 0 and mac-1 genotypic data and COI haplotype data, within each geographical region. For each environmental factor, we grouped samples according to the modality of the variable in order to test the effects of environment within regions (F CT , Va), within groups independently of the effects of population differentiation between regions. The AMOVA, although it is restricted to investigate nested factors, has two important advantages over parametric analyses (ANCOVAs were performed using the proportion of the G allele at Glu-5 0 as the variable to explain, but not shown): (i) it does not rely upon statistical conditions on the distribution of the data, as the P-value is assessed via permutations (Excoffier et al. 2005) and (ii) it takes into account the statistically important information of the number of individuals in each population.
We detected no polymorphism at loci EFbis and EFprem's scored on agarose gels. At locus Glu-5 0 , G and E allelic frequencies were of 41.6 and 58.4% in the total sample. In the Gulf of Morbihan, the G allele occurred in higher frequencies, sometimes over 50% (samples PR1, IGn, Ar1, Ar2, BoCRD, Bo100am, Bo200am, BoCentre; Table 3, Fig. 3). At mac-1, the allele a had a frequency of 91.9% in the total sample. Allele c4 had the lowest frequency in all samples (from 4 to 19.7%; Table 3, Fig. 3). Average H n.b. values were of 0.453 and 0.152, respectively, for Glu-5 0 and mac-1. F IS values at the two nuclear loci were generally nonsignificantly different from 0 (Table 3), except for the sample IH (northern part of the Gulf) which shows a heterozygote excess (F IS = À0. 186; P = 0.0362), but this significance level did not pass, by far, the correction for multiple tests. At locus COI, 16 haplotypes were found, haplotype diversity was about H n.b. = 0.86 within region (it was not computed within population due to a small sample sizes). Neutrality tests (Tajima's D and Fu's Fs) within each region were nonsignificant, and the haplotype network appeared balanced (Fig. 4).

Differentiation among populations
Overall F ST values at both nuclear DNA loci were significant (Glu-5 0 : F ST = 0.0627 AE 0.0176, P ≤ 0.0001; mac-1: F ST = 0.00945 AE 0.00592; P ≤ 0.004) establishing the presence of significant genetic structure for these loci in Kerguelen. By contrast, the exact test of global differentiation at the COI locus did not appear significant (P = 0.59856 AE 0.08481), and the single non-negative estimate of F ST value among regions (pooling individuals from different populations) was 0.007 (nonsignificant).
At Glu-5 0 , pairwise F ST values revealed a differentiation between samples depending on the region to which they belong. Three groups of genetically differentiated samples may be identified (Table 1 and 3): (i) the northern group presented the highest frequencies of the E allele, (ii) the Gulf group presented the highest frequencies of the G allele; (iii) the southern group was intermediate. The unique sample from the west coast (PCu) was less differentiated from the southern group than from the northern group and the Gulf. At locus Glu-5 0 after the FDR correction, only P-values lower than 0.021 subsisted, mainly those concerning pairwise F ST between samples from the   north coast and the Gulf of Morbihan. At locus mac-1, fewer pairwise F ST values were significant (Table 4). However, the samples RdA (East Coast) and PR2 (Gulf) are significantly differentiated from the majority of the other samples, due to the high frequency of allele c4 (17 and 19.4%, respectively). Only PR2 remained differentiated from the remaining samples after FDR correction.

Within-region differentiation
Genetic structure was also evidenced within the three groups of samples genetically differentiated at Glu-5 0 (North, Gulf, South). In the northern group, sample AJ was significantly differentiated to all other northern samples (due to its higher frequency of allele G at Glu-5 0 ). In this group, the sample RdA was also differentiated from all northern samples (except I3B) and had the highest frequency of allele E of the whole data set. In the Gulf group, the sample Ar1 was differentiated from all samples of the Gulf except BoCentre, Bo100am, and IGn. Consequently, a significant differentiation was highlighted between samples separated by no more than 500 m: Ar1 and Ar2. At locus mac-1, similarly, samples PR1 and PR2 appeared differentiated but only before FDR correction (Fig. 1, Table 4). At the scale of the archipelago, no correlation was detected between genetic differentiation and genetic distance, at any locus. Except along the north coast, at locus Glu-5 0 , samples PCh to RdA (from northwest to east) (P ≤ 0.04).

Confidence intervals of differentiations and power analyses
Jost's D values were computed for a set of seven populations from the North, Gulf, and the South regions, for Glu-5 0 , mac-1 and nine allozymes. The maximum value (D = 0.049) was obtained at Glu-5 0 , and its confidence interval only overlapped that of the enzyme PGD (D = 0.032) which also appeared particularly differentiated (Table 2). F ST values were all lower than those at Glu-5 0 . Additionally, whereas Jost's D confidence intervals of all loci except Glu-5 0 and PGD included 0.000, the Pvalues of the exact tests of overall differentiation were generally significant or highly significant (Table S2). For mac-1 and COI, we used POWSIM to simulate three populations with sample sizes and global allele frequencies corresponding to the three regions for these markers with an F ST of 0.07 because Glu-5 0 displayed an overall F ST of 0.067 in Kerguelen and a F CT of 0.077 in the AMOVA with regional groups. This value of F ST was obtained by simulating a fission of three populations of Ne = 1000 each, that occurred 145 generations ago, parameters which allowed maintaining the observed polymorphism. Sample sizes corresponded to samples sizes for mac-1 and COI in each region: 181, 725, 195 and 27, 19, 37, respectively. The allele frequencies used were 0.919 and 0.081 for mac-1, and 0.012, 0.012, 0.012, 0.012, 0.012, 0.012, 0.012, 0.012, 0.012, 0.012, 0.100, 0.144, 0.170, 0.060, 0.130, and 0.276, for COI simulations.
POWSIM simulations indicated that an overall F ST of 0.07 would generate significant differences in more than 99.5% of the cases for the COI data set and in more than 92.9% of the cases for the mac-1 data set. When there is polymorphism within populations the maximum value of fixation indices such as F ST does not reach one even when no allele is shared among populations (Jost 2008; Meirmans and Hedrick 2011). Glu-5 0 and mac-1 both have only two alleles, thus when comparing only two populations, F ST values can in theory reach one and the level of intrapopulation diversity should not affect the range of possible F ST values. For COI, however, the maximum possible value of F ST was higher than at Glu-5 0 , thus a given F ST value corresponds to less differentiation in Glu-5 0 than in COI (Jost 2008;Meirmans and Hedrick 2011).  Table 4. Kerguelen blue mussels. Pairwise F ST (Weir and Cockerham 1984) values at Glu-5 0 (above diagonal) and mac-1 (below diagonal) loci.  Thus, the value of 99.5% of significant F ST given by POWSIM for COI is an overestimate.

Statistical analyses with environmental variables
For each environmental variable, the genetic differentiation between samples grouped by category was assessed by pairwise F ST for each nuclear marker (Table 5). None was significant at mac-1, but many were significant at Glu-5 0 . For the variable "Substrate", samples collected on rocks, blocks, gravels, or sand were not significantly differentiated between each other. Regarding "Slope", only samples collected on flat shores and hangovers were differentiated (P ≤ 0.0001) but steep shore samples appeared significantly differentiated neither from flat nor from hangover locations. For the three remaining factors "Macrocystis", "Wave exposure" and "Salinity" the samples grouped by category were highly differentiated (P ≤ 0.0001). Concerning the variable "Region", the single western sample was differentiated from northern samples only (P ≤ 0.004). At Glu-5 0 , AMOVAs realized at the scale of the archipelago for each environmental variable, grouping the 35 samples by categories (Table 5) revealed a significant differentiation between the presence and absence of Macrocystis (F CT : 0.0239, P = 0.00684). The groupings were not significantly differentiated, neither for the other factors (at Glu-5 0 ), nor at mac-1 and COI loci. As we tested five environmental variables, the P-values should be corrected taking into account multiple tests. The effect of "Macrocystis" presence remains significant after correction for multiple tests.
Environmental variables "Substrate", "Wave exposure", "Slope", Salinity" and "Macrocystis" were significantly cor-related to one another, except the pair "Substrate/Macrocystis". The frequency of allele G was correlated to "Macrocystis" and "Region" only, whereas the variable "Region" was correlated to none of the other variables (Table 6).
AMOVAs by environmental variables, restricted to the 22 Gulf samples revealed a significant effect of "Wave Exposure" (accounting for 1.23% of the molecular variance between groups; P = 0.0088) and also of the presence of "Macrocystis" (F CT = 0.0125, P = 0.0489), but they did not overcome the FDR correction for five tests. The presence of Macrocystis in a population was correlated with wave exposure in the Gulf, as most populations where Macrocystis occur, are exposed to waves (except PR2 which is sheltered). The effect of slope was nearly significant (P = 0.058 AE 0.007). At locus mac-1, samples from the Gulf were differentiated (F CT = 0.0168, P = 0.0289) only when grouped by "Substrate" category but this significance level did not pass the FDR correction. In the regions North (eight samples) and South (four samples), the environmental groupings did not reveal any significant differentiation at Glu-5 0 or mac-1. At locus COI, none of the environmental grouping of samples was significant.

Discussion
Polymorphism at the three loci and possible departures from neutral expectations The locus Glu-5 0 has traditionally been considered as diagnostic between smooth-shell Mytilus species in the Northern Hemisphere (Inoue et al. 1995;Rawson et al. 1996;Borsa et al. 1999;Daguin and Borsa 2000;Daguin (Weir and Cockerham 1984), F CT (AMOVA) calculated between samples of Kerguelen blue mussels grouped by categories for each environmental variable at the three loci considering either the whole archipelago or the Gulf of Morbihan. Bold values are significant after FDR correction for multiple tests per column.

Variable Locus
Glu-5 0 mac-1 COI  Luttikhuizen et al. 2002;Gilg and Hilbish 2003a,b;Hilbish et al. 2003), although low frequencies of heterospecific alleles have been reported (Hamer et al. 2012). In Kerguelen, Glu-5 0 is polymorphic for heterospecific alleles and at Hardy-Weinberg equilibrium, which was unexpected in a genetic context other than the M. edulis/M. galloprovincialis hybrid zone in the Northern Hemisphere (Borsa et al. 2007). Mitochondria of Kerguelen blue mussels belong to the S1 clade which is endemic to the Southern Ocean (G erard et al. 2008). The Kerguelen archipelago thus shelters the only wild and stable population (i.e., outside a hybrid zone) of Mytilus known so far, whose polymorphism at Glu-5 0 is not in linkage disequilibrium with any of the typical genomes of northern M. edulis, M. galloprovincialis, or M. trossulus. Unexpected genetic structure was here revealed at Glu-5 0 not only at the scale of the archipelago, but also at a much smaller geographic scale, down to a few hundred meters. There was a clear break in allelic frequency at Glu-5 0 between samples from the Gulf and the north coast. The highest frequencies of the allele G occurred in the western part of the Gulf, far from the influence of outer marine waters, and reached 60% near Mayes Island ( Fig. 1) (Daguin 2000;Borsa et al. 2007).
Some of COI haplotypes in Kerguelen blue mussels also occur in southern South America (G erard et al. 2008). Here, we confirm the homogeneity of COI haplotype frequencies across the four regions of the archipelago. The shape of the haplotype network is compatible with a stable effective size of Kerguelen blue mussel population and with selective neutrality at this locus.
To summarize, in Kerguelen the polymorphism at Glu-5 0 is higher than everywhere else, whereas the polymorphism at all other nuclear loci tested (mac-1, EFbis and EFprem's) is lower (Bierne et al. 2002b; this study). The haplotype diversity at the mitochondrial locus COI is also lower in Kerguelen than in Patagonia (G erard et al. 2008), and allozyme loci are also less polymorphic in Kerguelen than in Northern-Hemisphere populations of M.
edulis (Blot et al. 1988). The smaller size of the Kerguelen metapopulation, compared to other less isolated populations worldwide, may explain its lower polymorphism (except at locus Glu-5 0 ). Local adaptation appears as a plausible cause for the maintenance of alleles at balanced frequencies at Glu-5 0 in the heterogeneous environment of the Kerguelen archipelago.

The three markers revealed distinct patterns of differentiation between samples
The level of differentiation is much higher at Glu-5 0 than at mac-1, COI and eight allozyme loci out of nine. One can hypothesize that allele differences at locus mac-1 may have escaped detection because of the low resolution of agarose gels and that the power to detect possible differences at locus CO1 may have been hampered by insufficient sample sizes. However, these hypotheses were ruled out by analyses of Jost's D and their confidence intervals, as well as the POWSIM analyses suggesting that Glu-5 0 was subjected to different constraints.
Thus, Glu-5 0 actually reveals highly significant genetic differentiation at all levels, among and within-region, and between environments. Three possible explanations arise: (i) the power analyses might be unreliable, because POW-SIM uses a model of fission which may not well represent the actual situation (but Jost's D and confidence intervals are not subject to such doubts); (ii) larvae may preferentially settle (by habitat choice) in certain environments according to their genotype at Glu-5 0 or other physically linked loci; (iii) mortality or fecundity may vary among locations according to genotype at Glu-5 0 or physically linked genes (i.e., differential selection). Marine species may be subjected to high variance of reproductive success (Hedgecock's sweepstake reproduction hypothesis) which together with collective dispersal of related individuals can generate complex patterns of genetic structure known as chaotic genetic patchiness (Broquet et al. 2013). A skewed offspring distribution also generates departure from the standard Kingman's coalescent and an increased heterogeneity in differentiation levels (Eldon and Wakeley 2009). Glu-5 0 therefore seems to be an outlier displaying particularly high genetic differentiation among Kerguelen populations. However, this observation alone is not sufficient to support hypotheses of natural selection. We will thus use an additional prediction that is not well explained by purely neutral processes which is an association between genetic differentiation and environmental distance (Coop et al. 2010).

Geographic pattern of genetic differentiation associated to ocean circulation
Patterns of genetic differentiation among Kerguelen blue mussels from different groups (North + East, South + West, and Gulf of Morbihan) similar to those here revealed at locus Glu-5 0 have been previously reported at allozyme loci .
After the FDR correction for multiple tests, the most significant differentiations were observed between the north coast and the Gulf. Indeed, the frontier between these regions displays the strongest break of allelic frequencies at Glu-5 0 , located between samples RdA and PAF. This sample RdA is also differentiated from all others at mac-1, suggesting restricted gene flow toward the most eastern point of Kerguelen. As did , we relate the restricted gene flow to the hydrology and water masses circulation around the archipelago (Murail et al. 1977). All samples are located in the 'Coastal Hydrological Region' which has the most changing physical parameters even at fine scale and globally a lower salinity compared to offshore oceanic waters. However, at wider scale, the south coast and northern point of the archipelago receive the same water mass coming from the west (the ACC), but they remain isolated, thus driving to a genetic differentiation among samples from these two regions. The water masses flowing along the north and south coasts only mix far offshore in the northeastern wake zone of the archipelago (Murail et al. 1977). The presence of eddies retaining larvae on the shelf and then dragging them from a site to another on relatively short distance may explain the pattern of isolation by distance observed along the north coast at Glu-5 0 . Koubbi et al. (2000) have suggested that Lepidonothoten squamifrons larvae are retained by a costal gyre in the Golfe des Baleiniers (the open area off Port Couvreux (PCx), north coast) and also noted the lability of this gyre and the consequent mixing of coastal and oceanic waters during the winter period when winds are the strongest (Razouls et al. 1996;Koubbi et al. 2000). Thus, at the inter-regional scale, hydrological characteristics are able to account for the main genetic differentiation observed, by their effect on migration (i.e., without necessity to invoke selection).

Very fine scale differentiation does not support selective neutrality
In the Gulf, a particular enclosure, genetic differentiations at very fine scale were observed: between samples from the Armor locality, Ar1 and Ar2, which are separated by very short distances (500 m) considering the dispersal potential attributed to the Mytilus mussels. No such differentiation is observed at the locus mac-1. In Armor (Ar), a marked difference in habitat occurs between samples (1 and 2). Ar1 is located near an important freshwater source, where Macrocystis are lacking, and has higher frequency in allele G than its neighbor Ar2. Out of the Gulf, in FPN a comparable habitat (Fjord with freshwater source), we also observed the same trend: a higher frequency of allele G compared to other south coast samples (see Table 3). This trend suggests the influence of these protected, low-salinity, sandy habitat on the blue mussels that is expressed by a higher frequency of allele G. However, a third sample with comparable habitat shows the opposite trend: RdA has the lowest frequency of allele G of the whole data set (9%; see Table 3).
Genetic differentiation caused by selective pressure from environment?
At Glu-5 0 , at the scale of the archipelago, the differentiation between groups and between categories of samples with and without Macrocystis, were significant. Typical habitats of protected areas with flat sandy bottoms and low-salinity waters, which are more frequent in the Gulf of Morbihan, lack Macrocystis kelp beds. Conversely, the open coasts are mostly exposed rocky shores, bordered by Macrocystis beds. Consequently, searching for differentiation between samples from the Gulf and those from the south and north coasts, leads to searching the differentiation between samples located in habitats, respectively, without and with Macrocystis kelp beds. Finally, the genetic differentiation among the three main geographic regions may mask the environmental effect (or the reciprocal) on the genetic data. Then, analyzing environmental effect within group would avoid the 'regions' effect. At the within-group scale, the results were distinct, mainly due to the contrasting samplings. More precisely, the absence of significant effect of all environmental factors on Glu-5 0 data in the north and south Coasts may be due to the low number of samples (8 and 5, respectively) and/ or a lower power of Glu-5 0 in these regions compared to the Gulf. In the Gulf, the H n.b. is the highest and both alleles have similar frequencies, thus allowing better detection of small differences. Indeed, within the Gulf, the substantial effect of presence/absence of Macrocystis beds on the sample differentiation was recovered, and the effect of the wave exposure was also revealed (see AMO-VAs results). A significant result after the FDR correction cannot be considered an artifact of the number of AMO-VAs that were carried out. The environmental effects found by the AMOVAs (even within the Gulf of Morbihan) do not necessarily reflect habitat choice or differential selection linked to Glu-5 0 genotypes: geographically close populations tend to share environmental characteristics even within region (for instance, the numerous samples from the Henri Bossi ere Fjord are all similar) thus if there is fine scale structure due to any other factor, by indirect correlation, a statistical effect of environment may arise even in the absence of causal relationship.
To conclude, three independent lines of evidence suggest that Glu-5 0 is affected by selection (or habitat choice): (i) the high polymorphism at this locus in Kerguelen, (ii) highest and more significant F ST and F CT at Glu-5 0 compared to other loci, and (iii) the significant effects of environmental factors on AMOVAs even within region. However, none is a sufficient proof of selection by itself.

Supporting Information
Additional Supporting Information may be found in the online version of this article: Table S1. Molecular markers: locus name, source, primer sequence, annealing temperature (T°C) and fragment length (L) in base pairs. Table S2. Kerguelen blue mussels.