Seascape genomics of eastern oyster (Crassostrea virginica) along the Atlantic coast of Canada

Abstract Interactions between environmental factors and complex life‐history characteristics of marine organisms produce the genetic diversity and structure observed within species. Our main goal was to test for genetic differentiation among eastern oyster populations from the coastal region of Canadian Maritimes against expected genetic homogeneity caused by historical events, taking into account spatial and environmental (temperature, salinity, turbidity) variation. This was achieved by genotyping 486 individuals originating from 13 locations using RADSeq. A total of 11,321 filtered SNPs were used in a combination of population genomics and environmental association analyses. We revealed significant neutral genetic differentiation (mean F ST = 0.009) between sampling locations, and the occurrence of six major genetic clusters within the studied system. Redundancy analyses (RDAs) revealed that spatial and environmental variables explained 3.1% and 4.9% of the neutral genetic variation and 38.6% and 12.2% of the putatively adaptive genetic variation, respectively. These results indicate that these environmental factors play a role in the distribution of both neutral and putatively adaptive genetic diversity in the system. Moreover, polygenic selection was suggested by genotype–environment association analysis and significant correlations between additive polygenic scores and temperature and salinity. We discuss our results in the context of their conservation and management implications for the eastern oyster.

As a result, marine species have been understudied in relation to spatial and environmental (i.e., seascape) factors driving their evolution (Selkoe et al., 2016). Recently, advances in genomic tools, with sufficient resolution to study population genetic processes in marine systems, have resulted in considerable interest in the fast-growing field of marine and seascape genomics (Gagnaire & Gaggiotti, 2016;Selkoe et al., 2016) and the ability for genomics to enhance management of marine resources . These studies have improved our understanding of how genomic variation in marine and coastal ecosystems is spatially distributed, and how these patterns are influenced by environmental conditions DiBattista et al., 2017;Diopere et al., 2017;Lal, Southgate, Jerry, Bosserelle, & Zenger, 2017;Metivier, Kim, & Addison, 2017;Sandoval-Castillo, Robinson, Hart, Strain, & Beheregaray, 2018;Van Wyngaarden et al., 2018;Xuereb, Benestan, et al., 2018;Xuereb, Kimber, Curtis, Bernatchez, & Fortin, 2018). While marine species are typically characterized by low absolute levels of genetic differentiation relative to their terrestrial counterparts, these studies have revealed complex interactions between spatial and environmental processes, mediated by a species' particular life-history traits, operating at a different scale than in familiar terrestrial systems.
While many marine species have important ecological and economic roles, management decisions are often made without critical population genetic information . This is particularly true for invertebrate species, which in general have received much less attention than marine vertebrates. Bivalves in particular occupy important ecological and economic roles in coastal ecosystems throughout the world, and have been the object of intense anthropological influence (Dumbauld, Ruesink, & Rumrill, 2009;Martinelli, Soto, González, & Rivadeneira, 2017;National Research Council, 2010;Shumway, 2011).
Oysters are the most economically important groups of bivalves (FAO, 2018), as well as being valued for the important ecosystem services they provide (zu Ermgassen, 2013). In North America, the eastern oyster (Crassostrea virginica Gmelin) has historically been the most important bivalve species. This species occupies the shallow waters of bays, lagoons and estuaries along the east coast of North America (Jackson, 2001;Policy & Economics Branch, Gulf Region, 2003). It has a planktonic larval stage of approximately three to four weeks (Booth & Sephton, 1993;Policy & Economics Branch, Gulf Region, 2003;Vercaemer, St-Onge, Spence, Gould, & McIsaac, 2010), during which the larvae can perform active vertical movement and thus bias their drift horizontally using horizontal and tidal currents (Abbe, 1986;Andrews, 1983;Mann, 1988;Seliger, Boggs, Rivkin, Biggley, & Aspden, 1982). After this period, the larvae settle themselves on hard surfaces by ejecting an adhesive, become "spat" (juvenile) and remain sessile for the rest of their lives (Eastern oyster Biological Review Team, 2007;Policy & Economics Branch, Gulf Region, 2003). Crassostrea virginica is naturally distributed from the Gulf of Mexico to New Brunswick, Canada, where it historically formed reefs comprised of millions of individuals (Wilberg, Livings, Barkman, Morris, & Robinson, 2011), providing physical structure for a range of marine species and important filtration services (Buroker, 1983;National Oceanic and Atmospheric Administration (NOAA) 2018a). Many of these large oyster reefs have been disrupted by overfishing in the 19th-20th centuries, and contemporary oyster populations are thought to be at less than 1% of their historical abundance (Mackenzie, 2007;Wilberg et al., 2011). The eastern oyster is still the object of an important wild-capture fishery and aquaculture industry throughout its range (FAO, 1992), and several major restoration projects are currently underway in the United States (Hudson River Foundation, 2018;NOAA, 2018b;Oyster Restoration Workgroup, 2018;Virginia Department of Environment Quality, 2018).
In Canada, oysters have been an important resource for indigenous peoples and colonists (Lavoie, 1978). As in the rest of its range, Canadian oyster beds were subject to intense exploitation, to which they are especially vulnerable due to their slower growth and maturation at this northern range limit (Comeau, Mayrand, & Mallet, 2012). In an effort to replenish the fisheries, oyster spat capture and translocation has been long-standing features of the fishery practices which are thought to have contributed to the introduction and spread of a major epizootic event, now known as Malpeque disease.
In 1915, mass mortality events were noted in Malpeque Bay, Prince Edward Island (PEI), Canada, and are thought to have been caused by transfer of oyster spat from New England in 1913-1914, in an effort to rebuild overharvested stocks (McGladdery & Zurbrigg, 2006;McGladerry & Bower, 1999;Medcof, 1961). Despite regulations prohibiting the transfer of oysters from Malpeque Bay after mortality was observed, the disease spread to other PEI bays, with characteristic mass mortalities of >90% across all age classes.
Eventually, the Malpeque stock appeared to acquire resistance to Malpeque disease and disease-resistant broodstock were transferred to other bays in an effort to accelerate the recovery of oyster populations (McGladdery & Zurbrigg, 2006). By the 1950-1960s, Malpeque disease appeared in New Brunswick (NB) and Nova Scotia (NS), presumably from prohibited transfers of asymptomatic disease carriers from PEI, causing the same widespread mortalities and decimating oyster beds.
The recovery strategy was to replace susceptible oysters from NB and NS with PEI disease-resistant oysters through mass transplantation of adult oysters and seed (Drinnan, 1967;Drinnan & England, 1965;Drinnan & Medcof, 1961;Found & Logie, 1957;McGladdery & Zurbrigg, 2006). The strategy appeared to be successful in re-establishing oyster populations, and subsequently, all oyster populations that were subject to these massive transplantations were assumed to be genetically homogeneous (Vercaemer et al., 2010). In addition to these massive historical transfers, spat collection and translocation at all geographic scales (within and between bays, regions and provinces) have been a continuing practice among oyster fishers and aquaculturists (Policy & Economics Branch, Gulf Region, 2003), which is expected to continue to exert a homogenizing influence.
Despite these large historic disturbances, oyster populations in the Maritimes may be genetically differentiated. In the northern part of their range, eastern oysters are found in semi-closed environment like estuaries, lagoons and bays, and oyster populations from different bays are not contiguous. Coastal habitats in the southern Gulf of Saint Lawrence are considerably fragmented and diversified ecologically (Dutil et al., 2012). Accordingly, selective forces could be large enough to maintain differentiation in the face of gene flow. Additionally, sweepstakes reproductive success (SRS), which refers to a large variance in individual reproductive success (Cushing, 1990;Waples, 1998), has been documented in several shellfish species (including oysters) and could imply an important effect of genetic drift (Hedgecock & Pudovkin, 2011). Yet, a low-resolution microsatellite study (Vercaemer et al., 2010)  Lawrence. On the other hand, empirical studies have found population-of-origin effects on oyster phenotype for these same populations (Mallet & Haley, 1983;Newkirk, 1978).
Given the increasing importance of eastern oysters as an aquaculture species (Atlantic Canada Opportunities Agency, 2012; Government of New Brunswick, 2018) and the availability of modern, high-resolution genomic methods, it is important to revisit the population genetic portrait of eastern oyster populations in the Maritime provinces of Canada. We used wild oysters collected from PEI, NB and NS to test the prediction of structured populations of the eastern oyster against that of homogenized genetic structure.
We also investigated the influence of spatial and environmental variation on observed patterns of genetic differentiation. This is a critical preliminary step towards improving oyster management and production in Atlantic Canada.

| Sampling
Oysters were collected in the summer of 2014 by a combination of hand-picking, snorkelling and dredging in 13 bays, with 11 being situated along the eastern coast of NB, one in NS and one in PEI (Canada) (Figure 1 and

| Library preparation and sequencing
Genomic DNA was extracted using a modified low-salt CTAB extraction protocol (Arseneau, Steeves, & Laflamme, 2017)

| Bioinformatics and genotyping
Read quality was evaluated using FastQC v0.11.5 (Andrews, 2010 We generated two data sets: (a) a haplotype data set for analyses of genetic diversity and (b) a SNP data set for analyses of population genetic structure. For the haplotype data set, we excluded loci for which there were more than two haplotypes in five individuals or more (we discarded individuals with more than two haplotypes) and a maximum of 20% missing haplotypic data.  (Danecek et al., 2011), PLINK software v1.90b5 (Purcell et al., 2007) and custom scripts (https://github.com/enormandeau). Patterns of identity by missingness (IBM) were examined using a multidimensional scaling analysis implemented in PLINK to evaluate clustering bias that could be related to missing data shared between sequencing lanes or sampling sites.

| Detecting SNPs putatively under selection and defining data sets
Loci under selection are expected to behave differently from neutral loci in terms of population-related patterns of diversification and are thus inappropriate to infer population demographic history (Beaumont & Nichols, 1996;Holderegger, Kamm, & Gugerli, 2006;Luikart, England, Tallmon, Jordan, & Taberlet, 2003). We were used to define a putatively neutral SNP data set, as well as a data set of SNPs putatively under divergent selection based on outliers detected using all three methods. Finally, as some analyses assume that genetic markers are not strongly linked, we tested the SNP data set for linkage disequilibrium (LD) (between each pair of SNPs within single chromosomes, n = 10) using VCFTOOLS.
For each pairwise comparison with an associated R 2 > 0.8, we randomly removed one linked SNP until no R 2 value ≥ 0.8 was observed.

| Missing data imputation
As some analyses cannot handle missing data, we performed an imputation of missing genotypes (4.7% missing genotypes overall), using the default options of the "on-the-fly" Random Forest algorithm (Ishwaran & Kogalur, 2015;Ishwaran, Kogalur, Blackstone, & Lauer, 2008), implemented in the R package RADIATOR v0.0.11 (Gosselin, 2018a). This approach is an efficient machine-learning method known for its excellent prediction performance and capacity to address interactions and nonlinearity, in addition to avoiding overfitting and measures variable importance for variable selection (Breiman, 2001;Tang & Ishwaran, 2017). This was achieved by first evaluating overall genetic differentiation using the putatively neutral SNPs based on estimates of F ST (Weir & Cockerham, 1984) between sampling sites using the R package ASSIGNER v0.5.0 (Gosselin, Anderson, & Bradbury, 2017), with 95% confidence intervals (5,000 bootstraps). The results from this analysis were used to subsequently infer groups required for missing data imputation. Pairwise comparisons between sampling sites with 95% confidence intervals (CIs) including zero were defined as a single group for imputation.

| Genetic diversity
The haplotype data set and the R package STACKR v2.0.4 (Gosselin, 2018b) were used to calculate number of monomorphic and polymorphic loci for each sampling site, in addition to nucleotide diversity (P i , calculated individually) and the number of consensus reads (i.e., reads with no polymorphism throughout the whole data set).
The average P i values were compared between sampling sites using a t test in R.
The SNP data set was used to calculate observed (H o ) and expected (H e ) heterozygosity using GENODIVE 2.0b27 (Meirmans & Tienderen, 2004), as well as deviations from the Hardy-Weinberg equilibrium (HWE) with F IS inbreeding coefficient (10,000 permutations) using the R package HIERFSTAT (Goudet, 2005;Goudet & Jombart, 2015). These statistics were evaluated for each sampling site using the full set of SNPs and were compared between the nonimputed and imputed data sets.

| Genetic differentiation, population structure and population assignment
All analyses related to neutral population genetic structure were performed using the imputed data set of putatively neutral SNPs and excluding SNPs with strong LD (i.e., R 2 value ≥ 0.8). Differentiation between sampling sites was first evaluated using pairwise estimates of F ST (Weir & Cockerham, 1984)  between sampling sites using ADEGENET (Jombart, Devillard, & Balloux, 2010). Geographic distances were estimated by using the shortest marine distance between the average sampling site coordinates of each bay. ADEGENET was also used to assess population clustering with a discriminant analysis of principal components (DAPC). We performed DAPC using the optimal number of clusters as determined based on the Bayesian information criterion (BIC) (Fraley & Raftery, 1998;Jombart et al., 2010;Lee, Abdool, & Huang, 2009). A second DAPC was also conducted using the sampling sites as a prior (K = 13). The optimal α-score was used to choose the optimal number of principal components (n = 45), and we used all discriminant functions (n = 12) for both analyses.
A hierarchical analysis of molecular variance (AMOVA) was performed using GENODIVE. Sampling sites were first grouped according to pairwise F ST values (pairs of sites with F ST < 0.001 were considered as a single population). A second grouping (hereafter "clusters") was based on the groups observed using the combined results of the dendrogram and DAPC approaches (see Results).
Finally, an individual assignment test was conducted using the approach developed by Paetkau, Calvert, Stirling, and Strobeck (1995) and implemented in GENODIVE. To avoid the problem of high-grading bias (see Anderson, 2010;Waples, 2010), we used all markers from the SNP data set (one SNP per locus, imputed data set) as recommended by Benestan, Gosselin et al. (2016). The null distribution of likelihood values was generated using a Monte Carlo test (10,000 permutations; Cornuet, Piry, Luikart, Estoup, & Solignac, 1999). As all possible source populations may have not been sampled, we used the home likelihood criterion (L H ) to detect putative migrants, and a threshold alpha value of 0.002 was used to determine whether individuals were tagged as migrants (see Berry, Tocher, & Sarre, 2004). We performed the assignment analysis by sampling sites and then by clusters. In order to assess the influence of unbalanced sample sizes, for each cluster, we used randomly chosen individuals to form equal sample sizes (n = 33, smallest sample size).

| Association between spatial structure, environmental factors and genetic variation
We used a spatial eigenfunction approach to evaluate spatial structure between sampling sites based on distance-based Moran's ei-  Dutil et al., 2012). The data set describes the coastal and epipelagic (0-30 m) habitats of the estuary and Gulf of Saint Lawrence in a grid divided into 6.25 km cells. Of the available variables, we chose the ones that we considered the most likely to be important for oysters (eight temperature, three salinity and three turbidity variables). We considered surface sea temperature only, as the available data were more detailed and it tends to be correlated with bottom temperatures in the studied region (Brickman & Drozdowski, 2012;Drinkwater & Gilbert, 2004;Dutil et al., 2012). Correlation between environmental variables was evaluated using the Pearson correlation coefficient.
When two variables were highly correlated (|r| ≥ 0.7), only one variable was retained. The resulting set of variables included seven variables related to temperature (mean surface temperature, minimum surface temperature, number of weeks with temperature between 2 and 6°C, 6 and 10°C, 10 and 14°C, 14 and 18°C, and above 18°C), two to turbidity (maximum monthly turbidity and minimum monthly turbidity) and one to salinity (mean surface salinity). Missing values (0.4%) for environmental data were replaced with the median using the function na.roughfix in the R package RANDOMFOREST (Liaw & Wiener, 2002).
We evaluated the influence of spatial distribution of environmental factors on putatively neutral and adaptive genetic variation throughout the studied system using redundancy analyses (RDAs) (Buttigieg & Ramette, 2014;Legendre & Legendre, 2012 We also performed a RDA as a multilocus genotype-environment association method to detect loci putatively under selection based on correlations with environmental variables. This approach can detect (even weak) multilocus signatures of selection for multiple environmental predictors, especially compared to differentiationbased outlier detection methods (Forester, Lasky, Wagner, & Urban, 2018;Rellstab, Gugerli, Eckert, Hancock, & Holderegger, 2015).
We used genotypes for all SNPs (one SNP per locus) and the same environmental variables as in the previous RDAs. The significance (alpha ≤ 0.05) of the global RDA and significance of each RDA axis were assessed using an ANOVA with 1,000 permutations, as above.
Outlier SNPs were defined using the distribution of their loadings on each significant RDA axis, where SNPs showing a loading located in the tail of the distribution are more likely to be under selection. We used a cut-off of ±3 SD from the mean loading of each axis to identify candidate SNPs, as suggested by Forester et al. (2018) to minimize false-positive and false-negative results. Thereafter, the correlation between each candidate SNP and the environmental variables was evaluated using the Pearson correlation coefficient.
As a final step for the multilocus approach, we calculated a polygenic score to assess the individual cumulative adaptive genetic variation associated with each environmental variable tested (Babin, Gagnaire, Pavey, & Bernatchez, 2017;Gagnaire & Gaggiotti, 2016).
For each individual, we used the genotypes (0, 1, 2) to generate a score for each outlier SNP, for which we evaluated the relationship with their correlated environmental variable. If the slope of the relationship was negative, scores were inverted (2, 1, 0) to obtain a positive relationship. Then, at the individual level, we summed the score of each SNP correlated with a particular variable, giving an individual polygenic score for each variable independently. Finally, we tested whether a linear or quadratic model better fit the relationship between the polygenic score and the associated environmental variable according to the lowest Akaike information criterion (AIC) value.

| Gene ontology
To gain insight into possible targets of selection, we performed a gene ontology (GO) annotation of SNPs with nonsynonymous mutations. First, as we used only a single SNP for each locus, we extracted all SNPs having passed our quality filters located on loci for which a SNP has been identified as being putatively under selection by at least one method (BAYESCAN, ARLEQUIN, OutFLANK, RDA).
The flanking regions (100 bp) of these SNPs were extracted from the eastern oyster reference genome that we previously used for the bioinformatic pipeline and BLASTed (minimum of 80% of similarity on 30 amino acids) on the protein sequences of C. virginica.
For variants that resulted in the same protein result, we evaluated whether the amino acid sequence was the same or not. If amino acid sequences were different, we conducted a search on the SWISS-PROT database (Bairoch & Apweiler, 2000) using the protein name.  Information Table S1, for details). SNPs were distributed along all 10 chromosomes (Supporting information Figure S1). Preliminary analyses did not show clustering patterns based on IBM (Supporting information Figures S2 and S3).

| Detecting SNPs putatively under selection and defining data sets
To identify putatively neutral and selected loci, we used the filtered SNP data set (
The average P i values ranged from 5.46 x 10 -4 (RIC) to 5.99 x 10 -4 (MAL) between sampling sites (  (Table 2). Estimations performed with imputed (using the previously described procedure) and nonimputed data were similar (Table 2).  Table S3). All 95% CIs excluded zero, and all corrected p-values were <0.0001 except for CRQ-MIS (p = 0.0246). The results of the Mantel test showed a weak but significant pattern of IBD (adjusted R 2 = 0.021, p = 0.039;

| Genetic differentiation, population structure and population assignment
Supporting Information Figure S4). The dendrogram revealed significant groupings of sampling sites (R 2 = 0.90, p = 0.001) that were highly supported with bootstrap support varying between 65% and 100% (median = 99.5%) (Figure 2 Table 1 Results from the DAPC analysis revealed a clustering pattern concordant with that based on genetic differentiation. Six genetic clusters were identified based on the lowest BIC value (Supporting Information Figure S5 Table 3).
Individual assignment success was highly variable between sampling sites. The proportion of correct assignment ranged from 7% (SSI) to 77% (RIC), and 52% of individuals was correctly assigned to their sampling sites ( Figure 3). However, a higher proportion (77%) of individuals were correctly assigned when considering the six genetic clusters (Supporting Information Figure S7). Standardizing sampling size to n = 33 increased the assignment success to 81%, and a more uniform assignment success ranging from 73% (RIC) to 94% (CRB) (Supporting Information Figure S7). In all analyses, the majority of the incorrectly assigned individuals were tagged as migrants ( Figure 3, Supporting Information Figure S7).

| Association between spatial structure, environmental factors and genetic variation
For the first RDA using putatively neutral genetic variation (8,246 SNPs), following the ordistep procedure, a single spatial vector (dbMEM-3) and two temperature variables (number of weeks with temperature between 10 and 14°C and mean surface temperature) were selected for the RDA. The global RDA was highly significant and explained 7.7% of the genetic variation (p <0.001; Table 4; Numbers in circles represent the percentage of individuals from a sampling sites assigned to another. Circle diameter is proportional to percentage. Numbers in parenthesis represent the percentage of individuals from a "current population" tagged as migrants from a "inferred population" Figure 4). When we controlled for the effect of environmental variables in a partial RDA, the spatial dimension of the RDA was marginally significant and explained 3.1% of the genetic variation (p <0.083; Table 4). The environmental dimension was significant and explained 4.9% of the genetic variation after controlling for the spatial variable (p <0.015; Table 4).

Source of variation
For the RDA based on the six SNPs identified as being putatively under divergent selection, following the ordistep procedure, a single geographic vector (dbMEM-3; Table 4) and a single temperature variable (mean surface temperature) were selected for the RDA. The global RDA was highly significant and explained 52.4% of the genetic variation at these six outliers (p <0.001; Table 4; Figure 4). When we controlled for the effect of the environmental variables in a partial RDA, the spatial dimension of the RDA was marginally significant and explained 12.1% of the genetic variation (p = 0.054; Table 4), and the environmental dimension explained 38.6% of the genetic variation after controlling for the spatial distribution (p <0.001; Table 4). Note. Significance of the global model and significance of each variable in the partial RDA were evaluated using an ANOVA (10,000 permutations).
F I G U R E 4 Redundancy analysis (RDA) for analyses performed on the 8,246 neutral SNPs (a) and the six loci putatively under divergent selection (b). Arrows represent significant spatial (dbMEMs) and environmental variables on axes 1 and 2. Sampling sites are represented by black filled circles. PC factors are positioned according to the top and right axes. Abbreviations of sampling sites are presented in Table 1 temperature above 18°C, 8 with maximum monthly turbidity, 23 with minimum monthly turbidity, and 17 with mean surface salinity.
Correlations between polygenic scores and the correspond-

| Gene ontology
Of the markers identified as putatively under divergent selection, three SNPs were found using at least one genome scan method only, 258 using the RDA approach only and 20 using at least one genome scan method and RDA approach simultaneously. These 281 F I G U R E 5 Redundancy analysis (RDA) for polygenic adaptation analyses performed using the 11,321 SNPs on significant axes, (a) axes 1 and 2, (b) axes 3 and 4, and (c) axes 5 and 6. Arrows represent environmental variables (MSS: mean surface salinity, NW2-6: number of weeks with temperature between 2 and 6°C, NW6-10: number of weeks with temperature between 6 and 10°C, NW10-14: number of weeks with temperature between 10 and 14°C, NW14-18: number of weeks with temperature between 14 and 18°C, NW18: number of weeks with temperature above 18˚C, MST: mean surface temperature, MIST: minimum surface temperature, MAMT: maximum monthly turbidity, MIMT: minimum monthly turbidity). Large coloured circles and small grey circles represent sampling sites and SNPs, respectively. Abbreviations of sampling sites are presented in Table 1 | 599

BERNATCHEZ ET Al.
SNPs were distributed across the 10 chromosomes. Of all the sequences that were BLASTed on the protein sequences of the eastern oyster genome, 122 (40 loci, 122 SNPs) had significant hits and 36 SNPs, located on 21 loci and nine chromosomes, were nonsynonymous mutations (Supporting information Table S4). Results of GO on SWISS-PROT showed that several genes containing nonsynonymous mutations were related to ion (metal and nonmetal) binding, ATP (adenosine triphosphate) binding, immunity and integral components of the cell. Of the proteins hit, six were categorized as "uncharacterized protein" (Supporting information Table S4).

| D ISCUSS I ON
Despite its economic and ecological importance, we know relatively little about levels of genetic structure and diversity in the eastern oyster. As a first step towards addressing this gap, we used RADSeq to document patterns of population structure between natural populations across its northern range limit. Contrary to our expectation that oyster populations in the study area would be genetically homogeneous due to larval dispersal, mass historical transfers and ongoing juvenile oyster translocations, our findings reveal a considerable degree of population genetic structure. Using genome scans and multivariate methods based on environmental associations, we identified putative targets of "monogenic" and "polygenic" selection, indicating that at least some of this variation is likely to be adaptive.
In addition, our results also are consistent with an influence of spatial distribution and environmental factors, especially temperature, in shaping both neutral and adaptive genetic variation.

| Genetic diversity of oysters in the studied region
Genetic diversity of the studied populations was not excessively low despite the historical decline of populations in our study area.

| Evidence for significant population structure
Despite the expected genetic homogeneity of oysters due to historical translocations in the studied region and the high dispersal potential of oyster larvae (Vercaemer et al., 2010), we found a clear pattern of population structure reflected by both significant geographic clustering of sampling sites and significant isolation by distance. Indeed, the extent of genetic differentiation is relatively pronounced in comparison with other marine invertebrates from the same region.
Within a similar geographic region, previous studies revealed average F ST values (neutral loci) of 0.0018 and 0.003 in American lobster  and in Sea scallop (Van Wyngaarden et al., 2017), respectively. These values are considerably lower than the F I G U R E 6 Correlations between additive individual polygenic score based on (a) mean surface salinity and 17 SNPs, (b) number of weeks with temperature between 2 and 6°C and 10 SNPs, (c) number of weeks with temperature between 6 and 10°C and 30 SNPs, (d) number of weeks with temperature between 10 and 14°C and 60 SNPs, (e) number of weeks with temperature between 14 and 18˚C and 27 SNPs, (f) number of weeks with temperature above 18°C and 18 SNPs, (g) mean surface temperature and 47 SNPs, (h) minimum surface temperature and 38 SNPs. Only significant correlations are presented. Correlation coefficient (R 2 ) and p-values of the linear (a, b) or quadratic (c, d, e, f, g, h) models are presented for each variable average F ST value of 0.009 that we observed in our populations but comparable to that reported for the acorn barnacle (Balanus balanoides) in the same region (0.010) (Dufresne, Bourget, & Bernatchez, 2002). Significant population structuring in the system was also supported by the relatively strong assignment success of 77% among the six resolved genetic clusters. Using balanced sampling sizes led to a higher and more uniform assignment success between clusters, corroborating the importance of sample size in assignment analyses. While each regional cluster was formed of geographic neighbours, the overall pattern of genetic structure does not have a simple relationship to physical distance. For example, the northern NB cluster and the southern NB cluster form a group with Malpeque

| Environmental effects on genetic variation
Using monogenic and polygenic approaches, we identified environmental factors that are correlated with changes in allele frequencies for both putatively neutral and adaptive loci in our study area.
In particular, our results suggest that temperature has a significant impact on both neutral and putatively adaptive genetic variation throughout the system, with two temperature variables (i.e., number of weeks between 10 and 14°C and mean surface temperature) being significantly correlated with neutral genetic variation using the RDA approach. For adaptive genetic variation, both temperature and salinity appear to be important based on the results of the monogenic and polygenic approaches.
Environmental variables can shape neutral variation through their impact on gene flow, for example, by acting as barriers to long-distance dispersal. Oysters disperse uniquely during the larval period, a particularly vulnerable life-history stage that is characterized by extremely high mortality rates (Dekshenieks, 1996;He et al., 2012). Larval traits such as development time, propensity to settle and swimming speed are directly related to temperature (Devakie & Ali, 2000;Hidu & Haskin, 1978;Johnson, 1957;Shumway, 1996). At their northern range limit, oysters become quiescent for almost half the year, from mid-November to early May (Comeau, Mallet, Carver, Nadalini, & Tremblay, 2017), so newly settled spat have a short window of time to accumulate energy reserves before the winter. As late recruitment has been suggested to cause high winter mortality (Mallet & Haley, 1983), migrants arriving later from neighbouring bays may be at a disadvantage.

| Contemporary demographic history and anthropogenic influences on population structure
In addition to the spatial and environmental factors uncovered in our analyses, the particular demographic history (i.e., recent mass mortality due to Malpeque disease) and contemporary anthropogenic influences (e.g., movement of adults and spat by industry) of oysters in our study area may have impacted extant population genetic structure. Indeed, Malpeque disease is the primary reason oyster populations in the Gulf of Saint Lawrence were assumed to be genetically homogeneous. In New Brunswick, Malpeque disease occurred in the 1950s, with massive transplantation of PEI oysters during the 50s and 60s. Oysters in our study area can spawn in the year following settlement, although the number of gametes is very small initially and increases with oyster size/age (Choi, Lewis, Powell, & Ray, 1993). Certain features of oyster biology may lead to relatively rapid diversification, especially sweepstakes reproductive success (SRS), which has often been suggested as a major feature of highly fecund marine species (including oysters) subject to high larval mortality (reviewed in Hedgecock and Pudovkin (2011)). It implies a large effect of genetic drift and lower genetic diversity than expected given the large census populations size (Hedgecock & Pudovkin, 2011), and has been suggested as a characteristic of certain oyster species (Boudry, Collet, Cornette, Hervouet, & Bonhomme, 2002;Hedgecock, 1994;Lallias, Taris, Boudry, Bonhomme, & Lapegue, 2010;Sun & Hedgecock, 2017). Some studies failed to reveal strong events of SRS in eastern oyster (He et al., 2012;Rose, 2008;Rose, Paynter, & Hare, 2006); however, SRS is difficult to test as it may fluctuate spatially and temporally, making it difficult to reveal locally or to reject at the species level (Hedgecock et al., 2007;Hedgecock & Pudovkin, 2011;Taris, Boudry, Bonhomme, Camara, & Lapègue, 2009). Nevertheless, SRS seems unlikely to account for the magnitude of population differentiation we observed, especially as there is no obvious difference between the source population (MAL) and other populations with respect to genetic diversity parameters, which would be expected if pervasive genome-wide population bottlenecks were a major contributor to diversification. While variance in the reproductive success of broadcast spawning oysters is undoubtedly high, the large-scale nature of the transfers (several tons per bay) may have prevented strong founder effects, even if the effective population size (Ne) may be much smaller than census size (Nc) (Hauser & Carvalho, 2008). As a reproductive-mediated form of random drift, SRS is also not a plausible candidate for the apparent pattern of adaptive differentiation in our study area.
Rather, it is plausible that extant oyster populations reflect a mixture of "ancestral" populations and introgressed alleles from PEI oysters. In northern and southern NB, oyster populations seem to be more related to MAL (in PEI) and thus probably more related to the resistant oysters used for repopulation. However, the populations from central NB (i.e., TAB, MIR and RIC) and NS (CRB) do not cluster with the "Malpeque clade." A possible explanation for this result is that these oysters may have independently acquired resistance to Malpeque disease or that there was limited introgression of resistance-related genes of PEI oysters in these populations, due either to the presence of a larger number of oysters having survived the Malpeque outbreak (e.g., in low-salinity refugia) or to countervailing local selection pressure. As our results suggest that environmental conditions are at least partially responsible for the observed neutral and adaptive genetic variation, it is possible that a part of the "ancestral" genetic variation was preserved through selection.
Evaluating the impact of ongoing human-induced movement is also very difficult with our data set. Within NB, transfer permits are issued with an emphasis on mitigating the risks associated with aquatic invasive species and diseases, but there is no tracking of oyster movements by industry. The oyster aquaculture industry is spread out throughout NB, with aquaculture leases in all of the defined regional clusters. Oyster movements frequently occur between the northern and southern NB clusters, and between the Richibucto Estuary and southern NB, though most oysters tend to be moved within a cluster due to the logistics of wild spat collection (Sylvio Doiron, personal communication). Importantly, the scope of activities carried out by the oyster aquaculture industry in NB has been relatively limited until the last 10 years. We were careful to avoid aquaculture leases in our sampling and sampled mature oysters, most of which are likely to be more than 5 years old, so it is also distinctly possible that insufficient time has passed for widespread hybridization to be detectable.

| Highlights from gene ontology
Of the loci identified to be putatively under divergent selection, 21 (mostly associated with temperature variables) were found to be located within genes with at least one nonsynonymous mutation.
In summary, despite the possible influence of Malpeque disease and population translocations, it appears that eastern oysters are still characterized by relatively pronounced population structure (in relative terms compared to other marine species), which could be associated with differential selection across temperature and salinity regimes. However, further investigation is clearly needed to elucidate the physiological basis for putative adaptive divergence in the eastern oyster.

| Implications for eastern oyster management and future research directions
Considering the ecological and economical importance of the eastern oyster in Canada, studying the extent of genetic divergence between populations and the factors responsible for the observed differentiation is a crucial step towards better management. Despite massive mortalities and human-mediated translocations, our results demonstrate a clear pattern of genetic divergence related, at least partially, to spatial proximity in oyster populations within the studied region. Moreover, we found evidence that environmental conditions (primarily temperature, but also salinity) were associated with putatively adaptive genetic variation in the system. The presence of local adaptation may argue for management measures aimed at limiting introgression from divergent populations (Conover, 1998;do Prado et al., 2018), including limiting the scope of transfer activities carried out by industry, and therefore, a major outstanding question for oyster management is determining to what extent wild populations are locally adapted.
While we found correlative evidence that environmental variables were linked to changes in allele frequencies at both putatively adaptive and neutral loci, much work remains on elucidating the causative links and relevant scales for selection. For example, we relied on a government database (Dutil et al., 2012) for the environmental variables that provided environmental data in the coastal waters surrounding our study populations. We did not have direct access to environmental data within the estuaries themselves. Estuaries are known to be extremely heterogeneous environments, and oysters are found in a broad range of habitats, spanning large gradients in temperature and salinity (see Eastern oyster Biological Review Team, 2007). Finer resolution environmental data and within-bay sampling may reveal further patterns of selection and differentiation, which may impact the broad-scale correlation observed here. For example, Newkirk (1978) found evidence of differentiation with the Richibucto (RIC) Estuary between high-and low-salinity sites. Importantly, these differences persisted into the F2 indicating that parental environmental effects were not the sole driver of performance differences.
This study aimed to provide a first step to understanding patterns of genetic diversity and population structure in eastern oysters from the Maritime provinces of Canada, and future research can elaborate on this work. Genetic patterns of differentiation in relation to environmental variables should be investigated at smaller spatial scales and with an increased sample size to determine whether genetic substructure exists. While this work may not be feasible across all bays, the presence of regional genetic clusters may suggest future empirical work should at least aim to include representatives from multiple genetic clusters. Temporal replication and single-cohort sampling would be powerful tools in deciphering the contribution of SRS to shifts in allelic frequencies and diversification, as well as the presence of temporal fluctuations in selection. Admittedly also, RADSeq methods as used here only allow the screening of a subsample of the genome-wide genetic variation within a given species. Therefore, future population genomic studies on this species would benefit from using a whole genome resequencing approach (Fuentes-Pardo & Ruzzante, 2017).
It is important to elucidate the precise selective mechanism causing the observed pattern of differentiation before considering a change in the current management of oyster movements. Largescale historical transfers were successful in re-establishing oyster populations, suggesting that current stocks are at least tolerably well adapted to their present environmental conditions. In addition, it is important to recall that the extant pattern of population differentiation is present despite ongoing movements by industry, though it may be too early to detect a genomic signature from the recent expansion of aquaculture activities in our data set. Ultimately, our data provide an intriguing snapshot of population genomic diversity in eastern oysters, which has uncovered patterns of genetic differentiation that warrant further investigation.
Understanding spatial patterns of adaptive genetic variation and associations with environmental variables has important implications for predicting how eastern oyster populations might respond to environmental change in the future. With climate change, the selective seascape for eastern oysters is likely to dramatically shift in the coming decades. Warmer and more acidified waters are expected to have a host of complex impacts, both direct (e.g., temperature stress) and complex (e.g., changing host-disease interactions, multiple environmental stressors). A better understanding of genotype-environment associations and the targets of selection in Maritime populations of the eastern oyster will contribute to mitigating these impacts and ensuring the long-term viability of fishery and aquaculture activities.

ACK N OWLED G EM ENTS
We are grateful to E. Normandeau for his help with the analyses and Bar. S. Bernatchez was supported by the Engage project grant.

CO N FLI C T O F I NTE R E S T
None Declared.

DATA ACCE SS I B I LIT Y
Genomic data (all filtered markers, putatively neutral markers and markers putatively under divergent selection) and environmental