Population structure, genetic connectivity, and adaptation in the Olympia oyster (Ostrea lurida) along the west coast of North America

Abstract Effective management of threatened and exploited species requires an understanding of both the genetic connectivity among populations and local adaptation. The Olympia oyster (Ostrea lurida), patchily distributed from Baja California to the central coast of Canada, has a long history of population declines due to anthropogenic stressors. For such coastal marine species, population structure could follow a continuous isolation‐by‐distance model, contain regional blocks of genetic similarity separated by barriers to gene flow, or be consistent with a null model of no population structure. To distinguish between these hypotheses in O. lurida, 13,424 single nucleotide polymorphisms (SNPs) were used to characterize rangewide population structure, genetic connectivity, and adaptive divergence. Samples were collected across the species range on the west coast of North America, from southern California to Vancouver Island. A conservative approach for detecting putative loci under selection identified 235 SNPs across 129 GBS loci, which were functionally annotated and analyzed separately from the remaining neutral loci. While strong population structure was observed on a regional scale in both neutral and outlier markers, neutral markers had greater power to detect fine‐scale structure. Geographic regions of reduced gene flow aligned with known marine biogeographic barriers, such as Cape Mendocino, Monterey Bay, and the currents around Cape Flattery. The outlier loci identified as under putative selection included genes involved in developmental regulation, sensory information processing, energy metabolism, immune response, and muscle contraction. These loci are excellent candidates for future research and may provide targets for genetic monitoring programs. Beyond specific applications for restoration and management of the Olympia oyster, this study lends to the growing body of evidence for both population structure and adaptive differentiation across a range of marine species exhibiting the potential for panmixia. Computational notebooks are available to facilitate reproducibility and future open‐sourced research on the population structure of O. lurida.

tinuous isolation-by-distance model, contain regional blocks of genetic similarity separated by barriers to gene flow, or be consistent with a null model of no population structure. To distinguish between these hypotheses in O. lurida, 13,424 single nucleotide polymorphisms (SNPs) were used to characterize rangewide population structure, genetic connectivity, and adaptive divergence. Samples were collected across the species range on the west coast of North America, from southern California to Vancouver Island. A conservative approach for detecting putative loci under selection identified 235 SNPs across 129 GBS loci, which were functionally annotated and analyzed separately from the remaining neutral loci. While strong population structure was observed on a regional scale in both neutral and outlier markers, neutral markers had greater power to detect fine-scale structure. Geographic regions of reduced gene flow aligned with known marine biogeographic barriers, such as Cape Mendocino, Monterey Bay, and the currents around Cape Flattery. The outlier loci identified as under putative selection included genes involved in developmental regulation, sensory information processing, energy metabolism, immune response, and muscle contraction. These loci are excellent candidates for future research and may provide targets for genetic monitoring programs. Beyond specific applications for restoration and management of the Olympia oyster, this study lends to the growing body of evidence for both population structure and adaptive differentiation across a range of marine species exhibiting the potential for panmixia. Computational notebooks are available to facilitate reproducibility and future open-sourced research on the population structure of O. lurida.

| INTRODUC TI ON
Coastal marine ecosystems provide important services such as carbon sequestration, food production, and recreation (Luisetti et al., 2014), yet contain some of the most exploited and threatened species on earth. As evidence for the direct impacts of human activities (e.g., overharvesting, increasing atmospheric CO 2 , and nutrient runoff) on these species grows, there has been increased focus on restoring depleted abundances, recovering ecosystem services, and determining which species are capable of adapting to environmental change (Granek et al., 2010). Effective management of threatened and exploited species requires an understanding of both the genetic connectivity among populations and adaptation across environmental gradients (Baums, 2008;Miller & Ayre, 2008;Palumbi, 2003).
Anthropogenic movement of individuals between populations, either for aquaculture or for restoration purposes, can confound signals of population structure and should be evaluated when drawing conclusions about genetic connectivity (David, 2018). For the numerous coastal marine species with planktonic dispersal, high connectivity among populations can obscure population boundaries and oppose the diversifying effects of natural selection through gene flow (Lenormand, 2002). A growing body of evidence indicates that both limited effective dispersal and local adaptation may be more common in marine species than previously hypothesized (Cowen, Lwiza, Sponaugle, Paris, & Olson, 2000;Hauser & Carvalho, 2008;Sanford & Kelly, 2011;Weersing & Toonen, 2009). Neutral molecular markers (e.g., microsatellites) have traditionally been used exclusively to identify the geographic structure of subpopulations and estimate the genetic connectivity between them (Funk, McKay, Hohenlohe, & Allendorf, 2012); however, these do not give insight into the scale or magnitude of adaptive divergence. For populations connected by dispersal, adaptation to local conditions can still occur if the strength of selection overcomes the homogenizing effect of gene flow (Hellberg, 2009). Advances in genomic and computational techniques, such as genotype-by-sequencing (GBS), have facilitated the detection of genomic regions that may be influenced by natural selection ("outlier loci") (Stapley et al., 2010). Although often referred to as adaptive markers, these outliers may only be linked to loci that are under natural selection rather than confer an adaptive advantage themselves. Outlier loci have provided increased spatial resolution of population structure compared to neutral loci alone for some marine species (Drinan et al., 2018;Milano et al., 2014;Van Wyngaarden et al., 2016), but not all (Moore et al., 2014). In addition to potentially resolving fine-scale genetic differentiation, outlier loci may be useful for characterizing the adaptive potential of a species or population to future environmental conditions (Eizaguirre & Baltazar-Soares, 2014). Inferences from both neutral and adaptive markers should be combined when making management recommendations (Funk et al., 2012).
The Olympia oyster (Ostrea lurida, Carpenter 1864) is a native estuarine bivalve found from Baja California to the central coast of Canada, patchily distributed over strong environmental gradients (Chan et al., 2017;Schoch et al., 2006). Oysters are ecosystem engineers in estuaries, providing structured habitat and removing suspended sediments (Coen, Dumbauld, & Judge, 2011;zu Ermgassen, Gray, Langdon, Spalding, & Brumbaugh, 2013). Unlike other oysters where both males and females spawn gametes (e.g., Crassostrea), the females fertilize eggs with sperm from the water column and initially brood larvae in the mantle cavity. After release, the larvae have been reported to be planktonic from 7 days to 8 weeks before settling on a hard substrate (Baker, 1995). The impact of maternal brooding on population structure in Osterideae has not been examined.
Following devastating commercial exploitation in the 19th and early 20th centuries, recovery of Olympia oyster populations has been stifled by other anthropogenic threats (e.g., water quality issues, habitat loss, and possibly ocean acidification (Blake & Bradbury, 2012;Hettinger et al., 2013;Sanford et al., 2014)). The last 15 years has seen increased interest in the Olympia oyster, with restoration projects underway by both government and nongovernment agencies across its range (Pritchard, Shanks, Rimler, Oates, & Rumrill, 2015). Current knowledge about the population genetic structure of O. lurida comes primarily from an unpublished 2011 dissertation, which sampled from San Francisco, CA, to Vancouver Island, BC, and found regional population structure using microsatellites (Stick, 2011). Two phylogeographic studies using two mitochondrial loci identified a phylogeographic break north of Willapa Bay, WA, and established the southern boundary divide between O. lurida and its sister species Ostrea conchaphila (Polson, Hewson, Eernisse, Baker, & Zacherl, 2009;Raith, Zacherl, Pilgrim, & Eernisse, 2016). Future and ongoing management plans would benefit greatly from thorough analysis of the fine-scale genetic structure using modern genomic techniques and rangewide sampling (Camara & Vadopalas, 2009).
The objective of this study was to characterize the spatial population structure of the Olympia oyster across the majority of its range using both neutral and adaptive markers derived from genome-wide single nucleotide polymorphisms (SNPs). I specifically tested whether patterns of genetic variation suggest a smooth continuum of allele frequency shifts consistent with isolation-bydistance (IBD) (Malécot, 1968), regional blocks of genetic similarity that correspond to physical barriers (Hare & Avise, 1996), or the null model of no significant genetic differentiation (Grosberg & Cunningham, 2001). SNPs produced from high-throughput sequencing have led to the identification of previously undetected population structure in a number of marine and terrestrial species K E Y W O R D S connectivity, genotype-by-sequencing, management recommendations, Olympia oyster, outlier loci, population genomics (Everett et al., 2016;Reitzel, Herrera, Layden, Martindale, & Shank, 2013;Van Wyngaarden et al., 2016). Compared to the Atlantic coast of North America (Hoey & Pinsky, 2018), studies utilizing genomewide SNPs for marine taxa from the Pacific coast are far fewer in number and have been limited to regional spatial scales (De Wit & Palumbi, 2013;Drinan et al., 2018;Gleason & Burton, 2016;Larson et al., 2014;Martinez, Buonaccorsi, Hyde, & Aguilar, 2017) or in the number of sampling sites (Pespeni, Oliver, Manier, & Palumbi, 2010;Tepolt & Palumbi, 2015). This study is the first of my knowledge to utilize thousands of SNPs to extensively survey the rangewide population structure of a marine species along this coast.
A secondary aim of this study was to produce a reproducible computational pipeline to go from raw data to results and figures, using Jupyter Notebooks. Jupyter Notebooks are interactive documents that integrate text, code, and analysis results (Kluyver et al., 2016). A major issue for genomic analyses today is how to clearly explain the computational methods used in order to allow for reproducibility (Kanwal, Khan, Lonie, & Sinnott, 2017). This open access pipeline is intended to provide an example template to improve reproducibility in future studies and function as an instructional tool for biologists and early-career scientists who wish to apply these methods to their own study organisms.

| Sample collection
A total of 20-25 adult Ostrea lurida over 2 cm in length were col-

| Genotype-by-sequencing analysis
Library preparation for genotype-by-sequencing (GBS) followed the protocol by Elshire et al. (2011)   Step 1) used sample-specific barcode sequences, allowing one mismatch in the barcode sequence. Base calls with a Phred quality score under 20 were converted to Ns, and reads containing more than 5 Ns were discarded. Adapter sequences, barcodes, and the cutsite sequences were trimmed from filtered reads, with only reads greater than 35 bp retained (Step 2). Reads were then clustered using a sequence similarity threshold of 85% both within (Step 3) and between samples to genotype polymorphisms (Steps 4,5) and identify orthologous loci (Step 6) with a minimum of 10× read coverage. Replicate samples were assembled separately and then compared using custom Perl scripts by Mikhail Matz (Wang, Meyer, McKay, & Matz, 2012). The replicate with the largest number of GBS loci after final filtering (Step 7) was retained. Samples were removed if they had fewer than 200,000 raw sequencing reads, fewer than 15,000 assembled clusters of at least 10× read depth, and were missing data for over 55% of loci assembled across at least 75% of samples, with Steps 4-7 rerun using the remaining individuals.
The final assembly was then filtered for excess heterozygosity based on deviations from Hardy-Weinberg equilibrium (HWE) in at least two populations, sample coverage of 75%, and an overall minor allele frequency (MAF) of 2.5%, retaining only GBS loci found in at least one individual from all populations. Preliminary analyses conducted on datasets allowing more or less missing data showed that the inferred population structure was robust to missing data up to 40%. Population genetic summary statistics, with the exception of F ST , did change quantitatively due to missing data but not qualitatively (not shown) (Cariou, Duret, & Charlat, 2016). Filtering steps were conducted using VCFtools (Danecek et al., 2011), custom Python code, and code adapted from Jon Puritz's laboratory (Puritz, Hollenbeck, & Gold, 2014). Input files and formats for subsequent analysis of population structure were created using a combination of custom Python code, custom R code, and the radiator R package (Gosselin, 2017). Every step of the assembly, filtering process, and creation of input files can be reproduced through Jupyter Notebooks. loci which had SNPs identified as outliers in at least two of the approaches were classified as putative adaptive GBS loci. From these GBS loci, any SNP that had been identified as an outlier by at least one approach was separated from the full SNP dataset to create an "outlier" SNP dataset. Subsequent analyses of population structure were conducted on three SNP datasets: all SNPs (combined), outlier SNPs, and neutral SNPs-which excluded any SNP found on a putative adaptive GBS locus.

| Detection of loci under putative selection
BayeScan uses a Bayesian approach to apply linear regression to decompose F ST coefficients into population-and locus-specific components and estimates the posterior probability of a locus showing deviation from Hardy-Weinberg proportions (Foll & Gaggiotti, 2008). BayeScan analysis was based on 1:100 prior odds, with 100,000 iterations, a burn-in length of 50,000, a false discovery rate (FDR) of 10%, and default parameters. Results were visualized in R.
OutFLANK is an R package that identifies F ST outliers by inferring a distribution of neutral F ST using likelihood on a trimmed distribution of F ST values. Because of its likelihood method, OutFLANK calculates F ST without sample size correction when inferring the neutral distribution. Simulation studies have shown that this approach has lower false positive rates compared to other F ST outlier methods (Whitlock & Lotterhos, 2015). OutFLANK was run using default parameters and a q-value threshold of 0.1, which can be considered a false discovery rate ( Putative adaptive GBS loci were functionally annotated through Blast2GO. Sequences were compared against molluscan sequences in GenBank's nr database using the BLASTx algorithm with default parameters and a e-value hit filter of 10 −3 , and against EMBL-EBI InterPro signatures. Gene ontology terms were mapped to annotations with default parameters except for an e-value hit filter of 10 −3 (Götz et al., 2008). Minor allele frequency was plotted against latitude individually for each outlier SNP in order to identify clinal patterns of allele frequency shifts.

| Summary statistics, population differentiation, and spatial structure
Population genetic summary statistics were calculated on the combined, neutral, and outlier datasets to describe and compare overall and population-specific genetic diversity. Observed heterozygosity (H o ), expected heterozygosity (H e ), overall F ST , and F IS were calculated using the basic.stats function in the R package hierfstat (Goudet & Jombart, 2015). Confidence intervals for population-specific F IS were determined using the boot.ppfis function in hierfstat with 1,000 bootstrap replicates. Pairwise F ST following Weir and Cockerham (1984) was calculated using the genet.dist function in hierfstat. Heatmaps of pairwise F ST values were created using ggplot2 (Wickham, 2016). A Mantel test of coastal water distance (calculated by drawing routes between all sites on Google Earth) and F ST /1−F ST as implemented in adegenet tested for evidence of isolation-by-distance (Sokal, 1979

| Estimating connectivity and historical relationships
Spatial variation in gene flow and genetic diversity was calculated and visualized using the program EEMS (estimated effective migration surfaces) (Petkova, Novembre, & Stephens, 2016). This method identifies geographic regions where genetic similarity decays more quickly than expected under isolation-by-distance based on sampling localities and a pairwise genetic dissimilarity matrix derived from SNP data. These regions may be interpreted as having reduced gene flow. A dissimilarity matrix was calculated for the neutral dataset using a variant of the bed2diffs R code included in the EEMS package that takes input from a genind R object. An outer coordinate file for defining the potential habitat of O. lurida was produced using the polyline method in the Google Maps API v3 tool (http://www. birdtheme.org/useful/v3tool.html). The habitat shape followed the shape of the coastline and excluded land regions that O. lurida larvae would not naturally be able to cross (e.g., the Olympic peninsula separating outer coast populations and those in Puget Sound, WA).
The EEMS model is recommended to be run for various numbers of demes, which establishes the geographic grid size and resulting potential migration routes. Three independent analyses were run for each deme size (200, 250, 300, 350, 400, 500, 600, and 700) for a total of 24 runs, with a burn-in of 1,000,000 and MCMC length of 5,000,000 iterations. The convergence of runs was visually assessed and results were combined across all analyses and visualized using the Reemsplots R package-producing maps of the effective diversity (q) and effective migration rate (m).
To infer the evolutionary relationship among sampling sites, including population splits and migration events, I reconstructed population graph trees using the software TreeMix (Pickrell & Pritchard, 2012). This method uses the covariance of allele frequencies between populations to build a maximum likelihood graph relating populations to their common ancestor, taking admixture events ("migration") into account to improve the fit to the inferred tree. The population graph was rooted with the two southernmost O. lurida population (San Diego, CA, and Mugu Lagoon, CA), then run allowing between 0 and 10 migration events. For each value of migration events, I calculated the proportion of variance in relatedness between populations that is explained by the population graph to evaluate model fit (Wang, 2017). GBS loci (the "combined" dataset).

| GBS and outlier detection
Three different methods were employed to identify putative SNPs under selection. The number of outliers detected by each program and the overlap between programs is illustrated in Supporting Information Appendix S1: Figure D1. OutFLANK, as the most conservative of the programs used (Whitlock & Lotterhos, 2015), had the lowest number of outlier markers detected with 31 SNPs across 16 GBS loci. Twenty-nine SNPs found across 16 GBS loci were identified as outliers by all three programs. A total of 129 GBS loci contained SNPs identified as outliers by at least two approaches, with 235 SNPs included in the outlier dataset for subsequent population structure analyses. The neutral dataset, with 13,073 SNPs across 6,057 GBS loci, excluded any SNP found on a GBS locus with an outlier SNP.

| Spatial structure
Bayesian population structure analysis from STRUCTURE differed slightly between the neutral and outlier datasets. For neutral SNPs,

| Population differentiation and isolation-bydistance
Pairwise population-specific F ST was higher for outlier SNPs, but both datasets qualitatively illustrated roughly six geographic "re-

| Connectivity and historical relationships
TreeMix produced a population graph that supported a major phylogeographic split between the outer coast of Washington and   (Nei & Chesser, 1983); H e : expected heterozygosity averaged across loci; H o : observed heterozygosity averaged across loci; pairwise F ST : average of all pairwise F ST values (Weir & Cockerham, 1984).
TA B L E 1 Overall summary statistics for the combined (13,424 SNPs), neutral (13,073 SNPs), and outlier (235 SNPs) datasets mapped the genetic diversity parameter q, which is an estimate of the expected within-deme coalescent time and is proportional to average heterozygosity (H e ). Populations from Oregon northwards had much lower genetic diversity than those in California. A linear regression of population-specific H e and latitude using the neutral dataset shows a strong relationship between genetic diversity and latitude (R 2 = 0.862; Figure 6), as did the outlier dataset (R 2 = 0.834).

| Functional annotation of outliers
The 129 GBS loci containing outlier SNPs were functionally annotated using Blast2GO. Eighteen of these mapped to protein-coding genes in the GenBank database, primarily from Crassostrea virginica and Crassostrea gigas. One mapped to the O. lurida mitochondrial NADH dehydrogenase subunit 5 gene (nad5), which exhibits high variability in oyster species and is commonly used for metazoan phylogenetics (Xiao, Wu, Li, & Yu, 2015). Annotated genes have potential roles in developmental regulation (glyoxalase 3, DNA N6-methyl adenine demethylase-like, transcriptional regulator ERG, and serine/ threonine-protein kinase), sensory information processing (serine/ threonine-protein kinase, sodium-dependent phosphate transport, and vesicular glutamate transporter), immune or stress response (nad5, E3 ubiquitin-protein ligase, Ty3-G Gag-Pol, and helicase domino), energy metabolism (carnitine palmitoyltransferase and glucose dehydrogenase), heavy metal binding (heavy metal-binding protein HIP), and muscle contraction (myosin heavy chain-striated muscle, myosin-XVIIIa) ( Table 2)  T o m a le s _ C A N _ S a n F r a n _ C A S _ S a n F r a n _ C A M u g u L a g o o n _ C A S a n D ie g o _ C A E lk h o r n _ C A T o m a le s _ C A N _ S a n F r a n _ C A S _ S a n F r a n _ C A M u g u L a g o o n _ C A S a n D ie g o _ C A E lk h o r n _ C A (c) either Coos Bay, OR, or San Francisco Bay, CA, to the north, and the other allele increases in frequency toward the south (Supporting Information Appendix S1: Figure D2).

| D ISCUSS I ON
Reduced-representation genomic methods, such as GBS, can greatly inform reintroduction efforts for threatened and exploited species by resolving fine-scaled population structure, providing estimates of genetic connectivity, and identifying informative markers for characterizing adaptive variation (Allendorf, Hohenlohe, & Luikart, 2010;Gagnaire et al., 2015). Using 13,424 GBS-derived SNPs, I characterized the rangewide population structure of the Olympia oyster from southern California to British Columbia and further identified 235 SNPs across 129 GBS loci potentially associated with local adaptation. Contrary to studies in some other marine species, neutral markers had greater power to detect fine-scale population structure compared to outliers. However, outlier loci did provide evidence for adaptive divergence among some populations with high inferred admixture and are informative as candidate loci involved in local adaptation. This study highlights the importance of using both neutral and outlier markers for conservation and management applications.

| Regional population structure and gene flow
Significant population structure was observed across the range of O. lurida in both the neutral and outlier markers, with sampling locations structured into six distinct regions. Notably, most of these regions fit well within previously described biogeographic provinces based on marine species distributions (Fenberg, Menge, Raimondi, & Rivadeneira, 2015;Hall, 1964;Valentine, 1966). In addition to de- Below I discuss these phylogeographic regions and potential barriers in more detail, as well as provide some recommendations for management at local scales.

| Southern California (SoCal)
The SoCal region, containing San Diego Bay, CA; Mugu Lagoon, CA; and Elkhorn Slough, CA, extends across both the Southern Californian and the Montereyan biogeographic provinces as defined by Hall (1964), with Monterey Bay as the northern boundary.
Monterey Bay is a known biogeographic barrier for some marine algae (Abbott & Hollenberg, 1976) and has been proposed as a potentially important barrier to gene flow in marine invertebrates as well (Dawson, 2001). This region extends across Point Conception, which is a well-known site of species turnover (Valentine, 1966) and a phylogeographic barrier for some taxa (Marko, 1998;Wares, Gaines, & Cunningham, 2001). This finding is consistent with meta-analyses demonstrating that strong population structure across Point Conception is the exception rather than the rule for many marine invertebrates (Dawson, 2001;Kelly & Palumbi, 2010). Finer-scaled sampling of populations on either side of Point Conception may provide evidence for slight genetic clines undetected by the current study.
SoCal exhibits the highest genetic diversity of any region, for which I propose three nonexclusive mechanisms. (a) The southward direction of the California Current results in asymmetric gene flow and an accumulation of genotypes in the south (Wares et al., 2001).
This hypothesis is supported by the inferred directionality of migration events in TreeMix (Figure 4). (b) Northern populations exhibit lower genetic diversity due to repeated extirpation or population bottlenecks from glaciation cycles (see Puget+BC) (Marko, 2004).

| Northern California (NoCal)
San Francisco Bay, Tomales Bay, and Humboldt Bay constitute the

| Oregon and Willapa
Both the Oregon region, comprised of Netarts Bay and Yaquina Bay, and the Willapa region with Willapa Bay, WA, and Coos Bay, OR, fall within the Mendocinian biogeographic province, which is usually demarcated by either Cape Flattery (Blanchette et al., 2008) or Vancouver Island (Fenberg et al., 2015) to the north. Evidence The separation of these two regions from those to the south corroborates previous evidence from mitochondrial loci of a strong phylogeographic divide (Polson et al., 2009). Although Cape Flattery and Puget Sound itself have both been classified as biogeographic barriers due to a bifurcation in ocean currents (Kelly & Palumbi, 2010;Valentine, 1966), there are surprisingly few studies evaluating the genetic structure of species found both within Puget Sound and on the outer coast of Washington. Those that do focus on species with much longer dispersal times than O. lurida (Buonaccorsi, Kimbrell, Lynn, & Vetter, 2002;Cunningham, Canino, Spies, & Hauser, 2009;Iwamoto et al., 2015;Jackson & O'Malley, 2017;Siegle, Taylor, Miller, Withler, & Yamanaka, 2013). To my knowledge, this is the first study in a marine mollusk to evaluate and identify significant population differentiation among Puget Sound populations and the outer coast.
More studies are required to fully characterize the importance of this barrier across marine taxa.  (Dyke & Prest, 1987

| Anthropogenic influences on population structure
The evidence for reduced effective migration, low differentiation within most of the phylogeographic regions, and external estimates of effective dispersal (Carson, 2010) suggests that long-distance  (Baker, 1995;Woelke, 1959 Figure D2). Many clinal GBS loci with functional annotations are associated either directly or indirectly with development. Glyoxalase 3 expression has been linked to developmental competence in female oyster gametes (Pauletto et al., 2017), and DNA N6-methyl adenine demethylase has been linked to developmental timing in oysters (Riviere et al., 2013). ERG transcriptional regulator, kinase B-raf, and Fez family zinc finger protein are likely to be directly involved with developmental regulation (Epelboin et al., 2016;Gaitán-Espitia & Hofmann, 2017). Olympia oysters exhibit latitudinal variation in gonad development and spawning, with California oysters initiating spawning up to 6°C warmer than those from Puget Sound, WA (Coe, 1932;Hopkins, 1937). Recent evidence suggests that there is heritable, adaptive variation in reproductive timing, even among populations of oysters within the same phylogeographic region (Barber, Dexter, Grossman, Greiner, & Mcardle, 2016;Silliman et al., 2018).

| Local adaptation
Another clinal locus of interest mapped to the mitochondrial gene carnitine Ol-palmitoyltransferase, which has been strongly associated with regulation of glycogen content (and therefore, tastiness) in C. gigas (Li et al., 2017). Other clinal genes have putative functions in sensory information processing and muscle contraction.

| Potential limitations
Although genomic methods such as GBS have been proven useful for evolutionary biology and conservation genetic studies (Andrews, Good, Miller, Luikart, & Hohenlohe, 2016), several potential limitations of GBS and the present study should be addressed. Nonrandom missing data due to polymorphisms in the restriction enzyme cut site ("allelic dropout") can bias population genetic analyses by underestimating genomic diversity and overestimating F ST ; however, the impact of these biases on F ST may be limited if effective population size (N e ) is small and if loci with large amounts of missing data are removed from analyses (Cariou et al., 2016;Gautier et al., 2013). Due to large variation in reproductive success every generation, N e is likely small for Ostrea species (Hedgecock, 1994;Lallias, Taris, Boudry, Bonhomme, & Lapègue, 2010). Loci with >25% missing data were removed from population genetic analyses, and preliminary analyses allowing 40%-10% missing data still resulted in the same regional population struc-  (Andrews et al., 2016). Second, these libraries were made and sequenced in-house as opposed to a dedicated commercial GBS facility. The protocol learning curve may be why a disproportionate number of individuals failed or had low sequencing depth in the first few prepared libraries. This filtering resulted in 4-9 individuals per population in the final dataset, which is sufficient for estimating F ST when >1,000 SNPs are used (Willing, Dreyer, & van Oosterhout, 2012). While these small population sizes may limit the power to detect outlier loci (Foll & Gaggiotti, 2008), the probability of false positives is reduced by comparing across multiple outlier methods (Rellstab et al., 2015). Lastly, while methods like EEMS and PCA can characterize genetic differentiation, they cannot distinguish between the different demographic scenarios that may result in these patterns (Petkova et al., 2016).

| CON CLUS IONS
This study provides the first comprehensive characterization of both neutral and adaptive population structure in the Olympia oyster, an ecologically important coastal species in North America. These results have direct implications for management policies and ongoing restoration efforts, and a future sustainable fishery. Putative adaptive loci identified here are excellent candidates for future research and may provide targets for genetic monitoring programs. Beyond these specific applications, this study contributes to the growing body of evidence for both population structure and adaptive differentiation in marine species. In particular, it is one of the first to utilize thousands of SNPs to characterize population structure from southern California to Vancouver Island. All analyses conducted for this study can be replicated using annotated Jupyter Notebooks, al-

CO N FLI C T O F I NTE R E S T
None decalred.

DATA ACCE SS I B I LIT Y
Genomic data (all filtered markers, putative neutral markers, and putative outliers) and sample metadata are available on Dryad (