Relative contributions of neutral and non-neutral genetic differentiation to inform conservation of steelhead trout across highly variable landscapes

Mounting evidence of climatic effects on riverine environments and adaptive responses of fishes have elicited growing conservation concerns. Measures to rectify population declines include assessment of local extinction risk, population ecology, viability, and genetic differentiation. While conservation planning has been largely informed by neutral genetic structure, there has been a dearth of critical information regarding the role of non-neutral or functional genetic variation. We evaluated genetic variation among steelhead trout of the Columbia River Basin, which supports diverse populations distributed among dynamic landscapes. We categorized 188 SNP loci as either putatively neutral or candidates for divergent selection (non-neutral) using a multitest association approach. Neutral variation distinguished lineages and defined broad-scale population structure consistent with previous studies, but fine-scale resolution was also detected at levels not previously observed. Within distinct coastal and inland lineages, we identified nine and 22 candidate loci commonly associated with precipitation or temperature variables and putatively under divergent selection. Observed patterns of non-neutral variation suggest overall climate is likely to shape local adaptation (e.g., potential rapid evolution) of steelhead trout in the Columbia River region. Broad geographic patterns of neutral and non-neutral variation demonstrated here can be used to accommodate priorities for regional management and inform long-term conservation of this species.


Introduction
Management strategies implemented for species conservation are highly contingent on a host of correlated life history and demographic information. In concert with these data, genetic structure is vital for characterizing population distinctions and limitations on productivity related to the decline of many species (e.g., conifer Cathaya argyrophylla pineceae, Ge et al. 1998; Chinese cobra Naja atra, Lin et al. 2012; gopher tortoise Gopherus polyphemus, Clostio et al. 2012; Chinook salmon, Moran et al. 2013). The genetic differentiation of populations across a species range is often determined on the basis of phylogenetic origins (Wagner et al. 2005), and historical and contemporary demography (Ruokonen et al. 2004). More recently, genetic variation has been viewed from the perspective of physical landscapes or environmental variation (Manel et al. 2003;Schoville et al. 2012). A landscape genetics approach reveals population variation relative to the influences or features in an organism's environment (Segelbacher et al. 2010;Sepulveda-Villet and Stepien 2012), including natural or human erected barriers and local climate. Most often it has been described on the basis of neutral divergence (Dionne et al. 2008;Narum et al. 2008), where restricted gene flow is explained in the context of a heterogeneous, patchwork environment (Latch et al. 2011).
Conservation units, such as a distinct population segment (DPS), are established based on a core set of criteria including population ecology and viability, ancestry and descent, reproductive isolation, and local adaptation (Fraser and Bernatchez 2001;Fraser et al. 2011). Local adaptation may be inferred from neutral genetic structure coincident with habitat or life-history variability (Olsen et al. 2011;Blankenship et al. 2011). However, direct evaluations of non-neutral population differentiation (i.e., putatively adaptive divergence) are likely to reveal stronger, more correlative relationships (Limborg et al. 2011). Nevertheless, conservation is not commonly informed by non-neutral variation, and inferences on adaptation based exclusively on neutral differentiation risk incorrectly identifying the underlying processes affecting population distinctions (Funk et al. 2012;Landguth and Balkenhol 2012).
Although full genome sequence data are typically not available for nonmodel species, evaluations of adaptive variation have recently been addressed using analyses of single nucleotide polymorphism (SNP) loci (Willing et al. 2010;Matala et al. 2011;Hohenlohe et al. 2010). Because SNP loci are commonly found within or adjacent to coding and regulatory regions of a genome, their allele frequencies may be influenced by selection (i.e., non-neutral). Techniques, such as association testing and detection of outlier loci, allow evaluation of differentiation that provides an improved understanding (over neutral loci) of the relationship between signatures of adaptive variation and the physical environment, even without direct interpretations of phenotypic variation, or interrogation of specific functional genes (Narum et al. 2010b;Matala et al. 2011;Bourret et al. 2013).
Understanding the distribution of adaptive variation across landscapes will be crucial in establishing conservation priorities (see Crandall et al. 2000;Fraser and Bernatchez 2001), and for anticipating how populations might be affected by local and regional changes in climate (Holderegger and Wagner 2008;Isaak et al. 2012a). The effects of global climate changes (e.g., rising temperatures) have increasingly altered habitats of myriad species, garnering the attention of a broad spectrum of researchers (Hickling et al. 2005;Milner et al. 2008;Winfield et al. 2010). Climate changes can prompt organisms to alter their behavior through range expansions (Loarie et al. 2009), or adapt over short time periods (rapid evolution; Hoffmann and Sgr o 2011; Barrett et al. 2010). Some of the most profound examples are found among organisms limited by confined habitats such as those of fishes, whose habitats are prescribed by water routes (e.g., networks of streams and lakes). Because of this limitation, they are particularly susceptible to environmental changes including water quality and temperature (Hari et al. 2006;Rieman et al. 2007;Wenger et al. 2011). Fish adapt variably to thermal conditions in their migratory environment (thermal optimum for aerobic scope), and even populations within the same subbasin may be affected disproportionately by dramatic thermal shifts (Farrell et al. 2008).
In the Columbia River Basin (CRB), steelhead trout (Oncorhynchus mykiss) occur as two evolutionarily divergent lineages, delineated east (inland) and west (coastal) of the Cascade Mountain Crest. The inland redband trout (O. m. gairdneri) are typically a stream maturing, summerrun type, while coastal rainbow trout (O. m. irideus) are dominated by an ocean maturing, winter-run type (Busby et al. 1996;Behnke 2002;Currens et al. 2009;Blankenship et al. 2011). Some populations of O. m. irideus also have a summer-run life history, though not necessarily genetically differentiated from sympatric winter-run populations (Busby et al. 1996). Owing to persistent steelhead trout population declines throughout the region, managers have implemented extensive monitoring and evaluation efforts (Busby et al. 1996;Chilcote 1998;ICTRT 2003;Scott 2008;Fryer et al. 2012). Five steelhead trout DPSs have been delineated within the CRB, and each is currently recognized for protection under the Endangered Species Act (ESA): the Upper Willamette River, Lower Columbia River, Middle Columbia River, Upper Columbia River, and the Snake River Basin (U.S. Office of the Federal Register 2006). Rivers in proximity to the Columbia River estuary lie within the Southwest Washington DPS. These conservation demarcations are largely contingent on adjacency of watersheds within stream networks, coupled with life history distinctions, and neutral genetic population structure. To date, there has been little to no direct interpretation of non-neutral variation for conservation assessment (Beacham et al. 1999;ICTRT 2003;Good et al. 2005;USOFR 2006;Nielsen et al. 2009).
Studies that describe distinctions among particular steelhead trout populations are plentiful (Chilcote et al. 1986;Zimmerman and Reeves 2000;Hendry et al. 2002;Matala et al. 2008), and address some local conservation concerns. However, more extensive evaluations are necessary to accommodate broad conservation management priorities in a regional context (Beacham et al. 1999;Winans et al. 2004;Currens et al. 2009;Blankenship et al. 2011). In this study, we investigate patterns of neutral genetic variation in contrast to non-neutral (putatively adaptive) genetic variation. We employed a multi-phased test approach to categorize non-neutrality (selection candidacy) of loci that was procedurally similar to several previous studies (Narum et al. 2010b;Matala et al. 2011). Our primary objective was to provide an extensive characterization of genetic variation of steelhead trout throughout the entire Columbia River drainage to inform conservation management of this species. Study questions are threefold: (i) Is neutral population structure using SNPs consistent with previous studies that utilized other marker types [Winans et al. 2004;Currens et al. 2009;Blankenship et al. 2011]?, (ii) Is there significant evidence for candidate SNPs under selection and indications of putative adaptive variation?, and (iii) How do patterns of neutral and non-neutral genetic variation compare and contrast among populations that occupy highly variable environments across the landscape. Lastly, we discuss how non-neutral differentiation in relation to climate change may improve our understanding of population viability, and promote informed conservation that will complement existing methods implemented for the practical management of many diverse and often imperiled species.

Methods
Sampling, genotyping, and descriptive statistics Our final data set consisted of 9011 steelhead trout samples, representing 145 collections spanning the sampling years 1996-2011 (Table 1; Table S1). Collections will hereafter be referred to as populations. The data set was primarily comprised of natural-origin populations (n = 133), but some hatchery exceptions are included (n = 12). Both the coastal and inland lineages were represented but in skewed numbers of populations: coastal (n = 24), and inland (n = 121; Table S1; Fig. 1). Sample sizes genotyped per population and by major population group (MPG) had minimum numbers ranging from n = 18 to n = 90, and maximums ranging from n = 30 to n = 164; Table 1). Genomic DNA was extracted from fin or opercle tissues of juvenile and adult fish preserved dry on Whatman paper (LaHood et al. 2008) or stored in individual vials containing 100% nondenatured ethanol. For DNA extraction, we used a standard Qiagen â DNeasyTM protocol, or NexttecTM Genomic DNA Isolation Kits from XpressBio (Thurmont, MD, USA) following the manufacturer's standard protocol.
A total of 191 SNP loci were genotyped with Taqman assays (Applied Biosystems, Grand Island, NY, USA). Our locus panels were comprised of SNPs developed by multiple sources (Table S2). All loci were ascertained from a broad coast-wide sample of populations, including Alaska, Washington, British Columbia, California, and Russia. Most SNPs were developed in a process of mining expressed sequence tags (ESTs) from GeneBank (https:// www.ncbi.nlm.nih.gov/genbank/), followed by sequencing via primers developed for EST sequence flanking regions. Generally, the coding or noncoding nature of specific ESTs, and associated functions were unknown (see references for additional information, Tables S2 and S3). Forward and reverse primer/probe sequences for Taqman assays are available in Ackerman and Campbell (2012). Polymerase chain reaction (PCR) amplification of loci followed the protocol of Hess et al. (2012) and included an initial preamplification step. Successful genotyping for a given sample was defined proportionally as <10% missing data (i.e., fewer than 19 of 191 SNP genotypes per individual). Three species-specific SNP markers (Ocl_gshpx-357, Omy_my-clarp404-111, and Omy_Omyclmk438-96) were used to screen for species ID and hybridization between O. mykiss and O. clarkii subspecies (Hess et al. 2012). All individuals identified as hybrids and congeners (n = 15 coastal, n = 79 inland) were subsequently removed from the data set prior to analyses. Following this screening exercise, the three species-diagnostic loci were removed from the data set. Eight SNPs in our pared panel of 188 SNPs have been previously identified as a priori candidate loci for selection related to environmental variables. Specifically, two loci are putatively associated with thermal stress-induced mortality (Narum et al. 2013): Omy_hsp47-86 and Omy_OmyP9-180. Five SNP loci (Omy_aldB-165, Omy_gdh-271, Omy_Ogo4-212, Omy_stat3-273, and Omy_tlr5-205) have been previously identified as associated with temperature variation in desert versus montane environments, and one locus (Omy_hsf2-146) was putatively associated with precipitation in the same study (Narum et al. 2010b). Two additional loci have been shown to differentiate anadromous and resident life-history types ): Omy_ndk-152 and Omy_LDHB-2_i6. In the following sections, these 10 loci are referred to as having 'precedence' in association tests (Table S3).

Population structure within and among lineages
Throughout our methods and analytical approach, the coastal and inland lineages were evaluated separately. First, we verified the resolving power of our SNP markers to discretely differentiate the two steelhead lineages that have been characterized in previous studies based on other genetic markers (e.g., allozymes, Currens et al. 2009;microsatellites, Blankenship et al. 2011), while confirming lineage of origin for each population. This test was conducted irrespective of classification of loci as neutral or non-neutral. The program STRUCTURE version 2.3.4 (Pritchard et al. 2000) was used to estimate the mean admixture proportions (Q), or fractional group fidelity among individuals in each of 145 populations, setting k = 2 (two inferred groups; coastal and inland). Default settings were used for the ancestry model (i.e., admixture) and frequency model (i.e., correlated allele frequencies), with a burn-in of 50 000 and 250 000 MCMC repetitions. Locus-specific and global pairwise F ST (h of Weir and Cockerham 1984) were calculated within each lineage using GENEPOP. A multivariate principle coordinates analysis (PCoA) was performed in GENALEX version 6.2 based on matrices of pairwise F ST to quantify amounts of variation.

Identifying putatively neutral loci
In classifying the nature of SNP loci, we define three possible outcomes or distinctions. A locus may be (i)unaffected by selection (neutral), (ii)directly under selection (affecting one or more traits under selection), or (iii)indirectly affected by the process of selection (selection at a linked locus). In all subsequent discussion, the term 'nonneutral' variation will be used to represent the latter two distinctions. We used an outlier approach based on simulation methods initially proposed by Beaumont and Nichols (1996) to identify outlier SNP loci putatively under selection. We conducted outlier tests as implemented in ARLE-QUIN version 3.5 (Excoffier et al. 2005) to test for excessively higher or lower F ST than would be expected under the assumptions of neutrality (Beaumont and Nichols 1996;Beaumont and Balding 2004). Tests were performed using 100 simulated demes and 50,000 coalescent simulations to generate a null distribution under neutral expectations around observed F ST values (with confidence intervals). Due to the potential high error rate of the hierarchical island model , a finite island model was assumed for outlier tests. Average locus heterozygosity versus F ST was plotted to represent the 1% and 99% quantiles, and loci lying below or above these quantiles were outliers putatively under balancing or directional selection, respectively. Outlier loci can be neutral or non-neutral and therefore relying solely on outlier tests to characterize the selection candidacy of loci risks a resulting high rate of false positives. For example, populations may be highly differentiated because of demography, or skewed patterns of isolation by distance (Akey 2009;Hermisson 2009;Narum and Hess 2011). In fractal landscapes, such as stream networks, patterns of genetic structure can be coincident with patterns or complexity of stream branching, likely resulting in elevated variance around neutral F ST (i.e., false-positive outliers; Fourcade et al. 2013). For our exploration of nonneutral differentiation, outlier methods were used in combination with regression models with control variables to test environmental associations. This reduces the risk of false-positive conclusions , and aids in identifying the influences of environmental or habitat heterogeneity on the spatial distribution of genetic diversity (inferred selection gradients).

Defining physical and control variables in an association test framework
Physical habitat variables used to gauge environmental heterogeneity were chosen to represent regional and local climate regimes. Latitude and longitude coordinates for each population were obtained from field data or were estimated using ARCMAP version 10 (copyright © 2010 ESRI) and were used to gather all physical variable measurements (Tables  S4 and S5). Migration distance from the Columbia River estuary and pairwise stream distances (kilometers) were calculated using the GIS application previously described. Elevation was obtained using Google Earth (Image © 2012 TerraMetrics) and verified for accuracy in ARCMAP version 10. Monthly averages for temperature and total precipitation measurements (rain and melted snow) were generated using parameter-elevation regressions on independent slopes model (PRISM) of the Oregon Climate Service; http://www.prism.oregonstate.edu/. Monthly average maximum and minimum air temperatures were simulated at 800-m cell resolution from a model based on climate normals from a 30-year period  in PRISM. Water temperature readings were unavailable, and therefore, we used air temperature readings as a proxy for in-stream temperature. In the Pacific Northwest region, stream temperature trends closely with air temperature, particularly on temporal scales (Isaak et al. 2012b). However, changes in stream temperatures occur more slowly and are affected by groundwater and riparian buffering, glaciation, and complex topography. In total, five physical variables were used to characterize local and regional climate: precipitation, temperature, elevation, geographic coordinates, and migration distance. We expected autocorrelation between all five variables. The precipitation and temperature variables we evaluated on the basis of annual and seasonal measurements. Geographic coordinates and migration distance add directional elements related to regional weather patterns.
The highly dependent nature of F ST on demographic history may confound the ability to accurately identify selection candidate loci (Beaumont 2005). When isolating forces appear correlated with variation across the landscape , it can be difficult to confidently differentiate demographic influences from non-neutral (putatively adaptive) associations. Prior to conducting association tests, we assembled panels of putatively neutral SNP loci (defined by exclusion of outliers) for each lineage, and as a conservative measure, we established control variables for underlying neutral population structure. Using STRUCTURE version 2.3.4 (Pritchard et al. 2000), we evaluated k ranging from 1-8 clusters for each lineage, and the most likely number of distinct populations (k) was selected based on five iterations for each potential k value. We used program default settings for the Monte Carlo Markov Chain procedure, with 30 000 burn-in followed by 150 000 MCMC repetitions. The (Dk) statistic derived from the second order rate of change of the likelihood function (Evanno et al. 2005) provided an improved estimate of the mode of true (k), and the mean Q values for each population were established using the full search algorithm in CLUMPP (Jakobsson and Rosenberg 2007). Secondly, a pairwise population F ST matrix was generated in GENEPOP to conduct PCoA in GENALEX version 6.2. The resulting (Q) inferred admixture proportions from STRUCTURE and the first three Eigen vectors (EV) from PCoA analysis were used to control for underlying neutral population differentiation in subsequent association tests.

Identifying candidate loci: association with environment
We used two regression methods to identify potential candidate markers in association with the previously described environmental or climate-related variables. First, population MAF at each locus was plotted against each of five physical variables (Table S2) in univariate linear regression analysis. The P-values for the correlation coefficient were calculated in Microsoft Excel. Statistical significance (a = 0.05) was adjusted for the number of simultaneous tests as a conservative measure using a Bonferroni correction to exclude false-positive results (Rice 1989). Eighteen populations from the genetically distinct Clearwater River are located in a geographic region characterized by high precipitation. All 18 ranked in the top 20 for greatest amount of spring and summer precipitation among 121 inland populations (variable nonindependence). In initial univariate regression analyses, 86 loci (48%) showed significant correlation with either spring or summer precipitation. Therefore, as a precautionary measure to guard against what appeared to be a spurious geographical influence, regression coefficients were recalculated in the absence of the Clearwater River group. Then, of the original 86 loci only those that maintained a significant correlation with precipitation were considered 'true' correlations.
Second, we used DISTLM forward (McArdle and Anderson 2001) to perform multivariate multiple regression with permutation tests on locus-specific pairwise F ST distance matrices versus pairwise matrices of values for physical variables (Carmichael et al. 2007;Olsen et al. 2010;Matala et al. 2011). Conditional tests that permute residuals under a reduced model (Anderson and ter Braak 2003) were performed in DISTLM forward using a stepwise forward selection procedure that fits individual variables or sets of variables sequentially in the linear model. Once the most informative variable or set of variables (explaining greatest amount of variation) is established, the remaining variables are fit to the model, while those selected in previous steps are held as constants. Specific associations were determined on the basis of highest rank (P < 5%) variable for a given locus, and the method accounts for correlations that are likely present between the variables used to characterize climates (Tables S4-S7). The temperature, precipitation, and neutral control (Q and EV) variables were evaluated in respective 'sets' as demonstrated by Anderson et al. (2004), where seasonal averages for the former two were defined as: winter (January-March), spring (April-June), summer (July-September), and fall (October-December).
In summary, the first two criteria used to flag loci for further consideration as candidates in our multitest process of categorization were as follows: (i) significant F ST outliers at a 99% confidence threshold or (ii) loci with precedence of association [see Table S3]. All 180 loci were tested for correlation based on linear regression (criterion #3). Lastly, all loci flagged as candidates based on the first three criteria were scrutinized on the basis of multivariate multiple regression with permutation tests (criterion #4). Note that as the final discriminatory step, when multivariate regression tests ranked a neutral control variable (Q or EV) as highest in significance among all tested variables, the locus in question was precluded from categorization as non-neutral (selection candidacy) regardless of results based on the other three criteria (e.g., linear correlation with environment). This approach resulted in three subpanels of SNP loci for genetic analysis: candidate locithose meeting stipulations outlined previously, neutral locithose showing strongest correlation with neutral structure, and 'ambiguous' loci. The latter category of SNP was primarily comprised of F ST outlier loci or loci having precedence of association which ultimately failed to reach non-neutral status based on regression analysis. Ambiguous loci (by definition) could not be confidently characterized as neutral or non-neutral given conflicting test results and were therefore excluded from subsequent evaluations of genetic differentiation. Candidate loci putatively under selection were subsequently used to evaluate non-neutral or adaptive variation. Putatively neutral loci were used to evaluate underlying neutral (demographic) population structure.

Comparing neutral and putatively adaptive variation
Using the newly established neutral and candidate SNP panels, comparisons were drawn based on multiple analyses to show the corresponding amount of variation described by both neutral and non-neutral variation (Nosil et al. 2007). Nonparametric Mantel tests for isolation as a function of environment were conducted using 9999 permutations of pairwise matrices of F ST (within lineages) against absolute pairwise difference in values for environmental variables. Nei's standard genetic distance (Nei 1972) was calculated for each lineage, and distance was displayed in the topology of an un-rooted neighbor-joining (NJ) tree using the analysis program PHYLIP version 3.68 (Felsenstein 2008). The SEQBOOT option was implemented to generate 1000 simulated data sets, and a consensus topology with bootstrap support was generated using the CONSENSE option. The program TREEVIEW version 1.6.6 (Page 1996) was used to graphically display the trees. Different trees were generated using either neutral or candidate SNP panels for each lineage.
A ranking method was employed based on averages for five climate variables to compare climate similarities that occurred in the clustering of populations within tree topologies. Specifically, populations were ranked in descending order (warmest to coolest) using corresponding mean maximum temperatures for both spring and summer independently. Elevation was ranked in ascending order assuming lowest elevation equals warmest climate. Lastly, populations were ranked in ascending order for mean precipitation (least equal to driest/warmest climate) in both spring and summer independently. Following independent ranking of populations for the five variables, the average rank across all five was used to order populations from most hot and dry to most cold and wet, and to compare relative climates in the tree topologies.

Results
Descriptive statistics and population differentiation Data quality analyses indicated only minor issues concerning locus scoring accuracy, nonrandom association of loci, and population admixture. We observed departures from expected genotypic proportions in 208 of 27 260 tests (188 loci 9 145 populations) at an adjusted significance threshold of P = 0.0046. Generally, the HWE deviations were not specific to any population or locus, spanning 100 of 145 populations, and 109 of 188 loci. Exceptions occurred in both Abernathy Creek (Ref. #12) and Canyon Creek (Ref. #9), each with 10 population-specific departures. There were also 12 HWE departures at locus OMS00087, which was therefore removed from all subsequent analyses. Tests for linkage disequilibrium revealed five pairs and one trio of loci that remained significantly out of equilibrium in at least 10% of populations after adjustment for multiple tests (P-value < 0.0001). Linked SNP pairs were as follows: (OMS00133 and Omy_rapd-167), (Omy_CRBF1-1 and Omy_crb-106), (Omy_Il-1b_.028 and Omy_Il1b-198), (Omy_SECC22b-88 and OMS00169), (Omy_ndk-152 and Omy_u09-52.284), and the trio (Omy_GHSR-121, OMS00176 and Omy_mapK3-103). In each pair, only the locus with the highest MAF was retained in the data set (Table S2).
Following paring of eight loci for HWE and linkage disequilibrium, the final data set included 180 loci for use in subsequent analyses. No SNP locus exhibited fixed allele frequencies, and variability ranged widely both within and between steelhead trout lineages.  In Bayesian clustering analyses, the mean admixture proportion (Q) for two inferred populations was 95.3% in Q1 for coastal populations, and conversely, 93.8% for Q2 among inland populations (Fig. 2). Mean values would likely have been higher, but populations adjacent to the crest of the Cascade Mountains (demarcating range limits of coastal and inland types) appeared admixed between lineages to varying degrees, substantially lowering inferred group fidelity. The genetic characterization of the Big White Salmon River population   Fig. 1; c) was significantly more similar to the coastal lineage despite its location among the middle Columbia River DPS and was therefore evaluated throughout these analyses as a coastal population.

Controlling for underlying neutral differentiation in association testing
Outlier tests revealed eight loci putatively under directional selection in the coastal lineage and 10 in the inland lineage (Tables S3, S6, and S7). No influence of balancing selection was observed. The remaining putative neutral loci in the inland lineage (n = 170) and coastal lineage (n = 172) were used in PCoA and Bayesian cluster analyses to establish neutral control variables. For both lineages, the most likely number of inferred populations was k = 2. However, this was likely a result of the relatively deep divergence between the Willamette River DPS and remaining lower Columbia River populations in the coastal lineage (ICTRT 2003), and similar divergence between the genetically distinct Klickitat River subbasin and remaining inland populations. To evaluate structure at a finer scale, we chose the appropriate number of inferred groups from the next peak in the second order rate of change (Δk) for log e [Pr{K}], which occurred at k = 6 (Q1-Q6) within each lineage independently (Table S4).
For the coastal lineage, mean Q partially distinguished summer-run populations, a less common life-history type among this lineage (Q5 = 65%), two groups in the upper Willamette DPS (Q2 = 58% and Q4 = 61%), the Washington coast population from the Quinault River (Q1 = 89%), and the Big White Salmon population on the Cascade Crest (Q6 = 81%). Most of the remaining lower Columbia populations had greatest proportions in Q3, ranging from 34% to 77%. For the inland lineage, the six inferred populations partially distinguished major subbasins, where the middle and south forks of the Salmon River were both dominant in Q1 (74%), the south and middle forks of the Clearwater River in Q4 (53%) and Q3 (76%) respectively, the Klickitat River in Q2 (68%), and the upper Salmon River in Q5 (56%). Regions with highest mean admixture proportion in Q6 included the majority from the upper and middle Columbia, and the Grande Ronde and Imnaha rivers in the Snake River Basin. Most individual populations within these regions did not exhibit definitive or strong fidelity in Q6 (ranging 27-68%, with a mean of 44%), and the highest admixture proportion by watershed occurred in the Yakima and John Day Rivers, with mean Q6 of 59% and 54%, respectively (Table S4). Differentiation was slightly higher within the inland lineage, with a larger range of pairwise population F ST (0.0002-0.1889, mean 0.0421) compared to values within the coastal lineage (range 0.0001-0.1098, mean 0.0413). Results of PCoA based on pairwise F ST corroborate primary distinctions revealed from STRUCTURE analyses. Among the inland lineage, observed distinctions coincide well with several of the MPGs within the Snake River DPS.
Association tests to identify non-neutral loci: putatively adaptive variation The adjusted probability threshold for determining significant association using linear regression was P < 0.00029. On the basis of linear regression alone, we identified 12 loci in the coastal lineage that were significantly correlated with one or more environmental variables (Table 2; Table S3). In the inland lineage, 59 loci were identified as significantly correlated, including 17 loci for spring precipitation, and 21 loci for summer precipitation. From combined results using outlier tests, association precedence, and linear regression, we ultimately flagged 26 loci in the coastal lineage and 62 loci in the inland lineage for further examination of environmental correlation. Following the final ranking phase based on multivariate regression, the pared locus classifications included nine candidate loci putatively under selection in the coastal lineage and 22 candidates in the inland lineage (Table 2). All remaining loci were considered either neutral or ambiguous (Table S3). The group of loci ultimately classified as ambiguous were typically comprised of outlier loci for which the highest ranked variable was one of our control variables for neutral differentiation (Q or EV; Table S4). The SNPs deemed ambiguous may indeed be under selection, but we failed to identify associations with the suite of environmental variables tested here.
Of the eight SNP loci showing precedence as candidates for local climate from previous studies, seven exhibited significant association with climate variables in at least one lineage in our broad-scale evaluation across many populations. Two loci previously identified as associated with survival under thermal stress (Omy_hsp47-86 and Omy-P9-180; Narum et al. 2013), were significantly associated with either temperature or precipitation (Fig. 3). Two loci previously identified as candidates for temperature (Omy_stat3-273 and Omy-gdh-271; Narum et al. 2010b) were found to be highly correlated with precipitation in at least one lineage (Tables 2 and S7), but not for temperature. Previous associations of locus Omy_hsf2-146 with precipitation, and locus Omy_aldB-165 with temperature (Narum et al. 2010b) were corroborated within the coastal lineage. Finally, locus Omy_tlr5-205 previously associated with temperature (Narum et al. 2010b) was significant for temperature and precipitation in the inland lineage. Notably, two of these loci (Omy-P9-180, Omy_stat3-273) were candidates for climate association in both lineages.
Additional novel candidate loci were detected that have not previously been identified as loci putatively under selection.
The two loci that were previously determined to be associated with differentiating resident from anadromous life-history forms (Omy_ndk-152 and Omy_LDHB-2_i6; Narum et al. 2011) could not be directly tested for association with life-history forms because most populations in our analyses were field identified as anadromous (e.g., juvenile smolt or adult steelhead phenotypes). However, Omy_ndk-152 was a significant F ST outlier. A single locus (Omy_Ogo4-212) with precedence of putative association for climate or life history had no confirmed associations for any habitat variable in our analyses and was deemed ambiguous as it may be a candidate at smaller, more local scales.
Among coastal lineage populations, precipitation and distance from the ocean (i.e., migration distance, and lat/long coordinates) were equally the most commonly correlated environmental factors. To clarify, the com-mon point of origin for all measured migration distances was the Columbia River estuary, and therefore, distance associations were not necessarily an example of IBD gene flow which relates to direct distance between populations. In the inland lineage, environmental correlations were dominated by precipitation, then temperature. Specifically, spring, summer and total annual precipitation were the physical variables most often associated with genetic variation, and several significantly correlated loci spanned both lineages (Fig. 3). In some cases, the ranking of variables in multivariate regression was inconsistent between tests of individual variables versus sets of variables. For example, while an individual variable (e.g., Q1, summer precipitation) may have ranked highest for a given locus, the corresponding variable sets (e.g., Q1-Q6, total precipitation) may have ranked low for that same locus (Table S6 and S7). An explanation for this outcome, centered on differential environmental influences among life-history stages, follows in the discussion.  (Narum et al. 2013); ' †'. Temperature or precipitation association (Narum et al. 2010b). Climate associations are: P, precipitation; T, temperature; E, elevation; D, distance.

Comparing neutral and non-neutral differentiation
The NJ tree topologies based on neutral SNP panels generally showed genetic distance relationships among populations that accurately aligned with the five DPS delineated under the ESA (Figs 4A and 5A) Figure 3 Linear regression plots, representing results for eight of the most highly correlated SNP loci (see Table 2). Plots A-C are temperature associations in the coastal lineage, while D-H are precipitation associations in the inland lineage. Symbols correspond with DPS: diamondlower Columbia and Southwest Washington, starupper Willamette, squaremiddle Columbia, circleupper Columbia, and triangle -Snake.
populations in the upper Willamette River DPS, while populations in the upper Willamette River west side tributaries were distinctly partitioned from upper Willamette east slope populations (Fig. 4A). The known summer-run populations in the coastal lineage cluster together with significant bootstrap support, rather than clustering with winter-run populations from the same tributaries (i.e., Kalama and Hood rivers). Middle and Upper Columbia MPGs and five MPGs in the Snake River (Ford 2011) are well differentiated within the inland lineage, although bootstrap support was minimal in some instances. Finer scale definition observed in the Salmon and Clearwater rivers indicates significant genetic distinctions between middle and south fork populations from both subbasins (within MPGs), and between each of those groups and corresponding populations in the lower sections of both subbasins (Fig. 5A). These within-watershed distinctions in both the Clearwater River and Salmon River subbasins are in agreement with previous reports (Nielsen et al. 2009), but the level of resolution that differentiates the Clearwater River subbasin from the Salmon River subbasin has not been previously reported. In the tree topology for neutral structure in the inland lineage, populations within MPGs or subbasins were also frequently charac-terized by climate similarity. However, deviations from ESA-or regionally based (e.g., distance) clustering were rare regardless of differences or similarities in climate.
The NJ tree topologies based on non-neutral structure were comprised of all candidate SNPs in the coastal (n = 9 loci) and inland (n = 22 loci) lineages. Trees presented population clustering patterns that often did not correspond with delineations of DPS or MPG (Figs 4B and 5B), albeit with limited bootstrap support. This was presumably a function of topologies based on small numbers of loci. Climate based clustering patterns were more apparent in the inland lineage, where several of the relationships depicted in the non-neutral tree topology conformed to warm or cool climate similarity irrespective of subbasin or MPG distinction. For example, the neutrally distinct Klickitat River and Yakima River groups were each split to show some association to climate, most Clearwater and Salmon river groups with similar climates were combined on the same basal node or primary branch of the tree, and the five lower Clearwater River populations were differentiated by climate distinctions. Overall, NJ results based on non-neutral SNPs for the coastal lineage make it difficult to discern any basis for population similarities within branching patterns. In particular, climate rankings for coastal populations were noninformative for understanding genetic    (Table S1).
distance relationships associated with physical variables ( Fig. 4A; climate ranks not shown). Allele frequencies were most commonly correlated with the precipitation variables in association tests. This was further verified with Mantel tests of isolation by precipitation (IBP). Tests were based on a five-locus subset of nine candidate loci in the coastal lineage and a 15-locus subset of 22 candidate loci in the inland lineage; subsets were chosen based on significant associated with precipitation specifically (Table 2; Table S6 and S7). For the coastal lineage, a test based on all five loci indicated significant IBP for mean annual precipitation (R 2 = 0.235; P = 0.0009), mean spring precipitation (R 2 = 0.201; P = 0.0006), and mean summer precipitation (R 2 = 0.296; P = 0.0004). The panel of 158 neutral loci for the coastal lineage was also tested to account for potentially confounding neutral patterns of IBP, but none were observed. Mantel tests of IBP based on 15 precipitation-associated loci in the inland lineage (pairwise comparisons between 121 populations) indicated significant IBP for mean spring precipitation (R 2 = 0.039; P = 0.0001), and for mean summer precipitation (R 2 = 0.037; P = 0.0001); no correlations were observed between seasonal precipitation and the panel of inland lineage neutral SNPs (n = 146).

Discussion
This study demonstrates patterns of neutral and non-neutral variation in O. mykiss at a broad geographic scale that will be a valuable contribution to improved conservation management of this species in the CRB. Neutral structure was complementary to preceding studies (Winans et al. 2004;Currens et al. 2009;Blankenship et al. 2011) and confirms the existence of two deeply divergent steelhead trout lineages (i.e., coastal and inland) across the species range, along with finer scale population structure. We identified neutral divergence among steelhead trout populations that coincides substantially with current   (Table S1).
DPS delineations among lineages, as well as MPGs (ICT-RT 2003;Good et al. 2005;Ford 2011) demarcated within the Snake River Basin (five MPGs), Middle Columbia River (four MPGs) and Upper Columbia River (one MPG). However, using our SNP panel, fine-scale resolution of neutral differentiation was detected in the Snake River DPS at a level previously unreported based on other marker types. Similar to other studies (Nielsen et al. 2009;Campbell et al. 2012), we found no evidence for multiple evolutionary lineages in the Snake River, but local environments may influence their differentiation. For reference, six Snake River hatchery stocks were genetically similar to their natural-origin counterparts that distinguish the Snake River DPS. In the coastal lineage, neutral differentiation among populations from east and west side tributaries in the upper Willamette River does not appear to fit DPS unit distinctions and may warrant further investigation to define conservation boundaries. From our analyses, the Big White Salmon river population is currently the only population within the Middle Columbia River DPS that is more consistent with a coastal lineage origin, suggesting further evaluation of its classification may be necessary. Additionally, we show clear evidence for non-neutral (putatively adaptive) variation that is significantly associated with climate in the region. Specifically, several candidate markers were primarily associated with precipitation and temperature. This study demonstrates that candidate markers can be applied at broad geographic scales to describe the extent of potential local adaptations across highly variable climates. To evaluate non-neutral differentiation, our designation of candidate SNP loci was applied conservatively to reduce false-positive results. Conclusions of climate association were based on a large and diverse number of populations and supported by a framework of multiple test criteria and strict likelihood thresholds (Balkenhol et al. 2009;Schoville et al. 2012). De Mita et al. (2013 suggest using multiple robust methods, and emphasize numbers of populations over numbers of individuals per population for improved confidence in determining selection candidacy of loci. A greater number of candidate loci were discovered in the inland lineage, represented by a larger number of populations than were observed in the coastal lineage. In addition, the interior region of the CRB is larger, characterized by a highly variable environment relative to the coastal environment (greater range of wet/dry and warm/cool conditions). Several loci were identified in association with tested environment or landscape variables, but precipitation and temperature proved to be the most common (# loci) and strongest factors in non-neutral population differentiation that spanned both lineages (e.g., at SNP Omy_stat3-273, Omy_OmyP9-180).
Although our association tests indicated relationships between environmental variation and genetic heterogeneity (i.e., allele frequency variation), it is challenging to decipher the biological relevance of those correlations. Climate variables such as temperature may affect behaviors and phenotypes alike (Perry et al. 2001;Zydlewski et al. 2005), sometimes relatively rapidly in response to perturbations (Kovach et al. 2012). Previous landscape genetics studies in salmonids have shown significant allele frequency correlation with precipitation that have been described as neutral influences on population structure (Narum et al. 2008;Blankenship et al. 2011;Olsen et al. 2011). Olsen et al. (2011) for example, present a compelling discussion on possible correlations between precipitation and gene flow. This may occur if increased flooding results in decreased stream stability, which in turn may affect fish dispersal. Alternatively, the similarity of outcomes across study locales may suggest that the distribution of precipitation across the landscape elicits, or is indicative of common adaptive responses. For example, streambed scour related to rain-on-snow events may have lesser impact on fish that adapt by burying eggs at deeper depth (Goode et al. 2013). Thus, climatic variables such as precipitation may impart either 'neutral' landscape effects, 'non-neutral' (putatively adaptive) landscape effects, or both. In either case, one could argue that adaptation and selection play a key role in shaping the genetic landscape, which is more apt to be revealed through non-neutral genetic variation (Limborg et al. 2011).
The genetic population structure we observed differed markedly depending on whether our evaluations were based on neutral or non-neutral differentiation. Neutral differentiation generally reflected a pattern of distancerestricted gene flow, and in the context of demographic factors, population clustering was relatively intuitive within and among regions (clustering by subbasins, DPS, etc.). In contrast, population-clustering patterns identified using candidate loci (non-neutral differentiation) were presumably based upon environmental variation and frequently did not align with current steelhead trout DPS delineations. Moreover, clustering similarities based on non-neutral variation did not necessarily coincide with geographic proximity, nor were patterns among populations always transparent in regard to biology (e.g., coastal lineage migration timing). Thus, the juxtaposition of genetic signals show that neutrally dissimilar populations may exhibit non-neutral similarity (and vice versa) related to environment, and irrespective of geographic distance. Compared with neutral variation, candidate loci revealed some novel distinctions between populations, presumably reflective of environmental differences between regions and locales, particularly for the inland lineage. For the coastal lineage, less variable environments may produce moderate selective pressure for local adaptation among surveyed portions of the species range. More extreme environments in the southern portion of the coastal lineage range (e.g., the Sacramento River system) might be expected to provide stronger selective pressure.
The relative influences of climate or environment on genetic variation (e.g., putative adaptive responses) may occur during particular life-history stages of an organism, which can be difficult to discern. Temperature for example has been shown to have variable effect on different life stages of fish (Fowler et al. 2009). In our multivariate regression analysis, we noted inconsistencies between tests on individual physical variables and corresponding sets of seasonal variables. Seasonal environmental variation may impart a disproportionate selective influence coincident with age related behaviors (e.g., emergence time or outmigration time; Coleman and Fausch 2007). Our results of non-neutral differentiation indicate that precipitation during specific juvenile rearing or adult spawning periods may be more effectual or correlative than average annual fluctuations in precipitation.
If correlations between loci and physical variables are indicative of an adaptive influence (Bonin et al. 2009), they are not necessarily representative of direct causal relationships (e.g., selection for specific phenotypes). Note that from among our panel, the locus Omy_stat3-273 was previously identified among desert and montane resident O. mykiss populations as being associated with temperature (Narum et al. 2010c). We demonstrated in our evaluation that this locus was correlated with precipitation but not with temperature, yet in actuality it is likely associated with overall climate and thus affected by myriad aspects of habitat variability. We identified patterns of putatively adaptive variation associated with climate, but corresponding phenotypic trait variation was not measured. However, distinguishing whether phenotypic changes are genetically based or the result of phenotypic plasticity has proven difficult (Merila and Hendry 2014). Often many genes are involved in adaptive responses to specific environments and/or climates (Kassahn et al. 2007) and controlled experiments would be necessary to make direct inferences on interactions between environments, phenotypes, and specific genes. Rather than providing unequivocal evidence of adaptation on the basis of phenotypic attributes, our study identifies locus associations that can be seen as indicators of related but undetermined causative environmental forces, such as early seasonal onset (Bradshaw and Holzapfel 2008). For example, distinct populations of steelhead trout in Oregon's Hood River occupy either glacial fed or spring fed tributaries (Underwood et al. 2003;Matala et al. 2009). Distinctions likely arose due in part to selection in variable environments, characterized by in-stream flow rate, thermal stability, and other stressors (Lytle and Poff 2004). When altered by the forces of climate change (e.g., lengthening seasons) those environments may in turn affect phenotypes such as spawn timing or migration timing (Crozier et al. 2011;Reed et al. 2011). Nevertheless, it cannot be stated unequivocally that climate associations are indicative of loci under direct selection.
Global climate change and effects of climate on ecosystems has earned the attention of the scientific community and freshwater fish are expected to be negatively impacted (McCullough et al. 2009). There is growing consensus that habitats occupied by salmonid species in North America and Europe will or have already experienced climate related alterations (Crozier and Zabel 2006;Battin et al. 2007;Winfield et al. 2010;Nielsen et al. 2013). Most of the CRB has been identified as habitat at high risk of thermal stress in salmonids (Wu et al. 2012), and conservation of many populations is already warranted (Busby et al. 1996;ICTRT 2003;USOFR 2006). Climate-altered habitats may lead to rapid evolution (Barrett et al. 2010) or shifting range margins of myriad species (Hickling et al. 2005;Chen et al. 2011). In fishes, this may manifest as range contractions in cold environments or expansions in warm environments, and adaptive versus neutral genetic divergence is likely to occur at differing spatial and temporal scales (Conover et al. 2006). Therefore, conservation strategies based heavily on neutral genetic variation, with an under-emphasis on the distribution of non-neutral variation, risk detrimental impacts to locally adapted population segments and should be scrutinized (Pearman 2001;Schwartz et al. 2009). We concur with Funk et al. (2012) that neutral and non-neutral elements of differentiation are not mutually exclusive and should be used in concert to provide a cautious and conscientious description of species diversity in a management framework. Monitoring trends between neutral and nonneutral differentiation and the corresponding degree of disparity observed over time may be fundamentally important for addressing impacts of climate change (i.e., adaptive responses). This will conceivably have a potential role in reshaping conservation boundaries to safeguard species diversity. The design of our steelhead trout study follows this perspective; identifying selection candidate loci and environmental associations, then drawing comparisons with patterns of neutral population divergence. We observed non-neutral genetic heterogeneity of populations in association with environment. However, in our results, neutral diversity encapsulated overall steelhead trout diversity with better clarity and finer resolution across established DPSs. Nevertheless, monitoring of historical and/or contemporary non-neutral differentiation of populations adds a unique dimension to the characterization of these populations (Willing et al. 2010). The candidate markers identified in our study are expected to be useful for modeling population level responses to climate change, and future population genomics approaches with thousands of steelhead SNPs should provide improved estimation and resolution of adaptive differentiation in the CRB.
There is a palpable consensus among researchers and managers cautioning against broad scale, rigid approaches to conservation, and acknowledging the need to address productivity limitations of myriad organisms in need of protection (Clostio et al. 2012). However, conflicting opinions on the role of genetics in conservation still pervade management agendas (Waples 1995;Fraser and Bernatchez 2001;Garcia de Leaniz et al. 2007;Allendorf et al. 2010). Maintaining overall conservation unit viability must necessarily account for the viability of all demographically important population components within those units (Crandall et al. 2000;Latch et al. 2011), particularly where demographic instability (e.g., genetic drift in small populations) may reduce overall adaptive variation or adaptive potential within regions (Kawecki and Ebert 2004;Schoville et al. 2012). It is likely that the relevance and contribution of non-neutral variation is frequently overlooked or underutilized in conservation planning, but given recent calls for the incorporation of climate science in application of the ESA (e.g., McClure et al. 2013), the utility of such information should be highlighted. In the absence of efforts to regularly evaluate putatively adaptive population differences, there is presumably a greater risk of the loss of genetic diversity as climates and habitats continue to change through time. Over the long-term, the adaptive potential of many species across taxa will need to be further explored and considered in conservation planning. The real effects of a changing environment, including shifting ranges, may not be uniformly realized or fit tightly into predefined units (e.g., ESU, DPS, MPG).

Supporting Information
Additional Supporting Information may be found in the online version of this article: Table S1. Collection names and reference numbers, DPS, and MPG, for 145 steelhead trout populations. Table S2. List of 191 SNP markers assayed for O. mykiss in the Columbia River Basin. Table S3. List of 191 SNP markers assayed for O. mykiss in the Columbia River Basin. Table S4. Values for predictor variables (by population) to control for underlying population structure in association tests (i.e., neutral variation). Table S5. Values for environmental predictor variables used in association tests to evaluate landscape genetics (see Table S6 and S7). Table S6. List of final locus classifications based on significance in association tests. Table S7. List of final locus classifications based on significance in association tests. The list includes only loci that met at least one of the four candidate criteria (see methods section), and neutral loci listed are only those that were initially flagged as possible candidates.