Estimates of gene flow and dispersal in wild riverine Brook Trout (Salvelinus fontinalis) populations reveal ongoing migration and introgression from stocked fish

Abstract As anthropogenic impacts accelerate changes to landscapes across the globe, understanding how genetic population structure is influenced by habitat features and dispersal is key to preserving evolutionary potential at the species level. Furthermore, knowledge of these interactions is essential to identifying potential constraints on local adaptation and for the development of effective management strategies. We examined these issues in Brook Trout (Salvelinus fontinalis) populations residing in the Upper Hudson River watershed of New York State by investigating the spatial genetic structure of over 350 fish collected from 14 different sampling locations encompassing three river systems. Population genetic analyses of microsatellite data suggest that fish in the area exhibit varying degrees of introgression from nearby State‐directed supplementation activities. Levels of introgression in these populations correlate with water‐way distance to stocking sites, although genetic population structure at the level of individual tributaries as well as their larger, parent river systems is also detectable and is dictated by migration and influenced by habitat connectivity. These findings represent a significant contribution to the current literature surrounding Brook Trout migration and dispersal, especially as it relates to larger interconnected systems. This work also suggests that stocking activities may have far‐reaching consequences that are not directly limited to the immediate area where stocking occurs. The framework and data presented here may aid in the development of other local aquatic species‐focused conservation plans that incorporate molecular tools to answer complex questions regarding diversity mapping, and genetically important conservation units.


| INTRODUC TI ON
A major goal of conservation biology is to determine how anthropogenic influences are shaping wild populations and their genetic structure under altered habitat regimes (Bushar et al., 2014;Cornille et al., 2015;Inoue, Lang, & Berg, 2015). Centuries of human mitigated translocations have led to intraspecific hybridization across the landscape, often leading to a reduction in genetic variability while increasing genetic homogenization (Laikre, Schwartz, Waples, & Ryman, 2010;Ozerov et al., 2016). These changes in genetic structure have the potential to negatively impact native populations by transforming a population's natural genetic constitution, which may be synonymous with local adaption, in the form of allele replacement and/or gene-complex disruption (Edmands, 2007;Evans, Mekel-Bobrov, Vallender, Hudson, & Lahn, 2006;Laikre et al., 2010). In addition to potentially negative genetic effects, the introduction of nonnative conspecifics produces risks associated with the introduction of pathogens, parasites, and disease (Adlard, Miller, & Smit, 2015;Cunningham, 1996;Gaughan, 2001). Understanding how the intentional transference of nonsympatric individuals is influencing wild populations is therefore key to executing successful management plans at the species level (Teixeira, Azevedo, Mendl, Cipreste, & Young, 2007). In addition to considerations related to translocations and introgressive hybridization, understanding how gene flow and migration patterns are affected by landscape features is also imperative to addressing concerns related to reproduction, dispersal, and population viability (Couvet, 2002;Young, Boyle, & Brown, 1996).
Habitat fragmentation, the division of habitat into smaller and more isolated patches divided by a matrix of unnatural barriers, has the potential to greatly reduce gene flow and connectivity between individuals or populations of the same species (Fahrig, 2003). In situations where populations have become small and selection pressures are heavy, gene flow from human mediated transference of conspecifics has the potential to increase population sizes even if the resulting phenotypes are unsuited to the environment, in turn leading to increased genetic variation which over time may potentially allow for new adaptations (Holt & Gomulkiewicz, 1997;Sexton, Hangartner, & Hoffmann, 2014). Given the complexities associated with translocations and habitat connectivity, understanding how landscape structure and genetic structure are linked is key to making decisions about proper management strategies under altered habitat regimes.
The Upper Hudson River watershed is located in the Northeastern United States, and is a major feeder of the Hudson River, which flows south through New York State, terminating at the tip of Manhattan in New York City where it meets the Atlantic Ocean. The Upper Hudson River watershed is located in the Adirondack Park, and is surrounded by dense protected wild forest, making large areas of the watershed inaccessible, and it is therefore one of the most pristine remote watersheds in the State. Among the many fish species found in the Upper Hudson River watershed, wild Brook Trout (Salvelinus fontinalis; Figure 1) are among the most likely to be negatively impacted by habitat alterations (Merriam, Fernandez, Petty, & Zegre, 2017). Currently, over 300 lakes and ponds in New York State are actively managed as Brook Trout habitat, with records of wild reproduction in over a hundred (Baker et al., 1990). Despite this fact, the number of wild, unstocked, self-sustaining populations in New York is considered to be far lower, and is projected to make up only 5% of the total number of water bodies that have been sampled (Baker et al., 1990). In addition, New York was one of the first states in the United States to supplement wild populations for recreational fisheries enhancement. State-based stocking of Brook Trout began in the late 1800s (Daniels, 2011;Emery, 1985), and New York State currently maintains six strains of Brook Trout for stocking on an annual basis (NYSDEC, 2017).
Given that New York Brook Trout have been manipulated with supplemental stocking, populations have been disconnected due to habitat fragmentation, and water quality declines have reduced viable habitat, sorting out present-day genetic structure has become exceedingly complex (Perkins, Krueger, & May, 1993a, 1993bBruce, Hare, Mitchell, & Wright, 2018). Based on current climate projections, the United States Environmental Protection Agency (EPA) has predicted a 50%-100% decline in Brook Trout abundance for the region by the year 2100 (E.P.A., 2015). These predictions are compounded by myriad studies published over the past several decades that have elucidated the potential negative effects of stocking on top of native populations (Araki, Berejikian, Ford, & Blouin, 2008;Fraser, 2008;Laikre & Ryman, 1996;Laikre et al., 2010;Rhymer & Simberloff, 1996;Ryman & Laikre, 1991).
The degree to which Brook Trout populations have been, and are predicted to be, impacted by direct and indirect anthropogenic factors makes it critical that regions exhibiting heightened levels of effective population size and comparatively distinct genetic structure be distinguished and preserved (Ficke, Myrick, & Hansen, 2007;Gao et al., 2012). Understanding these patterns of biogeographic structure is therefore essential to maintaining viability and evolutionary potential, not just for Brook Trout, but for all species across changing landscapes (Abdul-Muneer, 2014;Bruce et al., 2018).
In the Upper Hudson River watershed, three main river systems act as the major hydrological feeders to the downstream Hudson River. These rivers include the Boreas River, the Schroon River, and the Upper Hudson itself (Figure 2). These river systems are substantially larger than their feeder tributaries, and provide habitat for a wide range of nonnative competitors and potential predators (Greeley, 1955;Loukmasa & Perryb, 2015). In addition, all three of these river systems experience summer high temperatures in the range of 20-25°C, making them a potential thermal stressor for wild Brook Trout and their offspring (SHEDS Development Team, 2017).
Nevertheless, the river systems that comprise the Upper Hudson drainage are unique in that they are fed by a number of tributaries that possess ideal Brook Trout habitat (DeWeber & Wagner, 2015).
The Upper Hudson drainage has, however, been subjected to decades of state-sanctioned stocking activities in the main stem of all F I G U R E 1 Brook Trout (Salvelinus fontinalis) three of these systems (Van Offelen, Krueger, & Schofield, 1993;Webster & Flick, 1981). Despite the long-term interest in, and attempted management of, this fishery, little is currently known about Brook Trout population genetic structure in this region (but see Bruce et al., 2018;Perkins et al., 1993aPerkins et al., , 1993b or the potential impacts, genetic or otherwise, of these stocking activities on naturally occurring Brook Trout populations. In addition to considerations related to supplemental stocking, a number of recent studies have shown relatively short dispersal distances for Brook Trout, suggesting that the median stream channel distances between pairs of individuals belonging to the same families are as little as 100-250 m, with significant correlations between the locations of parents and their offspring (Hudy, Coombs, Nislow, & Letcher, 2010;Kanno, Vokoun, & Letcher, 2011). Nevertheless, little information is available on rates and magnitude of dispersal and/or gene flow between Brook Trout populations that are potentially connected by such larger riverine systems. If nonnative competitors, habitat fragmentation or water quality issues in this watershed dictate Brook Trout migration, we would expect strong genetic population structure based on limited gene flow between feeder tributaries, with a strong signal of isolation by distance. We would also expect minimal signs of in- and if so, is the level of introgression consistent with their proximity to stocking sites? (b) What level of population structure is present in the area, is ongoing migration occurring between tributaries, or do potential barriers to gene flow dictate population structure? (c) What do comparative levels of effective population size and genetic diversity tell us about Brook Trout genetic population structure throughout the region? To address these questions, we compared microsatellite data from our study samples to those of various stocked strains to estimate the relationship between introgression and water-way distance from stocking sites using beta regression. We assessed population structure in the area by employing current techniques for landscape genetics, and compared measures of effective population size and genetic variation throughout the region, to elucidate site-by-site differences in reproductive success and genetic diversity. We then conclude this study by discussing the management implications of this work as well as how the methods used here may help to inform local conservation strategies related to species-based conservation planning, and the identification of well-defined genetic management  where Brook Trout were previously extirpated due to acidification, but where evidence of recovery has warranted their reintroduction, while pure Temiscamie strain fish are only used to create the F1 hybrids, and do not see stocking in the study region.

| Neutrality testing and summary statistics
All 13 microsatellite loci included in this study were subjected to outlier tests of neutrality using the LOSITAN workbench (Antao, Lopes, Lopes, Beja-Pereira, & Luikart, 2008). We ran an initial simulation to remove potentially nonneutral loci before computing the genomic mean F ST for downstream analysis, while also running a bisection algorithm over repeated simulations to approximate a desired F ST . We treated each sampling location as a putative population employing a drift with migration model, assuming stepwise mutation across one million simulations. In addition, all samples were subjected to exact tests of Hardy-Weinberg equilibrium (Guo & Thompson, 1992) using the ARLEQUIN 3.5 software package (Excoffier & Lischer, 2010).
Deviations from Hardy-Weinberg equilibrium were tested against 1,000,000 random permutations. Tests for linkage disequilibrium were also carried out between all pairs of loci, for all sample groupings, using the log-likelihood ratio test as implemented in the program GENEPOP (Rousset, 2008

| Genetic population structure and migration
All individuals in this study were initially subjected to cluster analysis, to infer any potential introgression from hatchery fish commonly stocked in the region by the NYSDEC. Inferred ancestry for each sampling location was determined using the Bayesian clustering approach executed by the program STRUCTURE (version 2.3; Pritchard, Stephens, & Donnelly, 2000). Each sampling locality was run individually with all six strains used for supplemental stocking, assuming an admixture model with uncorrelated allele frequencies, employing an initial alpha value of 0.02 to account for differences in sample size across putative populations (Wang, 2017). We ran the analysis with a burn-in step of 250,000 Markov Chain Monte Carlo (MCMC) iterations, followed by 500,000 MCMC iterations. Five replicates for each K-value were performed assuming K = 4 through K = 8 to examine population structure across potentially different groupings.
The program CLUMPAK (Clustering Markov Packager Across K) (Kopelman, Mayzel, Jakobsson, Rosenberg, & Mayrose, 2015) was then used to permute clusters identified by STRUCTURE across independent runs for the purpose of producing bar plots for visualization as well as Q-values (individual level ancestry estimates) associated with stocked strains across sampling locations.
In order to assess the number of distinct groupings across all of the scenarios tested, we used the Evanno method (Evanno, Regnaut, & Goudet, 2005) of evaluating the best supported K-value, as well as the value where the mean likelihood Ln(K) plateaus across increasing K (Earl & vonHoldt, 2012;Evanno et al., 2005) using the web program STRUCTURE HARVESTER (Earl & vonHoldt, 2012). Mean Q-values associated with stocked strains from each sampling location were then plotted against their water-way distance from the nearest stocking site to determine any correlation between introgression and proximity to State-based stocking activities using a beta regression approach in the R package BetaReg (Ferrari & Cribari-Neto, 2004), given that the level of hatchery ancestry for each sampling site assumes values in the standard unit interval (0, 1). Beta regression also allows for the incorporation of different link functions (log-log link and logit link) into model selection to improve model fit in instances where extreme proportions are observed in the data (Cribari-Neto & Zeileis, 2009). Pairwise waterway distances between stocking sites and sampling locations were calculated using the R package RIVERDIST (Tyers, 2016).
RIVERDIST is able to read river network shape files and provides tools for distance calculation along river networks. Sampling areas that appeared to be significantly influenced by stocked strains were then removed from downstream analysis to avoid potential errors in determining natural population genetic structure and migration estimates across the landscape.
Cluster analysis was then performed a second time, excluding the strains used for supplemental stocking as well as two sampling areas that exhibited comparatively heightened levels of mean hatchery ancestry (Q-value ≤0.70), both exhibiting estimates associated with stocked strains at levels more than twice that of any other sampling location examined in this study. We again ran the analysis with a burn-in step of 250,000 Markov Chain Monte Carlo (MCMC) iterations, followed by 500,000 MCMC iterations with five replicates for each K-value, but this time assuming K = 1 through K = 15 to examine population structure across the wild-collected sample set with an initial alpha value of 0.08, based on the number of putative populations in the data set. The same downstream analysis was applied to determine the number of distinct populations. In addition to cluster analysis, we examined isolation by distance (the relationship between genetic differentiation and water-way distance) between our sampling areas. F ST values previously produced using the program FSTAT were incorporated into Rousset's equation (Rousset, 1997) and graphed against pairwise water-way distance, which was again calculated using the R package RIVERDIST.
Finally, recent estimates of migration between wild populations that showed minimal influence from stocked strains were estimated using the program BAYESASS (Wilson & Rannala, 2003

| RE SULTS
Results of neutrality testing using the LOSITAN workbench suggested that all loci were selectively neutral and therefore suitable for further analysis (Supporting Information Figure S1).  These results indicate weak population structure at the level of the individual river system and sampling sites, with substantial mixing between individuals at sampling sites both within and between river systems. The isolation-by-distance plot suggests a statistically significant linear correlation between genetic differentiation and waterway distance, although this correlation is somewhat weak, with water-way distance explaining approximately seven percent of the variation in genetic differentiation between sample sites ( Figure 6).

Migration estimates produced by BAYESASS mirror the results
of the STRUCTURE analysis (Figure 7, Supporting Information Table   S5). While ongoing migration (M) was detected to some extent across all sampling sites, it was estimated to be the highest within Migration estimates were generally much lower between river systems than within, consistent with the admixture patterns produced by STRUCTURE.   (Odell, 1932 of the study questions, and likely more costly. There is currently a general consensus among researchers and fisheries managers regarding the need for species-focused conservation plans, although the role of molecular applications in such plans continues to be disputed (Allendorf, Hohenlohe, & Luikart, 2010;Garcia de Leaniz et al., 2007). Nevertheless, we hope that the methods presented here may offer insight into how genetic analyses can answer complex questions related to species conservation not only in New York, but in any location where aquatic species of local conservation concern face similar circumstances and challenges.