Combining population genomics and forward simulations to investigate stocking impacts: A case study of Muskellunge (Esox masquinongy) from the St. Lawrence River basin

Abstract Understanding the genetic and evolutionary impacts of stocking on wild fish populations has long been of interest as negative consequences such as reduced fitness and loss of genetic diversity are commonly reported outcomes. In an attempt to sustain a fishery, managers implemented nearly five decades of extensive stocking of over a million Muskellunge (Esox masquinongy), a native species in the Lower St. Lawrence River (Québec, Canada). We investigated the effect of this stocking on population genetic structure and allelic diversity in the St. Lawrence River in addition to tributaries and several stocked inland lakes. Using genotype by sequencing, we genotyped 643 individuals representing 22 locations and combined this information with forward simulations to investigate the genetic consequences of long‐term stocking. Individuals native to the St. Lawrence watershed were genetically differentiated from stocking sources and tributaries, and inland lakes were naturally differentiated from the main river. Empirical data and simulations within the St. Lawrence River revealed weak stocking effects on admixture patterns. Our data suggest that the genetic structure associated with stocked fish was diluted into its relatively large effective population size. This interpretation is also consistent with a hypothesis that selection against introgression was in operation and relatively efficient within the large St. Lawrence River system. In contrast, smaller populations from adjacent tributaries and lakes displayed greater stocking‐related admixture that resulted in comparatively higher heterozygosity than the St. Lawrence. Finally, individuals from inland lakes that were established by stocking maintained a close affinity with their source populations. This study illustrated a benefit of combining extensive genomic data with forward simulations for improved inference regarding population‐level genetic effects of long‐term stocking, and its relevance for fishery management decision making.


| INTRODUC TI ON
Many native species are undergoing steep population declines due to rapid and globally changing environments and overexploitation (Allendorf, 2017). In response, supplementation programs continue to be extensively applied to enhance natural populations to sustain exploitation and more recently for biodiversity conservation (Laikre, Schwartz, Waples, Ryman, & GeM Working Group, 2010).
Fish populations, in particular, are subjected to intense commercial and recreational exploitation (Dunham, 2011). This leads to species transfers, sometimes over large distances, to supplement genetically and ecologically divergent populations. As a consequence, understanding how supplementation practices from nonindigenous sources may impact the evolutionary potential of wild populations remains a major concern (Araki, Cooper, & Blouin, 2007Laikre & Ryman, 1996;Ryman & Laikre, 1991;Waples, Hindar, Karlsson, & Hard, 2016).
Conversely, positive effects of supplementation have also been reported. Several threatened populations recently isolated due to habitat fragmentation often display low genetic divergence and have not accumulated genetic incompatibilities. These small populations, however, are exposed to genetic drift whereby mildly deleterious mutations are not efficiently removed by selection, as would be the case in larger populations (Whitlock & Bürger, 2004). These mildly deleterious mutations may increase in frequency and rise to fixation (Glémin, 2003;Lynch et al., 1999;Wang, Hill, Charlesworth, & Charlesworth, 1999). This random drift load, however, becomes important when the selection coefficient is >50% of the effective population size (Whitlock, 2000) and can ultimately lead to extinctions of very small populations (Lynch, Conery, & Bürger, 1995).
Similarly, inbreeding depression is of concern in these small populations (Bataillon & Kirkpatrick, 2000;Wang et al., 1999). In the long term, this may pose major threat to the adaptability of these populations to changing environment (Carlson, Cunningham, & Westley, 2014;Ralls et al., 2018).
While admixture and introgression have been extensively documented in salmonid species (Finnegan & Stevens, 2008;Hansen, Fraser, Meier, & Mensberg, 2009;Létourneau et al., 2018;Perrier, Baglinière, & Evanno, 2013;Perrier, Guyomard et al., 2013), less emphasis has been placed on other taxonomic groups. Here, we examine genetic implications of long-term stocking in Muskellunge Esox masquinongy, a representative of the Esocidae, the sister family of all salmonids (Rondeau et al., 2014). Muskellunge are distributed within particular temperate rivers and lakes from midwestern to eastern North America (Crossman, 1978), including the Great Lakes where it often occurs in sympatry with its congener, Northern Pike Esox lucius. In the St. Lawrence River, Muskellunge reaches sexual maturity between 5 and 7 years for males and 6 and 8 years for females (Farrell et al., 2007) and can grow up to more than 1.5 m and 20 kg that makes it a highly prized species for recreational trophy fishing. Given its typically low population density (~<1.0 fish/ha; Cloutier, 1987;Simonson, 2010) that limits sampling programs, more information is needed regarding its biology, genetic diversity, and population dynamics (Crane et al., 2015;Kapuscinski, Sloss, & Farrell, 2013).
In response to Muskellunge management challenges, widespread and sometimes intensive and long-term stocking is used to rehabilitate native populations and supplement existing ones. Stocking was also performed as species introductions to support new recreational fisheries and as a predator to control invasive fish species (Crane et al., 2015;Jennings et al., 2010;Kapuscinski, Belonger, Fajfer, & Lychwick, 2007;Wingate, 1986). Studies regarding Muskellunge genetic divergence and stocking effects primarily used microsatellite markers and mainly focused on the Laurentian Great Lakes and Upper St. Lawrence River (Kapuscinski et al., 2013;Miller, Mero, & Younk, 2009, 2012Turnquist et al., 2017;Wilson, Liskauskas, & Wozney, 2016). These studies revealed pronounced genetic differentiation among weakly connected populations at small spatial scales. They also demonstrated a generally reduced genetic diversity putatively associated with recent bottlenecks and/or strong genetic drift in populations of small effective size and restricted dispersal due to spawning site fidelity (Crossman, 1990;Farrell et al., 2007;Jennings, Hatzenbeler, & Kampa, 2011;Margenau, 1994;Miller, Kallemeyn, & Senanan, 2001). The impact of stocking has been documented in several populations where modest levels of admixture and a partial homogenization of population genetic structure have been reported (Scribner et al., 2015). In contrast, no study has documented native Muskellunge population structure further east, in the Lower St. Lawrence River basin of Québec (Canada), downstream of the Great Lakes. Moreover, no population genomic studies using thousands of markers distributed throughout the genome have been performed on this species to date. Similar to populations throughout its distribution range, Lower St.
Lawrence Muskellunge experienced a prolonged decline attributed to habitat loss (Robitaille & Cotton, 1992) and commercial overfishing that occurred until 1936 (Crossman, 1986;Dymond, 1939). Therefore, from 1951 to 1998 the Lower St. Lawrence was supplemented by over 1 million young Muskellunge (Brodeur et al., 2013;De La Fontaine, unpublished;Dumont, 1991;Mongeau, Leclerc, & Brisebois, 1980;Vézina, 1977;Vincent & Legendre, 1974). In particular, during 1951In particular, during -1965, fish from both southwestern New York State (Chautauqua Lake, Ohio River basin) and Ontario's Kawartha Lakes (represented in our study by Pigeon Lake) were transferred to Lachine hatcheries in Québec to support a massive stocking program in the St. Lawrence River (De La Fontaine, unpublished;Dumont, 1991). Muskellunge from the Ohio basin are thought to rise from a distinct regional lineage (Koppelman & Philipp, 1986), but phylogeographic and systematic relationships of these two stocking sources are incompletely understood . Fish from those sources were also stocked in numerous lakes including Lake Joseph and Lake Tremblant, which themselves became widely used as a brood source from 1965 for hatcheries until the end of stocking in 1998 (De La Fontaine, unpublished;Dumont, 1991;Vézina, 1977;Vincent & Legendre, 1974).
Besides stocking in those waterbodies, Muskellunge were also introduced in several lakes and rivers (details on stocking history and translocations are provided in Supporting Information Figure S1).
An improved understanding of the effects of admixture following stocking can also be advanced by application of individual-based forward simulations (Hoban, 2014;Hoban, Bertorelle, & Gaggiotti, 2012). Simulated outcomes of population divergence from a variety of demographic scenarios (e.g., in terms of population size and differential mortality) may be used to identify the one that best explains empirically documented patterns (Guillaume & Rougemont, 2006;Hernandez, 2008;Lawrie, 2017;Messer, 2013). These methods, however, are rarely applied to investigate stocking effects on the genetic composition of wild fish (but see Perrier, Guyomard et al., 2013). It is also possible to generate increasingly complex and biologically realistic simulations for comparison with empirical data. Such simulations can aid in understanding supplementation efficacy and how it may generate admixture, while also helping to understand the limits of simple clustering tools (e.g., Alexander, Novembre, & Lange, 2009;Pritchard, Stephens, & Donnelly, 2000) in detection of genome-wide effects of admixture on local ancestry.
Here, we combined empirical population genomics using a genotype-by-sequencing approach with forward simulations to document the extent and patterns of genetic admixture that resulted from past stocking events in the St. Lawrence River wa-

| Sampling
A total of 662 Muskellunge were sampled from 22 sites within the St. Lawrence River drainage, mainly from Québec (Canada) with a median sample size of 24 individuals per site ( Figure 1, Table 1).
This sampling represents the Upper and Lower St. Lawrence River, its main tributaries, and inland lake populations. Two sampling sites (Chautauqua Lake, CHQ, and Pigeon Lake, PIG) represented the original stocking sources used from 1951 to 1965 and two more recent secondary sources (Lake Joseph, JOS, 1965, and Lake Tremblant, TRE, 1986-1997both

| Molecular methods and SNP calling
A salt-extraction protocol adapted from Aljanabi and Martinez (1997) was used to extract genomic DNA. Sample quality and concentration were checked on 1% agarose gels and a NanoDrop 2000 spectrophotometer (Thermo Scientific). Concentration of DNA was normalized to 20 ng/µl. Libraries were constructed following a double-digest RAD (restriction-site-associated DNA sequencing; Andrews, Good, Miller, Luikart, & Hohenlohe, 2016) protocol modified from Mascher, Wu, Amand, Stein, and Poland (2013). Genomic DNA was digested with two restriction enzymes (PstI and MspI) by incubating at 37°C for 2 hr followed by enzyme inactivation by incubation at 65°C for 20 min. Sequencing adaptors and a unique individual barcode were ligated to each sample using a ligation master mix including T4 ligase.
The ligation reaction was completed at 22°C for 2 hr followed by 65°C for 20 min for enzyme inactivation. Samples were multiplexed (n = 48 individuals) to insure fish from each sampling location were sequenced on a minimum of six different multiplexes to avoid pool effects. Libraries were size-selected using a BluePippin prep (Sage Science), amplified by PCR, and sequenced for 96 individuals on a Ion Proton P1v2 chip that generated approximately 80 million reads per chip. Based on the number of reads for each individually barcoded sample, the prepared DNA was re-pooled into a new library where the representation of samples with low reads counts was increased.
This new library was sequenced on two additional Ion Proton chips to normalize the number of reads per sample.
They were aligned to the Northern Pike (Esox lucius) reference genome Eluc_V3 (Rondeau et al., 2014) using bwa-mem (Li, 2013) with default parameters. Here, 19 randomly distributed individuals with <2.5 million reads were removed, and 643 of 662 Muskellunge were retained for subsequent analysis. Then, aligned reads were processed with Stacks v.1.44 for SNPs calling and genotyping. The "pstacks" module was used with a minimum depth of 3, and up to three mismatches were allowed in catalog assembly. We then ran the "populations" module to produce a vcf file that was filtered with a custom python script. To control for paralogs and HWE disequilibrium, SNPs were retained with a read depth higher than 5, and presence in 70% of each individual at each sampling location and had heterozygosity <0.60.
We removed any SNPs that lacked presence in at least 70% of  A haplotype file was exported from the "populations" module in Stacks providing information on each RAD locus instead of single SNPs. These loci were then used to estimate a range of statistics described below and to perform FineRADstructure analysis.

| Genetic diversity analysis
Hierfstat (Goudet, 2005) was used to compute patterns of observed heterozygosity and gene diversity after excluding monomorphic markers from each sampling location. We then computed the proportion of polymorphic loci in each sampling location using custom R scripts. Contemporary effective population size (N e ) was estimated using the bias-corrected version based on linkage disequilibrium and assuming random mating (Waples, 2006;Waples & Do, 2008) and implemented in NeEstimator (Do et al., 2014). We kept one SNP per locus and filtered the dataset for Notes. Code = acronyms for each sampling site, n = number of individuals genotyped, H o = observed heterozygosity, H s = gene diversity, P = proportion of polymorphic loci. Pi = nucleotide diversity (Tajima, 1989), N e = effective population size with 95% CI obtained from jackknifing over individuals. Effective population size was computed by merging the major group within the St. Lawrence together. Similarly, all populations from the same stocking group (JOS, TRE, and FRO) were merged in order to increase the sample size. Effective population size could not be estimated for Traverse Lake when implementing a correction for linkage and resulted in an estimated of N e = 3 [95% CI = 2-4] without correction.
physical LD using plink (Chang et al., 2015). Windows of 50 SNPs were shifted by five SNPs each iteration and the routine removed any SNP with a variation inflation factor >2. Nonpolymorphic loci were removed in each population as well as singletons. Effective population size (N e ) was computed after merging sampling locations into major genetic clusters and applying equations from Table 2 of Waples (2006). All burrow estimates for pairs of SNPs located on the same chromosome were removed prior to N e estimation. Finally, we estimated nucleotide diversity (Tajima, 1989) and differentiation statistics (D xy , D a , F st ) for each RAD locus

| Genetic differentiation and population admixture
Levels of genetic differentiation between sampling locations were computed using Weir and Cockerham's F st estimator θ (Weir & Cockerham, 1984) in vcftools v0.1.15, and resulting values were used to construct a heatmap and a dendrogram in R using custom script.
Confidence intervals around F st were computed using Hierfstat with 1,000 bootstraps. Isolation by distance (IBD) was tested between pairwise genetic distances as F st /(1 − F st ) (Rousset, 1997) and the pairwise distance following the river watershed network using a Mantel test in R with 10,000 permutations.
To better understand whether low genetic differentiation was due to high population connectivity, we also measured the Our simulation pipeline was also applied to examine whether strong genetic differentiation (as measured by F st ) translated into a long divergence time. We estimated the divergence time between Chautauqua and Pigeon lakes, the two original sources of stocking, under the assumption that no gene flow existed between these contemporary isolated lakes. Under this hypothesis, the divergence history can be summarized by a model of strict isolation (SI) characterized by the effective population size of the two daughter populations (N1 and N2) that diverged from a common ancestral population of size Nanc at time Tsplit. Coalescent simulations were again used to generate genomic data and to compare these theoretical patterns to our empirical data using an ABC framework.
Full details are provided in Supplementary Materials: Appendix S1. This analysis was replicated five times by comparing a randomly chosen site within the St. Lawrence to either Chautauqua or Pigeon Lake.
Next, ancestry and admixture proportions were inferred for each individual using Admixture (Alexander et al., 2009) with K-values ranging from one to 25. Then the snmf function implemented in the LEA R package (Frichot & François, 2015) to estimate ancestry coefficients levels of K. Other DAPC model-based clustering methods (Jombart, Devillard, & Balloux, 2010) produced highly congruent results to those obtained using Admixture or LEA. All admixture analyses were performed by maintaining a single SNP per locus.
Keeping either a random SNP or the SNP with the highest minor allele frequency produced similar results (not shown), and we report only the results of analyses performed with SNPs showing the highest minor allelic frequency (MAF) at each locus. We then correlated the individuals "stocking source" ancestry coefficient to individual levels of observed heterozygosity (Spearman's rho) and tested it for significance with a t test.
Finally, we used a modification of the fineSTRUCTURE package (Lawson, Hellenthal, Myers, & Falush, 2012) implemented in FineRADstructure (Malinsky, Trucchi, Lawson, & Falush, 2018) to infer levels of population genetic structure and ancestry from haplotype data derived from RAD loci. Less than one percent of missing data were allowed. First, RADpainter was used to compute the co-ancestry matrix. Then individuals were assigned to populations with fineSTRUCTURE using 100,000 MCMC iteration for burn-in and the same number of sampled iteration with a thinning interval of 1,000.

| Demographic simulations of historical stocking
Individual forward simulations were used to investigate the effect of stocking on admixture patterns and the contemporary genetic makeup of populations in the fluvial system of the Lower St.
Lawrence River (a series of connected lakes with the mainstem).
Tributaries were not simulated as data on their stocking intensity were not consistently available. We simulated a neutral 10,000 kb long chromosome assuming a uniform mutation rate µ of 1e-8 per bp per generation and a recombination rate r of 1e-8 per bp per generation using slim v2.6 (Haller & Messer, 2009). First, we simulated an ancestral ideal population of size N = 2,400 for 80,000 generations to reach equilibrium. The ancestral population was then split into three populations corresponding to the stocking source (SRC), the LDM, and the SLR with a reduced initial size of N = 800 (a range of different sizes were tested, see below). Given observed data, we allowed for a constant rate of migration between LDM and SLR with m SLR->LDM = 0.0005 and m LD-M → SLR = 0.00025 (we explored parameters producing data similar to empirical observations). We implemented asymmetric dispersal to reflect the expected downstream biased dispersal (e.g., Paz-Vinas, Loot, Stevens, & Blanchet, 2015), which was also observed in our data from LDM into SLR (see admixture results). Initially, no migration was allowed between the stocking sources (considered as a single unit based on patterns of shared ancestry) and either LDM or SLR. These populations kept diverging for 2,685 generations roughly corresponding to the postglacial divergence period (assuming a generation time of 6 years based on age at first spawning; Farrell et al., 2007). To reproduce initial stocking event about 15 generations ago, we introduced migration between stocking sources and LDM and SLR. Since more admixture was observed in LDM compared to SLR (see Section 3), migration into LDM was set 1.5 times higher than in SLR. We tested a range of migration rate from 1e-6 to 0.1.

| Population differentiation
The (OTT) and Champlain Lake (CHP) was less clear as they appeared close to the source of stocking albeit the clustering was weakly supported in the tree (Figure 2b). There was a modest signal of IBD when all sites were included (mantel test r = 0.369 p = 0.0129) and considering only St. Lawrence R. sites (from TIN to LSP) a stronger IBD pattern was revealed (r = 0.799, p = 0.0046, Figure 2).
Finally, pairwise D xy and net divergence (D a ) did not revealed major difference between compared pairs of populations (Supporting Information Table S2).

| Evidence for asymmetric gene flow
ABC estimates of migration rates and effective population size were performed using 1 million coalescent simulations. We chose the site LSP and LSP, located at the opposite ends on the Lower St. Lawrence and did not use the more upstream site (e.g., TIN) located upstream of dams artificially restricting upstream migration. Results indicated that posterior distribution of the migration rate was highly different from the prior distribution, resulting in narrow credible intervals (Supporting Information Figure S5).  Figure S5) This provided support for the hypothesis that low genetic differentiation was partly due to strong downstream directed dispersal. Relatively large effective population size may also contribute to this pattern (Supporting Information Figures S5 and S6).

| Population structure and admixture
Analyses of admixture cross-validation consistently produced low cross-validation scores for K = 8 and K = 13 indicating the number of clusters likely lies between these values (Supporting Information Figure S3). Similar results were obtained from LEA cross-entropy criterion with minimum values obtained around K = 12 to 13 while the DAPC provided the same results. For simplicity, only results from the admixture analysis for K = 8 and K = 13 are detailed below (see also  ranging from 0.22 to 0.88). Among those, nine (36%) displayed q-value of belonging to the "stocking group" ranging from 0.10 to 0.617 with no individual of apparent pure "stocking" origin ( Figure 3a). Six (37.5%) and four individuals (57%), respectively, from YAM and OTT were admixed (q-value > 0.10) with the stocking source (Figure 3a). None of these individuals was of pure stocking origin as they displayed q-values ranging from 0.10 to 0.211 and from 0.10 to 0.365 for YAM and OTT, respectively. In OTT, all seven sampled Muskellunge were admixed and none was assigned to a particular cluster. The Achigan R. (ACH) and YAM populations clustered together (mean q-value = 0.915) whereas TRA and CHP populations formed two separate clusters (q-value = 0.96 and 0.81, respectively). Finally, the Champlain L. showed indicators of significant admixture from the stocking source for six individuals (30%). There were 2 F 0 immigrants and 5 introgressed individuals with q-values ranging from 0.10 to 0.525.
F I G U R E 3 (a) Levels of admixture for K = 8 and K = 13. Each color represents a distinct cluster, and each bar represents an individual. See Figure 1 for the labels of each dot in the graph; (b) co-ancestry matrix inferred by FineRADstructure. Each pixel represents individual co-ancestry coefficient inferred based on short haplotype loci. The labels summarize the major groups according to names of sampling sites.
The higher values of co-ancestry coefficient sharing are depicted in darker colors, whereas lower values of co-ancestry coefficient sharing appear in yellow colors Sites from the St. Lawrence River from the uppermost (TIN) to the most downstream site (LSP) tended to form a single group with the exception of LDM that clearly clustered separately (q-value = 0.84).
In contrast with inland lakes and tributaries, weak evidence was detected for admixture with the stocking source population. Thus, 21 individuals (8%) from the St. Lawrence R. (from TIN to LSP) displayed introgression with the stocking source (q-value ranging from 0.10 to 0.54) and no individual was of pure stocking origin. However, we observed more fish (n = 19, 33%) from LDM that were admixed with the stocking source (q-value ranging from 0.10 to 0.50) but none were of pure stocking origin. While there was limited admixture with the stocking source, we found evidence for admixture among fish from different locations within the system. Here, 6% of individuals from LSL were classified as F0 immigrants (q-value > 0.9) from the nearby LDM population and 42% were admixed with LDM ( Figure 3a). It is noteworthy that considering the spatial location of individuals in LSL (all collected fish within the SLR were georeferenced by GPS), we observed a tendency for the most admixed individuals (78%) to be preferentially found in the northern part of LSL whereas 87% of pure individuals were located preferentially in the southern part of LSL. Lawrence River from TIN to LSP were now separated into two admixed groups. In particular, individuals from the uppermost TIN site were assigned to this cluster (mean q-value = 0.87) while 31%, 26%, 12%, and 7% of individual, respectively, from LSF, LSW, LSL, and LSP were now assigned (q > 0.90) to this new group. In these same sites, respectively, 31%, 27%, 22%, and 37% of individuals, as well as 3.5% in LDM were admixed (Figure 3a, top panel).

| Simulations of admixture
Results indicated that over all scenarios, N = 600 (closer to estimated N e in the SLR) systematically displayed lower RMSE (root mean squared errors) than those with higher N e (Supporting Information  Table 2), none of the scenarios were able to reproduce the observed divergence between SLR and LDM (mean F ST for the 20 best scenarios = 0.011 vs. 0.036 in our data set; Table 2).
The difference between simulated and observed differentiation was significant (p < 0.005), an indication that some processes not considered in the model, such as a long isolation period or a smaller N e in LDM or that two populations originating from evolutionary divergent lineages, may have contributed to their divergence. Overall, scenarios with the mortality-based filter produced the lowest RMSE, with the best scenario (RMSE = 0.05) where 50 individuals randomly died (Table 2). However, scenario M9 where no individual died also produced a low RMSE of 1.60. Several scenarios with mortality of 100-400 individuals presented intermediate RMSE values (Table 2), F I G U R E 4 (a) Boxplot of "stocking source" ancestry coefficient (q-values) in the single nonsupplemented population and in the different supplemented populations separated according to whether they occur in the St. Lawrence River itself (SLR) or its tributaries. (b) Proportion of individuals assigned to different classes of ancestry: individual assigned as "resident" if q-value >0.9, individual assigned as of "admixed domestic ancestry" if q-value of belonging to a "stocking" source cluster was above 0.1, individual assigned as of "wild ancestry" if q-value of belonging to another foreign population was above 0.10. Notes. SRC = source population used for stocking. Given the shared patterns of ancestry between the populations used for stocking (i.e., Chautauqua, Joseph, and Tremblant lakes), individuals were merged together and modeled as a single unit. Given that no sign of ancestry from Pigeon Lake was found in the St. Lawrence (SLR) or Deux-Montagnes L. (LDM), this source was not included here. Based on our empirical results, the migration rate (m) was set 1.5 times higher in LDM than in SLR in which we included LSL, LSF, LSP, MSC, MSG, and MSV. Admixture and F ST values in bold are those observed in our empirical data. The group 1 (grp1), group 2 (grp2), and group 3 (grp3) represent admixture membership probability of individuals from LDM, SLR, and SRC, respectively, for K = 3. Polym = number of polymorphic sites with a MAF equal to 1% or more. RMSE = root-mean-squared error. The lower the RMSE, the closer a given model is to the empirical observation. Here, the 20 best models in terms of RMSE are displayed. Further details can be found in Supporting Information Table S3.
indicating that varying rates of mortality for N = 600 could explain empirical data. Increasing mortality rate to 500 individuals (i.e., 83%) resulted in RMSE of 1.73.

| Genetic structure between distant lakes and isolation by distance within the St. Lawrence River
A common issue in population genetics is delineation of population structure confounded with isolation by distance (IBD), where genetic differentiation increases with geographic distance due to genetic drift under limited dispersal (Wright, 1943). Here, our data indicated both the occurrence of clusters of genetically differentiated groups (i.e., among distant and isolated water bodies) and significant isolation by distance within the St. Lawrence River, as reported in Muskellunge (Kapuscinski et al., 2013;Miller et al., 2017) and other species (Meirmans, 2012;Sexton et al., 2014). The behavior of the Muskellunge, especially its known spawning site fidelity (Farrell et al., 2007) and possible natal homing (Crossman, 1990;Jennings et al., 2011;Margenau, 1994;Miller et al., 2001) are factors that can contribute to both IBD and the establishment of population structure. Regarding the major clusters, a strict interpretation of F ST values and admixture results indicates that each lake and each tributary could be classified as a distinct population (Supporting Information Figure S3). However, finer analyses revealed shared co-ancestry between many samples, especially between some stocked tributaries/lake with population stocking source. Moreover, coalescent analyses suggested a recent divergence of studied populations, and there is evidence for ample shared polymorphism between all populations. We therefore hypothesize that tributaries represent a bottlenecked subset of the larger St. Lawrence R. populations following their colonization by a small number of founders. Then, isolation of each individual group enhanced the effect of genetic drift (e.g., Turnquist et al., 2017), resulting in pronounced population-level genetic differentiation. In the case of the Chaudière River, we further note that the connectivity with the St. Lawrence is already restricted by natural waterfall. In this particular case, the river display stronger footprint of introgression with individuals of stocking origin." The two clusters observed within the St. Lawrence R. are likely due to isolation by distance confounding the clustering methods (Bradburd, Coop, & Ralph, 2018;Meirmans, 2012). As expected, our coalescent analysis supported downstream biased dispersal, which likely explains the high connectivity within the St. Lawrence R. (Morrissey & de Kerckhove, 2009). Moreover, the presence of two dams on the St. Lawrence certainly contributes to restrict upstream dispersal and associated gene flow.
In tributaries, a variable proportion of fish from sites between mainstem LSL and LSP displayed shared ancestry with fish from the LDM tributary. Admixed individuals were preferentially located in the northern shore of LSL, whereas those on the southern shore where generally not admixed. Interestingly, this dichotomy between the northern and southern shores of LSL was also reported in the Northern Pike (Ouellet-Cauchon, Mingelbier, Lecomte, & Bernatchez, 2014) and Yellow Perch Perca flavescens (Leclerc, Mailhot, Mingelbier, & Bernatchez, 2008). In both species, fish from the south shore of the LSL were more similar to those from elsewhere within the St. Lawrence R. than those from the north shore. LSL is characterized by contrasted water masses flowing from the St. Lawrence R. on the south shore and from the Ottawa River on the north shore (Hudon & Carignan, 2008;Leclerc et al., 2008). Therefore, it would be of interest to distinguish the respective contribution of ecological (e.g., temperature, pH, light extinction, etc.) and historical (e.g., glacial refugia) factors responsible for this repeated pattern of divergence observed between the north and south shores of LSL among different species.
Moreover, coalescent estimates of N e (supplementary analyses) may be more relevant (Sjodin, 2005)  uals. This would lead to a decline of the "stocked" genome as 1/2 g generation (Phillips & Baird, 2015). This could quickly result in low levels of admixture that may not be detectable with global ancestry inference approaches. On the other hand, in the smaller stream populations with putative low effective population size, footprints of admixture are still apparent. In these small populations, stocking could potentially have some positive assets (Frankham, 2015). Given that all populations are likely postglacially derived, they are likely to be genetically compatible with small probability of outbreeding depression (Ralls et al., 2018). Therefore, stocking from slightly diverged populations may contribute to the maintenance of their genetic diversity and evolutionary potential (Carlson et al., 2014;Frankham, 2015).
A second demographic scenario fitting our empirical data involved differences in effective population size, but requires the action of selection. For instance, stocked individuals may display low survival, low reproductive success (hence resulting in little introgression), and/or selection could act against introgression, as supported by several of our models. Under this second scenario, the higher effective population size and pronounced connectivity of the Lower St. Lawrence R. may play a critical role in selection against introgression (Glémin, 2003). Alternatively, genetic drift may overcome selection against introgression in smaller, isolated populations (Frankham, Ballou, & Briscoe, 2010) leading to fixation of "foreign" alleles and higher footprints of admixture. Such variable outcomes of stocking on admixture have been documented in salmonids (Perrier, Guyomard et al., 2013). Low survival and/or reproductive success of stocked fish has been reported mostly in salmonid species characterized by putatively high rate of local adaptation (Araki, Cooper, & Blouin, 2007;Araki et al., 2009;Araki & Schmid, 2010;Christie, Ford, & Blouin, 2014;Thériault, Moyer, & Banks, 2010 (Becker, 1983), small effective population sizes are expected (Romiguier et al., 2014).
Second, there was a significant and positive correlation be-  (Sacherri et al., 1998;Wang et al., 1999). Stocking could contribute to reduce the probability of inbreeding depression by maintaining higher levels of genetic diversity (Frankham, 2015). Alternatively, it is unclear whether selection against introgression removed non-native (e.g., stocked) alleles within the

| Conservation and management applications
Knowledge of the genetic structure of fish populations is essential for proper and sustainable stock management since it allows identification of groups of individuals that are genetically and often geographically unconnected, therefore implying at least some demographic independence. Our results indicate that Muskellunge samples studied here can be separated into the following groups: The first being the St. Lawrence watershed, characterized by an upstream-to-downstream pattern of dispersal and gene flow.
The second group is des Deux-Montagnes Lake with a shared co-ancestry with the St. Lawrence R. population, especially fish from the north shore of Saint-Louis L. The third group comprises tributaries draining into the St. Lawrence River. Each tributary has differentiated by genetic drift and display lower levels of polymorphism and lower effective population size possibly due to founder events. Therefore, ensuring the maintenance of connectivity of tributaries with the mainstem would contribute to maintenance of an overall large effective population size at the meta-population scale, limiting the risks of inbreeding depression (Frankham, 2015).
The fourth group is represented by lakes where Muskellunge has been introduced (Joseph L., Tremblant L. and Frontière L.). These populations exhibit high genetic similarity with the Chautauqua L.
source of stocking. The fifth group includes stocked lakes where Muskellunge was initially present (Maskinongé and Champlain lakes) with each of these lakes representing a genetically distinct population. Finally, Traverse Lake represents the sixth group corresponding to a wild unstocked population with low genetic diversity.
In isolated lacustrine and river systems originally unoccupied or with presumably low Muskellunge abundance, stocking efforts appear to have been successful in enhancing populations and contributed to support angling activity. This corroborates Dumont (1991) who reached similar conclusions following an analysis of data from the first half period of the stocking program in St. Louis L. located in the Lower St. Lawrence R.
In the smallest lakes, as well as in locations where a massive dieoff was observed (e.g., Thousand Islands; Farrell et al., 2017), supplementation may be necessary, and in such case, the most genetically diversified source should be favored.
In conclusion, our results show that Muskellunge populations are spatially structured into a set of different groups, whose differentiation was affected by the levels of stocking, natural connectivity, or geographic isolation. Our results highlight the necessity to go beyond the simple measurement of genetic differentiation and genetic structure to define subsequent management practice. We further suggest that fishery management and habitat protection measures should be applied globally in order to ensure high connectivity of the population and maintain their evolutionary potential at the highest. We also found that the genetic consequences of stocking on admixture and genetic diversity affected differentially Muskellunge from the St. Lawrence River, its tributaries, and inland lakes, whereby stocking apparently had little impact on the St. Lawrence R. and des Deux-Montagnes L.
populations but apparently led to more important admixture elsewhere. Disentangling the long-term evolutionary consequences of this history of stocking would require further investigations from denser genome-wide data.

ACK N OWLED G EM ENTS
We express our gratitude to Québec's muskies anglers who col-

CO N FLI C T O F I NTE R E S T
None declared.

DATA AVA I L A B I L I T Y
Raw sequence data are available at NCBI BioProject ID PRJNA512459. Pipeline for forward simulation is available on gitub at https://github.com/QuentinRougemont/fwd_sims.