Spontaneous hybridization and introgression between walleye (Sander vitreus) and sauger (Sander canadensis) in two large reservoirs: Insights from genotyping by sequencing

Abstract Anthropogenic activities may facilitate undesirable hybridization and genomic introgression between fish species. Walleye (Sander vitreus) and sauger (Sander canadensis) are economically valuable freshwater species that can spontaneously hybridize in areas of sympatry. Levels of genomic introgression between walleye and sauger may be increased by modifications to waterbodies (e.g., reservoir development) and inadvertent propagation of hybrids in stocking programs. We used genotyping by sequencing (GBS) to examine 217 fish from two large reservoirs with mixed populations of walleye and sauger in Saskatchewan, Canada (Lake Diefenbaker, Tobin Lake). Analyses with 20,038 (r90) and 478 (r100) single nucleotide polymorphisms clearly resolved walleye and sauger, and classified hybrids with high confidence. F1, F2, and multigeneration hybrids were detected in Lake Diefenbaker, indicating potentially high levels of genomic introgression. In contrast, only F1 hybrids were detected in Tobin Lake. Field classification of fish was unreliable; 7% of fish were misidentified based on broad species categories. Important for activities such as brood stock selection, 12 of 173 (7%) fish field identified as pure walleye, and one of 24 (4%) identified as pure sauger were actually hybrids. In addition, two of 15 (13%) field‐identified hybrids were actually pure walleye or sauger. We conclude that hybridization and introgression are occurring in Saskatchewan reservoirs and that caution is warranted when using these populations in stocking programs. GBS offers a powerful and flexible tool for examining hybridization without preidentification of informative loci, eliminating some of the key challenges associated with other marker types.


| INTRODUC TI ON
Hybridization and introgression, resulting in gene flow between species, can have conservation and management implications. Natural hybridization is an important evolutionary process that occurs in a variety of contexts (Caniglia et al., 2020;Hedrick, 2013;Marques et al., 2019;Mitchell et al., 2019). Conversely, anthropogenic hybridization is the result of human disturbances, such as intentional admixture, translocations, habitat modifications, and climate change, which can lead to hybridization that would not normally occur (reviewed by Caniglia et al., 2020;McFarlane & Pemberton, 2019).
Anthropogenic hybridization can lead to introgression, which can be a concern because it can impact local and global biodiversity (Gilman & Behm, 2011). Further, human-mediated hybridization can cause problems for native species, such as outbreeding depression, genetic swamping, and potentially extinction or extirpation (Gilman & Behm, 2011;McFarlane & Pemberton, 2019;Rhymer & Simberloff, 1996;Todesco et al., 2016). Anthropogenic hybridization has frequently been studied in fish as a result of concern generated by stocking programs (Dierking et al., 2014;Harbicht et al., 2014;Huuskonen et al., 2017), introduction of non-native species (Boyer et al., 2008;Lamaze et al., 2012), and habitat alterations (Huuskonen et al., 2017). It is important to understand where and how humans artificially influence hybridization to help maintain biodiversity and fisheries productivity.
Sauger are often found in more turbid, riverine environments compared with the lacustrine habitats preferred by walleye, but overlap in spawning times and locations occurs in some regions and can lead to the spontaneous generation of wild hybrids (Bozek, Baccante, et al., 2011;Bozek, Haxton, et al., 2011;Clayton et al., 1973;Nelson & Walburg, 1977;Stroud, 1948). Hybridization is more common in artificial or altered systems (e.g., reservoirs) and where parental species did not naturally co-occur (reviewed by Billington & Sloss, 2011).
First-generation (F 1 ) hybrids have been found in a wide variety of systems, but second-generation hybrids (F 2 ) are rare due to the low reproductive success of F 1 × F 1 crosses; however, F 1 hybrids can successfully backcross with either parental species (Bingham et al., 2012;Fiss et al., 1997). In addition to naturally occurring hybrids, F 1 saugeye have been artificially generated in hatcheries for targeted stocking programs for the purposes of creating putand-take recreational fisheries, due to their faster growth rates in suboptimal habitats (Boxrucker, 2002;Galiant et al., 2002;Lynch et al., 1982;Malison et al., 1990;Pope et al., 1996;Siegwarth & Summerfelt, 1993;Stahl et al., 1996). However, little is known about the factors that influence hybridization frequency, or introgression between the species (multigenerational hybrids).
In situations where walleye and sauger are sympatric or syntopic, identification of saugeye and estimation of hybridization rates are important for fisheries managers. A variety of markers have been used to identify saugeye and have estimated hybridization frequencies ranging from 0% to 39% in various waterbodies (reviewed by Billington & Sloss, 2011). The most readily available tool is field identification based on morphological features; however, the possibility for saugeye to present a mosaic of phenotypic features renders field markers unreliable Flammang & Willis, 1993;Van Zee et al., 1996;Ward & Berry, 1995;White et al., 2005).
Genetic identification tools used previously include protein electrophoresis (allozymes; Billington et al., 1997;Fiss et al., 1997;Flammang & Willis, 1993;Graeb et al., 2010;Krueger et al., 1997;Van Zee et al., 1996), mtDNA restriction fragment length polymorphisms (RFLP; White et al., 2005), fluorescent randomly amplified polymorphic DNA markers (FRAPD; Sovic et al., 2012), and microsatellites (Bingham et al., 2012;White et al., 2005White et al., , 2012. However, the limited number of genetic loci assessed with these methods may reduce the ability to correctly assign species categories and limit the potential to identify hybrids with low levels of admixture Bingham et al., 2012;Boecklen & Howard, 1997;Twyford & Ennos, 2012;Vaha & Primmer, 2006). Further, the diagnostic nature of these markers may be population-specific, and some can be difficult to replicate across different laboratories (e.g., microsatellite DNA; Fernandez et al., 2013;Stott et al., 2010;Vignal et al., 2002). Overall, the methods currently used to identify saugeye have some potentially important limitations and our understanding of spontaneous hybridization would be enhanced by a more powerful genomics approach.
Here, we use GBS and SNPs to examine spontaneous hybridization between walleye and sauger in two large reservoirs (Lake Diefenbaker and Tobin Lake) in the Saskatchewan River system of Saskatchewan, Canada. Little is known about the geographical distribution and frequency of hybridization in Saskatchewan; however, there is extensive range overlap between walleye and sauger, and saugeye have been identified morphologically in both reservoirs. These species are of principal interest given the high economic and cultural value of recreational walleye fisheries in the region (Fisheries & Oceans Canada, 2018;Saskatchewan Ministry of Environment, 2010). In addition, until 2013, the main source of walleye broodstock for the provincial stocking program came from Lake Diefenbaker, raising concern about the genetic integrity of stocked walleye. In 2004, hybridization frequencies in Lake Diefenbaker were estimated to be as high as 22% based on allozymes; however, broodstock collection was still considered feasible in certain areas of the lake (Billington et al., 2005). Given limited knowledge regarding spontaneous hybridization, our overarching goal was to harness the power of GBS and SNPs to identify saugeye. Our specific objectives were to: (a) use GBS and SNPs to classify fish from mixed populations of walleye and sauger without prior identification of informative loci; (b) assess our ability to correctly classify species based on morphology; and (c) apply the GBS approach to understand the extent of hybridization and introgression in two reservoirs in Saskatchewan, Canada. Our work provides insight into hybridization as a biological phenomenon, and the use of GBS as a powerful and flexible approach for identifying different levels of admixture among hybridizing species.  Figure 1). The reservoir is characterized by a transition from shallow, riverine habitat at the upstream end to a deep lacustrine habitat in the Thomson Arm of the downstream end, with a sandy shoreline consisting of numerous tributaries (North et al., 2015;Sadeghian et al., 2015). The reservoir is a critical water supply for Saskatchewan and provides water allocation for irrigation, power, and municipal and industrial uses (Saskatchewan Water Security Agency, 2012). Lake Diefenbaker supports at least 26 native and stocked fish species, including self-sustaining populations of both walleye and sauger (Saskatchewan Water Security Agency, 2012). In addition to its purpose for water allocation, Lake Diefenbaker is managed as a recreational fishery and is one of the most popular destinations for domestic and international tourism in the province.

| Study sites
The Saskatchewan Ministry of Environment has been actively stocking millions of walleye fry annually into lakes across a variety of Saskatchewan water systems since 1950 (Wallace, 2004). This stocking effort generally targets lakes that do not naturally sustain walleye, but in some circumstances, supplemental stocking has been used to augment natural productivity. Until 2012, Coteau Bay on Lake Diefenbaker (Figure 1c) was the main source for walleye eggs, with up to 100 million eggs collected (Wallace, 2004). In 2004, concern for potential hybridization between walleye and sauger in Lake Diefenbaker and the spread of hybrids through stocking prompted the Saskatchewan Ministry of Environment to commission a study on hybridization based on allozymes. Overall hybridization frequency based on fish collected in a gill net survey was estimated to be 21% based on four diagnostic allozyme loci, although with this few markers there is a 6.25% chance that introgressed alleles were missed (Billington et al., 2005). The highest rate of hybridization was detected in the western upstream riverine area of Lake Diefenbaker (i.e., Saskatchewan Landing). Based on uncertainty about the genetic status of fish in Lake Diefenbaker and specifically Coteau Bay, broodstock collections for the stocking program were switched to alternate locations containing only walleye (no sympatric sauger). However, the Coteau Bay location remains of interest for future stocking efforts, and there is a need to understand hybridization between walleye and sauger more comprehensively in Lake  & Tisdale, 1988). Tobin Lake is also characterized by a transition of shallow remnant riverine habitat upstream to deeper lacustrine habitat downstream; however, the reservoir is much shallower (mean depth = 8 m, max depth = 26.4 m) overall than Lake Diefenbaker and is primarily used for hydroelectric power production. Tobin Lake supports at least 25 different species of fish, including walleye and sauger, and is considered one of the most valuable trophy fisheries in Saskatchewan (Orr, 1993). The recreational fishery on Tobin Lake is a major draw for anglers from both Canada and the USA.
In a population assessment in 2018, the Saskatchewan Ministry of Environment fisheries staff morphologically identified six suspected saugeye, approximately 1% of the Sander spp. collected (R. Eberts, Saskatchewan Ministry of Environment, unpublished data); however, no genetic survey has been completed for that lake. The frequency of hybrids and backcrosses in the Tobin Lake system is of interest as a comparison to Lake Diefenbaker, and for a basic understanding of conditions that facilitate spontaneous hybridization. Both Lake Diefenbaker and Tobin Lake are large reservoirs where riverine and lake systems merge as a result of anthropogenic water management, which may be ideal conditions for hybridization of walleye and sauger (Bellgraph et al., 2008;Gangl et al., 2000).

| Sample collection, DNA isolation, and sequencing
Walleye, sauger, and saugeye tissue samples were collected from both Lake Diefenbaker and Tobin Lake during competitive fishing CFEs, angling teams weighed in a total of five fish each at the conclusion of fishing (up to 160 teams per event). These fish were presented at the weigh station in plastic tubs; we intercepted a random subset of tubs at each CFE and collected fin clips from live fish. In each case, we inspected the tub of fish and sampled two walleye; if sauger or intermediate fish were present, we also sampled from that individual or individuals (major field markings used described by Billington et al. (1997)). Correspondingly, at each CFE we collected fin clips from hundreds of walleye, but only a small number of sauger or phenotypic intermediates, which were weighed-in much less frequently. It is important to note that because fish were collected from CFEs, the data are based solely on individuals susceptible to angling. Consequently, hybridization and introgression levels may not be completely representative of the general populations in each reservoir.
During our field sampling, we accumulated a large tissue archive from CFEs on both study lakes. From these, a subset of 217 fish was prepared for GBS (Table 1). Samples were chosen for sequencing based on DNA quality (largest fragments present) and sample site to ensure that multiple individuals were sequenced from each location.

DNA was extracted from fin clips using a Genomic DNA Isolation
Kit following the manufacturer's guidelines (Norgen Biotek Corp.).
Proteinase K digestion was extended to 12 hr at 55°C, and fin clips were treated with RNAse A (Qiagen Inc.). DNA concentration was then quantified using a Qubit 2.0 fluorometer (Life Technologies Inc.), and the quality was assessed using a 1% agarose gel (E-Gel; Invitrogen). Although fin clips were collected from live fish and immediately preserved, there was still evidence of mild to moderate levels of shearing in the extracted DNA. To target an optimal number of SNPs per individual (see Graham et al., 2015), samples with DNA fragments ranging between 3 and 10 kb were retained for sequencing and those with higher levels of degradation were discarded.
Samples were prepared for sequencing by the Genomics

| Genotyping and basic statistics
Raw reads were first visualized in FASTQC (Andrews, 2010) and then processed using STACKS version 2.3e (Catchen et al., , 2013. The process_radtags script was used to remove any reads with uncalled bases, discard reads with an average quality score below Q10, and truncate the reads to 100 base pairs. Following quality filtering, the optimal distance allowed between stacks (-M in ustacks), the minimum sequencing depth (-m in ustacks), and the number of mismatches allowed between sample loci (-n in cstacks) were optimized using the r80 rule as recommended by Paris et al. (2017). Optimized parameters for this study were determined to be M = 1, m = 3, and n = 2. The ustacks script was run using a minimum sequencing depth of 3 (-m), a maximum distance of 1 bp allowed between stacks (-M), and a maximum distance of 3 bp allowed to align secondary reads.
The deleveraging algorithm was also enabled to resolve overmerged tags, while gapped alignments were disabled. Following ustacks, a catalog of loci was generated using 27 walleye and 17 sauger from across all of the sample sites with two mismatches allowed between sample loci, as determined above. Samples with an average number of reads were selected to be included in the catalog in order to reduce complexity, while still capturing the genetic diversity of each species. Following generation of the catalog, individual stacks were searched against the catalog using the sstacks script. The tsv2bam script was then run to transpose the data, followed by the gstacks script. Finally, the populations script was run to export SNPs. In order to avoid bias in missing data (see Graham, Boreham, et al., 2020),  Note: Fish were characterized in the field based on phenotypic markings.
TA B L E 1 Collection data for 217 walleye (Sander vitreus), sauger (Sander canadensis) and putative hybrid saugeye from Lake Diefenbaker and Tobin Lake of the individuals (-r). The r90 dataset was used to investigate population structure both within and between species, while the r100 dataset was used to examine differentiation between species while generating a smaller group of candidate loci for assessment of genetic status (hybrid vs. pure).
The influence of missing data on the r90 dataset was examined using the missing_visualization() function within the grur R package (Gosselin & Archer, 2019;R Core Team, 2019). An isolation-by-missingness (IBM) plot was generated based on the presence/absence of genotypes within individuals across sample sites. Following the analysis of missing data, loci were checked for conformation to Hardy-Weinberg equilibrium (HWE; p < 0.05) using the filter_hwe() function in the radiator package (Gosselin & Archer, 2019). Loci that did not conform to HWE in two of the three species categories were used to create an exclusion list for analyses requiring HWE as an assumption.
Following quality control, basic population and species-level statistics were calculated using both datasets. These basic statistics were calculated using groupings based on morphological identification for species. The nucleotide diversity (π) and the number of private alleles were calculated using the pi() and pri-vate_alleles() functions in radiator, respectively. The observed heterozygosity within subpopulations (H O ), the expected heterozygosity within populations assuming HWE (H S ), and the inbreeding coefficient (G IS ) were calculated using GENODIVE (Meirmans & Van Tienderen, 2004). The expected heterozygosity (H S ) within subpopulations is also known as the gene diversity and includes corrections for sampling bias (Nei, 1987

| Population subdivision analyses
Three different population structure analyses were then performed on each dataset: (a) fixation indices (F ST ); (b) a maximumlikelihood approach (ADMIXTURE); and (c) ordination (DAPC).
The approaches using fixation indices and maximum likelihoods have underlying assumptions of HWE, so loci identified in the exclusion list mentioned above were removed from these analyses, while all loci were used in the ordination analysis. The F ST analysis was conducted in GENODIVE by first computing a distance-based matrix for all sites using AMOVA and then computing pairwise F ST between sampling sites using 5000 permutations (Excoffier et al., 1992;Meirmans & van Tienderen, 2004). F ST analysis requires a priori user-defined groups for comparison; we used groups based on phenotype for this purpose. Significance values were adjusted for multiple comparisons using a false discovery rate (FDR) correction. The maximum-likelihood approach was performed using ADMIXTURE (Alexander et al., 2009;Zhou et al., 2011).
This program estimates the ancestry coefficient of each individual using a maximum-likelihood approach followed by cross-validation to determine the distinct populations (K; Alexander et al., 2009;Zhou et al., 2011). The R package pophelper was used to visualize ADMIXTURE results (Francis, 2017). The ordination analysis was conducted using discriminant analysis of principal components (DAPC) in adegenet (Jompart, 2008;Jompart & Ahmed, 2011). The DAPC plot was generated using the optim_a_score() function to determine the optimal number of principal components in order to avoid overfitting the data. ADMIXTURE and DAPC do not require a priori groupings for analysis.

| Identification of hybrids and genomic introgression
Our general approach involved the use of two unsupervised clustering programs, ADMIXTURE and NEWHYBRIDS, to group fish based solely on genotypes. No prior information about species (e.g., phenotype) was used in either analysis. We then compared genetic classifications to those based on identification using field markers.
In ADMIXTURE, we used a value of K = 2 to identify pure individuals in two groups, and those that were intermediate to varying degrees. The power and accuracy of ADMIXTURE to identify pure and mixed individuals from differentiate hybrid classes was assessed by generating simulations of parental, F 1 , F 2 , and backcrosses using HYBRIDLAB v1.1 (Nielsen et al., 2006). Simulated allele frequencies were generated using walleye and sauger from the r100 dataset by Walleye had more private alleles in both datasets compared with sauger (Table 2). This is likely a result of sample size, where more private sauger alleles would likely be uncovered with more saugers sequenced. The average observed heterozygosity in saugeye was 4.0× and 7.6× greater in the r90 data and 5.2× and 6.2× greater in the r100 data than walleye and sauger, respectively (Table 2; Figure 2).
Both walleye and sauger had very small G IS values in both datasets, indicating neither an excess nor deficiency of heterozygotes (Table 2; Waples, 2015). The nucleotide diversity of saugeye was 2.80× and 5.45× larger in the r90 and 3.54× and 4.26× larger in the r100 dataset than that for walleye and sauger, respectively (Table 2).

| Population subdivision analyses
We used GENODIVE to determine pairwise F ST values between each parental species category and sample site using both datasets (Tables S1 and S2

F I G U R E 2
Individual observed heterozygosity (Ho) of each species. Heterozygosities were measured from each species in the dataset generated with 90% of the individuals (r90; a) and 100% of the individuals sampled (r100; b). The line in the middle of each box denotes the median, the box encompasses the first and third quartile of the data, while the horizontal lines above and below represent the maximum and minimum, respectively. The letters denote species or populations that are statistically different the r100 dataset (Tables S1 and S2) (Table S1) (Table S2).
The DAPC analysis was run using six and nine principal components in the species-level analyses with the r90 and r100 dataset, respectively, as determined using the optim.a.score() function ( Figure 3a; Figure S1a). In the site-specific analysis, 26 principal components were used with the r90 analysis and 13 were used with the r100 dataset (Figure 3b; Figure S1b). The inclusion of the optimal number of principal components resulted in 70.1% and 81.0% of the total variation retained in the r90 and r100 species-level datasets, respectively (Figure 3a; Figure S1a). The sample site-level analysis retained 75.0% in the r90 and 82.6% of the total variation in the r100 datasets (Figure 3b; Figure S1b). The first two discriminant functions represented 98.8% and 1.18% of the variation in the r90 dataset and 99.6% and 0.41% of the variation in the r100 datasets in the species-level analysis (Figure 3a; Figure S1a). Similarly, the first two discriminant functions represented 61.0% and 19.4% of the variation in the r90 dataset and 77.7% and 18.6% of the variation in the r100 in the site-level analyses (Figure 3b; Figure S1b). The assignment proportion was 0.967 in the r90 dataset and 0.972 in the r100 dataset at the species level. In the r90 dataset, five individuals morphologically identified as walleye were found to group with saugeye (W6442, W6511, W6723, W7012, and W7142), one saugeye grouped with walleye (H6383), and one saugeye grouped with sauger (H6631; Figure 3a). Using the r100 dataset, four individuals identified morphologically as walleye grouped with saugeye (W6442, W6723, W7012, and W7142), one saugeye grouped with walleye (H6383), and one saugeye grouped with sauger (H6631; Figure S1a).
The assignment proportions for the sample site-level analyses were 0.788 and 0.759 for the r90 and r100 datasets, respectively, at the sample site level (Figure 3; Figure S1).

F I G U R E 3 Discriminant analysis of
principal components (DAPC) of the different species (a, b) and sample sites (c, d) in the r90 dataset (a, c) and r100 dataset (b, d). The DAPC analysis was run with 3 and 11 principal components in the species and sites analyses, respectively. Distinct ellipses indicate population differentiation. Site abbreviations can be found in Table 1 ADMIXTURE was run on all sample sites and individuals using both r90 and r100 datasets (Figure 4; Figure S2). K = 3 had the

| Identification of hybrids and genomic introgression
ADMIXTURE output for K = 2 was used to resolve hybrids and parental species. The simulated data generated with HYBRIDLAB using pure individuals from NEWHYBRIDS indicated that with the  Table 3; Data S1). One field-identified sauger (S6462) caught at Riverhurst was genetically identified as a saugeye (Figure 4; Table 3). Two field-identified saugeye were actually incorrect, with one from Riverhurst genetically identified as a walleye (H6383) and one from Saskatchewan Landing as a sauger (H6631; Figure 4; Table 3; Data S1). Twelve fish identified phenotypically as pure walleye in the field were genetically identified as saugeye: nine from Riverhurst, one from Saskatchewan Landing, and two from Tobin Lake (Figure 4; Table 3; Data S1). One of these walleye was genetically identified as an F 1 /F 2 hybrid (W7012; Table 3; Data S1). Introgression was detected in eight of the phenotypically misidentified walleye from Riverhurst, with three of these individuals (W6412, W6442, and W6471) identified genetically as first-generation backcrosses, and five (W6405, W6445, W6464, W6465, and W6472) as second-generation backcrosses (Table 3; Data S1). One walleye from Saskatchewan Landing (W6511) and two from Tobin Lake (W6723 and W7142) were genetically identified as F 1 /F 2 hybrids (Table 3; Data S1).
Run independently, the NEWHYBRIDS analysis alone identified the same misidentified individuals as ADMIXTURE, except for one F I G U R E 4 ADMIXTURE analysis across species from all sample sites in the (a) r90 and (b) r100 datasets. Each line represents an individual from the corresponding sample site. Site abbreviations can be found in Table 1.

Asterisks above each figure represent misidentified individuals
individual (W6405) identified as a pure walleye in NEWHYBRIDS and a potential second-generation backcross in ADMIXTURE. A total of 14 of 212 (7%) fish were misidentified in the field, resulting in 93% correct assignment ( Figure 5; Table 3; Data S1). One field-identified sauger from Riverhurst (S6462) was genetically identified as a backcross ( Figure 5; Table 3; Data S1). Two fish field identified as saugeye were misidentified, with one from Saskatchewan Landing genetically confirmed to be a sauger (H6631), and one from Riverhurst identified as a walleye (H6383), both of which were also identified in ADMIXTURE ( Figure 5; Table 3; Data S1). Interestingly, one fish from Riverhurst field identified as a hybrid (H6384) was identified as an F 2 hybrid in the NEWHYBRIDS analysis and one fish (H6435) was also identified to be a backcrossed sauger (Table 3; Data S1). A total of 11 fish identified as walleye in the field were actually hybrids. Four of the 11 fish were identified as F 1 : one from Riverhurst (W7012), one from Saskatchewan Landing (W6511), and two from Tobin Lake (W6723, W7142; Figure 5; Table 3; Data S1). Seven fish identified morphologically as walleye from Riverhurst were actually backcrossed walleye (W6412, W6442, W6445, W6464, W6465, W6471, and W6472; Figure 5; Table 3; Data S1).

Comparing the combined results of both analyses, ADMIXTURE
and NEWHYBRIDS detected 17 of 214 (8%) individuals with incongruent identification in the field, with 16 of those present in both analyses (Table 3). It is important to note that any individual identified as belonging to a hybrid category based on either analysis method was considered a hybrid. There were two incongruencies across the different programs. One field-identified walleye (W6405) TA B L E 3 Individual fish that were misidentified in the field based on morphological features, and their actual species category based on two genetic analyses, ADMIXTURE and NEWHYBRIDS Note: The ADMIXTURE column indicates what the individual was identified as using the ADMIXTURE program with the corresponding Q-value in the specified column. The potential ADMIXTURE category is based on Q-values with cutoffs from a simulation run in HYBRIDLAB, with F 1 /F 2 indicating the individual could belong to either hybrid category. The NEWHYBRIDS column is based on Bayesian posterior probabilities with the fullconditional probability of assignment in the probability column. F 1 represents a cross between a pure walleye and a pure sauger, F 2 represents the breeding of two F 1 individuals, BC represents backcrossed individuals where an F 1 breeds with a pure individual, and BC2 represents the breeding of a backcrossed individual with a pure parental individual. Site abbreviations can be found in Table 1. W6471 did not converge in Bayesian analyses and was therefore removed from the NEWHYBRIDS analysis.

F I G U R E 5
Results from the NEWHYBRIDS analysis using the r100 dataset. Each line represents an individual's posterior probability to a hybrid class. Pure 1 represents sauger, and pure 2 represents walleye. F 1 individuals are filial 1 hybrids generated by pure 1 × pure 2. F 2 individuals are filial 2 hybrids created by F 1 × F 1 , and BC1 and BC2 represent backcrossed individuals where F 1 mate with pure 1 or pure 2, respectively was identified as a second-generation backcross with ADMIXTURE but was identified as a pure walleye with NEWHYBRIDS. One field-identified hybrid (H6384) was identified as an F 2 hybrid with NEWHYBRIDS, but as an F 1 based on Q-values in ADMIXTURE.
NEWHYBRIDS uses allele frequencies to determine specific hybrid classes, including designation of F 1 , F 2 , and backcrossed individuals. This approach resulted in better resolution when distinguishing between first-and second-generation hybrids but was not able to resolve backcross generations (Table 3). In contrast, the simulations from HYBRIDLAB were unable to resolve F 1 and F 2 individuals but provided clear Q-values associated with pure walleye and sauger, and first-and second-generation backcrosses. Individuals identified as F 1 in NEWHYBRIDS had average Q-values of 0.50884 (SD ≅ 0.00778) in ADMIXTURE, while the F 2 individual had a Qvalue of 0.53253. Overall, the analyses indicate that pure walleye were correctly identified 93% of the time based on morphological identification in the field. Tobin Lake had 97.8% correct assignment of pure walleye, compared with only 87.7% from Lake Diefenbaker.
Combining the results from all analyses, the overall hidden introgression level in walleye based on our limited sampling was 4.6% across both reservoirs, and 9.9% and 0% in Lake Diefenbaker and Tobin Lake, respectively. Within Lake Diefenbaker, Riverhurst had a 19.5% level of hidden introgression, while zero hidden introgression was detected at Saskatchewan Landing and Coteau Bay.

| D ISCUSS I ON
The GBS approach we used worked well with unsupervised clustering methods to resolve different hybrid categories in two admixed populations of walleye and sauger without a priori knowledge of diagnostic loci. The power of GBS lies in the ability to genotype large numbers of loci across the genome, providing high resolution to detect hybridization and introgression (Ackiss et al., 2020;Allendorf et al., 2001;Boecklen & Howard, 1997;McFarlane & Pemberton, 2019;Melville et al., 2017;Miller, 2000;Randi, 2008;Young et al., 2001). This detection power offers a substantial improvement over more traditional markers, which likely underestimated introgression levels (Allendorf et al., 2001;Boecklen & Howard, 1997;Grabenstein & Taylor, 2018;Hohenlohe et al., 2013;Vaha & Primmer, 2006). Previous studies of hybridization have used similar RRL approaches for SNP genotyping, but took additional steps to identify a subset of informative loci to be genotyped using a more targeted approach (e.g., Hand et al., 2015;Hohenlohe et al., 2013;Pritchard et al., 2012;Pritchard et al., 2016). This strategy may limit the application of the identified SNPs to specific populations, similar to the limitations mentioned for other marker types (Hand et al., 2015;Pritchard et al., 2016;Wringe et al., 2019).
Thus, rather than list a panel of specific loci for additional studies of walleye and sauger, we recommend that others apply the general approach of GBS for characterizing hybrids. GBS is very flexible across species and does not require initial investment in the development and quality control of diagnostic loci (Davey et al., 2011;Shafer et al., 2016). In addition, SNPs based on sequencing do not present the same challenges with scoring of alleles and are much more readily shared across laboratories and via archived data than previously used marker types (e.g., microsatellites, allozymes;Davey et al., 2011). The GBS approach should work broadly across different taxa and other species pairs, provided that the parental species are sufficiently divergent to offer clear resolution.
The ability to reliably resolve hybrids beyond first generation (F 1 ) is critical for understanding introgression in areas with spontaneous hybridization. Unsupervised clustering approaches based on maximum likelihood (e.g., ADMIXTURE) and Bayesian probability (e.g., NEWHYBRIDS) are commonly used to identify and classify hybrids. ~10 -20 microsatellite loci using thresholds between 0.8 and 0.9 (Burgarella et al., 2009;do Prado et al., 2017;Marie et al., 2011;May-McNally et al., 2015;Sacks et al., 2011;Sanz et al., 2009;Vaha & Primmer, 2006). The incorporation of more markers through a GBS approach reduces the error and maximizes detection of introgression (reviewed by McFarlane & Pemberton, 2019). However, Qvalue ranges have not been thoroughly investigated with GBS data without creating a panel of diagnostic loci (Randi, 2008;Vaha & Primmer, 2006). In our study, GBS enabled classification of parental species with very high confidence (Q-values > 0.99), and our simulation in HYBRIDLAB suggested a range of 0.05 < Q < 0.95 to classify hybrids, similar to other GBS studies (Ackiss et al., 2020;Lavretsky et al., 2016). Importantly, this Q-value range resulted in 94% agreement between ADMIXTURE and NEWHYBRIDS for identification of hybrids. However, there were some differences in several specific hybrid classifications, which has also been shown in previous SNP studies (Elliott & Russello, 2018;Gramlich et al., 2018;Pujolar et al., 2014). These differences result from the model used to determine hybrid classes, allowing NEWHYBRIDS to distinguish between F 1 , F 2, and backcrossed individuals (Anderson, 2009;Anderson & Thompson, 2002). Nevertheless, any individual identified as a hybrid using either program should be treated with caution when it comes to broodstock management in stocking programs.
Spontaneous hybridization of walleye and sauger is occurring in both of the major Saskatchewan River reservoirs we studied. In the case of Lake Diefenbaker, our finding confirms previous allozyme work (Billington et al., 2005), but genetically verified saugeye are a novel finding for Tobin Lake. Anthropogenic disturbances can break down reproductive isolation between species and may increase the rate of hybridization (reviewed by Rhymer & Simberloff, 1996;Todesco et al., 2016;McFarlane & Pemberton, 2019). Hydroelectric dams alter natural flow regimes, which cause significant alterations to native habitat, including the generation of artificial lakes (reservoirs) that replace reaches of the original river system (Carr et al., 2015).
Over time, the reservoirs lose environmental heterogeneity and native habitat patterns, which can remove reproductive isolation between species (Hall et al., 2011;Hasselman et al., 2014;Seehausen et al., 2008). Dams have altered the habitat in the Saskatchewan River, resulting in the isolation of populations of walleye and sauger into smaller geographical sections. Further, reservoirs are characterized by a merging of riverine and lacustrine regions, creating areas of habitat transition, which likely influence interspecific interactions and facilitate hybridization (reviewed in Grabenstein & Taylor, 2018;Mandeville et al., 2019). The creation of reservoirs from the construction of dams in the Saskatchewan River has likely led to an increased level of hybridization due to the changes in habitat across the system.
The occurrence of hybridization and introgression varied substantially between reservoirs and was much more common in fish sampled from Lake Diefenbaker than Tobin Lake. Importantly, a single putative F 2 hybrid and multiple generations of backcrosses with both walleye and sauger were detected in Lake Diefenbaker, while only F 1 hybrids were found in Tobin Lake. It is important to note that although the F 2 hybrid was only detected in one of the analyses, the high posterior probability in NEWHYBRIDS (1.000) likely means that it is a true second-generation saugeye. This is only the second F 2 saugeye reported in nature (see Fiss et al., 1997) and demonstrates viable F 1 × F 1 crosses in Lake Diefenbaker. Thus, F 1 saugeye are reproducing in Lake Diefenbaker, causing introgression of the sauger genome into walleye, and vice versa. In contrast, we found no evidence of introgression in Tobin Lake. Multiple factors could contribute to the reservoir differences that we observed, including differences in relative abundance of parental species, overlap in habitat and spawning locations, and anthropogenic influences in each reservoir (Gilman & Behm, 2011;Grabenstein & Taylor, 2018;Scribner et al., 2001). Butt et al. (2017) found that ecological niche overlap between walleye and sauger differed substantially within the reservoirs, indicating there are underlying factors causing distinct interactions in these lakes. On a broader scale, these reservoirs are impoundments of distinct regions of the Saskatchewan River with different headwaters, habitat and productivity characteristics, and stressors, which may influence geographical patterns of hybridization. The South Saskatchewan River in particular (containing Lake Diefenbaker) is heavily altered due to its multiuse purpose (e.g., drinking water, power, and agricultural and industrial use), human population density, and associated land-use activities in the Canadian prairies (Carr et al., 2015;Corkal et al., 2011;North et al., 2015). However, the factors that are most relevant for understanding hybridization (e.g., spawning event overlap ;Hasselman et al., 2014;Mulfeld et al., 2009) are not well characterized in either reservoir or host river.
Hybridization and introgression in Lake Diefenbaker may vary based on location within the reservoir. Previous data based on allozymes from gill-netted fish showed an east-to-west gradient of F 1 hybrids, with the lowest level of hybridization detected in the northeast near Coteau Bay, and the highest near Saskatchewan Landing at the western end of the lake (Billington et al., 2005). The samples we collected from CFEs also showed within-reservoir differences, with higher levels of hidden introgression (19.5%) in field-identified walleye from the Riverhurst CFE, and zero hidden introgression at both the Saskatchewan Landing and Coteau Bay events. Although there was no hidden introgression detected at Saskatchewan Landing, multiple first-generation hybrids were detected, whereas multiple generations of backcrosses and a second-generation hybrid, F 2 , were detected at Riverhurst. There are major habitat differences among these areas of the lake, but it is uncertain whether apparent hybridization differences reflect the location of sampling, the time of year, and/or the characteristics of the particular fish susceptible to angling in the area at that time. Other studies have shown spatial gradients in hybridization frequency between lakes across large geographical regions, usually with high frequencies near the source reservoir (Albert et al., 2006;Graeb et al., 2010;Hargrove et al., 2019;Hasselman et al., 2014;Mandeville et al., 2017Mandeville et al., , 2019Rudbridge & Taylor, 2005;Sotola et al., 2019). However, very few studies have detected hybridization differences based on sample site within a lake. Additional research is required to understand potential spatial structuring of introgression in Lake Diefenbaker.
The results of our study and previous work suggest that the population of walleye in Lake Diefenbaker is in a hybrid zone, with a range of hybrid types, including those where introgression may render individuals indistinguishable from their parental species (Allendorf et al., 2001;Barton & Hewitt, 1985;Rhymer & Simberloff, 1996).
This could lead to a hybrid swarm, which can alter the genetic composition of the parental species and reduce population and species differentiation, possibly leading to outbreeding depression (Allendorf et al., 2001;Mandeville et al., 2019;McFarlane & Pemberton, 2019;Rhymer & Simberloff, 1996;Scribner et al., 2001;Todesco et al., 2016;Weigel et al., 2003). In contrast, hybrid zones can also lead to novel genotypes and an increase in diversity within the species (Barton, 2001;Hamilton & Miller, 2015;Harrison & Larson, 2014;Seehausen, 2004). This uncertainty makes it important to understand the genomic impacts of hybridization to help mediate long-term effects, particularly when hybridization is the result of anthropogenic habitat disturbances (Grabenstein & Taylor, 2018;Hasselman et al., 2014). This is especially true for fish in reservoirs, which are particularly susceptible to environmental change in combination with heavy anthropogenic impact (Hayes et al., 1999). Importantly, the impact of the hybridization documented in these lakes, if any, is unknown. Nevertheless, the presence of a hybrid swarm in Lake Diefenbaker indicates that hybridization is a pervasive and well-established biological phenomenon in that waterbody. Reservoirs such as Lake Diefenbaker often support important recreational fisheries and/or provide broodstock for stocking programs; a decline in population productivity due to outbreeding depression may not only deteriorate the population, but also have larger socio-economic impacts.
This study shows that it is crucial to include genetic assessments in fisheries management activities that require correctly identifying walleye, sauger, and their hybrids. Such activities may include broodstock collection, hybrid zone identification, and general population surveys. Overall, 7.0% of fish were misidentified morphologically based on broad species and hybrid categories, with levels of misidentification varying by reservoir. Most concerningly, several fish presumed to be pure walleye were identified genetically as F 1 hybrids. F 1 individuals should be phenotypically intermediate to parental species; however, our results and previous work suggest this is not always the case Flammang & Willis, 1993;Van Zee et al., 1996;Ward & Berry, 1995;White et al., 2005).
Further, parental species were misidentified as hybrids in the field, illustrating that natural phenotypic variation and introgressed markers in pure parental species may also be misleading. Our analyses of field-identified walleye revealed hidden sauger introgression in 8% of fish based on crosses beyond F 1 in Lake Diefenbaker. This finding indicates introgression within the walleye genome, which is common with backcrosses, especially after multiple generations, as individuals with low levels of admixture may not have intermediate morphol- ogy (Allendorf et al., 2001;McFarlane & Pemberton, 2019). Due to the hidden introgression detected in this study, fisheries managers should take caution, or entirely avoid sourcing broodstock from waterbodies where spontaneous hybridization is occurring. This is especially important when the purpose of the broodstock collection is for supplementing natural reproduction and/or stocking in locations with connectivity to waterbodies with self-sustaining populations.
However, the negative consequences of potential hybrid stocking in hydrologically isolated waterbodies, or waterbodies that lack suitable conditions for natural production (i.e., put-take fisheries) are inherently lower. In Saskatchewan, this means that Lake Diefenbaker, Tobin Lake, and possibly other locations in the Saskatchewan River may not be suitable broodstock sources for all target waterbodies.
Ultimately, it is vital to routinely monitor the genetic integrity of samples where spawn is collected in order to mitigate the risk of genetic introgression into broodstock and other systems. In addition, monitoring the genetic integrity of populations which were originally sourced from broodstock with potential introgression will aid in understanding the dispersal of hybrids through stocking activities and potential consequences. We are grateful to the anglers and organizers of competitive fishing events for working with us to provide access to fish, and to the anglers that made these events successful. We also thank J. Finally, we would like to thank the Associate Editor and two anonymous reviewers for their helpful comments on earlier versions of the manuscript.

CO N FLI C T S O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
All relevant data are available from Dryad at https://doi.org/10.5061/ dryad.8cz8w 9gnx .