Long‐term evaluation of fitness and demographic effects of a Chinook Salmon supplementation program

Abstract While the goal of supplementation programs is to provide positive, population‐level effects for species of conservation concern, these programs can also present an inherent fitness risk when captive‐born individuals are fully integrated into the natural population. In order to evaluate the long‐term effects of a supplementation program and estimate the demographic and phenotypic factors influencing the fitness of a threatened population of Chinook Salmon (Oncorhynchus tshawytscha), we genotyped tissue samples spanning a 19‐year period (1998–2016) to generate pedigrees from adult fish returning to Johnson Creek, Idaho, USA. We expanded upon previous estimates of relative reproductive success (RRS) to include grandparentage analyses and used generalized linear models to determine whether origin (hatchery or natural) or phenotypic traits (timing of arrival to spawning grounds, body length, and age) significantly predicted reproductive success (RS) across multiple years. Our results provide evidence that this supplementation program with 100% natural‐origin broodstock provided a long‐term demographic boost to the population (mean of 4.56 times in the first generation and mean of 2.52 times in the second generation). Overall, when spawning in nature, hatchery‐origin fish demonstrated a trend toward lower RS compared to natural‐origin fish (p < 0.05). However, when hatchery‐origin fish successfully spawned with natural‐origin fish, they had similar RS compared to natural by natural crosses (first‐generation mean hatchery by natural cross RRS = 1.11 females, 1.13 males; second‐generation mean hatchery by natural cross RRS = 1.03 females, 1.08 males). While origin, return year, and body length were significant predictors of fitness for both males and females (p < 0.05), return day was significant for males but not females (p > 0.05). These results indicate that supplementation programs that reduce the potential for genetic adaptation to captivity can be effective at increasing population abundance while limiting long‐term fitness effects on wild populations.

Owing to habitat degradation and fragmentation, dam construction, overfishing, and climate change, multiple species and populations of Pacific salmon are at risk of extirpation (Gustafson et al., 2007). To reduce the likelihood of extirpation, hatchery supplementation programs have been implemented throughout the Pacific Northwest (USA) to increase population abundance (Naish et al., 2007;Paquet et al., 2011). However, previous research shows that hatchery-origin fish released into the river systems can affect the fitness of wild stocks . Although supplementation programs, such as hatchery programs, may cause deleterious fitness effects (Araki, Cooper, & Blouin, 2009;Christie et al., 2012;Ryman & Laikre, 1991), they can also provide a large demographic boost (DB) to natural populations through an increase in the reproductive success (RS) of the population as a whole (Cuenco, 1994).
Despite increases in salmonid abundance from supplementation programs (Paquet et al., 2011), the use of hatcheries remains controversial. Therefore, there is a growing need for empirically based scientific evaluation of the long-term effects of hatchery fish on the wild population (Flagg, 2015). Recent evidence suggests that integrating natural-origin individuals into the hatchery breeding pairs (i.e., broodstock) can boost natural population abundance with minimal negative fitness impacts to the wild population (Hess et al., 2012;Schroder et al., 2008). However, it remains unclear whether these effects are sustained across multiple generations. This study extends results from Hess et al. (2012) to evaluate long-term fitness effects between hatchery-origin fish that spawn with natural-origin fish in nature across multiple generations and examines factors influencing highly variable RS that is observed among individuals.
We utilized a 19-year (1998-2016) span of genetic pedigrees from a Chinook Salmon (Oncorhynchus tshawytscha) population that integrates 100% natural-origin individuals into the hatchery broodstock to specifically (a) expand upon previous work by Hess et al. (2012) and evaluate whether supplementation continued to provide a longterm DB to the population, (b) test for overall differences in fitness (i.e., RS) between hatchery-and natural-origin fish across two generations, and (c) test the significance of other potential factors (e.g., body length and return timing) affecting fitness in this population.

| Study site and sample collection
Over the period of 1998-2016, approximately 14,500 caudal fin samples were collected from Chinook Salmon returning to spawn in Johnson Creek, Idaho, as part of a supplementation program implemented by the Nez Perce Tribe (NPT) in 1998 (see Table S1 for sample sizes). Johnson Creek is a spawning aggregate of the East Fork South Fork Salmon River spring/summer Chinook Salmon population, representing one of 32 evolutionarily significant units of spring/summer Chinook Salmon populations in the Snake River (see Supporting Information Figure S1 for map of the South Fork Salmon River basin). The majority of the tissue samples and biological data used in our analyses were collected from fish trapped at the NPT's Johnson Creek weir, a temporary but highly efficient trap located downriver from where approximately 90% of all redds in Johnson Creek are enumerated (Rabe, Nelson, & Covel, 2016). We also collected tissue from carcasses encountered during spawning ground surveys. Combined, the two approaches were estimated to have sampled a yearly mean of 93% of the entire population.
During sampling, NPT staff recorded data including sex, body length, origin, sampling date, and weir arrival date. Sex was determined by physical morphology. Origin was determined by the presence/absence of coded wire tags, visual implant elastomer tags, or a clipped adipose fin; hatchery-origin (further referred to as "HOR") fish in this system have a coded wire tag and/or a visual implant elastomer tag, while natural-origin (further referred to as "NOR") fish do not have any markings. Fish reared in this supplementation program assist with recovery of a nearly extirpated population, so no harvest is intended and these HOR fish are not adipose-clipped. Thus, any adipose-clipped fish that return to Johnson Creek represent strays from other hatchery programs and are removed at the weir from the spawning population. Arrival time for PIT-tagged fish was determined through the use of an array mounted on the weir, while arrival time for non-PIT-tagged fish was based on weir collection date. For the purposes of this study, data recorded in the field were entered into a shared database and combined with parentage-informed data.
If there were any discrepancies between field-collected and parentage-informed data, we deferred to parentage-informed data. For example, if a fish was listed as a 4-year-old from the field-collected data, but parentage analysis assigned that individual as a 5-year-old, we chose the parentage-informed data to infer age.
As per protocol for this supplementation program, only NOR returns were used as broodstock, which consisted of up to 40 pairs of NOR adults annually. All remaining NOR returns and all Johnson Creek HOR returns were released above the weir for natural spawning. After fertilization of gametes from broodstock pairs, eggs were transferred to the McCall Fish Hatchery, where juveniles were reared for approximately 19 months. Supplementation juveniles were then direct-released into Johnson Creek. Upon maturation, adults returned to spawn at Johnson Creek at either 3 (male "jacks"), 4, or 5 years old. Female 3-year-olds along with 2-and 6-year-olds returning adults represent a small proportion of returning fish (<1%) and were therefore not included in any analyses due to low sample sizes.

| Parentage analysis
The current study extends the parentage results of Hess et al. (2012) that presented genetic-based pedigrees for adults sampled be- We extracted genomic DNA from fin tissue using two separate methods over the course of the 19-year study: a standard Qiagen DNeasy protocol paired with a Qiagen 3000 robotic pipetting system (Qiagen Inc., Valencia, CA) and a Chelex 100 method (Sigma-Aldrich, St Louis, MO). Genotyping methods transitioned from 13 microsatellite markers to 298 single-nucleotide polymorphisms (SNPs) throughout the course of the 19-year study period, but the transition was done in a manner such that no gaps existed in parentage (Table S1). Microsatellite genotyping was completed following methods described previously (Hess et al., 2012), while SNP genotyping was done for a panel of 298 markers with genotyping in thousands by sequencing (GT-seq; Campbell, Harmon, and Narum (2015); SNP markers described in Hess, Campbell, Matala, Hasselman, & Narum, 2015). In order to transition between marker panels, there were 5 years (2008-2012) of adult returns that were genotyped with both microsatellite and SNP marker panels (Table S1) (Table S1). Single-nucleotide polymorphism loci with <90% coverage across individuals (i.e., any locus that had ≥10% missing data) were removed prior to running parentage analyses. Since parentage analyses were conducted on a yearly basis, the analyses were split, thereby removing a distinct, data-specific number of SNPs for each analysis. Therefore, the number of SNPs used for parentage analyses ranged from 276 to 293. In addition, we removed any individual with ≥10% missing genotypes. We then used the package idfgen in R (RCore, 2016) to format data for parentage analysis and remove duplicate samples (https://github.com/mack-erman44/idfgen) and the program SNPPIT to assign parentage for adult offspring (Anderson, 2010). The program SNPPIT identifies parent pairs with biallelic markers using a likelihood-based categorical assignment method and Monte Carlo simulations to assess confidence in assignments. We allowed for a 1% per locus genotyping error rate as recommended by the author (Anderson, 2010), but observed genotyping error based on concordance of quality control tests (i.e., repeated genotyping) was estimated to be <0.1%. Therefore, the estimate used in our analyses was a conservative overestimate of observed error. We did not include sex of the parent in the parentage analysis.
We then removed trio parentage assignments at a false discovery rate (FDR) >0.05 and >2 trio mismatches (Anderson & Garza, 2006;Benjamini & Hochberg, 1995). Additionally, we removed assignments in which the trio having the highest posterior probability of parentage did not involve the two true parents of the offspring (e.g., the candidate parents may be related to the true parents but are not the actual true parents themselves; Anderson, 2010;Hess, Campbell, Docker, et al., 2015). We also removed HOR offspring assigning to nonbroodstock parents, offspring assigning to one broodstock and one nonbroodstock parent, and offspring with spurious assignments to two same-sex parents. Finally, we determined the overall assignment success for offspring from return year 2013 and 2014, as all parental brood years that produced fish returning in 2013 and 2014 were sampled.

| Demographic boost
To determine whether the supplementation program provided a DB to the naturally spawning population, we compared the RS of F 0 fish taken from the wild and used as hatchery broodstck (RS BS ) and the RS of F 0 NOR natural spawners (RS NS ) F 1 natural spawners that originated from F 0 broodstock parents were classified as HOR, while F 1 natural spawners that originated from F 0 NOR natural spawners were considered NOR. In order to evaluate the DB through the first generation of NOR spawners, it was necessary to account for the F 1 NORs that returned to Johnson Creek but were removed and used for broodstock. We did this by subtracting the accordant number of F 1 fish that were used for broodstock and then determining the RS of HOR and NOR fish for the appropriate years This was a more conservative estimate (lower values) of F 1 DB than if we had not accounted for F 1 fish that were used for broodstock. Overall, this estimate allowed us to measure the RS of grandparents and subsequent second-generation DB. Males (including jacks) and females were combined for both estimates of F 0 DB and F 1 DB.

| Relative reproductive success
To measure lifetime RS, we estimated the number of returning adult offspring for each parent. Females (4-and 5-year-old adults), males (4-and 5-year-old adults), and jacks (3-year-old males) were analyzed separately. We then compared RS between naturally spawning HOR and NOR fish in 2006-2011 following the methods described in Hess et al. (2012), also including any updates of estimates from previously published results (years 2002-2005). Briefly, we calculated relative reproductive success (RRS) by dividing the average RS of HOR fish by the average RS of NOR fish. RRS estimates were analyzed using two approaches. For the first approach, we included all potential candidate spawners in the population, regardless of whether they were assigned as parents to returning adult offspring. This approach has been used across multiple studies of RRS in salmonids (Araki, Ardren, et al., 2007;Berntson et al., 2011;Milot, Perrier, Papillon, Dodson, & Bernatchez, 2013;Theriault, Moyer, Jackson, Blouin, & Banks, 2011;Williamson et al., 2010). However, there are numerous reasons that individuals do not produce returning adult offspring, including but not limited to prespawn mortality or staying of offspring (Reviewed in Bowerman, Keefer, & Caudill, 2016;Keefer & Caudill, 2014). Because those individuals do not pass on their alleles to the next generation, we used a secondary approach to estimate the RS of those individuals that did pass their alleles on to the next generation, thereby affecting the long-term fitness of the natural population. For this secondary approach, we included only those spawners that successfully produced returning adult progeny, removing those individuals that produced zero returning adult offspring (Hess et al., 2012). We found that the difference in genotyping failure rates for HOR versus NOR parents was not significant (ANOVA; p = 0.13), and thus, differences in RS were expected to be biological in nature.
Finally, to evaluate fitness effects of HOR fish mating with NOR fish in nature (Araki et al., 2008;Ford et al., 2016;Hess et al., 2012), RS of HOR fish spawning in nature with NOR fish (HOR♀ × NOR♂ or NOR♀ × HOR♂) was compared to NOR × NOR matings. This comparison of cross types allowed us to generate a separate RRS value for the effect of the female parent and the male parent having been reared in the hatchery (Figure 1). We also used assignments to grandparents to compare the RS of HOR × NOR matings to

RS NOR
NOR × NOR matings across two full generations for a subset of return years (2002)(2003)(2004)(2005)(2006). Similar to our analysis of DB, we subtracted the accordant number of F 1 fish that were used for broodstock when determining the RS of grandparents. Additionally, we compared HOR × HOR to NOR × NOR matings following similar procedures.
Due to low sample sizes, jacks were not included in any estimates involving crosses.

| Statistical analyses and phenotypic variation
For each return year and sex, we used ANOVAs to test the null hypothesis that the mean RS was equal for NOR versus HOR fish. We then tested for differences in RS for all four types of crosses H × H, H × N, N × H, and N × N (where female is listed first in each cross type). We also used delta-method-based 95% confidence intervals to test for differences in RRS (Bowerman et al., 2016;Ford et al., 2016;Ford, Murdoch, & Howard, 2012;Franz, 2007 length were transformed to mean absolute deviation estimates to account for variation in return timing and growth among years. Using a negative binomial distribution model and a log link function, we considered eight candidate models consisting of two to four predictor variables and used AIC model selection, with the best fit model being the one with the lowest AIC score (Akaike, 1992;Burnham & Anderson, 2003). Results of a preliminary exploratory data analysis revealed that quadratic terms did not improve model fit and were therefore not included in the candidate model set. Likewise, interaction terms did not improve model fit and were not included due to limited sample sizes. Additionally, age and body length were highly correlated and not included together in candidate models. Therefore, the full models appear as follows: RS = Return Year + origin + body length + return day or RS = Return Year + origin + age + return day. All GLMs were run in R version 3.3.3 using the glm.nb function as part of the MASS package (RCore, 2016;Venables & Ripley, 2002). Graphics were generated using ggplot2, and data manipulation was conducted using dplyr, tidyr, and readr as part of the tidyverse package (Ross, Wickham, & Robinson, 2017;Wickham, 2009).    dataset (return years 2008-2016). All parental brood years that produced fish returning in 2013 and 2014 were sampled. Therefore, we used multiple criteria to filter potential false assigned offspring and found a range of assignment success from 76% (2014 offspring) to 77% (2013 offspring). In this study, average LOD of 25.9 and FDR of 0.0002 provided very high confidence in parentage assignments.

| Parentage analyses
Within the filtered dataset, the average number of Mendelian mismatches within the trio was 0.30 loci.
As stated previously, our evaluations of fitness effects of HOR fish mating with NOR fish were sex-specific and involved only 4and 5-year-olds in RRS estimates (  Figure 4). When measuring the fitness effects of cross types across two generations, we did not find any significant differences for males or females (Table S2; Figure 5). The RRS values across one generation for H × H versus N × N crosses ranged from 0.84 to 1.44 for females (mean = 1.04; Table S2; Figure 4a) and 0.97 to 1.81 for males (mean = 1.20; Table S2; Figure 4b).

| Factors affecting reproductive success
Prior to running GLM analyses, we used exploratory data analy-  Figure S2). In addition, return day demonstrated a bimodal distribution that remained consistent across origin type (HOR vs. NOR) and sex (Supporting Information Figure S3).
Out of the eight models that we tested, the best fitting model to the data included return year, origin, body length, and day of return for female and male spawners, although three other models had similar AIC scores (Table S3). Overall for females, the estimated coefficients of return years 2004-2011, origin, and body length for predicting RS were significantly different than zero (p < 0.05).
Return year 2003 and return day were not significant predictors in the model (p > 0.05; Table 1). For males, all of the estimated coefficients for predicting RS were significantly different than zero (p < 0.01) (  Figure 6).
Although we demonstrate that RS of HOR and NOR was significantly different for only a small subset of return years (Table S2; Figure 3), we found a general trend of fewer returning adult offspring produced by HOR compared to NOR males (Table S2; Figure 3a).
We also show that when accounting for other factors (e.g., return timing, body length), origin predicted RS, with NOR fish demonstrating higher RS than HOR fish (Table 1). These results are consistent with similar findings in other populations (Araki et al., 2008;Christie, Ford, & Blouin, 2014;Ford et al., 2012Ford et al., , 2016. However, negative fitness effects were generally found in populations that incorporated HOR fish into hatchery broodstock. Alternatively, studies utilizing populations that incorporate 100% NOR fish into hatchery broodstock repeatedly demonstrate reduced genetic risks (i.e., domestication selection) associated with supplementation Hess et al., 2012;Schroder et al., 2008;Waters et al., 2015Waters et al., , 2018. Therefore, the lack of consistent fitness differences between HOR and NOR fish in this study may reflect the positive effects of a broodstock program that uses only NOR fish. In addition, the majority of other studies assessing fitness differences between HOR and NOR fish have not incorporated other sources of variation that could account for the decrease in fitness of HOR fish (Araki et al., 2008;Christie et al., 2014).
We show that origin, return year, body length, and return day significantly predicted RS for males. For females, return year (with the exclusion of 2003), origin, and body length significantly predicted RS. Overall, body length demonstrated a positive relationship with both male and female RS, a finding that is consistent across salmonid species (Berntson et al., 2011;Ford et al., 2012Ford et al., , 2016Neff, Garner, Fleming, & Gross, 2015;Williamson et al., 2010). In female salmonids, body size is correlated with greater fecundity and egg size (Dickerson et al., 2002), and in males, large body size is predictive of higher RS, greater access to mates, and increased likelihood of engaging in male-male competition (Berejikian et al., 2009). However, we find that in a subset of return years (e.g., 2007 females), HOR fish exhibit a general trend toward smaller body length compared to NOR fish which may, in part, explain lower HOR RS that same year ( Figure 6). Previous research has also found a size difference between HOR and NOR fish that may be reflective of seasonally mediated effects on growth or parental age of maturity through broodstock selection (Hankin, Nicholas, & Downey, 1993;Knudsen et al., 2006;Larsen et al., 2004).
The drivers of size differences between HOR and NOR fish were not measured in this study, but interannual differences in environmental conditions could provide a possible explanation and should be addressed in future studies.
Return day demonstrated a significant negative relationship with RS for males, providing evidence that males returning later in the season produce fewer returning offspring. We did not see a significant effect of return day for females in this study. However, the return dates for both sexes show a bimodal distribution with returns either early or late in the season (with a distinct decrease in returns in between the two modes; Supporting Information Figure   S3). Although we did not measure selection gradients in the current study, other salmonid studies have demonstrated large, genetically based, heritable variation in return timing that is under distinct selective patterns such as stabilizing or disruptive selection (Anderson, Faulds, Atlas, Pess, & Quinn, 2010;Ford, Hard, Boelts, LaHood, & Miller, 2008;Quinn, McGinnity, Reed, & Bradford, 2016;Quinn, Unwin, & Kinnison, 2000;Seamons et al., 2007). Moreover, other studies have provided support that return date predicts RS, with distinct optimum return dates (Berntson et al., 2011;Kodama, Hard, & Naish, 2012). Therefore, it is possible that return timing in this population is under stabilizing or disruptive selection (Narum, Genova, Micheletti, & Maass, 2018).
Finally, we demonstrate that return year significantly predicted RS, a finding that has been shown in other studies (Berntson et al., 2011;Ford et al., 2016). This finding is also reflected in the large variation in yearly RRS and RS estimates found both in our study and in previous studies (Garant, Dodson, & Bernatchez, 2001;Hess et al., 2012;Jones & Hutchings, 2002  Body length (mm) Predicted reproductive success (95% CI) this study, the interaction between annual environmental conditions and phenotypic factors may help to uncover the sources of unexplained variation in RS across years. Other studies in Pacific salmonids have demonstrated an effect of river temperatures on survival and fitness (Hinch et al., 2012). Further, fluctuating ocean conditions, such as upwelling of cold water, can impact fitness in salmonids (Bi, Peterson, Lamb, & Casillas, 2011;Emmett, Krutzikowsky, & Bentley, 2006;Fisher & Pearcy, 1988;Mantua, Hare, Zhang, Wallace, & Francis, 1997;Peterson & Schwing, 2003). Therefore, future studies should be aimed at addressing the specific environmental factors that are associated with variation in RS estimates.
While we show a reduction in fitness of HOR compared to NOR fish for a subset of return years, crosses involving either one or two successful HOR parents demonstrated RS that was not significantly lower than those crosses involving two successful NOR parents (Table S2; Figure 4). These results provide support that fitness generally did not decrease for NOR fish when mating with HOR fish. When we extend RRS estimates of cross types to two full generations, we continue to see a nonsignificant difference between crosses involving either one or two HOR grandparents compared to those involving two NOR grandparents (Table S2; Figure 5).
It is important to note, however, that crosses involving jacks were removed due to low sample sizes. Low sample sizes can result in both imprecise RRS estimates and statistical power that is too low to detect a fitness difference, particularly between crosses (Christie et al., 2014). Although sample sizes in this study varied across years, Hess et al. (2012) provided evidence that an increase in sample size (by combining sexes across years) did not change results of their study nor did removal of years with low sample sizes. In addition, we provide multiple replicates (i.e., return years) with sample sizes that are equal to or greater than previous RRS studies (Araki, Ardren, et al., 2007;Berntson et al., 2011;Milot et al., 2013;Theriault et al., 2011). Future studies will aim to include single-parent assignments to increase overall offspring assignment success and subsequent sample sizes for RRS estimates.
While we demonstrate lower RS of HOR fish overall, we do not find a short-or long-term fitness reduction when HOR fish interbreed with the wild population. Other factors in addition to origin, such as return timing and body length, also predict RS in this population. By broadening our understanding of the specific factors affecting fitness of threatened species, we can more efficiently tailor conservation management strategies to focus on maintaining genetic diversity and increasing supplementation as a means of reducing the likelihood of extirpation.

ACK N OWLED G EM ENTS
We would like to thank the many NPT employees who helped collect the field data. Laboratory-based work was completed by Stephanie Harmon, Nick Hoffman, Amanda LaGrange, Lori Maxwell, and Megan Moore. We would also like to thank Mike Ackerman for the custom R script used to format data for SNPPIT and Daniel Hasselman for making valuable modifications to the scripts. Special thanks to Jason Vogel, Bill Young, and the associate editor and reviewers for providing valuable and insightful feedback on our manuscript. Finally, we would like to acknowledge Bonneville Power Administration for funding of this research.

CO N FLI C T O F I NTE R E S T
None declared.

DATA ACCE SS I B I LIT Y
GT-seq primer sequences, the R script used to analyze data, and both phenotype and genotype data of all fish used in the analysis are available on the Dryad Digital Repository https://doi.org/10.5061/ dryad.q6c9891.