Population structure and gene flow in the Sheepnose mussel (Plethobasus cyphyus) and their implications for conservation

Abstract North American freshwater mussel species have experienced substantial range fragmentation and population reductions. These impacts have the potential to reduce genetic connectivity among populations and increase the risk of losing genetic diversity. Thirteen microsatellite loci and an 883 bp fragment of the mitochondrial ND1 gene were used to assess genetic diversity, population structure, contemporary migration rates, and population size changes across the range of the Sheepnose mussel (Plethobasus cyphyus). Population structure analyses reveal five populations, three in the Upper Mississippi River Basin and two in the Ohio River Basin. Sampling locations exhibit a high degree of genetic diversity and contemporary migration estimates indicate that migration within river basins is occurring, although at low rates, but there is no migration is occurring between the Ohio and Mississippi river basins. No evidence of bottlenecks was detected, and almost all locations exhibited the signature of population expansion. Our results indicate that although anthropogenic activity has altered the landscape across the range of the Sheepnose, these activities have yet to be reflected in losses of genetic diversity. Efforts to conserve Sheepnose populations should focus on maintaining existing habitats and fostering genetic connectivity between extant demes to conserve remaining genetic diversity for future viable populations.

are an increasing concern because of their crucial role providing ecosystem services such as environmental nutrient recycling, structural habitat for other species, food resource, and biofiltration (USFWS, 2012a;Vaughn, 2018;Vaughn et al., 2008). Their ecosystem services and intrinsic value warrant the development of comprehensive conservation strategies to preserve them. To achieve this, researchers have started to examine the ecological and genetic conditions of imperiled freshwater mussel species at the population level. Successful conservation of imperiled species, such as freshwater mussels, must include a measurement of available genetic diversity as it represents the raw material for adaptation to environmental changes. Lack of genetic diversity can cause populations to become genetically fixed and intolerant to a constantly changing environment (Frankel & Soulé, 1981). Population connectivity is important for maintaining genetic diversity and can be heavily impacted by anthropogenic activity. In this manuscript, we estimate the distribution of genetic diversity and population connectivity to inform conservation decisions of an imperiled freshwater mussel species, the Sheepnose mussel (Plethobasus cyphyus).
There are three species within the genus Plethobasus, all of which are listed as endangered. Of the three species, the Sheepnose currently occupies the broadest distribution (Hove et al., 2015;Turgeon et al., 1998. All three species of Plethobasus have exhibited rangewide population declines presumably as a result of anthropogenic changes to their habitat and it is thought that these changes are the most prominent factor in the decline of Plethobasus (Stein & Flack, 1997;USFWS Service, 1984;US ACOE, 2011). Unlike its much rarer congeners, the relatively high abundance and widespread nature of the Sheepnose allows for an opportunity to conserve this species.
Sheepnose usually occur in shallow shoals with moderate-toswift currents over coarse gravel and sand (Oesch, 1984). However, other habitat features may include mud, cobble, and boulders in deeper large river runs (Parmalee & Bogan, 1998). In general, freshwater mussels are long-lived species with life spans ranging from two years to decades (Mutvei et al., 1994). Sheepnose are estimated to live ~20-30 years and become reproductively mature around 5 years of age (Hove et al., 2015). The Sheepnose, as well as all unionid species, utilize a mating strategy in which males expel sperm into the water column, which are then taken in by females for fertilization of their eggs which are held in modified portions of their gills. After fertilization, the mature larvae, glochidia, are released into the water and attach to the gills or fins of fish where they complete their development. After development is complete, the juvenile mussels then drop off the gills and establish themselves in the substrate (Parmalee & Bogan, 1998). Originally, the only known natural host fish for the Sheepnose was the Sauger (Sanders candensis) (Surber, 1913), however, a more recent study has shown that the Sheepnose appears to be a cyprinid host specialist (Hove et al., 2015). Gene flow is thought to be achieved through the dispersal of sperm and the glochidia larval stage (Ferguson et al., 2013;Hove et al., 2015) so conservation efforts must include consideration of host species availability so the life cycle can remain complete.
Historically, the Sheepnose mussel occurred throughout much of the Mississippi River system (Figure 1) (USFWS, 2002). According to a status report conducted in 2002 by the USFWS, of the 77 streams that were historically occupied by Sheepnose populations, only 26 streams are thought to still be occupied. This decline in range and abundance has been attributed to human impacts such as land development, dams, and F I G U R E 1 Sampling locations of the seven sites from which Sheepnose mussels were collected for genetic analysis. The gray shaded area in the inset map indicates the approximate historic range of the Sheepnose (NatureServe Explorer) pollution (Haag & Williams, 2014). However, published and unpublished records since the 1800s, indicate that although Sheepnose was historically widespread, the species was often described as uncommon (USFWS, 2002). Archaeological evidence of discovered shell fragment locations indicates that this species may have been uncommon or rare for centuries (Parmalee & Bogan, 1998). The goals of this study are to describe the genetic diversity and population structure across the majority of the extant range of the Sheepnose. The loss of Sheepnose populations across its range indicates the possibility that a loss of genetic diversity and population connectivity has also occurred. Under this scenario, populations would possibly show the signature of genetic bottlenecks. If historical evidence is true and the Sheepnose has always been rare, analysis of population genetic data would not show signs of a loss of diversity or evidence of a bottleneck, and extant populations would be more resilient to the effects of isolation. Conservation implications will differ depending on the level of current connectivity and genetic diversity available within populations. If the isolated nature of the Sheepnose populations has resulted in the erosion of genetic diversity, then reestablishing habitat connectivity and implementing translocations could be prudent. If, however, the Sheepnose appears to have maintained genetic diversity despite population loss and isolation, then maintaining available habitat and improving connectivity would likely be a higher priority management objective. This study is intended to provide knowledge about the extent of isolation and genetic diversity of the Sheepnose by estimating contemporary population connectivity and structure to inform conservation decisions.

| Study area and sampling
A combination of microsatellite markers and mitochondrial DNA sequences was used to estimate genetic diversity, population structure, contemporary migration, and population size changes in the Sheepnose. Samples (N = 164) for DNA extraction were collected from seven different localities (Table 1). Collection efforts were focused on the Mississippi and Ohio river basins ( Figure 1). Mussels were collected by snorkeling or SCUBA at various locations. Samples for DNA extraction were collected by taking a small (~1 mm) biopsy of mantle tissue (Berg et al., 1995) or by using cytology brushes that were swabbed over the mantle tissue of mussels to accumulate mucous and sloughed cells (Henley et al., 2006). Biopsy samples were stored in 95% ethanol and DNA was extracted from mantle tissue samples using the Qiagen DNeasy ® Blood and Tissue Kit (Qiagen # 69506) according to the kit instructions. Cytology brush samples were stored in the lysis buffer provided with the Puregene Buccal Cell Core Kit B (Qiagen) and DNA was extracted following the kit instructions. Extracted DNA was quantified using a Nanodrop ND1000 spectrophotometer and stored at 4°C.

| Microsatellites
Sixteen species-specific polymorphic microsatellite loci were used to genotype samples. These markers were developed by Genetic Identification Services, Chatsworth, CA (Appendix S1). Polymerase chain reactions (PCR) were performed using 10 µL reactions (~2 ng of genomic DNA was used in each reaction). The standard M13 protocol (Schuelke, 2000) was used with the florescent dye labeled with HEX (Applied Biosystems). Reagents for a 10 µl reaction included: cycles; 72°C/4 min). Negative controls were performed with every reaction to detect potential contamination. PCR products were visualized on 1% agarose gels to confirm that the reaction was successful and that the negative control showed no contamination. PCR products were then sent to the Iowa State DNA Facility where they underwent capillary electrophoresis to determine allele sizes. The TA B L E 1 Numbers of Sheepnose mussels sampled from seven study sites for microsatellite and mitochondrial genotyping raw data were then scored using GeneMarker ® v1.85 (Hulce et al., 2011) and converted into the desired software input format using base RStudio (RStudio Team, 2016).

| Mitochondrial sequences
Mitochondrial DNA sequences of an 883 base pair fragment of the first subunit of the NADH dehydrogenase gene (ND1) were also generated for samples. DNA sequence data for the ND1 gene were gen- primers (Serb et al., 2003). For a 25 µl reaction: 1 µl LEU UURF primer, 1 µl LoGlyR primer, 9.5 µl H 2 0, 12.5 µl MyTaq polymerase (Bioline), and were then analyzed and edited using GENEIOUS v8.1.6 (Kearse et al., 2012). Results were also converted into amino acids to confirm that sequences were aligned properly before exporting the matrix for further analyses. DNA sequence data have been submitted to GenBank (Accession Number: MH853483).
Adjusted F ST (G' ST ) (Hedrick, 2005) and Jost's D (Jost, 2008) values were calculated using the package DEMEtics (Gerlach et al., 2010) in RStudio in order to account for a potential depression of the standard F ST measure due to high allelic diversity (many loci had between 15 and 45 alleles). A permutation test was performed on the F ST values using base RStudio to determine degree of genetic differentiation found among sampling locations that were located within versus between drainage basins. This test was permuted 100,000 times and p-values were assessed at a.05 significance level.

Population structure
Clustering of Sheepnose collection sites into distinct genetic groups was conducted using the program STRUCTURE v2.3.4 (Pritchard et al., 2000). STRUCTURE analyses consisted of a burn-in of 100,000 Markov chain Monte Carlo (MCMC) iterations followed by 1,000,000 iterations using the admixture model and correlated allele frequencies. Each run had 1-8 possible K values (n collection sites +1) and 10 replicates of each run. After an initial analysis detected two populations (K = 2), subsequent analyses to detect substructure were also conducted. STRUCTURE runs were conducted on the two populations identified in the initial run using the same procedures as previously, but with possible K values of 1-4 and 1-5. The web application POPHELPER (Francis, 2016) was used to determine the most probable value of K utilizing the Evanno method (Evanno et al., 2005) to determine the second-order rate of change in the distribution of L(K). POPHELPER was also used to merge the 10 replicates together and to graphically display results. An analysis of molecular variance (AMOVA) with 999 permutations was conducted using GenAlEx to further examine Sheepnose population structure.

Estimation of migration
The program BAYESASS (Wilson & Rannala, 2003) was used to estimate asymmetrical migration rates. Rates were estimated among collection sites because of the high degree of genetic differentiation observed between sites based on the F ST values. BAYESASS estimates genetic flow among sites as a migration rate (m) which can be interpreted as the fraction of migrants per generation in one population that is derived from a source population. These estimations are calculated using a Bayesian approach and MCMC sampling to generate values for m over the last few (<5) generations (Wilson & Rannala, 2003). Given a generation time of approximately 5 years for the Sheepnose (Hove et al., 2015), BAYESASS estimated m values over the past ~25 years. Run lengths and parameters were optimized to ensure convergence and delta parameters were adjusted to accommodate 40-60% acceptance. BAYESASS appeared to reach convergence after five runs with a different initial seed and a Bayesian deviance metric (Spiegelhalter, 2002) was used to select the run that best fit the dataset. TRACER v1.7 (Rambaut et al., 2018) was also used to visualize mixing, suitable burn-in values, and convergence problems. The final run consisted of the parameters from the best selected run with a run length of 5x10 7 iterations and sampling every 100 iterations. The burn-in period consisted of 2x10 7 iterations.

Changes in population size
Considering apparent declines in the number of Sheepnose populations and the potential genetic bottlenecks associated with habitat fragmentation and isolation (Andersen et al., 2004), a test for genetic bottlenecks at each site was conducted using BOTTLENECK (Piry et al., 1999 (Cornuet & Luikart, 1996;Luikart et al., 1998). This analysis was conducted with 10,000 replications under the stepwise mutation (SMM) and two-phase model (TPM) that included 95% single-step mutations and 5% multi-step mutations and a variance of 12 as recommended by Piry et al. (1999). All collection sites were tested separately for a bottleneck and the p-values estimated by the Wilcoxon's sign rank test were assessed at a 0.05 significance level.  (Leigh & Bryant, 2015) was used to create a minimum spanning network (Bandelt et al., 1999) of all haplotypes.

Population expansion
A mismatch distribution analysis using ARLEQUIN was performed on all sampling sites. This analysis estimates pairwise differences among all the sequences. A population that has not changed in effective size over a long period of time will display a ragged distribution of pairwise distances, whereas a population that has been growing generates distributions that are smoother (Harpending, 1994). A raggedness index is estimated based on this distribution and can be assessed to interpret whether population expansion has occurred (Harpending et al., 1993). Estimates of demographic expansion were generated over 1,000 bootstrap replicates. p-values for the raggedness index and sum of square deviations (SSD) were assessed at a 0.05 significance level.

| Genetic diversity
Two of the microsatellite loci (C109A and C115) amplified poorly and were subsequently dropped from the analysis. Only one locus (D113) exhibited an excess of homozygosity and potential null alleles at multiple sampling locations, and it was also dropped from analysis. Microsatellite analysis results described below are from the remaining 13 microsatellite loci (Appendix S1). MICROCHECKER analysis revealed that 5 loci were out of Hardy-Weinberg proportions at 1-2 sampling sites. There were 295 alleles across the 13 loci with 11 to 47 alleles per locus (Appendix S1). Observed heterozygosity was high (>0.70) at all sites (

| Population structure
The STRUCTURE analysis including all sampling locations indicated that the most likely value of K was 2 (Evanno et al., 2005) (Figure 2a). An AMOVA analysis also indicated significant genetic differentiation between the Mississippi and Ohio river basins (p ≤ .0001). Despite sampling sites clustering into major river basins, with few exceptions, each sampling location showed a high degree of differentiation from other sites ( Table 2). Because the sample sites exhibited substantial genetic differentiation, and populations may be structured hierarchically, with coarser structure obscuring more fine-scale structure, additional STRUCTURE analyses were conducted separately on the two initial clusters. STRUCTURE analysis of the Upper Mississippi River Basin indicated a K value of 3 (Figure 2b), clustering the MER and CHIP sites by themselves and the WIS, and MISS sites clustered together.
Analysis of the Ohio River Basin indicated a K value of 2 with the ALL site clustered separately from the TIPP and TN sites (Figure 2c).

| Estimation of migration
Contemporary migration rates estimated by BAYESASS indicated that migration between most sample sites was low. Only five of the 42 pairwise comparisons exhibited rates greater than 0.1 ( Table 3).
The permutation test indicated that there was significantly (p = .008) more migration occurring among sampling sites within the same drainage basin (Mississippi and Ohio basins) than sites found in different drainage basins. All instances of migration rates >0.1 were asymmetrical, with migrants moving in only one direction. Three of these were in the Upper Mississippi River Basin and two were in the Ohio River Basin (Table 3, Figure 3).

| Changes in population size
No significant values indicating bottlenecks were obtained based on the Wilcoxon's Sign Test. All sampling sites also exhibited an L-shaped distribution characterized by a high proportion of low-frequency alleles and a smaller proportion of alleles of intermediate frequencies indicating no recent bottlenecks had occurred.

| Sequence diversity
Analyses were performed on 157 aligned DNA sequences of 883 nucleotide base pairs that had no missing data or ambiguous sites.
Thirty-nine mtDNA haplotypes were detected across all sample sites. The number of haplotypes ranged from 5 to 12 per sampling site ( only three haplotypes were shared across most sampling sites with a star-like pattern stemming from these common haplotypes consistent with population expansion (Slatkin & Hudson, 1991) (Figure 4).

| Population expansion
Evidence for a population expansion was found for all sites except WIS. The sum of square deviations and raggedness index indicated that these sites did not significantly differ from the population expansion model (p ≥ .05) ( Table 4). The detected population expansion is also indicated by the star-like pattern seen in the minimum spanning network (Slatkin & Hudson, 1991)

| DISCUSS ION
This study found that despite the lack of gene flow and population isolation, there is still a high degree of genetic diversity among sampling sites at both the microsatellite and mitochondrial loci. Similar levels of genetic diversity have been found in other studies of both rare and common freshwater mussel species (Elderkin et al., 2007;Geist & Kuehn, 2005;Inoue et al., 2014;King et al., 1999;Zanatta & Murphy, 2007, possibly indicating abilities of some freshwater mussels to maintain genetic diversity despite isolation. Isolated Sheepnose populations may be large enough to maintain high level of genetic diversity and buffer populations against the erosive effects of genetic drift (Elderkin et al., 2007;Lande & Barrowclough, 1987). Additionally, Sheepnose are estimated to have lifespans of up to 30 years (Hove et al., 2015), such long-lived species can also buffer populations from the loss of genetic diversity due to drift (Hoffman et al., 2017). If this is true, efforts and resources aimed at conservation strategies such as propagation and translocations would be better directed toward regaining habitat suitability and connectivity (Olson & Vaughn, 2020).

TA B L E 3
Asymmetrical pairwise contemporary migration rates and associated 95% confidence intervals generated by BAYESASS Jost's D (Tables 2 and 5). It is tempting to attribute the genetic differentiation detected between Sheepnose demes to be the result of habitat degradation that has fragmented the species range and isolated these groups from each other. Dams are considered highly detrimental to unionoid populations because they disrupt dispersal of host fishes (Watters, 1995), feeding ability (Bates, 1962;Negus, 1966), and alter stream flow and depth (Salmon & Green, 1983). The United States has about 75,000 dams and almost half (~30,000) can be found in the Mississippi River system (Graf, 1999). The direct impacts of dams on mussel health and their indirect impacts on dispersal may be contributing factors to the isolation between Sheepnose demes detected at the contemporary timescale. The construction of dams, river channelization, increased pollution, and invasive species are collectively contributing to the loss of habitat and changes in the distribution of freshwater mussels (Williams et al., 1989). However, the contemporary fragmentation of populations and demes due to anthropogenic barriers and habitat loss are not likely what is being detected in the analysis of the data for the Sheepnose. For long-lived species with long generation times, and large N e like most unionids, it would most likely take centuries for any genetic signature of these anthropogenic issues to be detected using microsatellites (Haag, 2012;Hoffman et al., 2017). Instead, we propose that the differentiation observed is the result of changes in climate and on the landscape that occurred during and subsequent to the Pleistocene other studies of freshwater mussels (Elderkin et al., 2007;Hewitt, 1996;Inoue et al., 2014;Jones et al., 2015;Tomilova et al., 2020).
Together these studies indicate that for many freshwater mussels, population genetic structure is more reflective of long-term factors related to changes since the Pleistocene era and not from recent anthropogenic causes.
The amount of migration estimated between sampling locations was low overall and very low rates were estimated between the Ohio and Mississippi river basins (Table 3). In the Mississippi River Basin, the migration appears to be unidirectional-from tributaries to the Mississippi River, which could be due to several factors including dispersal via fishes (glochidia), sperm gene flow, or downstream displacement during floods. In the Ohio River Basin, the Allegheny River appears to be receiving migrants from the Tennessee and Tippecanoe rivers. This result is harder to interpret, as it seems unlikely that host fishes carrying glochidia larvae could effectively travel between these sites. These results in the Ohio basin may instead be due to the absence of samples from the mainstem Ohio River. Host fish vagility may also be an important factor on the population structure observed in the Sheepnose. Previous studies on freshwater mussels have invoked host fishes as contributing to or reinforcing the population structure (Chong & Roe, 2018;Zanatta & Wilson, 2011). The most recent information indicates that hosts of the Sheepnose include cyprinids (Hove et al., 2015) which may lack the dispersal capabilities of larger riverine fishes (Comte & Olden, 2018) and therefore contribute to the isolation between populations and demes.
Despite the detected isolation, high levels of diversity were still observed within in demes. With the exception of the WIS deme, the analysis of mitochondrial data revealed a pattern consistent with expanding populations. Such a pattern may be a result of responses of the Sheepnose to climate change oscillations during the Pleistocene (Alberdi et al., 2015). North American freshwater systems were heavily impacted by the expansion and contraction of Pleistocene glaciers, and the population structure of freshwater organisms often reflect these events (Berendzen et al., 2010;Inoue & Berg, 2017;Inoue et al., 2014;Jones et al., 2015;Mathias et al., 2016;Pielou, 2008 (Hewitt, 1996(Hewitt, , 2000Inoue et al., 2014;Pielou, 2008;Stewart & Lister, 2001).
Our results indicate that the contemporary pattern of low gene flow and isolation occurring among populations is the result of prehistoric changes to the landscape that have eliminated populations and introduced barriers to gene flow. Although anthropogenic influences may be too recent to explain the observed patterns, they may be reinforcing existing genetic differences. It appears that the long lifespan of the Sheepnose may be delaying the reduction in genetic diversity typically associated with isolation. However, if isolation persists, it is possible that genetic diversity in these demes will start to erode. Our results suggest that efforts should be made to reestablish gene flow among demes to support the maintenance of genetic diversity. For example, Sheepnose could be re-established within basins to facilitate connectivity between demes and within populations. The isolation of mussels into demes makes them more sensitive to stochastic events (Fagan et al., 2005), however, managers should be cautious about disrupting any local adaptions that may have been acquired by these demes (Fitzpatrick et al., 2015;Lean et al., 2017). Propagation and re-introduction operations for reestablishing Sheepnose within their historical ranges (Geist, 2010;Jones et al., 2006;Minckley, 1995) should aim to avoid disrupting localized adaptations; therefore, translocations between different populations are not recommended (Fitzpatrick et al., 2015;Lean et al., 2017).
Overall, the Sheepnose appear to have maintained a surpris- degree at Iowa State University. We also thank John Nason and Julie Blanchong for direction and advice during data analysis and for their input on this project and the anonymous reviewers for their helpful comments.

CO N FLI C T O F I NTE R E S T
The authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest, or nonfinancial interest in the subject matter or materials discussed in this manuscript.

O PE N R E S E A RCH BA D G E S
This article has earned an Open Data Badge for making publicly available the digitally-shareable data necessary to reproduce the reported results. The data is available at https://doi.org/10.5061/ dryad.gxd25 47mm.

DATA AVA I L A B I L I T Y S TAT E M E N T
Mitochondrial DNA sequence data have been submitted to GenBank