Defining relictual biodiversity: Conservation units in speckled dace (Leuciscidae: Rhinichthys osculus) of the Greater Death Valley ecosystem

Abstract The tips in the tree of life serve as foci for conservation and management, yet clear delimitations are masked by inherent variance at the species–population interface. Analyses using thousands of nuclear loci can potentially sort inconsistencies, yet standard categories applied to this parsing are themselves potentially conflicting and/or subjective [e.g., DPS (distinct population segments); DUs (Diagnosable Units‐Canada); MUs (management units); SSP (subspecies); ESUs (Evolutionarily Significant Units); and UIEUs (uniquely identified evolutionary units)]. One potential solution for consistent categorization is to create a comparative framework by accumulating statistical results from independent studies and evaluating congruence among data sets. Our study illustrates this approach in speckled dace (Leuciscidae: Rhinichthys osculus) endemic to two basins (Owens and Amargosa) in the Death Valley ecosystem. These fish persist in the Mojave Desert as isolated Plio‐Pleistocene relicts and are of conservation concern, but lack formal taxonomic descriptions/designations. Double digest RAD (ddRAD) methods identified 14,355 SNP loci across 10 populations (N = 140). Species delimitation analyses [multispecies coalescent (MSC) and unsupervised machine learning (UML)] delineated four putative ESUs. F ST outlier loci (N = 106) were juxtaposed to uncover the potential for localized adaptations. We detected one hybrid population that resulted from upstream reconnection of habitat following contemporary pluvial periods, whereas remaining populations represent relics of ancient tectonism within geographically isolated springs and groundwater‐fed streams. Our study offers three salient conclusions: a blueprint for a multifaceted delimitation of conservation units; a proposed mechanism by which criteria for intraspecific biodiversity can be potentially standardized; and a strong argument for the proactive management of critically endangered Death Valley ecosystem fishes.


| INTRODUC TI ON
Species represent the currency of biodiversity, and as such are focal points for conservation and management. Yet, they often lack those distinct demarcations expected of their categorization and represent instead mere waypoints along evolutionary pathways (Sullivan et al., 2014). Our capacity to discriminate is further confounded by an observed variance in intraspecific diversity that also contributes to the difficulties in unambiguously defining taxonomic units. Furthermore, a multiplicity of species concepts (Zachos, 2018) not only adds to, but also underscores this difficulty. "What precisely is a species?" and "How can it be delineated?" are questions with both philosophical and practical ramifications (de Queiroz, 2007). As a result, the lack of resolution promotes confusion among managers tasked with deciding what should be conserved, and how best to accomplish the task (Douglas, Douglas, Schuett, Porras, & Thomason, 2007;Holycross & Douglas, 2007). It also represents what we now define as the "species problem" (Freudenstein, Broe, Folk, & Sinn, 2017;Garnett & Christidis, 2017). Taxonomic ranks above the species level are commonly agreed upon and represent human constructs arbitrarily delimited by systematists (Coyne & Orr, 2004), with entities shuffled and re-shuffled according to morphological and molecular perspectives (and the limitations thereof). Taxa are assigned to these ranks based upon the knowledge and opinion of practitioners (see, e.g., Yang et al., 2015).
Phylogenomic data have been especially useful in this regard by refining classifications at familial and ordinal levels for cases where taxonomic assignments have been contentious or challenging to resolve (Seago, Giorgi, Li, & Ślipiński, 2011;Xi et al., 2012).
Surprisingly, this arbitrary delimitation of diversity has been widely adopted for categorizations below the species rank. In some cases, labile physical characteristics such as coloration have been utilized to diagnose categorical boundaries (Burbrink, Lawson, & Slowinski, 2000;Sullivan et al., 2014), which fail to be validated by genetic data (Phillimore & Owens, 2006;Zink, 2004). Genome-scale data and novel analytical methods have been useful in resolving these issues; however, the application of these methods to conservation unit delineation remains challenging without a blueprint for this purpose (Stanton et al., 2019). Furthermore, these issues represent an extension of the species problem, where terminology has not only become confusing but also conflicting. We address these issues below by outlining an approach to conservation unit delineation using modern analytical techniques for genome-scale data.

| A focus on intraspecific categorization
Intraspecific lineages often stem from a single ancestral source, potentially one population or a cluster thereof. Their components are relatively contiguous geographically, albeit with temporal dynamics, such as demographic expansions, contractions, and even potential stasis within refugia as repercussions of climatic fluctuation (Levin, 2019). They have been variously labeled, from more traditional frameworks such as subspecies or ecological races (Braby, Eastwood, & Murray, 2012), through more contemporary concepts including evolutionarily significant units (ESUs), management units (MUs) (Coates, Byrne, & Moritz, 2018), and uniquely identified evolutionary units (UIEUs) (Trotter et al., 2018) to those promoted by government regulations, such as distinct population segments (DPS: USFWS & NMFS, 1996) or diagnosable units (DUs: COSEWIC, 2012). The focus for all these categorizations remains unitary: the recognition and conservation of intraspecific diversity.
The emergence of molecular techniques to quantify "genetic diversity" was initially heralded as a potential solution for the demarcation of biodiversity units, but traditional methods provided scant applicability with regard to identifying intraspecific boundaries, and thus served to extend classificatory confusion (Phillimore & Owens, 2006). The advent of next-generation sequencing (NGS) techniques has markedly improved our ability to evaluate questions surrounding intraspecific variation, although they too present a double-edged sword: Discrete population-level patterns can indeed be identified, but an overinterpretation of these patterns is a management issue in that populations are often inappropriately categorized. These fears are borne out as well through complex analytical techniques commonly used to delimit species (Campillo, Barley, & Thomson, 2020;Sukumaran & Knowles, 2017). In addition, sophisticated analytical methods are now emerging, such as machine learning, that employ pattern recognition as a mechanism to identify biodiversity, and which allow systems to automatically learn and improve from experience without being explicitly programmed for a specific application.
We outline in this manuscript a multifaceted approach for the delimitation of conservation units and intraspecific diversities. We first: (a) define the questions to be addressed, (b) detail different analytical approaches that quantify inter-and intraspecific diversity using NGS data, (c) place detected patterns within a spatial and temporal framework so as to understand underlying evolutionary and ecological drivers, and (d) combine our data with those available from prior publications and gray literature reports to evaluate diversity within the context of adaptive potential and ecological boundaries (Cornetti, Ficetola, Hoban, & Vernesi, 2015;Stanton et al., 2019). We offer this multifaceted approach (Figure 1) as a potential blueprint from which to delimit biologically distinct entities and as a mechanism to classify intraspecific biodiversity via standardized criteria for recognition as conservation units.
To illustrate this approach, we offer a case study using an endemic group of desert fishes whose conservation is clearly hampered by taxonomic ambiguity. The advantages of our study are its various levels of complexity, both intra-and interspecific, manifested within a simple spatial design (i.e., populations within isolated habitats), within a region well understood with regard to biogeography and paleohydrology. We use our framework to develop a strong argument for the proactive management of critically endangered fishes in one of the world's most unique environments, the Death Valley ecosystem of arid southwestern North America.
Our study species is a small leuciscid (speckled dace: Rhinichthys osculus) from the Death Valley ecosystem of southwestern Nevada and eastern California (Appendix S1; Table 1; Figure 2).
While both basins are endorheic (with no connection to other basins or oceans), their hydrology differs. The high desert of the Owens Basin exists in the rain shadow of the eastern Sierra Nevada and accumulates surface water through snowmelt runoff from the surrounding mountains. However, available habitat for speckled dace has been systematically depleted by aqueducts constructed in 1913 and 1970 to supply the metropolis of Los Angeles. This has forced considerable reliance upon local groundwater resources (Danskin, 1998) which poses an existential threat to speckled dace persistence as aquifers are depleted. In contrast, the Amargosa Basin has been historically arid (Belcher, Sweetkind, Hopkins, & Poff, 2019), with intermittent flows in the Amargosa River a residual of rare, large-scale precipitation (Grasso, 1996). Here, speckled F I G U R E 1 Conceptual workflow to delineate conservation units at tips in the Tree of Life. A standardized mechanism that applies molecular data to categorize inter-and intraspecific diversity should address three Questions about patterns in genetic data that must be integrated with other available data (Concordance). Metrics depict genetic diversity and are quantified singly or in combination via Analyses to derive Inferences regarding how ecological and evolutionary processes have shaped these patterns. Genetic results and other public data (e.g., phenotypic, ecological, and biogeographical) are combined in a comparative framework so as to depict conservation units  (Robbins, 2017).
This study provides a comprehensive, population-level evaluation of speckled dace as a case study for conservation unit delineation in a system with confused intraspecific variability. We  with additional information on the taxonomic history of these OTUs as detailed in Appendix S1.

| Data collection and filtering
Whole genomic DNA was extracted using several methods: Gentra  (Eaton, 2014) was used for de novo assembly of ddRAD loci in pyrad v3.0.66.
Reads with > 4 low-quality bases (Phred quality score < 20) were removed. A minimum of 15 reads was required to call a locus for an individual. A filter was applied to remove putative paralogs using standard methods for their identification in ddRAD data (Eaton, 2014;McKinney, Waples, Seeb, & Seeb, 2017) by discarding loci with heterozygosity > 0.6 and those containing > 10 heterozygous sites. The resulting data were filtered (BcFtools; Li, 2011) as a means of retaining a single biallelic SNP from each locus, as present in at least 33% of individuals (hereafter referred to as "SNP-all").
The 33% cutoff was designed to minimize potential bias in missing data for ingroup samples, given the unbalanced basin sampling (e.g., Owens N = 50 vs. Amargosa N = 80). Our desire in this effort was to prevent the more numerous Amargosa samples from dictating which SNPs were recovered during alignment, genotyping, and filtering (Eaton, Spriggs, Park, & Donoghue, 2017;Huang & Knowles, 2016).

| Loci under selection
To identify putative local adaptations, all loci were subjected to F ST outlier analysis (i.e., F ST outliers indicate loci under selection).
Conditions included 100,000 total simulations, assuming a "neutral" forced mean F ST , 95% confidence interval (CI), and FDR of 0.1. Those SNPs deemed to be under positive selection by both Bayescan and lositan were extracted for downstream analysis (=SNP-select).

| Population structure analysis
A Maximum Likelihood approach (Alexander, Novembre, & Lange, 2009) was utilized to assess population structure in both datasets.

| Hybridization
Preliminary evaluations indicated high levels of admixture at one site, Amargosa Canyon, the most downstream site in the Amargosa River drainage (AMC; Figure 2). Three approaches were used to determine whether hybridization occurred among putative OTUs (Oasis Valley; Ash Meadows), and if so, its relative timing. Data from multiple sampling years (1993, 2004, and 2016) in Oasis Valley (AMA; backcross with Ash Meadows. The program was run for 1,000,000 generations of burn-in followed by 3,000,000 generations of data collection. Our second approach was to calculate a hybrid index (Buerkle, 2005) using the est.h function in the introgress R package (Gompert & Buerkle, 2010  sVdQuartets (Chifman & Kubatko, 2014, 2015, as implemented in PAUP* (Swofford, 2003), was used to construct a phylogeny using a multispecies coalescent approach. This method evaluates all combinations of four populations (quartets) at each locus, then calculates a single value decomposition (SVD) score (Golub & Van Loan, 1996) for each possible quartet tree. The topology with the lowest SVD score is selected as the true quartet topology. The full species tree is then assembled using a quartet assembly algorithm (Reaz, Bayzid, & Rahman, 2014). Exhaustive quartet sampling was conducted, and significance was assessed using 100 bootstrap replicates.  Rannala, & Yang, 2018), and parse populations rather than speciation events (Sukumaran & Knowles, 2017). This stems from the many assumptions implicit to MSC, such as random mating, neutral markers, a lack of postspeciation gene flow, and no within-locus recombination or linkage disequilibrium (Degnan & Rosenberg, 2009 (Mussmann, 2018). Thus, we do not suspect that the low sample size (N = 5 per lineage) negatively impacted the final results.

| Conservation unit delimitation
We estimated the prior value for the population mutation rate (Θ) in BFD* using the mean pairwise sequence divergence (7.99 × 10 -3 ) be- Bayescan recovered 210 loci under positive selection, whereas lositan found 632. We cross-referenced them to acquire a consensus of 106 loci (the "SNP-select" dataset) representing a subsample of the 14,355 locus "SNP-all" data.

| Population structure
Genetic diversity within the Owens and Amargosa basins was best represented by seven genetic clusters representing SNP-all

| Hybridization
A hybrid index was calculated for crosses between Ash Meadows (ASH) and the two Oasis Valley populations (AMA and RFO). Data represented fixed differences between the two populations [i.e., 61 SNPs (AMA × ASH: Figure 4a) and 83 SNPs (RFO × ASH: Figure 4b)].
The genomic composition of each Amargosa Canyon sample is an approximate 50/50 representation of each parent, with hybridization ranging from recent (F2) to historic based upon observed interspecific heterozygosity (i.e., interspecific heterozygosity < 0.5).
newHyBrids demonstrated a similar trend using the 390-locus dataset filtered in genepopedit (Figure 4c). All Amargosa Canyon samples were classified with high probability as F2 hybrids (p > .97).
Results of Hyde (Table 3)

| Phylogenetic analyses
pomo and sVdQuartets results converged on the same tree topology. While the five Amargosa Basin sites formed a clade, Long Valley speckled dace (LVD) forced a paraphyletic Owens Basin group ( Figure 5). Most nodes were well supported, with the greatest

| Bayes factor delimitation
The BFD* analysis ( Figure 6) decisively split the dataset into eight unique lineages, with seven distributed between the Owens and Amargosa basins (BF = 501.85). In this regard, BF ≥ 10 is considered very strong support (Kass & Raftery, 1995

| Unsupervised machine learning
The machine learning methods recovered a minimum of three

| D ISCUSS I ON
The need for a mechanism to categorize the tips in the tree of life resonates politically, ethically, and biologically (Garnett & Christidis, 2017). Yet, our capacity to do so has languished, not only with regard to the aforementioned species problem, but also with a multiplicity of conservation units at the intraspecific level.
The latter represent confusing, and frequently conflicting categorizations that have most often been parsed using phylogeographic analysis (Avise, 2009). However, a more fine-grained phylogenomic  PopulaƟons F I G U R E 7 Results of eight unsupervised machine learning (UML) algorithms compared against Bayes Factor Delimitation (BFD*) results for eight populations of speckled dace (Rhinichthys osculus). These include four random forest (RF) methods with mixtures of classical (c) and isotonic (iso) multidimensional scaling (MDS) as well as hierarchical and partition around medoids (PAM) clustering. The t-distributed stochastic neighbor embedding (t-SNE) algorithm was employed with two clustering methods. Finally, a discriminant analysis of principal components (DAPC) and variational autoencoder (VAE) were applied. All UML algorithms were applied to 130 individuals genotyped at 200 SNPs. The raw number of divisions (=Clusters) for several methods included groups of individuals that shared a high proportion of missing data (*). Other cases divided individuals from two localities among clusters that did not follow any pattern ( †). In one instance, Ash Meadows populations were subdivided according to springs ( ‡). All three scenarios were interpreted as "oversplitting" and ignored in the final interpretation (  PopulaƟons perspective is now possible, given the current capacity to generate voluminous data sets encompassing thousands of nuclear loci, and this in turn has subsequently shifted focus and intent (Freudenstein et al., 2017). However, the abilities of researchers to interpret results of phylogenomic studies into discrete conservation units have not kept pace, much to the detriment of adaptive management. A framework is needed to consistently ask if apparent intraspecific boundaries represent factual guides for conservation units, and if so, can they indeed be diagnostically defined (Rannala, 2015)?
One potential mechanism to interpret results of molecular studies that evaluate intraspecific diversity, and to translate them into management solutions would be to place them within a comparative framework with other available data. Here, the difficulty is to (a) know that such data exists, and (b) locate and access it for integrated analyses. In the present situation, this would provide the platform from which results of individual studies could be compiled in a consistent format, compared, and potentially categorized. A recent example is a platform that accommodates a trait-based approach to classification by delineating taxa from which ancestral morphologies and their functions are reconstructed (Gallagher et al., 2020). A similar platform could be built so as to accommodate molecular studies of intraspecific diversity in an attempt to seek consensus among results. As a template, it could mirror a recent study (Wieringa et al., 2019) that gathered results of papers published on historical phylogeography in Southeastern North America (N = 57), as a mechanism by which new questions relating to intraspecific genetic diversity could be addressed.

| The current study
The genetic structure of speckled dace diagnosed through our analyses reveals their evolutionary history within the Death Valley ecosystem. Their distribution is intimately tied to the prehistoric lakes and rivers of the region, with diversifications occurring within modern drainages. This pattern clearly reflects a relictual biodiversity with high endemicity, with patterns driven by Plio-Pleistocene tectonism and hydrology (i.e., dispersal of speckled dace from Owens Valley to the Amargosa Basin during fluvial events: Jayko et al., 2008;Knott et al., 2008). However, the current taxonomy is incompatible with these results, and must consequently be adjudicated prior to the delineation of conservation units for management purposes.

| Morphological and ecological support for conservation units
Next, we applied a comparative framework to solidify support for our conservation units. While speckled dace lineages were statistically significant on a genetic basis, they should also be validated morphologically, ecologically, and with additional life history data.
This process acts to confirm them as distinct biological entities warranting conservation unit status. A morphological overview of putative subspecies in the Death Valley ecosystem focused on broad, regional trends, but it also found "... highly significant differences among all populations for all meristic and mensural characters" (Sada et al., 1995). Ordination also revealed two qualitatively unique body shapes that are typical responses to hydrological conditions (Brinsmead & Fox, 2005;Collin & Fumagalli, 2011). Meristic counts are another type of phenotypic data often applied to diagnose species, but their specificity must be carefully interpreted. Ranges frequently overlap among subspecific divisions, and data for other taxa are either lacking or conflated (Moyle et al., 2015). This is especially true for Oasis Valley speckled dace, initially lumped with Ash Meadows speckled dace (Gilbert, 1893;La Rivers, 1962), but with morphological details for subspecific status lacking (Deacon & Williams, 1984;Williams et al., 1982).
Likewise, few details are available for Amargosa Canyon speckled dace (Scoppettone, Hereford, Rissler, Johnson, & Salgado, 2011), other than a series of qualitative descriptors when compared with other speckled dace subspecies (i.e., smaller head depth, shorter snout-to-nostril length, greater length between anal and caudal fins, greater numbers of pectoral rays, and fewer vertebrae: Moyle et al., 2015). A diagnosis for Ash Meadows speckled dace is qualitative as well (i.e., incomplete lateral line, relatively large head, small eye, short and deep body, and dark stripe along entire length: Gilbert, 1893).
These taxa meet all three criteria for ESU designation: geographic isolation, genetic differentiation, and local adaptation (as detected via F ST outlier SNPs).
Endemicity resulting from habitat isolation provides a unique challenge for conservation. While isolated habitats are often prone to human disturbance, they are readily identified and monitored as well (Arthington, Dulvy, Gladstone, & Winfield, 2016). Springs, for example, exhibit high levels of genetic structure across small geographic areas (Echelle et al., 2015). They frequently reflect low within-system species richness (α-diversity) but high diversity when compared to other systems (β-diversity) (Gibert et al., 2009). In other words, greater diversity exists between groundwater-dependent systems rather than within (Gibert & Deharveng, 2002). Thus, groundwater depletion invokes dire consequences for aquatic fauna, ranging from the depletion of faunal assemblages (Perkin et al., 2017) to complete extinction (Miller, Williams, & Williams, 1989). The latter scenario provides an elevated risk for these populations since they lack redundancy to protect from catastrophic loss.
Our validation of anecdotal subspecies has a conservation imperative, in that it underscores the necessity of a management strategy that sustains and protects these unique entities. Practical considerations require an adherence to current conservation policy, and this remains species-centric . While it has benefits (Runge et al., 2019), ecosystem-level factors are also necessary conservation aspects (Franklin, 1993). They are particularly important in regions containing multiple narrowly endemic species, as those rare and sympatric can suffer unintentional consequences through a strict species-centric approach (Casazza et al., 2016 Hybridization has been commonplace among desert fishes (Bangs, Douglas, Mussmann, & Douglas, 2018;Dowling & DeMarais, 1993) and has served as a mechanism of speciation (Gerber, Tibbets, & Dowling, 2001), but can also erode species boundaries (Chafin et al., 2019). Additionally, anthropogenic climate change has induced hybridization among divergent species (Canestrelli et al., 2017;Muhlfeld et al., 2014), and thus represents a post-Pleistocene evolutionary mechanism inherent to western North America (Woodhouse, Meko, MacDonald, Stahle, & Cook, 2010).
Hybridization is a contentious conservation topic (Fitzpatrick, Ryan, Johnson, Corush, & Carter, 2015), particularly when one pa-  (Allendorf et al., 2001). The situation parallels that of the red wolf (Canis rufus), deemed a hybrid between endangered gray wolf (Canis lupus) and coyote (Canis latrans) (Hohenlohe et al., 2017;vonHoldt et al., 2016), although recent genomic analyses revealed it to be a distinct species of its own with a convoluted history of introgression . Given this precedence, Amargosa Canyon speckled dace could therefore be listed as a DPS under the ESA (Waples, Kays, Fredrickson, Pacifici, & Mills, 2018).

| CON CLUS IONS
Distinct speckled dace lineages within the Death Valley ecosystem represent Plio-Pleistocene hydrological connections among basins.
They are narrowly endemic and relictual components of more pluvial Plio-Pleistocene ecosystems that now persist as small pockets within desert oases. The majority represent entities previously identified anecdotally, yet without formal description. Despite their academic recognition, legal protection is absent and their existence remains manifestly precarious, save for one described entity (R. o. nevadensis).
Our results demonstrate the necessity of using multiple approaches in a comparative framework to diagnose conservation units (Figure 1). as distinct ESUs, and argue that Amargosa Canyon speckled dace is eligible for protection as a DPS under existing environmental laws.

ACK N OWLED G M ENTS
This research represents partial fulfillment of the Ph.D. degree (SMM) in writing -original draft (supporting); writing -review and editing (equal).

DATA AVA I L A B I L I T Y S TAT E M E N T
Raw fastq files for each individual have been uploaded to the NCBI Sequence Read Archive (BioProject ID PRJNA598959) and will be released to the public upon publication. All alignments used in this study are available on the Dryad Digital Repository (https://doi. org/10.5061/dryad.51c59 zw62).