Defining Relictual Biodiversity: Conservation Units in Speckled Dace (Cyprinidae: Rhinichthys osculus) of the Greater Death Valley Ecosystem

The tips in the tree of life serve as foci for conservation and management, yet clear delimitations are masked by inherent variance at the species-population interface. Analyses using thousands of nuclear loci can potentially sort inconsistencies, yet standard categories applied to this parsing are themselves potentially conflicting and/or subjective [e.g., DPS (distinct population segments); DUs (Diagnosable Units-Canada); MUs (management units); SSP (subspecies); Evolutionarily Significant Units (ESUs)]. One potential solution for consistent categorization is to create a comparative framework by accumulating statistical results from independent studies and evaluating congruence among data sets. Our study illustrates this approach in speckled dace (Cyprinidae: Rhinichthys osculus) endemic to two basins (Owens and Amargosa) in the Death Valley ecosystem (DVE). These fish persist in the Mojave Desert as isolated Pleistocene-relicts and are of conservation concern, but lack formal taxonomic descriptions/designations. Double-digest RAD (ddRAD) methods identified 14,355 SNP loci across 10 populations (N=140). Species delimitation analyses [multispecies coalescent (MSC) and unsupervised machine learning (UML)] delineated four putative ESUs. FST outlier loci (N=106) were juxtaposed to uncover the potential for localized adaptations. We detected one hybrid population that resulted from upstream reconnection of habitat following contemporary pluvial periods, whereas remaining populations represent relics of ancient tectonism within geographically-isolated springs and groundwater-fed streams. Our study offers three salient conclusions: A blueprint for a multi-faceted delimitation of conservation units; a proposed mechanism by which criteria for intraspecific biodiversity can be potentially standardized; and a strong argument for the proactive management of critically-endangered DVE fishes.


Abstract 22
The tips in the tree of life serve as foci for conservation and management, yet clear delimitations are masked by inherent variance at the species-population interface. Analyses using thousands of 24 nuclear loci can potentially sort inconsistencies, yet standard categories applied to this parsing are themselves potentially conflicting and/or subjective [e.g., DPS (distinct population 26 segments); DUs (Diagnosable Units-Canada); MUs (management units); SSP (subspecies); Evolutionarily Significant Units (ESUs)]. One potential solution for consistent categorization is 28 to create a comparative framework by accumulating statistical results from independent studies and evaluating congruence among data sets. Our study illustrates this approach in speckled dace 30 (Cyprinidae: Rhinichthys osculus) endemic to two basins (Owens and Amargosa) in the Death Valley ecosystem (DVE). These fish persist in the Mojave Desert as isolated Pleistocene-relicts 32 and are of conservation concern, but lack formal taxonomic descriptions/designations. Doubledigest RAD (ddRAD) methods identified 14,355 SNP loci across 10 populations (N=140). 34 Species delimitation analyses [multispecies coalescent (MSC) and unsupervised machine learning (UML)] delineated four putative ESUs. F ST outlier loci (N=106) were juxtaposed to 36 uncover the potential for localized adaptations. We detected one hybrid population that resulted from upstream reconnection of habitat following contemporary pluvial periods, whereas 38 remaining populations represent relics of ancient tectonism within geographically-isolated springs and groundwater-fed streams. Our study offers three salient conclusions: A blueprint for 40 a multi-faceted delimitation of conservation units; a proposed mechanism by which criteria for intraspecific biodiversity can be potentially standardized; and a strong argument for the proactive 42 management of critically-endangered DVE fishes.

44
Species represent the currency of biodiversity, and as such are focal points for conservation and management. Yet, they often lack those distinct demarcations expected of their categorization, 46 and represent instead mere waypoints along evolutionary pathways (Sullivan et al., 2014). Our capacity to discriminate is further confounded by an observed variance in intraspecific diversity 48 that also contributes to the difficulties in unambiguously defining taxonomic units. Furthermore, a multiplicity of species concepts (Zachos, 2018) not only adds to, but also underscores this 50 difficulty. 'What precisely is a species?' and 'How can it be delineated?' are questions with both philosophical and practical ramifications (de Queiroz, 2007). As a result, the lack of resolution 52 promotes confusion among managers tasked with deciding what should be conserved, and how best to accomplish the task (Douglas, Douglas, Schuett, Porras, & Thomason, 2007;Holycross & 54 Douglas, 2007). It also represents what we now define as the 'species problem' (Freudenstein, Broe, Folk, & Sinn, 2017;Garnett & Christidis, 2017). 56 Taxonomic ranks above the species are commonly agreed upon and represent human constructs arbitrarily delimited by systematists (Coyne & Orr, 2004), with entities shuffled and 58 re-shuffled according to morphological and molecular perspectives (and the limitations thereof).
Taxa are assigned to these ranks based upon the knowledge and opinion of practitioners (see for 60 example, Yang et al., 2015). Surprisingly, this approach has also been widely adopted for categorizations below the species rank. This represents an extension of the species-problem, 62 where terminology has not only become confusing but also conflicting. We address these issues below. 64

| A focus on intraspecific categorization 66
Intraspecific lineages often stem from a single ancestral source, potentially one population or a cluster thereof. Their components are relatively contiguous geographically, albeit with temporal 68 dynamics, such as demographic expansions, contractions, and even potential stasis within refugia as repercussions of climatic fluctuation (Levin, 2019). They have been variously labeled, from 70 more traditional frameworks such as subspecies or ecological races (Braby, Eastwood, & Murray, 2012), through more contemporary concepts including evolutionarily significant units 72 (ESUs) and management units (MUs) (Coates, Byrne, & Moritz, 2018), to those promoted by government regulations, such as distinct population segments (DPS: USFWS & NMFS, 1996)  The emergence of molecular techniques to quantify 'genetic diversity' was initially heralded as a potential solution for the demarcation of biodiversity units, but traditional methods 78 provided scant applicability with regard to identifying intraspecific boundaries, and thus served to extend classificatory confusion (Phillimore & Owens, 2006;Zink, 2004). The advent of next-80 generation sequencing (NGS) techniques has markedly improved our ability to evaluate questions surrounding intraspecific variation, although they too present a double-edged sword: 82 Discrete population-level patterns can indeed be identified, but an over-interpretation of these patterns is a management issue in that populations are often inappropriately categorized. These 84 fears are borne out as well through complex analytical techniques commonly used to delimit species (Campillo, Barley, & Thomson, 2019;Sukumaran & Knowles, 2017). In addition, 86 sophisticated analytical methods are now emerging, such as Machine Learning (ML), that employ pattern recognition as a mechanism to identify biodiversity, and which allow systems to 88 automatically learn and improve from experience without being explicitly so programmed.
We outline in this manuscript a multifaceted approach for the delimitation of species and 90 intraspecific diversities. We first: (1) define the questions to be addressed, (2) detail different analytical approaches that quantify inter-and intra-specific diversity using NGS data, (3) place 92 detected patterns within a spatial and temporal framework so as to understand underlying evolutionary and ecological drivers, and (4) combine our data with those available from prior 94 publications and grey literature reports to evaluate diversity within the context of adaptive potential and ecological boundaries (Cornetti, Ficetola, Hoban, & Vernesi, 2015;Stanton et al., 96 2019). We offer this multifaceted approach. (Figure 1) as a potential blueprint from which to delimit biologically distinct entities, and as a mechanism to classify intraspecific biodiversity via 98 standardized criteria for recognition as conservation units.
To illustrate this approach, we offer a case study using an endemic group of desert fishes 100 whose conservation is clearly hampered by taxonomic ambiguity. The advantages of our study are its various levels of complexity, both intra-and interspecific, manifested within a simple 102 spatial design (i.e., populations within isolated habitats), within a region well understood with regards to biogeography and paleohydrology. We use our framework to develop a strong 104 argument for the proactive management of critically endangered fishes in one of the world's most unique environments, the Death Valley Ecosystem (DVE) of arid Southwestern North 106 America. 108

| Study species and its biogeography
The issues introduced above globally transcend regions, ecosystems and organisms. We focus in 110 this study on a desert ecosystem (Mojave Desert) with aquatic organisms largely restricted to freshwater springs (Craig, Kollaus, Behen, & Bonner, 2016;Devitt, Wright, Cannatella, &Hillis, 112 constructed in 1913 and1970 to supply the metropolis of Los Angeles. This, in turn, has forced 136 considerable reliance upon local groundwater resources (Danskin, 1998). In contrast, the Amargosa Basin has been historically arid (Belcher, Sweetkind, Hopkins, & Poff, 2019), with 138 intermittent flows in the Amargosa River a residual of rare, large-scale precipitation (Grasso, 1996). Here, SPD persists in ground water seeps (Faunt, Blainey, Hill, & D'Agnese, 2019) that 140 have remained relatively consistent despite overexploitation of local water resources (Robbins, 2017). 142 This study provides a comprehensive, population-level evaluation of SPD as a case study for conservation unit delineation in a system with confused intraspecific variability. We 144 sequenced thousands of nuclear loci (SNPs) so as to: (1) Gauge gene flow within and among populations; (2) Identify nuclear loci under selection as a proxy for local adaptation; (3) Generate 146 a phylogeny using species tree methods; and (4) Employ MSC and machine learning algorithms to delimit intraspecific diversity. As a means to validate conservation units and underscore 148 management, we then (5) interpreted the genetic data in the context of morphological and ecological descriptions available from published sources. This comparative framework also 150 allowed us to interpret the evolutionary drivers that shaped this unique and endemic biodiversity.

| Data collection and filtering
Whole genomic DNA was extracted using several methods: Gentra Puregene DNA Purification 166 Tissue kit; QIAGEN DNeasy Blood and Tissue Kit; QIAamp Fast DNA Tissue Kit; and CsClgradient. Extracted DNA was visualized on 2.0% agarose gels and quantified with a Qubit 2.0 168 fluorometer (Thermo Fisher Scientific, Inc.). Library preparation followed a double digest Restriction-Site Associated DNA (ddRAD) protocol (Peterson, Weber, Kay, Fisher, & Hoekstra, 170 2012). Barcoded samples (100 ng DNA each) were pooled in sets of 48 following Illumina adapter ligation, then size-selected at 375-425 bp (Chafin, Martin, Mussmann, Douglas, & 172 Douglas, 2018)  Libraries were de-multiplexed and filtered for quality using PROCESS_RADTAGS (STACKS 180 v1.48;Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013). All reads with uncalled bases or Phred quality scores < 10 were discarded. Reads with ambiguous barcodes that otherwise 182 passed quality filtering were recovered when possible (= 1 mismatched nucleotide). A clustering threshold of 0.85 (Eaton, 2014) was used for de novo assembly of ddRAD loci in PYRAD 184 v3.0.66. Reads with > 4 low quality bases (Phred quality score < 20) were removed. A minimum of 15 reads was required to call a locus for an individual. A filter was applied to remove putative 186 paralogs by discarding loci with heterozygosity > 0.6 and those containing > 10 heterozygous sites. The resulting data were filtered (BCFTOOLS; Li, 2011) as a means of retaining a single 188 biallelic SNP from each locus, as present in at least 33% of individuals (hereafter referred to as 'SNP-all'). Filtering was designed to minimize potential bias in missing data for ingroup 190 samples, given the unbalanced basin sampling (e.g., Owens N=50 versus Amargosa N=80) (Eaton, Spriggs, Park, & Donoghue, 2017;Huang & Knowles, 2016). 192

| Loci under selection 194
To identify potential for local adaptation, all loci were subjected to F ST outlier analysis (i.e., F ST outliers indicate loci under selection). BCFTOOLS-filtered SNPs were analyzed in BAYESCAN 196 v2.1 (Foll & Gaggiotti, 2008), using recommended settings [20 pilot runs (5,000 generations each) followed by 100,000 Markov Chain Monte Carlo (MCMC) generations (including 50,000 198 burn-in)]. Data were thinned by retaining every 10 th sample, equating to 5,000 total MCMC samples. Outlier status was determined by a false discovery rate (FDR) of 0.05. 200 BAYESCAN has the lowest Type I and II error rates among comparable software, yet a single outlier-detection method elicits some level of uncertainty (Narum & Hess, 2011). Thus, 202 cross-validation was conducted using the FDIST2 method in LOSITAN (Antao, Lopes, Lopes, Beja-Pereira, & Luikart, 2008;Beaumont & Nichols, 1996). Conditions included 100,000 total 204 simulations, assuming a 'neutral' forced mean F ST , 95% confidence interval (CI), and FDR of 0.1. Those SNPs deemed to be under positive selection by both BAYESCAN and LOSITAN were 206 extracted for downstream analysis (=SNP-select).

| Population structure analysis
A Maximum Likelihood (ML) approach (Alexander, Novembre, & Lange, 2009) was utilized to 210 assess population structure in both datasets. A minor allele frequency filter (ADMIXPIPE v2.0;Mussmann, Douglas, Chafin, & Douglas, 2020) was applied to remove SNPs at a frequency < 212 0.01. The number of distinct gene pools in the data set was explored in ADMIXTURE using clustering (K) values of 1 to 20, each with 20 replicates. Cross-validation (CV) values were 214 calculated following program instructions.
The output was evaluated using a Markov clustering algorithm to identify different 216 modes calculated by ADMIXTURE within a single K-value so as to automate the process of summarizing multiple independent ADMIXTURE runs (CLUMPAK; Kopelman, Mayzel, Jakobsson, 218 Rosenberg, & Mayrose, 2015). Major clusters were identified at a similarity threshold of 0.9 and summarized in ADMIXPIPE. The best interpretation of population structure was the K-value 220 associated with the lowest CV score.
To further assess population structure and distribution of genetic diversity, an analysis of 222 molecular variance (AMOVA; Excoffier & Lischer, 2010) was performed for the full SNP dataset (SNP-all), and again for SNP-select. In both cases, pairwise F ST values served to evaluate 224 genetic isolation between basins and among localities within basins. For sites with temporal collections, AMOVA was also used to test for genetic variance among sampling events within 226 localities.

| Hybridization
Preliminary evaluations indicated high levels of admixture at one site, Amargosa Canyon, the 230 most downstream site in the Armagosa River drainage (Figure 2). Three approaches were used to determine whether hybridization occurred, and if so, its relative timing. First, the Bayesian 232 clustering program NEWHYBRIDS v1.1 beta 3 (Anderson & Thompson, 2002) was employed to assess Amargosa Canyon samples. SNPs were filtered in GENEPOPEDIT to obtain a set of 234 unlinked loci that maximized F ST among populations (Stanley, Jeffery, Wringe, DiBacco, & Bradbury, 2017). The 'z' option in NEWHYBRIDS was used to assign two 'pure' parental gene backcross with pure Oasis Valley; or, F1 backcross with pure Ash Meadows. The program was run for 1,000,000 generations of burn-in followed by 3,000,000 generations of data collection. 242 Our second approach was to calculate a hybrid index (Buerkle, 2005) using the est.h function in the INTROGRESS R package (Gompert & Buerkle, 2010). Data were filtered so as to 244 acquire only fixed, unlinked SNPs among parental populations. Interspecific heterozygosity was calculated using the calc.intersp.het function in INTROGRESS, and results were visualized using 246 the triangle.plot function.
Finally, HYDE (Blischak, Chifman, Wolfe, & Kubatko, 2018) was used to test whether 248 Amargosa Canyon was of hybrid origin. This method differs from the previous two in that it tests for hybrid lineages by employing phylogenetic invariants that arise under the coalescent model. 250 Bonferroni adjustment (α=0.0045) was used to test for significance, and 500 bootstrap replicates were performed. 252

| Phylogenetic Analyses 254
To identify distinct evolutionary lineages, species tree methods were used in phylogenetic analyses using all samples (N=135), including the outgroup taxa [R. atratulus and R. o. robustus 256 (PRC)], but excluding a population from a private pond (RUP) because of its uncertain origin.
To account for incomplete lineage sorting, the reversible polymorphism-aware phylogenetic 258 model (POMO) was applied in IQ-TREE v1.6.9 (Nguyen, Schmidt, von Haeseler, & Minh, 2014;Schrempf, Minh, De Maio, von Haeseler, & Kosiol, 2016). This method allows polymorphic 260 states to occur within populations, rather than following the traditional assumption in DNA substitution models that taxa are fixed for a specific nucleotide at a given locus. To model 262 genetic drift, a virtual population size of 19 was assumed. Mutations followed a GTR substitution model, and rate heterogeneity was modeled using four categories. An ultrafast 264 bootstrap algorithm (Hoang, Chernomor, von Haeseler, Minh, & Le, 2018) performed 1,000 bootstrap replicates. 266 SVDQUARTETS (Chifman & Kubatko, 2014;Chifman & Kubatko, 2015), as implemented in PAUP* (Swofford, 2003), was used to construct a phylogeny using a multispecies coalescent 268 approach. This method evaluates all combinations of four populations (quartets) at each locus, then calculates a single value decomposition (SVD) score (Golub & Van Loan, 1996) for each 270 possible quartet tree. The topology with the lowest SVD score is selected as the true quartet topology. The full species tree is then assembled using a quartet assembly algorithm (Reaz, 272 Bayzid, & Rahman, 2014). Exhaustive quartet sampling was conducted, and significance was assessed using 100 bootstrap replicates. 274 276

| Conservation unit delimitation
Two approaches were employed to determine the number of discrete conservation units in the 278 Death Valley region: Multispecies coalescent (MSC) methods and unsupervised machine learning (UML) algorithms. Machine learning algorithms have been proposed as alternatives to 280 MSC methods, which seemingly over-split taxa under certain conditions (Barley, Brown, & Thomson, 2018;Leaché, Zhu, Rannala, & Yang, 2018), and parse populations rather than 282 speciation events (Sukumaran & Knowles, 2017). This stems from the many assumptions implicit to MSC, such as random mating, neutral markers, a lack of post-speciation gene flow, 284 and no within-locus recombination or linkage disequilibrium (Degnan & Rosenberg, 2009). Furthermore, UML methods reduce subjectivity in that they do not rely upon user-defined 286 models (Derkarabetian, Castillo, Koo, Ovchinnikov, & Hedin, 2019).
A MSC-based method (BFD*: Leaché et al. 2014) was first employed to test if observed 288 genetic diversity could be divided into discrete, well-supported units (i.e., subspecies, populations, or geographic subdivisions). Data were filtered in the PHRYNOMICS R package 290 (Leaché, Banbury, Felsenstein, de Oca, & Stamatakis, 2015) to remove invariant sites, nonbinary SNPs, and loci appearing in >95% of individuals. However, these conditions demanded 292 impractical compute times for each model, so we randomly subsampled the filtered data to retain 200 SNPs and five samples from each of eight sample sites (N=40). These sites represented the 294 most recent temporal collections of genetic clusters identified by ADMIXTURE.
We estimated the prior value for the population mutation rate (Θ) in BFD* using the 296 mean pairwise sequence divergence (7.99 x 10 -3 ) between R. osculus and its sister taxon R. cataractae (Longnose Dace). This value was set as the mean of a gamma-distributed prior. The 298 lineage birth rate (λ) of the Yule model was fixed using PYULE (https://github.com/joaks1/pyule).
A λ -value of 181.49 assumed tree height as one-half of the maximum observed pairwise 300 sequence divergence. Path sampling was set to 48 steps of 500,000 MCMC generations, with 100,000 discarded as burn-in. Bayes factors (BF) were calculated from normalized marginal 302 likelihoods (Leaché et al., 2014) using the current taxonomy (i.e., five subspecies) as a reference point. Models were evaluated for statistical significance via BF (Kass & Raftery, 1995). 304 Five UML algorithms were also applied (Derkarabetian et al., 2019), to include:

| Population structure
Genetic diversity within the Owens and Amargosa basins was best represented by seven genetic 328 clusters representing SNP-all ( Figure 3A) and SNP-select data ( Figure 3B), with Lahontan SPD (PRC) forming an additional 8 th group. In (A), all proposed subspecies (Table 1)  The proportion of genetic variance distributed among ADMIXTURE-identified clusters was 342 greatest for SNP-select (SNP-all =48.02%; SNP-select =93.82%) whereas for SNP-all it was within sampling localities (SNP-all =49.93%; SNP-select =4.43%). The proportion of variance 344 distributed among localities within clusters was very low for both (SNP-all =2.05%; SNP-select =1.75%) indicating that stochastic temporal sampling variance is unlikely the cause of the 346 observed strong genetic differences among localities.
Pairwise F ST values revealed significant isolation among sampling localities for both 348 datasets ( Table 2)

| Hybridization 362
A hybrid index was calculated for crosses between Ash Meadows (ASH) and the two Oasis Valley populations (AMA and RFO). Data represented fixed differences between the two [i.e., 364 61 SNPs (AMA x ASH: Figure 4A) and 83 SNPs (RFO x ASH: Figure 4B)]. The genomic composition of each Amargosa Canyon sample is an approximate 50/50 representation of each 366 parent, with hybridization ranging from recent (F2) to historic based upon observed interspecific heterozygosity (i.e., interspecific heterozygosity < 0.5). NEWHYBRIDS demonstrated a similar 368 trend using the 390-locus dataset filtered in GENEPOPEDIT ( Figure 4C). All Amargosa Canyon samples were classified with high probability as F2 hybrids (P>0.97). 370 Results of HYDE (Table 3)

| Bayes Factor Delimitation 388
The BFD* analysis ( Figure 6) decisively split the dataset into eight unique lineages, with seven distributed between the Owens and Amargosa basins (BF=501.85). In this regard, BF≥10 is 390 considered very strong support (Kass & Raftery, 1995).

426
The need for a mechanism to categorize the tips in the tree of life resonates politically, ethically, and biologically (Garnett & Christidis, 2017). Yet, our capacity to do so has languished, not only 428 with regard to the aforementioned species problem, but also with a multiplicity of conservation units at the intraspecific level. The latter represent confusing, and frequently conflicting 430 categorizations that have most often been parsed using phylogeographic analysis (Avise, 2009).
However, a more fine-grained phylogenomic perspective is now possible, given the current 432 capacity to generate voluminous data sets encompassing thousands of nuclear loci, and this in turn has subsequently shifted focus and intent (Freudenstein, Broe, Folk, & Sinn, 2017). 434 However, the abilities of researchers to interpret results of phylogenomic studies into discrete conservation units have not kept pace, much to the detriment of adaptive management. A 436 framework is needed to consistently ask if apparent intraspecific boundaries represent factual guides for conservation units, and if so, can they indeed be diagnostically defined (Rannala,438 2015)?
One potential mechanism to interpret results of molecular studies that evaluate 440 intraspecific diversity, and to translate them into management solutions would be to place them within a comparative framework with other available data. Here, the difficulty is to (a) know that 442 such data exists, and (b) locate and access it for integrated analyses. In the present situation, this would provide the platform from which results of individual studies could be compiled in a 444 consistent format, compared, and potentially categorized. A recent example is a platform that accommodates a trait-based approach to classification by delineating taxa from which ancestral 446 morphologies and their functions are reconstructed (Gallagher et al., 2020). A similar platform could be built so as to accommodate molecular studies of intraspecific diversity in an attempt to 448 seek consensus amongst results. As a template, it could mirror a recent study (Wieringa et al.,

2019) that gathered results of papers published on historical phylogeography in Southeastern 450
North America (N=57), as a mechanism by which new questions relating to intraspecific genetic diversity could be addressed. 452

| The current study 454
The genetic structure of SPD diagnosed through our analyses reveals their evolutionary history within the DVE. Their distribution is intimately tied to the prehistoric lakes and rivers of the 456 region, with diversifications occurring within modern basins. This pattern clearly reflects a relictual biodiversity with high endemicity, with patterns driven by Pleistocene tectonism and 458 hydrology (i.e., dispersal of SPD from Owens Valley to the Amargosa Basin during fluvial events: Jayko et al., 2008;Knott et al., 2008). However, the current taxonomy is incompatible 460 with these results, and must consequently be adjudicated prior to the delineation of conservation units for management purposes.

| Machine learning and subspecific designations 472
MSC-based species delimitation methods have been successful in delineating taxonomic units within problematic groups (Hedin, 2014;Hedin, Carlson, & Coyle, 2015;Herrera & Shank, 474 2016). However, caution is a key element in that over-splitting can occur under certain conditions (Sukumaran & Knowles, 2017). Despite these caveats, species delimitation tools 476 remain useful (Leaché et al., 2018) particularly when taxonomic groups are defined according to non-genetic attributes (Barley et al., 2018 Our consensus allowed for a relatively robust delimitation of conservation units within 488 the DVE. Despite this, questions still remain with regard to two lineages, Owens and Oasis valleys, in that ADMIXTURE indicated differentiated populations within each. Species 490 delimitation methods similarly concurred, with a minority (4/9; 44.4%) splitting Roberts Field (RFO) from the rest of Oasis Valley, whereas a majority (5/9; 55.6%) separated Benton Valley 492 (HAR) from Owens River. The latter is the most intriguing candidate for designation as a separate conservation unit, due largely to its elevated divergence from other local populations, as 494 measured via F ST . However, a previous phylogenetic reconstruction based on restriction-site mapping of mtDNA failed to distinguish it from Owens River (Oakey et al., 2004), and our study 496 provides moderate support for this same relationship. Unfortunately this population was last

| Hybridization and subspecies 502
Introgressed populations are another management conundrum (Allendorf, Leary, Spruell, & Wenburg, 2001) (Allendorf et al. 2001) that often have negative consequences for conservation 504 efforts (Rhymer & Simberloff, 1996). However, recent insights from genomic data suggest hybridization is much more common than previously thought (Bangs, Douglas, Brunner, & 506 Douglas, 2020), and in fact reticulation is an important evolutionary process (Chafin, Douglas, Martin, & Douglas, 2019). In our data set, one population, Amargosa Canyon (AMC), appears to 508 be of hybrid origin. Analyses suggest it derived from admixture between upstream Ash Meadows and Oasis Valley SPD, and is likely a rather recent event (i.e., approximately two generations). 510 This timing estimate may be skewed due to a specific analytical requirement of NEWHYBRIDS, where individuals must be classified into pre-defined generational categories. Based upon 512 observed interspecific heterozygosity, our hybrid index contrasting Oasis Valley samples (AMA) with R. o. nevadensis (Ash Meadows: ASH) reflects a spread from historical to approximately 514 F2-hybrid status. The HYDE results indicate AMA as being the more probable Oasis Valley parental population than RFO, in that comparisons between the latter and R. o. nevadensis were 516 not as well supported. The Amargosa Canyon samples were collected 10-months following a substantial flood that temporarily reconnected this area with upstream populations (i.e., Ash 518 Meadows and Oasis Valley). Interestingly, no pure parentals or F1 offspring were detected, and 10 months is inadequate time for SPD to yield F2 offspring. Thus, we interpret these results as an 520 older hybridization event of indeterminate age.
Given its hybrid origin Amargosa Canyon is our most challenging population to classify. 522 The majority of species delimitation methods (6/9; 66.67%) recognized it as discrete, while those that clustered it with other lineages did so inconsistently. The two RF isoMDS methods clustered 524 Amargosa Canyon with Oasis Valley, while the t-SNE Hierarchical method classified it with Owens River. Neither is consistent with the placement of Amargosa Canyon as sister to R. o. 526 nevadensis in the phylogenetic tree. These inconsistencies are likely a reflection of its hybrid ancestry. 528

| Morphological and ecological support for conservation units 530
Next, we applied a comparative framework to solidify support for our conservation units. While SPD lineages were statistically significant on a genetic basis, they should also be validated 532 morphologically, ecologically, and with additional life history data. This process acts to confirm them as distinct biological entities warranting conservation unit status. A morphological 534 overview of putative subspecies in the DVE focused on broad, regional trends, but it also found "... highly significant differences among all populations for all meristic and mensural characters" 536 . Ordination also revealed two qualitatively unique body shapes that are typical responses to hydrological conditions (Brinsmead & Fox, 2005;Collin & Fumagalli, 2011). These 538 are: A slender and elongated form typical for flowing streams versus shorter and deeper-bodied associated with still water such as lakes or spring. In our study, two populations occur in springs, Meristic counts are another type of phenotypic data often applied to diagnose species, but 546 their specificity must be carefully interpreted. Ranges frequently overlap among subspecific divisions, and data for other taxa are either lacking or conflated . This is 548 especially true for Oasis Valley SPD, initially lumped with Ash Meadows SPD (Gilbert, 1893;La Rivers, 1962), but with morphological details for subspecific status lacking (Deacon & 550 Williams, 1984;Williams et al., 1982). Likewise, few details are available for Amargosa Canyon SPD (Scoppettone, Hereford, Rissler, Johnson, & Salgado, 2011), other than a series of 552 qualitative descriptors when compared with other SPD subspecies (i.e., smaller head depth, shorter snout-to-nostril length, greater length between anal and caudal fins, greater numbers of 554 pectoral rays, and fewer vertebrae: Moyle et al., 2015). A diagnosis for Ash Meadows SPD is qualitative as well (i.e., incomplete lateral line, relatively large head, small eye, short and deep 556 body, dark stripe along entire length: Gilbert, 1893).
Owens River SPD is also locally variable. Meristic counts  are 558 summarized across four populations, to include Benton Valley, thus preventing within-basin comparisons. However, the presence of maxillary barbels distinguishes it from conspecifics in 560 surrounding basins. Furthermore, Benton Valley populations have qualitatively longer pelvic fins, and lower counts for lateral line and pore scales relative to others within-basin (Moyle et al., 562 2015). In contrast, Long Valley SPD has a higher pectoral and pelvic fin ray count, elevated lateral line scale count, and fewer lateral line pores. 564 Ecological, life history, and morphological data are thus inconclusive, and fail to delimit conservation units in the DVE. However, this is due to a deficiency of data (rather than 566 homogeneity), and subspecies do seemingly segregate morphologically, albeit without statistical tests as confirmation. In contrast, multiple lines of genetic data provide clear signals of distinct 568 entities, and inferences from modern genomic techniques also reinforce observed gaps in ecological data (Crandall, Bininda-Emonds, Mace, & Wayne, 2000;Funk, McKay, Hohenlohe, 570 & Allendorf, 2012). F ST outlier loci, for example, diagnosed potential ecological adaptation among SPD lineages, in that loci under selection have the potential to reveal cryptic signals of 572 adaptive divergence (Tigano, Shultz, Edwards, Robertson, & Friesen, 2017). While these loci do not replace traditional field observations (Funk et al., 2012), they do underscore in our situation 574 the juxtaposition of neutral variation and adaptive variation among DVE units. Isolation in different habitat types (i.e., springs versus rivers) therefore highlights the need for conservation 576 strategies that differ yet are linked via an ecosystem-oriented focus.

| Conservation implications with regard to subspecies
Modern genomic tools offer an in-depth view of population histories and allow unique genetic 580 lineages to be discriminated. In our study, they phylogenetically validate a rare situation where the majority (80%) of anecdotal subspecific designations (our OTUs) are recognized as 582 genetically discrete units. We deem this as a response to the extreme geographic isolation imposed upon each DVE lineage. Until formal descriptions can occur (i.e., morphological 584 analyses so as to formally describe each lineage), we thus recognize Long Valley SPD, Owens Valley SPD, and Oasis Valley SPD as ESUs (Moritz, 1994;Ryder, 1986;Waples, 1991), and 586 substantiate Ash Meadows SPD (R. o. nevadensis) as a distinct taxon and conservation unit.
Endemicity resulting from habitat isolation provides a unique challenge for conservation. 588 While isolated habitats are often prone to human disturbance, they are readily identified and monitored as well (Arthington, Dulvy, Gladstone, & Winfield, 2016). Springs, for example, 590 exhibit high levels of genetic structure across small geographic areas (Echelle et al., 2015). They frequently reflect low within-system species richness (α-diversity) but high diversity when 592 compared to other systems (β-diversity) (Gibert et al., 2009). In other words, greater diversity exists between groundwater-dependent systems rather than within (Gibert & Deharveng, 2002). 594 Thus, groundwater depletion invokes dire consequences for aquatic fauna, ranging from the depletion of faunal assemblages (Perkin et al., 2017) to complete extinction (Miller, Williams, & 596 Williams, 1989). The latter scenario provides an elevated risk for these populations since they lack redundancy to protect from catastrophic loss. 598 Our validation of anecdotal subspecies has a conservation imperative, in that it underscores the necessity of a management strategy that sustains and protects these unique 600 entities. Practical considerations require an adherence to current conservation policy, and this remains species-centric (Smith et al., 2018). While it has benefits (Runge et al., 2019), 602 ecosystem-level factors are also necessary conservation aspects (Franklin, 1993). They are particularly important in regions containing multiple narrowly endemic species, as those rare and 604 sympatric can suffer unintentional consequences through a strict species-centric approach (Casazza et al., 2016). Importantly, climate change impacts are also relevant for ecosystem 606 conservation (Prober, Doerr, Broadhurst, Williams, & Dickson, 2019;Wilkening, Pearson-Prestera, Mungi, & Bhattacharyya, 2019). 608 In addition, the evolutionary history of these lineages adds complications. In the DVE, each lineage of SPD is narrowly endemic and scant evidence of contact zones, with the exception 610 of the hybrid population in Amargosa Canyon (AMC). Hydrological changes underlying these patterns occurred on considerably different timescales in the two basins. Loss of surface water in 612 the Owens Basin is directly tied to anthropogenic activities during the past century, with available habitat being severely restricted (Buckmaster & Parmenter, 2019). The full extent of 614 how much this depleted genetic diversity is unknown, but our data document the probable contemporary loss of (at least) one genetically distinct lineage from this system (i.e., RFO in the 616
Hybridization is a contentious conservation topic (Fitzpatrick, Ryan, Johnson, Corush, & 626 Carter, 2015), particularly when one parental species is afforded protection under the US Endangered Species Act (ESA), as is the case with Ash Meadows SPD (R. o. nevadensis). The 628 issue is further compounded by a lack of policy explicitly addressing hybrids (vonHoldt, Brzeski, Wilcove, & Rutledge, 2017). The historic and ongoing lineage mixing that yielded Amargosa 630 Canyon SPD (AMC) appears natural and thus should not preclude protection (Allendorf et al., 2001). The situation parallels that of the red wolf (Canis rufus), deemed a hybrid between 632 endangered grey wolf (Canis lupus) and coyote (Canis latrans) (Hohenlohe et al., 2017;vonHoldt et al., 2016), although recent genomic analyses revealed it to be a distinct species of its 634 own with a convoluted history of introgression . Given this precedence, Amargosa Canyon SPD could therefore be listed as a DPS under the ESA (Waples, 636 Kays, Fredrickson, Pacifici, & Mills, 2018).

CONFLICT OF INTEREST
None Declared. 676

AUTHOR CONTRIBUTIONS 678
All authors participated in study design; SMM prepared DNA, generated ddRAD libraries, and completed data analysis; all authors contributed in drafting the manuscript and all approved its 680 final version.

DATA ACCESSIBILITY
Raw fastq files for each individual as well as all alignments used in this study are available on 684 the Dryad Digital Repository (address added after acceptance of article). 686 Table 1: Speckled dace (Rhinichthys osculus) sample groups analyzed in this study, including the ingroup from the Death Valley Ecosystem (DVE) and an outgroup from the Lahontan Basin (OUT). Latin trinomials are provided for described taxa. Those lacking formal description are provided subspecies (ssp) numbers reflecting those assigned by fisheries managers for unnamed R. osculus taxa in the Great Basin and surrounding areas. Common names, basins, regions within basins (Area) and specific site names (Site) are provided. Conservation units reflect assignments designated by multispecies coalescent and machine learning algorithms (ESU = Evolutionarily Significant Unit; DPS = Distinct Population Segment).     Figure 1: Conceptual workflow to delineate conservation units at tips in the Tree of Life. A standardized mechanism that applies molecular data to categorize inter-and intra-specific diversity should address three Questions about patterns in genetic data that must be integrated with other available data (Concordance). Metrics depict genetic diversity and are quantified singly or in combination via Analyses to derive Inferences regarding how ecological and evolutionary processes have shaped these patterns. Genetic results and other public data (e.g., phenotypic, ecological, biogeographical) are combined in a comparative framework so as to depict conservation units.     Figure 7: Results of eight unsupervised machine learning (UML) algorithms compared against Bayes Factor Delimitation (BFD*) results. These include four random forest (RF) methods with mixtures of classical (c) and isotonic (iso) multidimensional scaling (MDS) as well as hierarchical and partition around medoids (PAM) clustering. The t-distributed stochastic neighbor embedding (t-SNE) algorithm was employed with two clustering methods. Finally, a discriminant analysis of principal components (DAPC) and variational autoencoder (VAE) were applied. All UML algorithms were applied to 130 individuals genotyped at 200 SNPs. The raw number of divisions (=Clusters) for several methods included groups of individuals that shared a high proportion of missing data (*). Other cases divided individuals from two localities among clusters that did not follow any pattern ( †). In one instance, Ash Meadows populations were subdivided according to springs ( ‡). All three scenarios were interpreted as 'oversplitting' and ignored in the final interpretation (=Corrected