Integrative taxonomy refutes a species hypothesis: The asymmetric hybrid origin of Arsapnia arapahoe (Plecoptera, Capniidae)

Abstract Molecular tools are commonly directed at refining taxonomies and the species that constitute their fundamental units. This has been especially insightful for groups for which species hypotheses are ambiguous and have largely been based on morphological differences between certain life stages or sexes, and has added importance when taxa are a focus of conservation efforts. Here, we examine the taxonomic status of Arsapnia arapahoe, a winter stonefly in the family Capniidae that is a species of conservation concern because of its limited abundance and restricted range in northern Colorado, USA. Phylogenetic analyses of sequences of mitochondrial and nuclear genes of this and other capniid stoneflies from this region and elsewhere in western North America indicated extensive haplotype sharing, limited genetic differences, and a lack of reciprocal monophyly between A. arapahoe and the sympatric A. decepta, despite distinctive and consistent morphological differences in the sexual apparatus of males of both species. Analyses of autosomal and sex‐linked single nucleotide polymorphisms detected using genotyping by sequencing indicated that all individuals of A. arapahoe consisted of F1 hybrids between female A. decepta and males of another sympatric stonefly, Capnia gracilaria. Rather than constitute a self‐sustaining evolutionary lineage, A. arapahoe appears to represent the product of nonintrogressive hybridization in the limited area of syntopy between two widely distributed taxa. This offers a cautionary tale for taxonomists and conservation biologists working on the less‐studied components of the global fauna.

for a particular hypothesis. But given the central role of species to biology, these names profoundly influence how we think about the elements and conservation of biodiversity.
Although most taxonomies of organisms are based on morphological characters, genetic tools are essential to refining these taxonomies by parsing genotypic variation across demographic, geographic, and taxonomic scales, discerning recent and ancient introgression, revealing cryptic taxa, and synonymizing dubious ones (Kjer, Simon, Yavorskaya, & Beutel, 2016). Perhaps most straightforward has been the genetic assignment of specimens to known lineages by linking their genotypes to those of named and phenotypically identified voucher specimens. This is the province of DNA barcoding-assigning individuals to species based on their genetic sequences, generally of the cytochrome c oxidase subunit 1 (COI) mitochondrial region-and it has been applied to taxa across all of life, with an emphasis on animals (Hebert, Cywinska, & Ball, 2003). One of the first demonstrations of its efficacy involved insects (Hebert, Penton, Burns, Janzen, & Hallwachs, 2004), and they have been the focus of initiatives to associate DNA barcodes with individual species across higher taxonomic categories and throughout particular geographies (Smith et al., 2008;Zhou et al., 2016). These efforts have been broadly successful (Blagoev et al., 2016;Webb et al., 2012), in part because they permit linking morphologically cryptic larvae or females with their more easily recognized adult male counterparts (Gamboa & Monaghan, 2014). For some groups of insects, however, these broad assessments are in their infancy and concordance between morphological and genetic identifications of forms has been uneven (Geiger et al., 2016). In part, this reflects taxonomies erected on morphological grounds that have not yet been evaluated from a molecular perspective (Schlick-Steiner et al., 2010). This pattern also derives, however, from discord in the phylogenetic signal among different genes, especially for taxa of recent origin or exposed to introgressive hybridization (Funk & Omland, 2003). Taxonomic instability is expected given that all names are hypotheses about evolutionary lineages, and their revision is straightforward, albeit nontrivial, from a scientific perspective (Valdecasas, Williams, & Wheeler, 2008). But getting the names right is more than an academic exercise; society makes outsize investments in some of these hypotheses, particularly when they are associated with at-risk taxa that are the focus of conservation (Mace, 2004;Schwartz & Boness, 2017).
An exemplar of many these issues is the Arapahoe snowfly, Arsapnia arapahoe Nelson & Kondratieff (Plecoptera: Capniidae), one of eight species of stoneflies in the western North American genus Arsapnia that were formerly assigned to the Capnia decepta Banks group (Murányi, Gamboa, & Orci, 2014;Nelson & Baumann, 1989). This species was originally described from single male specimens collected in two streams in 1986 and 1987 in the Cache la Poudre River basin in north-central Colorado (Nelson & Kondratieff, 1988), and not observed again in these streams until March 2009, when the first putative females were also collected (Heinold & Kondratieff, 2010). More recent surveys for adults extended this distribution to 19 additional sites as far south as the South Platte River basin in the Colorado Front Range (Fairchild, Belcher, Zuellig, Vieira, & Kondratieff, 2017); larval forms remain unknown. Where present, this species was outnumbered by orders of magnitude by two sympatric capniids, A. decepta and Capnia gracilaria Claassen (Fairchild et al., 2017;Heinold, Gill, & Belcher, 2014). Its restricted range and apparent rarity led to a petition and subsequent candidacy for its listing under the U.S. Endangered Species Act (U.S. Fish and Wildlife Service, 2012).
The life history, morphology, and systematics of capniid stoneflies, however, make assessing the conservation status of many species, including A. arapahoe, a challenge. Members of this family tend to emerge as adults in late winter and early spring and can be locally abundant (Baumann, Gaufin, & Surdick, 1977), but the mating period is brief and synchronized at any particular location. Larvae apparently occupy hyporheic habitats that make them difficult to capture in benthic surveys, and this life stage can rarely be identified to species (Stewart & Stark, 1988). This is also true of adult females, for example, differences between A. arapahoe and A. decepta are thought to be subtle at best (Heinold & Kondratieff, 2010), and females of different species in this genus may be easily mistaken for one another and for females in confamilial taxa (Nelson & Baumann, 1989). Even male identification can be problematic.
The shape, size, and ornamentation of the male reproductive organ often constitutes the basis for describing and identifying species, yet this structure can exhibit substantial local and range-wide variation within taxa (Baumann & Stark, 2017). Male A. arapahoe, however, are readily identifiable because the epiproct lacks the mesal bulbous projections typical of all other members of this genus (Nelson & Kondratieff, 1988). This characteristic is so distinctive that it led Nelson and Kondratieff (1988) to speculate that A. arapahoe was the sister taxon to all other members of Arsapnia, and perhaps most closely related to A. sequoia Nelson & Baumann and A. utahensis Gaufin & Jewett, not the sympatric A. decepta. The position of these species within Capniidae is also ambiguous; systematists have long regarded Capniidae as a synthetic, paraphyletic assemblage in need of revision (Murányi et al., 2014;Nelson & Baumann, 1989).
Confusion about the taxonomic position of A. arapahoe grew when an attempt to use genetic tools to identify it (Heinold, Gill, Belcher, & Verdone, 2014) produced unexpected results: The interspecific distance between sympatric male A. arapahoe and A. decepta (0.10%) was typical of variation found within species of stoneflies (0.35%-0.53%; Gill et al., 2013;Morinière et al., 2017;Zhou, Jacobus, DeWalt, Adamowicz, & Hebert, 2010, but see Gill, Sandberg, & Kondratieff, 2015 for much higher estimates), not between species (generally ≥2%; Zhou, Adamowicz, Jacobus, DeWalt, & Hebert, 2009). Paradoxically, a putative female allotype of A. arapahoe shared the haplotype of a distantly related species in a different genus, Capnura wanica. The incidence of interspecific haplotype sharing and low interspecific divergence is rare in many arthropods (e.g., <2% among Canadian spiders (Blagoev et al., 2016) and virtually nonexistent in North American mayflies (Webb et al., 2012)), making assignment of different sexes to different species highly unusual and difficult to ascribe to a single mechanism. More likely is that a combination of factors is responsible, among them misidentification of voucher specimens, incomplete lineage sorting between recently diverged species, interspecific haplotype sharing caused by infection by the bacterium Wolbachia, or hybridization (Funk & Omland, 2003;Smith et al., 2012). Heinold et al. (2014) favored incomplete lineage sorting, arguing that the dramatic morphological differences between males were reliable and obvious evidence of speciation, haplotypes were not those of the markedly divergent Wolbachia sequences (and selective sweeps driven by Wolbachia infection are extraordinarily rare; Smith et al., 2012), and hybridization was not likely because of the lack of morphological intermediates between male A. arapahoe and A. decepta. But this interpretation did not account for the differences among male and female genotypes and failed to satisfy the requirement that an integrative taxonomy provides an evolutionary explanation for all aspects of morphological and molecular discord (Schlick-Steiner et al., 2010).
A simple explanation for haplotype sharing among these taxa would be hybridization, but this phenomenon has been regarded as rare among stoneflies and other aquatic arthropods (Dijkstra, Monaghan, & Pauls, 2014;Hughes, Finn, Monaghan, Schultheis, & Sweeney, 2014), and until recently was thought to be confined to a few species pairs of eastern North American Allocapnia (Ross & Ricker, 1971). That hybridization might be more prevalent is discouraged by the biological species concept with its predisposition to view hybridization as a rare accident (Mallet, 2005) and by notions that behavioral or anatomical differences constitute intrinsic reproductive barriers. For example, conspecific drumming signals used by male and female stoneflies for mate recognition are regarded as species-specific and likely to ensure pre-zygotic reproductive isolation (Boumans & Johnsen, 2014;Stewart, 2001). Nonetheless, there are many examples of arthropod taxa with elaborate pre-mating and putatively species-specific displays that nonetheless result in attempted mating between species (Masly, 2012), including between stoneflies in different genera (Zeigler, 1990). Even if hybridization does not ensue, assuming species identity based on temporary male-female association could lead to misidentification of the less morphologically distinct females. There may be a stronger argument for the rarity of hybridization based on the "lock and key" hypothesis (Sota & Kubota, 1998), which posits that anatomical complementarity of male and female terminalia is required for successful reproduction, but again there are a host of examples demonstrating hybridization between anatomically disparate taxa (Shapiro & Porter, 1989). Nonetheless, the success of heterospecific crosses may be asymmetric because of pre-or post-zygotic reproductive incompatibilities, that is, crosses between a male of one species and a female of the other may exhibit lower female survival, likelihood of insemination, or fitness of offspring than does the opposite pairing (Masly, 2012).
Our goal was to resolve the taxonomic ambiguity surrounding A.
arapahoe via an iterative approach to integrative taxonomy (Yeates et al., 2011). We treated the morphological identifications as authoritative hypotheses to be evaluated in light of molecular data from A.
arapahoe and related capniid stoneflies. To that end, we analyzed sequences of two mitochondrial regions and one nuclear gene.
Because these results were inconclusive, we used genotyping by sequencing to more thoroughly explore the evolutionary origin and taxonomic validity of A. arapahoe.

| Sample collection and sequence selection
Specimens for sequencing were collected for this study or drawn from the collections at the C.P. Gillette Museum of Arthropod Diversity, Colorado State University, Fort Collins, Colorado ( Figure 1; Supporting Information Table S1). All individuals were identified to species by taxonomic experts at this facility. Furthermore, we reexamined every specimen under a dissecting microscope to confirm the sex of the individuals being genetically sequenced, and of all A. arapahoe specimens to ensure they were of this species. We   Baumann 1987), and three species of Capnia (C. elongata Claassen, C. gracilaria, and C. promota Frison), and we restrict presentation of our results to these taxa (hereafter, Arsapnia group), except where references to additional taxa are pertinent.

| DNA sequencing
We used the QIAGEN DNeasy Blood and Tissue kit (QIAGEN Inc.) to extract genomic DNA from whole hind legs or the thorax of individual specimens, following the manufacturer's instructions for tissue.
We sequenced COI and cytochrome b (cyt b) to facilitate comparisons with sequences in existing databases (Ratnasingham & Hebert, 2007) and to increase the taxonomic resolution of the mitochondrial data (Hillis, Pollock, McGuire, & Zwickl, 2003). We sequenced the ribosomal internal transcribed spacer (ITS1) to permit comparison of a nuclear phylogeny to those from the mitochondrial genes and to assess whether hybridization or sharing of mitochondrial genes associated with Wolbachia infection were evident. We amplified COI using the standard primers (LCO1490/HCO2198 (Folmer, Black, Hoeh, Lutz, & Vrijenhoek, 1994) or LepF1/LepR1 (Hebert et al., 2004)), cyt  (2000) and appended to the nucleotide sequences using FastGap 1.2 (Borchsenius, 2009).

| Genotyping by sequencing
To produce a much larger dataset from across the nuclear genome to refine our understanding of the identity of A. arapahoe, we used  Information Table S1).
Initial sequencing produced ~501.7 M raw reads. After trimming reads with bases having PHRED quality scores of <15, ~440.6 M reads (mean length, 115 bp) remained. Consensus sequences were generated for A. arapahoe because of the lack of a suitable reference genome for alignment and SNP calling. Trimmed sequence reads from all samples were combined and normalized to a maximum of 50x coverage, using diginorm (Brown, Howe, Zhang, Pyrkosz, & Brom, 2012). The sequencing errors in the reads were then corrected using Fiona (Schulz et al., 2014). The coverage-normalized and error-corrected reads were condensed using CD-HIT-454 (Fu, Niu, Zhu, Wu, & Li, 2012) with ≥96% identity to form consensus clusters. Clusters with fewer than 10 component reads and shorter than 50 bp were discarded. Trimmed reads were aligned to the consensus reference sequence using GSNAP (Wu & Nacu, 2010). Confidently mapped reads were filtered if each mapped uniquely (≤2 mismatches every 36 bp and <5 bases for every 75 bp as tails) and were used for subsequent analyses.
Any site that was polymorphic (homozygous or heterozygous) relative to the reference genome sequence in at least one sample was considered a SNP. Putative homozygous and heterozygous SNPs were retained if the most common allele (or two alleles in heterozygotes) was supported by at least 80% of all the aligned reads covering that position, at least five unique reads supported the most common allele (or two most common alleles), and each polymorphic base had a PHRED base quality value ≥20. Polymorphisms in the first and last 3 bp of each read were ignored. Polymorphic sites were filtered further based on a minimum allele frequency of 1%, a constrained heterozygosity rate (ranging from zero to twice the product of the frequency of the two most common alleles), and a minimum call rate of 20%, that is, each locus was genotyped in 20% of all specimens.

| Phylogenetic analyses
Our initial assessment of the validity and evolutionary position of First, we used TCS 1.21 (Clement, Posada, & Crandall, 2000) to construct 95% maximum parsimony networks based on COI sequences from field samples and public sequence libraries. Independent networks using this threshold have been regarded as representing single species, although this approach can underestimate species diversity because of the greater tendency to combine distinct taxa than to split a single taxon (Chen et al., 2010;Hart & Sunday, 2007).
Sequences with ambiguous nucleotides were excluded from maximum parsimony networks to avoid spurious networks (Joly, Stevens, & van Vuuren, 2007). Second, we developed maximum-likelihood phylogenetic trees for the ITS1 sequences and for concatenated sequences of COI and cyt b. Analyses were restricted to unique haplotypes which we identified using DAMBE version 6 (Xia, 2017).
We used PartitionFinder 2.0 (Lanfear, Frandsen, Wright, Senfeld, & Calcott, 2016) to select the best-fitting partitioning scheme as measured by AIC c , constrained to the suite of evolutionary models considered by RAxML (Stamatakis, 2014) and excluding all outgroups, that is, nonmembers of the Arsapnia group. There were six data subsets for the concatenated mitochondrial sequences based on codon position and gene, and two subsets for ITS1 based on nucleotides and recoded gap characters. Because RAxML will only consider a single evolutionary model for the entire suite of partitions, we compared AIC c scores among maximum-likelihood models using the GTR, GTRGAMMA, and GTRGAMMAI evolutionary models to choose a best model. We then ran RAxML version 8.

| SNP analyses
Results of these analyses were not consistent with the prevail-  (Huang & Knowles, 2014). Next, we used principle coordinate analysis in GenAlex 6 (Peakall & Smouse, 2006) and inferred potential parental taxa from their position relative to specimens of A. arapahoe in two-dimensional coordinate space (Payseur & Rieseberg, 2016). We restricted these analyses to a single SNP at each locus with a minimum call rate of 90% (n = 1,788) to avoid issues with linkage and to minimize the influence of missing data on potential patterns in hybridization. For the likely parental taxa identified in these analyses, we examined SNPs present in every specimen (minimum call rate = 100%) that were fixed for alternate alleles in each taxon and thus potentially diagnostic to permit precise estimates of the levels of introgression and heterozygosity within individuals (Hohenlohe, Amish, Catchen, Allendorf, & Luikart, 2011). We examined the distribution of alleles at these loci only in male A. arapahoe because the two putative A. arapahoe females were assigned by mitochondrial sequences to members of other, morphologically similar species (Nelson & Baumann, 1989). Another female phenotypically identified as A. decepta, however, had a mitochondrial and nuclear genotype matching that of male A. arapahoe and was considered a female representative of this taxon (see below). This led to identification of sex chromosome-linked SNPs, the allelic patterns of which differed from those in autosomal loci (Carmichael et al., 2012) but were consistent with the interpretation of the origin of A. arapahoe.

| RE SULTS
The 563-nucleotide COI dataset consisted of 556 sequences con- Analyses of 95% maximum parsimony networks of the Arsapnia group did not support recognizing A. arapahoe as a distinct taxon.
Although 13 phenotypically identified species were included, the analysis produced only three separate networks (Figure 2, Supporting Information Figure S1). In the network with A. arapahoe, all specimens of that species shared haplotypes with A. decepta, and that network included four other species of Arsapnia, one species of Sierracapnia, and two species of Capnia. This network also included two specimens of A. decepta from Arizona, which were closely related to but distinct from those in Colorado. Interspecific patterns of relationships, however, were largely concordant with those in other sequence-based analyses.
Both the mitochondrial and ITS1 phylogenies strongly supported the Arsapnia group (Figure 3). The mitochondrial phylogenies provided greater resolution at lower taxonomic levels, as would be expected because mitochondrial genes have smaller effective population sizes and thus diverge more rapidly relative to nuclear genes. Unobserved haplotype In both analyses, Capnia gracilaria was supported as a member of (ITS1) or sister to (COI + cyt b) the remainder of the Arsapnia group.
The terminal clades in the mitochondrial trees were not always concordant with the phenotypic identifications of specimens, especially of females (also see Supporting Information Table S1). This included two females phenotypically assigned to A. arapahoe that shared COI mitochondrial haplotypes with species in other genera: specimen 222 with Capnia gracilaria and specimen 225 with Capnura wanica.
These sequences were unlikely to represent paralogous nuclear  decepta (mean 53.6%, range, 48.5%-59.4%) and specimens were heterozygous at the majority of these SNPs (mean 85.8%, range 73.5%-96.8%). No individual of A. arapahoe for which we had SNP data had a genotype indicative of any level of introgression resulting from backcrosses to either parental species (in which case an individual would have ≥75% of the diagnostic alleles of one parent) or mating between hybrid individuals (in which case levels of heterozygosity would be ≤25%; Figure 5). Finally, all SNPs fixed for single allele in A. arapahoe (n = 1,275) were shared with A. decepta and Capnia gracilaria, that is, no SNPs were diagnostic for A.

arapahoe.
TA B L E 1 Mean pairwise genetic differences (%) among haplotypes within (on the diagonal) and between (below the diagonal) Arsapnia arapahoe and related or sympatric capniid stoneflies for two mitochondrial genes and one nuclear gene

| D ISCUSS I ON
Collectively, the genetic analyses of specimens from nine separate locations demonstrated that A. arapahoe consisted of individuals that were the first-generation progeny of female A. decepta and male Capnia gracilaria. The phylogenetic analyses of two mitochondrial genes and one nuclear gene did not delineate A. arapahoe as a taxon distinct from its more common, widespread, and sympatric congener, A. decepta. Specimens of A. arapahoe often shared haplotypes with A. decepta and always occupied the same terminal clades.
Interspecific differences between A. arapahoe and A. decepta were typical of differences found within, not between, other species of stoneflies, and were comparable to intraspecific mitochondrial variation throughout much of the animal kingdom (Goldstein & DeSalle, 2010), providing no evidence of any degree of lineage sorting.
Although analyses of sequences of ITS1 were less informative, they did reveal monophyly among the Arsapnia group, which included analyses have patterns that strongly parallel those that were indicative of hybridization (Figures 2 and 3). For example, male Sierracapnia palomar have a long, narrow epiproct markedly different from other members of Sierracapnia (Bottorff & Baumann, 2015), the species is a local endemic that is geographically disjunct from other congeners, and its haplotypes cluster with species of Arsapnia. Evaluating whether hybridization influences this species hypothesis, however, will require focused sampling of additional capniid stoneflies in and around its small range in central California.
Zoogeographic patterns suggest that the hybrids constituting A.
arapahoe may be relatively restricted in their distribution. Arsapnia decepta is primarily a species of small and sometimes intermittent streams of the southwestern United States and adjacent Mexico that makes its most northerly advance along the Colorado Front Range.
In contrast, Capnia gracilaria appears to occur in larger streams and the majority of its range is from South Dakota to Oregon and north through the Rocky Mountains to Alaska, with scattered observations in Arizona, New Mexico, and the Great Basin (Baumann, Sheldon, & Bottorff, 2017;DeWalt, Maehr, Neu-Becker, & Stueber, 2018;Kondratieff & Baumann, 2002;Nelson & Baumann, 1989 Nevertheless, such collections are the foundation of ecological, genetic, and taxonomic exploration for many taxa. Our interrogation of the status of A. arapahoe was motivated by the conflict between morphological and molecular interpretations of this species and made possible by a comprehensive museum collection. Despite recent calls to bolster the ranks of traditional taxonomists and the collections, they steward (Morrison, Sillett, Funk, Ghalambor, & Rick, 2017), the taxonomic impediment remains. There are too few taxonomists, and they are confronted by waves of genetic data simultaneously suggesting candidate species and challenging long-standing species hypotheses. Technological advances that facilitate recovering genome-wide data for many species, including from environmental samples for which detected taxa are never observed (Deiner et al., 2017), and the increasingly sophisticated algorithms for genetically driven species delimitation (Luo, Ling, Ho, & Zhu, 2018), make it likely that species discovery and revision will increasingly be crowdsourced to nontaxonomists such as ecologists and geneticists. The lack of consensus among species concepts and the variation in how genetic data are interpreted (Sukumaran & Knowles, 2017), however, suggest that an integrative taxonomy using multiple data sources (Padial, Miralles, De la Riva, & Vences, 2010) should be the standard, and emphasizes that expertise in morphological assessment remains indispensable (Zhou et al., 2016). Robust taxonomies also rely on the taxonomist's exploration of novel habitats and the ecologist's systematic inventory of representative ones (Sheldon, 2016) to produce the comprehensive zoogeographical sampling that is paramount to delimiting and describing biodiversity (Young, McKelvey, Pilgrim, & Schwartz, 2013).
Such an intensive approach may only be practical for taxa, or the hypotheses they represent, that are of intense conservation interest. This level of attention is being directed at many groups of freshwater species, including stoneflies, because they are disproportionately represented in lists of imperiled taxa (Strayer & Dudgeon, 2010;Williams et al., 2011). In the U.S., federal listing under the Endangered Species Act is typically reserved for those rare and declining taxa, or the distinct populations segments thereof, that appear most at risk. The highly restricted range and limited abundance of A. arapahoe, in light of the existing and proposed developments in this region, met those requirements and elevated this species to candidacy for listing (U.S. Fish and Wildlife Service, 2012). This status, however, rests on the notion that it represents a valid taxon to which a name may be applied. With respect to animal taxa of hybrid origin, the International Commission broadly, these individuals are further evidence of the ubiquity of hybridization even between species thought to be reproductively isolated by morphology and behavior, and a reminder to consider this phenomenon as a potential source of variation in taxonomic and phylogenetic studies. We have little doubt that further instances of unrecognized hybridization, and its taxonomic consequences, will become apparent as genomic exploration of understudied groups continues.

ACK N OWLED G M ENTS
We thank Boris Kondratieff, Chris Verdone, and staff at the

CO N FLI C T O F I NTE R E S T
None declared.