Genomic, ecological, and morphological approaches to investigating species limits: A case study in modern taxonomy from Tropical Eastern Pacific surgeonfishes

Abstract A wide variety of species are distinguished by slight color variations. However, molecular analyses have repeatedly demonstrated that coloration does not always correspond to distinct evolutionary histories between closely related groups, suggesting that this trait is labile and can be misleading for species identification. In the present study, we analyze the evolutionary history of sister species of Prionurus surgeonfishes in the Tropical Eastern Pacific (TEP), which are distinguished by the presence or absence of dark spots on their body. We examined the species limits in this system using comparative specimen‐based approaches, a mitochondrial gene (COI), more than 800 nuclear loci (Ultraconserved Elements), and abiotic niche comparisons. The results indicate there is a complete overlap of meristic counts and morphometric measurements between the two species. Further, we detected multiple individuals with intermediate spotting patterns suggesting that coloration is not diagnostic. Mitochondrial data recovered a single main haplotype shared between the species and all locations resulting in a complete lack of structure (ΦST = 0). Genomic analyses also suggest low levels of genetic differentiation (F ST = 0.013), and no alternatively fixed SNPs were detected between the two phenotypes. Furthermore, niche comparisons could not reject niche equivalency or similarity between the species. These results suggest that these two phenotypes are conspecific and widely distributed in the TEP. Here, we recognize Prionurus punctatus Gill 1862 as a junior subjective synonym of P. laticlavius (Valenciennes 1846). The underlying causes of phenotypic variation in this species are unknown. However, this system gives insight into general evolutionary dynamics within the TEP.


| INTRODUC TI ON
Species are the fundamental unit of biology, and as such their proper identification is critical for a variety of disciplines, including phylogenetics, biogeography, population genetics, and conservation (De Queiroz 2005). Traditionally species are diagnosed by one or more morphological differences (either fixed or in combination) between groups of organisms. In groups that generally display vibrant coloration patterns, such as tropical coral reef fishes, many species have been delimited through subtle color differences (Leray et al., 2010;Rocha, 2004;Taylor & Hellberg, 2005). For many reef fishes, color or squamation patterns have been used to identify genetic breaks between major biogeographic provinces (DiBattista et al. 2013;Coleman et al. 2016), and to detect areas with high rates of endemism, such as Hawaii (Randall & Rocha, 2009) and the Marquesas (Gaither et al. 2015). However, a number of studies have shown that differences in color patterns are not always indicative of reduced gene flow (Ramon, Lobel, & Sorenson, 2003;Lin, Sanchez-Ortiz & Hastings, 2009;Schultz et al. 2007), and can be discordant with patterns of genetic structure (DiBattista et al., 2015;Gaither et al., 2014;Leray et al., 2010). Taken together, these studies indicate that color patterns alone are not well-suited for defining species limits, but should be used in concert with other measurements to ensure an accurate reflection of evolutionary history.
The tropical Eastern Pacific (TEP) is a marine biogeographic region that spans 29° of latitude from Magdalena Bay, Mexico, to the Gulf of Guayaquil, Ecuador (Robertson & Cramer, 2009). Numerous studies have categorized the TEP into three to five biogeographic provinces based on the distribution records of fishes (Briggs, 1974;Briggs & Bowen, 2012;Hastings, 2000;Robertson & Cramer, 2009;Spalding et al., 2007). This region has been partially isolated from the Indo-Pacific since the Miocene, and completely separated from the Atlantic since the closure of the Isthmus of Panama. The biodiversity of the TEP pales in comparison to that of its neighboring Central/West Pacific region, and it has consequently been discussed as having "reduced speciation capacity," particularly in several iconic reef-fish families (Cowman & Bellwood, 2013). Still, speciation within the TEP is facilitated by the limited connectivity between the offshore islands and the continental coast (Allen & Robertson, 1994). Examples include the high rates of fish endemism of the Galapagos (~17% endemic species), Clipperton atoll (~7% endemic species), Cocos Island (~4%), and the Revillagigedos (~8%; Cortés, 2012;Robertson & Cramer, 2009). Many of these offshore endemics are distinguished by coloration differences from their continental congeners, and for some groups, multiple offshore islands have their own endemic species. For example, in Holocanthus angelfishes,
The present study focuses on two species of Prionurus surgeonfishes distributed latitudinally throughout the TEP: P. punctatus occurs from the Gulf of California to Costa Rica, while P. laticlavius extends from Costa Rica to Ecuador, also occupying offshore islands of the TEP (Figure 1; Robertson & Allen, 2015). This pattern of distribution is somewhat unexpected, as surgeonfishes have extremely high dispersal potentials (Doherty, Planes, & Mather, 1995), and several species lack population structure across entire ocean basins (Dibattista, Wilcox, Craig, Rocha, & Bowen, 2011;Eble, Rocha, Craig, & Bowen, 2011;Eble, Toonen, & Bowen, 2009). In fact, while seven surgeonfish species regularly occur in the TEP (Allen & Robertson, 1994), the two species of Prionurus are the only surgeonfishes in the region that are not also present in the Indo-Pacific. Furthermore, these two species are nearly identical phenotypically. In the description of P. punctatus, Gill notes that "it widely differs from the previously known [P. laticlavius] by its spotted body; in other respects it is most nearly allied to the Prionurus laticlavius from the Galapagos Islands" (Gill 1862). The situation is further complicated by a recent phylogenetic analysis of the genus, where a multilocus approach did not recover these two species as reciprocally monophyletic . However, that particular study was based on three individuals of P. punctatus and two of P. laticlavius, and it is possible that the loci did not provide the resolution needed to distinguish shallow divergences . Considering their distribution across the continental waters of the TEP, as well as their morphological and phylogenetic similarities, it would be interesting to explore potential differences at the genomic level that could diagnose P. punctatus and P. laticlavius. This would clarify the status of these species, while providing insight into the patterns of genomic divergence of closely related species in the region.
Here, we expand upon the results of  by including individuals from several locations across the TEP and by adding genomic analyses between the two species. In addition to genetic data, we gathered traditional morphological and meristic data for both species across their ranges and compared them to original species descriptions and type material. We then examined if ecological factors may be responsible for any divergences between these species in order to assess possible speciation drivers along the coastal TEP.

| Phenotypic and morphological comparisons
To assess species limits in this system, both molecular and specimen-based approaches were used. An in-depth morphological comparison of these two species has never been conducted and could reveal more characters consistent with species diagnoses than just squamation patterns. For this purpose, specimens for P. punctatus and P. laticlavius were examined from across their distributions for phenotypic and morphological variation. Standard measurements and meristic counts were taken for each specimen following those reported in Randall (2001). This included counting the spines and rays of the dorsal, anal, and pectoral fins, and measuring the body depth, predorsal length, pelvic-fin and anal-fin lengths in proportion to standard length. The two species mainly differ in the presence or absence of dark spots covering the body; thus, photographs of all specimens were taken to determine how consistent spotting pattern is as a character across the entire TEP. All measurements were made with digital calipers, and averages were calculated for each species.

| Molecular sampling and extraction
To have a better understanding of the genetic divergence between P. punctatus and P. laticlavius along the mainland TEP, we sampled at three localities: Baja California, Mexico; Guanacaste, Costa Rica; and Las Perlas Islands, Panama. This sampling scheme targets two extreme locations, where only a single species is reported in the literature (Mexico for P. punctatus, and Panama for P. laticlavius), as well as one location where the two species overlap in their recorded distributions (Guanacaste, Costa Rica). Samples were obtained between 2012 and 2015 using either nets along the shore or pole spears while SCUBA diving. Tissue samples were taken from pectoral fins, gills, or muscle tissue and stored in 95% EtOH. Once in the laboratory, tissue samples were stored in a −80°C freezer prior to sample preparation. When possible, voucher specimens were fixed in formalin and deposited at the Louisiana State University Museum of Natural Science.
Genomic material was extracted from each sample using the Qiagen DNeasy Blood and Tissue extraction kit following manufacturers protocols. Extracts were then quantified using a Qubit 2.0 fluorometer with a dsDNA BR Assay Kit (Life Technologies). Quality of genomic extractions was assessed via gel electrophoresis, with a 1% agarose gel using SYBR Safe DNA gel stain (Invitrogen) and 6x blue/ orange loading dye (Promega). All extracts were then kept at −20ºC prior to library preparation and amplification.

| Mitochondrial sequencing and analysis
To determine if our increased sampling effort was enough to resolve the relationships of these two species, we amplified all samples for the mitochondrial COI barcoding region. Primers and PCR reactions protocols were identical those described in  and can be found in the appendix. All samples were purified and sequenced in both forward and reverse directions using the Genomic Sequencing and Analysis Facility at the University of Texas at Austin. Sequencing was performed on an Applied Biosystems 3730 sequencer. All sequences were edited and aligned using Geneious 6.0.5 (Biomatters), and all alignments were checked manually. Haplotype networks were created using the TCS networks option in PopART (Clement, Posada, & Crandall, 2000). Summary statistics (haplotype and nucleotide diversities, Φ ST ), and Fu's F statistic (Fu, 1997) were calculated using Arlequin 3.5 (Excoffier, Laval, & Schneider, 2005). An AMOVA was conducted to test for population structuring between the two species, as well as between sampling localities, using 50,000 permutations in Arlequin. These summary statistics were calculated for both species and for all sampling locations.

| Genomic library preparation, sequencing, and analysis
For each sample, ~0.5-1ug of DNA was sonicated to ~600 bp using an Episonic 1000E sonicator with 15-s pulse intervals. Fragmentation was verified on a 1% agarose gel, and the process was repeated as necessary. Library preparation was conducted using a KAPA Hyper Library Prep Kit (KAPA Biosciences) using 10 bp TruSeq-style oligonucleotide dual-indexing barcodes . Library The sequenced libraries were demultiplexed, and barcodes, lowquality base calls, and reads shorter than 40 bp were removed using Trimmomatic (Bolger, Lohse, & Usadel, 2014) as part of the program Illumiprocessor (Faircloth, 2013). Sequences were then assembled into de novo contigs using Trinity 2.0.6 with default parameters (Grabherr et al., 2011), and these were mapped to UCE probes using the Phyluce 1.5 pipeline (Faircloth, 2015). Sequence data were then processed in two ways optimized for phylogenomic or population genomic analyses.
For phylogenomic analyses, contigs were first aligned in the Phyluce pipeline using Mafft (Katoh & Standley, 2013) with the notrim option. Internal trimming using gblocks (Castresana, 2000) was then conducted on this alignment prior to outputting a final 70% complete data matrix. These alignments were then concatenated, and a maximum-likelihood phylogenomic tree was then constructed using RAxML v8.1.24 (Stamatakis, 2014) on the CIPRES scientific gateway portal (Miller, Pfeiffer, & Schwartz, 2010). Two samples of P.
biafraensis were included as outgroups for rooting the tree, as a previous study indicates this is the sister clade to the TEP species . All analyses were completed using the GTRGAMMA model for bootstrapping, with 1,000 bootstrap iterations, and the rapid bootstrapping option (−x) selected. All nodes with a bootstrap value <50 were then collapsed.
Meanwhile, for the population genomic analyses, a reference dictionary was created to assist in SNP alignment using Picard (http://broadinstitute.github.io/picard/). This dictionary was created using the sample that recovered the most UCE loci. The reference was then indexed using SAMtools ). All samples were then aligned to this reference using BWA

| Ecological comparisons
Considering the broad geographic range occupied by these sister species, it is quite possible that they are occupying ecologically distinct habitats, which could promote divergence even in the presence of gene flow (Bernardi, 2013;Rocha & Bowen, 2008;Rocha, Robertson, Roman, & Bowen, 2005). To test this, niche equivalency and similarity tests were conducted to determine if these two species are occupying similar habitats in the TEP (Broennimann et al., 2012). This approach uses kernel density smoothing to compare the density of species occurrence in environmental space using occurrence and environmental data. Occurrence data for both species was acquired from the Global Biodiversity Information Facility (GBIF) using the R package RGBIF (Chamberlain et al., 2017). Locality information was checked manually for errors, verifying species assignments with vouchered museum specimens or photographs. Eleven environmental layers that summarize bathymetry and annual properties of sea surface salinity (SSS) and sea surface temperature (SST) for the TEP were downloaded from the MARSPEC database (Sbrocco & Barber, 2013; http://www.marspec.org). These included: distance to shore, depth, mean annual range, and annual variance of SSS and SST, as well as the SSS of the wettest and driest months, and SST of the coldest and warmest month of the year. The comparison tests used here are bivariate, thus a principal components analysis was conducted using all 11 environmental layers, and the top two axes were kept for subsequent analyses. Niche equivalency and similarity tests were conducted in the R package ENMTools (Warren, Glor, & Turelli, 2010).

| Phenotypic and morphological data
In total 169 vouchered museum specimens ( in P. laticlavius), but the ranges of these counts overlapped between the two species (Table 1).

| Mitochondrial COI and sampling
In total, 53 individuals were collected, including 25 P. punctatus, 23 P. A portion of the mitochondrial COI gene (546 bp) was successfully amplified for all individuals. Regardless of how the data were analyzed, all results revealed low haplotype and nucleotide diversity.
In total, nine haplotypes were recovered: one main haplotype shared between 45 individuals and eight singleton haplotypes ( Figure 2).
There was no genetic structure between either species or between any of the localities (Φ ST = 0, for all comparisons

| UCE phylogenomics and population genomics
UCEs were successfully sequenced for 49 individuals: 23 P. punctatus, 24 P. laticlavius, as well as two individuals of P. biafraensis used as outgroups. The average number of sequencing reads per individual was 2.8 million and ranged from ~941,000-4.7 million. A data TA B L E 1 Averages and ranges of meristic and morphological measurements of the two species matrix with a completeness of 70% was assembled for phylogenomic analyses, which contained 866 UCE loci, with an average UCE locus length of 963 bp. The resulting phylogenomic hypothesis failed to recover the two species as reciprocally monophyletic, with overall low support throughout the tree (Figure 3a).
Meanwhile, after filtering, the population genomic approach identified a total of 8,757 SNPs in the data, which was reduced to 864 SNP-  Table 3.
DAPC analyses that included P. biafraensis suggested the most likely number of clusters to be two, with the sister-species pair P. punctatus and P. laticlavius together in a single group. This pattern could be driven by large genetic divergence between P. biafraensis and both TEP species, which could mask any subtle differences between the two TEP species. However, when the outgroup P. biafraensis is removed, the most likely number of clusters recovered is one, with both TEP species clustering together. This result can also be seen in a PCA of the SNP dataset, which reveals both species completely overlapping in 95% confidence intervals (Figure 3b). These results are mirrored by our STRUCTURE analyses. When testing between K = 1-5, a comparison of model outputs with the Evanno method recovered K = 2 as the most likely result, with K = 1 the second most likely number of clusters (Supporting information Appendix S1: Table S2).
However, the two clusters recovered do not correspond to the two TEP species, but rather differences in allele frequencies for particular sets of loci (Figure 3c). Examining the distribution of individual locus  Appendix S1: Figure S3)   This study represents the most comprehensive morphological analysis for P. laticlavius (including the former P. punctatus), as it includes historical specimens of both phenotypes, as well as individuals collected from offshore islands (Galapagos and the Revillagigedos).

| D ISCUSS I ON
Overall, our results are very clear in showing complete overlap of all meristic counts and measurements between the two phenotypes.
Perhaps the most unique observation is that the spotting pattern is not discrete, as suggested by the type specimens of these species. Several individuals display faint spots on parts of their bodies (Supporting information Appendix S1: Figure S1), and while these phenotypic traits could be interpreted as evidence of hybridization without any other information, the lack of any genetic structuring between the species suggests that this is merely an intermediate phenotype between two populations.
Mitochondrial analyses revealed a single main haplotype distributed across the entire coastline of the TEP, resulting in low haplotype and nucleotide diversities. This genetic signature is typically observed in groups that have recently experienced a population bottleneck, or recent founder events (Grant & Bowen, 1998).
A founder event seems unlikely given that the TEP Prionurus are the sister group to P. biafraensis from the eastern Atlantic and must have had a common ancestor in the Central American Seaway prior to the closure of the Isthmus of Panama .  . These climatic shifts also correspond with the appearance of upwelling areas and ENSO oscillations in the TEP (Cortes, 1997;Cortés, 2003). All of these changes contributed to a period of rapid community turnover in the reef structure of the TEP from a community composed of Atlantic-related corals, to a community of sparsely distributed Pacific corals (Leigh, O'Dea, & Vermeij, 2014;López-Pérez, 2017).
This turnover could easily result in population fluctuations and potential population bottlenecks.
In our comparison of nuclear loci, which are gathered from SNPs distributed throughout the entire genome, a similar pattern of little to no differentiation between the phenotypes was recovered. This dataset failed to reveal any alternatively fixed alleles between the two phenotypes. However, a low, but significant, F ST was found between spotted and nonspotted individuals. This value was comparable to F ST estimates of different collection sites (e.g., Mexico vs. Central American localities), and these results suggest a possible signature of isolation by distance, which has been previously reported for other fishes of the TEP (Bernal, Gaither, Simison, & Rocha, 2017;Lessios & Baums, 2017). It would be tempting to suggest that SNPs gathered from UCEs lack sufficient signal to detect differentiation at this time scale given the conserved nature of these genomic regions. However, only the cores of these loci are conserved, and variation increases in the regions flanking this core Gilbert et al., 2015). In fact, SNPs gathered from UCEs have been proven informative in detecting population structure at shallow timescales for various taxa (e.g., in birds: Harvey, Aleixo, Ribas, & Brumfield, 2017;Oswald et al., 2016;Smith, Harvey, Faircloth, Glenn, &Brumfield, 2013, andfishes: Burress et al., 2018). Thus, it is likely that the similarities between the mitochondrial and UCE loci reflect an actual shared history, and that this situation echoes one in which a single species displays color variation across its range.
We compared the abiotic habitats that these phenotypic variants occupy to test whether ecology could be a driving factor in the divergence of these two groups. Using locality data from across the entire range of this species, we failed to detect any significant differences in the abiotic habitats that the two phenotypes occupy.
However, these data are all associated with the abiotic habitat of the region (e.g., temperature, salinity), and they do not take into account any biotic factors (e.g., coral cover, species interactions, productivity), which could differ throughout the range of the focal species. Despite this limitation, Prionurus appears to traverse multiple ecotypes of the region. This is well illustrated by the Central American faunal gap, an ~1000km stretch of coastal habitat lacking coral reef ecosystems that has been suggested as a barrier between the Mexican and Panamic provinces of the TEP, influencing the connectivity and distribution of multiple species (Briggs, 1974;Hastings, 2000;Robertson & Cramer, 2009;Springer, 1959 this scenario could also explain the modal differences observed in the pectoral-fin and dorsal-fin ray counts between the two phenotypes. In species that are more dispersal limited, or that have more rapid turnover rates with shorter generation times, these environmental fluctuations and corresponding population bottlenecks could result in isolated populations that ultimately form new species, suggesting a mechanism in which TEP in situ speciation can occur in allopatry (Hastings, 2000). However, this study also highlights why in situ speciation along the TEP coastline may be uncommon in large-bodied fishes, as these surgeonfishes are perhaps some of the best dispersers among reef fishes, and have long generations times (~45 years for other species of this genus; Choat & Axe, 1996) allowing populations to regain connectivity after population bottleneck events.
Additionally, such severe population crashes could also easily result in high extinction rates, contributing to the reduced diversification rates previously observed for this region (Cowman & Bellwood 2013). Ultimately, an extended genomic approach that targets whole genomes, including samples from oceanic islands, could reveal the molecular underpinnings of the squamation patterns of P. laticlavius.
Further studies including a diverse set of endemic taxa in the TEP are needed to shed light on how speciation occurs in one of the most distinctive tropical marine regions of the world.

ACK N OWLED G M ENTS
The authors would like to thank the following people, in no particular order, for assistance during field work: Jagjit Chadha, Eric Garcia,

CO N FLI C T O F I NTE R E S T
None declared.