DNA barcoding of economically important freshwater fish species from north‐central Nigeria uncovers cryptic diversity

Abstract This study examines the utility of morphology and DNA barcoding in species identification of freshwater fishes from north‐central Nigeria. We compared molecular data (mitochondrial cytochrome c oxidase subunit I (COI) sequences) of 136 de novo samples from 53 morphologically identified species alongside others in GenBank and BOLD databases. Using DNA sequence similarity‐based (≥97% cutoff) identification technique, 50 (94.30%) and 24 (45.30%) species were identified to species level using GenBank and BOLD databases, respectively. Furthermore, we identified cases of taxonomic problems in 26 (49.00%) morphologically identified species. There were also four (7.10%) cases of mismatch in DNA barcoding in which our query sequence in GenBank and BOLD showed a sequence match with different species names. Using DNA barcode reference data, we also identified four unknown fish samples collected from fishermen to species level. Our Neighbor‐joining (NJ) tree analysis recovers several intraspecific species clusters with strong bootstrap support (≥95%). Analysis uncovers two well‐supported lineages within Schilbe intermedius. The Bayesian phylogenetic analyses of Nigerian S. intermedius with others from GenBank recover four lineages. Evidence of genetic structuring is consistent with geographic regions of sub‐Saharan Africa. Thus, cryptic lineage diversity may illustrate species’ adaptive responses to local environmental conditions. Finally, our study underscores the importance of incorporating morphology and DNA barcoding in species identification. Although developing a complete DNA barcode reference library for Nigerian ichthyofauna will facilitate species identification and diversity studies, taxonomic revisions of DNA sequences submitted in databases alongside voucher specimens are necessary for a reliable taxonomic and diversity inventory.

However, over the years, reports have shown decline in the number of fish caught from most Nigerian inland waters (Oguntade, Oketoki, Ukenye, Usman, & Adeleke, 2014). This could be attributed to inadequate management of fisheries, climate change, pollution, and degradation of water bodies (Odo, Nwani, & Eyo, 2009). The impact of environmental pollution and other human activities on fish diversity cannot be overestimated. Hence, improved management plans and conservation approaches will aid in preventing loss of Nigerian fish diversity.
Accurate identification of species is a pivotal component in conservation efforts. The use of traditional methods (morphological characters) in species identification is common in Nigeria. In fact, about 48% of Nigerian freshwater fish species have been characterized using this method (Nwani et al., 2011). Although the use of morphological approach can be incorrect (Ward, Hanner, & Hebert, 2009), its accuracy has not yet been tested for Nigerian fishes. The challenges of the use of morphology lie in the discrimination of closely related organisms (Rasmussen, Morrissey, & Hebert, 2009). This has paved way for the development of improved molecular approaches for identification of fish species (Abdullah & Rehbein, 2017;Nazarov et al., 2012;Nwani, Eyo, & Udoh, 2016;Ratnasingham & Hebert, 2007).
To date, there are no studies on the DNA barcoding of freshwater fishes in north-central region of Nigeria. Herein, we explored the use of DNA barcoding as reliable molecular tool for identification of fish species obtained from the north-central Nigeria. We evaluated and compared GenBank and BOLD databases for use in species identification. Furthermore, we compared the taxonomic reliability of morphological method against DNA barcodes. Finally, we examined the usefulness of DNA barcode reference data in uncovering cryptic lineage diversity in fish species from north-central Nigeria.

| Sample collection
We collected one hundred thirty-six (136) freshwater fish samples belonging to 53 species between 2016 and 2017 (Table 1).
Our sampling covered nine (9) inland water bodies (Oyun and Asa recover four lineages. Evidence of genetic structuring is consistent with geographic regions of sub-Saharan Africa. Thus, cryptic lineage diversity may illustrate species' adaptive responses to local environmental conditions. Finally, our study underscores the importance of incorporating morphology and DNA barcoding in species identification. Although developing a complete DNA barcode reference library for Nigerian ichthyofauna will facilitate species identification and diversity studies, taxonomic revisions of DNA sequences submitted in databases alongside voucher specimens are necessary for a reliable taxonomic and diversity inventory.

K E Y W O R D S
Biodiversity, conservation policy, geographic variation, integrative taxonomy, mitochondrial DNA, population divergence TA B L E 1 List of species including voucher's specimen number, species name, locality information and GenBank accession number

| DNA extraction, polymerase chain reaction (PCR), amplification and sequencing
We used proteinase K to digest the ethanol-preserved tissues and followed the standard phenol-chloroform extraction procedure to extract the total genomic DNA (Sambrook & Russell, 2001).

| Sequence assembly and data analyses
The nucleotide sequences were viewed and confirmed by eye using SeqManTMII (DNASTAR Lasergene 7). They were aligned in MEGA 7.0 using ClustalW (Kumar, Stecher, & Tamura, 2016)  TA B L E 1 (Continued) and maximum intraspecific distance, average genetic distance to the nearest neighbor and the nearest neighbor member for each species.
We used MEGA v. 7.0 to create a neighbor-joining (NJ) tree based on the Kimura 2 parameter distance (K2P) (Kimura, 1980) and estimated the intergeneric, inter-and intraspecific sequence divergences. For the NJ tree, we considered bootstrap values of 95% and above as strongly supported. Following Decru, Van Ginneken, Verheyen, and Snoeks (2016), identification is considered successful if the sequence and the match are conspecific and failed if they are allospecific.
Upon discovery of deeply divergent lineages within species, further genetic analysis was carried out to investigate possibility of cryptic lineage diversity. To infer this, we downloaded additional related sequences of such species from the GenBank ( Table S1).
The Bayesian Inference (BI) analysis was rooted with a closely related species as out-group taxon. We partitioned the COI gene into codon position 1, 2 and 3. Evolutionary model testing for each of the partitioned codon was performed using JMODELTEST (Posada, 2008). Furthermore, models were selected: GTR + G for the first and third codon positions; and F81 for the second codon position.
Phylogenetic relationships were evaluated using a Bayesian framework as implemented in BEAST v1.6.1 (Drummond & Rambaut, 2007). Analysis was run for 20 million generations with sampling every 1,000th generation. Two independent runs with four Markov chain Monte Carlo Chains (MCMC) were performed. We excluded the first 25% of the tree as burn-in before the log-likelihood scores stabilized. A 50% majority rule consensus of the sampled trees was constructed and visualized using FigTree v1.4.2 (Rambaut, 2012).

| Morphology-based species identification
Of the 136 specimens collected, all specimens (100%) were identified to consist of 53 species belonging to 28 genera and 18 families based on morphology (Table 2). This included 46 (86.80%) species identified to species level and seven (13.20%) species that could not be assigned species level and thus referred to genus.

| Amplification success and sequence statistics
We obtained 130 sequences (all >500 bp) belonging to 53 morphologically identified species (Table 1) The COI sequences and related information for each specimen were also made publicly accessible via the BOLD systems website within "Diversity studies and DNA barcoding of Nigerian freshwater and marine fishes" project as part of the international fish barcode of life project.

| DNA sequence similarity-based species identification
All the 130 successfully amplified sequences were crossreferenced to GenBank and BOLD databases. One hundred twenty-seven se-

| Mismatch in taxonomy
Of the 53 morphologically identified species, 34 (64.20%) matched with species names assigned using morphological approach and GenBank database. Using BOLD database, only 19 (35.80%) were in accordance with species names assigned using both morphological and BOLD database (

| Tree-based identification
We used HM883007 (Pellonula leonensis); and AP009231 (Pellonula vorax) as the out-group taxa to root the NJ tree (NJ) for the pooled COI sequences of freshwater fishes from north-central Nigeria (Figure 1). were clearly separated from their sister species (Figure 1). Thus, the NJ tree revealed that species identification based on morphological evidence and molecular methods are broadly consistent in most cases.

| Identification of unknown fish tissue samples
All sequences of the four unknown fish samples collected from fishermen were successfully amplified. Our query search of COI sequences of unidentified species 1 and 2 in GenBank showed 100% sequence TA B L E 4 Intergeneric pairwise genetic distance (%) of COI sequence data of freshwater fishes from north-central Nigeria using Kimura- (Figure 2a). Furthermore, unidentified species 2 clustered with H. niloticus from River Asa (Fig. 2b). Unidentified species 3 and 4 showed, respectively, 99% and 100% DNA sequence similarity with Mormyrops anguilloides (AP011576) and S. intermedius (HM882935). In addition, unidentified species 3 and 4 clustered with M. anguilloides from Rivers Moro and Niger (Fig. 2c), and S. intermedius from River Asa (Fig. 2d), respectively. Therefore, DNA barcoding could aid in identification of unknown tissue samples. These unknown tissues samples could possibly be from fish collected from rivers across Nigeria.

| Uncovering cryptic diversity
Our NJ tree-based analyses of COI sequences of freshwater

| D ISCUSS I ON
In our study, DNA barcoding approach was very efficient in species identification. The success rates of DNA barcoding approach in our study (95.60%) was higher than the 93% success rate reported for Canadian freshwater fish (Hubert et al., 2008) (Lakra et al., 2015). In most cases, our study shows that COI sequences effectively clustered most of the conspecific and congeneric species. This was also observed in similar studies in fishes from Upper Parana River Basin (Pereira, Hanner, Foresti, & Oliveira, 2013), freshwater fishes from southeastern Nigeria (Nwakanma et al., 2015;Nwani et al., 2011) and freshwater fishes from southwestern Nigeria (Falade, Opene, & Benson, 2016

| Application of DNA barcoding reference data
We reported two applications of DNA barcoding: identifying unknown samples from fishermen and uncovering cryptic diversity.
In the case of the identification of unknown samples, DNA barcode reference data were very useful in identifying the unknown fish samples. Hence, the acquisition of DNA barcoding data will aid in species identification, which in turn, help in the conservation and management planning of Nigerian fishery resources. In There is the possibility that some of the identified lineages exhibit minute morphological differences that may have been overlooked in the past. However, due to the high rate of biodiversity loss, the distinct lineages uncovered from our study require consideration for conservation strategies and fishery management practice (Fraser & Bernatchez, 2001).
Comparison of our COI sequences with others from GenBank revealed existence of several more complexes of potentially cryptic lineages within S. intermedius. Contrary to previous studies (Nwani et al., 2011) that hypothesized two lineages of S. intermedius in Nigeria, our study revealed the presence of more than two lineages within this species in Nigeria. Increasing sample size and geographic sampling range may uncover more cryptic diversity within S. intermedius. Thus, our data is insufficient to explore the hypothesis of speciation within S. intermedius. To explore this hypothesis, it is necessary to sample these species across broad geographic range.
Careful examination of possible morphological variations and more genetic analyses would aid in determining whether the detected cryptic lineages be warranted species status. Thus, our study emphasizes the need for a more complete reference DNA barcode data across Nigeria for the detection of more cryptic diversity in freshwater fish.

| Reliability of DNA barcode reference data
The success of using DNA barcoding approach for species identifica-

| CON CLUS ION
Our study demonstrates the usefulness of DNA barcoding for the identification of fish species in north-central Nigeria and uncovering lineage diversity. This study contributes to the construction of DNA reference barcode data for Nigerian fish fauna. This study has therefore contributed important data for the species identification, which in turn will aid the management of freshwater fishes in Nigerian inland water bodies. Furthermore, it has provided additional data to the major databases of GenBank and BOLD. We also confirm that

CO N FLI C T O F I NTE R E S T
None declared. critically revised the manuscript. All authors read and approved the final manuscript.

DATA ACCE SS I B I LIT Y
DNA sequences: GenBank Accession Nos MG824552-MG824685; for each individual, details on locality information and GenBank Accession no. for its sequence data are shown in Table 1.