Phylogeographic structure of the dwarf snakehead (Channa gachua) around Gulf of Tonkin: Historical biogeography and pronounced effects of sea‐level changes

Abstract Geological events, landscape features, and climate fluctuations have shaped the distribution of genetic diversity and evolutionary history in freshwater fish, but little attention has been paid to that around the Gulf of Tonkin; therefore, we investigated the phylogeographic structure of the dwarf snakehead (Channa gachua) on Hainan Island and mainland China, as well as two populations in Vietnam. We attempted to elucidate the origins of freshwater fish in South Hainan by incorporating genetic data from DNA markers on both the mitochondrial cytochrome b gene (cyt b) and the nuclear recombination‐activating gene 1 (RAG‐1). Mitochondrial phylogenetic analysis identified two major lineages (lineages A and B), which may represent separate species. Divergence data suggested that C. gachua populations diverged between 0.516 and 2.376 myr. The divergence of the two cryptic species is congruent with sea‐level rise, which subsequently isolated Hainan from the mainland. During the Pleistocene glaciations, the entire region of the Gulf of Tonkin and the Qiongzhou Strait became part of the coastal plain of the Asian continent, which might have resulted in the current distribution patterns and dispersal routes of C. gachua populations. The formation of three sublineages in lineage A indicated that the Gulf of Tonkin was a geographical barrier between Hainan Island and mainland China but not between Vietnam and Hainan Island. The results of this study may help to elucidate the origins of freshwater fish in South Hainan and the phylogeographic structure of C. gachua.


| INTRODUC TI ON
Continental islands provide excellent opportunities for examining phylogeographic patterns in freshwater fish, with their biogeographic relationships that reflect historical rather than present-day drainage connections (Han et al., 2019). The dispersal of primary freshwater fish will be strongly affected by geological events. Islands are typically separated from continents by the sea, and fluctuating sea levels may cause continental islands to connect and separate from the mainland repeatedly and provided a suitable route for species to migrate. Combined with climatic oscillations, local geological events play a role in driving genetic patterns leading to diversification, speciation, and biogeography. The South China Sea has fluctuated greatly and experienced considerable climatic changes during the Pleistocene when glacial cycles acted as a major driver shaping the present-day diversity and distribution patterns of species.
Hainan Island is located off the south coast of China and is separated from mainland China to the north by the Qiongzhou Strait and from northern Vietnam to the west by the Gulf of Tonkin (Zhu, 2016).
The topography of Hainan Island is diverse, with the Wuzhishan and Yinggeling Mountain Range (WY Range) approaching elevations of 1,800 m and serving as the core of the uplift. The topology of the island rises steeply from the central and southern regions and extends north to a wide plain. The four largest rivers on the island, the Nandu, Changhua, Wanquan, and Linshui Rivers, originate from the central mountainous area. A comparison of the freshwater fish species of Hainan Island, northern Vietnam, and mainland China around the Gulf of Tonkin (Beibu Gulf) showed that they were quite similar to each other. The withdrawal of the South China Sea during several inter-ice ages, Gulf of Tonkin was exposed due to the decrease in sea level, might have led to the colonization and divergence of some fish populations in the coastal rivers of northern Vietnam and mainland China (Voris, 2000;Zhao et al., 2007;Zhou et al., 2017).
According to previous phylogeographic studies, freshwater fishes as migrants probably moved between mainland China and island populations during low sea level via water system confluence (Chen et al., 2007;Yang et al., 2016). Similar phenomena were also found in the phylogeographic studies of many freshwater fishes in the drainages of Hainan Island or on both sides of the Qiongzhou Strait (e.g., Glyptothorax hainanensis, see Chen et al., 2007; Garra orientalis, see Yang et al., 2016; genus Opsariichthys, see Lin et al., 2016;and Opsariichthys hainanesi, see Zhang et al., 2020).
In addition, some studies have shown that freshwater fish on Hainan Island may have originated in northern Vietnam in the west, rather than in southern China in the north (Lin et al., 2010). Our previous studies have shown that some freshwater fish in southwestern Hainan Island are more closely related to freshwater fish from the Red River in mainland China (Zhang et al., 2020). Moreover, Lin et al. (2010) examined the population structure of Reeves's butterfly lizard (Leiolepis reevesii) and proposed the migration route of L. reevesii from Vietnam to Hainan followed by a second wave of dispersal from Hainan to mainland China. Accordingly, the species of Hainan Island might originate from the north (China) or west (Vietnam). However, to the best of our knowledge, there is no research on the population relationship between North Vietnam and Hainan Island. Therefore, we propose investigating whether the freshwater fish in southwestern Hainan Island and the freshwater fishes of the Red River in mainland China belong to the same fish zoogeographical region. The above studies reflect that the freshwater fishes of Hainan Island may not have a single origin, but due to the lack of information on freshwater fishes in northern Vietnam, the relationship between freshwater fish in Vietnam and Hainan has not been studied. Therefore, the phylogenetic process of freshwater fish on Hainan Island has not been effectively and reasonably explained. To date, there have been no comparable data on freshwater fish from geographically separated populations between Hainan Island and the Red River, although such data will provide additional insights into the phylogeographic history of the species and contemporary factors shaping the population genetic structure and supporting the implementation of effective management and conservation strategies for freshwater fish.
Channa gachua (Hamilton, 1927) belongs to the order Anabantiformes, family Anabantiformes, and genus Channa, and is a freshwater fish restrictively distributed in various water systems of the Hainan Island zoogeographical region and Nujiang-Lancangjiang zoogeographical region in China but is primarily distributed in Nepal, the Indian subcontinent, Afghanistan, and Southeast Asia abroad (Chu & Chen, 1990). This species is usually found in hill streams, while adults inhabit medium to large rivers, brooks, rapid-running mountain streams, and stagnant water bodies, including sluggish flowing canals (Kottelat et al., 1998;Taki et al., 1978). The species is not distributed in the coastal waters of southern China. According to this distribution pattern, C. gachua is an ideal fish for inferring the biological consequences of historical biogeography in this area.
In the present study, to address the abovementioned problems, we used the mitochondrial DNA (mtDNA) cytochrome b gene (cyt b) and nuclear gene rag 1 region (RAG-1) to establish the phylogeographic patterns in South China and northern Vietnam. There are two major questions investigated by our study: (1) What is the genetic diversity and genetic structure of C. gachua? and (2) How did C. gachua colonize the rivers of different geographical districts on Hainan Island?
Whether the freshwater fish in southwestern Hainan Island and the freshwater fishes of the Red River in mainland China belong to the same fish zoogeographical region has not been elucidated. The results of this study may help to elucidate the population genetic structure of C. gachua that may contribute to conservation, management, and sustainable utilization of freshwater fish on Hainan Island and may also help to provide context for other threatened endemic fish species in the same or similar river systems.

| Sampling and molecular methods
A total of 336 individuals were collected from twelve localities on Hainan Island and mainland China and two populations in Vietnam between July 2017 and December 2019 (Table 1; Figure 1). Samples were collected from the field sites with seines, fatally anesthetized with St. Louis,MO), and fixed and stored in 95% ethanol. Sampling information and localities are provided in Table 1  were assessed using CHROMAS software (Technelysium Pty Ltd, Australia), and the sequences were manually edited using BIOEDIT 6.0.7 (Hall, 1999).

| Sequence alignment, phylogenetic analyses, and Bayesian species delimitation
The phase of the alleles was resolved computationally using PHASE v. 2.1.1. For the RAG-1 gene, phylogenetic inference of independent alleles was also conducted on gamete phases by phase testing with DnaSP v.5.0 software (Librado & Rozas, 2009) with a probability threshold of 0.9 to resolve alleles. Sequences of the entire cyt b gene and RAG-1 gene were aligned with Clustal X v2.1 (Thompson et al., 1997), and the evolutionary substitution models for mitochondrial (cyt b) and nuclear genes (RAG-1) were the GTR + G + I model using the Akaike information criterion (AICc) in SMS (Smart Model Selection in PhyML) (Lefort et al., 2017).
A phylogenetic analysis was performed using the program BEAST 1.8.2 (Drummond et al., 2013) with 10 7 MCMC steps, taking the first 10% as burn-in and estimating the divergence times of the major lineages from the most recent common ancestor (TMRCA) by running 10 6 generations. We employed TreeAnnotator v.2.2.1 (Rambaut & Drummond, 2015) in the BEAST package to summarize trees and displayed them using FigTree 1.4.3 (Rambaut, 2016) to display posterior probabilities (PP) and mean node ages for each node.
A molecular clock in the cyt b gene was calibrated using a divergence rate of 1.05% per million years (Dowling et al., 2002). The minimumspanning tree was created via the MINSPNET algorithm, as implemented in Arlequin v3.5 (Excoffier & Lischer, 2010). We utilized the Bayesian phylogenetics and phylogeography (BPP) method to test the cryptic lineages of C. gachua (Yang, 2015). We performed the Bayesian phylogenetics and phylogeography (BPP) method under the multispecies coalescence model (MSC), as implemented in bpp 4.1.3 software (Yang, 2015). To examine whether the two lineages inferred from Cyt b trees represent different species, we used BPP with both cyt b and nuclear genes. We set the inverse-gamma distribution G (3, 0.01) for theta and G (3, 0.004) for tau. We ran the MCMC analyses for 500,000 generations, sampled every five generations, and discarded samples from the first 50,000 generations as burn-in. Each analysis was repeated at least twice to confirm consistency between runs. Topology based on Cyt b was used as a guide tree. We used the A10 model (species delimitation = 1, species tree = 0) and the species tree of the BEAST analyses (see below) as a userspecified guide tree (Rannala & Yang, 2013;Yang & Rannala, 2010) treating the two distinct evolutionary lineages.

| Population genetic structure
The genetic diversity of each population was estimated by using the number of haplotypes (N), haplotype diversity (h) (Nei & Tajima, 1983), and nucleotide diversity (π) (Jukes & Cantor, 1969) using DnaSP v5.0 software (Rozas et al., 2003). The existence of a phylogeographic structure was examined with two genetic differentiation indices (G ST and N ST ) following the method of Pons and Petit (1996) in software DnaSP v5.0 (Rozas et al., 2003). To quantify the genetic structure of C. gachua populations, we performed the pairwise F ST values and AMOVA (analysis of molecular variance) by Arlequin version 3.5 (Excoffier & Lischer, 2010). AMOVA was used to partition variation among samples into within-population (F ST ), within-group (F SC ), and among-group (F CT ) components to evaluate the most likely population configuration and geographical subdivision with K 2 P distance and 20,000 permutations. The table was prepared using R v.3.2.4.   (Dupanloup et al., 2002) was also employed to delineate the best clustering of population groups with the maximum extent of genetic differentiation. These analyses TA B L E 1 Sampling localities, abbreviations, latitude and longitude, sample size, haplotype diversity, and nucleotide diversity for mitochondrial cyt b and nuclear RAG-1 data of Channa gachua Locality Abb.

Latitude and longitude
Mitochondrial cyt b employed 500 simulated annealing steps and compared maximum indicators of differentiation (F CT ) when the program was instructed to identify K = 2-6 potential population units.

| Historical demography, divergence time estimation, and biogeographic analysis
We implemented neutrality tests (Tajima's D test (Tajima, 1989) and Fu's Fs test (Fu, 1997)) and mismatch distributions to assess signatures of recent historical demographic events using DnaSP v5.0 software (Librado & Rozas, 2009). Bayesian skyline plot (BSP) analyses for the effective population size changes over time and the time to the most recent common ancestor (TMRCA) for each lineage were evaluated in BEAST v1.8.2 for C. gachua (Drummond et al., 2013).
Two independent MCMC simulations ran for 200,000,000 generations to ensure convergence of all parameters (ESSs > 200); the first 10% of samples for each chain were discarded as burn-in. Bayesian skyline plots (BSP) and TMRCA analysis were plotted using Tracer v1.6 (Rambaut et al., 2014). A 1.05%/MY divergence rate has been calibrated for the mtDNA cyt b genes in multiple bony fishes for population expansion (Dowling et al., 2002).

| Phylogenetic reconstruction and Bayesian species delimitation
A comparison between the N ST and G ST fixation indices demonstrated that N ST was larger than G ST (0.879 and 0.352, respectively), demonstrating the presence of phylogeographic structure in C. gachua (Pons & Petit, 1996). Island and Vietnam, and lineages Ⅱ and Ⅲ included specimens from mainland China ( Figure S1).
Species delimitation analysis in BPP delimited 2 divergent mitochondrial and nuclear lineages of C. gachua as separate species.
Bayesian phylogenetics and phylogeography (BPP) and minimumspanning networks were largely concordant. Lineage A was distributed on Hainan Island and Vietnam as a separate species, and the specimens from mainland China (lineage B) were delimited as a separate putative species (Figure 2).

| Population genetic structure of C. gachua
The value of overall F ST was 0.711, and the range of pairwise F ST values between populations varied from 0.020 (between MC3 and MC4 populations) to 1.000 (between NN2 and V1 populations) in mtDNA cyt b gene. Moreover, the pairwise F ST values between populations from different regions were relatively high and significant, indicating that high differentiation existed among these populations (  (Table 3).

This result indicated that the Gulf of Tonkin was a geographical barrier between Hainan Island and mainland China but not between
Vietnam and Hainan Island (Table 3)

| Historical demographic expansion, molecular dating, and historical biogeography
According to the neutral test, the values of Tajima's D and Fu's Fs tests were positive, and multimodal mismatch distributions were observed for all three clades (Table 1)

| High population differentiation and low diversity of Channa gachua
For conservation planning for freshwater fishes, the application of genetic diversity to reflect the ability of populations to adapt to environmental factors has been presented over the last several decades (Bernatchez, 2016 (Zhang et al., 2020) and G. orientalis (θπ = 0.77%) (Yang et al., 2016), but higher than that of the mainland China and Vietnam populations (0.341% and 0.524%, respectively). In general, refugial populations could harbor considerably higher levels of genetic diversity and private haplotypes than migratory populations (Avise, 2000). Therefore, the population on Hainan Island, which exhibited the highest genetic diversity, may have served as a genetic reservoir for C. gachua, from which complex colonization scenarios could have originated. Our study also suggested that C. gachua possesses a low level of genetic diversity in a single population (θπ = 0.04%-1.39%), which is suspected to cause genetic drift by effective population size decline and a high level of genetic diversity in the total population (θπ = 3.18%). As in previous studies, the reason for effective population size decline in freshwater fish has occurred not only due to overfishing but also due to massive loss of their natural habitat in mainland China (Kang et al., 2014).
Similarly, overfishing and anthropogenic disturbance were also suggested to be major factors leading to population declines in C. gachua on Hainan Island and mainland China (Kang et al., 2014).  was achieved relatively recently, given that populations had weak genetic differentiation between Hainan Island and Vietnam, supporting the occurrence of enabling mixing of populations across this ocean barrier (Gulf of Tonkin). To the best of our knowledge, this report describes the first time that the Gulf of Tonkin has not represented an important geographical barrier between Hainan Island and Vietnam for freshwater fish.

| History demography
Glacial cycles during the Pleistocene significantly influenced the demographic history of freshwater fishes in mainland China, such as Rhodeus ocellatus (Yang et al., 2018), Squalidus argentatus (Yang et al., 2012), G. orientalis (Yang et al., 2016), and O. hainanensis (Zhang et al., 2020). However, neutral tests, mismatch distribution analyses, and Skyline plots indicated that range expansion did not occur in C. gachua. Moreover, a Bayesian skyline plot (BSP) indicated a decline in effective population size 75,000 years ago (75 Kya), coinciding with the last glacial period in relation to sea-level rise in the Gulf of Tonkin. We suggest that Hainan Island is located in the transitional zone between the subtropical and tropical zones and that C.
gachua exhibits a more restricted habit in the headwaters of a drainage system. During glacial cycles leading to sea-level fluctuations, the habitat area will not increase for C. gachua, which indicates that the effects of population expansion are more significant for downstream fishes than for upstream fishes. Therefore, the results of demographic analysis of O. hainanensis in downstream fish displayed a pattern of population expansion (Zhang et al., 2020), but that of C. gachua did not.

| Phylogeography of Channa gachua
According to all results, our study suggests that the ancestral populations of C. gachua were distributed in mainland China, Vietnam, and Hainan Island before the Pleistocene (2.376 myr). During glaciations after the warmer Miocene and Pliocene, marine transgressions were extensive, and most of the continental shelves of the South China Sea, including the Gulf of Tonkin, were underwater (Duan & Yongyang, 1989;Liu, 1994;Voris, 2000;Zhao et al., 2007). However, the ancestral populations of C. gachua were distributed in mainland China and Hainan Island before the Pleistocene, and mainland China and Hainan Island were connected as the Hainan Peninsula (Zeng & Zeng, 1989). Thus, the gene flow of C. gachua between mainland China and the southern Hainan Peninsula (South China Sea) was absent during glaciations.
Approximately between 2 and 2.5 myr, Hainan was first isolated from the mainland due to volcanism and sinking land in the current Gulf of Tonkin (e.g., Zeng & Zeng, 1989;Zhao et al., 2007).  (Duan & Yongyang, 1989;Xie et al., 2012)  These results also showed that the WY Ranges were raised before the Qiongzhou Strait formed.
Previous studies (e.g., Zhang et al., 2020) (Zhang et al., 2020). According to habitat and ecology, C. gachua occurs on the bottom of fast-flowing mountain streams. Thus, although the Gulf of Tonkin was exposed to glaciations, the populations of C. gachua were not distributed. Only the floods may have created temporary confluences allowing dispersal. Thus, our study suggests that the phylogeographic pattern of C. gachua was different from that of other fishes due to its habits.

| Molecular species delimitation
Several molecular studies suggest two cryptic species in C. gachua.
Phylogenetic analyses of the mtDNA Cyt b gene demonstrated that C. gachua is composed of two distinct lineages that exhibit deep genetic divergence, which is a well-established criterion for species delimitation. The results obtained for the nuclear RAG-1 gene that we examined were in keeping with the mtDNA finding, indicating that lineages A1-A3 and B are two cryptic lineages/species. Based on these nuclear and mitochondrial data, our study strongly suggests that there are two cryptic species distributed on Hainan Island (belonging to the Hainan Island zoogeographical region) and mainland China (belonging to the Nujiang-Lancangjiang zoogeographical region). Four well-supported evolutionary lineages were identified in the mtDNA trees, which may include two cryptic species (lineages A1-A3 and lineage B) (Figure 2). Moreover, the nuDNA trees also presented two divergent lineages, which was consistent with the mtDNA trees ( Figure S1). The haplotype networks of C. gachua demonstrated the same pattern as the phylogenetic analysis with 55 mutation steps between the two major lineages (lineages A and B) (Figure 3), implying deep divergence between them. The species delimitation method (BPP) was used to infer cryptic species in C. gachua ( Figure 2). The genetic differentiation (F ST ) between the two cryptic species was 0.916, which is also indicative of high differentiation between them (Balloux & Lugon-Moulin, 2002).

| Conservation implications
According to the model proposed by Moritz, evolutionarily significant units (ESUs) are recommended on the basis of reciprocal mtDNA monophyly, while management units (hereafter MUs) have been defined as geographical areas with restricted interchange in allele frequency distributions in mitochondrial or nuclear loci (Moritz, 1994).
Understanding genetic variability and phylogeography can contribute to the development and management of effective conservation programs. The importance of fish diversity protection is generally considered in South China, but there are few studies on the basis of fish diversity. In this study, two unique evolutionary lineages (A and B) within C. gachua were represented and should be considered two cryptic species (Figure 2). The importance of recognizing cryptic C. gachua species has implications for the assessment of conservation. The two cryptic species may represent a means of improving future conservation plans, which should be aimed at managing them independently; furthermore, they should be considered two ESUs. We suggest that protection is required for synchronization with the genetic uniqueness of the different MUs. In our study, some populations in the Changhua and Wanquan Rivers (lineage A3) were the first to differentiate as independent sub-branches, and their F ST was significantly high, suggesting that they should be regarded as different MUs. We also suggest that two lineages (A1-A2) should be regarded as different MUs distributed in the other population of Hainan Island and the population in Vietnam.

ACK N OWLED G M ENTS
We thank the Wuzhishan National Nature Reserve, Yinggeling

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests in the preparation or execution of this study.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available in GenBank (https://www.ncbi.nlm.nih.gov/), reference number (cyt b: MW233899-MW233949 and RAG-1: MW233950-MW234070).
Details regarding individual samples are available in Table 1.