Mitochondrial DNA analyses revealed distinct lineages in an alpine mammal, Siberian ibex (Capra sibirica) in Xinjiang, China

Abstract Maternal lineages of mitochondrial DNA (mtDNA) are recognized as important components of intra and interspecific biodiversity and help us to disclose the phylogeny and divergence times of many taxa. Species of the genus Capra are canonical mountain dwellers. Among these is the Siberian ibex (Capra sibirica), which is regarded as a relic species whose intraspecific classification has been controversial so far. We collected 58 samples in Xinjiang, China, and analyzed the mtDNA genes to shed light on the intraspecific relationships of the C. sibirica populations and estimate the divergence time. Intriguingly, we found that the mtDNA sequences of C. sibirica split into two main lineages in both phylogenetic and network analyses: the Southern lineage, sister to Capra falconeri, consisting of samples from Ulugqat, Kagilik (both in Xinjiang), India, and Tajikistan; and the Northern lineage further divided into four monophyletic clades A–D corresponding to their geographic origins. Samples from Urumqi, Sawan, and Arturk formed a distinct monophyletic clade C within the Northern lineage. The genetic distance between the C. sibirica clades ranges from 3.0 to 8.6%, with values of F ST between 0.839 and 0.960, indicating notable genetic differentiation. The split of the genus Capra occurred approximately 6.75 Mya during the late Miocene. The Northern lineage diverged around 5.88 Mya, followed by the divergence of Clades A–D from 3.30 to 1.92 Mya during the late Pliocene and early Pleistocene. The radiation between the Southern lineage and C. falconeri occurred at 2.29 Mya during the early Pleistocene. Our results highlight the importance of extensive sampling when relating to genetic studies of alpine mammals and call for further genomic studies to draw definitive conclusions.


| INTRODUC TI ON
The genus Capra inhabits restricted mountain areas in southern Europe, the Middle East, and Central Asia of the Palearctic region.
Gene introgression between ancestral bezoar and ibex types was likely occurred and may have contributed to the genetic makeup of now-living Capra species (Pidancier et al., 2006).
The Siberian ibex, also known as Asiatic or Asian ibex, is the genus's largest and heaviest-built species, with the longest horns on males (Fedosenko & Blank, 2001). It has the broadest distribution than any other Capra species, ranging from the Himalayas mountains in northern India and China through the Pamir plateau in eastern Pakistan, Afghanistan, Kyrgyzstan, Kazakhstan, and China to the Altai mountains in Mongolia, Russia, and China ( Figure 1). This species is listed as "Near Threatened" both in the red list of IUCN and in the Red Book of Kyrgyzstan (Ingeborg et al., 2018;Reading et al., 2020), and one of the category II key protected wildlife in China (National Forestry and Grassland Administration, National Park Administration, 2021). Its population is decreasing due to trophy hunting in most of its range (Ingeborg et al., 2018), poaching, habitat degradation or fragmentation, conflicts with domestic animals, and natural diseases (Reading et al., 2020). As such, there are scarce genetic studies, especially in China.
The intraspecific classification of this species is one of the most complex problems and is still controversial. According to the characteristics of body shape, coat color, and horns, Fedosenko and Blank (2001) and Wilson and Reeder (2005) classified C. sibirica into four subspecies, namely: those in the Altai mountains in Russia, Northwest China, Western Mongolia, and Northeastern F I G U R E 1 Distribution of Siberian ibex (Capra sibirica) in Central Asia (www.iucnr edlist.org, downloaded on 23, March, 2022). The border lines are approximate. The round marks represent sampling sites from GenBank (out of Chinese territory) and this study (in Chinese territory). The different colors indicate different C. sibirica lineages: the blue, pink, yellow, red, and green correspond to clades A-E in the phylogenetic trees and networks, respectively. The numbers in parentheses represent the number of samples collected in this study.
India and Pakistan as C. s. sakeen Blyth, 1842. Smith and Xie (2008) and Wang and Jie (2004) regarded C. sibirica in the Kunlun and Karakoram mountains as a separate subspecies, although some disagree with this classification (Abdukadir, 2003;Gao, 2005). Grubb et al. (2005), however, believed that C. sibirica should be defined as a monotypic species without dividing any subspecies.
Topographic uplifts contribute to biodiversity evolution and have elevated speciation in mammals (Igea & Tanentzap, 2021). C. sibirica is a species of mountain dweller that inhabits the Altai, Tianshan, and Himalayan Mountains. These mountains are geographically remote and divided by the Junggar and Tarim basins. C. sibirica populations in the Altai and Himalaya mountains showed different external morphology (Sobanskiy, 1992). In light of these facts, we hypothesize that C. sibirica populations in different alpine habitats are likely genetically differentiated and may consist of several distinct populations.
Mitochondrial markers have been widely used in the phylogenetic studies of many taxa, including mammals (Abduriyim et al., 2022;Abduriyim, Nabi, & Halik, 2018;Abduriyim, Zibibulla, et al., 2018), and mitochondrial lineages were accepted as an important component of intra and interspecific biodiversity (Hu et al., 2021;Huang et al., 2023). Hence, to test our speculation, we analyzed mitochondrial cytochrome b (cytb) gene and control region (CR) sequences derived from feces, skin, or tissue samples collected in Xinjiang, China, together with sequences retrieved from GenBank. We aimed to elucidate the phylogeny and genetic structure of C. sibirica populations. Our findings are significant in terms of identifying conservation management units and providing effective protection.

| Sampling
In Xinjiang, 58 samples were collected, including 49 feces, six muscle, two skin, and one liver sample: 19 from Urumqi, 21 from Arturk, one from Sawan, 14 from Ulugqat, and three from Kagilik ( Figure 1, Appendix 1). The tissue samples, including muscle, liver, and skin, were taken from individuals that died of natural causes or individuals that were poached. For fecal samples, we first observed C. sibirica individuals in the distance, took their positions, and only their fresh feces were collected after they had left to ensure the individual identity of fecal samples collected during field work. Identities were also determined based on relative stool and urine positions, as well as stool size and distances (Abduriyim, Nabi, & Halik, 2018;Abduriyim, Zibibulla, et al., 2018). All fecal and tissue samples were preserved in 96% ethanol at −80°C, while skin samples were frozen in plastic bags.

| DNA extraction, amplification, and sequencing
The total genomic DNA of fecal samples was extracted using an Omega stool DNA extraction kit (D4015; Omega Bio-tec), and that of muscle and skin samples was extracted using a Tiangen tissue/blood DNA extraction kit (DP328-02; Tiangen), following the manufacturer's instructions. After electrophoresis detection, DNA concentration and purity were measured by the Thermo Nanodrop 1000 and stored at 4°C for later use. PCR products were run on 1% agarose gel electrophoresis to check relative purity and concentration and then were sent to Sangni Biotechnology (Shanghai) for sequencing. Raw sequences obtained were manually edited in SeqMan, aligned with MEGA v.6.0 (Tamura et al., 2013) using MUSCLE, and BLAST searched at NCBI to confirm their gene identity.
The phylogenetic analyses for cytb and cytb + CR sequences were performed in MEGA v.6.0 (Tamura et al., 2013) using the neighborjoining (NJ) method with the Kimura two-parameter model and in IQ-TREE v.1.6.8 (Minh et al., 2020) using the maximum likelihood algorithm (ML). Nodal supports for both the NJ and ML trees were calculated using bootstrapping with 1000 replications. Medianjoining network analyses were conducted separately for the different dataset in PopART v.1.7 (Leigh & Bryant, 2015). The optimal model for Bayesian inference (BI) was chosen based on the Bayesian Information Criterion (BIC) by using Kakusan v.4 (Tanabe, 2011). The BI phylogenetic trees were constructed in MrBayes v.3.1.2 (Ronquist & Huelsenbeck, 2003). Two independent runs were performed for 2 million generations each, with Markov chains sampled every 100 generations. The convergence of parameter values for two runs was checked in Tracer v.1.7.2 (Rambaut et al., 2018), prior to discarding the first 25% trees as burn-in.  (Rambaut et al., 2018). TreeAnnotator v.2.6.7 was used to summarize the representative summary tree with the mean ages of all the nodes and the corresponding 95% highest posterior density (HPD) ranges and to calculate posterior clade probability for each node (Bouckaert et al., 2019). The maximum clade credibility tree visualization and geological timescale were plotted using the R packages ggtree v.3.2.1 and deeptime v.0.2.2 in R v.4.1.2 (Gearty, 2022;R Core Team, 2021;Yu, 2020).

| Sequence variation
Since the reliability of the beginning and ending sequence reads was very low, the final dataset of 54 cytb sequences we obtained was 1095 bp long, and the final dataset of 43 CR sequences was 935 bp long, with indels in some sequences. PCR amplification from a few samples failed (Appendix 1), probably owing to the longer storage time or low quality of the sample. Because sample size or haplotype number was limited for some localities (e.g., one sample from Sawan; one haplotype in the cytb gene from Urumqi, Kagilik, and Ulugqat-f; and one haplotype in the CR gene from Ulugqat-s), some genetic parameters were not calculated. Due to the fact that the sequences from Ulugqat were split into two distinct lineages in phylogenetic trees and networks, we calculated genetic parameters accordingly, namely separated into two groups (Ulugqat-s is grouped with C. sibirica, Ulugqat-f with C. falconeri) in Table 1 F S and Tajima's D of the Siberian ibex population were both positive, indicating that the C. sibirica population is shrinking and the rare alleles in the population exist at low frequency (Table 1).

| Phylogenetic relationships
In the phylogenetic analyses, we also included cytb sequences of C.  Figure 2). The genetic distance between the five clades ranged from 3.0% to 8.6%. The maximum value within the Northern lineage was between clades B and C (4.7%), while the minimum value (3.0%) was between clades A and B (

| Divergence time estimations
The divergence times of Capra species were inferred based on mtDNA cytb 679 bp sequences using BEAST2 (Figure 4, Table 3). sequences, although estimates from 376 bp cytb sequences were marginally higher than those from the other two datasets (Table 3).

| DISCUSS ION
Identifying valid species/evolutionary significant units and/or management units is a priority for effective conservation measurements and policies and plays a pivotal role in systematics and evolutionary biology. We found two major lineages in the phylogenetic tree and network analyses of cytb and CR gene sequences (Figures 2 and 3), one containing sequences from the northern range (clades A-D) that are sister lineages to other Capra species and the other containing sequences from the southern range that are sister to C. falconeri, indicating paraphyly of C. sibirica. The monophyletic separation of the Northern lineage from other Capra species is in line with the previous studies (Joshi et al., 2020;Kazanskaya et al., 2007;Pidancier et al., 2006;Zheng et al., 2020). Moreover, the genetic distances between clades A and E (Table 2) were greater than those between Capra species (Kazanskaya et al., 2007), and they were highly structured and divergent genetically. The divergence time of these clades' dates back much earlier (3.30-1.92 Mya). The grouping pattern of clades within the Northern lineage corresponds to their geographic origins. Based on the traffic-light system for designating taxonomic certainty in mammals (Kitchener et al., 2017(Kitchener et al., , 2022, it is highly likely that these clades represent distinct species. Different geographical populations' coat colors and body sizes support this notion (Fedosenko & Blank, 2001), and the nature of species' altitudinal migration may cause significant genetic differentiation of populations (Reading et al., 2020;Sobanskiy, 1992;Xu, 1995). Likewise, five different species of Ovis ammon were identified in different mountains in Northwestern China (Jiang et al., 2016), where they are sympatric to Clades A-D, respectively. However, the morphological and biogeographical data, including modeling of species distributions based on habitat usage, for example, Capra walie (Gebremedhin et al., 2009), will be helpful in confirming whether populations are valid species.
Southern lineage (Clade E) clustering with C. falconeri is consistent with Y-chromosomal and genomic phylogenies of C. falconeri and C.
Many studies on adaptive introgression in mammals may be useful as references for our study (Chen et al., 2018;Hedrick, 2013;Hu et al., 2019;Miao et al., 2017;Wu et al., 2018). According to the to- The trees rooted with Pseudois nayaur were bootstrapped with 1000 replicates in ML and NJ, and nodal supports of ≥80 (ML and NJ) and 0.7 (BI) were given. Haplotypic sequences for the C. sibirica species were used, and the numbers in the brackets next to them indicate the number of sequences of that particular haplotype. The GenBank accession numbers for the rest of the Capra species were provided on the right of their Latin names within brackets.
the Southern lineage being sister to other Capra species seems to be due to ILS. However, given that the clustering pattern of the Northern lineage and C. falconeri in our mtDNA trees (Figures 2   and 3) was in line with the topology of the WGS and Y chromosome (Zheng et al., 2020), and the divergence time between lineages was much earlier (Table 3), we cannot exclude the possibility of hybridization or introgression between these two species, owing to their geographic proximity, as reported in other sympatric bovine species (Fadakar et al., 2020). Thus, population genomic studies with dense geographical sampling are needed to unveil this phenomenon, as sufficiently many unlinked genes/loci always return the correct topology (Mossel & Roch, 2010).  Our divergence time estimation results for Caprinies ( Figure 4, Table 3) were similar to those of Ropiquet and Hassanin (2005) and Bibi (2013), suggesting that the Capra diversification after adaptive radiation of the tribe Caprini during the late Miocene (Gebremedhin et al., 2009) followed major climatic shifts that occurred during the late Miocene, Pliocene, and Pleistocene, similar to other terrestrial mammals (Ge et al., 2013). We deduced that radiation of the Northern lineage occurred concurrently with that of the other Capra species, resulting in the diversification of clades A-D, Clade E, and C. falconeri between 3.30 and 1.92 Mya. This is in agreement with the large number of species in several subfamilies that originated during this period, in which the diversification of C4 plants accelerated (Christin et al., 2009) and was followed by the replacement of a large area of forest on the earth by open grassland in "Nature's Green Revolution" (Osborne & Beerling, 2006). The subsequent tectonic changes in the Pamir and Tibetan plateaus (Cao et al., 2013;Thompson et al., 2015) and glacial changes (Ge et al., 2013)  (equal); validation (equal); visualization (equal). Daisuke Hirata: Data curation (equal); formal analysis (equal); methodology (equal); validation (equal); visualization (equal); writing -original draft (supporting); writing -review and editing (equal). Shamshidin Abduriyim: Conceptualization (lead); data curation (equal); formal analysis (supporting); funding acquisition (lead); investigation (equal); methodology (equal); project administration (lead); resources (lead); supervision (lead); validation (supporting); visualization (supporting); writing -original draft (lead); writing -review and editing (equal).

ACK N OWLED G M ENTS
We thank Saypulla Suba in the Forest and Grassland Bureau of Hami city and Abdugini Kerim in the Forest and Grassland Bureau of Arturk county for their help in field sampling. This study was funded by the National Natural Science Foundation of China (No. 32260328).

FU N D I N G I N FO R M ATI O N
This study was funded by the National Natural Science Foundation of China (No. 32260328).

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The mtDNA sequences we obtained have been deposited in the DDBJ databases under accession numbers LC706311-LC706364 for cytb and LC711033-LC711075 for CR.