Diversity of endemic cold‐water amphipods threatened by climate warming in northwestern China

Climate change threatens freshwater faunal diversity. To prioritize areas for conservation, patterns in the distribution of species must be understood. We apply genetic analysis and species distribution models to identify patterns in the distribution of freshwater amphipods around Xinjiang, China, and project the impact of climate change on endemic species.


| INTRODUC TI ON
Climate warming is dramatically altering freshwater ecosystems, and displacing species to higher latitudes or altitudes (Hou et al., 2022;Lindholm et al., 2012).Anthropogenic disturbance is accelerating the increase in average air temperatures, which increase water temperatures globally (Wanders et al., 2019).These changes may adversely affect the availability of habitat for and reproduction of aquatic animals (Barbarossa et al., 2021;Domisch et al., 2013;Warren et al., 2012), particularly cold-water species.
Because freshwater amphipods typically brood eggs and lack free-swimming larvae, their ability to disperse is poor and their thermal tolerances and geographical distributions are generally limited (Hou et al., 2022).Xinjiang, northwestern China, has a highly endemic freshwater amphipod fauna (Tong et al., 2022), largely because regression of the Tethys Ocean and uplift of the Tibetan Plateau has driven diversification in this region (Hou et al., 2011;Zheng et al., 2020).As ancient Tethyan relicts, amphipods from Xinjiang retain characteristics of their cold-water ancestors (Copilaș-Ciocianu et al., 2020) and are more closely related to northern Eurasia Gammarus lacustris Sars, 1863 andG. balcanicus Schäferna, 1922 clades than they are to those of oriental Gammarus (Hou et al., 2011).However, comprehensive studies on the diversity of freshwater amphipods in Xinjiang have not been conducted.
Phylogenetic diversity (PD) (Faith, 1992) is widely used to assess patterns in species diversity, and an understanding of it is important for conservation (Antonelli et al., 2022;Hu et al., 2021).
PD uses phylogenetic trees to demonstrate biodiversity from the perspective of evolutionary history (Rosauer et al., 2009).Rapid developments in statistical analysis and phylogeny have resulted in more biodiversity indicators based on PD being proposed, including calculations of endemic species diversity patterns such as weighted (Crisp et al., 2001) and phylogenetic endemism (Rosauer et al., 2009).The study of niches also contributes to an improved understanding of the distribution patterns of species and those environmental factors that affect them (Rödder et al., 2017;Wang et al., 2021;Zheng et al., 2021).
We compile and interrogate a comprehensive dataset of genetic, geographic, environmental, and ecological information for freshwater amphipods from around Xinjiang, China, to identify patterns in their PD, and the impact that a changing environment may have on them.Via construction of a time-scaled molecular phylogeny, we explore areas where diversity is concentrated, evaluate those environmental factors that most explain differences in species niches, and project the nature, extent, and implications of changes in freshwater amphipod habitat that might occur with ongoing climate warming (Figure 1a).

| Sample collection and DNA extraction
From 2001 to 2022, approximately 2000 individuals identified to 11 species (a Sarothrogammarus and ten Gammarus) were collected in surveys at 67 localities across Xinjiang, China (Figure 1b; Table S1).Of these specimens, 37 were selected from representative localities, encompassing morphological variability, for molecular sequencing (Table S2).Additionally, four individuals of G. pisinnus Hou et al., 2014, one individual of S. cataractae Stock, 1995, andfive individuals of Gammarus from Lake Ohrid (Hou et al., 2011), and five individuals belonging to Jesogammarus and Niphargus were selected for phylogenetic reconstruction and divergence-time estimation (Hou et al., 2022).
Genomic DNA was extracted from the legs of amphipods using a Tiangen Genomic DNA kit.Fragments of mitochondrial cytochrome oxidase subunit I (COI) and nuclear 28S rRNA were amplified at 45-49°C annealing temperatures.Detailed information about the primers are provided in Table S3.PCR products were sent to Sangon Biotech (Beijing) for Sanger sequencing, then assembled and aligned in MEGA11 (Tamura et al., 2021).COI sequences were translated into amino acids using invertebrate mitochondrial genetic code.The final 52-tip matrix was generated by concatenating COI and 28S together in PhyloSuite v1.2.2 (Zhang et al., 2020).To ensure accuracy in species delimitation, pairwise p-distances within and between species were calculated using MEGA 11 based on COI sequences.
For multiple specimens of a single species, pairwise distances of specimens were averaged before comparisons were made among species.

| Phylogenetic analysis and divergence time estimation
The best-fit substitution model was selected using Partitionfinder2 (Lanfear et al., 2017) based on the corrected Akaike information criterion (AICc).Maximum likelihood (ML) analyses were carried out in RAxML v8.2.10 (Stamatakis, 2014) using the substitution model GTRGAMMAI according to Partitionfinder2 results.The best-scoring ML tree and support values were estimated by executing 1000 rapid bootstrap inferences, followed by a thorough ML search.Polytomies of the phylogenetic tree were pruned to ensure that divergence time estimation could be performed.Three points were used for calibrations.(1) The crown age of G. pisinnus was set at 2.6-10 Ma based on gammarid fossils from the Late Neogene of Shanxi, China (Wei et al., 2021).Although fossils are poorly preserved and their taxonomic position is unclear, our morphological assessment indicates that they may belong to Gammarus.Specimens of G. pisinnus were collected near the type locality of these fossils; this species diverged from sister species approximately 6 Ma (Hou et al., 2014), which corresponds well with fossil ages.(2) The timing of the intralacustrine radiation of Gammarus species in Lake Ohrid was constrained at 1.36 Ma, based on the formation of an ancient lake ecosystem (Wilke et al., 2020).(3) Fossils of Niphargus were reported from Baltic ambers, approximately 45-50 Ma (Jażdżewski & Kupryjanowicz, 2010).Because a time-calibrated multilocus phylogeny revealed that Niphargus originated in the Middle Eocene, ~45 Ma (Borko et al., 2021), we constrain the ancestral age of Niphargus at 44-46 Ma.
Divergence time estimation was calculated using MCMC tree and BASEML programs under the GTR + G substitution models (model = 7) in the 'PAML' v4.9j package (Yang, 2007).A strict clock model was selected.The time unit was set to 100 Ma, and a maximum bound for the root was set to 130 Ma based on a fully annotated chronogram of Amphipoda (Copilaș-Ciocianu et al., 2020).
The setting of rgene_gamma was set to 1 1.668 according to the overall substitution rate.The independent rate model (clock = 2) was used to specify the rate priors for internal nodes.The MCMC run was first executed for 100,000 generations as burn-in and then sampled every 50 generations until 20,000 samples were collected.Two MCMC runs were compared for convergence and checked in Tracer v1.7 (Rambaut et al., 2018) to ensure that all parameters had effective sample sizes (ESS) > 200.A jackknife analysis was performed to test the sensitivity of time calibration.Calibration analyses were repeated three times; for each repetition, a different calibration point was removed to estimate divergence times.

| PD and endemism analyses
To assess patterns in the distribution of freshwater amphipods, biodiversity based on records from the 67 surveyed locations (Table S1) throughout the Xinjiang region was divided into 204 grid cells of 100 × 100 km spatial resolution.Genetic data available for paratype specimens were selected as representatives of that species for PD and phylogenetic endemism (PE) analysis based on a time-calibrated tree.Of genetic data available for G. lacustris, we randomly selected an individual as a representative for this analysis.For each grid cell, PD was calculated following Faith (1992), and is represented by the sum of the phylogenetic branch lengths spanning a set of taxa to the root of the tree.If only one species occurred within a grid cell, PD represents the length of the branch connecting the species to all other branches in the time-calibrated tree.To further identify if PD is significant after controlling for the influence of species richness, we used an independent swap algorithm (Gotelli, 2000) to calculate the standardized effect size of PD (sesPD), and to improve accuracy, further estimated the sesPD pattern after eliminating grid cells with fewer than two species.The randomization procedure was repeated 9999 times to compare if a significant difference existed between observed and random PD (Borko et al., 2022).Values of sesPD >1.96 indicate a significantly higher PD than expected (random PD) at a given level of species richness, and values below −1.96 indicate a significantly lower PD (Molina-Venegas et al., 2022).Faith's PD and sesPD were implemented in the 'picante' R package (Kembel et al., 2010).
Excepting G. lacustris, freshwater amphipods in Xinjiang are range-restricted (endemics).Traditional diversity indexes do not identify areas where spatially restricted evolutionary diversity are concentrated.To better understand patterns in the distribution of species diversity, we calculated weighted species endemism, and PE in the 'phyloregion' R package (Daru, Farooq, et al., 2020;Daru, Karunarathne, & Schliep, 2020).For weighted endemism (WE), we summed the weight coefficient of all species in each grid cell -the reciprocal of the total number of each species occupying the grid cell in the studied area -using a variant of Laffan and Crisp's weighted endemism (Daru, Farooq, et al., 2020;Daru, Karunarathne, & Schliep, 2020;Laffan & Crisp, 2003).The smaller a species' distribution range, the higher its weight value, so, areas where WE is high tend to have a higher concentration of endemic species.PE was measured following Rosauer et al. (2009).We regard the number of grid cells in which a species occurred to represent its geographic range.A higher PE value indicates that the grid cells contain phylogenetically distinct and spatially rare species (Gudde et al., 2013).Finally, we used natural a breaks (jenks) approach within ArcMap (Spatial Analyst module) to map diversity patterns in PD, sesPD, WE, and PE.

| Ecological niche analyses
To explore niche differentiation, species pairs were selected according to the criteria: (i) the species pair occurred near areas with high PE values; (ii) habitats between species were closely spaced; and (iii) each pair contained at least one range-restricted species.High PE areas contain geographically rare and most evolutionarily unique lineages (Murali et al., 2021;Rosauer & Jetz, 2015), and the species within them are most vulnerable to climate change.
In high PE areas, species pairs aid the identification of environmental factors that explain niche differentiation at local scales, the sensitivity of rare and unique species to environmental perturbations, and the threats that a changing climate may pose to them.
Because G. lacustris is widely distributed, its many distribution records can obscure the effect that some environmental factors have on niche differences between species in a small range.Because this may reduce the explanatory power of some environmental factors, niche differentiation between species might be unapparent.To minimize this possibility, we only considered records of G. lacustris when they surrounded another species in the same pair.G. zhouqiongi Zhang, Wang, Ge, Ma & Zhou and G. simplex) were quantified using n-dimensional hypervolumes (Blonder et al., 2014).
Niche differentiation within each pair was explored based on PE patterns.We estimated n-dimensional hypervolume using the hyper-volume_gaussian function in the 'hypervolume' R package (Blonder et al., 2018); hypervolumes were delineated using a Gaussian kernel density estimator (Blonder et al., 2018).After hypervolume construction, centroid distance and Jaccard dissimilarity were calculated to evaluate environmental niche overlaps within each species pair.The Jaccard dissimilarity index represents the level of overlap between two hypervolumes; the index ranges from 1 (full niche overlap) to 0 (no niche overlap) (Mammola, 2019).The centroid distance is the Euclidean distance between the centroids of two hypervolumes (Mammola, 2019).A relatively low Jaccard dissimilarity index and a high centroid distance in a species pair indicates a low level of niche overlap.
For hydrological variables we included annual land surface runoff (to represent water flow) from HydroBASIN database (Lehner & Grill, 2013), and average annual water temperature, based on averaged annual data from 1979 to 2000 (Wanders et al., 2019).
Environmental factors (temperature seasonality and average annual water temperature) that great differed in n-dimensional hypervolume analysis were retained.For species pairs, we performed principal components analysis (PCA) to explain niche differentiation using our retained environmental variables.
To examine the effects of climate change on species range from 1960 to 1990, and projected to 2050, species distribution models were built using MaxEnt v3.4.1 algorithm implemented in the 'ENMeval' R package (Kass et al., 2021).For the year 2050, we used the CCSM4 model, which performs well among those tested in the 5th Coupled Model Intercomparison Project (CMIP5) experiment (McSweeney et al., 2015).Models were run in representative concentration pathways of 2.6 (the most 'benign' scenario, RCP2.6) and 8.5 (the most extreme scenario, RCP8.5) (Tang et al., 2018).
Temperature seasonality and average annual water temperature had high explanatory power in isolating species in Xinjiang, combined with the result of n-dimensional hypervolumes.Temperature seasonality data for present and future climate scenarios were downloaded at a resolution of 2.5 arc-min.Because simulation data for average annual water temperature in the future (2050) scenario model did not exist, we considered that the increase in water temperatures in 2050 for both scenarios would be approximately half of that projected for future lake surface water temperatures (2071-2100).With lake surface water temperatures expected to increase by 1.0 [0.7, 1.7] °C under the RCP2.6 scenario, and 2.2 [1.8, 3.3] °C under the RCP8.5 scenarios for 2071-2100 (Wang et al., 2023), we added 0.2°C to each present-day annual water temperature grid value to represent expected water temperatures for the year 2050 in RCP2.6, and 1°C for RCP8.5 scenarios.
We used all available species records from Xinjiang and adjacent areas to consider all possible climate areas covered by these species in projection.To reduce spatial autocorrelation, to ensure that all records in 2.5-min raster grid cells differed from one another, species records were thinned using the 'spThin' package v0.S1).
To test the performance of our models, we partitioned the localities into testing and training bins using the 'jackknife' method for small data sets (e.g., < ca. 25 records) (Kass et al., 2021).About 10,000 background points were randomly selected for models from a defined buffer area (~500 km around occurrence points) to avoid areas where congeners were absent.We cropped a partial boundary of distributional maps based on known biogeographical barriers to further refine the projected distributions.A suite of models was created using regularization multipliers from 0 to 5 (increasing in increments of 0.25) with five combinations of feature classes (L = linear; LQ = linear and quadratic; H = hinge; LQH = linear, quadratic, and hinge; LQP = linear, quadratic, and product).The best model with the lowest AIC value among all models was selected.The true skill statistic (TSS) and the area under the receiver operating characteristic curve (AUC) of the best model were evaluated to ensure that it exhibited excellent predictive capacity.All models exhibited good predictive capacity, with TSS > 0.70 and AUC > 0.85 (Bellard et al., 2013;Gallardo et al., 2017).
To compare changes in potential future ranges, continuous habitat suitability results were converted to binary predictions using maximum training sensitivity and the specificity Cloglog threshold (Liu et al., 2013(Liu et al., , 2016)).To show change in range of G. brevipodus and G. takesensis, the Spatial Analyst module in ArcMAP was used to calculate area and latitude (raster cells) of suitable habitats in different scenarios.We also tested for a significant change in latitude of suitable habitats through present and two future scenarios using two-sample Wilcoxon tests.

| Phylogenetic structure and divergence times
Seventeen newly generated COI and 28S sequences were deposited in GenBank (Table S2).Pairwise p-distance analyses revealed genetic distances among the 11 species exceeded 12% for COI (Figure 2b; Table S5).We constructed a phylogenetic tree from the 52-tip data set (Figure S1).Divergence time estimates of major clades with the three calibration points correspond well with the calibration obtained after deleting the Shanxi fossil point (Table S6).Calibrations deleting the point of the genus Niphargus produced younger estimations, while calibration deleting the point of Lake Ohrid was older (Table S6).Accordingly, to analyse diversity patterns, divergence times were estimated using all three calibration points.
Phylogenetic placements of most clades in the ML analyses had similar topologies to those of Hou et al. (2011) andTong et al. (2022).The time-calibrated phylogeny (Figure 2a) reveals the genus Gammarus first diversified in Xinjiang during the Eocene, approximately 41 Ma, and that it then split into geographically distinct species during the Miocene.

| PD patterns
PD analysis indicates that grid cells with high PD values occurred mostly in the Tian Shan mountain region (Figure 3a).The sesPD and PD patterns are similar (Figure 3b; Figure S2).Most grid cells did not deviate from random expectation.Three grid cells (red) around the Tian Shan mountains had significantly high sesPD, two grid cells (red) around the Irtysh River in northernmost Xinjiang had significantly high sesPDs, and four grid cells (blue) had significantly low sesPD around the Junggar Basin.Cells with high WE values occurred along the western boundary of Xinjiang, suggesting that most species in in this region were range-restricted (Figure 3c).Grid cells with high PE are scattered among grid cells with high WE (Figure 3d).PE was high in the north of Irtysh River, Tian Shan mountains, and the eastern margin of Pamir, indicating that most spatially restricted PD occurred in these areas.

| Ecological niche comparison
Niche characterization with n-dimensional hypervolumes for the four species pairs revealed divergence in each pair across their distribution range (Figure 4, Table 1).The greatest niche differentiation occurred between G. lacustris and G. simplex (Table 1).Long distances between the niche centroids for temperature seasonality and average annual water temperature for the species pairs G. lacustris and G. simplex, G. lacustris and G. liuruiyui + S. yiiruae, and G. zhouqiongi and G. simplex, indicate dissimilarity in niches in multidimensional space (Figure 4bd).Niche dissimilarity between G. tianshan and G. brevipodus was most apparent in temperature seasonality (Figure 4a).PCA plots for temperature seasonality and average annual water temperature are consistent with hypervolume results (Figure 4).
All species distribution models performed well on TSS (mean 0.789) and AUC (mean 0.914), suggesting that each was highly robust (Table S7).Habitat suitability projections for G. brevipodus indicate range contraction and a slight northerly shift in future scenarios (Figure 5a-e).Suitable habitats reduced towards the west in future climate scenarios.The area of suitable habitat for both RCP2.6 and RCP8.5 scenarios reduced by ~39% in 2050 (Figure 5d).For G. takesensis (Figure 5f-h), only a small contraction in the range of suitable habitats occurs in the RCP2.6 scenario (Figure 5b), but future latitudinal shifts are apparent under both scenarios (a slight shift southward in RCP2.6 and a large shift northward in RCP8.5) (Figure 5j). of PD, the G. lacustris species complex has spread and colonized areas surrounding the Tian Shan mountains, having been influenced by tectonic events and the Paratethys Sea (Hou et al., 2022).
Although the number of species in Xinjiang affects assessment of PD patterns, sesPD analyses of different datasets (one dataset including F I G U R E 4 Pairplots representing niche differentiation for four species pairs.Clouds of coloured points for each species comprise 10,000 stochastically sampled points from the inferred hypervolume.The colour represents species for PCA analysis; the species distribution map is the same as for the hypervolume analysis.Significant differences in species niches within species pairs are denoted by asterisks and bold fonts.WT, annual water temperature; run_mm_syr, annual land surface runoff.

TA B L E 1
Pairwise niche differentiation between hypervolumes, estimated through centroid distance and Jaccard dissimilarity.all grid cells with species (Figure 3b) and the other including only grid cells with two or more species (Figure S2)) consistently reveal high PD in similar areas (Figure 3b).these endemic freshwater amphipods provides more detailed but easy overlooked information (i.e., it reveals the current restricted ranges of long-diverged lineages) (Rosauer et al., 2009;Rosauer & Jetz, 2015).
Endemic amphipods are particularly vulnerable to changes in climate and hydrology.Species with high dispersal ability can occupy more habitat and adapt to a wide range of climates (Sandel et al., 2011).However, range-restricted species, especially cold-freshwater amphipods, are unable to locate suitable habitat within short periods of time.Because their dispersal ability is reduced, the likelihood of their extinction is increased without the added influence of geographic factors (e.g., the mountain building movements, and waterway linkages).Therefore, an understanding of patterns in PE aids in the design of appropriate conservation strategies to protect endemic amphipod diversity in Xinjiang, and serves as an additional metric of diversity.

| Effects of climate change on freshwater amphipods
Areas with high PE values can indicate appropriate areas to protect to conserve freshwater amphipods.Identifying those environmental factors that most influence the distribution of species in these areas can assist with predicting future changes in their distributions.We report niche differentiation between geographically related species to be mainly affected by temperature seasonality and average annual water temperature (Figure 4, Table 1).Temperature seasonality have a significant effect on the geographic distribution of insects in temperate regions (Neff et al., 2022), freshwater fish (Carvajal-Quintero et al., 2019), fungal endophytes (Oita et al., 2021) and high-altitude birds (Thom et al., 2021).In a changing climate, temperature seasonality negatively affects the area of suitable habitat for cold-adapted species (Neff et al., 2022).Because increased water temperature reduces dissolved oxygen levels, respiration of aquatic organisms is also negatively affected (Dibble et al., 2018;Ficke et al., 2007;Ganser et al., 2015).
For cold-water amphipods, changes in seasonal temperatures and those in water temperatures negatively affect habitat suitability.Although most aquatic species are affected by climate change, their responses to it can vary.Environmental stress caused by climate warming causes trout to migrate to cooler altitudes (Hari et al., 2006).For mussels with limited dispersal ability, climate warming reduces habitat in southern Europe (Bolotov et al., 2018).
Habitat reduction caused by temperature has also been reported for range-restricted fish species (McMahan et al., 2020).Demersal fish species also experience habitat contraction and shift northward in a warming climate (Lamine et al., 2022).
Climate change, conservation, diversity, endemism, freshwater amphipods, temperature F I G U R E 1 Overview of analysis and freshwater amphipod sampling map, Xinjiang.(a) Workflow of areas for consideration for priority conservation.(b) Amphipod sampling sites.G., Gammarus; PD, phylogenetic diversity; PE, phylogenetic endemism; S., Sarothrogammarus; sesPD, standardized effect size of PD; WE, weighted endemism.

F
Divergence time and pairwise distance of freshwater amphipods, Xinjiang.(a) Time-calibrated tree inferred from MCMCtree analysis with outgroup removed.Species names on the right side of the tree are those from Xinjiang.(b) Pairwise p-distances of COI sequences among species.Shading indicates the Miocene, during which time most Xinjiang species diverge; blue bars on nodes represent 95% posterior credibility intervals of the divergence times of major clades.

F
I G U R E 3 PD patterns of freshwater amphipods, Xinjiang.White grid cells contain no species records.(a) PD for grid cells with amphipod species; (b) Standardized effect size of PD.Red cells indicate significantly higher PD than expected; blue grid cells indicate significantly lower PD than expected; yellow grid cells are not significant.(c) Weighted endemism for grid cells with amphipod species; (d) Phylogenetic endemism for grid cells with amphipod species.4 | DISCUSS ION 4.1 | Diversity patterns A comparison of PD and PE reveals different patterns in the distributions of freshwater amphipod species in Xinjiang.Results of PD and sesPD analyses indicate that areas of high PD occur mainly near the Tian Shan mountains (Figure 3a,b), suggesting the existence of some ancient lineages in this area.Consistent with our results in patterns Comparable different datasets of PD patterns were used in a study of groundwater amphipods in the western Balkans, with robust results (Borko et al., 2022).Both PD and sesPD indicate phylogenetic overdispersion around the Tian Shan mountains, possibly because of species diversification driven by tectonic events.After adding the influence of geospatial factors, PE analyses reveal three significant areas with unique evolutionary lineages along the western border of Xinjiang (Figure 3d).The Tian Shan mountains have both high PD and high PE, while the areas north of Irtysh River and the eastern margin of Pamir have low PD and high PE.Inconsistent patterns reveal that PD and PE are not necessarily linked.Understanding the distribution of PE for

F
Binary projections of habitat suitability for two Gammarus species: (a, f) present day, and (b, g) RCP 2.6, and (c, h) RCP 8.5 climate warming scenarios.Colours: grey, altitude, and unsuitable range (0); red, suitable range (1); green, distribution records of two Gammarus species.Area changes for G. brevipodus (d) and G. takesensis (i) under present day and two RCP (2050) scenarios.Numbers above columns represent percentages of range loss for different RCP scenarios compared with present day values.Significance (p < 0.001) of suitable habitat changes in latitude (raster cells) for G. brevipodus (e) and G. takesensis (j) are denoted by asterisks.