Biodiversity hotspots and conservation efficiency of a large drainage basin: Distribution patterns of species richness and conservation gaps analysis in the Yangtze River Basin, China

The Yangtze River Basin (YRB), a key biodiversity area and a major economic zone in China, the biodiversity of the area is confronted with severe challenges. In this paper, we analyzed species distribution patterns based on a dataset with 18,538 seed plant species to identify hotspots and evaluate conservation effectiveness and gaps of the YRB. We calculated distribution patterns of species richness with the top 5% richness algorithm and complementarity algorithm, and reconstructed a phylogenetic tree and elucidated spatial phylogenetics. There were 214 hotspot grid cells covering only 6.8% of the study region, but containing 88% seed plant species. Then, we conducted correlation analysis on distribution patterns of different algorithms and species categories, which showed moderate or weak correlations between the top 5% richness algorithm and complementary algorithm. Conservation effectiveness analysis indicated there were 116 hotspot grid cells (accounting for 73.6% of the total species) protected by current conservation networks and there are many conservation gaps remaining. Finally, we used the maximum entropy model (MaxEnt) to predict suitable habitat areas of threatened species under current and future climate scenarios. Prediction analysis results indicated future expansion of suitable habitat to the southeast, and reduction in the central and western portions of the study area. When we considered anthropogenic areas, the suitable habitat was severely decreased, indicating the importance of optimizing the layout of conservation networks and comprehensive biodiversity conservation planning in the YRB.


| INTRODUCTION
Biodiversity is not only the material basis for human existence, but also represents the wealth of natural systems (Butchart et al., 2010;Hooper et al., 2005). Due to the complexity of topography, landforms and microclimates, the ecological environment of river basins is often highly heterogeneous, giving rise to interesting vegetation types, and floras with high richness of endemic, distinctive and rare plants (Naiman & Décamps, 1997;Naiman et al., 1993;Sabo et al., 2005). The abundant natural and water resources are necessary for human survival, and humans tend to settle along river areas. The expansion of human settlements and the intensification of human activities have led to numerous environmental problems in many river basins, resulting from the unreasonable utilization and excessive consumption of the land resources (Mensing et al., 1998;Zhang et al., 2014).
The Yangtze River Basin (YRB) is home to many threatened and endemic species, and has a high biodiversity and abundant germplasm resources, which makes it a crucial region for biodiversity conservation globally (Chen et al., 1997;Chen et al., 2001;Zhang et al., 2014; Figure S1). The area contains one world-class biodiversity hotspot (Myers et al., 2000) and two of the three Chinese endemism centers (Ying & Zhang, 1984). Among these, the Eastern Sichuan-Western Hubei center is the modern distribution center of many relic plants (Huang et al., 2015;Wang et al., 2005). However, as an important economic zone and population center in China, the YRB is also a region where the contradiction between environmental protection and economic development is most prominent (Li & Chen, 2018). The construction of dams (Liu et al., 2013), overgrazing (Su et al., 2019), logging (Cui et al., 2007;Houghton & Hackler, 2003) and other human activities are the leading factors fostering biodiversity loss and ecosystem degradation, additionally the region has been seriously affected by agricultural production for a long time. Water conservancy projects and farmland surrounding lakes are the most representative artificial systems in YRB, while much less common in other regions of China (Chen et al., 1997). Few studies on the spatial distribution pattern of biodiversity in this area exist, especially studies focusing on plants species are lacking. The few previous studies were mostly based on very limited species and distribution data, and thus did not fully reflect the status of biodiversity and conservation in the region (Xie, 2003;Xu et al., 2010;Zhang et al., 2014). The conservation status of biodiversity and the efficacy of the current conservation networks in the YRB are unknown, and the area faces huge challenges.
Due to the uneven distribution of biodiversity, the identification of biodiversity hotspots and conservation gap analysis have become important and popular methods as baseline for conservation priority planning and for promoting biodiversity conservation efficacy (Huang et al., 2016;Liu et al., 2018;Myers et al., 2000;Zhang et al., 2014;Zhang et al., 2017). Studies of the distribution patterns of biodiversity provided important insights for conservation priority planning in biodiversity hotspots (Brummitt & Lughadha, 2003;Myers et al., 2000;Orme et al., 2005;Reid, 1998). Many previous studies found mismatches between distribution patterns of different plant categories, and differences between the most popular algorithms used to assess biodiversity patters have also been observed (Ceballos & Ehrlich, 2006;Orme et al., 2005;Tang et al., 2006). Therefore, when identifying biodiversity hotspots based on only one single group or index (Xie, 2003;Xu et al., 2010;Zhang & Ma, 2008), other important aspects of biodiversity, such as species complementarity and spatial phylogenetics may be ignored (Chi et al., 2017;Huang et al., 2012;Luo et al., 2015). Given these shortfalls, comprehensive biodiversity indicators are highly important for the determination of biodiversity hotspots, providing better baseline information and achieving higher representativeness for the planning of protected areas, the adjustment of protected area systems and the formulation of protection policies (Chi et al., 2017;Huang et al., 2016;Mittermeier et al., 2011;Xu et al., 2019;Yang et al., 2021).
In addition to identifying biodiversity hotspots and conservation priorities in the current context, full consideration should be given to the fact that global climate change is a constant and broad challenge to systematic conservation efforts (Carroll et al., 2010;Ihlow et al., 2012). Studies have shown that climate change is already impacting the distribution patterns of species, and that global warming will promote some species to migrate to colder zones in search of more suitable habitats (Colwell et al., 2008;Walther et al., 2015). If the rate of plant migration cannot keep up with the climate changing, the population number and species distribution will be impacted (Ackerly et al., 2010;Barber et al., 2016). Previous studies point out that most of the YRB area will be the area most affected by climate change in China by the end of the 21st century (Gu et al., 2015). In the YRB, plants of many areas are sensitive to climate change and ecosystems are fragile, such as Qinghai-Tibet  and the karst area in southern China . Therefore, it is not enough to consider only the current biodiversity conservation measures for future conservation priority planning . In this context, the prediction and analysis of plant suitable habitats combined with climate change based on species distribution models can provide help for biodiversity conservation priority planning (Dawson et al., 2011;Dillon et al., 2010;Lawler et al., 2020).
This study attempts to fill the tremendous knowledge gap on plant species distribution patterns, and conservation efficacy in the YRB, considering data on all plant species, and using a comprehensive approach for species distribution modeling. This study aims: (1) to elucidate the spatial distribution patterns of species richness in YRB and identify the biodiversity hotspots for conservation priority planning; (2) to assess the conservation effectiveness of the current conservation networks in YRB and detect possible conservation gaps; (3) to predict suitable plant habitat areas for conservation priority planning under current and future climate scenarios, to present conservation strategies and countermeasure against biodiversity loss.

| Plant species inventory and occurrence database
We compiled a regional plant inventory of the YRB, covering an area of 180 Â 10 4 km 2 , accounting for 18.8% of the China's land area and containing 11 provinces or regions (Xu et al., 2010;Figures S2-S4). The inventory was compiled based on the relevant regional floras and plant checklists with full consideration of all subspecies and varieties, treating them as separate items. Cultivated species and nonnative species were excluded. We updated all scientific names according to Catalogue of Life China (Species 2000 China Node, http://www. sp2000.org.cn/) to replace synonyms with corresponding accepted names. Species endemic to China were identified according to the website database of Catalogue of Life China. According to the distribution information available in the occurrence database, we identified species confined to the YRB as regional endemic species, to compare with China endemic species. The threatened species were categorized as vulnerable endangered (VU), endangered (EN), and critically endangered (CR) according to Qin et al. (2017). Plants nationally protected in China were determined according to the List of Wild Fauna and Flora under State Key Protection (issued by Ministry of Forestry and Ministry of Agriculture in 1989).
The inventory contained 18,538 species of seed plants, belonging to 2181 genera and 235 families, of which 9846 were endemic species (accounting for 53.11% of the total species), 2217 regional endemic species (11.96%), 1146 threatened species (6.18%), and 99 national protected species (0.53%; Tables 1 and S1). An occurrence database was built based on specimen records accessed from the Chinese Virtual Herbarium (http://www.cvh.ac.cn/) and the website of the Global Biodiversity Information Facility (GBIF, https://www. gbif.org/, all DOI of downloads in Table S12). Specimen records lacking precise localities were rejected. All specimen records were precisely georeferenced, using the geographical coordinates according to China gazetteers. The final occurrence database contained 489,132 georeferenced points, each including data on the status as endemic, regional endemic, threatened, and national protected species according to the plant inventory. Among these, we found 35 species with more than 400 georeferenced occurrence points, 315 species ranging between 200 and 399 georeferenced points, and 18,208 species ranging from 1 to 199 georeferenced points (Table S2). For further analysis we divided the map of the YRB into 3159 grid cells with a resolution of 25 km Â 25 km, using the occurrence data to analyze distribution patterns of different plant categories, endemic species, regional endemic species, threatened species, and national protected species.

| Phylogeny reconstruction and spatial phylogenetics
Phylogenetic trees play an important role in biodiversity and evolution studies (Véron et al., 2019) and phylogenetic information reflects the evolution process and has important implications in hotspot studies (Forest et al., 2007;Isaac et al., 2007;Mooers & Atkins, 2003;Redding & Mooers, 2006;Rosauer et al., 2009). Here, we used the "V. PhyloMaker" package (Jin & Qian, 2019), which is the largest dated phylogeny for vascular plants, as a backbone to generate species level phylogeny of seed plants in YRB and used the function build.nodes. 1 to extract the information of root and basal nodes of the genera for generate the phylogenetic hypothesis of our list. A total of 17,864 plant species, belonging to 1927 genera (accounting for 88.4%) and 231 families (98.3%), were sampled, representing a sound representative sample at the family and generic level (Table S3). To study spatial phylogenetics, we calculated two metrics including phylogenetic diversity (PD) and phylogenetic endemism (PE) by using Biodiverse V2.0 (Laffan et al., 2010). At the same time, we calculated weighted endemism (WE) by the same software to study differences in area size of species distribution (Crisp et al., 2001). We standardized indices of PD, PE, and WE in each grid, respectively, by calculating the ratio of index in each grid to the corresponding maximum. Then, we summed the three ratios to measure the diversity level of phylogeny in each grid (Table S4).

| Spatial distribution patterns of species richness and hotspots
During analysis of distribution patterns and hotspots, two algorithms were employed: the top 5% richness algorithm and complementary algorithm. The top 5% richness algorithm considers the top 5% land area with highest species richness as hotspots (Prendergast et al., 1993;Tang et al., 2006). In contrast, the complementary algorithms defined biodiversity hotspots by selecting the smallest areas which could accommodate the largest number of species (Dobson et al., 1997). The specific operation steps of the complementary algorithm involve selecting the grid cell with the highest species richness and removing all the species that occurred in this grid cell from the database first. Then repeat this process until all species were included in the selected grid cells (Dobson et al., 1997).
First, we calculated the distribution patterns and hotspots of all species, endemic species, regional endemic species, threatened species, and national protected species according to the top 5% richness algorithm and complementary algorithm, respectively. Next, we got the ratio of species richness (species numbers) of different plant categories in each grid cell in relation to the maximum species number of corresponding plant categories. Finally, we summed the ratios of different plant categories in each grid cell as an index of hotspots according to each algorithm and selected the top 5% hotspot grids by sorting the index of hotspots in descending order (Tables  S5 and S6).
Overall there were 2675 grid cells interpolated with occurrence data, and the top 5% land areas were about 134 grid cells. To compare this with the results detected by the top 5% richness algorithm, complementary algorithm and spatial phylogenetics, we also selected 134 hotspot grid cells of each algorithm to determine the final hotspots (Tables S4, S5, and S6). In this we defined the grid cells detected as hotspots by any algorithm as the final hotspot grid cells. Finally, we visualized different distribution patterns of every plant categories and hotspots by utilizing ArcGIS 10.6 for further conservation efficiency analysis.

| Conservation effectiveness and gaps analysis of current conservation networks
We performed conservation effectiveness and gap analysis based on the distribution patterns of hotspots and current conservation networks at national and provincial levels. To evaluate the conservation effectiveness and gaps of current conservation networks focusing on hotspots in the YRB, we constructed geodata documents for protectedplanet.net/), using ArcGIS 10.6. We then superimposed the layers of NNRs and PNRs on the layer of final hotspots to generate maps of conservation effectiveness and gaps on national and provincial level by recognizing overlapped or nonoverlapped hotspot grids, respectively (Hou et al., 2010;Zhang et al., 2017). We also evaluated conservation effectiveness and gaps of different plant categories confined to hotspots under current conservation networks.

| Correlation analysis on distribution patterns and statistics analysis on species composition
Due to the distribution patterns of different plant categories and algorithms are different. We used correlation analysis to evaluate these differences to elucidate a more comprehensive reference for conservation priority planning. To detect correlations between distribution patterns of different plant categories and algorithms, we conducted a correlation analysis of distribution patterns of all, endemic, regional endemic, threatened and national protected species generated according to the top 5% richness algorithm, complementary algorithm, and spatial phylogenetics (PD, PE, and WE; Table S7), by employing the "corrplot package" in software R (version 4.0.2; Wei & Simko, 2017). For the correlation analysis on spatial distribution patterns, we calculated the Pearson's coefficient, measuring the related degree of different spatial distribution patterns by identification of jrj value (≥.8 for very strong correlation, .6 ≤ jrj < .8 for strong correlation, .4 ≤ jrj < .6 for moderate correlation, .2 ≤ jrj < .4 for weak correlation, and 0 ≤ jrj < .2 for very weak correlation or absence of correlation; Jain & Chetty, 2019). We filled the value of empty grid cells with zero (value 0) to make the procedure run smoothly while performing correlation analysis. We conducted then a statistical analysis of species composition of different groups included in hotspots, protected areas and gaps of NNRs and PNRs as well as species contained in NNRs and PNRs. Then, we employed the "circlize" package (Gu et al., 2014) and "tidyverse" packag (Wickham et al., 2019) in software R (version 4.0.2) by using chord diagram and circular barplot to illustrate statistical results. To depict the results more intuitively; we divided all species into four groups according to conservation priority, that is, threatened species, regional endemic species (excluding threatened species), endemic species (excluding threatened species and regional endemic species), and rest of species (all species excluding threatened, regional endemic and endemic species; Table S8).

| Prediction of suitable habitat areas of threatened plants in YRB
Previous studies indicated that threatened species are considered as target species for hotspots determination, conservation priority and sensitive to environmental and climate change (Abolmaali et al., 2018;Li & Chen, 2014). Here, we selected 385 threatened plant species as prediction objects (Table S9), to evaluate the impact of climate change on their current and future suitable habitat areas. We employed the maximum entropy method (MaxEnt, version 3.4.1) as predicating tool for current and future suitable habitat areas prediction analysis, which has been considered as one of the most reliable methods for modeling species distribution with presence-only data (Pearson et al., 2007;Tang et al., 2018;Yan et al., 2020) and has been widely applied to predict species range and vegetation shifts under climate change (Abolmaali et al., 2018;Tang et al., 2018;Warren et al., 2013;Yang et al., 2021).
We accessed a set of 19 bioclimate variables with a resolution of 10 arc min for current climate (1950-2000; WorldClim, version 1.3) and future climate data (2100, 2 Â CO 2 climate scenarios, CCM3 model from the divagis databases www.diva-gis.org/climate) (Govindasamy et al., 2003), which has been used in many studies Jiang et al., 2015;Rodrıguez-Nunez et al., 2021;Yang et al., 2021). To minimize overfitting of the models, we used SPSS 13.0 to calculate the intercorrelations among 19 bioclimatic variables and removed one of the two variables in case a correlation coefficient ≥.85 was obtained. Finally, eight variables were selected to predict the suitable habitat areas of threatened species, including BIO2 (mean diurnal temperature range), BIO4 (temperature seasonality), BIO7 (temperature annual range), BIO8 (mean temperature of wettest quarter), BIO11 (mean temperature of coldest quarter), BIO12 (annual precipitation), BIO15 (precipitation seasonality), and BIO19 (precipitation of coldest quarter).
Species with less than five distribution points were rejected, given that the predictive ability of models with sample sizes below five may not be enough (Pearson et al., 2007;Tang et al., 2018). We used two methods of replication to construct MaxEnt models: jackknife for those species with 5-29 occurrence records (which is the recommended method for species with low sample sizes), and cross-validation for those with more than 30 occurrences . For the cross-validation approach, we ran 10 replicates to obtain more robust modeling results (Tang et al., 2018). After importing information on species distribution and bioclimatic variables under current and future climate scenarios, we used the default settings of the Maxent software: the max number of background points = 10,000, convergence threshold = 0.00001, maximum iterations = 500, and default prevalence = 0.5 (Phillips & Dudík, 2008). We set 75% of the distribution data as training data, with the remaining 25% as testing data. The logistic output format was used in this study.
Each threatened species had a suitability index in each grid cell (range from 0 to 1) obtained by the prediction analysis. We used ArcGIS 10.6 to extract and combine the suitable index of each threatened species in different grid cells. The values of the suitability index were regrouped into four classes of potential habitats with high potential (>0.6), good potential (0.4-0.6), moderate potential (0.2-0.4), and low potential (<0.2) . We selected threshold values of more than 0.5 and more than 0.75 to represent the good and high potential suitable habitat areas, respectively. Then, the final suitability index for each grid cells was calculated by summing the suitability indices of different species in each grid cell. We also subtracted sum of suitability indexes in the current climate scenario from the sum of suitability indexes in the future climate scenario in each grid, to reflect the change in the quality of suitable habitat areas. Finally, we visualized the prediction results by using ArcGIS 10.6 and evaluated the impact of climate change on biodiversity hotspots, current conservation networks and conservation gaps according to their distribution patterns, combined with the land use situations in different periods (http://www.resdc.cn).

| Distribution patterns of different algorithms and plant categories
According to the top 5% richness algorithm, distribution patterns of different plant categories showed that a majority of grid cells with high species richness were confined to the Hengduan Mountains and areas surrounding the Sichuan Basin (western portions of the YRB), with fewer high species richness grid cells radiating to the middle and lower reaches of the YRB (Figures 1 and  S6-S10). The distribution patterns of different plant categories showed strong correlation in the top 5% richness algorithm, except for national protected species (r = .83-.94, p < .01). The highest correlation coefficient was shared by endemic species and threatened species (r = .94, p < .01; Figure S5 and Table S7). However, there were strong correlations (r = .73-.74, p < .01) between distribution patterns of national protected species and endemic species and threatened species, and moderate correlations between the distribution patterns of national protected plant species and regional endemic species (r = .51-.54, p < .01; Figure S5 and Table S7).
The results of the complementary algorithm showed grid cells with high species richness were scattered and mostly confined to the southwest portions of the study area, that is, areas surrounding the Sichuan Basin and middle reach of Yangtze River (Figures 1 and S6-S10). Compared with other categories, regional endemic species were more mainly distributed in the Hengduan Mountains and Minshan Mountains ( Figure S8C). The national protected species grid cells were mainly distributed along the mainstream and tributaries of the Yangtze River, and there was high species irreplaceability in the source area of Ganjiang River ( Figure S10C). According to the complementary algorithm, distribution patterns of different plant categories showed great variation in the value of the correlation coefficient ( Figure S5 and Table  S7). There were very strong correlations (r = .94-.98, p < .01) between distribution patterns of all, endemic, and threatened species, and the highest correlation (r = .98, p < .01) was shared by all species and endemic species ( Figure S5). However, there were also strong correlations (r = .7-.78, p < .01) between distribution patterns of national protected species and all, endemic, and threatened species, meanwhile, it was moderate correlation (r = .43-.47, p < .01) between regional endemic and all, threatened, and endemic species ( Figure S5).
The grid cells with high PD value were mostly distributed in the areas surrounding the Sichuan Basin, Mufu Mountains, Qinling Mountains, as well as the source areas of Xiangjiang and Ganjiang rivers (Figure 3a). The distribution patterns of PE and WE appeared similar to the distribution pattern of PD, although high value grid cells of them were scattered (Figures 1 and S11). Correlation analysis on spatial phylogenetics indicated that there existed a very strong correlation between distribution patterns of PE and WE, and strong correlations between distribution patterns of PD and PE, PD and WE ( Figure S5 and Table  S7). The correlation analysis also indicated there were moderate or weak correlations (r = .38-.72, p < .01) between the top 5% richness and complementary algorithm results ( Figure S5 and Table S7). There were very strong correlations (r = .765-0.96, p < 0.01) between the results of the top 5% richness and spatial phylogenetics, a very strong correlation (r = 0.96, p < 0.01) between distribution patterns of regional endemic species and PE, and a moderate correlation (r = 0.45-0.53, p < 0.01) between national species and spatial phylogenetics. There were strong, moderate or weak correlations (r = 0.3-0.76, p < 0.01) between results of the complementary algorithm and spatial phylogenetics ( Figure S5 and Table S7).

| Biodiversity hotspot distribution pattern
We identified 134 hotspot grid cells using the top 5% richness algorithm, which were mainly distributed in the Hengduan Mountains, Wumeng Mountains, Bashan-Wushan Mountains, the eastern parts of Guizhou and the source area of Xiangjiang River ( Figure  1b). The hotspots contained 15,459 plant species, accounting for 83.4% of all plant species recorded in study area. These included 8255 endemic species (accounting for 83.8% endemic species of the study area), 1636 regional endemic species (73.8%), 887 threatened species (77.4%), and 85 national protected species (85.9%) confined to the hotspot grid cells (Table 1).
The hotspot grid cells identified with the complementary algorithm were scattered and mainly confined to the upper reaches of the Yangtze River and the southeastern portion of the basin (Figure 1d). These hotspots contained 15,535 plant species (accounting for 83.8% of all plant species), of which 8273 were endemic species (84.0%), 1616 regional endemic species (72.9%), 930 threatened species (81.2%), and 99 national protected species (100% ; Table 1). Hotspots grid cells of the spatial phylogenetics analysis were confined mostly to the area surrounding the Sichuan Basin, Wumeng Mountains, F I G U R E 1 Distribution patterns of top 5% richness algorithm, complementary algorithm and spatial phylogenetics, and hotspots of the top 5% richness algorithm, complementary algorithm and spatial phylogenetics. Distribution patterns of (a) all species according the top 5% richness algorithm, (b) hotspots according the top 5% richness algorithm, (c) all species according complementary algorithm, (d) hotspots according the complementary algorithm, (e) phylogenetic diversity (PD) and (f) hotspots according the spatial phylogenetics Qinling Mountains, and the source area of Ganjiang River (Figure 1f). These contained 15581 plant species (84.0%), including 8300 endemic species (84.3%), 1644 regional endemic species (74.2%), 911 threatened species (79.5%), and 86 national protected species (86.9%; Table 1).

| Conservation effectiveness and gaps of current conservation network
The conservation effectiveness analysis indicated that 81 out of 214 hotspot grid cells were include in the NNRs, and most of them were distributed in northern parts of the Hengduan Mountains, the eastern parts of Qinling Mountains, Minshan Mountains, Bashan-Wushan Mountains and the source area of Xiangjiang River (Table 2 and Figure 3a). These protected hotspot grids contained only 11,388 species (accounting for 61.4% all species), of which 6118 species were endemic (62.1%), 971 species were regional endemic (43.8%), 575 species were threatened (50.2%) and 55 species were national protected ones (55.6%; Table 1). Conservation effectiveness analysis of PNRs showed that there were 57 hotspots grid cells in the PNR networks, which were mostly confined to the Hengduan Mountains, Minshan Mountains, and Mufu Mountains (Table 1 and Figure S14B). The hotspots covered by PNRs contained 11,316 species (61.0%), including 6285 endemic species (63.8%), 1028 regional endemic species (46.4%), 570 threatened species (49.7%), and 64 national protected species (64.7%; Table 1). In case of NNRs and PNRs, a total of 116 out of 214 hotspots grid cells were protected in the current conservation networks (Figure 5b; Table 1). Overall there were 13,650 species (73.6%) protected by NNRs and PNRs, including 7514 endemic species (76.3%), 1380 regional endemic species  Table 1).

| Species composition among hotspots, NNRs and PNRs
Statistics on species composition indicated there were high proportions of endemic, regional endemic and threatened species contained in hotspot grid cells, accounting for 88.1%, 79.4%, and 85.4% of corresponding plant categories, respectively (Table 1). The conservation gaps of NNRs and PNRs contained 98 hotspot grid cells with 15951 species; however, the conservation effectiveness covered 116 hotspot grid cells including only 13650 species (Table 1). Furthermore, there were more threatened, endemic, and regional endemic species contained in gaps of NNRs and PNRs, than there were in the protected hotspots ( Figure 4 and Table S8). In addition, there were also 973 species (including 78 threatened species, 198 regional endemic species and 500 endemic species) distributed in hotspots beyond the coverage of NNRs and PNRs. Overall in the YRB, statistics on species composition in NNRs and PNRs indicate that the networks of NNRs covered 791 (25% of all) grid cells, and PNRs 626 (19.8%) grid cells, containing 13,459 (72.6%) and 13,552 (73.1%) species, respectively ( Figure 5 and Table 3). The current conservation networks covered 1248 (39.5%) grid cells containing 15,679 species (84.6%) has 630 fewer species than the overall hotspots ( Figure 5 and Table 3). The NNR and PNR networks contained more grid cells than the hotspots themselves, but included fewer threatened, regional endemic or endemic species than hotspots ( Figure 5). In terms of conservation status of different plant categories, and endemic species had higher conservation efficiency in the current conservation networks than other groups ( Figure 5 and Table 3), whereas threatened species were clearly insufficiently protected in the current protected area network compared to other plant categories ( Figure 5 and Table 3). All data provided in Tables S1 and S2.

| Suitable habitat areas of threatened plants in the YRB
The analysis of suitable habitat areas indicated that almost all models for 385 threatened species achieved commendable performance, with high-average areas under the operating characteristic curve (AUC) scoring a 10% cumulative threshold value. Overall 98% models had high test AUC values (0.900-1.000) in our study, with an average AUC value of 0.98 and a lowest value of 0.86 (Table S9 and Figure S15), indicating that models achieved high accuracy. Prediction analysis results with a suitability index of more than 0.5 showed that the suitable habitat areas are mainly located in the western parts of the area and distribution patterns showed no obvious changes in current and future scenarios ( Figure S16). Results with a suitability index of more than 0.75 showed a decreasing distribution pattern of suitable habitat areas in the future, especially in the Hengduan Mountains, western parts of Qinling Mountains, and Sichuan Basin. In addition, under this threshold changes in suitable habitat areas in the YRB were more evident. The southeast part of the YRB, for example, is expected to be a good suitable habitat area in the future ( Figure S16). After removing the species with a suitability index of less than 0.75, the suitable habitat areas reflect the changes more clearly and could be more easily used for further analysis. The quality of suitable habitat areas showed an improvement mostly in the eastern part of the study area, while in the western part only the Sichuan Basin and the western part of Qinling Mountains showed improving habitat quality ( Figure S17). We combined the prediction analysis with land use in different periods, and the distribution pattern and area of land use in different periods (2000, 2010, and 2018) changed few in the YRB ( Figure S18). These results showed that many suitable habitat areas were being occupied by anthropogenic areas, which are not suitable for threatened species survival, and suitable species habitats in the Sichuan Basin, central portions of YRB and the Yangtze River Delta Region were greatly reduced ( Figure S19).
The spatial distribution patterns of suitable habitat areas with land use situation in 2000 and final hotspots T A B L E 3 The statistics of species composition of the YRB, final biodiversity hotspots and current nature reserves indicated that 91.6% of the hotspot grid cells wholely or partly located in current suitable habitat areas, especially in the Hengduan Mountains, areas surrounding Sichuan Basin, western part of Qinling Mountains and southeast parts of the YRB (Figure 3e,f and Table S11). However, 18.7% hotspot grid cells may be at risk of disappearing of suitable habitat areas in the future, and in particular the Wumeng Mountains, eastern parts of Qinling Mountains and eastern Guizhou will no longer be suitable habitats for some threatened species (Figure 3e,f). In contrast, many hotspots grid cells in the southeast of the YRB will be more important suitable habitats for threatened plants in the future (Figure 3e,f). Most of the identified conservation gap areas in the NNRs and PNRs networks will still be suitable for plant growth in the future, including the Hengduan Mountains, Bashan-Wushan Mountains, western parts of Qinling Mountains, and source area of Ganjiang River (Figure 3f).

| Significance of hotspots in conservation priority of the YRB
Based on the conservation planning principles of irreplaceability and vulnerability, bringing an end to global biodiversity loss would require that the limited available resources be guided to those regions that need them most (Mittermeier et al., 2011). In this study, we identified 214 hotspot grid cells, accounting only for 6.8% of the land area of the YRB but containing 88% of the species found in YRB. These hotspots also harbored 88.1% of endemic and 85.4% of threatened plant species (Table 1). Therefore, the identified hotspots are of especially high significance for conservation planning. Compared with the hotspots, NNRs and PNRs occupied much more land area (39.5%), but contained a much more limited number of plant species (Table 3). Unfortunately, the conservation effectiveness of the current protected areas in covering the identified hotspots is highly insufficient, as only 37.9% of the hotspot grid cells are covered by NNRs and 26.6% by PNRs, and 54.2% hotspot grid cells are covered by both (Table 1), and little attention is given to the conservation of the remaining hotspot areas.
Our study also highlighted the important role of species complementarity and evolutionary processes (spatial phylogenetics) in hotspots identification, casting new light on understanding distribution patterns of hotspots. The 14 plant diversity hotspot areas identified largely coincided with hotspot areas identified in previous studies, for example, the Hengduan Mountains, Bashan-Wushan Mountains, Dalou Mountains and eastern parts of Guizhou (Xie, 2003;Xu et al., 2010;Yang et al., 2021;Zhang et al., 2014). However, three new hotspots were identified: the Three-Rivers Headwater Region, as well as Wumeng Mountains, and the source area of Ganjiang River (Figure 2). The first one was identified due to the high species complementarity, while the last two were known not only for high species richness and species complementarity but also for their distinctive spatial phylogenetics (Figure 1b,d,f). The Three-Rivers Headwater Region hotspot area (consisting of only 12 hotspot grid cells) contained 1057 plant species, including 554 endemic species, and 26 threatened species. As example, Corydalis leucanthema and Oxygraphis tenuifolia are the endemic, regional endemic, and national protected species, which avowed especially high importance in conservation priority.
We also found that some previously described hotspots (source of Yangtze River, Dabie Mountains, and Huangshan Mountains) were not detected in our study given that these areas showed only a few scattered plant hotspots grid cells. The source areas of Yangtze River have been treated as hotspot areas in previous studies mostly due to conservation value of mammals rather than species richness or complementarity (Xu et al., 2010;Zhang et al., 2014), and the last two hotspots identified by previous studies may be due to limited species sampled, such as only 127 threatened plant species and 568 plant species involved (Xu et al., 2010;Zhang et al., 2014). Thus we believe that due to incorporation of a large dataset and comprehensive algorithms our identified hotspots are more accurate.

| Optimization of conservation network in the YRB
Biodiversity conservation in YRB has received much attention in the past few decades, and the government has taken numerous protection measures (Zhang et al., 2014), greatly increasing conservation efficacy in the region. In China, NNRs have been shown as the most effective measure for wildlife protection and biodiversity maintenance (Wu et al., 2011). However, conservation effectiveness analysis indicated that the NNRs conservation efficiency in YRB is limited. Only 37.9% of all hotspot grids are currently protected by NNRs, leaving more than half of the hotspot grids as conservation gaps. Many large nature reserves are mainly located in the upstream areas of the YRB, for example, the Chang Tang Nature Reserve and the Three-Rivers Headwater NNR, aiming to protect some important endangered mammals but not plants (Su et al., 2019). The Three-Rivers Headwater Region, Hengduan Mountains, Wumeng Mountains, Bashan-Wushan Mountains, source area of Ganjiang River, Mufu Mountains and the Yangtze River Estuary, identified in this study as NNRs conservation gaps, should therefore be included in future conservation and management efforts (Figure 3a). In plant diversity conservation, the PNRs played an important complementary role in the YRB, covering less hotspot grid cells but protecting more endemic species and regional endemic species ( Figure 5 and Table 1). Hotspot areas, including the north and south Hengduan Mountains and Bashan-Wushan Mountains only contain small and scattered nature reserves (Figure 3b), indicating the insufficient conservation effectiveness of the current conservation networks, and more conservation actions need to be implemented in these areas.
Although endemic species were received decent protection under the NNRs and PNRs networks, threatened and regional endemic plant species are much less protected ( Figure 4 and Table 3). Therefore, efforts in optimizing the conservation network should pay close attention to threatened and regional endemic plant species of YRB. Many biodiversity hotspot areas, such as the Dalou Mountains, Mufu Mountains, and the northern parts of Hengduan Mountains, have been seriously divided by anthropogenic areas, and are now highly fragmented ( Figure 3 and Table S11), highlighting the need to establish conservation corridors. Distribution patterns of hotspots also indicated that many conservation gaps were located in the junction of provinces, likely due to the fact that nature reserves in China generally do not cross administrative boundaries (Su et al., 2019;Xu et al., 2017). Increased efforts on removing such restrictions of provincial boundaries on nature reserves should be taken in the future.

| Specific conservation planning focusing on target plant categories
Biodiversity distribution patterns in the YRB were apparently zonal due to the distinct variations in landform, habitat, and climate from west to east (Chen et al., 1997;Zheng, 2004). Our correlation analysis also indicated an apparent mismatch between different distribution algorithms, especially between the complementary algorithm and other algorithms, raising the question of how to reasonably utilize hotspots in prioritizing conservation ( Figure S5 and Table S7). The complementary algorithm and spatial phylogenetics provided a good reference in hotspots determination, and combined with high species complementary and PD, provided new data for conservation priority planning. Furthermore, correlation analysis indicated apparent incongruences between different distribution patterns of plant categories, so specific conservation efforts should be advocated for certain plant categories, and conservation priority planning should focus on the plant categories under insufficient conservation status. Especially threatened and regional endemic species were identified as insufficiently protected as compared with other groups. This is particularly problematic given that these groups often have low abundance, which may result in greater susceptibility to environmental stochasticity (Rejm anek, 2018; Wilsey & Polley, 2004).
The geographic range of plant species is important in the identification of hotspots (Huang et al., 2012(Huang et al., , 2016(Huang et al., , 2013, due to narrow-ranged species concentrated in particular geographic areas and potentially replaced by widespread species (Xu et al., 2019). Therefore, highlighting the need to take biological characteristics of different plant categories into account for conservation priority planning, especially because many plant hotspots in the YRB are surrounded by anthropogenic areas and residential areas, and habitat fragmentation is serious (Zhang et al., 2014; Figure 3e,f), we suggest establishing plant microreserves around these areas, as has been successfully promoted previously (Fos et al., 2014;Laguna et al., 2016). In the upstream area of YRB the flora, playing an important role in water source ecology, is often seriously damaged due to anthropogenic pressure acerbated by economic underdevelopment. Therefore, we suggest conducting ecocompensation projects, which not only would protect the local endemic flora, but also benefits the local population (Wang et al., 2016).

| Long-term protection of plant diversity under climate change in YRB
In the last decades, extreme climate events occurred frequently in the YRB, and habitat fragmentation intensified, which indicated a significant response to climate change (Gu et al., 2015). In this study, the predication of suitable habitat areas indicated that there would be an apparent shift of suitable habitat areas between current and future climate change scenarios, resulting in decreasing habitat quality in the western part of YRB and an increase in the eastern parts ( Figure S17). More hotspot grid cells will likely see reduced suitable habitat areas in the future in the center and western parts of the YRB (Figure 3c,d). In addition to considering suitable habitat areas, we also consided the land use situation which was often overlooked by previous studies (Xie, 2003;Xu et al., 2010;Yang et al., 2021). The land use changes caused strong habitat loss leading to biodiversity loss (Powers & Jetz, 2019;Winkler et al., 2021), therefore, land use needs to be fully integrated in future protection planning. When considering the expansion of anthropogenic land, the suitable habitat areas shrank sharply under both current and future climate change scenarios (Figure 3e,f). Climate change will similarly endanger suitable habitat areas in NNRs and PNRs confined to the center and western parts of YRB (Figure 3e,f).
The results of suitable habitat area analysis can provide a basis for the optimization and rearrangement of conservation networks, avoiding conservation gaps. Especially, the Hengduan Mountains, Bashan-Wushan Mountains, and western parts of Qinling Mountains were highlighted not only as biodiversity hotspots, but also especially stable suitable habitat areas under current and future climate change scenarios (Figure 3e,f). Therefore, these areas are keys for biodiversity protection. A pronounced decline of suitable habitat areas was found in the middle reaches of YRB, especially the southwestern part of Sichuan, eastern part of Qinling Mountains, and eastern parts of Guizhou ( Figure  3e,f), indicating a pressing need to structure migration corridors. New suitable habitat areas emerged in the southeast of the YRB (Figure 3e,f), which would be interesting for the establishment of new nature reserves. In addition, there are many conservation gaps in stable suitable habitat areas, for example, west Sichuan and east Chongqing (Figure 3e,f). Given both biodiversity hotspots and important suitable habitat areas under current and future scenarios, the reasonable measurements or appropriate conservation programs should prioritize these areas.

| CONCLUSION
Our study showed that big occurrence data and multiple algorithms are effective tools in hotspot determination. Here, the hotspots identified were mostly confined to the southwest of the YBR and areas surrounding Sichuan Basin. Correlation analysis indicated an apparent mismatch among distribution patterns of different algorithms and plant categories. Although the current conservation networks play an important role in biodiversity conservation, they fail to cover all hotspot areas, leaving numerous conservation gaps, especially in southwest YRB, including the Hengduan Mountains and Wumeng Mountains and the dominating advantage of NNRs in conservation are unapparent. More conservation efforts are need in these areas. Predication analysis indicated that suitable habitat areas face large uncertainties, both under current and future climate change scenarios. Hotspots located in stable suitable habitat areas, including the Hengduan Mountais, Bashan-Wushan Mountains, and western parts of Qinling Moutains, are of highest importance to implement conservation priority planning. Future conservation efforts need to pay attention to hotspots as a conservation priority, optimize the layout of existing conservation networks, and take the influence of climate change on biodiversity into account.