Predicting the distribution of plant associations under climate change: A case study on Larix gmelinii in China

Abstract Association is the basic unit of plant community classification. Exploring the distribution of plant associations can help improve our understanding of biodiversity conservation. Different associations depend on different habitats and studying the association level is important for ecological restoration, regional ecological protection, regulating the ecological balance, and maintaining biodiversity. However, previous studies have only focused on suitable distribution areas for species and not on the distribution of plant associations. Larix gmelinii is a sensitive and abundant species that occurs along the southern margin of the Eurasian boreal forests, and its distribution is closely related to permafrost. In this study, 420 original plots of L. gmelinii forests were investigated. We used a Maxent model and the ArcGIS software to project the potential geographical distribution of L. gmelinii associations in the future (by 2050 and 2070) according to the climate scenarios RCP 2.6, RCP 4.5, and RCP 8.5. We used the multi‐classification logistic regression analysis method to obtain the response of the suitable area change for the L. gmelinii alliance and associations to climate change under different climate scenarios. Results revealed that temperature is the most crucial factor affecting the distribution of L. gmelinii forests and most of its associations under different climate scenarios. Suitable areas for each association type are shrinking by varying degrees, especially due to habitat loss at high altitudes in special terrains. Different L. gmelinii associations should have different management measures based on the site conditions, composition structure, growth, development, and renewal succession trends. Subsequent research should consider data on biological factors to obtain more accurate prediction results.

Typically, forests play an important role in the global carbon cycle (Bonan, 2008;Pan et al., 2011;Schlosser et al., 2003), and the dominant effect of climate change on forest ecosystems is evident at low and high altitudes (He et al., 2005). The Chinese boreal forests are on the southern margin of the Eurasian boreal forests (Jia et al., 2021). Larix gmelinii is commonly found in the boreal forests of subalpine coniferous forests in Northeast China and contributes to the forests' high carbon storage capacity (Fang et al., 2001;He et al., 2019). The range of L. gmelinii extends almost to the permafrost region (Larionova et al., 2004). A particular concern is that the northern boundary of the broad-leaved forest is moving northwest (Chen, 2000). Li et al. (2006) found that the geographical distribution of L. gmelinii forests is decreasing and may even move northward from China. Yang et al. (2014) indicated that suitable high-altitude areas for larch forests are not available in China.
Dominant species (especially constructive species) coexist with the community, are important builders of the community and create a specific community environment (Zhou, 1991). In this paper, the L. gmelinii associations of different dominant shrub and grass species were taken as the research object. Biodiversity is indispensable for stabilizing biological communities (Loreau & de Mazancourt, 2013;Ma et al., 2017;Mougi & Kondoh, 2012). Species in ecological communities reflect the interactions among organisms and between organisms and their abiotic environments (Cardinaux et al., 2018;Koffel et al., 2021;Walther et al., 2002). Many researchers have focused on the response of communities to global changes, and an indepth understanding of species interactions can help to predict their responses to climate change (Enquist, 2002;Gilman et al., 2010;Ovaskainen et al., 2013;Santos-Hernández et al., 2021). Climate change can lead to inconsistencies in the phenology of species, which in turn leads to community changes (Ovaskainen et al., 2013).
Through long-term observations, it has been found that with climate change, cold mountain habitats and the biological communities in high mountains are gradually decreasing (Gottfried et al., 2012).
Therefore, conserving habitats and maintaining the living conditions of this species is vital, given that larch habitats support a wide range of organisms, including endemic species, and that any habitat change can affect their distribution (Rivas et al., 2020).
The Chinese vegetation classification system is separated into three levels, namely vegetation, alliance, and association Wang et al., 2020), with the association being the basic unit of plant community classification (Jennings et al., 2009;Tansley, 1920). This study addressed the following research questions: (1) Which climatic factors have the power to distribute the L. gmelinii associations more strongly? (2) Which association types control the movement of L. gmelinii forests under different climate change scenarios?
Compared to field surveys, the study of plant communities using remote sensing methods does not provide sufficiently comprehensive results. For example, the spectral signal changes between communities are not evident when using remote sensing, and the ability to interpret complex local terrains is limited (Chang et al., 2004;Westman et al., 1989).In this study, 420 original plots of L. gmelinii forests were investigated. The forest plot area was set to 30 × 30 m, and the sample plot survey data included the basic condition of the tree, shrub, and herb species in the plot. The Maxent model and the ArcGIS software can help determine the future (by 2050 and 2070) potential geographical distribution of different associations based on the three different climate scenarios of RCP 2.6, RCP 4.5, and RCP 8.5 (Dyderski et al., 2018;Tapiador et al., 2019).
The reasons for the changes in spatial distribution can be analyzed using multinomial logistic regression analysis (Fagerland et al., 2008;Friedman et al., 2010;Kwak & Alan, 2002). Through this study, we seek to understand the current and future changes in the distribution of L. gmelinii associations to provide a scientific basis and useful reference for medium and long-term management, biodiversity protection, and regional ecological planning.

| Study area
The study area is located in Northeast China, with a geographical range of 43°25′N-53°33′N and 115°31′E-135°05′E encompassing an area of 0.723 million km 2 . The northern part of the Greater Khingan Mountains is the only high-latitude cold temperate region and is the second largest permafrost region in China (Duan et al., 2017(Duan et al., , 2020. (Figure 1).

| Data analysis
2.2.1 | Sample plot data Based on regional distribution data from Northeast China, 420 plots of L. gmelinii forests were selected for this study. Zhou (1991) divided the association based on the same layer structure; the dominant species or co-dominant species of each layer are the same plant community. The data were classified using two-way indicator species analysis (Hill et al., 1975) combined with traditional community classification (Zhou, 1991(Zhou, , 1994(Zhou, , 1997 to remove transitional associations and were assigned names. For example, Ass. Carex callitrichos, Rhododendron davuricum, Larix gmelinii (LRC1) and Ass. Vaccinium vitisidea, Rhododendron davuricum, Larix gmelinii (LRV3) have the same association group but LRC1 is the association where C. callitrichos is the dominant species, and LRV3 is where Vaccinium vitisidea is the dominant species. We set a buffer radius of 1 km to screen the distribution points of the plots to avoid the influence of overfitting caused by excessive correlation.
Subsequently, 13 association types were determined (Table 1). We then processed all the association distribution points with a 1 km buffer to obtain the points at the level of the L. gmelinii alliance, with 182 distribution points for the L. gmelinii alliance being available.

| Environmental data
The WorldClim database (http://world clim.org) can describe climatic conditions by specifying annual and seasonal changes in temperature and precipitation. We used the "WorldClim 2" dataset at a spatial F I G U R E 1 The geographical location of the research area. The blue part represents the research area, the red points represent the sampling points, and the black lines represent the provincial boundaries. TA B L E 1 Association type resolution of 30 arcs, commonly referred to as "1-km" spatial resolution (Fick & Hijmans, 2017). We then separated the future period into 2050 and 2070. CMIP5 implemented four representative concentration pathways (RCP 2.6, RCP 4.5, RCP 6.0, and RCP 8.5) which describe the change curves of different greenhouse gas concentrations in response to different levels of increase in radiative forcing (IPCC, 2013).

Code
CMIP6 employed the shared socioeconomic pathways (SSPs), working in harmony with RCPs via shared policy assumptions (The CMIP6 Landscape, 2019). Our study used field survey data, which requires close-to-natural simulations, so the policy-oriented CMIP6 scenario was not selected. In terms of global warming, RCP 8.5 showed the most pessimistic condition, RCP 2.6 showed the most optimistic condition, whereas RCP 4.5 showed moderate conditions. Based on previous research (Dyderski et al., 2018;Tapiador et al., 2019;Yu et al., 2019), we selected three climate change scenarios (RCP 2.6, RCP 4.5, and RCP 8.5) for prediction and analysis using Maxent.
Variables such as the soil and terrain are difficult to predict but can be regarded as static variables and input into the maxent model to obtain more accurate results (Stanton et al., 2012). The soil data were obtained from the World Soil Database at a spatial resolu-  (Warren et al., 2021) to conduct Pearson's correlation analysis. Based on the environmental contribution rate of the initial model, if the correlation coefficient of the two variables was greater than 0.8, the environmental variable with a larger contribution rate was selected, the actual distribution of species was determined, and relevant research results were examined (Yang et al., 2013(Yang et al., , 2014. The factors such as the soil moisture content (Yang et al., 2017) and the annual mean temperature (Jia et al., 2021) with ecological significance were saved by referring to the relevant research results. Finally, six climatic variables, three topographic factors, and five soil factors were selected (Table A1 in Appendix 1 and Table 2). Environmental variables have been proven to affect the distribution and physiology of plant species across different spatial extents (from local to global scales) and are widely used to project the distributions of plant species.

| Model analysis
Species distribution models (SDMs) provide comprehensive distribution statements of possible future occurrences by connecting the existence of species with condition predictors (Despland & Houle, 1997;Zhao et al., 2020;Zhong et al., 2021). Maxent shows higher performance and accuracy than other SDM tools (Carnaval & Moritz, 2008). It also has a good prediction ability for small sample datasets (Elith et al., 2011;Pearson et al., 2006;Phillips et al., 2006).
It can also be used to identify areas where sensitive species currently exist or may exist (Li et al., 2020;Qin et al., 2017). The Maxent model indirectly describes how ecological processes shape ecological communities in the form of constraints (Bertram et al., 2019), and simulates the sample and environmental data of vegetation at the local and regional scale (Comino et al., 2021;Merow et al., 2013;Phillips et al., 2017;Radosavljevic & Anderson, 2014).
Considering the association as a species, we used the Maxent model to quantitatively prove its association with environmental factors and explore the response of association distribution to climate change. Given that the quantity of L. gmelinii associations is different, we operate the model according to the following rules (Elith et al., 2011). By default (auto features), when the sample is greater than 80, all features were used. If the quantity range of the sample was 2-9, "Linear features" was selected. If the quantity range of the sample was 10-14, "Linear features" and "Quadratic features" were selected. If the sample's quantity range was 15-79, "Linear features", "Quadratic features," and "Hinge features" were selected. We randomly selected 75% of the distribution data as the training set to establish a prediction model, and the remaining 25% were used as the test set for model validation (Zhang et al., 2016).
The maximum number of iterations was 1000, and the number of model repetitions was 10 (Salako et al., 2019). Jackknife analysis using Maxent was performed to determine the weight of each variable (Zhang et al., 2016). The receiver operating characteristic curve analysis method was used to verify the accuracy of the Maxent model prediction results (Hanley & McNeil, 1982). In this method, the prediction accuracy of the model is determined by calculating the area under the curve (AUC) value (Swets, 1988). When the AUC value is greater than 0.9, the prediction accuracy is high, and the prediction results can be used. The Maxent model outputs the existence probability of L. gmelinii alliance and associations for each grid point in ASCII format, and uses ArcGIS to convert the data into Raster format to give the potential distribution map of the alliance and associations. We selected the "minimum training presence logistic threshold" (Itzel Montemayor et al., 2016), which is the 10time average of the maxent output, to distinguish between suitable and unsuitable regions for species and to visualize the model re- sults. To further quantitatively analyze the changes in the spatial pattern of the L. gmelinii alliance and their associations, we defined four types of conditions: suitable areas increased, unsuitable areas unchanged, suitable areas unchanged, and suitable areas decreased.
The SDM toolbox (Brown, 2014; http://www.sdmto olbox.org/ downl oads) was used to determine the spatial pattern change of the L. gmelinii alliance and associations under different future climate scenarios (Figures 2 and 3). This is based on the current distribution simulated by maxent, with suitable areas as 1 and unsuitable areas as 0. The simulated distribution under the future climate scenario was compared with the current situation; range expansion (suitable areas increased) was considered as −1, no occupancy (unsuitable areas unchanged) as 0, no change (suitable areas unchanged) as 1, and range contraction (suitable areas decreased) as 2.

| Multinomial logistic regression analysis
The dependent variable consisted of disordered multi-classification data, which were suitable for the multinomial logistic regression model (Fagerland et al., 2008;Friedman et al., 2010;Kwak & Alan, 2002). We used the multi-classification logistic regression analysis method to obtain the response of the suitable area change for the L. gmelinii alliance and associations to climate change under different climate scenarios.
We selected 182 distribution points for the L. gmelinii alliance in the study area, and the distribution changes and climatic factor changes at the 182 points were extracted as modeling data.

| Current and potential future geographical distribution of Larix gmelinii alliance and associations
The prediction accuracy was tested, and the mean AUC value of the test dataset was greater than 0.9, which showed that the simulation accuracy of the potentially suitable area using Maxent was high, and the prediction results were reliable ( Figures A3 and A4 in . It is predicted that the southern boundary will move northward, the eastern boundary will move slightly westward, the western and northern boundaries will not substantially change, and the These results further highlight that the areas suitable for L. gmelinii associations in its future distribution will also decrease. However, there were significant differences in the distribution of different association types. Some of the L. gmelinii association distribution results are used as an example (the remainder of the results are in the Appendix 1, please refer to Figure A1) below. The examples selected are L. gmelinii associations with different habitats and additional sampling points, including LRC1 and LRV3 which are important in mesogenic drought habitats, LH5 and LCC6 for mesogenic habitats, LLV9 for mesogenic wet habitats, and LBC11 for wet habitats ( Figure 3). The northern area suitable for LRC1 has considerably increased. The main distribution in the scattered areas for LRV3 will decrease further. Although the LH5 suitable habitat will decrease, it will also show a pronounced increase in the Northwest. The habitat loss rate of LCC6 will be relatively high, but this type will occupy a new northward habitat. The shrinkage of suitable areas for LLV9 and LBC11 was relatively less than that of the other types, and there was also an increase in suitable areas.

| Importance of environmental factors in the Larix gmelinii alliance and associations
The output results of the three climate scenario models for the two periods were analyzed, and the contribution rates of each environmental factor involved in the modeling were statistically analyzed according to the jackknife method provided by the model. The statistical results for the contribution rates are shown in Figure 4 (please refer to Figure A2 in Appendix 1

| Response of the spatial distribution of Larix gmelinii alliance and associations to climate change
In this study, we used the entire suitable area as a reference and

| DISCUSS ION
Dominant species can alter the living conditions of other species and affect the entire community (Hickler et al., 2012). At the community scale, forests are a mixture of tree species with different functional characteristics and growth behaviors that respond to different light, moisture, and nutrient regimes (Pan et al., 2013).  Table 2 for an explanation of the environmental factors.
conditions, and the analysis accuracy from the perspective of the alliance is not enough. The associations in our study were divided into four categories based on the vegetation type, namely mesogenic drought, mesogenic, mesogenic wet, and wet association. the different association types were inconsistent. Our study found that the temperature type was the most important factor influencing L. gmelinii, followed by the soil type. Soil type considerably influences the distribution of LV10, which is related to the growth of such associations in brown taiga soils (Zhou, 1991).
In theory, only two datasets are needed to run the Maxent model. The first is the geographic distribution points displayed for the target species in the form of latitude and longitude. The second is the actual distribution area of the species and the environmental variables of the target area, which are predominantly climate data, terrain data, and soil data. Each association has its own specific habitat, which can be used for predicting the spatial geographic distribution of associations. We have the existence data for the distribution points and the environmental data associated with the distribution points, thus, meeting the two necessary conditions. Therefore, the Maxent model has applicability for association prediction. Regarding the environmental data selection, our study considered climate factors as dynamic variables and terrain and soil factors as static variables. However, the terrain and soil factors will also change under future climate scenarios (Richter & Markewitz, 2003). Moreover, subsequent research should consider data on biological factors to obtain more accurate prediction results.
Our study found that temperature was the most important factor affecting the distribution of L. gmelinii forests and most of Increasing temperature, precipitation increase, and precipitation seasonal dispersion are the main causes for the reduction in the suitable area for the Larix gmelinii alliance.
In terms of clump physiology, L. gmelinii has strong drought resistance and can grow under mild drought conditions (Sugimoto et al., 2002). Under climate warming, the distribution of plant species tends to shift to habitats at high latitudes or altitudes (He et al., 2019). In the future, suitable areas for each association are expected to shrink by varying degrees. Using a logistic regression model, Leng et al. (2006)  is limited to L. gmelinii. When the general living conditions support multiple species with similar functions, or some species contribute less to the general living conditions, or when the characteristics are controlled mainly by the abiotic environment, the characteristics of the ecosystem will be insensitive to species loss (Hooper et al., 2005).
For different L. gmelinii associations, different management measures are required for each association because of different habitat conditions, composition structure, growth, development, and renewal succession trends (Estrada Valdés et al., 2021;González de Andrés et al., 2018;Jia et al., 2016). Areas with good site conditions should be selected for performing thinning, and the forest spatial structure should be adjusted and optimized through thinning (Zhang et al., 2018). The mesogenic drought habitat associations have the characteristics of a dry and cold climate and mainly contain leafy plants but no big leafy plants (Zhou, 1991). Considering  Zhou, 1991). Although LH5 is more common, in the future it is also necessary to focus on protecting the reduced areas of LH5 suitable areas, implementing artificial promotion updates, and increasing protection and attention. The natural regeneration of LCC6 L. gmelinii forest is poor, but the trees grow lush and accumulate large amount; this is a forest type with high economic value in the northern part of Xiao Hinggan Ling (Zhou, 1994). The suitable area of LCC6 and LCC7 under the prediction of future climate scenarios changes greatly, and the protection of wild resource plants in the area should be strengthened when using forest grassland for the production of related economic sideline industries. Furthermore, the recovery of forests after a fire is generally due to undamaged or slightly damaged trees (Oreshkova et al., 2013). Therefore, it is necessary to increase the number of LPV8 samples that face the most serious fire loss (Chen et al., 2015;Makoto et al., 2011). LPV8 is also critical as an important net resource for maintaining the economic development of the community. LLV9 Natural regeneration of L. gmelinii is poor due to the shade and wet forest, thick moss and lichen (Zhou, 1991). For LLV9, attention should be paid to the combination of artificial and natural regeneration, and the growth of larch should be promoted by rational utilization of moss and lichen.
Understory natural regeneration of LV10 is good, and Vaccinium is widespread under such L. gmelinii with a frequency of 100%, and the soil of this kind of association is moist and has good drainage (Zhou, 1991). In the future, we should pay attention to water conservation and minimize soil erosion. The suitable area of LBC11 is the largest in our study, and the adaptation of this association to climate change is relatively good in future climate scenarios. In the future, we should pay attention to the impact of human factors on this cluster and protect the existing habitats as far as possible.
Previous studies have demonstrated the effects of fixed conditions on plants and animal communities, such as the long-term absence of rain (Vicente-Serrano et al., 2020). As far as possible, continuous conditions should be artificially created for climate-sensitive association types, such as creating wet and moist conditions for wet association groups (LBV12 and LBV13).

| CON CLUS IONS
The

CO N FLI C T O F I NTE R E S T
The authors declare that they have no known competing financial interests or personal relationships that could have influenced the work reported in this study.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data of this paper is stored in https://doi.org/10.6084/m9. figshare.20438595.
Predicting   Table 2 for an explanation of the environmental factors.
F I G U R E A 3 Average omission and predicted area for Larix gmelinii alliance (QX) and associations. The picture shows the training omission rate and predicted areas as a function of cumulative threshold, averaged over the 10 replicate runs.

F I G U R E A 4
The receiver operating characteristic (ROC) curve for Larix gmelinii alliance (QX) and associations.  Table 2 for a detailed introduction to environmental factors.