Meta‐replication, sampling bias, and multi‐scale model selection: A case study on snow leopard (Panthera uncia) in western China

Abstract Replicated multiple scale species distribution models (SDMs) have become increasingly important to identify the correct variables determining species distribution and their influences on ecological responses. This study explores multi‐scale habitat relationships of the snow leopard (Panthera uncia) in two study areas on the Qinghai–Tibetan Plateau of western China. Our primary objectives were to evaluate the degree to which snow leopard habitat relationships, expressed by predictors, scales of response, and magnitude of effects, were consistent across study areas or locally landcape‐specific. We coupled univariate scale optimization and the maximum entropy algorithm to produce multivariate SDMs, inferring the relative suitability for the species by ensembling top performing models. We optimized the SDMs based on average omission rate across the top models and ensembles’ overlap with a simulated reference model. Comparison of SDMs in the two study areas highlighted landscape‐specific responses to limiting factors. These were dependent on the effects of the hydrological network, anthropogenic features, topographic complexity, and the heterogeneity of the landcover patch mosaic. Overall, even accounting for specific local differences, we found general landscape attributes associated with snow leopard ecological requirements, consisting of a positive association with uplands and ridges, aggregated low‐contrast landscapes, and large extents of grassy and herbaceous vegetation. As a means to evaluate the performance of two bias correction methods, we explored their effects on three datasets showing a range of bias intensities. The performance of corrections depends on the bias intensity; however, density kernels offered a reliable correction strategy under all circumstances. This study reveals the multi‐scale response of snow leopards to environmental attributes and confirms the role of meta‐replicated study designs for the identification of spatially varying limiting factors. Furthermore, this study makes important contributions to the ongoing discussion about the best approaches for sampling bias correction.

Typically, SDMs are developed using all the predictors measured at a fixed scale. However, single-scale modeling risks incorrectly describing a species' responses to features of the environment (Mateo-Sánchez, Cushman, & Saura, 2013;McGarigal et al., 2016;Timm, McGarigal, Cushman, & Ganey, 2016). Each species will experience their environment at a range of different scales (Levin, 1992) in relation to life history traits and ecological requirements (Addicott et al., 1987;Johnson, 1980;Wiens, 1989). The correct identification of scales at which animals perceive and respond to landscape features should therefore be an important focus of ecological and distribution studies (Levin, 1992;McGarigal et al., 2016;Wiens, 1989).
Previous research has shown that multi-scale optimization is critical in producing reliable predictions of carnivore habitat (e.g., Mateo-Sánchez et al., 2013;Vergara et al., 2015;Wasserman, Cushman, Wallin, et al., 2012) and felids in particular (e.g., Ashrafzadeh et al., 2020;Elliot, Cushman, Macdonald, & Loveridge, 2014;Hearn et al., 2018;Krishnamurthy et al., 2016;Macdonald et al., 2018Macdonald et al., , 2019. For example, Ashrafzadeh et al. (2020) employed a multiscale, multi-species approach to model habitat suitability and connectivity for six felids across Iran, finding that each species' habitat use was influenced in a scale-dependent manner by different sets of environmental variables. Similarly, Hearn et al. (2018) modeled multi-scale habitat suitability of four felids across Sabah, Borneo, and found species-specific differences in the scale of habitat associations, with most species associated with broad scales of environmental variation. Therefore, multi-scale SDMs, which describe how the contribution of each variable varies across scales, produce more accurate, organism-centered, distribution models .
With this study, we assess the performance of multi-scale models versus single-scale models, in terms of accuracy and predictive ability, with data from a wide-ranging top predator, the snow leopard (Panthera uncia). The snow leopard is a species of conservation concern, listed as Vulnerable by IUCN (McCarthy, Mallon, Jackson, Zahler, & McCarthy, 2017), and is regarded as a flagship species for the mountainous habitats of Central Asia. The global population is estimated to be 2,710-3,386 mature individuals (McCarthy et al., 2017), though there is substantial uncertainty. Estimates of local abundance and studies on distribution remain scarce McCarthy, Mallon, Sanderson, Zahler, & Fisher, 2016;Robinson & Weckworth, 2016). To understand population distribution, status, and trend, it is important to identify areas whose characteristics are most favorable to snow leopard presence and persistence, which might result in strengthened survey efforts and conservation measures.
We investigated snow leopard habitat relationships in two landscapes of Western China: the Qilian Mountains (Gansu and Qinghai Provinces) and in the Himalayas of the Tibetan Autonomous Region. Of the few published studies from these areas, most have focused on site occupancy, density estimation, and human perception toward snow leopards (Alexander, Chen, et al., 2015;Alexander, Gopalaswamy, Shi, Hughes, & Riordan, 2016;Alexander, Gopalaswamy, Shi, & Riordan, 2015;Alexander, Shi, Tallents, & Riordan, 2016;Alexander, Zhang, Shi, & Riordan, 2016;Chen et al., 2016Chen et al., , 2017. Bai et al. (2018) produced the first habitat suitability model for snow leopard using the MaxEnt algorithm in the Qomolangma National Nature Reserve, in the Chinese Himalayas.
Comparing species-habitat relationships across meta-replicated study areas can provide more reliable and generalizable information about the factors and scales that drive species occurrence and distribution patterns (e.g., Cushman et al., 2011;Shirk et al., 2012;Short Bull et al., 2011;Wan et al., 2017), and the effects of landscape patterns on ecological processes generally (e.g., McGarigal & Cushman, 2002).
Earlier, Li (2012) modeled the range-wide distribution of this species. Further SDMs described range expansion and contraction under past (Li et al., 2016) and future (Aryal et al., 2016;Li et al., 2016) climate change scenarios. Recently, Li et al. (2020), modeled range-wide snow leopard habitat as a means to infer a resistance map to guide management recommendations. None of these studies, however, explicitly considered scale issues (sensu McGarigal et al., 2016), nor adopted a formal meta-replication framework (sensu Shirk et al., 2012).
With this study, we focus on the multi-scale habitat relationships of snow leopards in two study areas characterized by different topography, land cover, and climatic attributes to: (a) identify the landscape-specific predictors of snow leopard relative habitat suitability; (b) assess the influence of scale on snow leopard habitat relationships and identify the scales at which their effect is most pronounced; (c) assess the performance of multi-scale models versus single-scale models, comparing relative accuracy and predictive ability; (d) provide a framework for model selection and correction of biased occurrence records; (e) create predictions from ensembles of competing distribution models, built with different variables, to probabilistically infer relative suitability. As a further objective (provided as Appendix material), we (f) assess the efficacy of two correction methods under different bias intensities. To our knowledge, this is the first SDM on snow leopard adopting a multi-scale and meta-replicated approach.

| Presence data
We used two sources of presence data: photographic captures from remote camera trapping, and genetically verified fecal samples. In QMLNR, camera trap locations and positions were reported in Bai et al., (2018), and cover the period of 2014-2016.
In QNR, Qinghai, occurrence data from Tianjun, Qilian, and Meiyuan Hui Autonomous Counties were exclusively from photographic captures (WIBFU, unpublished). Details on fecal sample collection, preservation, and laboratory methods are reported in Bai et al. (2018).

| Environmental layers
We selected 27 variables, divided into five categories (Table 1).
We reclassified the ESA GlobCover 2009 v2.3 landcover raster (Arino et al., 2008) from 22 to 10 cover classes (Table 2) We downloaded annual mean temperature (TEMP) from WorldClim Version 2 (Fick & Hijmans, 2017) at 30 arc-seconds resolution. We used the National Roads and Highways of China layer (Berman, 2009) to calculate density of major traffic routes in the two landscapes. Density of rivers was calculated using inland water layers of China, downloaded from www.DIVA-GIS.org (accessed 15/03/2018). OpenStreetMap (www.opens treet map.org through http://downl oad.geofa brik.de/asia/, accessed 15/03/2018) was used to download layers of human settlements, from which we retained the categories "city", "town", "village", and "hamlet", which we weighted using a scale from 4 to 1, to compute a density raster.
We resampled all variables to a UTM projection, with a 90 m cell size. Each variable was calculated at nine scales, with radii (in m) of 300,600,1,200,2,400,4,800,9,600,14,400,19,200,28,800. We chose the increments based on the original resolution of GlobCover 2009 v2.3 raster layer (300 m). We set the limit to 28,800 m to approximate a plausible daily distance moved by snow leopards (27.9 km in McCarthy, Fuller, & Munkhtsog, 2005), in absence of telemetry information from our study locations.

| Univariate scaling
We conducted a univariate scaling for each variable (Mateo-Sánchez et al., 2013;Vergara et al., 2015) to identify the scales most strongly related to snow leopard presence, using MaxEnt v.3.4.1 (Phillips, Anderson, Dudík, Schapire, & Blair, 2017;Phillips et al., 2006). The choice of background (pseudo-absence points) should be based on previous ecological knowledge of the focal species (Phillips et al., 2009) and should reflect the geographic space accessible to a species in a given amount of time (Barve et al., 2011;Merow et al., 2013). Therefore, we used SDMtoolbox 2.2 (Brown, Bennett, & French, 2017) to create a buffer around occurrences with a radius of 28,800 m within which background points were selected, which is approximately the radius of snow leopard home ranges (McCarthy et al., 2005).
Where occurrences are clustered, the performance of the model can be increased by limiting the background to the fraction of the area in which presence points occur (Acevedo et al., 2012;Chefaoui & Lobo, 2008;Phillips et al., 2009). Models trained in this way tend to show better environmental potential when projected beyond the calibration areas, suggesting reduced tendency toward overfitting (Acevedo et al., 2012;Chefaoui & Lobo, 2008).
Following Vergara et al. (2015) and Mateo-Sánchez et al. (2013), we ran MaxEnt with 20,000 background points, 5,000 iterations, linear and quadratic features, default regularization multiplier,  (Warren & Seifert, 2011). AUC diff is calculated by subtracting the evaluation AUC from the calibration AUC and represents a measure of overfitting, since overfitted models tend to discriminate with great accuracy when using the training partition, but performs poorly on the test fraction of the data (Warren & Seifert, 2011). To avoid multi-collinearity, we ran pairwise Pearson's correlations on the set of best performing scales for each variable, using the Band Collection Statistics tool in ArcGIS.

| Multivariate models and scales comparison
The predictors retained were included in the multivariate models at the scale identified in the univariate scaling step. We built models composed of five predictors, including one variable from each category (e.g., Mateo-Sánchez et al., 2013) (Table 1). We retained a maximum of five predictors in order to avoid overfitting of the models by adding potentially spurious variables (Mateo-Sánchez et al., 2013;Vergara et al., 2015) and to allow comparability of the models in terms of variance explained by each of the predictors (Mateo-Sánchez et al., 2013;Vergara et al., 2015).
To estimate how the multi-scale optimization affected the predictive performance of each model, we compared the top performing multivariate models with models built using the same environmental predictors at each of the nine scales considered (Mateo-Sánchez et al., 2013;Timm et al., 2016;Vergara et al., 2015). We retained the ten best multivariate models for each study area (based on AUC) and ran them in MaxEnt with the same parameters as before, with

| Sampling bias correction-simulated and real data
Simulating habitat relationships of a virtual species is often used in SDMs to assess the predictive power of modeling settings and/ or bias correction methods (Fourcade et al., 2014;Hijmans, 2012;Kramer-Schadt et al., 2013). We implemented a simulation experiment to evaluate the effects of bias intensities and performance of correction methods. We used the uncorrected top five models in each study area, at their best performing scales, as starting scenarios, considering them provisional "reference" models.
These reference models provided a relative suitability value at each pixel which we used to generate simulated occurrence points.
This provides training data to rebuild models, given a known probability of occurrence, and to assess the performance of the models in correctly identifying the variables and scales that drive the occurrence probability, the accuracy and/or bias of the resulting predicted probability map, and performance of alternative bias correction maps.
To implement this simulation experiment we first created a uniform random raster for each study area, with values ranging from 0 to 1, and subtracted this raster from the probability surface of each of the best five reference models. We then overlaid the subtracted outputs creating a cumulative potential surface for the five models combined.
On each of these raster layers, we created a cloud of 50,000 random points on the whole extent of both study areas. As these points were placed randomly by the algorithm, we selected a subset of only those occurring on pixels with positive values (representing probabilistic potentially suitable sites with values bigger than zero), creating a set of potential occurrences (4,602 in QLNSP, 555 in QMLNR). From these points, we randomly selected an equal number of presence points as the original datasets, using SDMtoolbox 2.2 (Brown et al., 2017). We thereby created two full random unbiased simulated sets of pseudo-occurrences, representing the whole suite of potential habitats for snow leopards in the two landscapes (QLSNP_FR and QMLNR_FR).
We also produced a situation in which the pseudo-occurrences were created with the same geographic bias as the original datasets, but were more spatially uniformly distributed (simulated-biased datasets, QLNP_SB and QMLNR_SB). To do this, we clipped the original cloud of 50,000 random occurrences to an extent delimited by a buffer of 28,800 m radius built around the original occurrences (real datasets, QLNSP_RD and QMLNR_RD). This produced a set of 2047 points in QLSNP and 271 points in QMLNR. As described previously, we used SDMtoolbox 2.2 (Brown et al., 2017) to randomly select as many presence points as the original datasets.
To assess the effect of bias correction in three sampling scenarios, characterized by decreasing bias intensity, we used SDMToolbox 2.2 (Brown et al., 2017) to apply spatial rarefactions (SR) and Gaussian density kernel surfaces (GK) at scales of 1,200, 2,400, 4,800, and 9,600 m. Spatial rarefaction and density kernels are commonly used and highly effective correction methods in SDMs (Fourcade et al., 2014;Kramer-Schadt et al., 2013;Mateo-Sánchez et al., 2013;Veloz, 2009;Vergara et al., 2015). To ensure consistency with the previous steps, and place optimizations across the same training areas, we applied the two correction categories to all datasets, constraining the background to 28,800 m from the full sets of points.
We ran the RD datasets at each of the eight corrections. We further ran the top five models in each study area using the two new simulated sets of occurrences (FR and SB), with and without corrections. All MaxEnt parameters were set as described for the multivariate models' evaluation. We assessed the performance of these simulations and bias corrections using threshold independent measures (AUC and AUC diff ) for discrimination accuracy and overfitting proxies, and through the maximum training sensitivity plus specificity logistic threshold (MTSS) omission rate (Liu, White, & Newell, 2013;Syfert et al., 2013;Vergara et al., 2015), which is a preferable evaluation metric in presence-only and presence-background frameworks.
We averaged all values across the five models for each bias situation (corrected and uncorrected) and dataset. Results and discussion on the simulated occurrences are provided as Appendix material.

| Sampling bias correction-niche overlap in geographic space
We anticipated that models built with different variables may respond differently to dataset type and bias correction (Randin et al., 2006), yielding different predictions (Guisan et al., 2017), even when they show similar performances based on evaluation metrics (Burnham, Anderson, & Burnham, 2002). In these circumstances, it is difficult to unequivocally rely on one single model to predict species distribution (Araújo & New, 2007;Marmion, Parviainen, Luoto, Heikkinen, & Thuiller, 2009). Since models are an approximation of a true underlying relationship, there can be many candidate models whose evaluation criteria meet the conditions required to be considered a likely representation of reality (Araújo & New, 2007;Marmion et al., 2009).
We therefore report results from an ensembling strategy to assess which correction, across five models, would most improve the uncorrected biased scenarios (RD_RAW). For the RD datasets, we overlaid the top five models according to their correction (SR and GK) at the four radii analyzed, or absence of correction (RAW). We normalized the outputs to values ranging from 0 to 1, using the Geomorphometric and Gradient Metrics Toolbox 2.0 (Evans et al., 2014;Figures 2 and 3). We used ENMtools 1.3 (Warren, Glor, & Turelli, 2010) to calculate Schoener's D index of niche overlap (Schoener, 1968) to assess, for each ensemble, which correction would give, on average, the highest overlap with respect to a simulated, unbiased ensemble of models (FR_RAW).
Before this step, raster layers in QLSNP were upscaled to 300 m cell size, in order to prevent memory failure caused by large extent and small pixel size.
The D statistic (Schoener, 1968), describes the probability distribution of the absolute difference on the geographic space between each pixel i of two SDMs x and y, and ranges from 0 to 1, which is the highest degree of overlap (Warren, Glor, & Turelli, 2008). After Fourcade et al. (2014), we evaluated the performance of bias correction using the indicator ΔD geo , which expresses the degree of how an uncorrected raw model is improved after correction, with 1 corresponding to the correction yielding a model identical to an unbiased (or reference) one. We selected the best correction as the one maximally increasing ΔD geo , (indicating actual correction), while at the same time reducing the average MTSS omission rate with respect to the average of the uncorrected raw models (RD_RAW).

| Presence data
We collected 464 and 475 fecal samples in QMLNR and QLSNP, respectively. We were able to genetically identify 68.3% and 77.4% of samples in the two study areas (Table S1). We found 134 snow F I G U R E 2 Uncorrected and corrected ensembles in Qilianshan National Park using, real occurrence data (QLSNP_RD). RAW = uncorrected model. SR = Spatial Rarefaction, followed by radius in meters. GK = Gaussian Kernel, followed by radius in meters. Inland water layers have been overlaid on top, using the color scheme indicating the lowest suitability leopard samples in QMLNR (28.8% of the total, 42.2% of identifications) and 230 in QLSNP (48.4% of the total and 62.5% of identifications) (Table S1). In QMLNR, snow leopards were detected in 120 out of 286 cameras in three counties (Bai et al., 2018

| Predictors of snow leopard occurrence
We evaluated 243 univariate models per study area, assessing each of the 27 variables at the nine scales analyzed. In QMLNR, the majority of variables (85.1%) were selected at scales ≥14,400 m, while QLSNP scale selection exhibited more heterogeneity, with roughly 48.1% of variables at scales ≥14,400, and 33.3% from 4,800 to 9,600 m. (Table 3, Table S2).
Differences in scales were more pronounced among topographic descriptors, selected from fine to medium scales in QLSNP and at large scales in QMLNR (Table 3). ELEV was the only topographic metric for which this trend is reversed (Table 3). There were also scale differences in the selection of area (AREA_AM) and extensiveness (GYR_AM) class-level metrics associated with Grassland (Gr), Bare Land (Br), and Snow cover (Sn), with fine -medium scales in F I G U R E 3 Uncorrected and corrected ensembles in Qomolangma National Nature Reserve, using real occurrence data (QMLNR_RD). RAW = uncorrected model. SR = Spatial Rarefaction, followed by radius in meters. GK = Gaussian Kernel, followed by radius in meters. Dashed white lines represent National Borders. Inland water layers have been overlaid on top, using the color scheme indicating the lowest suitability QLSNP and large scales in QMLNR. The same metrics for Shrubland (Shr) and needle-leaved forest (NLF) showed a general agreement between the areas, being selected at coarse scales. We observed broad similarities between the two landscapes in the scales describing the response to the percentage of landscape (PLAND) relative to all landcover types at large scales (Table 3). The main difference in lansdcape-level metrics was in terms of the scale at which contrast-weighted edge density (CWED) was selected (medium scale in QLNSP (4,800 m) and at the largest in QMLNR (28,800 m), while aggregation (AI) and patch density (PD) metrics were selected consistently at the broadest scales across areas (28, (Table 3). We assessed only one climatic variable (TEMP) which was selected at the smallest scale in both areas (Table 3). TA B L E 3 AUC values of the variables selected after the univariate scale selection, conducted on different predictors at each of the nine scales considered density of water sources (DENS_riv_19200) was selected in seven out of ten models, with the remaining built with DENS_set_9600.

| Multivariate SDMS
Among topographic descriptors, DISS_600 (two models) and SLP_2400 (eight models) were included in QLSNP, and CTI_14400 (eight models) and SLP_28800 (two models) in QMLNR (Table 4). PD was included in only one model in each study area. The remaining top models were built with AI_28800 in QLSNP and CWED_28800 in QMLNR. Among landscape categories, metrics associated with Gr (GYR_AM_Gr_4800, PLAND_Gr_14400) were selected in nine out of ten models in QLSNP, with the remaining built with AREA_AM_ Shr_19200. In QMLNR, five models were built with metrics associated with Gr (PLAND_Gr_14400, GYR_AM_ Gr_19200) four with Br (PLAND_Br_28800, GYR_AM_Br_28800) and one with NLF (GYR_ AM_NLF_14400) (

| Multi-scale versus single-scale models
We assessed the ten best multivariate models at each of the nine scales considered and compared their performances with those of their multi-scale equivalents. This produced 100 models per study area. We observed that multi-scale models had the highest discrimination ability in almost all cases (

| Bias correction and overlap with reference unbiased models
Here, we present results from real datasets (RD), with details of correction methods across the three different bias situations provided in Appendix S4. We ran 40 models per study area, evaluating the eight correction methods (Figures 2 and 3; Table 6, Appendix S4) on real datasets (RD) for each of the top five models. These correction methods consisted of four radii of spatial rarefaction (SR) and four radii of Gaussian density kernels (GK), as explained in the Methods section.
AUC values decreased consistently across all radii of rarefaction in the two study areas ( Table 6). The opposite was true for GKs, reaching similar performance to raw models (QMLNR) or improving the discrimination ability (QLSNP) ( Table 6). AUC diff values were not maximal for the raw models (Veloz, 2009;Boria et al., 2014, Radosavljevic andVergara et al., 2015), as we applied a restricted background in order to increase accuracy of the prediction (Acevedo et al., 2012;Chefaoui & Lobo, 2008;Phillips et al., 2009). On average though, large radii of SR consistently led to an increase of this metric, while small SR radii and GKs caused minimal change (Table 6). MTSS omission rate was always increased by rarefaction, although SR1200 in QLSNP caused a minimal increase with respect to the average of raw models. GKs on average were more robust in reducing MTSS omission rate at all radii in the two areas, except for the largest radius in QLSNP (Table 6).
In QLSNP_RD there was a steady increase in overlap based on D and ΔD geo as the radius of SR increased, reaching the highest value at a distance of 9,600 m (QLSNP_RD_SR9600, D = 0.921, ΔD geo = 0.584), but characterized by a high omission rate (0.407) ( achieved highest overlap and optimal improvement for the second correction type (D = 0.860, ΔD geo = 0.267) and an average omission rate lower than the average of the uncorrected models (0.255) (Figures 2 and 4, Table 6). In the QMLNR_RD dataset, SR4800 (D = 0.844, ΔD geo = 0.636, MTSS om = 0.190) and GK9600 (D = 0.768, ΔD geo = 0.457, MTSS om = 0.081) maximized the overlap in geographic space with respect to the ensemble of QMLNR_FR_ RAW models, with QMLNR_RD_GK9600 representing the best average correction (Figures 3 and 4, Table 6).

| Environmental predictors of snow leopard occurrence
Based on the results of the top performing corrected models (QLSNP_RD_GK4800 and QMLNR_RD_GK9600), temperature (TEMP_300) in both landscapes showed a unimodal peak of support, representing a key factor for habitat suitability (28.3%-34.8% contribution in QLSNP, 37.4%-50.8% in QMLNR), and emphasizing the influence of abiotic and climatic gradients in determining the distribution of snow leopards (Tables S5 and S6, Figures 5 and   6). Other important predictors were associated with topography.

| Overview of main results
Our study addressed four main issues. (a) What is the scale-dependent relationship between snow leopard occurrence and habitat variables in western China, (b) are those relationships stable and comparable between two different study areas with different limiting factors, (c) where they differ between study areas, can the differences be explained by different limiting factors, and (d) what is the performance of commonly used spatial bias correction methods in improving prediction of snow leopard habitat selection. Here we briefly describe the main implications of our results for each of these points, with elaboration of specific details in the following paragraphs. First, our models showed strong perfor-  Mateo-Sánchez et al., 2013;Wasserman, Cushman, Wallin, et al., 2012). Fourth, we found a relatively high degree of agreement between the two study areas in terms of variables and scales, with temperature and large extents of grass and sparsely vegetated conditions in upland and ridge topographic settings important for snow leopards in both landscapes. Fifth, the differences we did observe between study areas seemed to be related to differences in the limiting factors in those particular landscapes (e.g., Cushman et al., 2011;Shirk et al., 2014;Short bull et al., 2011).
Specifically, variables tended to be selected at broader scales in the landscape that was more homogeneous, and at finer scales in the more topographically and ecologically complex landscape, suggesting that scale of variable selection is related to the scale at which each variable is heterogeneous and thus potentially limiting, as also seen by previous studies (e.g., Short Bull et al., 2011). Sixth, our results clearly show there is potentially serious impact of spatial bias in presence-only models, and similar to past findings (e.g., Vergara et al., 2015), we found that Gaussian kernel methods of bias correction out performed other spatial filtering or rarefaction approaches in all scenarios we evaluated. However, in our two study areas these variables are not major limiting factors to occurrence patterns for snow leopards, but may still be critical components of habitat (e.g., Cushman et al., 2011;. Failure to account for this would result in models with lower discriminatory power and possibly incorrect assessment of key habitat components determining the true occurrence patterns, as well as the misspecification of true ecological interactions (Tables S3 and S4).

| Scales affecting snow leopards distribution
Our results show the emergence of snow leopard habitat selection at multiple scales for different variables in the same landscape, and for the same variables across different landscapes, with locally specific topographic and landcover heterogeneity factors determining those scales, and the vast majority of variables selected at mediumcoarse scales across the areas (Table 3). Large territorial carnivores are expected to select habitat features at mostly broad scales, reflecting their mobility (Elliot et al., 2014;Krishnamurthy et al., 2016) and home range requirements (Ashrafzadeh et al., 2020;Hearn et al., 2018;Khosravi et al., 2019;Macdonald et al., 2018Macdonald et al., , 2019Mateo-Sánchez et al., 2013). In the case of the snow leopards, the overall broad-scale response to habitat features is further driven by the low productivity of the cold xeric environments typical of the mountainous habitats in their range, determining scales of effect

F I G U R E 4
Top five models in each study area, shown at their best average correction. For details about variables used in each model, and best average correction, refer to the main text and Tables 4-6. QLSNP = Qilianshan National Park; QMLNR = Qomolangma National Nature Reserve. Dashed white lines represent National borders in QMLNR. Inland water layers have been overlaid on top, using the color scheme indicating the lowest suitability TA B L E 6 Performances of correction methods, reported as average of the five top models, for the real datasets (RD), in Qilianshan National Park (QLSNP) and Qomolangma National Nature Reserve (QMLNR). MTSS = Maximum training sensitivity plus specificity logistic threshold; MTSS om = omission rate for MTSS threshold. D represents Schoener's D (Schoener, 1968)  Note: ΔD geo calculated using D overlap with FR_RAW ensemble of models (ΔD geo =(D corrected -D biased )/(1-D biased ); Fourcade et al., 2014). Values in bold represent improvement or equal performance with respect to the raw models. The model selected as the best correction (maximizing D, with positive ΔD geo while reducing MTSS omission rate) is reported in bold and italic.

F I G U R E 5
Response curves for each of the unique variables included in the top five models, in Qilianshan National Park (QLSNP), shown at their best correction. These curves represent snow leopard response when each variable is tested without interactions with other predictors F I G U R E 6 Response curves for each of the unique variables included in the top five models, in Qomolangma National Nature Reserve (QMLNR), shown at their best correction. These curves represent snow leopard response when each variable is tested without interactions with other predictors greater than would be expected upon a relationship with body size alone (Fisher, Anholt, & Volpe, 2011;Tucker & Rogers, 2014).
As a recurring pattern in studies of habitat selection by large carnivores (Ashrafzadeh et al., 2020;Elliot et al., 2014;Hearn et al., 2018;Khosravi et al., 2019;Macdonald et al., 2018Macdonald et al., , 2019Mateo-Sánchez et al., 2013) we observed medium-broad scales negative association with variables expressing human footprint (Table 3), either represented by density of settlements or transport infrastructures. Snow leopards select their habitats restricting their use of topographic features to ridges and dry uplands to minimize human disturbance ( Figures 5 and 6). These patterns have been seen in other felids, which generally select higher elevation areas and rugged terrain as a way to minimize the risk of conflicts with humans Krishnamurthy et al., 2016;Macdonald et al., 2018, Reddy et al., 2017Reddy, Puyravaud, Cushman, & Segu, 2019).
We found selection of riverine features at medium-coarse scales, with finer scale of selection indicative of increased topographic complexity, likely resulting in a denser array of seasonal streams, and the coarser scale ultimately dependent on major hydrological processes operating at landscape level (Table 3). Where rivers are more abundant, snow leopards are associated with a slightly higher density of such features ( Figures 5 and 6), whose riverbeds might offer access to easier dispersal routes (besides ensuring ambush opportunities) in a terrain with increased topographic ruggedness and dissection (Table 3). Where rivers are less abundant, the coarse scale association might be indicative also of a secondary relationship with human disturbance, as many settlements in QMLNR are found in the proximity of these major watercourses (Table 3).
Broad ecological associations are highlighted not only by the scale at which landscape composition metrics are selected (Table 3), but also by their magnitude (Figures 5 and 6), which confirm snow leopard preference for largely connected contiguous habitats. The steady cross-area coarse scale response with regards to the relative abundance of habitat types (PLAND) and similarity of their scales between study areas (Table 3), suggest that snow leopards consistently select an optimal amount of such habitat features at broad scales.
This possibly implies ecological domains (Levin, 1992;Wiens, 1989) and confirms also the role of the area and extensiveness of key habitat patches (Tables 3 and 4 (Table 3).

| Performance of multi-and single-scale models
This study provides further evidence that, when habitat selection for different features operates at several scales (Levin, 1992;McGarigal et al., 2016;Wiens, 1989) (Table 3), the incorporation of landscapespecific scales provides a more accurate (and ecologically realistic) description of snow leopard habitat, compared to any approach in which the suite of predictors included in a model is held at a fixed scale (Mateo-Sánchez et al., 2013;Shirk et al., 2012;Timm et al., 2016;Vergara et al., 2015;Wan et al., 2017;Wasserman, Cushman, Wallin, et al., 2012). The multi-scale modeling approach was superior to the unscaled counterparts in almost all cases (Table 5), being thus able to identify the magnitude of effect of locally influential factors (Appendix S3), and their role in determining model performance (Table 5, Appendix S3).
There can be cases however in which unscaled models models might reach a performance similar to multi-scale models (Elliot et al., 2014;Graf, Bollmann, Suter, & Bugmann, 2005;Krishnamurthy et al., 2016;Martin & Fahrig, 2012). Reasons for such situations have been extensively investigated by Martin and Fahrig (2012) and are intrinsically dependent on the effect that a given variable has on species ecology at any of its focal grains, in relation to the species' life cycle.
Models built with fixed-scale variables might provide a surrogate description of habitat, performing almost as well as the true multiscale models (Table 5 Therefore, as supported by this and other studies (Ashrafzadeh et al., 2020;Hearn et al., 2018;Khosravi et al., 2019;Macdonald et al., 2018Macdonald et al., , 2019Mateo-Sánchez et al., 2013;Timm et al., 2016;Vergara et al., 2015;Wan et al., 2017) it is more effective to employ a multiple scale optimization as a description of species habitat, as it will identify the best multiple or single scales, as the case may be.

| Snow leopards ecological associations across study areas
Our best multivariate models (Tables 4 and 5), at their best correction (Tables S5 and S6, Figures 2-4), showed the emergence of landscape-specific scales and factors driving snow leopard distribution in our study areas. However, we observed important analogies related to snow leopard ecological requirements.
In both study areas, our results for CTI and SLP are consistent in predicting highest snow leopard occurrence probability in areas of the landscape on ridges and uplands away from large valley bottoms, but show landscape-specific differences in how topography limits occurrence based on the grain and heterogeneity of the topographic structure of the landscape.
A striking similarity was observed with regards of the general composition of the landscape. Differences in the patch mosaic across areas might cause ecological associations being described by different metrics, as a consequence of the ecosystem complexity in the areas, which is more pronounced in the Himalayas due to larger altitude gradients (Bai et al., 2018). However, our results were consistent in revealing snow leopard preference for landscapes with high aggregation of a few dominant land cover class types ( Figures 5   and 6), facilitating dispersal and ability to integrate large territorial home ranges (Johansson et al., 2018).
In both landscape, snow leopard habitat suitability was consistently associated with the extent of grassland patches, and the effect at which they locally influence occurrence is revealed by their different magnitudes (Figures 5 and 6). The steady selection toward optimal amounts of habitat types (Table 3) is confirmed by the identification of PLAND_Gr in both study area as a characteristic driver of snow leopard habitat suitability (Tables 4 and 5; Figures 5 and 6), which is consistent not only with respect to the identified scale of selection (Table 3) but most importantly for its magnitude (Figures 5   and 6). This suggests the key role of such landscape attribute in driving general occurrence for the species throughout its range.
The identification of PLAND_Gr and of GYR_AM_Gr as recurring landcover metrics in the top five models (Tables 4 and 5), across the two study areas, might be related to habitat choices in function of predatory behavior (Hayward, Hayward, Tambling, & Kerley, 2011;Lyngdoh et al., 2014). Grassland and sparse vegetation are important components of the landscape for wild ungulates, small mammals, and birds, as well as for livestock, on which snow leopards are known to occasionally prey (Bagchi & Mishra, 2006;Chen et al., 2016;Lyngdoh et al., 2014). Therefore, the selection of a landscape with a total optimum amount of sparse vegetation in a given radius (as highlighted by PLAND_Gr), regardless of its extensiveness or continuity (revealed by GYR_AM_Gr) might be indicative of habitat choices intended to maximize hunting opportunities by foraging in preys' feeding grounds, and to balance tradeoffs in energy expenditures to locate and chase them (Hayward et al., 2011;Hayward, Jędrzejewski, & Jêdrzejewska, 2012;Lyngdoh et al., 2014).
The similar values of river density in the two areas might depend on general hydrological attributes characterizing mountainous environments across snow leopard range, but suggest domains of habitat characteristics (Wiens, 1989), as density is constrained within a similar range of values (Figures 5 and 6). These selection patterns might again be related to foraging behavior, as snow leopards are known to ambush their preys along ravines and river bluffs (Riordan, pers. comm.;Jackson & Ahlborn, 1984).
Annual average temperature represented the strongest determinant of habitat across the two study areas (Tables S5 and S6). However, it is not possible from this study to separate the effects of climate per se with those of the correlated topographic features.
Lower values of mean annual temperature in QLSNP reflects the species' association with higher elevation mountainous areas in the landscape, while the slightly higher values in QMLNR are indicative of the moderately rugged plateaus, relatively distant from the higher Himalayas mountains, in which the snow leopard presence points occur (Figures 5 and 6).

| Effect of landscape-specific limiting factors
Our study confirms observations from McGarigal and Cushman (2002) on the utility of meta-replicated landscape-level analyses in studies of species distribution, to uncover local habitat associations and provide generalizations based on species ecology.
Our results show a pattern of snow leopard habitat selection at a finer scale when those environmental predictors vary at fine scales across the landscape or are widely distributed through the landscape. Environmental factors that are not highly variable, or that vary at broad scales with low local variation within landscapes, tended to be selected at coarser scales. Although snow leopards have a consistent response to landscape topography and composition, the extent to which habitat components vary, in relation to local attributes, lead primarily to a differential scale of effect of such predictors, and secondly to the inclusion of different limiting factors Shirk et al., 2014;Short Bull et al., 2011) as strongest descriptors of habitat in different areas (Tables 3-5, Tables   S3 and S4).
The effect of a predictor will be detected in a model only if it bears enough variability, such that it is able to differentially affect the modeled response (Reddy et al., 2019;Shirk et al., 2012Shirk et al., , 2014Short Bull et al., 2011;Vergara, Cushman, & Ruiz-González, 2017).
The effect of variables instead will not be identified in a model if they are too homogeneous or bear not enough variability. This does not imply they lack ecological importance, rather that they do not possess sufficient power to structure the response variable (Short Bull et al., 2011).
We observed a reversed pattern of selection for ELEV, dependent on its variability across the entire extents. Finer scales are thus associated with higher local elevation differences, especially notable considering the altitude of the high Himalayas compared to the plateaus with snow leopard occurrence, while a landscape with low local variation (as the Qilian Mountains) produced a coarse scale of response (Table 3).
In contrast to this, we highlight the role of the complexity of the mountain texture in determining scales of response for other derived metrics. The topographic homogeneity of uplands with snow leopards presence in QMLNR (although presenting landscape attributes locally favorable for their occurrence) causes less local landscape variation, and the major landscape topographic gradients are between extreme mountain peaks and plateaus, driving the selection of derived topographic descriptors at coarse scales (Table 3).
Although the best multivariate models in the two areas (Tables 4   and 5, Tables S3 and S4) agreed in the general preference for high elevation dry areas (Figures 5 and 6), it is interesting to observe how different aspects of topography emerge in different contexts as a consequence of what is locally limiting snow leopards occurrence.
CTI then is the only metric in QMLNR able to frame the large hydrological gradients between upland and lowland conditions, and SLP in QLSNP is associated with a fine-scale high degree of slope, consequence of the topographic texture of the area. These descriptors are the best factors in their respective study area possessing enough variability to achieve a higher discriminatory power.
Complexity of the patch mosaic in each area translates into different scales of selection for the most abundant landcover types.
Smaller scales are indicative of a less complex landscape with wider and larger patches, driving the selection of area, extensiveness and contrast metrics at finer scales, representing thus landscape-specific factors (Table 3). These landscape properties emerge in the multivariate context where we observe AI in QLSNP and CWED in QMLNR as best landscape-level descriptors for the two areas (Tables 4 and   5, Tables S3 and S4).
In situations in which a landscape is composed of few main classes, with patches having low edge and large extent (as in QLNSP), occurrence patterns for snow leopard will be better described by a metric of aggregation. When the landscape mosaic is more heterogeneous, showing an alternation of patches with higher contrast (as in QMLNR), habitat suitability will be associated with metrics revealing density of, and contrast among, different land types, indicative of the presence, and possibly the avoidance, of nonoptimal and suboptimal habitats (Tables 4 and 5; Figures 5 and 6). This allows also the inclusion of more classes as best habitat discriminants (Tables 4 and   5, Table S4, Figure 6).
Optimal scales for hydrological and anthropogenic features were selected as a function of their relative abundance, with smaller scales indicative of more homogeneous patterns and fine-scale variation (Table 3). Their performance in top multivariate models in the two study areas (Tables 4 and 5, Tables   S3 and S4)  Qinghai Lake and on the northern foot of Qilian mountains), and by landscape-level patterns of hydrological heterogeneity in QMLNR, which is consistent with the general topographic properties of the whole area, already described for CTI.
Summarizing, we found that natural history of the snow leopard might dictate a range of scales for several predictors. However generalizations on scales pertinent to the same species in replicated study areas should be made after careful evaluation, as each scale could be optimal within the landscape-specific context in which it has been observed, given variation in limiting factors between different landscapes Shirk et al., 2014;Short Bull et al., 2011;Wan et al., 2017). This does not imply, however, that it is impossible to generalize habitat associa- sible to predict a priori which variables will be limiting occurrence in a given landscape, and also, to some extent, at which scales they will be most limiting, based on the structure and composition of the meta-replicated landscapes (Tables 3-5, Tables S3 and S4).
This is critical to understanding the habitat niche of a species and when different dimensions (variables) of that niche become limiting to its pattern of occurrence.

| Performance of bias correction methods
Our bias correction framework confirmed the impact of sampling bias in distribution modeling (Figures 2 and 3), as the impacts of the radius of correction and correction type (  (Figures 2 and 3, Table 6). This is achievable by balancing simultaneously correction radii, improvement in overlap, and omission rate (Figures 2 and 3, Table 6, There are caveats in the use of Schoener's D (Schoener, 1968) and ΔD geo (Fourcade et al., 2014) alone as a proxy for successful correction. We also warrant against the use of MTSS omission rate alone, the reasons for which are expounded on in Appendix S4, together with additional considerations on our modeling approach.

| CON CLUS IONS
This study emphasized the key influences of scale dependency on the identification of optimal relationships between the occurrence of a focal species and environmental gradients, offering further evidence for the need to integrate scale selection into species distribution and habitat suitability modeling . If scales of effect are not optimized, this may lead to misinterpretation of ecological determinants, resulting in suboptimal management strategies based upon ecologically unrealistic distribution models (Bellamy et al., 2013;Mateo-Sánchez et al., 2013;Shirk et al., 2014;Timm et al., 2016;Vergara et al., 2015;Wan et al., 2017;Wasserman, Cushman, Wallin, et al., 2012)  We observed area-specific differences in the relative importance of human footprint and hydrological network, different limiting landscape-level attributes and a differential influence of topographic metrics in describing snow leopard habitat in the two areas.
Accounting for these differences, we have been able to generalize ecological associations based on the interpretation of response curves from the top models. Our results identified consistent finescale association with temperature, medium/broad-scale associationwith sparse vegetation on ridges and uplands, and broad-scale association with aggregated and low-contrast landscapes. The consistent values of relative abundance of sparse vegetation and thresholds of watercourse density, possibly related to predatory behavior, were the strongest limiting factors related to snow leopard ecology, implying ecological domains related to the magnitude of those attributes. The consistently strong relationship to temperature highlights the climatic sensitivity to snow leopard and strongly suggests additional work evaluating the effects of climate change on its habitat suitability and population connectivity (e.g., Wasserman, Cushman, Littell, Shirk, & Landguth, 2013;Wasserman, Cushman, Shirk, et al., 2012).
In our simulations, multi-scale models outperformed single-scale counterparts in most cases. Even when single-scale models perform equivalently, identifying the best single-scale requires scale optimization . Therefore, we suggest as a general approach employing a multi-scale modeling strategy where each covariate is allowed to vary independently in an analysis that optimizes their individual scales.
This study provides guidelines for selection of model optimization strategies across a range of different occurrence configurations reflecting bias intensities (Appendix S4). We presented a framework in which we coupled a restricted background with spatial filtering or Gaussian density kernels and created reference models directly related to an ensemble of the best raw suitability surfaces. We ensembled competing models in their raw and corrected versions to capture a whole range of potential suitability scenarios in a single output and evaluated their performances based on average metrics. This approach yielded a description of habitat which framed the effect of several different predictors, assessed without interactions with variables describing the same underlying landscape characteristic (i.e., topography). This ensembling strategy, when coupled with maximization of overlap with respect to a hypothetical reference model (Fourcade et al., 2014;Veloz, 2009), allowed us to select the best average correction that further minimized the MTSS omission rate, which is desirable for model accuracy (Liu et al., 2013;Boria et al., 2014, Radosavljevic & Anderson, 2014Vergara et al., 2015).
We found that Gaussian density kernels are the best optimization for heavily clustered occurrences, but perform well under each bias circumstance. Spatial rarefaction on average is effective when the presence points are more uniformly distributed in space (Appendix S4). While accounting for overfitting, we deem desirable selecting a threshold to minimize omission errors, and to create a reference model which is probabilistically related (but not as biased) to the real data, upon which a metric of niche overlap (and its improvement after correction) must be selected.
We providing help and support with data collection. We finally thank two anonymous reviewers, whose comments greatly improved the focus of the manuscript.

CO N FLI C T O F I NTE R E S T
The Authors declare no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
Datasets of occurrences in both study areas cannot be made available online before completion of National Forestry and Grassland Administration of China (NFGA) project. Specific requirements can be addressed directly to the corresponding author by interested third parties.