Mapping the ghost: Estimating probabilistic snow leopard distribution across Mongolia

Snow leopards are distributed across the mountains of 12 countries spread across 1.8 million km2 in Central and South Asia. Previous efforts to map snow leopard distributions have relied on expert opinions and modelling of presence‐only data. Expert opinion is subjective and its reliability is difficult to assess, while analyses of presence‐only data have tended to ignore the imperfect detectability of this elusive species. The study was conducted to prepare the first ever probabilistic distribution map of snow leopards across Mongolia addressing the challenge of imperfect detection.


| INTRODUC TI ON
Large carnivores, which often require large areas and intact prey populations, are globally experiencing dramatic declines in population and geographical range (Ripple et al., 2014;Wolf & Ripple, 2016. Research suggests that habitat specific species with large home ranges are particularly vulnerable to range contractions and possibly extinctions (Ceballos et al., 2005;Johansson et al., 2016).
Our understanding of the species ecology and status will guide longterm efforts to conserve these species (Ripple et al., 2014). In particular, it is useful to assess distribution at local and regional scales, and track changes in distribution over time in order to assess the impact of overarching and wide ranging threats such as climate change, poaching and poorly planned infrastructure (Burton et al., 2012;Ghoshal et al., 2017;Karanth et al., 2011;Sunarto et al., 2012;Taubmann et al., 2015). Studies of species distribution also inform the design of protected area networks and the planning of assisted re-colonization where necessary and possible (Chapron et al., 2014;Müller et al., 2014;Thorn et al., 2011). Finally, they play a crucial role in estimating and monitoring regional, country-specific and global populations of wildlife (Taubmann et al., 2015, Ghoshal et al., 2017. At appropriate scales, range contractions and expansions can provide reliable surrogates for the status of a species (MacKenzie & Nichols, 2004a) where estimating abundance is often expensive and resource intensive (Alexander et al., 2015).
Detecting wildlife can seldom be achieved perfectly (MacKenzie et al., 2002). Few situations exist when a wild animal can be detected with 100% certainty, especially in its natural habitat. In the case of cryptic species, where indirect signs are used to detect a species, detection is influenced by two factors. These include the species of interest leaving its signs in a defined grid-cell and that of the observer finding it (MacKenzie et al., 2002). With species that roam across large ranges, detection probabilities often vary (Karanth et al., 2011). This poses a particular challenge in studies that attempt to estimate species distribution with varying levels of effort and skill sets.
Assessing the status and distribution of carnivores such as the snow leopard Panthera uncia is particularly challenging, given its cryptic nature and that it is found over large areas of mountainous terrain Ghoshal et al., 2017;Taubmann et al., 2015). The species is distributed across the mountains of 12 countries in Central and South Asia, covering an estimated 1.8 million km 2 (McCarthy & Chapron, 2003). Snow leopards occur at low densities and have large home ranges of the order of a few hundred square kilometres (Johansson et al., 2016;Sharma et al., 2014). It is estimated that <10% of the Protected Areas across the snow leopard range are large enough to support a viable snow leopard population (Johansson et al., 2016). Conserving this charismatic predator shall ultimately require in-depth knowledge of the species' whereabouts and population trends, of its prey, and of its interactions with surrounding livestock and humans (McCarthy & Chapron, 2003).
Apart from a small number of surveys that have used occupancy based estimators (Alexander et al., 2015;Ghoshal et al., 2017;Taubmann et al., 2015), previous efforts that mapped snow leopard distributions have relied on either expert opinion from scientists and conservationists with experience of working in the specific areas (Mccarthy & Chapron, 2003;Riordan et al. 2015;Aryal et al., 2016;Li et al. 2016;Kalashnikova et al., 2019), or on the modelling of presence-only data (e.g. with MAXENT; Li et al., 2014Li et al., , 2016Aryal et al., 2016;Bai et al. 2018;Holt et al. 2018;Kalashnikova et al., 2019;Shi et al. 2019;Hameed et al. 2020). Both approaches have specific limitations. Expert opinion is subjective and its reliability is difficult to assess without collecting the very data it is intended to substitute for. Presence-only analyses assume uniform effort in collecting data and assume perfect or constant detection probability across space and time, all of which are unlikely to hold in most snow leopard surveys. Although methods such as spatial filtering and background manipulation with bias files are recommended methods to correct for non-uniform effort (Kramer-Schadt et al., 2013), these are still subjective measures and only arbitrarily address the bias.
Here, we address these limitations by reporting the first largescale assessment of snow leopard distribution incorporating both presence and absence data, and accounting for imperfect detection and variable survey effort using occupancy modelling approaches (MacKenzie et al., 2002(MacKenzie et al., , 2003(MacKenzie et al., , 2018. Specifically, we sought to: (a) rigorously assess snow leopard distribution across Mongolia, accounting for imperfect detection and variable survey effort, (b) examine how snow leopard distribution varies across the landscape in relation to a range of spatial covariates and (c) compare the results with previously known distribution maps and maps generated with the help of presence-only techniques. Given the species' ecology, the scale of the study and the mixture of survey methods employed, we hypothesized that a wide range of covariates would affect the probability of snow leopards using a particular grid-cell, and that of detecting it. | 2443 BAYANDONOI et Al.

| Study area
Mongolia is home to the second largest population of the snow leopards in the world (Munkhtsog et al., 2016). It is estimated that there are approximately 1,000 snow leopards across the Mongolian Altai, Gobi Altai, Khangai mountain ranges, and isolated mountains of Trans-Altai Gobi and Khuvsgul Mountains containing around 103,000 km 2 area (McCarthy, 2000;Munkhtsog et al., 2016).
The snow leopard is protected and listed as a rare species in the Mongolian Wildlife Law and Mongolian Red Book (Shiirevdamba et al., 2016). Snow leopards in Mongolia typically occupy elevations of 900-3500 m above mean sea level. The snow leopard distribution in Mongolia represents the eastern-most population of wild snow leopards . Mongolia's climatic conditions and low human population density of 2 persons per km 2 (Mongolian Statistical Information Service 2020, www.1212.mn/stat.aspx?LIST_ ID=976_L03) can potentially offer a refugium to the species, should it continue to decline in most parts of its range due to climate change and human activities. Mongolia plays an important role in regional efforts to conserve snow leopards as it borders China in the south, where approximately 60% of the world' snow leopard population occurs (McCarthy & Chapron, 2003). Little is known about the snow leopard populations close to the border in Inner Mongolia, but predictive models indicate availability of possible suitable habitat within a few hundred kilometres from the Mongolia-China border (Li et al., 2020). Mongolia also borders Russia to the north which has a relatively small snow leopard population (McCarthy & Chapron, 2003).  The PAWS methodology (Sharma et al., 2019) follows a two-step process where occupancy-based surveys (MacKenzie et al., 2018) are conducted across large landscapes (e.g. Ghoshal et al., 2017;Taubmann et al., 2015) to develop reliable species distributions and a stratification surface with variable probability of use by snow leopards. The second step involves intensively surveying specific sampling frames for abundance estimation using spatial capture recapture methods (Borchers & Efford, 2008;Efford et al., 2009). The workshop helped build an understanding of basic sampling theory, methods of occupancy estimation and spatial capture recapture modelling. Workshop participants included practitioners who were to lead teams to conduct occupancy surveys across Mongolia.

| Field data collection
Following the workshop, we designed the occupancy survey and overlaid a grid with 20 x 20 km cells (400 km 2 , roughly the same area as a typical snow leopard home range) across the potential snow leopard range in Mongolia. Occupancy or site use probabilities are sensitive to the choice of grid-cell size, and tend to increase with bigger grid-cells, so that any estimates should be interpreted with the size of the grid-cells in mind (e.g. MacKenzie et al., 2018). Snow leopards are known to be selective in their habitat use. They live almost exclusively in mountainous habitats, even though they are known to use steppe, rivers, roads and sometimes visit settlements as a means to move between habitats. We created a terrain ruggedness index (Riley et al., 1999) map for the entire country using the digital elevation model downloaded from Shuttle Radar Topography Mission, SRTM (http://srtm.csi.cgiar.org/) to demarcate areas of mountainous habitat (defined by slope > 5%) and non-habitat (defined by slope < 5%) at a resolution of 90 m. We excluded grid-cells that had <5% of the total area as mountainous habitat without forests, or were separated from known snow leopard distribution range by 400 km, assuming these were only used for transit purposes. A total of 1,200 grid-cells (480,000 km 2 ) fit the criteria and these units were considered to hold potential snow leopard habitat. of segment vertices, topography and terrain characteristics for each segment were collected on data forms (Appendix S1). We recorded the location of each scrape, pugmark, scat or direct observation detected during the survey, other micro-habitat characteristics and the expected age of the sign. We only recorded and retained signs that were clearly visible and unambiguously identified as belonging to snow leopards. We only used scrapes and pugmarks as evidence of presence in the analysis as these are considered to be least likely to be misidentified or confused with signs of other species. In general, it is not possible to age snow leopard signs accurately, and these could be anywhere between a few days to several weeks old. This likely violates the assumption of closure of sites during surveys, and so in later modelling we interpret our occupancy estimates as probability of sites being used instead of sites being occupied by snow leopards at the time of the survey (MacKenzie et al., 2004b).

| Occupancy analysis
In May 2019, we organized a workshop by bringing together all survey team leaders, including in-country as well as international experts. The participants identified key covariates that could improve inference on detecting snow leopard signs and estimating probability of site use.
Probability of detecting snow leopard signs can be affected by a range of factors. We considered the possible effects of seven detection covariates: survey mode (on foot or from a vehicle), transect segment length, field team category, number of persons conducting the transect, average terrain ruggedness (TRI) (Riley et al., 1999) of the particular transect segment, average slope of the particular segment and mean altitude of the transect segment above mean sea level as potential covariates. Presence or use of a particular area by a species is often correlated with one or more variables. Snow leopards are known to have strong preferences for mountainous habitats. They typically inhabit rugged terrain with low tree cover and typically occupy an altitudinal range that varies across latitudes.
Although as a predator, one is tempted to use prey availability as a covariate, we refrained from using it because of the uncertainty in estimating these with any confidence, while accounting for imperfect detection. Instead, we use habitat characteristics such as the normalized differential vegetation index (NDVI) and ruggedness as proxies of habitat used by snow leopard's primary prey, ibex and argali (see, e.g., Odonjavkhlan et al. 2021, Hamel et al. 2009). We considered the possible effects of 5 site use covariates, including average elevation, average ruggedness, average ruggedness of mountainous habitat, proportion of the grid-cell covered by forest and average differential vegetation index (NDVI) on snow leopard site use. We used SRTM data for elevation recorded at 90 m resolution for the entire Mongolia and averaged it for each grid-cell. We also included a quadratic altitude effect to allow for intermediate elevations to be preferred. We used the SRTM data to calculate terrain ruggedness index (Riley et al., 1999) and averaged it for each grid-cell, and the mountainous habitat (defined by slope > 5%) within each grid-cell. We also tested the effect of vegetation cover by using Normalized Differential Vegetation Index estimated from Landsat imagery (https://lpdaac.usgs.gov/produ cts/mod13 q1v00 6/) and by estimating the proportion of the grid-cell covered by forest (source: Each model was ranked by Akaike's information criterion (AIC) to choose the best model balancing between likelihood (fit) and overparameterization (number of parameters). We also tested for the possibility of correlated detections between segments using the extended occupancy model of Hines et al. (2010).
To assess the effective coverage and the sufficiency of our sampling effort, we plotted occupancy as a function of different covariates and marked the covariate values represented by grid-cells ( Figure 1). We used AIC weights of models to define variable importance. We used the top model to predict the probability of sites being used in unsampled areas. For ease of interpretation, we divided the probability of site use, ranging between 0 and 1, into four equal-width categories, denoting low, low-medium, medium-high and high probability of site use. We estimated c-hat on the global model fit to a trimmed dataset where only the first 3 replicate surveys were used to assess how the model fit the data. The dataset was reduced as the number of replicate surveys varied between 1 and 40, and the assessment procedure can produce unreliable results in such circumstances (MacKenzie et al., 2018).

| Comparing distribution maps using Occupancy and other methods
A snow leopard distribution map with confirmed, probable and possible distribution was developed from expert opinion in the 2008 International Conference on Range-wide conservation planning for snow leopards. Given the ambiguity in determining the three classes, and how they could have represented an absence of researcher knowledge rather than an absence of the species, we transformed it into a binary map, considering all three categories to represent areas that were considered potentially used by snow leopards. Another popular approach for generating snow leopard distribution maps, despite known limitations (Royle et al., 2012;Yackulic et al., 2013), is to use presence-only data to model animal presence as a function of a suite of covariates using MAXENT, interpreting the output as a distribution map (e.g. Aryal et al., 2016;Li et al., 2014). We therefore also created a distribution map using MAXENT based on modelling animal presence We transformed the maps to the same resolution (20 x 20 km) as the occupancy maps for the sake of comparison.
Outputs from expert opinion, MAXENT and occupancy models are not directly comparable because they either measure different quantities or use different measurement scales. There are many reasons why maps produced by these methods might differ, chiefly that the methods use different inputs and produce different outputs.
Nevertheless, these approaches are all used in practice to produce maps that are interpreted as indicating of snow leopard distributions.
Here, we compare maps not to demonstrate the existence of a difference, but to argue that (a) in this case, the differences are large enough that different policies and priorities might arise, (b) that it is therefore important that the method used to generate the map is theoretically justified, and (c) that here occupancy models hold a clear advantage.
To illustrate qualitative differences between the outputs of each method we present these outputs in three different ways. Firstly, we show the outputs of each method without any transformation.
Then, we show the percentiles of both cardinal outputs (occupancy, MAXENT), which illustrates areas of high and low values within each map. Percentiles are also invariant to the choice of MAXENT output format, as these are all monotonically related. Finally, we use a generalized additive model to estimate a function that provides a flexible one-to-one mapping between MAXENT and occupancy outputs. Our intention here is to transform MAXENT outputs into quantities that match the output of the occupancy model as closely as possible, in the sense of maximizing the percentage of variance in the occupancy probabilities that is explained by the transformed MAXENT outputs.
This makes the two maps as similar as mathematically possible. We used a beta distribution for the error terms and a logit link function, with a smoothing term using cubic regression splines with six knots (k = 6) capturing nonlinearities in the relationship between the predictor and response variable. The exact number of knots is not critical F I G U R E 1 Predicted snow leopard occupancy (probability of site use, Psi) from the model with the minimum AIC value, shown as a function of altitude, ruggedness, NDVI and forest cover. Tickmarks on x-axis show the covariate values that were sampled (n = 1,017). Detection probability (p) from the top model is predicted as a function of interaction between ruggedness and if the surveys were conducted while walking or in a vehicle. Shaded grey bands denote 95% confidence intervals around the predicted occupancy/detection probability but was chosen conservatively to prioritize smoothness and monotonicity. Spatial autocorrelation was not included in the model because our aim was to produce a mapping between MAXENT and occupancy outputs that treat grid-cells as independent.

| RE SULTS
In total, our teams covered 19,924 km in transects (13,130 km on vehicles, 6,794 km on foot). We recorded 1,421 snow leopard signs (scrapes, pugmarks or spray markings, excluding putative scats) in a total of 235 grid-cells out of 1,017, thus providing a naive occupancy estimate of 0.22 (i.e. 235/1017). We tested the effect of correlated detections using Hines et al. (2010), but these did not perform as well as models without correlation (delta AIC = 75.25, see Appendix S2).
We therefore proceeded with the remainder of the analysis using However, this could be attributed to the removal of much of the original data to estimate c-hat reliably.
The pairwise correlation was <0.5 for all covariates used in the modelling. The AIC table (Table 1) indicated that the top model had the greatest level of support among all models that were run, with 56% of the total AIC model weight. The cumulative AIC weight of the top 3 models was 0.99. We determined the relative importance of covariates in explaining the variation in data for probability of site use and detection of snow leopards by using summed AIC weights of the top models (Table 2). Our top models suggest that mean ruggedness of mountainous habitat, normalized differential vegetation index, and elevation were the most important covariates for probability of site use by snow leopards, whereas the ruggedness of the transect segment, the mode of survey and the interaction of these two effects influenced the probability of detecting snow leopards using indirect signs. The interaction between forest and NDVI likely did not explain much of the additional variation in site use, indicating that it could have been a pretender variable that appears in a highly ranked model because the model structure is similar to other highly ranked models (Anderson, 2008). Probability of grid-cells being used by snow leopards increased with the ruggedness of available habitat; reduced with an increase in vegetation index and forest cover; and responded according to a quadratic relationship with altitude in which intermediate altitudes were preferred (Figure 1). Detection probability was higher for segments walked on foot; and those that were more rugged, even though the relationship was interactive where p increased more rapidly with ruggedness in segments surveyed on vehicles ( Table 2).
Given that our sampling represented the entire range of available covariate values (Figure 1), we predicted the probability of site use by snow leopards for the entire country (Figure 2). Our assessments indicate that only 78,000 km 2 (5%) of Mongolia's total land TA B L E 1 Models with cumulative weight of 1.0, ranked on the basis of AIC  mass has a high probability (Psi>0.75) of being used by snow leopards, followed by 135,200 km 2 (8%) as a moderately high (0.5<Psi ≤0.75); 230,400 km 2 (14%) as moderately low (0.25<Psi≤0.5); and the remaining 73% to be least likely (Psi≤0.25) to be used by snow leopards.
Comparing occupancy maps with both expert maps and those constructed from presence-only data showed broadly similar patterns but also some notable differences (Figure 3). The expert map did not identify an area in the north where site use was likely (Figure 3a-c, denoted i), and showed greater fragmentation than the occupancy map, in particular the identification of two relatively isolated islands of snow leopard potential presence in the eastern and north-western reaches of the Altai range (Figure 3a,c, ii-iii). The presence-only maps also did not identify the same area in the north as in (i) above (Figure 3d,f, denoted iv), also indicated an island of isolated activity in the northern Altai that was not reflected in the occupancy results (Figure 3d

| Snow leopard distribution across Mongolia
Our study provides the first probabilistic distribution map of snow leopards from across Mongolia while correcting for possible "falseabsences." Under the assumption that covariate relationships established for our survey region hold true across Mongolia, the results indicate that <12% of entire Mongolia (42% of the total potential snow leopard habitat in the country represented by the 1,200 grid-cells) is used by snow leopards with a probability of >0.5. Snow leopards have historically been represented by expert opinion as occurring across 90,000-130,000 km 2 (Mallon, 1984;McCarthy, 2000;Schaller et al., 1994) that were represented by 36% of the total grid-cells used for the occupancy survey. Furthermore, there were stark differences between models using presence-only data and occupancy data for snow leopards. These suggest that although previous estimates may have missed snow leopards from certain areas, they also included areas with relatively low probability of being used by snow leopards.
Methods that do not take into account imperfect detection not only underestimate the true probabilistic habitat use (MacKenzie et al., 2018), but can also bias the estimates and assume the species is not present in settings where it is yet to be detected. We detected the possible presence of snow leopards in areas where they had not Methods such as MAXENT provide useful tools to predict species distributions across large landscapes using presence-only data (Elith et al., 2011). However, recent studies highlight several shortcomings of such simplistic modelling approaches that ignore nondetection data; assume uniform effort; are scale dependent; and do not provide a framework to compare models for fit and parsimony (Renner & Warton, 2013;Yackulic et al., 2013). Our study's occupancy distribution maps showed some important areas of disagreement with maps constructed from expert opinion or presence-only data. In addition, the occupancy based distribution maps provide a far more nuanced understanding about a species as compared to binary maps. This highlights that, although presence-only maps are valuable for preliminary planning, occupancy maps are to be preferred for conservation planning and monitoring purposes.

| Distribution covariates
This study is the first nation-wide distribution assessment of snow leopards that accounts for detection probability. It therefore provides the most reliable and up to date snow leopard F I G U R E 2 (a) Predicted distribution of snow leopard across Mongolia estimated from the occupancy model with the minimum AIC developed from the 1,017 sampled units. (b-e) represent the distribution and true variation of covariates, viz. NDVI, Altitude, Forest cover and ruggedness of available habitat, respectively, that were used in estimating occupancy (probability of site use) distribution map in Mongolia. The study also provided us with the first opportunity to empirically identify variables that determine snow leopard presence across the country. Snow leopards are widely considered to be habitat specialist, selecting mountainous areas (Fox et al., 1991;Johansson et al., 2016). Mongolia is also known to have reported snow leopards at altitudes as low as 800 m above mean sea level (McCarthy & Chapron, 2003).
Our results highlighted the importance of key habitat features in predicting snow leopard site use at the country scale. The effect of overall ruggedness in the grid-cell did not greatly affect the probability of use by snow leopards (Table 2). Instead, ruggedness interacted strongly with habitat availability, indicating that the probability of a grid-cell being used by snow leopards was higher for those with available rugged habitat ( Table 2). The probability of site use by snow leopards first increased and then declined as a function of altitude, thus supporting the hypothesis that snow leopards do not use mountains uniformly, but instead thrive along an altitudinal band. Vegetation index (NDVI) and area under forest cover were negatively related to probability of grid-cells being used by snow leopards, thus clearly indicating an aversion of snow leopards for forested habitats. This finding supports previous hypotheses that snow leopards are also considered to avoid forested and non-mountainous habitats (Johansson et al., 2016;Lovari et al., 2013).
It is evident that how surveys are planned and conducted play an important part in our ability to detect snow leopards. In our study, surveys that took place on foot and in more rugged areas were much more likely to detect snow leopards. Our best models predicted the average detection probability of snow leopard signs within 1km segments to be only 0.1 (0.09-0.11), even though all top models predicted detection probability to be higher in rugged areas surveyed on foot. We suspect that although surveys in vehicles allowed greater distances to be covered in short duration, surveys on foot allowed field teams to look for snow leopard signs more intensively. Also, snow leopards prefer moving about rugged terrain, thus increasing the chances that these are where they will leave signs to be detected by the survey teams. Additionally, snow leopard signs are less likely to be disturbed by human and livestock movement in rugged terrain, thus making it more likely for detection probability of snow leopard sign to be greater in more rugged terrain.

F I G U R E 3
Snow leopard distribution maps obtained over the surveyed region using occupancy data, presence-only data (MaxEnt) and subject assessment by experts (Expert). Maps show (a) predicted occupancy probabilities from the best-fitting model (Model 1, see Table 2); (b) MAXENT outputs showing cumulative probabilities; (c) expert judgements on areas of likely or possible snow leopard presence (purple) and absence (light yellow); (d) occupancy probability percentiles; (e) MAXENT output percentiles; (f) MAXENT outputs transformed using a fitted generalized additive model (see plot inset) that maximizes the degree to which the transformed outputs match the predicted occupancy probabilities in (a) (see text for details). Inset letters i, ii, etc., refer to specific regions of disagreement (see text for details)

| CON CLUS ION
This study is the first national-level occupancy survey of snow leopards in Mongolia. The sign surveys conducted across the entire potential snow leopard range in Mongolia are also a first of its kind at this scale. The study highlights robust methodological opportunities that can be taken to scale and support national-level conservation planning. Difference between previous predictions based on expert opinion or presence-only maps is significant and highlights how such estimates should be used with care for conservation planning. In the case of Mongolia, previous estimates were potentially biased. We hope the granulated methods such as camera trapping and genetic surveys will help validate the findings further.
The work presented here also provides a case study as part of the Population Assessment of the World's Snow leopard (PAWS) initiative, a unique effort towards empirically estimating the abundance and distribution of the snow leopards from its entire range.
The PAWS guidelines (Sharma et al., 2019) provide a two-step process to achieve this, (a) occupancy based distribution surveys and (b) spatial capture recapture surveys. This study is an example of the first step. We hope that this study will contribute to planning and designing of similar studies underway across the snow leopard range.

ACK N OWLED G EM ENTS
This work is a part of the nation-wide snow leopard population as-

CO N FLI C T O F I NTE R E S T
We have no conflict of interest to disclose.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13412.

DATA AVA I L A B I L I T Y S TAT E M E N T
The full dataset and code used to run our analyses are openly avail-