Integrating animal tracking datasets at a continental scale for mapping Eurasian lynx habitat

The increasing availability of animal tracking datasets collected across many sites provides new opportunities to move beyond local assessments to enable detailed and consistent habitat mapping at biogeographical scales. However, integrating wildlife datasets across large areas and study sites is challenging, as species' varying responses to different environmental contexts must be reconciled. Here, we compare approaches for large‐area habitat mapping and assess available habitat for a recolonizing large carnivore, the Eurasian lynx (Lynx lynx).


| INTRODUC TI ON
Habitat suitability maps are important for informing ecological and biogeographical research, wildlife management and conservation planning (Guisan et al., 2013).Today, an increasing wealth of wildlife and environmental data provide new opportunities for assessing habitat suitability at higher detail and across larger extents than ever before (He et al., 2015;Oeser et al., 2020).Animal tracking technology provides a growing and powerful data source in this context (Kays et al., 2015).As tracking datasets collected across an increasing number of individuals and study sites are integrated and harmonized, they become robust sources of information for characterizing habitat suitability across large areas (Fraser et al., 2018).Yet, combining heterogeneous wildlife datasets sampled across study sites and large environmental gradients is far from trivial.As wildlife datasets are often geographically restricted and unrepresentative of species' current or potential ranges, large-area habitat assessments typically require extensive model transfers (i.e.predicting models to geographical areas or environments not covered by training data), posing a major challenge for habitat models (Sequeira et al., 2018;Yates et al., 2018).Moreover, differences in population density, species interactions or local adaptations to varying environmental conditions might cause habitat selection to differ across study sites (Aarts et al., 2013;Avgar et al., 2020), meaning that regionally varying responses should be considered when combining wildlife datasets for large-area habitat mapping.
Regional variation in habitat selection (i.e. the variation of a species' response to an environmental factor across space) forces a trade-off between specificity and generality when developing habitat models.Local, site-specific models typically capture local responses well, but often show limited transferability to other regions or environmental contexts.Conversely, global models that pool datasets from several sites provide more general characterizations of habitat selection but may fail to capture relevant local variation (Bamford et al., 2009;Paton & Matthiopoulos, 2016).Which strategy is optimal might thus depend on the modelling goal.Both global and local model training strategies have been used for assessing wildlife habitat across large areas (DeCesare et al., 2012;Muhly et al., 2019;Scharf & Fernández, 2018), but have rarely been compared (Olson et al., 2021;Reisinger et al., 2021).How to best integrate wildlife datasets covering several study sites and large environmental gradients for large-area habitat mapping therefore remains a largely unresolved question.
Regionally varying habitat selection is commonly ignored in habitat assessments (Paton & Matthiopoulos, 2016), yet some potential solutions have been proposed.For example, in global models that pool data from different sites, regionally varying responses can be included via interaction effects, treating variation in habitat selection either as a function of space (Hothorn et al., 2011) or of regional-scale environmental conditions (i.e.habitat availability; Matthiopoulos et al., 2011;Paton & Matthiopoulos, 2016).Such interaction models can either be fit by specifying interactions in generalized linear or additive models (Aarts et al., 2013;Matthiopoulos et al., 2011), or via more flexible machine-learning algorithms that learn interactions from data (Aldossari et al., 2022).Similarly, different approaches for combining local, site-specific models may be used to describe regionally varying habitat selection, such as model weighting based on geographical distance or environmental dissimilarity between training and prediction sites (DeCesare et al., 2012).
While some of these modelling strategies for large-area habitat assessments have been tested or studied in isolation, systematic assessments of modelling approaches for large-area habitat mapping, across training strategies, approaches for incorporating varying habitat selection, and modelling algorithms are lacking (Reisinger et al., 2021).
performance was the highest using flexible machine-learning algorithms and when incorporating variation in habitat selection as a function of environmental variation.
Our best-performing model used a weighted combination of local, site-specific habitat models.Our habitat maps identified large areas of suitable, but currently unoccupied lynx habitat, with many of the most suitable unoccupied areas located in regions that could foster connectivity between currently isolated populations.

Main Conclusions:
We demonstrate that global and local modelling strategies can achieve robust habitat models at the continental scale and that considering regional variation in habitat selection improves broad-scale habitat mapping.More generally, we highlight the promise of large wildlife tracking databases for large-area habitat mapping.Our maps provide the first high-resolution, yet continental assessment of lynx habitat across Europe, providing a consistent basis for conservation planning for restoring the species within its former range.

K E Y W O R D S
animal tracking, Eurasian lynx, habitat suitability, large carnivore, large-area mapping, Lynx lynx One group of species that could particularly benefit from improved large-area habitat predictions are large carnivores.Large carnivores have long been focal species for conservation, due to their charisma, their declining populations in many parts of the world (Ceballos et al., 2005), their role as umbrella species (Sergio et al., 2006) and their key roles in ecosystems (Ripple et al., 2014).
At the same time, large carnivore conservation is often particularly challenging.Large carnivores often come into conflict with humans due to their large home ranges and food requirements (Treves & Karanth, 2003), and commonly remain only in small and fragmented populations due to historical persecution and/or habitat loss (Cardillo et al., 2005).Together, this translates into a need for coordinated metapopulation conservation and management, often across national borders and jurisdictions (Bonn Lynx Expert Group, 2021;Trouwborst, 2015).Such efforts, in turn, require habitat information that is consistent across large areas but still provides sufficient detail to inform decision-making at local scales (Guisan et al., 2013).
The Eurasian lynx (Lynx lynx; hereafter: lynx) provides an interesting case for comparing large-area habitat mapping approaches.After being extirpated from much of their former distribution in Europe, lynx have recently expanded their range through reintroductions as well as recolonizations of nearby habitat patches (Linnell et al., 2009;Molinari-Jobin et al., 2010).This resulted in highly fragmented populations across Europe, with limited success of natural recolonization through dispersal, making them susceptible to loss of genetic diversity and jeopardizing their long-term survival (Mueller et al., 2022;Schmidt et al., 2011).Better data on lynx habitat suitability across Europe are therefore key to plan and manage towards viable lynx metapopulations (Bonn Lynx Expert Group, 2021;Hemmingmoore et al., 2020), particularly in terms of identifying unoccupied but suitable habitat.Yet, while some local assessments for lynx habitat in Europe have been made (Cristescu et al., 2019;Hemmingmoore et al., 2020), consistent, up-to-date and detailed habitat information at broader extents is lacking.
We here compare approaches for large-area habitat mapping using a large collection of tracking data from lynx in Europe.Our dataset encompasses 450 individual lynx tracked at 14 sites, representing 8 of 11 extant European lynx populations (Kaczensky et al., 2021).More specifically, we compared global and local model building strategies, three modelling algorithms including mixed effects models and two machine-learning algorithms, and different approaches for incorporating regional variation in habitat selection, We used GPS and VHF tracking datasets collected at 14 sites across Europe (Figure 1).Our dataset was collected through the research network EUROLYNX, which provides a collaborative, bottom-up platform for bringing together datasets and expertise of lynx researchers across Europe (Heurich et al., 2021).Prior to all analyses, tracking datasets were harmonized through a standardized procedure of quality checks (Urbano et al., 2021); see Appendix S1 in Supporting Information for an overview of all datasets).We filtered the tracking datasets by manually removing dispersing individuals, which often differ strongly in their habitat selection compared to resident animals (Hemmingmoore et al., 2020).To reduce spatial autocorrelation, we down-sampled our tracking data to a maximum sampling frequency of 6 h.To ensure robust estimates of individual-level home ranges for sampling habitat availability, we removed individuals with fewer than 50 available locations.While VHF data (225/442 remaining individuals) offered lower positional accuracy compared to GPS data (location errors up to ca. 500 m; Nagl et al., 2022;White et al., 2015), we included them to improve the representativity of our datasets for characterizing site-and continental-scale habitat selection by lynx.

| Habitat suitability modelling
We collected 13 predictor variables for modelling lynx habitat suitability, which we selected based on previous studies on lynx habitat selection (Basille et al., 2009;Filla et al., 2017;Hemmingmoore et al., 2020).Variables included information on land-cover, topography, human pressure, as well as satellitebased metrics characterizing fine-scale habitat variability (Oeser et al., 2020); see Appendix S2 for a full overview of variables and details on their selection, data sources and processing).
We compared three modelling algorithms: Maxent (Phillips et al., 2006), random forests (Breiman, 2001) and generalized additive mixed effects models (GAMMs).Maxent and random forests are flexible machine-learning algorithms that are considered among the best-performing presence-only species distribution modelling algorithms (Valavi, Guillera-Arroita, et al., 2021).Moreover, these algorithms are well suited for detecting and describing interactions among predictor variables, which could be useful for characterizing regional variation in habitat selection (Aldossari et al., 2022).
Following recommendations for modelling presence-only data with random forests, we used down-sampled random forests, in which sample sizes of presence and background records are balanced in each classification tree (Valavi, Elith, et al., 2021).As our third algorithm, we used nonlinear mixed effects models that allow to account for the sampling structure of our tracking dataset (repeated sampling at the level of study sites and individuals (Muff et al., 2020).Specifically, we fitted GAMMs, including random intercepts and (linear) random slopes for either lynx individuals (site-specific local models) or study sites (global models pooling tracking data across sites).
We analysed habitat selection of lynx at two hierarchical levels: selection of home ranges within the landscape (hereafter landscape level, also referred to as second-order selection (Johnson, 1980;Mayor et al., 2009); and within-home range selection (hereafter home range level, also referred to as third-order selection).We used a nested  (Breitenmoser et al., 2015;Kaczensky et al., 2021Kaczensky et al., ). et al., 2012)).We derived home ranges for each lynx individual by calculating 95% minimum convex polygons (MCPs).We chose 95% MCPs since our goal was to define approximate home ranges at the landscape level while creating a comprehensive sample of available habitats at the home range level (Fattebert et al., 2018;Holbrook et al., 2017).
To match the scale of the predictor variables (i.e. the neighbourhood of values considered) with the scale of the habitat selection process (McGarigal et al., 2016;Remm et al., 2017), for landscape-level models we used predictor variables derived through moving window averages at the scale of lynx home ranges (22 km diameter = 380 km 2 area ≈ average of site-wise median home range = 392 km 2 ).For home range-level models, we used variables at a fine scale of 100 m, the target resolution of our habitat suitability maps.
At the landscape level, we characterized habitat use by sampling as many random points within the home ranges of each individual lynx as available tracking locations.As available locations, we sampled random points in a buffer area of 80 km around the home ranges, representing likely dispersal distances of lynx (Samelius et al., 2012;Zimmermann et al., 2005).At the home range level,

| Comparison of modelling approaches
We compared a total of six modelling approaches: two strategies for model training (global vs. local) and, within each of them, two approaches for incorporating regionally varying habitat selection plus one stationary approach (Figure 2).First, we established baseline models assuming no regional variation in habitat selection by using a simple global model (global stationary) and a simple average of local model predictions (local stationary).We contrasted these with approaches explicitly allowing for regionally varying responses.
In global models, we let responses vary as a function of space (global spatial) or environmental conditions (global environment).For combining local model predictions, we used weights based on the F I G U R E 2 Schematic overview of the different modelling approaches tested.In the global modelling strategy, a single model is fitted using all data.In the local modelling strategy, site-specific models are built and then combined into an ensemble prediction.In each strategy, we tested two different ways to incorporate regional variation and contrasted these with a stationary model (for more explanation see Table 1).environment).An overview of all modelling approaches is given in Figure 2 and Table 1.
For the global spatial approach, we implemented spatially varying responses for Maxent and random forest models by including spatial X and Y coordinates as model predictors, thereby allowing the algorithms to fit interactions with spatial coordinates (Osborne et al., 2007).In GAMMs, we added spatially varying smooth effects for each predictor variable to our models (Hothorn et al., 2011).For the global environment approach, we added predictor variables derived at broader spatial scales as predictors, explicitly specifying (linear) interaction effects between variables of different scales in GAMMs following Matthiopoulos et al. (2011).Allowing for interactions between fine-and broad-scale variables allows capturing functional responses in habitat selection (i.e.variation of habitat selection with habitat availability; Paton & Matthiopoulos, 2016).In addition to the regular scales of predictor variables (22 km and 100 m at the landscape-and home range level respectively), we added variables at scales reflecting the dispersal buffer distance (80 km scale) and home range sizes (22 km scale) at the landscape and home range level, respectively, applying circular moving averages to derive rescaled variables.
In the local spatial approach, we implemented regional variation of selection by weighting the predictions of local models based on spatial distance on the distance of training sites to the prediction site (i.e.inverse distance weighting; DeCesare et al., 2012).We defined the extent of study sites as the intersection of all lynx home ranges from that site.In the local environment approach, we weighted local models based on the environmental dissimilarity of the training data to the prediction site.To ensure the comparability of variables, we mean centred and standardized predictors across our entire study area before calculating dissimilarity measures.Then, we calculated environmental dissimilarity using three distance measures: the Euclidean distance (in predictor variable space) to the nearest neighbour in the training dataset (Meyer & Pebesma, 2021), the median Euclidean distance to the training dataset, and finally, the Mahalanobis distance to the centre of the predictor distribution at training points (Mesgaran et al., 2014).
To reflect different contributions of predictor variables in habitat models, we also calculated weighted versions of the three distance measures, in which variables were weighted by their relative variable importance before distance calculation (Meyer & Pebesma, 2021).We calculated relative importance scores based on permutation importance in Maxent and random forest models (Smith & Santos, 2020) and used χ 2 values of smooth terms for GAMMs as a proxy for variable importance.Based on cross-validation, we selected the weighting scheme that optimized model transferability (i.e.performance at external evaluation).Weighting models improved the transferability using only one (median Euclidean distance between datasets) of the three measures of environmental dissimilarity in comparison to a simple averaging of local habitat models (local stationary approach), which we therefore used in the local environment approach (see Appendix S6).

| Comparison of habitat model outputs
We compared the six habitat modelling approaches (Figure 2; Table 1) in terms of their predictive performance.We assessed the performance of all models using two types of cross-validation: (1) testing our models' ability to predict lynx occurrence at sites used for model training (internal evaluation), as well as (2) transferring models to sites not used for model training (external evaluation).For external evaluation, we left out one study site at a time, resulting in 14 cross-validation folds.For internal evaluation, we created five crossvalidation folds based on lynx individuals (Roberts et al., 2017), using random samples of 70% of lynx individuals at all sites to train models, and testing models on the held-out 30% of individuals, one site at a time.As a measure of predictive performance, we combined two complementary performance metrics, the area under the receiveroperating-curve (AUC) and the continuous Boyce index (CBI; Hirzel et al., 2006), indicating model discrimination and calibration (Phillips & Elith, 2010) respectively.We combined both metrics into a relative performance score by normalizing AUC and CBI values on a ∈ [0, 1] scale and calculating their mean (hereafter: model performance).
We created habitat suitability maps for all six modelling approaches, computing an ensemble prediction by averaging across algorithms in each approach.We predicted landscape-level and home range-level habitat maps using the most recent layers of environmental predictors and multiplied ensemble predictions from landscape and home range-level models to obtain level-integrated habitat suitability maps (DeCesare et al., 2012).Then, we compared maps of the best-performing global and local approaches with one another.To assess the similarity of predicted suitability patterns, we calculated the Spearman rank-correlation coefficient between habitat maps.Furthermore, we derived habitat patches by thresholding maps, using the 90%, 95% and 97.5% suitability values at lynx locations used for training habitat models as low, medium and high thresholds respectively.Although binary habitat maps are sometimes required for research or conservation planning applications, selecting thresholds in presence-only habitat modelling introduces an arbitrary decision into the mapping process.We compared different thresholds to evaluate the impact of threshold selection on mapped habitat areas, choosing thresholds based on percentile suitability values at tracking locations given their clear underlying assumptions.For example, a 90% threshold assumes that 10 % of lynx tracking locations are recorded outside areas classified as suitable.
After applying thresholds, we compared patch maps by calculating the per cent overlap of habitat patches across threshold values.
We identified suitable but unoccupied areas based on the most recent, continental-level survey of lynx distribution carried out for the reference period 2012-2016 (Kaczensky et al., 2021).Using the same grid as the survey (EEA reference grid used for Flora-Fauna-Habitat reporting by the European Union), we calculated the suitability per 10 km 2 grid cell in all countries fully covered by the distribution survey (all European countries but Russia, Ukraine, Belarus and Moldova).
For calculation, we used an unweighted ensemble (i.e.mean value) of the best-performing global and local approaches, summing total suitability scores within each grid cell.We then compared suitability for unoccupied grid cells with areas of extant lynx populations (Boyce & McDonald, 1999;Hebblewhite et al., 2012).Therefore, we divided summed suitability values per unoccupied grid cell through the median of sums within grid cells featuring permanent lynx occurrence.
Finally, we identified the most suitable unoccupied areas as the unoccupied grid cells with the highest 5% of summed suitability values, excluding all islands except the United Kingdom.

| Performance of habitat modelling approaches
Overall, performance of global and local approaches was similar (mean AUC/CBI values across all models 0.69/0.695and 0.70/0.66respectively; Figure 3).Comparing modelling algorithms, the performance of GAMMs was lower than of the machine-learning algorithms (Maxent and random forest), which was consistent across levels of selection (landscape and home range level), and evaluation types (external and internal evaluations; average difference in combined performance score of GAMM models compared to machine-learning models: −0.146).Differences were particularly large at the landscape level (mean difference: −0.226).In addition, model tuning (i.e.optimization of algorithm parameters) had a large impact on the performance of Maxent and random forest models, yielding particularly large performance improvements in global models (see Appendix S3).Since we integrated predictions at both scales for creating habitat suitability maps, in the following, we mainly focus on combined performance scores across scales.For an overview of absolute performance scores (AUC and CBI values) at individual levels of selection, see Appendix S7.Across all six tested modelling approaches, performance was higher at internal evaluation (predicting to known sites used during model training) than at external evaluation (transferring models to sites not used during training).Yet, performance differences between internal and external evaluation were moderate (average across approaches: −0.039), indicating overall good transferability of the modelling approaches.On average, local modelling approaches (i.e.combining separately built, site-specific habitat models) performed slightly better during internal evaluation, while global modelling approaches (i.e.pooling all study sites for model training) showed higher transferability (i.e.performing better during external evaluation).
Comparing individual approaches, both local approaches considering regional variation in habitat selection (local spatial and local environment) performed best at internal evaluation, slightly improving performance over the local stationary approach (mean internal performances of 0.748, 0.752 and 0.738 respectively; only machine-learning models: 0.804, 0.801 and 0.786).Among global approaches, the nonstationary approaches (global spatial and global environment), did not lead to clear internal performance improvements over the global stationary approach, although for machine-learning models, slight performance increases were discernible (internal performance of 0.782, 0.781 and 0.772).
Focusing on model transferability (external evaluation), the local spatial approach implemented by weighting models based on spatial proximity performed considerably worse than all other approaches (average difference −0.064 vs. other approaches), while other approaches performed similarly.Considering only the better-performing machine-learning algorithms, the most transferable approaches were those describing variation in selection as a function of environmental variability (global environment and local environment), although improvements over stationary approaches were small (0.764 and 0.761 global environment vs. global stationary, 0.770 and 0.766 local environment vs. local stationary).Overall, the local environment approach thus achieved the highest average predictive performance across both levels of selection and both evaluation types.

| Comparison of habitat suitability maps
Given considerably lower model performance, we excluded GAMMs from our algorithm ensemble used to create habitat suitability maps and averaged predictions only across Maxent and random forest models.We created suitability maps using the global environment TA B L E 1 Overview of the six modelling approaches.

Global stationary Single model pooling all tracking datasets None
Global spatial Single model pooling all tracking datasets Addition of latitude and longitude as model predictors (random forest, Maxent) or spatially varying effects (GAMMs)

Global environment
Single model pooling all tracking datasets Addition of predictors characterizing regional-scale habitat availability (random forest, Maxent) or interaction effects with these predictors (GAMMs) and local environment approaches, which showed the best model transferability (see Section 3.1).Suitability maps of these best-performing global and local approaches showed highly similar spatial patterns (Spearman rank correlation = 0.93; Figure 4).The integration of habitat models fitted at different levels of habitat selection (i.e.landscape level and home range level) allowed to filter out habitat patches too small for the establishment of lynx home ranges (see Appendix S8).Threshold selection for binarizing suitability maps had a large impact on the total habitat area identified, with 48% and 43% more habitat areas mapped in the global and local approach, respectively, when comparing maps using the low versus high threshold.
Overall, the global environment approach identified more suitable lynx habitat (13%-19% more across thresholds; see Appendix S9).

| Identification of unoccupied habitat areas
Both model approaches predicted large areas of currently unoccupied lynx habitat across our study area.In countries fully covered by the most recent continental-scale assessment of lynx occurrence in Europe (excluding Russia, Ukraine, Moldova and Belarus; Kaczensky et al., 2021), binary habitat patch maps predict 677,000-1,420,000 km 2 of suitable lynx habitat, depending on the modelling approach and threshold used, which corresponds to 56%-117% of currently occupied areas in this region.Comparing suitability values with areas of extant lynx populations at the level of 10 km 2 grid cells, we identified a large number of grid cells with equal or larger suitability than the average suitability of grid cells with confirmed, permanent lynx occurrence (N = 1541, corresponding to 4.0% of unoccupied grid cells and 16.4% of occupied grid cells; Figure 5a).Unoccupied grid cells with the highest suitability were widely distributed across the continent and mainly located in mountainous and continuously forested regions.Importantly, large parts of the most suitable unoccupied habitats were located in regions lying between extant populations, such as large parts of the Southern and Eastern Alps, the Black Forest and Thuringian Forest in Germany, as well as large areas on the Balkan peninsula, including the Balkan Mountains, the Rhodope Mountains, as well as parts of the Dinaric Mountains (Figure 5b).

| DISCUSS ION
As animal tracking is entering the big data era (Kays et al., 2015), the growing availability of tracking data provides new opportunities to assess habitat consistently, at high resolution, and across large Model performance of different habitat modelling approaches, summarized across two levels of habitat selection (landscape and home range level).We tested six modelling approaches (rows), comparing internal evaluation (prediction to known sites) against external evaluation (prediction to unknown sites).Model performance was calculated by combining two performance metrics (AUC and CBI) into a relative performance score.Black triangles indicate mean performance values.
geographical extents.A key challenge in this context is how to best integrate tracking datasets across study sites and environmental gradients, while appropriately accounting for regional variation in habitat selection (Paton & Matthiopoulos, 2016).Here, we systematically assessed different approaches for addressing this challenge and used them to map habitat for a recovering large carnivore at fine resolution at a continental scale.Our analyses yielded three key insights, which we discuss further in the below.First, global and local approaches achieved similar predictive performance and resulted in similar suitability maps, indicating that both model building strategies can yield robust and transferable habitat maps across large areas.Second, our analyses underscore the importance of considering regionally varying habitat selection in large-area habitat mapping, as approaches describing variation in lynx' responses as a function of environmental variation across study sites performed best in our assessment.Third, we here provide the first consistent, fine-resolution habitat assessment for the Eurasian lynx across Europe, highlighting large areas of suitable, currently unoccupied habitat and therefore considerable potential for continued range expansion.
Global and local model building strategies have been applied for large-area habitat mapping (Bluhm et al., 2023;Scharf & Fernández, 2018), yet comparisons of their effectiveness are scarce (Olson et al., 2021;Reisinger et al., 2021).On average, global and local models performed better at extrapolation and intrapolation (i.e.model transfer vs. prediction at observed sites), respectively, in line with the trade-off between generality and specificity associated with pooling and splitting datasets for model training (Bamford et al., 2009;Paton & Matthiopoulos, 2016).Yet, differences between the two model building strategies were overall small and other factors had more substantial effects on model performance and habitat maps.Specifically, three factors were particularly important.First, the choice of modelling algorithm was key, as Maxent and random forest consistently outperformed GAMMs.The good performance of Maxent and random forests for habitat modelling, including models using animal tracking data, is in line with previous research (Aldossari et al., 2022;Shoemaker et al., 2018;Valavi, Guillera-Arroita, et al., 2021).Second, the predictive performance of Maxent and random forest models varied greatly over different parameter settings.This underscores the need for carefully tuning parameters instead of relying on default settings, which should be particularly important when relying on large and heterogeneous datasets as in our case (Radosavljevic & Anderson, 2014).Finally, while habitat maps were highly similar between the two best-performing global and local approaches, the selection of thresholds introduced considerable uncertainty, with mapped habitat areas varying around 45% across thresholds.Although binary habitat maps are sometimes needed for conservation planning, the large uncertainty-associated threshold selection remains an unsolved challenge for habitat maps based on presence-only datasets (Muscatello et al., 2021;Norris, 2014).We thus recommend using continuous habitat suitability maps whenever possible or relate predicted habitat suitability in currently unoccupied regions to areas with known occurrences, as we did for identifying the most suitable unoccupied habitat areas.
In sum, we highlight that both global and local modelling strategies can be suitable for combining animal tracking datasets at a continental scale, while other commonly discussed issues of modelling techniques (i.e.modelling algorithms, parameter tuning and threshold selection), seem to have a larger effect on modelling results.
We achieved the highest predictive performance with approaches that consider regional variation in lynx habitat selection.
Approaches describing varying habitat selection as a function of environmental variation yielded the most transferable models, pressure and landscape composition (Oeser et al., 2023).While previous studies have highlighted the usefulness of including predictor variables at multiple scales for capturing functional responses (Aldossari et al., 2022;Matthiopoulos et al., 2011;Paton & Matthiopoulos, 2016), as in our global environment approach, our comparison showed that combining habitat maps from local models can also achieve good performance.Indeed, our local environment approach overall performed best among all approaches we tested.
Thus, our results highlighted that strategies based on the combination of site-specific local models, although used less frequently, can outperform global approaches and thus deserve more research attention in the context of large-area habitat mapping (Reisinger et al., 2021).More generally, our results confirm the importance of considering regional variation in habitat selection when integrating animal tracking datasets across large areas and environmental gradients.A growing body of research suggests that functional responses in habitat selection are practically ubiquitous among large mammals (Godvik et al., 2009;Herfindal et al., 2009;Holbrook et al., 2019).Therefore, accounting for regionally varying habitat selection in large-area habitat assessments is likely important in the case of many other species as well.
Our analyses provide the most comprehensive and detailed assessment of European lynx habitat so far, highlighting large areas of suitable, but currently unoccupied habitat.For example, at the level of 10-km 2 grid cells, we mapped areas corresponding to 16% could make them key building blocks in the creation of lynx metapopulations at the continental scale.Yet, despite the availability of suitable habitat, the success of recolonization through natural dispersal has been limited, highlighting the importance of considering other limiting factors for range expansions, such as barriers to dispersal, traffic mortality and poaching (Arlettaz et al., 2021;Heurich et al., 2018;Port et al., 2021).In addition, we caution that several factors might disqualify areas identified as environmentally suitable by our maps as good candidate sites for fostering lynx comebacks, such as potential overlap with recovery areas of the critically endangered sister species Iberian lynx (Lynx pardinus; Garrote et al., 2020).
Finally, evaluating the human acceptance of lynx in regions containing patches of suitable habitat is important (Behr et al., 2017;Drouilly & O'Riain, 2021).Bearing these constraints in mind, our habitat suitability maps can provide an important basis for informing lynx conservation and management at a continental level.
While we used a very large tracking dataset and tested a range of state-of-the art modelling approaches, some limitations must be kept in mind.First, other factors we did not explicitly incorporate in our models are important causes of regional variation in habitat selection.For example, differences in population density, interactions with other species or genetic variation have all been linked to variation in habitat selection (Avgar et al., 2020;Fletcher, 2007).How to account for such sources of variation in habitat models is an active field of research but might be integrated with approaches capturing environmentally related variation (Smith et al., 2019;Tikhonov et al., 2017).In addition, we could not include information on prey availability in our models due to a lack of continuous spatial data.
However, given the wide distribution and generally high levels of abundance of most prey species of lynx across Europe (e.g.roe deer Capreolus capreolus; Carpio et al., 2021;Linnell et al., 2020), prey availability rarely should be a limiting factor for lynx recovery in  et al., 2021).
Despite these limitations, our study highlights the considerable potential of integrating animal tracking data across large areas for habitat assessments.Robust habitat maps are crucial for informing the transboundary conservation of wide-ranging species, such as large carnivores, yet consistent large-area habitat maps providing levels of spatial detail required for guiding conservation and management decisions are typically lacking.Our results highlight that the integration of animal tracking datasets across animal populations through curated and openly accessible databases can provide for a step change in our ability to map wildlife habitat consistently, robustly, at high resolution, and across large geographical extents (Heurich et al., 2021;Urbano et al., 2021).Importantly, as we showed here, large-area mapping does not have to come at the cost of losing

A FFI LI ATI O N S
allowing for varying selection either as a function of space or of environmental variation.Our analyses addressed the following overarching research questions: 1.How do global and local habitat model building strategies, different modelling algorithms and different approaches for incorporating varying habitat selection influence the predictive performance of broad-scale habitat models?2. How similar are habitat maps derived from global versus Local modelling approaches? 3. What are patterns of lynx habitat suitability across Europe and where are the most suitable but currently unoccupied habitat areas?
use/availability sample for training habitat models, which allows combining the outputs of models fitted at different levels of selection into a single, level-integrated prediction of habitat suitability (DeCesare F I G U R E 1 Overview of the study area and extent of the lynx tracking data used in this study.Coloured polygons show areas covered by lynx home ranges.Abbreviations of the study site names: BA-Baltic, BF-Bohemian Forest, BI-Białowieża Forest, DM-Dinaric Mountains, EA-Eastern Alps, HM-Harz Mountains, JU-Jura Mountains, NN-Norway North, NS-Norway South, SN-Scandinavia North-Central, SS-Scandinavia South-Central, SW-Sweden South, WA-Western Alps, WC-Western Carpathians.Light grey areas indicate the current distribution of Eurasian lynx according to IUCN data we used the recorded tracking observations as used locations and sampled available locations randomly inside home ranges of lynx individuals.Based on initial tests on the convergence of model coefficients with increasing background sampling sizes, we used a sampling ratio of 1:12 of used: available locations (Northrup et al., 2013; see Appendix S4).Finally, to balance the numbers of observations between individuals, we limited the maximum number of used locations included per individual lynx to 200 through random sampling, keeping 12 times as many available locations as used locations.Since tests on the performance of model predictions at the individual level indicated little improvements beyond 200 locations (see Appendix S5), we assumed that this down-sampling did not represent a consequential loss of information.Our final datasets used for building habitat models contained 740,018 locations from 442 lynx individuals, with an average tracking period of 735 days per individual (range 248-1617 across sites).
distance to training sites (i.e.inverse distance weighting, local spatial) or based on the environmental dissimilarity to training sites (local

F
I G U R E 4 Habitat suitability maps of the best-performing global (left) and local (right) modelling approaches.while weighting local models based on their spatial proximity performed much worse when transferred to unobserved study sites.Variation in environmental conditions more closely reflects the ecological mechanisms causing animals to adjust their habitat selection (i.e.adjustments of habitat use with varying habitat availability also termed functional responses; Aarts et al., 2013), whereas spatial proximity alone is often a poor proxy of model transferability (Yates et al., 2018).Our results thus indicate that functional responses might be a main factor explaining variation in habitat selection by lynx across Europe, which is well in line with recent results highlighting adjustments by lynx across gradients of human of the current lynx distribution that feature suitability values equal or larger than the average of currently occupied areas.Importantly, many of the most suitable unoccupied areas are located in regions lying between extant populations.Key regions identified by our assessment, some of which already feature sporadic lynx occurrence, include parts of the Eastern and Southern Alps, the Black Forest and Thuringian Forest in Germany, as well as the Balkan Mountains, the Rhodopes and parts of the Dinaric Mountains.The lack of functional connectivity between extant populations constitutes one of the central challenges for lynx conservation and management in Central and Southern Europe (Bonn Lynx Expert Group, 2021).Fostering connectivity for lynx in these regions, for example through the establishment of functional transboundary migration corridors or the creation of stepping-stone populations through reintroductions F I G U R E 5 Suitability of unoccupied habitat areas at the level of 10 × 10 km grid cells across countries fully covered by the most recent continental-scale assessment of lynx distribution (Kaczensky et al., 2021).(a) Suitability relative to the median suitability of grid cells with permanent lynx occurrence.(b) Unoccupied grid cells featuring the highest 5% of summed suitability values (shown in red).Grey areas correspond to grid cells with permanent lynx occurrence.
Europe.As our tracking datasets mainly cover relatively remote and undisturbed habitats and do not include parts of the species' range in which lynx are known to differ in their prey preference and habitat selection (e.g.specialization on lagomorphs and stronger use of open habitats in Anatolia;Mengüllüoğlu et al., 2018), our habitat models might underestimate suitable habitat areas in some regions.As a more general limitation, we did not explicitly address several challenges relating to the representativity of heterogeneously collected tracking datasets for assessing population-or species-level habitat selection.As site-level tracking datasets are collected with different preconditions and goals, they often differ in several parameters potentially affecting inferences on habitat use, such as the number of collared individuals, sex and age ratios, tracking intervals and location errors.We here chose to maximize the number of individuals in our dataset by including all individuals with more than 50 available tracking observations.Yet, we acknowledge that our habitat models, particularly local models built on site-level data, could be biased by factors not accounted for in our analysis, such as uneven sex ratios, differences in location errors and tracking intervals between GPS and VHF data, or low numbers of individuals coupled with strong individual-level variation in habitat selection.More generally, when using continental-scale suitability maps such as ours, it is important to consider that environmental predictor variables derived over large areas might miss relevant local differences in habitat conditions (e.g.traffic volume on roads or recreational activity levels inside forested areas).Therefore, decision-making for conservation planning might require follow-up analyses based on datasets better capturing local habitat conditions.Finally, while we here were able to rely on a large dataset comprising many individuals from almost all extant lynx populations in Europe and focused on comparing modelling approaches that can be applied once such continental-scale databases are available, further research investigating how many (and which) individuals and study sites are required to ensure representatively characterizing habitat selection is important to better understand under what conditions tracking data allows robust inferences in large-area habitat mapping(Street regional context, when local variations in habitat selection across gradients of human modification, climate or landscape composition are properly taken into account.A new generation of habitat models harnessing the enormous potential of the emerging big data in animal tracking will increasingly allow us to overcome the traditional trade-offs between local, fine-scale habitat assessments and broadscale, biogeographical analyses.