Snow information is required in subcontinental scale predictions of mountain plant distributions

Aim: To examine how snow cover and permafrost affect plant species distributions at a subcontinental extent. Fennoscandia, northern Europe. Time period: Species data from 1 January 1990– 25 February 2019. Major taxa studied: Arctic- alpine and boreal vascular plants. Methods: We examined the effect of snow persistence and permafrost occurrence on the distributions of arctic- alpine and boreal plant species while controlling for climate, topography and geological factors. Data comprised 475,811 observations from 671 species in the Fennoscandian mountains. We investigated the relationships between species distributions and environmental variables using four modelling methods and ensemble modelling building on both non- spatial and spatial models. Results: Snow persistence was the most important driver of plant species distributions, with the greatest variable importance for both arctic- alpine (38.2%) and boreal (49.9%) species. Permafrost had a consistent minor effect on the predicted distributions. Arctic- alpine plants occur in areas with long snow persistence and permafrost, whereas boreal species showed the opposite habitat preferences. conclusions: Our results highlight the importance of snow persistence in driv-ing the distribution of vascular plant species in cold environments at a subcontinental scale. The notable contribution of the cryosphere to plant species distribution models indicates that the inclusion of snow information in particular may improve our understanding and model predictions of biogeographical patterns in cold regions.

However, there are special elements in various ecosystem types or geographical regions that may, in certain circumstances, actually outcompete the "universal" climatic or environmental drivers in shaping biogeographical patterns (Luoto & Heikkinen, 2008;Mod, Scherrer, et al., 2016). The cryosphere, that is, the frozen part of the Earth's system, is a crucial element in cold environments. Through its connections with water, energy and carbon exchange, the cryosphere supports unique habitats for biota in high-latitude and mountain areas (Aalto et al., 2018;Hock et al., 2019;Virtanen et al., 2016). As the cryosphere is highly interconnected with the climate and is sensitive to its changes (Hock et al., 2019), cryogenic conditions may possibly be important mediators shaping vegetation patterns Schuur et al., 2007).
Snow cover and permafrost control multiple aspects of the abiotic conditions in terrestrial ecosystems, both in winter and during the growing season (French, 2013). In terms of extent, snow is the largest component of the cryosphere (Fountain et al., 2012), and its uneven distribution is a pivotal driver of vegetation patterns at fine spatial scales . In addition to snow, the cryosphere also affects species occurrences belowground due to spatial variation in permafrost (Fountain et al., 2012;Westermann et al., 2015). Snow cover and permafrost occurrence control freeze and thaw cycle-related disturbances (e.g., cryoturbation and solifluction) of the ground and may also drive spatio-temporal variation in soil moisture and nutrient contents Rowland et al., 2010). Therefore, variation in cryogenic conditions will alter the functions and fitness of plant individuals, thus also affecting species distributions (Hock et al., 2019;Williams et al., 2015). However, species tend to respond differently to warming and melting depending on their thermal niche, and thus, their biogeographical range (Felde et al., 2012;Lesica & McCune, 2004).
Furthermore, species responses to environmental conditions and their survival ability depend on their life strategies .
Many arctic-alpine species can tolerate extreme conditions and high environmental stress, whereas boreal species are often stronger competitors in more productive habitats (Mod, Heikkinen, et al., 2016).
Reduced snow cover and permafrost melting have already led to shrinking of the cryosphere (Aalto et al., 2018;Hock et al., 2019;Liston & Hiemstra, 2011). The land area suitable for arcticalpine plants is decreasing globally, as the boreal tree line is moving northward and to higher elevations, and shrubification of the tundra continues (Myers-Smith et al., 2011;Reichle et al., 2018;Wilson & Nilsson, 2009). Species growing in snowy habitats in particular are shifting their optimal distribution ranges up-and northwards to locate more suitable environmental conditions (Felde et al., 2012;Lesica & McCune, 2004;Randin et al., 2009). Thus, boreal and temperate vegetation may prosper in high-latitude mountain regions, changing the present arctic-alpine ecosystem drastically (Niskanen et al., 2019;Nogués-Bravo et al., 2007;Stewart et al., 2016).
Although snow cover and frozen ground are acknowledged to be important determinants of cold ecosystems, they have only rarely been included in species distribution models (SDMs). Incorporating landscape-scale snow persistence data into SDMs has substantially improved distribution predictions and thus our understanding of current and future species patterns (Carlson et al., 2015;. However, the effects of cryogenic features on biogeographical patterns at broader scales are not currently understood. Here, we examine the effect of snow persistence and permafrost occurrence on two major biogeographical groups of vascular plant species (arctic-alpine and boreal) of tundra vegetation at subcontinental scale in the mountain environment of Fennoscandia. We aim to characterize plant species with regards to their cryospheric niche and to answer the following questions:

| Research area
Our study is located in the arctic-alpine region of Fennoscandia, northern Europe (approx. 55-72° N, 5-32° E, Figure 1a). The region covers wide environmental gradients due to varying geology, climate and topography. Geologically, all of Fennoscandia is part of the Precambrian Baltic shield, though differences in bedrock quality occur due to past orogenies and glaciations (Lidmar-Bergström & Näslund, 2005), which are reflected in soil nutrient content variations (Virtanen, 2003). Two climatic gradients can be detected in the study area: the climatic conditions change from west to east from oceanic to continental and from south to north the temperate climate shifts towards the arctic (Aalto et al., 2014). Additionally, variations in precipitation and temperature conditions are caused by the strong orographic effect of the Scandes mountain range (Karlsen et al., 2006). The wide environmental gradients are also reflected in the cryogenic features of the area. The southernmost parts of Fennoscandia are, on average, free of snow throughout the year, whereas snow cover may persist in the coldest parts until late summer ( Figure 1b). Permafrost occurs mainly within the mountain range and in the most continental parts of the study area in three zones, which reflect the spatial continuity of the permafrost (Figure 1c).
In Fennoscandia, tundra vegetation is controlled by both altitude and latitude, though the alpine area of the Scandes mountains covers the majority of the arctic-alpine realm (Virtanen et al., 2016).
At sea level, the biogeographically defined arctic area covers only a small section of northernmost Norway (European Environment Agency, 2015). We defined the arctic-alpine realm of Fennoscandia based on the biogeographical regions of Europe, from which we delineated the "Alpine" and "Arctic" regions (Supporting Information Appendix S1: Figure S1.1a; European Environment Agency, 2015).
Due to the wide topographical relief from the fjord bottoms at sea level to mountain summits (2,469 m a.s.l.), the vegetation of the area is a combination of species with very different ecologies and distributional ranges. To capture this heterogeneity, we included both boreal and arctic-alpine species in our study.

| Species data
Species occurrence data of boreal and arctic-alpine vascular plant species include observations from the national databases of Finland (https://laji.fi/en), Sweden (https://www.artpo rtalen.se/) and Norway (https://www.artsd ataba nken.no/), the Global Biodiversity Information Facility (GBIF; https://www.gbif.org/) and field observations by the authors' research groups. We only included georeferenced observations from 1 January 1990-25 February 2019 with location accuracy better than or equal to 100 m. The original data cover all of Fennoscandia, including over 6.2 million observations from 833 species. We only used species observations within the arctic-alpine realm. To decrease the effect of single observations and thus misleading predictions for species that were too rare in the data, we only used species that were present in > 20 grid cells (407 m × 407 m) within the modelling area (Titeux et al., 2017). We also removed observations of species with distributions showing strong direct anthropogenic influence such as crop species. Based on their distributional ranges in Fennoscandia, we divided the species into arctic-alpine and boreal species using the classification by Niskanen et al. (2019) complemented with observation maps from Den nya nordiska floran (Mossberg & Stenberg, 2003

| Environmental data
To consider fundamental environmental factors, we used climatic, geological and topographical variables in addition to the cryogenic  (Table 1). Climatic data for the study were acquired from CHELSA (Climatologies at high resolution for the earth's land surface areas) version 1.1., which is a global high-resolution climate data set for land surfaces. The data include bioclimatic variables derived from monthly and yearly average temperatures and precipitation from the period 1979-2013 (Karger et al., 2017). The data set's original resolution is 30 arcsec (30 arcsec c. 1,000 m, WGS84), and was resampled to a 1-km × 1-km grid (projection: Transverse Mercator Finland Uniform Coordinate System, EPSG: 2393). To keep our model structure parsimonious and avoid collinearity problems we included only three climatic variables. Growing degree days (GDD3), freezing degree days (FDD) and water balance (WAB) were chosen to represent average growing season, overwintering and moisture conditions, respectively (Supporting Information Appendix S1: Figure S1.1b-d).
A threshold of 3 °C in the mean daily temperature was used to calculate growing degree days (Karlsen et al., 2006). Chosen climatic variables are ecologically relevant as they represent known physiological limitations for vegetation (Mod, Scherrer, et al., 2016).

To represent topographical heterogeneity (Supporting Information
Appendix S1: Figure S1.1e), we used elevation difference (ELE) within the modelling grid cells calculated from combined national digital elevation models provided by the Land Surveys of Finland, Sweden and Norway (50-m resolution). To account for the geochemistry of the geological substrates, we used five bedrock classes (BR, Supporting Information Appendix S1: Figure S1.1f). The most nutrient-rich areas contain for example, limestone and dolomite (assigned to class 5), whereas the nutrient-poorest regions consist of for example, sandstone and quartzite (class 1). Bedrock data were reclassified from the geological data set of the Fennoscandian shield obtained from the Geological Surveys of Finland, Sweden and Norway (see Niskanen et al., 2019 for more details). The resolution for both topographical and rasterized bedrock data is 407 m × 407 m (projection: WGS84/ UTM Zone 34N).
The cryosphere was incorporated into our models utilizing snow persistence and permafrost occurrence data (Supporting Information Appendix S1: Figure S1.1g-h). Snow persistence (SNOW) is defined using day of year (DOY) of average snowmelt. A snow persistence map was constructed from two moderate resolution imaging spectroradiometer (MODIS) imagery-based daily snow cover products (MOD10A1 and MYD10A1) from January-September 2001-2018 (Hall et al., 2002). We downloaded the daily imagery, masked cloudy pixels, averaged the two MODIS products and binarized the snow cover ("snow" when the fractional snow cover was at least 10% and "no-snow" when below it). From the daily binary snow maps, we determined the melting DOY per pixel using a binomial generalized linear model (for more detailed information see  for each year separately and then averaging the melting DOY over the years. In mid-winter, the polar night prevents passive satellite-sensor observations in the northernmost parts of Fennoscandia and total snow cover was assumed for these days. The resolution of the snow data is 407 m × 407 m (WGS84/UTM Zone 34N). Permafrost occurrence (PF) was modelled for the North Atlantic region by Westermann et al. (2015) using mean annual ground temperatures (MAGT) from remotely sensed land surface temperatures and land cover data on a 1-km 2 scale (WGS84). Based on MAGT, the probability of permafrost occurrence (i.e., ground that remains frozen for at least 2 years) can be divided into four classes: continuous (> 90% of modelled MAGT realizations < 0 °C), discontinuous (50-90%), sporadic (10-50%) and no permafrost (< 10%) (Westermann et al., 2015). Glaciers were removed from the data to prevent misleading distribution predictions. The sizes of the permafrost classes varied greatly (Supporting Information Appendix S1: Table S1). To balance the sizes and thus increase variable TA B L E 1 Statistics and descriptions of the environmental predictors in the study area performance, we reclassified the data combining areas where the probability for permafrost occurrence was > .5 (i.e., continuous and discontinuous permafrost areas). Areas with sporadic or no permafrost were used as they were in the original data.

| Statistical analyses
All statistical analyses and spatial data manipulations were carried out using R version 3.5.3 (R Core Team, 2019). Before model fitting, we tested for possible collinearity between environmental predictors. The correlation coefficient did not exceed |.7| for any pairwise variable combination, thus all variables were retained for subsequent analyses (Supporting Information Appendix S1: Figure S1.2). For producing distribution predictions, environmental data were resampled to the resolution of the snow variable (407 m × 407 m, WGS84/ UTM Zone 34N) using the nearest neighbour method. Prior to modelling, we also combined our species presence records with pseudoabsences (PAs). We created PAs randomly for the cells in which a modelled species was absent, but which had at least one observation of some other study species. The number of PAs for each species was at least 2,000 or the same number as the presence records.
Non-spatial models were fitted by only using environmental explanatory variables as follows: Spatial structure of the data, that is, spatial autocorrelation (SAC), was considered by adding a residual autocovariate term (RAC) into the models (Crase et al., 2012). We calculated RAC from the residuals of non-spatial models following the framework by Bardos et al. (2015), using function "autocov_dist" in package "spdep". As our species occurrence data consist of irregular points instead of a uniform grid, the neighbourhood of residuals was calculated based on the distance of statistically significant SAC calculated with Moran's I.
Spatial models were fitted using the following form: To create more robust distribution predictions and gain the benefits of the different modelling approaches, we combined the four models to build an ensemble model (Araújo & New, 2007). We used weighted averaging based on cross-validated true skills statistics (TSS) as the utilized ensemble method, with a threshold criterion of max (Sensitivity + Specificity) or max (TSS) of the test data set (Naimi & Araújo, 2016). In spatial distribution predictions, RAC was incorporated into the prediction data using two approaches. First, to describe spatial structure as realistically as possible, we interpolated RAC values from species-specific RAC values over the entire modelling area using inverse distance weighting. Second, to standardize spatial structure, a median RAC value from species-specific RAC values was used for the entire modelling area. Ensemble models were used to visualize distributions spatially, but all analytical results were derived from the individual modelling methods. For a more detailed modelling framework, see Supporting Information Appendix S1: Figure S1.3.
We detected species responses to environmental conditions in both the non-spatial and spatial models using variable importance and response curves. Variable importance was calculated with the correlation test within "sdm" for each species in each modelling method. In the correlation test, variable importance is based on predicted values using an original and randomly permutated variable (Naimi & Araújo, 2016;Thuiller et al., 2009). The formula for the calculation is: Response curves were derived from each modelling method See Supporting Information Appendix S1: Figure S1.4 for examples of various response types in different methods. Distribution predictions were evaluated with area under receiver operating characteristics curve (AUC) and TSS (Allouche et al., 2006;Fielding & Bell, 1997). Evaluation statistics were calculated using fourfold crossvalidation for each model method along with the ensemble models (both non-spatial and spatial). To increase the reliability of the results we included only species models performing at least reasonably in terms of AUC (Swets, 1988). Only species models with AUC > .7 in all four modelling methods were included in the analytical results concerning response effects and variable importance.

| RE SULTS
The distributional ranges of boreal and arctic-alpine plants in Looking at the SDMs, the spatial models revealed similar results in terms of both response curve trends and variable importance as the non-spatial models. Thus, they are presented in the Supporting Information (Appendix S1: Figures S1.5 and S1.6, Table S1.2) and all results in the main text concern those of non-spatial models.
Furthermore, results concern species that had model AUC > .7 in all four methods (152 arctic-alpine and 339 boreal species). In the models, snow persistence was the most important predictor for species distributions in both groups over all modelling methods followed by FDD and GDD3 (arctic-alpine species, Figure 3a), or FDD and elevation difference (boreal species, Figure 3b) According to species response curves over all modelling methods, SNOW had a positive or unimodal effect on the occurrence probability of the majority of arctic-alpine species (72.0%) and only 6.4% of species showed negative responses. The species' preferences for snowier conditions are clearly visible in the case of two arcticalpine herbs Micranthes stellaris and Ranunculus glacialis, the latter of which favours areas with the longest snow persistence found in our study area (Figure 4b and c; Supporting Information Appendix S1: Figure S1.6). However, this positive response trend does not mean that species would habit areas with permanent snow, as our snow gradient does not reach its uppermost limit in glaciated areas. In contrast, approximately half of the boreal species (49.0%) had negative responses to increasing snow persistence. However, their optimal snow conditions were not in areas with complete snow absence, as seen in the response of Oxalis acetosella (Figure 4a). A unimodal response was detected for 17.8% of boreal species and certain species (5.0%) even favoured longer snow persistence. SNOW caused a detectable response for the majority of all species (74.0%), with fewer "no trend" responses than any other explanatory variable.
The importance of permafrost occurrence to species distributions was marginal in both groups (Figure 3a and  Species responses to snow and permafrost conditions had some variation between different modelling methods, as seen in the response curves of example species (Figure 4 and Supporting Information Appendix S1: Figure S1.5), though the main trends were  Predictive performance of the models was slightly higher for arctic-alpine species than for boreal species in all modelling methods (Supporting Information Appendix S1: Figure S1.7). In arcticalpine models, average AUC was .82 and TSS .54 (mean calculated over all four methods), compared to .78 and .49, respectively, in boreal ones. The difference in the means of AUC and TSS between the two species groups was statistically significant (t test, p-value < .001). Species-specific evaluations are presented in Supporting Information Appendix S2: Table S2.2. Snow persistence proved to be the most important explanatory variable for species distributions overall, whereas permafrost occurrence played a minor role. However, the importance of snow persistence was even more pronounced among boreal species than among arctic-alpine species. The opposite responses discovered and the predicted occurrences of arctic-alpine and boreal species regarding snow and permafrost support their known ecological differences: arctic-alpine plants are adapted to short growing seasons and cold conditions, whereas many boreal species avoid long-lasting snow cover (Billings & Mooney, 1968;Jonasson, 1981). However, in regard to response curves we acknowledge that they cannot capture the entire environmental gradient of the species in Fennoscandia. In terms of total distribution, the range of arctic-alpine species is covered almost entirely as we study the arctic-alpine realm, whereas boreal species occur mainly in lower and temperate parts of Fennoscandia. Moreover, the range of species inhabiting the harshest conditions (e.g., R. glacialis) is not fully covered as the species data do not cover glaciated areas. Thus, the responses are ecologically relevant, yet they need to be interpreted in the context of the study setting. The importance of snow was also highlighted in spatial models, and the results were overall similar between non-spatial and spatial models. Thus, we presume that the predicted distributions were mostly controlled by the environmental factors rather than spatial structures in the data (Crase et al., 2012). Furthermore, our results indicate the ecological relevance of snow persistence as a F I G U R E 4 Predicted distributions, snow and permafrost responses and variable importance of three study species from non-spatial models. Distribution predictions are based on ensemble models, response curves are from generalized boosted regression models (GBMs) and variable importance is calculated over all four modelling methods. For abbreviations, see Figure 3. Responses from other modelling methods are presented in Supporting Information Appendix S1: Table S1. At a landscape scale, biological systems in cold environments are highly driven by uneven snow distributions (Schmidt et al., 2019;Walker et al., 1993), and according to our results, the effect seems to also hold true at larger spatial scales. Variation in snow cover accumulation and thus in the snow period length are locally driven by the interplay of local topography and wind conditions creating a mosaic ranging from deep snow depressions to snowless windblown ridges (Billings & Mooney, 1968). But here, ecologically relevant variation in snow conditions controlled by much larger-scale processes also appears to occur. Snow cover affects local habitat conditions by regulating the ground thermal state, which is linked to permafrost occurrence, soil moisture conditions and nutrient availability, defining both growing season and overwintering conditions of the vegetation (French, 2013). Through its connections with the atmosphere and biosphere at multiple scales, snow cover information may represent environmental heterogeneity more accurately than traditional climatic and soil variables alone. Therefore, at broader scales, it can capture variation that occurs within the same climatic niche, increasing the realism of species distribution predictions. Furthermore, snow offers a more process-based pathway to understanding climate change effects on cold ecosystems, which makes it a crucial part of upcoming SDM studies, especially as more remotely sensed snow information is becoming available .

| D ISCUSS I ON
Even though our snow variable was derived from a coarser resolution remote sensing product (MODIS) compared to landscape-level studies  it captured ecologically relevant variation in snow conditions producing similar results to local studies. More broadly, the importance of snow in our results highlights the advantages of using accurate remotely sensed or otherwise real observation based, environmental data in SDMs, as they have the capability to overcome the effects of the variables derived from global models or interpolations such as traditional climate variables (Randin et al., 2020). In addition to accurate snow information, broader scale SDMs would benefit from real observations of other biophysical factors (e.g., soil thermal and moisture conditions), which appeared to be of high importance in local scale studies (Kemppinen et al., 2019;Niittynen, Heikkinen, Aalto et al., 2020). With the growing global data sets, (see e.g., Lembrechts et al., 2020) these factors could be better addressed in future studies.
The reasons why permafrost occurrence only had a negligible effect on species distributions remain speculative at present, but we propose two possibilities. First, the occurrence of permafrost in Fennoscandia is restricted to a small area, especially in terms of continuous permafrost. Thus, the majority of suitable habitats for vegetation are defined by the absence rather than the presence of permafrost. In addition, the soil active layer is generally very thick, buffering the effects of permafrost on vegetation (Körner, 2003;Seppälä, 2005). Therefore, the interaction between permafrost occurrence and vegetation distribution may be more difficult to interpret in Fennoscandia than in for example, Alaskan or Siberian arctic areas, where permafrost is a more prominent feature of the environment and the active layer is thin (Walker et al., 1993). Second, permafrost is difficult to detect remotely due to its high level of heterogeneity in the landscape (Rowland et al., 2010). The probability of permafrost occurrence derived from modelled ground temperature and divided into categorical classes was probably at an overly coarse scale to efficiently capture the heterogeneity in permafrost occurrence. More detailed information about permafrost occurrence could be obtained from measurements of ground surface temperature and active layer thickness (Christensen et al., 2004;Giaccone et al., 2019), but aggregating that information to cover large geographical areas remains challenging.
In light of our results, the future of boreal plants looks somewhat brighter than that of arctic-alpine species due to their positive response to shortening snow periods. As snow cover duration diminishes and ground temperature rises, the timing and length of the growing season change, possibly providing better habitat conditions for boreal species (Aalto et al., 2018;Hock et al., 2019). The decrease in species occurrences and increase in local extinctions of vascular plants, in particular arctic-alpine species, may be due to a decrease in snow cover rather than an increase in temperature itself . As our results demonstrate, arctic-alpine species have narrower ranges of occurrence in relation to snow and their average snow conditions occur on areas with longer snow persistence compared to boreal species, which possibly makes them more vulnerable to changing snow conditions (Felde et al., 2012). Diminishing snow persistence and higher temperatures might also allow more competitive boreal species to expand their distribution to arctic-alpine areas threatening the survival of less competitive but stress tolerant tundra species (Dullinger et al., 2012;Pauli et al., 2003). Generally, our results support former studies indicating that the arctic-alpine realm of the Northern Hemisphere, also in Fennoscandia, may become more boreal (Niskanen et al., 2019;Nogués-Bravo et al., 2007).
Furthermore, our study increases the reliability of these predictions by also including the cryosphere, specifically snow persistence information, in addition to climatic and edaphic factors.
To conclude, our results have two important implications. From a theoretical standpoint, we show that cryogenic features, especially snow persistence, are at least as influential on northern plant species as most of the traditionally utilized climatic variables (Austin & Van Niel, 2011;Mod, Scherrer, et al., 2016). Therefore, from an applied perspective, snow cover and possibly permafrost data should be included when analysing current plant species distributions and predicting their future ranges in cold regions. Including cryogenic features may be particularly important when analysing the impacts of changing climatic conditions, as shifts in climate are associated with major changes in the cryosphere (Fountain et al., 2012). These changes may represent key mechanisms through which alterations in temperature and rainfall regimes can affect plant communities in cold environments , with arctic-alpine and boreal species potentially responding differently, as our results suggest. Therefore, more collaborative research is required on global, regional and local scales to increase our understanding of how high-latitude or high-altitude vegetation will be impacted by various environmental conditions.

ACK N OWLED G M ENTS
The authors thank Babak Naimi for his help with SDMs, Juha Aalto for his help with the environmental data and Stella Thompson for checking the language. The authors also thank all current and former members of the BioGeoClimate Modelling Lab for their assistance with collecting the species observation data. The authors are grateful for financial support provided by the Doctoral School of Geosciences (GeoDoc, University of Helsinki), Arctic Avenue (spearhead research project between the University of Helsinki and Stockholm University) and Kone Foundation.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

AUTH O R CO NTR I B UTI O N S
T.R., M.L. and J.S. designed the study. T.R. and P.N. processed the data and conducted the analyses. T.R. compiled the results and wrote the first draft of the manuscript. All authors designed the methodology, interpreted and discussed the results and contributed to writing the final version of the manuscript.