Global distribution modelling of a conspicuous Gondwanian soil protist reveals latitudinal dispersal limitation and range contraction in response to climate warming

The diversity and distribution of soil microorganisms and their potential for long‐distance dispersal (LDD) are poorly documented, making the threats posed by climate change difficult to assess. If microorganisms do not disperse globally, regional endemism may develop and extinction may occur due to environmental changes. Here, we addressed this question using the testate amoeba Apodera vas, a morphologically conspicuous model soil microorganism in microbial biogeography, commonly found in peatlands and forests mainly of former Gondwana. We first documented its distribution. We next assessed whether its distribution could be explained by dispersal (i.e. matching its climatic niche) or vicariance (i.e. palaeogeography), based on the magnitude of potential range expansions or contractions in response to past and on‐going climatic changes. Last, we wanted to assess the likelihood of cryptic diversity and its potential threat from climate and land‐use changes (e.g. due to limited LDD).


| Soil microbial diversity patterns and conservation
Biodiversity conservation requires an up-to-date knowledge on the diversity of organisms and their distribution patterns to prioritise areas for protection and/or restoration (Bragazza, 2009;Tittensor et al., 2010;Whittaker et al., 2005).This information is primarily based on empirical species distribution and biodiversity data, or on model-derived predictions (Rondinini et al., 2006).Climate-based species distribution models (SDM) are thus useful tools to potentially support conservation activities (Araújo et al., 2011(Araújo et al., , 2019;;Franklin, 2013;Guisan et al., 2013).These models have often been based on current climate data, but palaeoecological data (Nogué et al., 2017) are complementary in our goal to understand species responses to past environmental changes (Maiorano et al., 2013), including past human influence (Phelps et al., 2020).Thus, conservation biogeography can build on past climate data and models to provide guidance for present and future conservation (Barnosky et al., 2017), illustrating the value of biogeography in applied ecology (Yannic et al., 2014).
Results: Our models show that favourable climatic conditions for A. vas currently exist in the British Isles, an especially well-studied region for testate amoebae where this species has never been found.This demonstrates a lack of interhemispheric LDD, congruent with the palaeogeography (vicariance) hypothesis.Longitudinal LDD is, however, confirmed by the presence of A. vas in isolated and geologically young peri-Antarctic islands.Potential distribution maps for past, current and future climates show favourable climatic conditions existing on parts of all southern continents, with shifts to higher land from LGM to current in the tropics and a strong range contraction from current to future (global warming IPSL-CM6A-LR scenario for 2071-2100, SSP3.70 and SSP5.85) with favourable conditions developing on the Antarctic Peninsula.
Main Conclusions: This study illustrates the value of climate niche models for research on microbial diversity and biogeography, along with exploring the role played by historical factors and dispersal limitation in shaping microbial biogeography.We assess the discrepancy between latitudinal and longitudinal LDD for A. vas, which is possibly due to contrast in wind patterns and/or likelihood of transport by birds.Our models also suggest that climate change may lead to regional extinction of terrestrial microscopic organisms, thus illustrating the pertinence of including microorganisms in biodiversity conservation research and actions.

K E Y W O R D S
Apodera vas, climate change, conservation biogeography, cosmopolitanism, endemism, freeliving protists, Gondwana, microbial biogeography, palaeogeography, soil biodiversity, species distribution modelling, testate amoebae Singer, Metz, et al., 2019).The amount and quality of data generated allows for a better assessment of microbial biodiversity distribution patterns, potential threats such as changes in climate or soil conditions (Mod et al., 2021), and cascading impacts on ecosystem functioning (Heleno et al., 2020).
Changes in microbial biodiversity and associated ecosystem functioning will affect the resilience of other organisms and potentially their ability to respond to climate change.Concerns about biodiversity loss, as well as its possible impact on ecosystem functioning (Heleno et al., 2020), require that microorganisms be considered in conservation strategies.However, like biogeography in general (Hortal, 2011), conservation biogeography is still largely focussed on macroscopic plants and animals (Ladle & Whittaker, 2011).A further paradox is that microorganisms are crucial in regulating climate change, yet they are rarely the focus of climate change studies.Their diversity and response to environmental change and fluctuation in climate make determining their role in ecosystems challenging.Thus, we need an improved understanding of microbial processes and their response to climate change to ensure an environmentally secure future.Finally, in addition to being essential for soil functioning and natural soil fertility and hence to plant health and agricultural production, soil microorganisms also have intrinsic value as elements of biodiversity worthy of preservation (Averill et al., 2022;Cotterill et al., 2008).We therefore have moral as well as practical and economic reasons to better document soil microbial diversity as a basis for its conservation and understanding its functions.Such moral arguments are, however, currently applied mainly to a small fraction of biodiversity, for example plants and animals (O'Malley, 2007).

| Microbial species distribution data and models with Apodera vas as an ideal test case
New molecular methods now make it possible to obtain reliable data on diversity patterns of microorganisms at regional (e.g.Mod et al., 2020) to global scales (Robeson et al., 2011).However, their taxonomic resolution is often limited due to methodological approaches such as high-throughput sequencing of short markers such as the V4 region of the SSUrRNA gene (Seppey et al., 2020).Data at the species level resolution are needed to inform about effects of global warming and other impacts of human activities on specific microbial species.But distribution data are rare for soil microscopic organisms at the global scale (Fontaneto et al., 2007).Existing records are patchy due to highly uneven and nonsystematic sampling, with regions such as Europe and North-America typically better covered than other parts of the world (Burdman et al., 2021;Geisen et al., 2018).Such geographical sampling bias is a known problem in biogeography (Meyer et al., 2015;Troudet et al., 2017), and it is no surprise that the less studied microorganisms suffer from it as much if not more than macroscopic organisms (Yang et al., 2013).
Long-distance dispersal (LDD) of microorganisms requires a capacity for passive transport and survival during the time needed for the transport.Measurements and estimates of the potential longterm survival of testate amoebae are rare.In a study conducted in a Canadian aspen woodland, estimated life expectancy of soil testate amoebae was short, ranging from ca. 6-10 days (Lousier, 1974a).
However, as most soil testate amoebae are able to encyst, they can survive during long periods of drought and frost (Bonnet, 1964), which explains their presence also in hot and cold deserts (Bamforth, 2004;Bamforth et al., 2005;Fernández, 2015;Pérez-Juárez et al., 2017).Very long-term survival seems possible as attested by the finding of viable protists including amoebae in 30,000-year-old permafrost (Shatilovich et al., 2005).
If survival is not a limiting factor for their LDD, size and a limited capacity to remain airborne may be more critical.Indeed, testate amoebae do not produce diaspore, such as spores, specifically adapted for passive aerial transport and which explain the observed anisotropic LDD of bryophytes, lichens and ferns (Muńoz et al., 2004).In line with this, empirical evidence suggests that testate amoeba species larger than ca.150 μm do not travel far (Smith & Wilkinson, 1987;Wilkinson, 2001).Furthermore, a modelling study comparing the dispersal potential of virtual particles ranging from 9 to 60 μm in size showed that while smaller particles (9-20 μm) could easily be transported over long distances, albeit only within a hemisphere, the dispersal potential of particles larger than 60 μm rapidly declined and dropped to very short distances (Wilkinson et al., 2012).Given the large size of A. vas by microbial standards (130-170 μm), LDD by wind seems unlikely.
Wind may not be a likely transport mechanism, but animals, and especially birds, could easily transport microscopic organisms (Green et al., 2023), including testate amoebae.However, the lack of reports for A. vas from mid-high latitudes of North America and Eurasia suggests that interhemispheric LDD is rare or absent.
A. vas is common in forests and peatlands with well-developed humid humus and moss cover of the Southern Hemisphere and intertropical zone (Smith & Wilkinson, 2007).But, although such environments are common in northern temperate regions, protistologists have so far failed to find it there (Figure 2a).This is in clear contradiction with the once popular idea that all free-living microorganisms should potentially be cosmopolitan ('everything is everywhere, …'), their occurrence in a given place being determined only by the local environmental conditions ('… but, the environment selects') (Baas Becking, 1934;Canfield, 2016;de Wit & Bouvier, 2006).
The abundance of records for A. vas in the literature makes it possible to model its potential distribution, thus providing a unique test case to use a predictive bioclimatic niche-based SDM in soil microbial biogeography.Studies on larger organisms have indicated that the performance of such models is independent of the trophic level (Huntley et al., 2004), suggesting they are also applicable to soil microorganisms (Schroder, 2008).Still, although this approach has been widely used in studies of multicellular taxa, there are only a few SDM microbial studies, and the existing ones (e.g.Mod et al., 2020Mod et al., , 2021) ) are based on metabarcoding data (i.e.OTUs or ASVs rather than direct observations of specimens).

| Conservation of microbial diversity
Due to their large population sizes, it used to be considered unlikely that any microbial species may be endangered (Finlay et al., 2004), but this view has been challenged (Cotterill et al., 2008).Extinction threat increases with decreasing population size and geographic range (MacArthur & Wilson, 1967) and it is now demonstrated that at least some soil microorganisms also have limited geographical ranges ( Beyens & Bobrov, 2016;Boenigk et al., 2006;Foissner, 2008).However, it remains difficult to demonstrate that a microbial species may indeed be threatened and to forecast the effect of its potential loss on ecosystem functioning.
Climate-niche models are useful to compare the predicted and realised distribution of taxa, and thus to assess the possible role of historical factors such as palaeogeography or glaciations in shaping current patterns.For example, if these models are used to predict the potential distribution under different climatic scenarios, in the past (e.g.Schorr et al., 2013) and in future (e.g.Mod et al., 2021), it is then possible to infer the magnitude of potential range expansions or contractions over glaciation cycles and the likely impact of future climate warming.Finally, SDM are also potentially useful to guide the sampling in regions suitable for the species but where it was not yet observed.And finally, these models can be used to predict phylogeographical patterns and cryptic species that could have evolved in isolated regions.A. vas is also a good model organism in this respect.
Note that although it is considered as a single species, morphological (Penard, 1911;Zapata & Fernandez, 2008) and genetic evidence (Duckert et al., 2021) suggest the existence of a species complex.

| ME THODS
Our modelling approach aimed at predicting the past, present and future potential distributions of A. vas based on high-resolution climatic and topographic variables by using N-SDM v1.0.1 (Adde et al., 2023), an end-to-end high-performance computing pipeline for species distribution modelling.Modelling building and analyses were reported in the ODMAP protocol (Zurell et al., 2020) (Appendix S1).All analyses were performed in the R environment (v4.2.2, R Core Team, 2022), with the full R code provided in Supporting Information.The N-SDM code can be found here: https://github.com/N-SDM/N-SDM.The supplementary R code related to this study can be found here: https://github.com/estellebru ni/Apode raVas -nsdm.

| Species occurrences and background absences data
We compiled 401 occurrence points from all known published records of A. vas worldwide and from our own unpublished data (Figure 2a, Table S1).When the exact occurrence location was not available, we estimated it given the information provided in the companion texts, especially based on the habitat-type (i.e.forests or peatlands) and elevation, resulting in an estimated geographical accuracy of less than 2 km from the original sampling location for most samples.
A dubious unconfirmed record from Antarctica (an apparent albeit surprising confusion with Difflugia vas, now either Lagenodifflugia vas or Pontigulasia spectabilis (Murray, 1910;Penard, 1902Penard, , 1911)) )) was excluded, but the record from King George Island, 120 km off the coast of the Antarctic Peninsula (Zapata & Matamala, 1987) is validated despite being from lake sediments, which is not the typical habitat of A. vas.Records from Vancouver, Canada (Penard, 1911), Iceland (Decloitre, 1965), Nepal (Bonnet, 1977), Hawaii and Japan (Richters, 1908) likely corresponded to mis-identifications.Finally, records with insufficient information available to infer a location with the required degree of certainty were also excluded (see Text S1 for a more detailed discussion of the critical data points).A set of 10,000 background absences was generated to contrast the occurrence observations by using a random-stratified sampling strategy in the six ecoregions considered accessible to A. vas (i.e.Afrotropics, Antarctica, Central Neotropics, Indomalaya, Oceania, and South America; Table S2, ODMAP).

| Variable selection
Based on expert knowledge, we preselected 19 bioclimatic, and one topographic candidate variables for modelling and project the potential distribution of A. vas.All variables were retrieved at a 30 arcsec resolution.The 19 bioclimatic variables (bio1-bio19; Table S3) related to air temperatures and precipitations were extracted from the CHELSA v2.1 data set (Karger et al., 2017(Karger et al., , 2021) ) for current (i.e. 1981-2010)  The Topographic Position Indexes were calculated using the terra R library (v1.7-3;Hijmans, 2023) and digital elevation models (DEM): CHELSA v2.1 DEM (Karger et al., 2017(Karger et al., , 2021) ) for current climate and future scenarios, and CHELSA v1.2 PMIP3 DEM for the past climate (Karger et al., 2017(Karger et al., , 2018)).To evaluate whether soil temperature was better predicting A. vas distribution compared with CHELSA air temperature, we built a second set of variables, in which the CHELSA air temperatures variables bio1-bio11 were replaced by the soil temperature variables SBIO1-SBIO11 from (Lembrechts et al., 2022), version 2, soil depth: 0-5 cm.Since these soil temperature data were available for the current climate solely, past and future spatial projections were made using air data only.To select the best subset of variables to model A. vas potential distribution among the 20 candidates, we used the automated procedure included in the N-SDM workflow with default setting (see Adde et al., 2023 and ODMAP for more details).

| Model fitting and evaluation
Generalised Linear Model (GLM) (McCullagh & Nelder, 1989), Maxnet (MAX) (Phillips et al., 2017), and light Gradient Boosted Machine (GBM) (Ke et al., 2017) models were fitted using their default values for hyperparameter tuning (see ODMAP protocol).Model accuracy was evaluated using a split-sample approach repeated 100 times with 30% of the data kept for validation.For each model, the best combination of hyperparameters was identified using the average 'Score' of three evaluation metrics: (i) MaxTSS (Allouche et al., 2006), (ii) Sommer's D (AUC′ = AUC × 2-1, where AUC is the Area Under the Curve; Somers, 1962), and (iii) the Continuous Boyce Index (CBI; Hirzel et al., 2006).To account for class imbalance, occurrences and background pseudo-absences were equally weighted in the models.For each algorithm, the variable importance was calculated using algorithm-specific measures (see Adde et al., 2023 for more details).Response curves were drawn for each variable and for each algorithm to show how predicted values changed along each variable gradient while keeping all other variable at their mean value (Elith et al., 2005).

| Mapping past, present and future predictions of Apodera vas
For each algorithm, projected probability values for past, present and future periods were mapped over a 30 arcsec resolution (i.e.ca. 1 km on the equator) grid covering the world and containing more than 9 million cells.Ensemble maps were calculated by averaging individual algorithm projections.Algorithmic uncertainty was evaluated using the coefficient of variation.

| Multivariate environmental similarity surface (MESS) analyses
For each projection, the climatic dissimilarity of each grid cell compared with the range of environmental values suitable to A. vas (i.e. the area considered accessible to A. vas) was measured using multivariate environmental similarity surfaces (MESS) analyses (Elith et al., 2005) We performed MESS analyses for all time periods and climate scenarios using the dismo R library (v1.3-9;Hijmans et al., 2023).Grid cells with MESS values <0 were set to 0, and grid cells with MESS values >0 were set to 1. Ensemble forecasting maps were then adjusted by multiplying each grid cell by their corresponding MESS value to create the final prediction maps reported in the present manuscript.

| Shifts in habitat range
To explore shifts in habitat range owed to climate changes, we calculated suitable habitat loss and gain for A. vas between (i) past and present distribution, (ii) present and the two future scenarios, respectively, and (iii) present using CHELSA air temperatures and present using soil temperatures.We then created maps with values ranging from −100% to +100%.

| Environmental condition ranges and relationship to elevation
We performed a principal component analysis (PCA) to analyse the range of climatic conditions found in the six ecoregions considered accessible to A. vas (i.e.Afrotropics, Antarctica, Central Neotropics, Indomalaya, Oceania and South America) compared with the climatic conditions of the 401 locations in which A. vas occurred.To do so, a set of 100,000 points were randomly selected on all continents worldwide and values of the CHELSA variables included in the model (i.e.bio1, bio2, bio4, bio15, bio16 and bio19) were extracted for points located in the above-mentioned ecoregions.Values of the CHELSA variables included in the model were also extracted for the 401 occurrence points of A. vas.The relationship of A. vas to elevation was assessed by plotting the absolute latitude against the elevation of the 401 occurrence points of A. vas.

| Updated distribution map of Apodera vas
The 401 validated occurrences of A. vas (Figure 2a, Table S1) show that this species has a broad distribution in the Southern Hemisphere and intertropical zone.It confirms its absence from mid-to high latitudes of the Northern Hemisphere above 20° N. A. vas reaches higher elevation at low latitudes (20° N-20° S; average 2331 m, SE 127 m, median 2103 m) than at high latitude (>35° S; average 536 m, SE 29 m, median 414) (Figure 2b).
Comparing the importance of variables revealed that mean annual air temperature (bio1) and mean annual soil temperature (SBIO1) were the dominant variables in the models (Figure S1).Looking at response curves (Figure S2), the occurrence of A. vas was correlated with all automatically selected variables.The response curves also suggest that this taxon is mainly found in temperate to cold climates.This is also evident from the relationship between latitude and elevation in our records (Figure 2b).
Single algorithm and ensemble model performances were high for both variable sets (ensemble average 'Score': 0.86 for the 'air' and 0.84 for the 'soil' temperatures variables sets, respectively; see Table S3 for single algorithm and other evaluation metrics).The coefficient of variation of habitat suitability maps for A. vas show generally low uncertainty of model predictions where A. vas is known to occur (Figures S4 and S6B).The areas with the highest uncertainty of predictions were mostly located outside the area considered accessible to A. vas (Figures S4 and S6B), and stand out as being mainly unsuitable for A. vas in the ensemble projections (Figures S3 and S6A).
To obtain the final maps, the ensemble projections (Figures S3   and S6A) were filtered with MESS results (Figures S5 and S6C).The MESS map for the LGM shows some areas (the East African Mountains much of the southern half of South America, most of Australia and most of southern Africa with the exception of the eastern costal area) as nonanalogous.This explains the projected unsuitability for LGM, and thus the major range shifts between LGM and present in these areas (Figure S8).Regardless of the set of variables or the climate scenario, the final weighted maps (Figure 4, Figure S7) differ only slightly from the raw ensemble projection, indicating that our models did not extrapolate outside the range of climatic conditions contained in the calibration area (Elith et al., 2010).

| Current, past and future potential distributions of Apodera vas
The PCA using the six bioclimatic data included in the air temperature models (i.e.bio1, bio2, bio4, bio15, bio16 and bio19) show general overlap of environmental condition ranges found in the major geographical zones (Figure 3).South America has the largest range of environmental conditions, while Antarctica is on the margin of the overall distribution.The environmental conditions in which A. vas was found (represented by points) cluster homogeneously and cover only a part of all climatic conditions occurring across the different ecoregions, which is in line with the modelled distribution maps.
Our prediction maps show that A. vas could, under present climatic conditions, potentially occur on all continents with highest predicted F I G U R E 3 Biplot of principal component analysis (PCA) of bioclimatic variables included in the air temperature models (i.e., bio1, bio2, bio4, bio15, bio16, and bio19).The coloured points correspond to the environmental conditions found at the 401 occurrences points of Apodera vas included in the models.Coloured lines delimit the range of environmental conditions in the six ecoregions considered accessible to A. vas.The grey area corresponds to the range of environmental conditions in the whole area considered accessible to A. vas (i.e., the overall environmental condition range in the six ecoregions).Correlation among the six variables included in the PCA are shown in the correlation circle.The comparison between current and future climate scenarios shows further strong range reduction across all regions, with only minor contrast between SSP3 and SSP5 projections (Figure 4b vs. 4c, Figure S7B).While A. vas is still predicted to occur on all continents and major regions, the suitability of climatic conditions decreases almost everywhere and the connection between suitable areas is lost or strongly reduced.Indeed, A. vas occurrences only increase at very high latitudes: Svalbard, Greenland, Iceland, Alaska and the Antarctic Peninsula.

| Updated distribution map of Apodera vastaxonomy as possible caveats
The 401 validated occurrences of A. vas (Figure 2a) represent a substantial increase in comparison to the 46 sites included in the analysis of Smith and Wilkinson (2007), which also included several errors.The most critical points are discussed in Text S1.Three possible caveats should be mentioned here.
The first caveat is that some published records may be wrong due to misidentification.For example, A. vas resembles Lagenodifflugia vas (Finlay et al., 2004), an unrelated species (Mitchell & Meisterfeld, 2005) in general shape, and we suspect this confusion could explain the unconfirmed record of A. vas from mainland Antarctica (Murray, 1910;Penard, 1911).
The second caveat is that it is impossible to account for cryptic or pseudocryptic diversity when using morphology-based records.Cryptic or pseudocryptic diversity is commonly reported within broadly defined morphological species of protists and other micrometazoa (de Vargas et al., 1999;Foissner et al., 2001;Fontaneto et al., 2011;Fucikova & Lahr, 2016;Kosakyan et al., 2012;Leasi et al., 2013;Singer et al., 2015;Skaloud & Rindi, 2013).It has long been suspected that A. vas is a species complex (Penard, 1911;Zapata & Fernandez, 2008), and this is supported by the recent finding of a long-forgotten yet highly conspicuous species within genus Apodera in New Zealand and the associated high genetic variability within morphotypes identified as A. vas in New Zealand and Macquarie Island (Duckert et al., 2021).
Thus, relying on morphological identification and partly unverifiable sources carries the risk of overestimating the ecological or climatic niche of a species.This caveat suggests that, if our predicted distributions for A. vas were biased, this bias would be towards an overestimation of its distribution, with each (pseudo)-cryptic species potentially having a smaller potential geographical distribution as well as more restricted ecological niche.However, as shown in Figure 3 there is a general overlap in the climatic characteristics of the records in the different major geographical zones.It is however likely that within each zone several species exist, each of which may have a somewhat different climatic niche.We therefore regard our results as being valid for A. vas as a species complex.
The third caveat is that, despite the fact that A. vas is arguably the best documented terrestrial protist, many regions remain unexplored.We are nevertheless confident that the 401 occurrences cover most of the climatic niche of the species.

| Model parameters and performance
As soil moisture is a key factor controlling the productivity (Lousier, 1974a(Lousier, , 1974b)), and community structure (Koenig et al., 2018) of soil protists, we expected bioclimatic variables related to precipitation to significantly explain a high fraction of the distribution of A. vas for both variable sets.Also, A. vas being commonly found in peatlands, which only develop in flat land or shallow slopes, we expected topography (i.e.TPI) to emerge as a strong predictor in our models.
However, this was not the case as temperature emerged as the most important variable (bio1 and SBIO1, respectively) (Figure S1).
This apparent paradox could be explained in several ways: (1) Precipitation only partly predicts soil moisture (Mod et al., 2016;Piedallu et al., 2013;Scherrer & Guisan, 2019), while temperature, in addition to directly influencing biological activity also partly controls soil moisture through its effect on evaporation and evapotranspiration (Seneviratne et al., 2010).Indeed, soil moisture may remain high even when rainfall is low, under lower temperatures, which could explain the relationship between latitude and elevation in the occurrences of A. vas (Figure 2b).( 2) Soil moisture is also determined by local factors such as microtopography (Lembrechts et al., 2019(Lembrechts et al., , 2020)), soil texture, shading, or interception by tree foliage, all of which vary substantially at the local scale (i.e. at a finer scale than predicted by the resolution of the models), decreasing the pertinence-and hence predictive power of precipitation for soil biotic communities (le Roux et al., 2013;Lembrechts et al., 2020).( 3) The relatively low importance of precipitation-related variables and TPI may be linked to the spatial resolution of the data, which is too coarse to take microclimate and microtopography into account.Indeed, even the finer 1 km 2 spatial resolution used in this study may fail to capture the precise conditions associated with an occurrence in topographically contrasted regions.A small peatland may for example develop on a small flat surface along a mountain slope.Similarly, but to a lesser extent, precipitation or temperature data for a given 1 km 2 grid may reflect the conditions occurring mostly at the base of a mountain which may be quite different from those of the forest growing 500 or 1000 m higher along the slope but still within the same 1 km 2 grid.Alternatively, conditions may be cooler (allowing higher soil moisture) in deep gorges in a relatively arid region, or wetter that would be predicted by climate in groundwater dependent ecosystems (Kløve et al., 2011).
(4) Finally, precipitation data are more difficult to model than temperature data (Karger et al., 2017).As a result, the predictive power of precipitation variables may be reduced, which could partly explain their lesser importance in our models.
Despite this unexpected result, the model performance was high (ensemble 'Score' > 0.84 for each variable set; Table S3) and allowed for answering the hypotheses of this specific study.

| Current potential distribution of Apodera vas
Both potential current distribution maps of A. vas match well the known occurrences of A. vas in the Southern Hemisphere and part of the tropics (Figures 2a and 4a,b).The differences between the prediction maps based on current models using soil and air temperature (respectively Figure 4a,b) confirm the usefulness of including soil climate data to determine which data set is most appropriate for soil microorganisms.Indeed, soil and air temperature can differ substantially: (1) Mean annual soil temperature is 3°C warmer on average with differences among biomes being 3.6°C warmer in cold and/or dry biomes and 0.7°C cooler in warm and humid environments, and (2) in the temperate forest biome, soil temperature was lower (on average −0.8°C) in forested habitats, but warmer (on average +1.8°C) in non-forested habitats (Lembrechts et al., 2022).
As a result, while the general patterns at the global scale were similar, some clear differences can be seen when analysing specific regions.When comparing the two results, the model based on soil temperature clearly makes more sense: while the map based on air temperature suggested that some coastal and arid regions of the Mediterranean, Atlantic coast of North Africa, Southern and Baja California were more suitable, the map based on soil temperature instead showed that the most suitable regions were the mountains where indeed more favourable habitats such as mixed forests exist (or could potentially exist without human impact), which better reflects the known ecological preferences of A. vas.The fact that differences between the two models are especially marked in the temperate regions of the Northern Hemisphere may be due to very strong impact of land use on current vegetation, which the soil temperature data account for, while the air temperature data does not.
However, a current limitation of this database, and possible cause for the lower contrast between the two models for the Southern Hemisphere, is the imbalance in data between regions.Indeed, the dataset of Lembrechts et al. (2022) is strongly dominated by data from the USA and Western Europe.There are no points in New Zealand, Madagascar, very few in Asia and none in SE Asia, only two in (Eastern) Australia, six in Africa and two in South America excluding Chile and Argentina.Despite these limitations, the comparison of the two maps clearly confirms the usefulness of the soil temperature database as soil temperatures better reflect soil conditions compared to air temperatures.From this we conclude that, even if the evaluation parameters for the model based on soil temperatures were slightly lower than those for the model based on air temperatures (ensemble average 'Score': 0.86 for the 'air' and 0.84 for the 'soil' temperatures variables sets, respectively; see Table S3), predictions of suitable habitats based on soil temperatures seem to take better account of the ecological preferences of soil microorganisms (e.g. in terms of soil moisture, see also Section 4.2).
Our results are valuable to target potential sampling sites for A. vas, especially the sub-Saharan African mountain ranges (i.e. from the highlands of Ethiopia to the mountains of Southern Africa and Cape region, highlands around the Congo basin; Figure S9), the (remnants of the) Atlantic Forest in Brazil, the mountains from Central America to Tierra-del-Fuego and mountains of SE Asia (Figure 4a).
Sampling in these places would first allow testing the validity of our models and, if the species is found, to isolate specimens for DNA barcoding and phylogeographical analysis.As other microbial taxa may exhibit a distribution similar to A. vas, a sampling targeting this taxon would allow us to assess phylogeographical patterns in several microbial groups, such as other hyalospheniidae (e.g.genera Nebela, Padaungiella, Alocodera and Certesella), ciliates (Kumar & Foissner, 2016) or bdelloid rotifers (Fontaneto & Ricci, 2006).Such a multigroup approach would be useful to clarify the extent of microbial diversity and biogeography more generally.

| Discrepancy between potential and documented distribution of Apodera vas versus documented records and likely dispersal mechanisms
The absence of documented records for A. vas in well-studied regions of the Holarctic with a high probability of occurrence such as in Britain, Ireland, Iceland, and the Pacific Coast of Canada is remarkable (Figure 2a).Indeed, favourable habitat such as peatlands and forests with well-developed humus are widespread in these regions (Figure S10).The absence of A. vas from these regions can therefore be interpreted as evidence for limited LDD.The British Isles are arguably the most intensively studied region for the current and past (palaeoecology) diversity of testate amoebae, but A. vas has never been reported there, except from horticultural Sphagnum mosses imported from New Zealand to be used in decorative hanging baskets (Wilkinson, 2010).This absence was interpreted as a sign of limited dispersal (Smith & Wilkinson, 2007), and our modelling results bring further support to this interpretation through an expanded compilation of occurrence data, and clear and quantitative predictive data.
Evidence for LDD by birds of bryophytes, and of a wide range of organisms including eight families of terrestrial vascular plants, insects and invertebrates (Lewis et al., 2014) suggest that LDD of testate amoebae and other protists through zoochory should at least be possible, if not common.But propagule size matters, as well as the degree to which birds are specific to a certain habitat-type (e.g. freshwater only) (Figuerola & Green, 2002;Green et al., 2023).Indeed, dispersal may be most likely at a local to regional scale and for organisms living in aquatic ecosystems and wetlands.For example, a high abundance and diversity of diatoms were recovered from snipe (Gallinago gallinago) (Wuthrich & Matthey, 1980).Yet, even for aquatic and wetland diatoms, a high degree of endemism was observed in New Zealand (Kilroy et al., 2007).This result was interpreted as evidence for the dynamic equilibrium model according to which the maintenance of endemism is more likely in stable and unproductive environments (Huston, 1979), which are also characteristics of the environments in which A. vas is found.
Dispersal by birds may be much less likely for forest soil protists as it would require a bird species to trap soil protists in its feathers while foraging on the ground, which is less likely than for a bird standing in water, and then to migrate over long distance.Forest birds were indeed shown to transport viable bryophyte propagules (Chmielewski & Eppley, 2019).Furthermore, disjunct distribution of terrestrial tardigrades matching patterns of bird migration between North and South America suggest long-distance transport by birds.The presence of tardigrades in bird's nests built from lichen supports this idea (Mogle et al., 2018).To our knowledge, no such study exists for protists and the available data clearly shows that A. vas is absent from intensively studied, favourable habitats in the temperate regions of the Northern Hemisphere.
The presence of A. vas on many peri-Antarctic islands, most of which are young and of volcanic origin, contrasts with the lack of latitudinal LDD and is either evidence for wind dispersal due to the extremely strong winds occurring under these latitudes, to bird dispersal, or to transport on rafts of terrestrial vegetation.Global wind patterns indeed make it more likely for microscopic organisms to be transported over long distances in the 40-70° S latitudes where strong winds blow almost constantly with low seasonality (median wind speed of 10-12 m s −1 and a 90th percentile wind speed of 15-18 m s −1 (Derkani et al., 2021)) than across the equator where the intertropical convergence zone and trade winds converging towards the equator strongly limit the potential for aerial dispersal between the Northern and Southern hemispheres.A modelling study however suggests that even in the Southern Ocean, LLD of particles of 60 μm or larger such as A. vas is very unlikely (Wilkinson et al., 2012).
The most likely birds that could potentially carry testate amoebae to distant islands are seabirds nesting on islands.However, this would imply that they stop on several islands, which may not be common given the low rate of hybridization in seabirds (Phillips et al., 2018) and that the amoebae would remain viable even if immersed in sea water.This seems unlikely given empirical evidence showing the strong negative effect of sea salt on the abundance and diversity of terrestrial testate amoebae (Whittle et al., 2022).The latter observation also makes transport on rafts of terrestrial vegetation unlikely.
Alternatively, and perhaps more likely, A. vas could have been carried by terrestrial or freshwater wetlands birds which rarely fly to islands but at least for the former cannot land on the sea to rest.Indeed, even isolated islands such as Kerguelen and Macquarie have native duck species and several strictly terrestrial species are occasional vagrants (Catard, 2001), South Georgia even having a native pipit species.

| Current, past and future potential distribution of Apodera vas and implications for microbial biogeography and phylogeography
As the soil temperature data are currently only available for the present (Lembrechts et al., 2022), comparison of past, present and future suitability maps can currently only be made with the air temperature data.While keeping in mind the limitations of these models some clear patterns emerge and seem convincing enough.Our models predict substantial range contraction and expansion over glacial-interglacial cycles, despite the limitations for the LGM model (Figure 4b, Figure S7A).Such changes likely drive complex phylogeographical patterns, as reported for Hyalosphenia papilio, a common

Sphagnum-dwelling hyalospheniid testate amoeba taxon common in
Northern Hemisphere peatlands and which is similar in size to A. vas (Heger et al., 2013;Singer, Mitchell, et al., 2019).Further marked range contraction is predicted in response to on-going climate warming (Figure 4b,c, Figure S7B): potentially favourable in South America, Africa and SE Asia may move outside of the climatic niche space of A. vas.As these regions have been subject to intense deforestation in the past decades, the potential surfaces of habitats favourable for A. vas have already strongly declined.Our modelling results show that climate change will contribute to further strong reduction, similar to findings reported, at the more regional scale, for soil microorganisms in the Swiss Alps (Mod et al., 2020(Mod et al., , 2021)).This predicted contraction suggests that A. vas may become rare or may even go extinct in some regions for both future scenarios (Figure 4c, Figure S7B).

| Linnean and Wallacean shortfalls in the study of soil microbial biodiversity
The study of soil biodiversity in general, but especially for microscopic organisms including protists, suffers from two main curses: the Linnean shortfall (most species remain undescribed) and the Wallacean shortfall (the geographic distribution of known species is incomplete) (Hortal et al., 2015).The Linnean shortfall is illustrated by general estimates of the 'known and unknown' diversity (Chao et al., 2006;Decaens, 2010;Foissner, 1999), as well as by studies reporting new protist species, often from specific habitats (Foissner, 2010;Pérez-Juárez et al., 2017).Clearly, much remains to be done to clarify the taxonomy of most microbial groups (Cotterill & Foissner, 2010).
Taxonomic uncertainty in turn hinders biogeographical inference (Caron, 2009;Heger et al., 2009) and, together with very patchy sampling, contributes to the Wallacean shortfall, the scarcity of geographic distribution data (Hortal et al., 2015).Many species are known from only a small number of localities, often within a given region (Heger, Booth, et al., 2011), but sometimes there is a puzzling disjunct distribution with reports from distant regions (Bourland, 2017;Nicholls, 2015).As most of these records are based on morphological data only, strange patterns may hide cryptic or pseudo-cryptic diversity, and hence emphasising the need for more taxonomic studies combining morphological and molecular approaches (Foissner, 2008;Foissner et al., 2001;Heger et al., 2014;Singer et al., 2018).

| SDMs to predict cryptic diversity, distribution, invasion risk and threat for biodiversity conservation
As was shown for Hyalosphenia papilio, a ubiquitous hyalospheniid testate amoeba common in northern Hemisphere Sphagnumdominated peatlands (Foissner et al., 2001;Heger et al., 2013;Singer, Mitchell, et al., 2019), a single morphospecies may correspond to numerous distinct species.As each of these may occur only in a fraction of the whole range of the morphospecies, some of this diversity may indeed be threatened and, as such, worthy of concern to the same degree as macroscopic plants or animals.
There is no reason to believe that the pattern of geographically structured cryptic diversity observed for H. papilio would not also be found in A. vas.Quite the contrary: the distribution of A. vas is disjunct due to the distribution of land masses in the Southern Hemisphere, and, within continents, it is much patchier than that of H. papilio due to the ecological preference of this taxon.The predicted distribution of this morphospecies therefore suggests the existence of a diverse species complex.As our models show that many currently favourable locations will become unsuitable for A. vas in a warmer future, A. vas should be considered at risk of becoming locally extinct, and if these local populations correspond to genetically distinct species this would mean a net loss of species diversity (Bickford et al., 2007).If we are to understand the true morphological and molecular diversity within this taxon, it is therefore urgent to study such relict isolated populations.This illustrates Global warming is a recognised major threat to plant (Harter et al., 2015) and animal (Díaz et al., 2019) biodiversity, especially in places such as oceanic islands from where migration to more suitable habitats is limited.Our data suggest that it may also cause a loss of diversity in free-living protists, and this study therefore contributes to ongoing discussions about the possible impacts of global warming on the diversity of microscopic organisms (Averill et al., 2022).Considering other components of global change, it is likely that many microbial species have already disappeared and many more are currently threatened.Such a loss has only very rarely been documented for protists.One such example is Nebela carinatella a highly conspicuous species described as subfossil from Subatlantic peat deposits in Belgium but absent from the communities living at the surface (Beyens & Chardez, 1982).A record from China for an aquatic habitat and which we could not find any illustration is not considered here (Yang et al., 2004).The loss of microbial diversity represents a potential threat to ecosystem functioning through disruptions of food webs (Heleno et al., 2020).It is therefore now urgent to invest in taxonomy to better document this mostly unknown diversity, evaluate how much of it is threatened (i.e.create red lists for soil organisms including microorganisms) and what would be the ecological consequences of their disappearance.
Our modelling results also have implications for the possible invasion risk of free-living microbial species.Indeed, no living specimen of A. vas has yet been observed in the British Isles, but should some specimen survive the transport and conditioning of the horticultural peat, A. vas could likely colonise the British Isles and other places such as Japan which also imports Sphagnum from New Zealand (Wilkinson, 2010).Our models clearly show that the climate would be favourable for its development (e.g.maximal temperature during the warmest quarter below ca.20°).This suggests that importation of mosses and soil, including through horticultural products, should be more strongly regulated to prevent the spread of potentially invasive soil microorganisms, a topic that has not yet received much attention (Thakur et al., 2019).
Protists and microorganisms in general are rarely considered to be of any concern for biodiversity conservation despite their major ecological roles and huge diversity (O'Malley, 2014; Wilkinson & Smith, 2006).The almost complete exclusion of microorganisms in conservation efforts is partly due to a historical focus on macroscopic organisms, which ignores large parts of biodiversity resulting in a general lack of expertise on microbial biodiversity (Wilkinson, 1998).But is also likely due to the belief that owing to their high abundance and dispersal potential it is all but impossible for any microbial species to go extinct (Fenchel & Finlay, 2003).However, even within the microbial world, most attention has been recently on bacteria and fungi, compared with the relatively understudied protists (Caron, 2009;Geisen et al., 2018).Reaching a better balance in the study of different groups of organisms is a clear challenge, as is bringing microorganisms and especially protists to the attention of conservationists.Apodera vas is not just a scientific curiosity; the

F
Pictures and microphotographs of Apodera vas.(a) drawing by Adrian Certes in the original description of Nebela vas in 1889 (Certes, 1889), (b) light microscopy of A. vas from Chile (DIC), modified from Fernández et al. (2015), (c) light microscopy of A. vas from New Zealand (DIC, extended depth focus image), (d, e) scanning electron microscopy of A. vas from Tanzania (Mitchell & Meisterfeld, 2005).Scale bars: b and c = 20 μm, d and e = 50 μm.Images b-e by E. Mitchell.Image b reproduced (modified) from Fernández et al. (2015) image c unpublished, images d and e reproduced from Mitchell and Meisterfeld (2005).
401 curated A. vas occurrences (Figure 2a), we built bioclimatic niche-based SDMs and determined its potential distribution worldwide and at high resolution (30 arcsec) according to current climate, future climate scenarios (IPSL-CMP6A-LR predictions for 2071-2100, shared socio-economic pathways (SSP) 3 and 5) and past climatic conditions during the last F I G U R E 2 (a) Geographical position of 401 validated occurrences of Apodera vas compiled from the literature and personal observations.Grey areas correspond to the potential distribution in which background points were randomly selected for the modelling.(b) Scatterplot of the absolute value of latitude versus elevation of 401 A. vas geographical records.The black line shows a linear regression and confidence intervals.glacial maximum (LGM, i.e., 21 ± 3 ka BP; PMIP3 IPSL-CM5A-LR).Our general goal was to assess the usefulness of climate niche models for research on terrestrial microbial biodiversity and biogeography, with possible applications in ecosystem functioning and conservation.Our specific objectives were to compile an updated distribution map of A. vas, build a climate niche model and, using this model, to predict the current, past and future potential distribution of A. vas in the absence of any dispersal barriers.Based on these results, we then aimed to (i) assess whether the distribution of A. vas is best explained by dispersal (i.e.matching its climatic niche) or vicariance (i.e.palaeogeography), (ii) evaluate the magnitude of potential range expansions or contractions in response to past and on-going climatic changes and (iii) predict phylogeographic patterns, the likelihood of cryptic diversity and possible threats to this yet undescribed diversity from climate and land-use changes.
and (future [i.e.2071-2100]  periods.Two future climate scenarios representing alternative global change projection were considered: IPSL-CM6A-LR SSP3 and SSP5 (O'Neillet al., 2017).Past climate data were obtained from CHELSA v1.2 PMIP3 IPSL-CM5A-LR (bio1-bio19;Karger et al., 2017Karger et al., , 2018)), which corresponds to ca. 21,000 years BP, that is the Last Glacial Maximum period.The Topographic Position Index (TPI) was included as a topographical variable.By comparing the elevation of a central grid cell to the mean elevation of a predefined neighbouring area, the TPI provides information on terrain classification (e.g.ridge or hilltop, middle slope, valley bottom and flat areas)(De Reu et al., 2013).
. The MESS analyses result in an index evaluating the uncertainty related to extrapolation in models.Positive MESS values represent climatically analogous areas compared to the calibration zone.In contrast, negative MESS values indicate climatically nonanalogous areas compared with the calibration zone, that is grid cells for which one or more environmental predictors felt outside the range of environmental values encountered in the calibration zone.
FigureS7A) reveals range contractions of A. vas, particularly clear in NE Brazil, the Congo basin, lowlands of SE Asia, and eastern Australia, and some increases, mostly in mountain regions.A caveat here is that the predicted past total absence of A. vas in many large stretches of land (e.g., most of the Southern half of South America, Eastern African mountains, much of Southern Africa) is due to limitations of the model and should therefore not be interpreted.The predicted ranges nevertheless appear significantly reduced between LGM and current conditions.
how useful SDMs are to identify areas where taxa are potentially threatened due to direct impact (e.g.land-use changes) and climate change.However, it should be reminded that predictions from SDMs, are nothing more than well-informed hypotheses (Lee-Yaw et al., 2022) that should be tested by sampling in areas identified as potentially favourable or not in different regions.If such ground truthing confirms the model predictions for the present, we will gain more confidence that the projections for the future are valid.