The role of dispersal limitation in the forest biome shifts of Europe in the last 18,000 years

Aim: How the ability of plants to move towards newly favourable habitats (dispersal limitation) impacts the change of biome distribution and transition under fast climate warming is still debated. Analysing vegetation change in the past may help to clarify the relative importance of underlying ecological processes such as climate, biotic interactions, and dispersal. In this study, we investigated how dispersal limitation affected the distribution of European forests in the last 18,000 years. Location: Southern and Central Europe. Taxon


| INTRODUC TI ON
Climate is typically considered the main driver of species' distribution where individual bioclimatic tolerances determine the potential range of occupancy (potential niche) (Pearson & Dawson, 2003;Williams et al., 2021).Additionally, a number of ecological processes may bring species disequilibrium with the climate forcing and cause temporal and spatial lags in the occupancy of the potential niche.These processes include evolutionary adaptation (Davis et al., 2005), biotic interactions (both competition and facilitation) (Sirén & Morelli, 2020), and migration (Urban et al., 2012), which in turn may be affected by soil conditions, species-specific dispersal ability, and population growth rate and size, as well as competing species.
By studying the paleo-vegetation change in response to past climatic fluctuations, we can gain insights into the relative importance of different drivers of species distribution.While we are not expecting equal rates of vegetation change in the future, the ecological mechanisms underlying range shift dynamics in the past will also regulate species' response to future climate change (Fordham et al., 2020).Therefore, analysing vegetation response to past climatic periods that experienced similar warming rates to forecasted temperature changes across the 21st century will improve our ability to predict the effect of future climate change.
Paleo-reconstructions through fossil records and dynamic modelling show pronounced fluctuations in both vegetation cover and temperature regime spanning the period from the Last Glacial Maximum (LGM; 21-18 ka BP, ka BP: 1000 years before present, where 'present' corresponds to the year 1950 CE) until the current time (0 ka BP), especially across the global warming after the Last Deglaciation (LD; 20-10 ka BP).Following the ice sheet retreat from high latitudes (~17 ka BP), the Northern Hemisphere experienced two rapid temperature rises with the beginning of the Bølling-Allerød warming event (BA; 14.8-12.8ka BP) and by the end of the cold period of the Younger Dryas (YD; 12.8-11.7 ka BP) (Osman et al., 2021).Warming rates at these two time periods reached up to 10°C in a few decades, similar to what is projected for polar areas under the RCP 8.5 future scenario (Fordham et al., 2020;Renssen & Isarin, 2001).Archives of fossil pollen and macrofossils record a shift of vegetation boundaries across the Last Deglaciation, mostly as an expansion of forests into newly available open habitats after ice sheet retreat in the Northern Hemisphere (Cao, Tian, Dallmeyer, & Herzschuh, 2019;Dallmeyer et al., 2019Dallmeyer et al., , 2022;;Githumbi et al., 2022;Harrison, 2017).
A common trend that has been observed in both dynamic vegetation models and climate models including vegetation feedback (Dallmeyer et al., 2022;Harrison et al., 2015;Liu et al., 2014;Tian & Jiang, 2013) is that the onset of post-glacial forests was simulated earlier than expected by paleo-records.It has been argued that the lack of migration processes in the model structure may explain the abrupt expansion of simulated post-glacial forests after climate amelioration (Dallmeyer et al., 2022).Models in which plant establishment is unconstrained by seed availability assume climate-vegetation equilibrium, where plants can establish as soon as climatic conditions become favourable.Thus, a vegetation model assuming no dispersal limitation will be unable to reproduce a potential multi-millennial temporal delay in the forest expansion of the Last Deglaciation.
Here we investigate how dispersal limitation impacted the biome distribution in Europe during the last 18,000 years.For this purpose, we simulated the vegetation change (as represented by 20 major European tree species plus 3 shrubs and generic grass) across southern and central Europe at a fine spatial scale (0.01° longitude/ latitude) from the Last Deglaciation to the current time (18-0 ka BP).
We ran simulations using two dispersal modes, with plant establishment depending on local climatic conditions, inter-specific competition, and assuming either (1) an unlimited availability of seeds at all locations (free dispersal mode) or (2) availability determined by seed dynamics, i.e., germination, survival in the soil bank and dispersal from mature individuals (dispersal limitation mode).To do so we used the migration module of the dynamic vegetation model LPJ-GM 2.0, where seed dynamics are explicitly simulated to account for dispersal limitation during species' spread along with interspecific competition (Lehsten et al., 2019;Zani et al., 2022).Finally, we compared these simulations against the biomes reconstructed from pollen data.

| ME THODS
2.1 | Paleo-vegetation simulations with LPJ-GM 2.0 2.1.1 | The dynamic vegetation model LPJ-GM 2.0 LPJ-GM 2.0 (Lehsten et al., 2019;Zani et al., 2022) couples a dynamic migration module to the dynamic global vegetation model, LPJ-GUESS 4.0 (where LPJ-GM is short for LPJ-GUESS-MIGRATION), thus allowing species to migrate simultaneously while interacting with each other.LPJ-GUESS employs a gap model approach for the simulation of population dynamics.Physiological processes of plants (e.g., photosynthesis, carbon allocation) and biogeophysical processes (e.g., evapotranspiration, soil hydrology) are linked by water, carbon and nitrogen exchanges in response to input climate forcing (Smith et al., 2001(Smith et al., , 2014)).Simulated plant functional types (PFTs) based on growth form (tree, shrubs, grasses), attributes of life-history strategy (shade tolerance), phenology (evergreen or summergreen), physiology, and bioclimatic limits (e.g., minimum temperature for survival) represent plant species (e.g., Abies alba) or biogeographically distinct groups of taxa (e.g., boreal needle-leaved evergreen trees, BNE).Establishment, growth and mortality are subsequently modelled annually for each PFT as cohorts of interacting species in local communities (patches), which are contained in areas experiencing the same climate forcing (grid cells).These dynamics are influenced by both bioclimatic limits for establishment and survival and resource competition among species cohorts.Competition for light and nutrients is simulated asymmetrically based on plant biometrics, i.e., individuals with taller canopies and bigger root systems have an advantage in resource-limited environments.On the contrary, water is distributed equally among individuals based on water availability and the species' root fraction for each soil layer.Specifically, the soilwater-vegetation dynamics in LPJ-GM 2.0 follows the description by Smith et al. (2014) and the water scheme in Gerten et al. (2004), where soil is defined in multiple layers and according to specific traits (clay/silt/sand fraction, organic matter, C: N ratio and bulk density), and the proportion of water uptaken by plants is given by water availability (derived from input precipitation), evapotranspiration and species-specific rooting profiles.Interspecific competition for light is determined by the degree of shade tolerance of each species, whereas physiological tolerances are defined by PFT-specific bioclimatic limits, which in turn are estimated based on contemporary plant distributions.
Seeds are assumed to be always present across all simulated cells (free dispersal mode) in the LPJ-GUESS model.Thus, PFTs can establish given suitable environmental conditions as defined by PFT-specific bioclimatic limits and shade tolerance.LPJ-GM 2.0 implements a dispersal mode, which allows to simulate of explicit seed dynamics and dispersal across space to limit seed availability (dispersal limitation mode).For the initialization, it simulates a period of free establishment by specifying a species-and cell-specific starting year of dispersal limitation.The dispersal mode simulates migration annually through four processes: (1) seed production, (2) seed dispersal via a dispersal kernel, (3) seed bank dynamics and (4) seedling establishment based on the available seeds in the soil bank.Dispersal traits are represented by migration parameters as follows: generation time by the size to maturity (the tree height at which a species starts to produce seeds); growth rate by the seed fecundity and germination rate; dispersal efficiency and mechanism (e.g., wind-or animal-dispersed) by the dispersal kernel shape (Zani et al., 2022).

| Simulation settings and climatic input
The spatial domain of our simulations is divided into grid cells of 0.01° longitude/latitude resolution (ca. 1 km) and spanning southern and central Europe (Longitudes: −10.75-26°;Latitude: 36-54°), thus including the Pyrenees and the Alps as glacial refugia, barriers, and areas of (de)glaciation (Figure S1 and Supplement S1.1).Generic vegetation disturbances, representing wind throw, avalanches or insect outbreaks, were simulated by introducing a stochastic, local stand-destroying event with a return interval of 400 years.This interval value is an estimate of the natural disturbance interval of contemporary forests (Pugh et al., 2019).Due to uncertainties of fire patterns in the past, no specific fire module was used (although the effects of fire are partially covered by the generic disturbance).
Simulations were performed for a total of 18,500 years with 500 years of spin-up time, starting from 18.5 ka BP, which roughly corresponds to the coldest period of the Last Glacial Maximum (LGM) before the ice sheet retreat from northern Europe (Armstrong et al., 2019).The simulation time includes the Late Holocene (4.25-0 ka BP; LH), the Middle Holocene (8.236-4.25 ka BP; MH), the Early Holocene (11.7-8.236ka BP; EH), the Younger Dryas (12.9-11.7 ka BP; YD), the Late Glacial Interstadial or Bølling-Allerød interstadial (15.0-12.9ka BP; BA), and the Oldest Dryas (18.5-15.0ka BP; OD) (Walker et al., 2012).Following the default version of LPJ-GUESS (Sitch et al., 2003;Smith et al., 2001), we set an initialization period of 100 years (18.5-18.4ka BP), during which all plant species could establish and grow with no nitrogen (N) limitation to equilibrate the N pools of the soil (Smith et al., 2014).After the initialization period, all individuals were killed so that plant succession could start from the bare soil with equilibrated N and C pools.Each species was allowed to establish freely (i.e., with no dispersal limitation) in grid cells with suitable bioclimatic conditions for a further 400 years .This cold period of free establishment allowed species to occupy their potential range (i.e., glacial refugia) according to their bioclimatic limits.Global model parameters are listed in Table S1.1.
We simulated 24 PFTs, including 20 tree species, two Mediterranean shrub species, one Mediterranean shrub PFT, one alpine/boreal shrub PFT, and one generic grass PFT (Table A1 and Supplement S1.2).This species list corresponds to the representative European species selected by Hickler et al. (2012), with the addition of Alnus glutinosa, Acer campestris and Larix decidua, which were significantly present in the pollen data.Species-specific parameters of the selected taxa were provided by Hickler et al., 2012, with the exception of Alnus glutinosa and Acer campestris, which follow the values from Allen et al. (2010), and Larix decidua, which were parameterised by Scherstjanoi et al. (2014).Species-specific parameters describing the bioclimatic limits for survival and establishment (see Table S1.2) were newly estimated based on current distributions (both for natural and introduced occurrences) given by Caudullo et al. (2017).
We simulated two alternative dispersal modes following the spin-up phase (18-0 ka BP): (1) free dispersal and (2) dispersal limitation.The grass was assumed to have no dispersal limitation.The migration parameters of all other species were calibrated following the approach by Zani et al. (2022) based on the maximum migration rates estimated by the pollen analysis of Huntley (1983).All PFTspecific migration parameters are listed in Table S1.3.We used the paleo-climatic grid (monthly surface air temperature, monthly precipitation, monthly shortwave radiation and annual land mask) developed by Armstrong et al. (2019), with a spatial resolution of 0.5 × 0.5° longitude latitude and spanning back 60 ka BP until present (Supplement S1.3).
For the analysis of our simulations, we used 50-year averages of leaf area index (LAI) for each species simulated by LPJ-GM.The simulations were run on the Swiss and Swedish supercomputers operated by CSCS and LUNARC, respectively.

| Palso-vegetation data from pollen datasets
To assess how well our simulated paleo-vegetation reproduces patterns of vegetation change, we compiled 68,906 pollen chronologies from 1036 sites across central and southern Europe starting from the Last Glacial Maximum (Figure S1).Pollen records were compiled from the Neotoma Paleoecology Database (https:// www.neoto madb.org) (Fyfe et al., 2009;Giesecke et al., 2014) using the R package 'neotoma' (Goring et al., 2015).In total, 335 pollen taxa were selected and amalgamated into 60 taxa by keeping tree species as in the original extracted taxonomy while merging shrubs and herbs to the family or genus level (Cao, Tian, Li, et al., 2019).Further details on the pollen selection and analysis are provided in Appendix A1.

| Biome assignment for pollen data and model simulations
We reconstructed biomes from pollen data and model simulations by applying a score-based biomization approach.Following Prentice et al. (1996), we calculated the affinity scores A i based on the abundance of characteristic PFTs of each biome i (Table 1), and assigned the dominant biome with the highest affinity score to each record (time window and location).Next, we counted the occurrences of biomes per time window and calculated the percentages by weighting occurrences based on the area of the Thiessen polygons overall pollen sites or aggregated neighbouring cells (for both dispersal modes) to correct for the unequal spatial distribution of data points.By comparing simulated and pollen-derived biomes, we aimed to evaluate the performance of our model (and the different dispersal modes) to capture vegetation changes.Further details and an illustrative example of the bionization routine are provided in Appendix A2.

| Estimates of climate and forest cover change
Yearly temperature anomalies over the simulation time (18.5-0ka BP) and across the spatial domain were calculated based on the mean temperatures used in the simulation.We calculated climate velocities from the same climatic input by applying the gradient-based approach of the R package 'VoCC' (García Molinos et al., 2019) where s is the temporal trend calculated eevery100 years with yearly samples (°C year −1 ), and g is the spatial temperature gradient based on a 3 × 3 cell neighbourhood (°C km −1 ).We then searched for peaks of temperature velocity (i.e., time-points of rapid climate warming) across simulation time.
To estimate the forest cover change, the fraction of temperate forest cover was calculated every 500 years as the ratio between the number of pollen sites (or aggregated neighbouring simulated cells) dominated by either CMIX or TEDE (i.e., biomes with temperate forests) and the sum of all 'Vegetated' sites.'Unvegetated' cells were defined based on the land mask as inactive, i.e., submerged by the sea or under ice.Next, for each simulation and pollen reconstruction, we extracted the first time window after which temperate forests slowed down their expansion, i.e., reached a relatively stable vegetation cover comparable to the modern distribution.Thus, we used the function find_curve_elbow() (R package 'pageview') (Baliga et al., 2021) to locate the 'elbow' of the forest fraction curve (change of temperate forest fraction across time) for each model and pollen reconstruction.The 'elbow' of a curve is defined as the point at which the slope decreases rapidly, i.e., the curve bends.In the case of the forest fraction curve, this point corresponds to the time window at which the expansion of European temperate forests slowed down abruptly or stopped altogether.

| Comparison of simulated vegetation with pollen data
To compare biomes reconstructed from pollen data and model simu- We additionally evaluated the distribution of species (presence and absence, P/A) composing the biomes.We used two types of metrics to compare the point data of pollen sites with the gridded data of model simulations: (1) accuracy as point-to-point-metric (Appendix B2), and (2) fractional scores (FS) as point-to-grid metric (Appendix B3).As the pollen in lake sediment usually captures not only the local signal but also the extra-regional pollen components due to long-distance dispersal (MacDonald, 1993), an averaging of the simulated vegetation within a radius from the pollen site (as Codes of pollen plant functional types (pPFTs) allocated to each biome are indicated in parenthesis, where pPFTs allocation follows Table S5 by Cruz-Silva et al. (2022).Full names of pPFTs and constituent taxa can be found in Table A2.2).European averages of temperature velocities peak around these two time periods, namely at the start of BA around 14.7 ka BP (P-BA, 1.8 km year −1 ) and at the end of YD around 11.7 ka BP (P-YD, 1.3 km year −1 ), respectively (Figure 2).We observe pronounced differences in the distribution of biomes simulated with the free dispersal and dispersal limitation mode immediately after both peaks of climate warming (i.e., around 14 and 11 ka BP) (Figure 3).In our model reconstructions with dispersal limitation, boreal forests (CENF) have higher cover across southern and central Europe compared to reconstructions with free dispersal, where temperate forests (CMIX and TEDE) rapidly replace boreal forests, especially after P-YD (Figure 3).In the simulations with free dispersal, temperate forests reach their maximum cover abruptly around 11 ka BP, whereas the expansion of temperate forests in the simulation with limited dispersal occurs gradually and slows down around 7 ka BP, though it continues across the Holocene (Figure 2).

| Comparison of simulated biomes and species distribution with pollen data
The slower post-glacial forest expansion simulated under dispersal limitation seems to reflect more accurately the biome change obtained from pollen data (with a gradual expansion ending around 8 ka BP; Figure 2).Similarly, the overall accuracy of biome agreement with simulations using the free dispersal mode is dropping around P-BA and P-YD, whereas the model accuracy for simulations with dispersal limitation tends to be more stable, especially during BA (Figure 4b).Metrics of biome disagreement (EMDs) offer similar results.We observe lower values (higher agreement) for simulations using the dispersal limitation mode during periods of rapid warming (14-11 ka BP) though followed by a decrease of agreement across the Holocene, especially for unweighted EMDs, with ecologically-informed EMDs showing a higher agreement compared to unweighted values (Figure 4a).Overall, model and pollen data are more consistent in vegetation composition (i.e., dominant biomes and biome score distributions) with the dispersal limitation mode across the warmest time period after P-BA (Figure 4).
Biome-specific accuracies indicate that the drop of agreement for simulations using the free dispersal mode is mainly due to a decrease in accuracy in boreal and temperate mixed forests (CENF and CMIX) around P-BA and P-YD (Figure 4b and Table S3).More specifically, comparisons between simulated and observed dominant biomes 500 years after P-BA and P-YD show that simulations with free dispersal incorrectly predict CMIX and TEDE instead of CENF (Figure S3.2).This indicates that the model assuming free dispersal is simulating a sudden expansion of temperate forests when the pollen records suggest a longer persistence of boreal biomes around both P-BA and P-YD (Figures 1 and 3).Table S3) than for forested biomes (e.g., accuracy FreeDispersal = 0.68 and accuracy DispersalLimitation = 0.67 for TEDE; Table S3), which would account for the lower accuracy of predictions during OD (18.5-15.0ka BP) (Figure 2), when open-vegetation biomes, in particular TUND, were more abundant (Figure 1).Differences in average agreement (i.e., accuracy across space and time) between the two dispersal modes are relatively small, with the highest improvements being the agreement of boreal forests (+3%, CENF) and mixed forests (+2%, CMIX) under the simulation with limited dispersal (Table S3).However, we observe a relatively high decrease in accuracy (−22%) for warm temperate forests (WTFS) in the simulation with the dispersal limitation mode (Table S3).
Also at the species level, the model performance confirms the importance of including dispersal limitation during periods of rapid climate change (Figure S4.1).This was especially pronounced for temperate summergreen tree species (e.g., Fraxinus excelsior, the deciduous Quercus spp., Tilia cordata and Ulmus glabra; Figure S4.1), for instance, the late-successional species Fagus sylvatica, which had significantly lower prediction accuracy without dispersal limitation (−24%; Table S4).
On the contrary, the few species whose accuracy was not affected by the inclusion of dispersal limitation were early-successional and coldtolerant species, namely, Betula spp.and Pinus sylvestris (Figure S4.1).S1).Relevant time-slices were selected: 500 years after time-points of rapid climate warming (14 ka BP for P-BA and 11 ka BP for P-YD) and at a stable climatic point in the mid-Holocene (6 ka BP) (see Figure 2).Biome codes are: TUND, tundra; GRAM, graminoids with forbs; ENWD, evergreen needle-leaf woodland; CENF, cold evergreen needle-leaf forest; CMIX, cool mixed evergreen needleleaf and deciduous broadleaf forest; TEDE, temperate deciduous malacophyll broadleaf forest; WTFS, warm-temperate evergreen needleleaf and sclerophyll broadleaf forest; XSHB, xeric shrubland.(c.3) competition with early-successional species (e.g., Fagus sylvatica competing with Betula and fast deciduous spreaders like Corylus avellana) (Giesecke & Brewer, 2018;Roberts, 2014).Our model accounted largely for processes underlying these drivers.It incorporated topography through dynamic land masks with temporal changes of ice cover to free terrain for plant establishment (and vice-versa).Additionally, soil-water vegetation dynamics and species-specific dispersal ability were implemented in LPJ-GM 2.0 as described in Section 2.1.1.

| DISCUSS ION
Overall, these dynamic processes of dispersal and competition may have considerably delayed the establishment and expansion of some species after the LGM, thus delaying a climate-vegetation disequilibrium across centuries (MacDonald et al., 2008;Pennington, 1986) and possibly millennia in the case of low species-specific migration rates.grazing activities (Kalis et al., 2003;Nascimento et al., 2022).The expansion of the late-successional Fagus in Europe, which occurred later in the Holocene compared to other deciduous tree species (Magri, 2008), has often been linked to human disturbances, specifically landscape clearance and alteration of the fire regime (Bradley et al., 2013;Bradshaw & Lindbladh, 2005).In this sense, the inclusion of human activity in the simulations could potentially speed up the expansion of late colonisers by eliminating already established trees (competitors) with fire or other anthropogenic disturbances.To account for human impact across the Holocene, databases on past anthropogenic land cover (e.g., HYDE 3.2) (Klein Goldewijk et al., 2017) could be used as inputs for vegetation simulations in LPJ-GM.
(d.2) Despite the higher accuracy in periods of rapid climate change, our simulation with limited dispersal resulted in the extinction of six species: Abies alba (~15-14 ka BP), Alnus glutinosa (~17-16.5 ka BP), Juniperus oxycedrus (~17-16.5 ka BP), Larix decidua (~17-16.5 ka BP), Pinus halepensis (~17-16.5 ka BP) and Mediterranean Raingreen Shrubs (~17-16.5 ka BP).All of the listed species have narrow bioclimatic limits (Table S1.2), and half of them belong to Mediterranean biogeographical ranges (Juniperus oxycedrus, Pinus halepensis and MRS), which would explain the low performance of the simulation using the dispersal limitation mode in predicting warm-temperate forests (Table S3).(d.2.1)The cause of these species' disappearances may be linked to the parametrization of their bioclimatic limits, which used current species distributions to reconstruct species-specific climatic niches for historical simulations.This approach is problematic since it assumes that the current realised niche, which is strongly affected by biotic interactions as well as (a)biotic barriers, is a reliable predictor for the potential climate niche.This interpretation is consistent with the time of disappearance of the majority of species, namely by the end of the LGM (~17-16.5 ka BP), which was characterised by strong temperature (Figure 2) and drought anomalies.
Besides the lack of climate analogues, other factors that can alter the realised niche from present to past include: changes in biotic interactions, niche evolution and plasticity, and a vegetation-climate disequilibrium in the present distribution, which can be caused by post-glacial migration lags (Gavin et al., 2014).In the simulation with free dispersal, no species went extinct since the model allows species to reappear as soon as bioclimatic conditions are suitable, contrary to the simulation with dispersal limitation.(d.2.2) There is also the possibility that our spatial domain of simulation did not include the climatic cells that could have buffered the species from extreme climatic fluctuations, i.e., glacial refugia for the species that went extinct during or immediately after the LGM.These refugia are generally located on mountain chains, which were not fully included in our simulation domain due to constraints in computational power (Figure S1) and lack of fine spatial resolution in the climatic input (Appendix C2, b.1).We suggest optimise bioclimatic limits by taking into account past climatic conditions, e.g., by overlaying pollen data of species presence and absence with the paleo-climate at timepoints when the vegetation-climate disequilibrium was at a minimum (e.g., not during BA or after YD).Furthermore, grid cells with a suitable (micro-)climate to act as glacial refugia (e.g., Nunatak regions) should be included in the simulation domain.

| Post-glacial disequilibrium of temperate forests driven by dispersal limitation
We interpret the differences between the two simulations (free establishment and dispersal limitation) as post-glacial migration lags driven by dispersal limitation, i.e., the delayed occupancy of a climatically suitable biome due to the limited dispersal capacity of some tree taxa.This has been demonstrated by our reconstructions based on the simulations with dispersal limitation showing a delayed onset of temperate forest expansion compared to the simulation with free dispersal (Figure 2), which is closer to pollen reconstructions (Figure 4).In particular, we observed marked mismatches in the distribution of warm-demanding and slow-spreading tree species between the two dispersal modes during periods of rapid warming (P-BA and P-YD; Figure S4.1).This suggests that a number of less cold-tolerant trees were not able to track the rapidly ameliorating (i.e., warming) climate after the deglaciation of Europe and thus likely experienced migration lags immediately after P-BA and P-YD.

| Implications for future biome changes
If compared with pollen-based reconstructions of vegetation change during the Early Holocene, the temporal mismatch of the temperate forest expansion between our simulated dispersal modes (Figure 2) suggests that the European vegetation might have lagged behind the ameliorated climate after the YD for some thousands of years due to dispersal limitation and other migration processes (e.g., soil formation and local population build-up).This result questions the assumption of unlimited seed availability used in models to predict future plant distributions (Engler & Guisan, 2009;Seaborn et al., 2020).There are still a number of differences between the landscape configuration and relevance of migration processes between past and contemporary/future environments.
For instance, post-glacial plant colonization was characterised by species spreading from isolated refugia with ice sheet retreat, soil build-up and order of arrival imposing limits on species establishment (MacDonald et al., 2008;Pennington, 1986).On the contrary, biome changes within the contemporary landscape of widespread plant distribution and human presence happen mostly in zones of (northern) ecotone transition (Vissault et al., 2020) and are mainly impacted by anthropogenic activities, e.g., human-driven land use change and forest management.Nevertheless, as future warming rates are predicted to approach or even exceed the two time periods of rapid climate change after deglaciation (i.e., P-BA and P-YD) (Fordham et al., 2020), we expect that a number of plant species will lag behind the climate signal across the 21st century.Indeed, migration lags have already been observed in poleward and upward shifts of some terrestrial species' range during the last decades (Lenoir et al., 2020) and are expected to influence the transition of forest biomes in the temperate-boreal ecotone (Vissault et al., 2020).Thus, a model failing to account for dispersal limitation might predict an excessive extent of the vegetation cover and/or a higher proportion of temperate biomes, which in turn would lead to misinterpretations of the potential terrestrial carbon storage and vegetation feedback to the climate in future projections.Acknowledging these potential biases, a number of recent studies have sought to include natural dispersal in their modelling framework, most commonly by using dispersal potential to constrain the range of climatic suitability given by species distribution models (e.g., MigClim applied by Mauri et al. (2022) for future European tree species).Additionally, dynamic vegetation models including migration processes (such as LPJ-GM 2.0) can be included in earth system models for the prediction of more realistic plant range shifts, along with future climate and other ecosystem dynamics (Hanbury-Brown et al., 2022).
Finally, future projections comparing species distributions with and without dispersal limitation can be used to select slow-spreading and lagging species for assisted migration (Iverson et al., 2019;Iverson & McKenzie, 2013), or to support forest conservation and invasion risk management (Urban, 2020).

CO N FLI C T O F I NTE R E S T S TATE M E NT
The contact author has declared that neither they nor their coauthors have any competing interests.

site ('neighbouring cells')
. Next, we used the biomisation method as described above to assign a biome based on the PFT-specific LAI averaged across all neighbouring cells.The biome i × taxon m matrix can be derived from Tables 1 and A1.The affinity score of the simulated PFT m for the simulated time-point lis is calculated as: where the LAI l is the LAI for a time window (500 years average) and for the average of all neighbouring cells within a 0.5° radius.
We present as an illustrative example the biomization routine for the first site (site ID: 16; site name: Ahlenmoor; Lon: 8.73, Lat: 53.69) and uppermost time window (500 ya BP) available in the extracted pollen data.The ecological distance between biomes d i,j can be defined by two weighting schemes: (1) 'uniform', where the replacements from one biome to another are given the same ecological weight (EDMuni), (2) 'ecologically informed', where distances are given different weights (1, 2, 3) depending on the macro-environments of the biomes as defined in Table 1 (EDMw).We followed the distance matrix provided by Chevalier et al. ( 2023) (Table B1).EMD metrics were calculated with the R package 'paleotools' (Chevalier et al., 2023).

B2 | Calculation of accuracy: evaluation of species and biome distributions
The overall accuracy between reference and simulated classes is defined as the proportion of correct predictions over the total number of predictions across all classes.In binary classification such as predictions of species presence and absence (P/A), accuracy is defined as follows: where TP, TN, FP and FN are the number of true positives (or correctly assigned presences), true negatives (or correctly assigned absences), false positives (or incorrectly assigned presences) and false negatives (or incorrectly assigned absences), respectively.In the case of multiclass classification such as biome predictions, the model performance is evaluated as the average accuracy across all biome classes.
where TP i , TN i , FP i and FN i are true positive, true negative, false positive and false negative counts, respectively, for each biome class i, for a total of l = 9 number of biomes.
Counts of correct and incorrect predictions used in Equations ( B3) and (B4) were obtained from matrices built with pairs of pollen sites (reference) and neighbouring grid cells (predicted) to compare for each time window.Reference data at each pollen site were derived as described in Section 2.2 for P/A and Appendix A2 for biomes.Predicted biomes were obtained from simulations with free dispersal and dispersal limitations as described in Appendix A2.Similarly, predicted P/A for each 500year time-window (P ∕ A tw ) were obtained by applying a cut-off threshold of 0.01 to LAI averaged across the neighbouring cells c (LAI neig,c ) of each pollen site l : Accuracy metrics were calculated with the R package 'caret'.

B3 | Calculation of fractional scores: evaluation of species distributions
Predicted P/A were derived according to Equation (B3) for each neighbouring cell and time window.Next, at a pollen site l with either the presence or absence of species j, we calculated the local fractional score of presence FSP l,j and absence FSA l,j as the ratio between the number of neighbouring cells with species presence (P c,j ) and absence (A c,j ), respectively and the total amount of neighbouring cell n: Thus, the pollen site-specific fractional scores range from 0 to 1, where 0 indicates that all neighbouring cells were incorrectly assigned to either presence or absence, whereas 1 indicates perfect agreement.2023), with ENWD being a transitional biome between TUND and CENF.The upper and lower triangles represent the 'ecologically informed' and 'uniform' weighting scheme, respectively.Biome codes and related macro-environments are defined in Table 1.
lations, we used the recently developed evaluation metric Earth Movers' Distance (EMD) and test scheme provided by Chevalier et al. (2023).Compared to other evaluation metrics (e.g., the kappa statistic) (Cohen, 1960) that require a reduction of accuracy scores inferred for each biome to a single category (dominant biome; see Appendix A2), EMD has the advantage of taking the whole distribution of biome scores into consideration for each spatial and temporal sample.The EMD can be interpreted as the ecological 'cost' (or weighted distance) of replacing a vegetation composition (as a set of biome scores) with another, where 0 would indicate a perfect match between two (pollen data vs. model) biome score distributions.Furthermore, ecologically informed weights can be applied in the calculation of EMD to further penalise the prediction of more ecologically different biomes (Appendix B1).Additionally, we calculated the overall accuracy statistic for dominant biomes as a more traditional method of data-model comparison (Appendix B2).The overall accuracy [0-1] is a good index to assess the ability of a model (or dispersal mode) to simulate the distribution of dominant biomes across time, with 0 and 1 indicating no better agreement than by chance and perfect agreement, respectively.Furthermore, time series of biome-specific accuracies can detect which dominant biome the model (with the corresponding dispersal mode) can simulate with the highest (or least) agreement and at which point in the simulation time.
Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jbi.14836by Paul Scherrer Institut PSI, Wiley Online Library on [26/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License | 5 ZANI et al. required for the calculation of accuracy; Appendix B2) might partially result in including the extra-local signal.In this case, it might be more appropriate to consider each neighbouring cell individually using statistics for site-to-gridded-data comparison.Thus, we formulated the pollen site-specific fractional score (FS) for species P/A (FSP and FSA, respectively; Appendix B3) by modifying the fractional skill score used by Dallmeyer et al. (2019) in the evaluation of biome predictions.Fractional scores for species P/A can be interpreted similarly to model sensitivity and specificity as metrics to evaluate the model (and dispersal mode) ability to predict true presences and absences, respectively.Additionally, site-specific fractional scores can provide both the time window and coordinates (i.e., pollen sites) where false assignments occur.As dispersal involves movement in space, this additional spatial information can be useful in comparing the different performances of the two dispersal modes.3| RE SULTS3.1 | Post-glacial climate change and simulated biome distribution under free and limited dispersalBiomes inferred from pollen-and model-based reconstructions have similar relative abundances, with no desert biome (DESE) and significantly less open-vegetation types (TUND, GRAM, XSHB; cf.Table1) being assigned than forested biomes (Figure1and FigureS3.1).For all three reconstructions (pollen-based and model-based reconstructions with two dispersal modes), boreal biomes (TUND and CENF) are more abundant during the Oldest Dryas (OD, 18.05-15.0ka BP) with a progressive increase of temperate mixed (CMIX) and deciduous (TEDE) forested biomes towards the Holocene (start ~12 ka BP) (Figures1 and 2).In agreement with Northern Hemisphere climatic reconstructions(Osman et al., 2021), the mean temperature anomaly extracted TA B L E 1 Biome to macro-environment and climatic zone assignments, with descriptions for each biome followingCruz-Silva et al. (2022).
The simulation with dispersal limitation better captures this persistence (Figures 2-4; Figures S3.2 vs. S3.3),though it tends to over-predict the presence of biomes dominated by grasslands (GRAM) in the western and northern regions of our simulation domain (mostly Ireland) as far in time as the Late Holocene (Figure 3; Figure S3.7).Overall, the agreement of both dispersal modes with the pollenbased data tends to be lower for open-vegetation biomes (e.g., accuracy FreeDispersal = 0.56 and accuracy DispersalLimitation = 0.58 for TUND;

4. 1 |
Post-glacial vegetation disequilibrium: dispersal limitation and other potential causes 4.1.1| Simulated biome change in agreement with the post-glacial forest conundrum Comparisons of pollen data with LPJ-GM 2.0 simulations using two dispersal modes showed that the simulation implementing tree establishment as dependent on seed availability (dispersal limitation) F I G U R E 2 Post-glacial temperature and temperate forest change in central and southern Europe.Temperate forest fraction is the sum of CMIX and TEDE versus all biomes, where dominant biomes were reconstructed either by model simulations with free dispersal (yellow) or dispersal limitation (dark green), or by pollen samples (black).Vertical dashed lines indicate the time when the post-glacial expansion of temperate forests slowed-down or stopped (Free Dispersal: 11 ka BP; Dispersal Limitation: 7 ka BP; Pollen: 7.5 ka BP).Temperature anomaly (°C, light grey dashed line) and gradient-based temperature velocity (km year −1 , light grey solid line; calculation in Section 2.5) are reported as the average across the simulation domain ± SD (shaded areas).was better in capturing the post-glacial expansion of non-boreal forests in southern and central Europe than the simulation assuming free dispersal.Results obtained with the free dispersal mode were in agreement with previous vegetation modelling results that ignored migration processes from the dynamics of tree expansion (Dallmeyer et al., 2022).In particular, the recent transient MPI-ESM1.2simulation by Dallmeyer et al. (2022) modelled a 4000 years advance of post-glacial forest onset with respect to pollen reconstructions in the Northern Hemisphere.Similarly, we observed a multi-millennial mismatch in the free dispersal simulation for the onset of temperate forest expansion, though not for the degree of openness (i.e., from open to forested biomes).This is likely due to the simulation domain being confined to southern and central Europe where the major shift of biomes after the end of YD was the change of forest types (from boreal to temperate), whereas the forest expansion into open land would have been more pronounced in northern Europe.Indeed, we recorded a similar delay as Dallmeyer et al. (2022) though for different biome types.Specifically, the maximum temperate forest cover simulated under our free dispersal mode peaked after the end of YD (11 ka BP), while the pollen-derived biome reconstruction indicated a maximum during the mid-Holocene, around 7.5 ka BP (Figure2).Dallmeyer et al. (2022) defined the temporal mismatch between modelled and pollen-reconstructed vegetation with regard to the climate signal as the 'postglacial forest conundrum' (PFC) caused by (a) errors in the biomisation procedure and/or imbalanced pollen data (for more details, see Appendix C1), (b) biases in the input paleo-climate (for more details, see Appendix C2), and (c) missing ecological processes of plant re-colonisation in the simulation routine or model structure.4.1.2| Simulated processes of plant re-colonisation (c) Climate-vegetation disequilibrium has been attributed to a number of drivers, including (c.1) abiotic factors (soil development, F I G U R E 3 Maps of biome distributions reconstructed from LPJ-GM 2.0 using the free dispersal (left panels) and dispersal limitation (central panels) modes and pollen samples (right panels).Maps of model reconstructions are made using categorical interpolation (k-nearest neighbours algorithm, k = 1000) of dominant biomes predicted from simulate grid cells (Figure

| 9 ZANI
et al.topographical barriers), (c.2) species-specific physiological and demographic characteristics of plants (physiological tolerance to the environment, traits of dispersal ability) and (c.3) biotic factors (interspecific competition)(Tomiolo & Ward, 2018; Williams et al., 2021).(c.1)As summarised byGiesecke and Brewer (2018), the post-glacial forest expansion in Europe happened through a succession of dominant tree species and a parallel evolution of soil types, starting with the formation of thin soil layers after ice retreat.According to the reconstructions of post-LGM plant successions in northern Europe by Roberts (2014), the primary condition for plant colonization was the passage from ice fields to periglacial landforms, which were rapidly occupied by a tundra-steppe biome.In these northern regions, ice retreat was driven by global climate warming, while the pioneer herbaceous plants of the tundra-steppe subsequently prepared the ground for early-successional trees by forming a thin layer of mineral soil (Roberts, 2014).(c.2) In pollen reconstructions, these early pioneer tree species (Betula spp., Pinus sylvestris, and Populus) colonised the newly available terrain first likely due to their high dispersal ability (given by fast growth, short generation time, abundant seed production and long-distance dispersal of seeds) and a broad tolerance to limiting soil and climatic conditions as long as there is sufficient light(Giesecke & Brewer, 2018).Following the pioneer open woodland that prepared soil conditions for more demanding plants, the colonization of late-successional species was further slowed down by a narrower bioclimatic tolerance that limited population expansion even after the first arrival at a site (e.g., Tilia and Picea), by a slower generation time and/or F I G U R E 4 Evaluation metrics of (a) dissimilarity (EMD) and (b) similarity (accuracy) of biome predictions against pollen data across simulation time (18-0 ka BP).Biome predictions were obtained with LPJ-GM 2.0 using free dispersal [(a) yellow and (b) dashed lines] and dispersal limitation [(a) green and (b) solid lines].(a) Dashed (solid) lines indicate EMDs calculated with uniform (ecologically informed) weighting.(b) Black lines indicate the overall accuracy of biome prediction, whereas coloured lines indicate the balanced accuracies of the three most abundant biomes: CMIX, cool mixed evergreen needleleaf and deciduous broadleaf forest; TEDE, temperate deciduous malacophyll broadleaf forest; CENF, cold evergreen needleleaf forest.Vertical grey bands indicate two significant time points of rapid climate change (P-BA: 14.7 and P-YD: 11.7 ka BP; see Figure 2).Calculations of EMDs and accuracy are described in Appendices B1 and B2, respectively.

4. 1
.3 | Challenges and missing processes: paleo-climate, parametrization and anthropogenic disturbances Challenges that we encountered in our study are (d.1) a lower model performance since the mid-Holocene and (d.2) the extinction of six species at the start of the simulation.(d.1)A decrease in performance across the Holocene (after 10 ka BP) is observed in both dispersal modes (Figure 4) and can be attributed in part to (d.1.1)the wrong assignment of closely related forested biomes, i.e., prediction of simulated TEDE instead of observed pollen-derived CMIX (Figure S3.4).This is related to an intrinsic bias of the biomization procedure as discussed in Appendix C1 (a.3) and can be addressed either by analysing vegetation patterns at the mega-biome level (see Figure 2 where temperate forests are the sum of TEDE and CMIX) or by evaluating biome assignment with ecological weights (see Appendix B1 and ecologically-informed EMD in Figure 4(a).(d.1.2)Additionally, our input climate does not capture the Holocene Thermal Maximum (HTM) (see Appendix C2), which might have prompted a further expansion of the temperate vegetation by speeding up the northward migration of more warmdemanding species.Thus, a warmer input climate could partially solve the excessively slow spread of late-successional and temperate trees observed under the dispersal limitation mode during the Holocene (Figures S3.4b,d and S4.3).This bias can be additionally attributed to (d.1.3)a missing driver of Holocene vegetation dynamics, i.e., anthropogenic disturbances.(d.1.3)Human activities that impacted the mid-Holocene (7-6 ka BP, 'Neolithic Revolution') (Bocquet-Appel, 2011) vegetation of Europe includes fire disturbances, vegetation clearance for human settlements, agriculture and in pollen data and biomizationAs is common with paleo-records, our pollen data suffered from a temporal imbalance in sampling with pollen records being increasingly abundant 1s recent times (FigureS2.1).Following the approach by Cao, Tian, Dallmeyer, and Herzschuh (2019), we temporally(B2) Accuracy = TP + TN TP + TN + FP + FN , i TP i +TN i +FP i +FN i l , (B4) LAI tw,l = ∑ n c=1 LAI neig,c n ,(B5) P ∕ A tw = {1 (P), LAI tw,l > 0.01 0 (A), LAI tw,l ≤ 0.01.Distance matrices used in the study adapted from Chevalier et al. (

Veiko
Lehsten has a long experience in simulating climate change effects on vegetation and related ecosystem properties such as carbon fluxes and wildfires.He has been working on disentangling the effects of climate change, increased CO 2 emissions and land use change on vegetation and migration lags are important, and as of now mostly neglected, part of this.
13652699, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jbi.14836by Paul Scherrer Institut PSI, Wiley Online Library on [26/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 13652699, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jbi.14836by Paul Scherrer Institut PSI, Wiley Online Library on [26/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 13652699, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jbi.14836by Paul Scherrer Institut PSI, Wiley Online Library on [26/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 13652699, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jbi.14836by Paul Scherrer Institut PSI, Wiley Online Library on [26/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Giesecke et al., 2017.ntaxapresent in this record, i.e., List of taxa (either species or PFTs) implemented in the model LPJ-GM 2.0 with relative characteristics and pollen equivalents.PFT characteristics (geographic range and form) are defined by the model parameters (e.g., range is given by optimum temperature for photosynthesis and base respiration rate) according toHickler et al. (2012)(see also TableS1.2 for species-specific parameters).Pollen thresholds for taxa detection follows Table1byGiesecke et al., 2017.For unspecified taxa, the default value of 0.5 was chosen.Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jbi.14836byPaulScherrer Institut PSI, Wiley Online Library on [26/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)onWileyOnline Library for rules of use; OA articles are governed by the applicable Creative Commons License List of pollen taxa present in the pollen site Ahlenmoor (ID 16) in the time window (0-500 ka BP) with relative biome affinity scores (A).Assignments of each pollen taxa to pollen PFT and biome (either 0 or 1) follow TableA2 and Table 1, respectively.For details on the calculation of adjusted pollen percentage (%) see Section 2.2., 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/jbi.14836by Paul Scherrer Institut PSI, Wiley Online Library on [26/03/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License m im √ 0.01, LAI l , TA B L E A 1 13652699