Harnessing paleo‐environmental modeling and genetic data to predict intraspecific genetic structure

Abstract Spatially explicit simulations of gene flow within complex landscapes could help forecast the responses of populations to global and anthropological changes. Simulating how past climate change shaped intraspecific genetic variation can provide a validation of models in anticipation of their use to predict future changes. We review simulation models that provide inferences on population genetic structure. Existing simulation models generally integrate complex demographic and genetic processes but are less focused on the landscape dynamics. In contrast to previous approaches integrating detailed demographic and genetic processes and only secondarily landscape dynamics, we present a model based on parsimonious biological mechanisms combining habitat suitability and cellular processes, applicable to complex landscapes. The simulation model takes as input (a) the species dispersal capacities as the main biological parameter, (b) the species habitat suitability, and (c) the landscape structure, modulating dispersal. Our model emphasizes the role of landscape features and their temporal dynamics in generating genetic differentiation among populations within species. We illustrate our model on caribou/reindeer populations sampled across the entire species distribution range in the Northern Hemisphere. We show that simulations over the past 21 kyr predict a population genetic structure that matches empirical data. This approach looking at the impact of historical landscape dynamics on intraspecific structure can be used to forecast population structure under climate change scenarios and evaluate how species range shifts might induce erosion of genetic variation within species.

The climates of the glacial cycles of the Pleistocene, in particular the cold period of the last glacial maximum (LGM, 21 kyr BP;Clark et al., 2009), reshaped the distribution of ecosystems (Williams, & Jackson, 2007), species ranges (Hampe & Jump, 2011;Hewitt, 1999), and intraspecific genetic structure (Yannic, Pellissier, Ortego, et al., 2014). Climate dynamics and its control over the distribution of the continental ice sheet caused dramatic species range expansions and contractions in the Northern Hemisphere (Alsos et al., 2012).
Species dispersal is a central determinants of range shifts, but also has consequences on the genetic structure within a species range (Davis & Shaw, 2001;Szűcs et al., 2017). In particular, repeated isolation of populations during successive glacial cycles has generated complex genetic differentiation within species (Hewitt, 2004;Hofreiter & Stewart, 2009;Lister, 2004). Moreover, genetic drift during population range shift and range contraction can erode the genetic diversity of populations, as theoretically predicted (Arenas, Ray, Currat, & Excoffier, 2011;Garnier & Lewis, 2016;McInerny, Turner, Wong, Travis, & Benton, 2009), empirically shown (Alsos et al., 2007) or forecasted under climate change (e.g., Collevatti, Nabout, & Diniz-Filho, 2011). The current genetic structure of species is thus expected to be intimately related to the historical spatial and temporal variation in their distribution ranges, which in turn has shaped the pattern and frequency of population genetic exchanges or the degree of genetic differentiation (Espíndola et al., 2012;Pellissier et al., 2016).
Landscape genetics as a discipline integrates spatially explicit data to investigate the influence of landscape heterogeneity on contemporary gene flow (Balkenhol, Cushman, Storfer, & Waits, 2016). Landscape features and habitat characteristics can have a profound impact on genetic structure of populations, by either restricting or enhancing individual movements and populations connectivity (Taylor, Fahrig, Henein, & Merriam, 1993;Taylor, Fahrig, & With, 2006). A central tenet of landscape genetics is to identify patterns of suitable habitats or sets of features that promote or hinder connectivity among patches and shape the genetic structure of species (Zeller, McGarigal, & Whiteley, 2012). Landscape resistance to gene flow is parameterized using different approaches: (a) expert opinions (Murray et al., 2009), (b) optimization and parameterization methods (e.g., Peterman, 2018;Spear, Balkenhol, Fortin, McRae, & Scribner, 2010), or (c) occurrence, presence-data, or satellite-collar relocations coupled with species distribution modeling (Shafer et al., 2012;Yannic, Pellissier, Le Corre, et al., 2014;Zeller et al., 2018). The advantage of data-based inferences in relation to environmental variables is that it allows constructing more objective models of habitat use and can identify corridors based on true locations or species' preferred habitat (Fattebert, Robinson, Balme, Slotow, & Hunter, 2015;Panzacchi et al., 2016). Landscape feature shown to impact connectivity and genetic distinctiveness today could have had a similar effect in the past. Hence, the study of intraspecific genetic structure requires the identification of historical landscape elements that shaped gene flow through the analysis of paleo-environmental maps coupled with species ecological information.
The climate distribution within landscapes was largely dynamic over the past millennia (Batchelor et al., 2019). Originally, the distributions of paleoclimate were reconstructed using indicators such as pollens, environmental DNA (eDNA), oxygen isotopes, or other features (Koch, 1998;Lyman, 2017;Parducci et al., 2017;Willerslev et al., 2007Willerslev et al., , 2014. The position of moraines was, for example, used for reconstructing the extent of glaciated area in the past millennia and its effect on landscape connectivity (Nesje, Bakke, Dahl, Lie, & Matthews, 2008). Fossilized indicators and eDNA from lake sediments were used for reconstructing past vegetation patterns and informed on species range segregation .
Vegetation reconstruction using fossil records showed that the tree line largely shifted during glacial periods (Binney et al., 2017;Payette & Lavoie, 1994), possibly shaping connectivity for other organisms.
The development and refinement of global climate models (GCMs) provide another source for the reconstruction of past landscape dynamics (Haywood et al., 2019). The downscaling of GCMs from coarse to fine resolution can generate the changes in abiotic conditions that constrain the distribution of organisms over time (Latombe et al., 2018). Combining different sources for reconstructing past species range dynamics can help understand how past dynamics shaped present intraspecific genetic structure of species (Fordham et al., 2016;Fordham, Brook, Moritz, & Nogués-Bravo, 2014;Gavin et al., 2014).
Beyond current landscape configuration, the genetic structure of populations is the result of an intermingling of past climatic effects, geography, and anthropogenic pressures on species (e.g., Lorenzen TA B L E 1 Detailed information on spatially explicit landscape genetics simulators. We provide information on (a) the type of simulators between forwards-in-time simulations (also known as individual-based simulations) and backwards-in-time simulations (also known as coalescent simulations) and (b) the level of program between individual-based and population-based simulations. We also provide general descriptions of programs based on direct citations from original journal articles Program Simulators Level Description Ref.

Splatche 3
Backward Population Simulate the demography of populations and the genetic data (e.g., DNA sequences, SNPs, STRs) are simulated under a serial coalescent-based approach. The simulation framework is spatially explicit and accounts for the heterogeneity and the dynamic of landscapes. see also Aquasplatche (Neuenschwander 2006), extension of Splatche for river habitats.

Individual
Simulate evolution of life history/phenotypic traits and population genetics in a (meta-) population framework. Allows for patch-specific carrying capacities, dispersal rates, stochastic extinction/ harvesting rates, and demographic stochasticity. Allow population bottlenecks, patch fusion/fission, population expansion, etc. during the simulation process. Guillaume and Rougemont (2006) CDMetaPOP Forward

Individual
Simulate landscape demographic and genetics (defined herein as landscape demogenetics). Simulate changes in neutral and/or selection-driven genotypes through time as a function of individual-based movement, complex spatial population dynamics, and multiple and changing landscape drivers.

Individual
Simulate spatially and temporally explicit scenarios with chromosome-scale data. Features such as spatially and temporally variable selection coefficients are incorporated in a flexible manner. This software allows for large-scale analyses and comparisons of these complex, stochastic models.

McManus (2015)
SimAdapt Forward Individual A spatially explicit, individual-based, forward-time, landscape-genetic simulation model, combined with a landscape cellular automaton to represent evolutionary processes of adaptation and population dynamics in changing landscapes, using the NetLogo environment. Rebaudo, et al. (2013) EcoGenetics Forward

Individual
Simulate spatially explicit metapopulations. Allows defining all kinds of landscape configurations, from real landscapes (e.g., stemming from Geographic Information Systems) to theoretical structures (e.g., island model, circles, grid). Consider various life history, including sex-dependent or age-dependent behaviors and generation overlap Jaquiéry, et al., (2011) (Continues) et al., 2011). Disentangling effects of different drivers on the current genetic structure of populations is however challenging and requires the use of integrative modeling tools (Epperson et al., 2010;Landguth, Cushman, & Balkenhol, 2015). We present an overview of the simulation approaches that were used to explore the determinants of extant genetic diversity and structure. We further illustrate a new modeling approach, conceptually simpler, which provides an expectation of intraspecific genetic structure as a result of the dynamic landscape suitability and connectivity for species.

| INTEG R ATING S IMUL ATI ON S AND MODELING IN A G LOBAL L ANDSC APE G ENE TI C A PPROACH
Genetic data, increasingly available (e.g., Lorenzen et al., 2011), can be used to infer past demographic history of species (a) directly, using genetic models, in the light of geographic information data, or (b) indirectly, using process-based spatial simulation models (e.g., Campos et al., 2010;Drummond, Rambaut, Shapiro, & Pybus, 2005;Shapiro et al., 2004). Moreover, advanced analyses from genomic data can support the quantification of past demographic events such as estimations of past population sizes or the occurrence of demographic bottlenecks (Hansen et al., 2018;Nadachowska-Brzyska, Li, Smeds, Zhang, & Ellegren, 2015;Pilot et al., 2014;Stoffel et al., 2018).
Through the comparison of spatially structured populations, genetic analyses can inform about ancestral population connectivity (e.g., temporal changes in the level of isolation-by-distance among ancient DNA samples; Lorenzen et al., 2011) and demographic history such as genetic bottlenecks driven by population disconnection (Broquet et al., 2010).
A variety of spatial data can be used to investigate the origin of intraspecific genetic structure of species, which include contemporary landscape descriptors (Sork et al., 2013;Storfer et al., 2007) and historical species distribution reconstruction (Nogués-Bravo, 2009).
Various landscape habitat characteristics are expected to influence the connectivity for populations and determine their genetic structure (Balkenhol et al., 2016). Methods to measure connectivity using cost-weighted distance allowed refined quantification of landscape barriers to gene flow (Balkenhol et al., 2016). Another complementary spatial information is provided by species distribution models (SDMs), which by estimating species habitat suitability, can help understand the historical dynamics that shaped intraspecific genetic structure of populations (Carnaval, Hickerson, Haddad, Rodrigues, & Moritz, 2009;Lorenzen et al., 2011;Razgour et al., 2013;Yannic, Pellissier, Ortego, et al., 2014). Hindcasted SDMs were used to identify the geographic position of refugia (but see Davis, McGuire, & Orcutt, 2014), and assess visually if they match a similar structure found in the geographic distribution of genetic clusters (e.g., Waltari et al., 2007). The availability of more temporal steps for climate reconstructions between the LGM and the present allowed tracking climatic suitability area over time (Hijmans,

Forward Population
Simulation framework that combines a powerful, fast engine for forward population genetic simulations with the capability of modeling a wide variety of complex evolutionary scenarios. Supports models that occupy continuous spatial landscapes, including built-in support for spatial maps that describe environmental characteristics. Possible to model the explicit movement of individuals over a continuous landscape, life cycles with overlapping generations, individual variation in reproduction, density-dependent population regulation, individual variation in dispersal or migration, local extinction and recolonization, mating between subpopulations, age structure, fitness-based survival and hard selection, emergent sex ratios, and more.

Forward and backward
Individual Individual-based, genetically explicit stochastic simulation program. Developed to investigate the effects of selection, mutation, recombination, and drift on quantitative traits in structured populations connected via migration in a heterogeneous landscape Neuenschwander, Michaud, and Goudet (2018) MEMGENE Sideway Population Method and software for identifying spatial neighborhoods in genetic distance data. Use a multivariate technique developed for spatial ecological analyses. Using simulated genetic data, allow recover patterns reflecting the landscape features that influenced gene flow. Galpern, Peres-Neto, Polfus, and Manseau (2014) TA B L E 1 (Continued) models (Engler & Guisan, 2009), following the geographic range dynamics of isolated populations (Espíndola et al., 2012;Nobis & Normand, 2014). For example, Yannic, Pellissier, Ortego, et al. (2014) found two main refugia for the caribou/reindeer Rangifer tarandus, which explain current structure, whose dynamic until the present matches the current genetic structure of the species. Habitat suitability values can be used in more complex process-based model to simulate intraspecific genetic structure in forward-going simulations (Hoban, Bertorelle, & Gaggiotti, 2012).
The adoption of spatial models including genetic mechanisms can inform the process that shapes current intraspecific genetic diversity and structure (Landguth et al., 2015). Generally, the best available confirmation of the understanding of the causes of a phenomenon is the ability to build a model from the expected underlying mechanisms and reproduce realistic emergent patterns (Leprieur et al., 2016). Processbased models place causal hypotheses on a leading role, as an a priori abstraction of the inner workings of the system that is used to build the model (Grimm et al., 2005). More specifically, process-based models in landscape genetics are primarily built to evaluate analytical issues, for example, the effects of spatial and temporal scale in landscape genetics (Jaquiéry, Broquet, Hirzel, Yearsley, & Perrin, 2011), or to investigate theoretical questions, for example, quantifying the lag time between the emergence of a barrier to movement and its effects on spatial genetic data . The expectation of those models can be then compared to empirical data qualitatively and quantitatively (Jeltsch et al., 2013;Landguth et al., 2015). Based on a limited number of principles and in combination with the reconstructed dynamics of landscape suitability for species, process-based models are able to produce expectations related to intraspecific genetic diversity and structure that matches empirical data (Landguth et al., 2015). A variety of spatial process-based models of species genetics have been developed and include a variety of processes (Table 1; Box 2), such as genetic drift (CDPop; , geographic isolation (IBDSim; Leblois, Estoup, & Streiff, 2006), selection, and adaptation (Nemo; Guillaume & Rougemont, 2006). The importance of simulation studies for this specific research has been emphasized in a number of recent articles (e.g., Balkenhol, Waits, & Dezzani, 2009;Epperson et al., 2010).
We present a parsimonious model allowing inference on population genetic structure given the suitability of species occupancy of the landscape, the species dispersal capacity, and the landscape struc- and rely on the computation of matrices as classically performed in landscape ecology (Balkenhol et al., 2016;Sork et al., 2013;Storfer et al., 2007). Our framework is presented on Figure 1. We illustrate our approach using a previously published dataset on the case study of reindeer/caribou across North America and Eurasia (Yannic et al., 2018;Yannic, Pellissier, Ortego, et al., 2014).

| Genetic data
We used a set of 1,297 caribou and reindeer genotyped at 16 nuclear microsatellite markers (Yannic et al., 2018;Yannic, Pellissier, Ortego, et al., 2014). Samples were obtained from 57 locations located across the entire Holarctic native species' range, including Alaska, Canada, Greenland, Svalbard, Norway, Finland, and the Russian Federation ( Figure 2). Samples were collected over a decade (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010). We tried to cover as much as possible the known circumpolar distribution of the species, although some areas are not evenly represented (e.g., few populations from Russia versus  Figure 2). In these previous works, using multivariate and Bayesian analysis, we identified that the genetic structure of Rangifer was split into two main phylogeographic clusters at the uppermost hierarchical level (Yannic et al., 2018;Yannic, Pellissier, Ortego, et al., 2014). Using climatic niche modeling, we further identified refugia occupied by the two lineages at the LGM (i.e., 21 kyr BP), as defined by discontinuous suitability for the species, that is, suitable areas located south and northwest of the Laurentide Ice Sheet in North America (see panel "21 ka" in Figure 2; Yannic, Pellissier, Ortego, et al., 2014). We finally used a dispersal model (Engler & Guisan, 2009) to track the species range shift overtime (between 21 kyr BP to present) from these distinct refugia that lead to the current observed genetic structure of Rangifer populations.

| Box A geographic model of genetic beta diversity
Process or hybridization (e.g., PhyloGeoSim, Splatche; see Table 1 for details on simulation models). A third line of abstraction is the numerical simplification of the landscape over which agents and mechanisms play out. The landscape can be modeled with more or less complexity, for instance how different features (e.g., water, forested areas) influence the permeability to the dispersal of a species or using the outputs of SDMs.
Simulation studies require simplifying assumptions tailored to the specific research question, and those should correspond to the foreseen comparison to empirical data (Landguth et al., 2015). In order to study the variety of empirical patterns, relying on a unique general modeling approach is unrealistic and a diversity of more specific model implementations should be preferred (Landguth et al., 2015).
Specific models could be implemented in agreement with the research question and the property of interest (Landguth et al., 2015). A model generally represents a simplification of the complexity of natural processes and can be developed to specifically investigate one or a limited number of specific properties of a system (e.g., neutral genetic diversity rather than genetic structure, Grimm et al., 2005). The implementation of a model depends on very specific assumptions that are made when translating biological processes into a computer algorithm (Grimm & Railsback, 2012). The complexity of the models largely depends on data availability (e.g., species dispersal capacity, population demographic parameters). Hence, even when investigating a specific empirical property, it is recommended to rely on models developed independently.
As a result, understanding the spatial variability in population genetic data can be tackled from different modeling angles.

| Box Scientific curiosity and expertise know no boundaries
This word cloud was picked up on the web page of Louis Bernatchez ( Figure Box 1). This figure illustrated the main research topics of Louis' laboratory, that is, fish and genes. It would however be restrictive to resume the Louis' research activities to the population genetics and genomics of fishes, and more generally, of freshwater and marine organisms, as highlighted by some seminal papers not directly related to fishes, for example, on MHC in nonmodel vertebrates (Bernatchez & Landry, 2003) or on the definition of conservation units (Fraser & Bernatchez, 2001). Louis also regularly hosts in his laboratory some outliers. I was one of them, working mainly on questions related to mammal and bird population genetics (e.g., in collaboration with Louis on black bear Ursus americanus (Roy, Yannic, Côté, & Bernatchez, 2012) or caribou (Yannic, Pellissier, Ortego, et al., 2014;Yannic et al. 2016)). My few years as a post- Scientific curiosity and expertise know no boundaries.

| Paleoclimate
We created a centennial time series for the times from 21'000

| Species distribution model
Species distribution models were calibrated using ensembles of four  year where a cell is covered with snow). We evaluated the model performance using the area under the ROC plot curve (AUC), Kappa, and true skill statistics (TSS). We used a split sample approach (70% calibration data and 30% evaluation data) with 20 repetitions. Models are considered to have a reliable performance with AUC scores > 0.7 and TSS values > 0.4 (Allouche, Tsoar, & Kadmon, 2006;Guisan, Thuiller, & Zimmermann, 2017). We applied a binary classification of the suitability from each model output using the maximum TSS approach to define the threshold with the presenceAbsence R package. Areas covered by glaciers Clark, 2012) were classified as nonsuitable habitat. To apply binary classification to the ensembles, we classified cells with predictions of at least two models as suitable climate. We used the threshold maximizing the TSS value to convert occurrence probabilities in presences/absences from the PresenceAbsence R package (Freeman & Moisen, 2008). The distribution was predicted with 100-year time steps till the LGM, 20,000 years BP. We validated the projection to the past with dated fossil records from Matthew T. Boulanger (Pers. Comm.), Lorenzen et al. (2011), Sommer, Kalbe, Ekström, Benecke, andLiljegren (2014), and all references therein.

| Computation of landscape connectivity through time
We precomputed matrices of connectivity between all the cells that are suitable for the species for all 100-year time steps from the LGM to the present. Cost distance and electric circuit are common tool in landscape genetics to investigate the connectivity among populations (McRae, Dickson, Keitt, & Shah, 2008) and can be directly compared to genetic distance metrics such as F ST , chord distance, or distance-based principal components analysis (see Shirk, Landguth, & Cushman, 2017 for an overview). We computed connectivity matrices based on cost functions but generalized the computation to multiple time steps from the LGM to the present. The connectivity matrix computation is based on the gdistance R package (van Etten, 2018) and can accommodate various dispersal costs of the different features of the landscape.
Nonsuitable habitat increases the cost of crossing the particular cell(s) according to their classification (e.g., glacier or sea) and thus decreases the possible geographic distance over which a population can disperse.
A population can disperse to another cell as long as the least cost path between them is cheaper than the dispersal distance of that population. We prepared matrices with different costs: Crossing unsuitable habitat and glaciers had a cost factor of 1.1 to 1.25, and crossing the (frozen) sea had a cost factor of 1.1 to 3.

| Simulating differentiation among populations through time
The model components are defined as the following steps.
Step 1: Habitat suitability change: The new binary classification of the SDMs discriminating suitable habitat from nonsuitable habitat is loaded.
Step analyses (data not shown), we found that during the colonization process, genetic differentiation by drift occurred faster than genetic homogenization by gene flow. Therefore, we set the accumulation of divergence to occur three times as fast as the reduction.
Step 3: Dispersal allows gene flow among connected cells at a distance d as presented above, but further allows the colonization of uncolonized cells. Species can disperse to all connected cells, which have a least cost path less than the dispersal distance d. The cells occupied by the species will change over time together with the genetic distance matrix quantifying pairwise differentiation between all the occupied cells.
As a burn-in at the start of the simulation, we replicated the oldest time step (20,000 years BP) for 50 time steps to accumulate genetic differentiations between the isolated populations before the dynamic simulation starts. As parameters in the illustration simulations, we randomly draw dispersal distances for each cell from a Weibull distribution with shapes of 1 and 2.5 with median dispersal ranges from 2.1 km to 26 km per year (95% quantile ranging from 4.7 to 89 km/ year). Values were selected according to the reported changes in calf ground location which reach a maximum speed of 15 km/year (Taillon, Festa-Bianchet, & Côté, 2012) to 25 km/year (Couturier, Jean, Otto, & Rivard, 2004) over nearly 20 years. Together with the sets of distance F I G U R E 2 Distribution of caribou and reindeer Rangifer tarandus sampled herds across the Holarctic distribution of the species. Sampling locations are colored according to population membership of each herd to the North American clade, considering two genetic clusters (from blue for North American clade to red for Euro-Beringian clade) as inferred in Yannic, Pellissier, Ortego, et al. (2014). For more details on sampling locations, see Yannic, Pellissier, Ortego, et al. (2014); Yannic et al., 2018) Sampling locations and genetic clustering

| Comparison between model and genetic data
We applied a principal coordinate analysis (PCoA) on the pairwise distance matrices obtained from our model as well as the Cavalli-Sforza chord distance Dc (Cavalli-Sforza & Edwards, 1967) and F ST value from 1,297 caribou individuals sampled from 57 locations around their circumpolar distribution (Yannic et al., 2018;Yannic, Pellissier, Ortego, et al., 2014). The PCoA was computed with the vegan R package (Oksanen et al., 2019) and the phylogenetic trees with the ape R package (Paradis, Claude, & Strimmer, 2004;Paradis & Schliep, 2018). We mapped the resulting ordinated values and compared simulated ordination and empirical data.
We also compared simulated and empirical pairwise genetic distance matrices using Spearman's correlation, and significance was tested with Mantel tests implemented in the vegan R package.
Because not all currently viable populations have been predicted with the model (and vice versa), we removed all populations which are not included in both the model and the genetic dataset for the comparison (see Results). We applied a buffer of 200 km around the genetic sampling location to assign the location to the simulation cluster.

| Species distribution models
The which were classified as suitable habitat, were predicted by all models, 87% by at least three and 95% by at least two models. The congruence between the models generally decreased over time. The average congruence between all four models between the present and 5'000 years F I G U R E 3 Simulated present genetic distinctiveness for five different simulation parameter settings of dispersal and landscape costs. Upper row (a)-(c): costs for crossing water equal costs for crossing glaciers. They are increased by a factor of 5/4 compared to suitable habitat. Lower row (d)-(e): costs for crossing water are increased by a factor of 10/3 compared to suitable habitat and costs for glaciers by 5/4. The dispersal distances are increasing from left to right; 1st column (a): median dispersal distance (mdd): 2.5km/year (95% quantile: 4.7km/year); 2nd column (b)-(d): mdd: 4.3km/year (95%: 7.8km/year); 3rd column (c)-(e): mdd 17km/year (95%: 31km/year). (b) Exhibits the output matching mostly to the observed genetic structure. Increasing the dispersal distance leads to only two genetic clusters, while increasing clearly the costs for crossing water results in two distinct populations on the side of the Bering Strait. The difference between (c) and (e) both with a high dispersal rate, but with a lower cost to cross the sea for (c) in comparison with (e), results in lower genetic difference between the Eurasia-Beringia population and the one from northeastern America Dispersal low m edium h igh

Cost s to cross sea
BP was 76%, while it decreased to 64% between 15,000 years BP to 20,000 years BP. All SDMs showed two distinct refugia in northern America during the LGM. The northeastern refugia located in Alaska close to the Bering Strait and the southern one from the plains in the Midwest toward the Atlantic (see also Figure 3). Fifty out of 57 (88%) sampling locations were predicted as suitable habitat by the models (incl. 200-km buffer; 45/57, 79%, with a 100km buffer).

| Formation of the clusters. Effect of the cost distance
The cost distance for different habitat types strongly influenced the formation of population clusters (Figure 3). Simulations with lower costs for nonsuitable habitat showed more limited population structure in Northern America when the dispersal capacity was high or disconnected populations between Russia and Eastern North America in case of very low dispersal capacities. Exponential dispersal (Weibull shape of 1) resulted in events of long-distance dispersal, which largely reduced the genetic structure among populations. The landscape and the dispersal parameters modulate the simulation outputs, which generate alternative expectations for the present genetic structure among populations that can be compared to the data.

| Comparison of the simulations with the data
Simulation outputs can be displayed in the form of genetic dendrograms and the corresponding distribution maps of the genetic clusters through time (Figures 4 and 5). southeastwards. The two populations became connected around 11,000 years BP. This formed a contact zone between the two genetically differentiated groups.

| D ISCUSS I ON
We provide a parsimonious method to simulate intraspecific genetic structure over a temporally dynamic landscape, which can be compared to empirical data. As a proof of concept, our illustrative application show the formation of two genetic clusters in the reindeer/ caribou associated with two main different refugia during the LGM 20 kyr BP, that is, in Eurasia and central North America south of the ice sheet (see also Yannic, Pellissier, Ortego, et al., 2014).   In simulations, the largest dispersal distance explored best corresponded to the empirical data and agrees with measures of calf ground displacement, that is, 15-25 km/year (Couturier et al., 2004;Taillon et al., 2012). Second, properties of the landscape facilitate or hinder the movement of species and can further influence the gene flow among populations. Reconstructing vegetation types in the past might have provided more structure for the formation of genetic differentiation within each of the genetic clusters.
The levels of details in the simulations outputs depend on the complexity of the landscape features that are accounted for in the computation as well as the resolution of the SDMs. In our illustration, the simulated genetic structure showed more limited complexity that in empirical data ( Figure 5). For instance, the lineage that had a refugia South of the North American ice sheet contained only three subgroups, which is lower than ~20 distinct lineages found in the empirical data.  (Yannic et al., 2018;Yannic, Pellissier, Ortego, et al., 2014). As a result, some lineages with significant differentiation are not reproduced in the simulations. To calibrate the SDMs, we used the IUCN range map rather than accurate locations of species observation. As such, the calibrated climatic niche of the species is not accurate and only captures the most obvious dimensions of the species ecological preferences. More accurate high resolution, but also computationally demanding simulations could generate more structured outputs with better agreement with the data. In the future, higher landscape complexity can be included in order to investigate whether with increased landscape features more complex genetic structure can emerge in the simulations, beyond the two main groups associated with the distinct refugees (Yannic, Pellissier, Ortego, et al., 2014).

| Management implications and future directions
We provide a parsimonious simulation model that generates expectations on intraspecific genetic differentiation as a result of landscape dynamics illustrated with the case of the caribou/reindeer. We propose to simulate how the interactions between, (a) dispersal ability, (b)

F I G U R E 6
Violin plots depicting the correlation between pairwise empirical (F ST and Chord distance) and simulated genetic distances. According to simulations, all populations that belong to the same cluster have a genetic distance equal to zero. Red dots correspond to mean values and error bars to standard deviation Simulated distance Chord distance habitat suitability from SDMs, and (c) landscape connectivity shape the gene flow between populations and the generation of intraspecific structure over time. Our approach remains simple in nature and further properties can be considered. In particular, our approach simulates beta diversity, but not alpha diversity: considering continuous suitability as a proxy of carrying capacity within cells can provide the means to integrate all diversity facets within the same framework. In a future development, the strength of our framework could be reinforced by the use of ancient DNA that could, when available, simultaneously validate past distribution modeling as well as genetic lineage occurrences. Furthermore, the simulation model provides expectation for neutral genetic markers and it remains to be seen how processes can be integrated in a simplified form to study adaptive loci (e.g., Rebaudo, et al., 2013). The simulation model further allows forecasting species range shifts and genetic beta diversity under climate change scenarios and evaluates how species range shifts will induce genetic diversity erosion of species. García-Díaz et al. (2019) show that quantitative models have already served an important role in generating effective conservation, policy, and management actions. We expect that a better understanding of the genetic changes as a result of past climate changes will help predict the future shifts of intraspecific genetic variation under ongoing climate changes.

ACK N OWLED G EM ENTS
We Joint call BiodivERsA-Belmont Forum call (project "FutureWeb"). We thank the Editor and two anonymous reviewers for comments on an earlier version of the manuscript.

CO N FLI C T O F I NTE R E S T
None declared.