Spatial areas of genotype probability: Predicting the spatial distribution of adaptive genetic variants under future climatic conditions

In a context of rapid global change, one of the key components for the survival of species is their genetic evolutionary potential for adaptation. Many methods have been developed to identify genetic variants underpinning adaptation to climate, but few tools were made available to integrate this knowledge into conservation management. We present here the SPatial Areas of Genotype probability (SPAG), a method to transpose the results of genotype–environment association studies into an evolutionary potential spatial prediction framework. We define a univariate model predicting the spatial distribution of a single‐locus adaptive genotype and three multivariate models allowing the integration of several adaptive loci in a composite genotype. Unlike existing methods, SPAGs provide (a) a flexible approach to combine loci under different types of intergenic relationships and (b) a cross‐validation framework to assess the pertinence of evolutionary potential predictions. SPAGs can be integrated with climate change projections to forecast the future spatial distribution of genotypes. The analysis of the mismatch between current and future SPAGs (“genomic offset”) makes it possible to identify vulnerable populations potentially lacking the adaptive genotypes necessary for future survival. We tested the SPAG approach on a simulated population and applied it to characterize the evolutionary potential of 161 Moroccan goats to bioclimatic conditions. We identified seven regions of the Moroccan goat genome strongly associated with the precipitation seasonality and used the SPAG approach to predict the evolutionary potential. We then forecasted the shift in SPAGs under a strong climate change scenario and uncovered the goat populations likely to be threatened in future conditions. The SPAG methodology is an efficient and flexible tool to characterize the evolutionary potential across a landscape and to transpose evolutionary information into conservation frameworks.


| INTRODUC TI ON
Climate change is causing a shift away from the favourable conditions necessary for the survival of many animal and plant species (IPBES, 2019). In order to avoid extinction under these conditions, threatened species can either move to more favourable areas or adapt to their new environment (Waldvogel et al., 2020). Due to limitations in dispersal capacity, loss of favourable habitats and increased landscape fragmentation, the possibilities for dispersal to new areas are often limited (McGuire et al., 2016;Opdam & Wascher, 2004). Tolerance to stressful environments relies on phenotypic changes, which can be induced by genetic evolution (Fox et al., 2019;Merilä & Hendry, 2014).
The evolutionary potential for adaptation (hereafter referred to as evolutionary potential) of a given population can be defined as the presence of the genetic variants conferring resistance against an environmental constraint (Harrisson et al., 2014). Information on evolutionary potential can have major implications in the design of efficient conservation strategies Nicotra et al., 2015;Sgrò et al., 2011;Shafer et al., 2015). For instance, it could complement models on species distribution and provide a more pertinent view of biodiversity richness (Waldvogel et al., 2020). In addition, the current distribution of adaptive genetic variants could be compared with climate change projections to identify populations lacking the genetic characteristics necessary for future survival (notion defined as "genomic offset," or "genomic vulnerability"; Bay et al., 2018;Capblancq et al., 2020;Fitzpatrick & Keller, 2015). The characterization of the evolutionary potential of a population follows a two-step framework: first, the genetic variants involved in local adaptation are uncovered, and then, their distribution across an area of interest is predicted (Peterson et al., 2019).
The fields of population and landscape genomics have produced a wide spectrum of approaches to uncover genetic variants involved in local adaptation of natural populations (Luikart et al., 2003;Rellstab et al., 2015). In outlier tests (or genome scans), loci deviating from the neutral distribution of genetic variation (outliers) are considered as potentially adaptive (Luu et al., 2017). Such analyses are often complemented by genotype-environment association (GEA) methods, which investigate how the genetic variation of a population matches the environmental variability of the landscape where the population is living (Rellstab et al., 2015). For instance, redundancy analysis (RDA) can be used to assess how the distribution of a group of loci can be explained by a set of environmental descriptors (i.e. multi-loci adaptive genotypes; Legendre & Legendre, 2012). Other GEA methods investigate the association between environmental variables and single loci (i.e. single-locus adaptive genotypes).
These methods are implemented in software like Bayenv (Günther & Coop, 2013), LFMM (Frichot et al., 2013), gINLAnd (Guillot et al., 2014) or Samβada (Stucki et al., 2017). The latter is designed for high-performance computation and is therefore attractive for whole genome-scale genotyping. In addition, Samβada has recently been complemented with an R-package facilitating the uptake of the method (R. Samβada, Duruz et al., 2019, with dedicated guidelines to boost statistical power via sampling strategy as proposed in Selmoni et al., 2020).
Over the last decade, different methods have been proposed to transpose the results of GEA analyses into evolutionary potential spatial prediction frameworks (Waldvogel et al., 2020). One of the most straightforward approaches defines the evolutionary potential as the sum of the adaptive genotypes observed in an individual (polygenic scores; Gagnaire & Gaggiotti, 2016). For instance, Babin et al. (2017) applied this method to adaptive genotypes (uncovered using RDA and Bayenv) in American eel. Linear and generalized linear models were then employed to evaluate how the polygenic scores of sampled individuals matched the environmental conditions at the sampling locations. A similar approach was applied to loci adaptive for aridity (identified using LFMM and gINLAnd) in sugar beet (Manel et al., 2018). Importantly, the study is one of the few that cross-validated the predictions on evolutionary potential on a separated dataset. There are also more sophisticated methods to predict evolutionary potential at population-level and across multivariate environments. For instance, Razgour and colleagues (2019) recently assessed the frequency of adaptive genotypes in distinct populations using LFMM and RDA, and then employed the RDA ordination axes to project the expected genotype frequencies across the environmental space. Another framework based on genotype frequencies at population-level is the one proposed by Fitzpatrick and Keller (2015). This framework first employed outlier tests and Bayenv to identify adaptive loci in balsam poplar populations and then alternatively used a nonlinear extension of matrix regression method (generalized dissimilarity modelling) or a nonparametric machine learning algorithm (Gradient Forest) to predict evolutionary potential.
Despite the usefulness of the existing approaches for predicting the evolutionary potential, there are still important gaps that need to be filled. One of the main issues is that evolutionary potential is often described as a sum of the effects of single adaptive genotypes (Gagnaire & Gaggiotti, 2016). Such a view of additive genetic effects is well established, but neglects the fact that epistatic genetic structures can also underpin adaptation (Hansen, 2013). Under an epistatic view of adaptation, the selective advantage provided by an adaptive genotype depends on the presence or absence of other adaptive genotypes. For instance, two adaptive genotypes can be considered as equivalent when the presence of one genotype, regardless of the other, is sufficient to confer the selective advantage (Thompson & Jiggins, 2014). This situation can also concern loci implicated in adaptation to different environmental constraints, since the molecular basis of stress responses can converge towards common physiological mechanisms (Pandey et al., 2015). Understanding how the combined effects of different loci contribute to the evolutionary potential requires (a) a flexible framework to combine adaptive loci in different additive/epistatic relationships when computing the evolutionary potential and (b) a rigorous cross-validation approach to assess how pertinent evolutionary potential predictions are under the different additive/epistatic relationships.
We propose here a novel approach to predict the frequency of adaptive genotypes called SPatial Areas of Genotypes Probabilities (SPAG). The SPAG approach is based on logistic genotypeenvironment associations (Joost et al., 2007) and takes as input the results of the GEA software Samβada (Stucki et al., 2017). By relying on conditional probability theory, the SPAG approach allows for the combination of adaptive genotypes in different types of intergenic relationships. The resulting SPAGs can be used to portray the evolutionary potential across an area of interest or to identify vulnerable populations that may be threatened by climate change. Importantly, the SPAG approach features a cross-validation step to assess the accuracy of the predictions on evolutionary potential. We introduce the theoretical bases of SPAGs and validate the approach with a simulated dataset. We then apply our approach to a case study on Moroccan goats to characterize their evolutionary potential to current climatic conditions and to predict genomic offset to climate change.

| SPAG approach
The SPAG approach is based on the spatial analysis mMethod (SAM; Joost et al., 2007) implemented in the GEA software Samβada to uncover candidate adaptive single-locus genotypes (Stucki et al., 2017).
We first describe the SAM method and then show how the results of SAM can be used to predict spatial areas of genotype probabilities. Four different types of SPAGs are described (univariate SPAG, Intersection-SPAG, Union-SPAG and K-Percentage-SPAG), mirroring different possibilities of intergenic relationships between adaptive loci.

| Logistic regressions (SAM)
The SAM method assumes a linear response of the genotype to the environmental variable and uses logistic regressions (Formula 1) to assess the probability of presence of a (single-locus) genotype G1 as a function of an environmental variable (x1), where β0 and β1 are the parameters of the regression to be fitted.
Independent univariate logistic regressions can be computed between each genotype and each environmental predictor, and significant associations can be identified using statistical tests. Joost et al. (2007) suggested the combined use of a likelihood ratio (G score) and a Wald test. The likelihood ratio (G) compares the likelihood of a model with the likelihood of a null model without the environmental variable of interest. The null hypothesis is that the model considered does not explain more variance than the null model, that is the environmental variable considered does not help in predicting the probability of presence of the genotype. The Wald test is a common statistic used to estimate whether a parameter is equal to a given value. In our case, it is used to reject the null hypothesis that the β parameter associated with an environmental variable is equal to 0, which would also indicate that it has no effect on the probability of finding the genotype.

| Univariate SPAG
Once a single-locus genotype involved into a significant association with an environmental variable has been identified, Formula (1) enables the estimation of the probability of presence of the genotype for any value of the environmental variable (x1). We consequently used it to estimate and delimit on a map the probability of presence of a genotype over the whole region of interest (Joost, 2006;Rochat et al., 2016). We named such a delimited surface univariate Spatial Area of Genotype Probability (SPAG).

| Intersection-SPAG (I-SPAG)
The Intersection model (I-SPAG) is used to compute the probability that a set of single-locus genotypes are all simultaneously present.
This corresponds to the situation where the evolutionary potential is defined by the additive effect of complementary loci (Gagnaire & Gaggiotti, 2016). Following the theory of conditional probability (Kolmogorov, 1956), the probability of simultaneous presence of n adaptive single-locus genotypes G i , i = 1:n can be computed using is a conditional probability that can be estimated using a logistic regression where ⋂ n − 1 i = 1 G i is integrated as a covariate (Formula 3).
However, as we would like to use this model to predict the probability of presence of the genotypes for any point of the region of interest, that is also where G i values are unknown, we suggested to Using the associative property of the intersection operator, the intersection of n genotypes can be computed by starting with the univariate model p(G1), which is used as a covariate to compute p(G1 ∩ G2), itself used to compute p(G3 ∩ (G1 ∩ G2)), etc. Formula (3) can thus be implemented with a recursive model based on the univariate formula in which covariates are added (see File S1 for more details). We implemented this model as an R function available following the link given in the section "code availability" at the end of the manuscript.

| Union-SPAG (U-SPAG)
The union model (U-SPAG) is used to compute the probability of finding at least one of the adaptive single-locus genotypes of interest. This view of evolutionary potential corresponds to the epistatic situation where the presence of an adaptive single-locus genotype is sufficient for securing the selective advantage, while additional adaptive genotypes do not confer additional advantages (Hansen, 2013). We implemented it with the inclusion-exclusion principle (e.g. for two genotypes: . We implemented the generalized formula for n adaptive genotypes (Formula 4), as an R function based on the intersection model previously described (see code availability and File S1).

| K-percentage SPAG (K-SPAG)
Finally, we developed a K-percentage model (K-SPAG) to estimate the probability that an individual carries K% of n adaptive single-locus genotypes. This view of evolutionary potential is at the crossroads between the additive situation, described in the I-SPAG, and the epistatic situation of U-SPAG. Here we assume that a higher number of adaptive single-locus genotypes increases the selective advantage (additive effect), but also that when a sufficient number of adaptive genotypes is present, there is no additional selective advantage associated with additional genotypes (epistatic effect). This probability can be computed by combining formulas from the union and intersection models (Formula 5, explained in more detail in File S1). Again, this formula was implemented as an R function (see code availability).
Note that all multivariate models allow the integration of adaptive genotypes associated with various environmental variables since the environmental variable xi used to compute p(Gi) can be different for each i.

| Simulation study
In order to test the SPAG approach, we first computed a simulated dataset using the individual-based population genetics model software CDPOP 1.3 (Landguth & Cushman, 2010;Landguth et al., 2020). We simulated individual genetic exchanges and natural selection across 300 non-overlapping generations among 200 individuals randomly located in a 500 × 500 gridded landscape.
For the breeding parameters, we considered a sexual reproduction, with random mating, both male and female with replacement, no selfing, no philopatry, no multiple paternity, equal sex ratio and each mated pair producing three offspring. The movement of the individuals was linearly restricted as a function of the Euclidean distance, with a maximum dispersal corresponding to 25% of the entire landscape. We simulated 50 diallelic loci, with three loci under selection (L0, L1 and L2). The selection was implemented using three 500 × 500 raster gradients, the first from north to south (X0), the second from east to west (X1) and the third from northwest to southeast (X2; see Figure 1a). We set the average effects bL0A0A0 = 10 and bL0A1A1 = −10 for the locus L0 with the environmental variable X0, which indicates that the genotype A0A0 from locus L0 will be favoured in the south (where X0 = 1), whereas A1A1 will be favoured in the north (where X0 = −1). We set similar effects for the locus L1 with the environmental variable X1 and L2 with X2. All other beta effects were set to 0, indicating no influence of the environmental variable to the distribution of genotypes. All genotypes were randomly initialized at the beginning of the simulations. The exact list of simulation parameters used is provided in File S2.
Univariate logistic regressions were then applied to the genetic data of individuals at the 300th generation in order to identify the most significant associations, which should highlight loci under selection. Univariate and multivariate SPAGs were then applied to estimate the probability to find these genotypes across the simulated landscape. The results were validated using a cross-validation procedure presented below.

| Genetic data
The real case study focussed on goats (Capra hircus) from Morocco. The genetic dataset was produced in the context of the NEXTGEN project (Alberto et al., 2018). The NEXTGEN project produced whole genome sequences data for 161 Moroccan goats from 6 different local breeds. Since goat production system in Morocco is mainly free range, these goats are living from 8 to 12 months outdoors (Boujenane, 2005) and are confronted to contrasting environmental conditions, from the Sahara desert (4) Univariate and multivariate spatial areas of genotypes probability (SPAG) for the simulated dataset. The identifiers of the presented models (S1, S5, S8) refer to Table 1. Please refer to Box 1 to interpret the validation graphs shown on the right of each map 2.3.2 | Environmental data The climatic conditions of the sampling locations were characterized using the 19 bioclimatic variables (File S4) from the WorldClim database (https://www.world clim.org/), representative of the period 1960-1990(Hijmans et al., 2005. Each variable was retrieved as a raster layer with a spatial resolution of 30 arc-seconds (approx. 1 km 2 ), and values were extracted for all sampling locations using the extract function from the R-package raster (Hijmans & van Etten, 2012). In order to get a comparable scale of values for all bioclimatic variables, we applied a standardization by subtracting the mean and dividing by the standard deviation. Some of the bioclimatic variables are highly correlated. However, we choose to keep all of them to be able to identify a posteriori which variable had the strongest effect. Since no models computed involved more than one environmental variable simultaneously, this collinearity will not impact the results.

| Population structure
We ran a preliminary analysis of population structure to evaluate the possible confounding role of neutral genetic variation on the detection of adaptive genotypes (Li et al., 2012). The genetic population structure was estimated with a principal component analysis (Price et al., 2006;Reich et al., 2008) computed with the function snpgd-sPCA from the SNPRelate R-package (Zheng et al., 2012). In order to avoid a strong influence of SNP clusters on this analysis, we used here a pruned set of SNPs that are in approximate linkage equilibrium with each other. The pruning was performed with the function snpgdsLDpruning from the SNPRelate package, with a threshold D′ = 0.2. The resulting datasets contain 59,224 SNPs.
The cumulated variance explained by the 10 first PCA components on the SNP markers represents only 8.1% of the total variance, and the increase in variance explained is almost proportional to the number of components, which highlights that there is no clear substructure (File S5). We thus performed the GEA analysis without any covariates for population structure.

| Logistic regressions and SPAGs
Logistic models were computed for all the combinations of singlelocus genotypes with the 19 bioclimatic variables. The statistical significance of the model was assessed using Wald test and loglikelihood ratio (G), both corrected for the false discovery rate due to multiple comparisons using the procedure proposed by Benjamini and Hochberg (1995), under an expected false discovery rate (FDR) of 0.05 (i.e. 5% of the results expected to be false positives). An association was considered as significant if both the models without covariates and with population covariates were significant.
In order to identify potential functions of the SNPs involved into the significant associations, we used the NCBI Genome Data

| Validation procedure
SPAGs were validated with a cross-validation procedure, using 25% a custom validation graph presented in Box 1. The cross-validation procedure was repeated 10 times. In order to avoid bias due to rare genotypes occurrences, we applied this cross-validation procedure only when at least 10 individuals carried the genotype of interest.

| Simulated dataset-logistic regressions
The 10 most significant genotype-environment associations obtained with the simulated dataset ranked on the basis of the likelihood ratio (G) are presented in Table 1. We observed that the three loci simulated as under selection (L0, L1, L2) were coherently identified as the most significantly associated with the environmental variables under study.

| Simulated dataset-SPAGs
We focussed on the three single-locus genotypes under selection (corresponding to models S1, S5 and S8), and we evaluated how using different types of SPAGs modulated the predictive power of the approach. Figure 1 displays an example for every type of

SPAG.
Univariate SPAGs resulted in an average AUC of 0.84 ± 0.05 (Table S1). Figure 1b presents the univariate SPAG for the model S1 (Locus L1, genotype A0A0 associated with the environmental variable X1). Since this locus was simulated as under selection with the east-west gradient X1, the resulting SPAG coherently shows a sim-  Table S1). Figure 1c Table S1). Figure 1d presents a U-SPAG of three single-locus genotypes under selection. It shows that the probability of finding at least one of them is higher than 0.5 in most of the area. The validation graph indicates that the model failed to estimate the

BOX 1 Validation graph
SPAGs indicate the probability of finding one or more genotypes of interest in a territory (panel below). For a given probability threshold value (e.g. th = 0.6), we can thus use the SPAG to delimit the area where the probability of finding the genotype(s) of interest is predicted to be greater or equal to this threshold (e.g. p ≥ 0.6). If the SPAG is valid, the frequency of the genotype(s) observed among the testing individuals located within the thresholded SPAG should effectively be greater or equal to the threshold value, whereas it should be less outside.
The goal of the validation graph presented in this box is to evaluate the predictive power of SPAGs for different levels of genotype probability. We thus calculated the observed genotype frequencies inside and outside the thresholded SPAGs for each threshold value between 0 and 1 (with a step of 0.1), and we presented the results on a graph (panel below). The green line indicates the genotype frequency observed inside the thresholded SPAG, whereas the red line shows the genotype frequency observed outside it. A black line indicates the limit case where the observed genotype frequency is equal to the threshold value. For a given level of genotype probability, the SPAG is thus validated if the green line remains above the black line and the red line remains below it. The green and red areas around the lines indicate the 95% confidence intervals for each line, computed on the basis of the 10 cross-validation runs. We also presented on the graph the percentage of individuals located within the thresholded SPAG (grey line) and outside it (dotted grey line). The hatched grey areas indicate ranges of testing values where there was less than 5 individuals remaining inside or outside the thresholded SPAG, which was therefore considered not to be usable for the validation.
For a threshold value th = 0.4, the SPAG is validated since 50% of the testing individuals located within the area SPAG ≥ 0.4 carry the genotype of interest, whereas only 26% carry it outside (SPAG < 0.4). Inversely, the model is not validated for th = 0.6, since only 54% of individuals carry the genotype of interest within the area where the SPAG predicted a probability of at least 0.6 (SPAG ≥ 0.6). The model is also not valid for a value th = 0.2 since 23% of individuals carry the genotype of interest in the area SPAG < 0.2.
probability of finding the genotypes for threshold values below 0.2 and above 0.8.
The AUC of the K-SPAG was higher than the univariate and U-SPAGs (AUC = 0.87 ± 0.02). Figure 1e depicts the probability of finding at least two of the three single-locus genotypes under selection, which is high in the southeast of the map. The validation graph indicates that the predictive power of the K-SPAG was usable for any level of genotype probability.

| Moroccan goats-logistic regressions
More than 483 million logistic association models were computed.
After correction for false discovery rate with a significant threshold of 5%, no model was significant according to the Wald score, but seven models were significant according to the G score (Table S2a). Among them, three models were strongly associated with the precipitation seasonality (bio15, Figure 3a), which is the coefficient of variation CV = SD mean of the variation of monthly precipitation over the year. Following this initial result, we investigated in more details the adaptation to this bioclimatic variable.
When considering only the associations involving bio15 (25,447,348 models), 78 models were significant after FDR correction of G score, with a significant threshold of 5% (Table S2b). The SNPs involved in these models are located on seven different genomic regions (Table 2), corresponding to four annotated genes F I G U R E 2 SPAG-Moroccan dataset. Univariate and Multivariate spatial areas of genotypes probability (SPAG) for the Moroccan dataset. The identifiers of the presented models (M1, M3, M4) refers to Table 2. The maps show the average genotype(s) frequency(ies) based on the 10 runs computed with different random selection of training sets containing 25% of the total number of individuals. AUCtest indicates the mean value of the AUC computed with the testing dataset over the 10 runs. Please refer to Box 1 to interpret the validation graphs shown on the right of each map TA B L E 1 Most significant models-Simulated datasets. 10 most significant models obtained for the genotype-environment association (GEA) analysis of the simulated datasets. The simulated dataset sets an association between locus L0 and environmental variable X0, locus L1 and environmental variable X1 and locus L2 and environmental variable X2. GEA models are ranked based on the likelihood ratio. Env, environmental predictor; Geno, genotype; pG, p-value associated with the G score; pW, p-value associated with the Wald score; β0 and β1 are the parameters of the logistic regression (DSG4, CDH2, KCTD1 and WRN) on the reference genome CHIR 1.0.

| Moroccan goats-SPAGs
The seven significant single-locus genotypes uncovered in the GEA were employed to compute univariate SPAGs, I-SPAGs, U-SPAGs (both bi-and trivariate) and K-SPAGs (trivariate only). We assessed the predictive power of the approach under different types of SPAGs using AUC, and we show some examples of SPAGs in Figure 2.
Univariate SPAGs resulted in an average of AUC of 0.79 ± 0.03 (Table S3; Figure 2a shows the univariate SPAG for the genotype of model M1 presented in Table 2 (see File S6 for the other uni-variate SPAGs). The predicted probability of presence of the genotype is the highest in the extreme southwest of the country. In this region, the precipitation seasonality is the greatest (>100%, Figure 3a) and all goats carry the genotype of interest. In northern and western coastal areas, the predicted probability of finding the genotype is close to 0.5, since the precipitation seasonality is relatively high (>70%). Finally, in the eastern regions of the country, the probability of finding the genotype of model M1 is much lower (<0.2 in most areas). In these regions, variations of precipitation SPAGs. Figure 3b shows the I-SPAG for the genotypes involved in GEA models M1 and M4 (Table 2). This I-SPAG indicates that the simultaneous presence of the two single-locus genotypes is very unlikely (p < 0.2) for most of the territory, with the exception for the southwest of the country.

F I G U R E 3 Moroccan goats-
Union SPAGs (U-SPAGs) generally resulted in lower AUC, compared to I-SPAGs (bivariate U-SPAGs: AUC = 0.81 ± 0.03; trivariate U-SPAGs: 0.82 ± 0.03). Indeed, we found only one combination of single-locus genotypes with higher AUC under U-SPAG (0.84 ± 0.01) than under I-SPAG (0.81 ± 0.02). This U-SPAG employed genotypes from M1 and M3 and is shown in Figure 3c. The probability of finding at least one of these two genotypes is close to 1 for most of the country except in the north-eastern region.
Percentage SPAGs (K-SPAGs) showed AUC values generally lower than I-SPAGs (AUC = 0.82 ± 0.05). One exception was the K-SPAG shown in Figure 3d, employing at least two of the genotypes from M1, M3 and M4. For these three loci, the K-SPAG showed an higher AUC (0.86 ± 0.01) than the I-SPAG (0.83 ± 0.02) and the U-SPAG (0.84 ± 0.03). Unlike the I-SPAG and U-SPAG shown in Figure 3b,c, the validation plot shows that this K-SPAG is validated for any level of probability of presence of the adaptive genotype. Figure 3 shows the genomic offset, which represents the differences between the current SPAGs (presented in Figure 2) and the SPAGs corresponding to their projections for 2070. In Morocco, precipitation seasonality (bio15) is predicted to increase in the north of the country, with a maximum increase of 5%-10% in the extreme northwestern region, and to decrease (from −10% to −20%) in other areas, especially in the Atlas Mountains (centre of the country, in the eastern region) and near the Sahara (South region; Figure 3a,b).

| Moroccan goats-genomic offset
The genomic offsets of the univariate SPAG for model M1 have to be slightly higher (up to +10%) for individuals to be adapted to 2070 conditions. However, the multi-locus genotypes described by the U-SPAG and the K-SPAG were already found in most of the sampled goats, with the exception of the north-eastern regions.

| Mapping genotype probabilities
The different SPAGs computed in this work generally showed an AUC above 0.8 on both simulated and real data, stressing the predictive power of the approach. While previous works had already described methods to transpose GEA results into predictions of evolutionary potential (Babin et al., 2017;Fitzpatrick & Keller, 2015;Gagnaire & Gaggiotti, 2016;Razgour et al., 2019), the cross-validation of such predictions is rare (Manel et al., 2018). In this regard, it is noteworthy to mention that the SPAG approach showed a generally high predictive power despite using a limited number of training samples (i.e. 50 simulated individuals, 41 Moroccan goats).
As we compared different types of SPAGs, we noticed that the I-SPAGs generally showed a higher predictive power. Indeed, for a same combination of single-locus genotypes, the I-SPAGs resulted in higher AUCs than the corresponding U-SPAGs and univariate SPAGs.
We expected to observe this in the simulated dataset, because the three adaptive genotypes were set as independent from each other  (Phillips, 2008). Unfortunately, such interpretation is hampered by the fact that both loci are located in intergenic regions of the genome and therefore possibly act as regulatory regions on distant genes (Brodie et al., 2016).
Nevertheless, previous research expects that the evolutionary potential is shaped by the co-existence of adaptive loci underpinning both additive and epistatic relationships (Hansen, 2013;Thompson & Jiggins, 2014). The existing methods to predict evolutionary potential overlook the epistatic component of adaptation either because (a) they explicitly focus on additive effects (e.g. "polygenic scores"; Gagnaire & Gaggiotti, 2016); (b) they work at population-level (Fitzpatrick & Keller, 2015). Indeed, in population-level approaches the presences/absences of single-locus genotypes are averaged as genotype frequencies by population. Consequently, it is impossible to assess whether two loci co-occur (or not) in the same individual and to establish whether they are complementary (or alternative) to each other.
The K-SPAG overcomes these issues, by providing a flexible approach to use information at individual-level to combine adaptive loci underlying different types of intergenic relationships. For instance, in the Moroccan case study we constructed a composite genotype combining alternative loci (M1 and M3) with additive loci (M1 and M4), and the resulting K-SPAG showed higher predictive power than the corresponding I-SPAGs and U-SPAGs alone.

| From SPAG to conservation: The Moroccan goat example
Information on the evolutionary potential can have major implications in predicting species response to climate change (Razgour et al., 2019). One of the main application in conservation is the prioritization of areas based on the information on evolutionary processes (Orr & Unckless, 2008 in outdoor pastoralist systems, therefore exposed to climatic constraints (e.g. heat stress, limited water availability, presence of parasites; FAO, 2015). Increase in precipitation seasonality is a threat for such populations, as it implies a more heterogeneous distribution of rainfall across the year, exacerbating the risk of drought and floods (Nardone et al., 2010). Climate forecasts expect an increase in precipitation seasonality in the northern regions of Morocco (Tramblay et al., 2012). We estimated the genomic offset using different types of SPAGs, and all the resulting predictions agree that in the northern regions of Morocco an increase in the evolutionary potential for adaptation to precipitation seasonality will be necessary by 2070.
Importantly, such increase in the evolutionary potential depends This mismatch in the results underlines the importance of assessing how complex relationships between adaptive loci contribute to the evolutionary potential (Hansen, 2013). In this regard, it is recommendable to complement the use of SPAGs in conservation plans with other information on phenotypic or physiological characteristics associated with the different adaptive genotypes (Waldvogel et al., 2020).

| Limitations and perspectives
The SPAG approach presented appears to be powerful for mapping the probability of finding locally adapted genetic variants in a landscape. However, the adaptation process is complex and can involve large numbers of loci with small individual effects (Pritchard & Rienzo, 2010), for which the detection power of the single-locus GEA methods may be reduced (Harrisson et al., 2014;Villemereuil et al., 2014). In this case, it may be advisable to use multi-loci GEA methods to detect adaptive genotypes Legendre & Legendre, 2012) and to use multi-loci prediction frameworks to predict evolutionary potential (Fitzpatrick & Keller, 2015).
In addition, since the genomic offset calculations may be highly dependent on the climate change scenario considered, computations should be performed with various scenarios and less weight should be given to the conclusions not consistent within scenarios (Reside et al., 2018). It is also possible that the forecasted environmental conditions exceed the climatic ranges observed today, leading to extrapolation (and the related uncertainty) in the calculation of future evolutionary potential (Elith et al., 2010). Such uncertainty should always be considered in the interpretation of the genomic offset calculated using the SPAGs.
Furthermore, in order to assess the real vulnerability of populations, an analysis of connectivity should be carried out to highlight the potential of natural gene flow to increase the probability of finding favourable genotypes in threatened populations. Such information on connectivity could be used to develop a statistical framework assessing the likelihood of the shift from a current SPAG to a future SPAG.
The SPAG approach can be extended to the output of other types of GEA methods, assuming that such output is convertible into a binary measure of genotype presence/absence. This might be relatively straightforward for GEA methods that are individual-based and focus on single-locus genotypes. With LFMM (Frichot et al., 2013) for instance, the adaptive genotype (or allele) can be easily identified by comparing the variation of the genetic variant of interest across the environmental gradient driving adaptation. For population-level GEA methods (e.g. RDA; Legendre & Legendre, 2012), the use of SPAGs is more complex and less pertinent. In fact, with this type of GEA methods the frequencies of single-locus genotypes are generally averaged by population and grouped via ordination techniques, and it is therefore difficult to assess in which individuals the adaptive genotype is present (or absent).
Finally, GEA and SPAGs could also be used to predict the presence of genotype(s) associated with other pressures showing a spatial distribution, such as the presence of a parasite (Vajana et al., 2018) or a predator (Cousyn et al., 2001) or the urbanization level (Harris & Munshi-South, 2016). Including such selective pressure will portray a more realistic view of how different species are shaping their evolutionary potential in response to climate change.

ACK N OWLED G M ENTS
We thank Paolo Ajmone Marsan and Elia Vajana for the fruitful discussions we had about the subject of this research. The authors declare that there is no conflict of interest regarding the publication of this paper.

Pe e r Rev iew
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13256.

DATA AVA I L A B I L I T Y S TAT E M E N T
The Moroccan goat dataset is available at https://proje cts.ensem bl.org/nextg en/ (population MOCH