Climate change impact on cultivated and wild cacao in Peru and the search of climate change‐tolerant genotypes

Cacao (Theobroma cacao L.) is expected to be vulnerable to climate change. The objectives of this study were to (a) assess the future impact of climate change on cacao in Peru and (b) identify areas where climate change‐tolerant genotypes are potentially present.


| INTRODUC TI ON
Climate change is expected to severely impact agricultural production. Global average temperature has already increased by about 0.85°C during the period 1880-2012 and is projected to further increase by 1.4-3.1°C towards the end of this century (Stocker et al., 2013). Rising temperatures in combination with changes in precipitation patterns and higher frequencies of extreme droughts and floods (Cai et al., 2014;Stocker et al., 2013) are expected to reduce crop yields and threaten farmers' livelihoods around the world (Parry et al., 2004;Porter et al., 2014). Impacts are likely to disproportionately affect agricultural systems in the tropics, both because negative effects on yields are expected to be larger compared with temperate zones (Parry et al., 2004;Porter et al., 2014) and because smallholder systems have lower resilience due to their limited available resources (Morton, 2007). As climate change will have differential impacts on individual crop species or varieties and growing areas (Porter et al., 2014), urgent action is needed to better understand the expected impacts of climate change on specific crops, particularly in the tropics, to guide the development of adaptation strategies.
Cacao (Theobroma cacao L.), one of the most important cash crops in the tropics, is expected to be vulnerable to climate change Lahive et al., 2019;Medina & Laliberte, 2017).
Evolved as an understory tree native to the Amazon basin, cacao is a drought-sensitive species and high temperatures and low precipitation can negatively impact its growth, productivity and yield quality (Daymond & Hadley, 2008;Moser et al., 2010;Zuidema et al., 2005).
Increasing evidence indicates that climate change is already affecting cacao cultivation around the world, resulting in higher mortality, declines in yield quality and quantity (Medina & Laliberte, 2017), and increased incidence of diseases (Gateau-Rey et al., 2018).
Habitat suitability models-also known as species distribution or ecological niche models-are useful tools to both assess the future impact of climate change and to guide the selection of climate change-tolerant genotypes. These models can estimate current and future suitability distributions of species and describe their environmental range and can therefore be used to identify regions where genotypes adapted to climatic extremes are potentially present.
In the case of cacao, recent suitability modelling studies have predicted a reduction in the growing area and a shift towards higher elevations. However, these studies have focussed on West Africa Schroth et al., 2016) and Central America (Eitzinger et al., 2015;de Sousa et al., 2019), and until today research in the centre of origin of cacao remains limited. To date, suitability modelling has not been used to support genotype selection yet (Medina & Laliberte, 2017).
The objectives of this paper were to (a) assess the future impact of climate change on the distribution of suitable habitat of cultivated and wild cacao in Peru and (b) identify areas where climate changetolerant genotypes are potentially present. Peru has great potential for cacao genotype selection as it presents high levels of genetic diversity and is the country of origin of multiple genetic groups recognized within the species (Motamayor et al., 2008;Thomas et al., 2012). Because exchange of materials between countries is still restricted, and in order to identify locally adapted genotypes in the country, the search of genotypes is limited within Peru. We used an ensemble modelling approach to assess the distribution of suitable habitat under current and future climatic conditions considering three time horizons (2030s, 2050s, 2070s) and two greenhouse gas emission scenarios (RCP4.5 and RCP8.5). Because model selection can significantly influence model results (Araújo & New, 2007;Thuiller et al., 2019), we assessed the impact of different approaches of model assembly on current and future suitability distributions. Furthermore, we investigated the potential emergence of novel climates within the cacao suitable area and determined how cacao ecogeographical zones (representative for particular sets of growth conditions) might change under future climates. Finally, we carried out an outlier analysis based on the environmental variables most relevant for climate change adaptation to identify areas where climate change-tolerant genotypes are potentially present, which will be target of collection missions.

| Presence data
Cacao presence points were compiled from several sources (Text S1 in Appendix S1) and divided into cultivated and wild cacao according to the indication in the source. For cultivated cacao, we retained only presence points within Peru (Figure 1a). For wild cacao,  Figure 1b). Cultivated cacao in Peru spans at least seven of the 10 genetic groups currently described in cacao (Motamayor et al., 2008) and includes also many introduced cacao varieties. Therefore, in our dataset, we considered both national and international cultivated cacao varieties present in Peru (Table S1 in Appendix S1). For cultivated cacao, presence points from the coastal regions and other sites with <600 mm of annual precipitation were omitted as cacao F I G U R E 1 Prediction of suitable habitat for all the ensemble models constructed with different pseudo-absence points selection approaches and resolutions of spatial filtering for a) cultivated and b) wild cacao in Peru. The best-performing models for each of the pseudo-absence point selection approaches are in red boxes: the base model at 5 arcmin filtering and the target group model at 30 arcsec filtering for cultivated cacao, and the base model at 30 arcsec filtering and the target group model at 30 arcsec filtering for wild cacao. Green areas represent the suitable area for cacao predicted by each model in the left maps. Presence points used for the modelling are shown in red in the right maps can only be grown in these regions under irrigation, which would confound the identification of naturally suitable areas. Accordingly, there are no records of wild cacao in these regions . The compiled   presence points comprised 19,685 points for cultivated and 1,182 for wild cacao.

| Environmental variables
The explanatory variables included 34 climate, soil and terrain variables commonly used in habitat suitability modelling (Table S2 in Appendix S1). These variables were selected because climate, soil and terrain are the main factors influencing species distributions.
Collinear variables under present conditions were removed based on stepwise calculations of variance inflation factors (VIF), retaining only variables with VIFs <10. As correlations between variables can change both in strength and direction in the future (Braunisch et al., 2013;Dormann et al., 2013), we also assessed collinearity under future conditions and retained all variables resulting from the stepwise VIF procedure in at least one time horizon-by-emission scenario combination (see below). Although this resulted in a final set of variables of which some had a VIF >10, this does not outweigh the risk of excluding variables with diverging collinearity patterns under future conditions, which may lead to misleading predictions of future habitat suitability (Braunisch et al., 2013). Multicollinearity analyses were based on the values of the explanatory variables at the locations of all presence and pseudo-absence points used to calibrate each of the ensemble models (see below).

| Habitat suitability modelling
We generated multiple ensemble models (see below) for both cultivated and wild cacao using different approaches to select pseudoabsence points, and different resolutions of spatial filtering. We selected the pseudo-absence points of both approaches from the area enclosed by a convex hull polygon around all presence points extended with a buffer corresponding to 10% of the polygon's largest axis (Acevedo et al., 2012). We further restricted the area to sites below 2500 masl (hereafter called the "convex hull area") to improve the ability of the suitability models to distinguish between presence and absence at lower altitudes.
We used two approaches for the selection of pseudo-absence points: random selection (from hereon called the "base approach") and target group selection ("target group approach"). We separately selected pseudo-absence points for the presence-absence algorithms in our ensembles (see below) and background points for Maxent algorithm. In the base approach, background points and pseudo-absences were randomly selected from the convex hull area. In the target group approach, pseudo-absence and background points were selected from the convex hull area according to the method outlined by Phillips et al., (2009) and Mateo et al., (2010), which involves the selection of points from grid cells with presences of species belonging to a similar group as the target species. These locations are expected to reflect a similar sampling bias compared with the target species, thus reducing the effects of spatially biased presence points on model calibration (Phillips et al., 2009).
The target group grid to select pseudo-absence points was based on the farm locations in the Peruvian national agricultural census of 2012 for cultivated cacao and on the presence points of all tree and shrub species occurring in the national checklists of Peru, Ecuador and Colombia for wild cacao (see Text S2 in Appendix S1 for more details).
Both pseudo-absence point selection approaches for cultivated and wild cacao were crossed with different resolutions of spatial filtering. Spatial filtering is used to reduce the effects of spatially biased presence points on model calibration (Kramer-Schadt et al., 2013) and was implemented by randomly retaining one presence point per grid cell for each of the selected resolutions of spatial filtering.
We carried out spatial filtering at four resolutions, ranging from 30 arcsec (ca. 0.9 km at the equator) to 15 arcmin (ca. 27 km). For the target group approach, we also filtered the pseudo-absence points to the respective resolutions of the models to reflect the spatial bias of the filtered presences. For the base approach, we kept the original resolution of pseudo-absence points as randomly selected points should not show spatial sorting bias. As such, we calibrated eight ensemble models for cultivated cacao and eight ensemble models for wild cacao (two pseudo-absence points selection approaches combined with four resolutions of spatial filtering).
Suitability modelling was performed using ensembles composed of up to 19 modelling algorithms, using the BiodiversityR package for R (Kindt, 2018). The list of the 19 modelling algorithms is presented in Table S3 in Appendix S1. To carry out an initial selection of the algorithms to include in the ensemble and remove spatial sorting bias (Hijmans, 2012), we performed a 20-fold cross-validation for each algorithm using the presence and pseudo-absence points of each of the combinations of pseudo-absence selection and spatial filtering, calculated the calibrated Area Under the receiver-operator Curve (cAUC) for each of the cross-validation folds and retained the algorithms with cAUC values significantly higher than the null model according to Mann-Whitney tests, following Thomas et al., (2014).
The retained algorithms in the ensemble models were crossvalidated for 10 folds using spatial blocks with the blockCV package for R (Valavi et al., 2019), following Fremout et al., (2020). With the spatial block cross-validation, presence and pseudo-absence points were partitioned in training and testing points using a set of spatial blocks, with each fold consisting of one or more 200 km wide squared blocks. AUC values cross-validated with spatial blocks (cvAUC) provide a better measure of model transferability, which is crucial when projecting species distributions to future climates (Wenger & Olden, 2012). The weights of the algorithms in each of the ensemble models were optimized using the ensemble.tune function in BiodiversityR (Kindt, 2018). The ensemble models were then calibrated again using all presence and pseudo-absence points and the previously optimized weights. For each of the ensemble models, we calculated four evaluation metrics (cvAUC, False Negative Rate  The importance of the environmental variables retained in the four best-performing ensemble models was calculated using the biomod2 package for R (Thuiller, 2003) Suitability maps for present and future climates were converted in presence-absence maps using the maximum training sensitivity plus specificity threshold (Liu et al., 2005). Suitable conditions under future climatic conditions were predicted in grid cells where at least 75% of the GCMs coincide. For each of the six time horizon-by-RCP combinations, we defined novel climates (i.e., climatic conditions outside the range between the 5% and 95% percentile of the conditions used to calibrate the distribution models) using the ensemble.
novel function in BiodiversityR. For simplicity, the figures in the manuscript focus on the projections for 2050s and RCP4.5, but we present the main findings for all the time horizon-by-RCP combinations in the text and include figures for all combinations in the appendix.

| Ecogeographical zones and niche overlap analysis
To assess changes in growth conditions within the suitable distribution range of cultivated and wild cacao under climate change, we subdivided the current cacao distribution in homogeneous ecogeographical zones (Parra-Quijano et al., 2012) and projected these zones to future climates, following Fremout et al., (2021).
We determined these zones using the CLARA (Clustering Large Applications) method implemented in the cluster package for R (Maechler et al., 2019). As input data, we used the first 10 principal components from a principal component analysis using values for all the 34 environmental variables mentioned above within the total current distribution range of cacao (both cultivated and wild), which together explained more than 95% of variance. The potential number of clusters (i.e., ecogeographical zones) was varied between five and 25, and the optimal number was determined using the cluster silhouette criterion through the optCluster package for R (Sekula et al., 2020), resulting in nine zones. Variables were normalized prior to clustering. For future projections, each grid cell was assigned to the closest cluster in ordination space based on the values of the principal components resulting from the environmental variables under future climate scenarios, using the cl_predict function from the clue package for R (Hornik & Böhm, 2020).
To investigate whether cultivated and wild cacao occupy different environmental niches, we performed a niche overlap analysis and niche hypervolume analysis. We calculated niche overlap, niche equivalency and niche similarity following the methodology described by Broennimann et al., (2012) and implemented in the ecospat package for R (Broennimann et al., 2012). We calculated the niche hypervolume occupied by cultivated and wild cacao and their intersection using the hypervolume package for R (Blonder et al., 2014). Further methodological details are presented in Text S4 in Appendix S1.

| Outlier analysis for climate changetolerant genotypes
We carried out an outlier analysis of the environmental conditions within the modelled distribution ranges to identify the areas where climate change-tolerant genotypes are potentially present. For this analysis, we selected twelve variables predicted to be important for climate change adaptation of cacao cultivation based on variable importance, suitability response curves and the expected direction of climate change (Table 2, Fig. S6, Fig. S7 and Fig. S8 in Appendix S1).
These variables include flooding risk (tolerance to flooding), altitude (tolerance to occasional chills) and ten climatic variables: maximum temperature of warmest month, mean temperature of driest quarter, annual precipitation, precipitation of wettest quarter, precipitation F I G U R E 2 Future distribution of suitable habitat according to all the ensemble models constructed with different absence point selection approaches and resolutions of spatial filtering for a) cultivated and b) wild cacao in Peru. Future projections show all areas identified as suitable by at least 75% of the subset of five well-performing and dissimilar General Circulation Models (GCMs) for the Representative Concentration pathway (RCP) 4.5 and the 2050s time period. Ensemble models with sufficient predictive accuracy (cvAUC >0.75) used for estimate future uncertainty are in black boxes while the best models for each approach are in red boxes: the base model at 5 arcmin filtering and the target group model at 30 arcsec filtering for cultivated cacao, and the base model at 30 arcsec filtering and the target group model at 30 arcsec filtering for wild cacao. The first four maps present the areas expected to become suitable (light green), the area expected to remain suitable (dark green) and the area expected to become unsuitable (red) as predicted by each of the four models. As a summary, the consensus maps on the right present the number of ensemble models predicting suitability (from 1 model in yellow up to 4 models in dark purple), including only the ensemble models with sufficient predictive accuracy of driest quarter, annual potential evapotranspiration, aridity (tolerance to heat and drought), mean diurnal range, temperature seasonality and precipitation seasonality (tolerance to altered diurnal and seasonal variation). The flooding risk map was retrieved from the National Service of Meteorology and Hydrology of Peru (SENAMHI, Spanish acronym) (Aybar et al., 2019). Altitude was included among the selected in the view of potential expansion of cacao at higher elevations where occasional chills can occur. Apart from flooding risk and altitude, among the remaining ten climatic variables, annual precipitation, precipitation of wettest quarter and aridity were colinear with other variables and not retained by the models and therefore do not have a response curve. Nevertheless, they were used in the outlier analysis because annual precipitation is one of the most important factors influencing yield (Zuidema et al., 2005), precipitation of the wettest quarter is an important predictor of flooding risk, to which cacao can be sensitive (De Almeida & Valle, 2007;Wood & Lass, 2008), and aridity is a fundamental indicator for climate change assessments (Berg et al., 2016;Fu & Feng, 2014).
For each variable, we calculated the outlier threshold as the 1% percentile for the lower outliers and the 99% percentile for the upper outliers. The outliers were calculated based on all grid cell values within the total cacao distribution range (both cultivated and wild).
To maximize the chance of including areas under the most extreme environmental conditions at the margins of habitat distribution, we defined the distribution range for wild cacao using the minimum training presence (i.e. the lowest predicted suitability value at the training presence points) as threshold for suitability. Coastal regions where wild cacao does not occur were excluded from the obtained distribution for wild cacao by excluding the respective terrestrial ecoregions. The areas where climate change-tolerant genotypes are potentially present were identified as those with environmental variable scores above or below the respective outlier thresholds. Finally, we maximized the probability of actual presence by overlaying these areas with 5 km buffers around cacao presence points.

| RE SULTS
The calibrated suitability models resulted in different suitability distributions, depending on the pseudo-absence point selection approach and resolution of spatial filtering (Figure 1). Table S4 in Appendix S1 shows the number of presence and pseudo-absence points, modelling algorithms and environmental variables used in the different ensemble models. In general, models generated with the target group approach and at lower resolutions of spatial filtering resulted in broader and less overfit distributions compared with models constructed using the base approach and higher resolutions of spatial filtering (Figure 1). According to the evaluation metrics ( Fig. S1 in Appendix S1), the best-performing models for each of the pseudo-absence points selection approaches were the base model at 5 arcmin filtering and the target group model at 30 arcsec filtering for cultivated cacao, and the base model at 30 arcsec filtering and the target group model at 30 arcsec filtering for wild cacao. For cultivated cacao, we were interested in the model that provides the closest fit to the current range of cultivated cacao (i.e. base model at 5 arcmin filtering) to obtain more reliable indications of climate change impacts in currently cultivated areas. By contrast, for wild cacao the target group model at 30 arcsec provides a more realistic approximation of the present distribution of wild cacao compared with the base approach models which tended to overfit the spatially biased presence data. The selected models for cultivated and wild cacao presented distinct suitability distributions, with cultivated cacao extending along the Andes foothills and wild cacao covering most of the Peruvian Amazon (Figure 1). In both models, the coastal regions were not predicted as suitable.
To estimate the uncertainty of future predictions related to model selection, we generated projections for all the ensemble models and built consensus maps including only the models with sufficient predictive accuracy (cvAUC >0.75) ( Figure 2). As the predictions of the four best-performing ensemble models using the full set of GCMs (Figure 3 and S2) were similar to the predictions obtained with the subset of five well-performing and dissimilar GCMs  (Figure 3 and S2 in Appendix S1). A comparison of current and future suitability models showed that the 99% quantile for altitude shifts from 1,700 masl under present climate conditions up to 2,100 masl for the expected future distribution range ( Fig. S4 in Appendix S1). However, according to at least 75% of the GCMs, the predicted expansion areas are smaller than the contraction areas, resulting in a net decrease in suitable area (Table 1 and Table S5). Similar trends are expected under the worst-case (RCP8.5) and more optimistic emission scenario (RCP4.5), and different time horizons (Fig S5). For wild cacao, most of the current distribution range is foreseen to remain suitable by at least 75% of GCMs and is predicted to further expand towards the Amazon region and the Andes (Figure 3 and Figure S2). Higher gains in suitable area of wild cacao are predicted at more distant time horizons and under the worst-case emission scenario (RCP8.5) ( Table 1 and Table S5).
The single best suitability models for cultivated (base model at 5 arcmin filtering) and wild cacao (target group model at 30 arcsec filtering) retained similar environmental variables, although these variables showed different importance scores (Fig. S6 in Appendix S1) and different trends in the response curves (Table 2, Fig. S7 in Appendix S1). The niche overlap and niche hypervolume analyses showed that cultivated and wild cacao occupy different environmental niches, with cultivated cacao occurring in areas with higher mean diurnal temperature range, lower mean temperature of wettest quarter and higher precipitation of the warmest quarter compared with wild cacao (Text S4).
In the area where cacao is currently cultivated, temperatures and potential evapotranspiration are predicted to increase across the whole distribution range, while precipitation changes show varying patterns depending on the season and region (Fig. S8 in Appendix S1). For cultivated cacao, expansion of suitable range seems to occur in areas that will have higher precipitation of wettest month/quarter and coldest quarter, and similar precipitation of driest month/quarter and warmest quarter compared with current precipitation regimes in currently suitable areas (Fig. S8 in Appendix S1).
Similar changes in climate are expected within the distribution range of wild cacao (Fig. S9 in Appendix S1).
Ecogeographical zones identified within the cacao distribution range are expected to change 8%-16% in area (Fig. S10, Table S6 in Appendix S1), meaning that local climatic conditions are expected to F I G U R E 3 Future distribution of suitable habitat for a) cultivated and b) wild cacao in Peru showing all areas identified as suitable by at least 50%, 75% and 90% of 31 different future General Circulation Models (GCMs). The maps present the areas expected to become suitable in light green, the area expected to remain suitable in dark green and the area expected to become unsuitable in red. Here, we present the maps for the selected ensemble models (i.e. the base model at 5 arcmin filtering for cultivated cacao and the target group model at 30 arcsec filtering for wild cacao), and for the Representative Concentration Pathway (RCP) 4.5 and the 2050s time horizon, but similar results were found for the projections with combinations of other models, emission scenarios and time horizons (Figs. S2 and S5, Table 1

| D ISCUSS I ON
In this study, we modelled the current and future distribution of cultivated and wild cacao in Peru by comparing multiple ensemble models constructed using different pseudo-absence selection approaches and resolutions of spatial filtering. Both the procedures had pronounced impacts on model predictions (Figure 1). In general, the target group approach and spatial filtering at lower resolutions reduced overfitting by reducing spatial sample bias (Phillips et al., 2009). However, spatial filtering strongly impacted future predictions for cultivated cacao, ranging from net negative impacts at higher spatial resolution of filtering to net positive impacts at lower resolution. And this in spite of the fact that models calibrated with data filtered at different spatial resolutions had a similar accuracy in terms of cvAUC and produced similar distribution ranges under current climate conditions. Two lessons can be learned from this. First, that model selection must be tailored to the characteristics of available data, the modelled species and the intended use of the suitability models (Derville et al., 2018;Segurado & Araujo, 2004). In our case, for cultivated cacao we had very detailed agricultural census data and were interested in the current (instead of potential) range of cultivated cacao to obtain more reliable indications of impacts of climate change in currently cultivated areas. By contrast, for wild cacao our data showed strong spatial sampling bias and we were interested in a model that best predicts the current realized distribution also in poorly sampled areas. Hence, the random pseudo-absence point selection worked better for cultivated cacao, while the target group selection worked better for wild cacao. A second lesson is that selecting the best models for future climate projections based on performance under current climate conditions alone may be misleading as alternative models with similar accuracy can result into variable future projections (Araújo & New, 2007;Thuiller et al., 2019). Therefore, testing projections for multiple models is required to estimate the uncertainty of future predictions (Araújo & New, 2007;Thuiller et al., 2019).
Our results show that cultivated and wild cacao present distinct suitability distributions (Figure 1), in accordance with the fact they occupy different environmental niches (Text S4 in Appendix S1). This is in line with other suitability modelling studies comparing the distributions of cultivated and wild populations of the same species (d'Eeckenbrugge & Lacape, 2014;Eshetae et al., 2019;A. J. Miller & Knouft, 2006). The best-performing model obtained for cultivated cacao presents a realistic representation of where cacao is currently grown in Peru (Figure 1a). For cultivated cacao, many parts of Peruvian Amazon are probably currently suitable, as evidenced by cacao cultivation by indigenous communities and smallscale plantations (Levis et al., 2017;R. P. Miller & Nair, 2006), but cacao is not grown there at commercial scale. As expected, none of the models of cultivated cacao predicted suitability in the coastal regions where cacao is unable to grow without irrigation. The bestperforming model for wild cacao represents a realistic distribution of the current suitable habitat of wild cacao (Figure 1b). However, even though the target group suitability models for wild cacao minimized overfitting, the best-performing model still omitted some areas that may be suitable in the northern Amazon region and assigned higher suitability values to the more accessible areas along rivers. This is most likely due to the strong spatial sample bias, with a near complete lack of collection missions in inaccessible areas located at longer distances from rivers. TA B L E 1 Changes in suitable areas for cultivated and wild cacao in Peru for two Representative Concentration Pathways (RCPs) and three time horizons. Total suitable areas and area changes were calculated by predicting suitable conditions in grid cells where at least 75% of the 31/33 General Circulation Models (GCMs) predict suitable conditions. Here, we present the results for the selected ensemble models (i.e. the base model at 5 arcmin filtering for cultivated cacao and the target group model at 30 arcsec filtering for wild cacao), while the results for the target group model at 30 arcsec filtering for cultivated cacao and the base model at 30 arcsec filtering for wild cacao are presented in Table S5 in Appendix S1

Net area change Contraction area Expansion area
Net area change Future projections yielded different results between cultivated and wild cacao. For cultivated cacao, projections foresee a contraction of the suitable range and shift at higher elevation which is in line with other studies in West Africa Schroth et al., 2016) and Central America (Eitzinger et al., 2015;de Sousa et al., 2019), although there is some uncertainty among projections from different ensemble models (Figure 3). Because cacao cultivation in Peru has been expanding at an annual rate of 9.1% in recent years (MINAGRI, 2018), new expansion efforts should be directed towards areas that will remain or will become suitable into the future rather than the predicted contraction areas in the northern and central Amazon (Figure 3). However, some of the new suitable areas in southern Amazon are currently covered by forests and it is paramount that potential expansion of cacao cultivation is not implemented at the expense of deforestation or encroachment into protected areas. By contrast, our projections consistently foresee a positive future for wild cacao in Peru as the current suitable area is expected to largely remain suitable and to TA B L E 2 Overview of general trends in response curves of cultivated and wild cacao suitability (Y-axis) to different climate variables (X-axis) and trends of climate variables under climate change. Red trend lines indicate whether the variable has positive, negative, flat, flattening or optimum curve (exact response curves are given in Fig S7). Black arrows indicate whether the variable is expected to increase, decrease, or increase and decrease depending on the areas in the future in the distribution range of cacao in Peru (shown in Fig S8). The climatic variables included in the outlier analysis to identify climate-tolerant genotypes are indicated with an "a". These response curves and future trends of climatic variables were used to define the ten climatic variables used in the outlier analysis, in addition to altitude and flooding risks. The selected climatic variables are: maximum temperature of warmest month, mean temperature of driest quarter, annual precipitation, precipitation of wettest quarter, precipitation of driest quarter, annual potential evapotranspiration, aridity, mean diurnal range, temperature seasonality, precipitation seasonality. Among these ten variables, annual precipitation, precipitation of wettest quarter and aridity were colinear with other variables and not retained by the models, and therefore do not have a response curve. Nevertheless, they were used in the outlier analysis because annual precipitation is one of the most important factor defining yield (Zuidema et al., 2005), precipitation of the wettest quarter is the main predictor of flooding risk which cacao is sensitive to (Almeida & Valle, 2007;Wood & Lass, 2008), and aridity is a fundamental indicator for climate change assessments (Berg et al., 2016;Fu & Feng, 2014) (Figure 1, Text S4 in Appendix S1), and by the different variable importance scores and response curves ( Table 2, Fig. S6 and J. Miller & Knouft, 2006 where multiple extremes overlap, on the premise that these may be tolerant to multiple stress factors simultaneously. For example, several areas in Cajamarca and Cuzco combine potential drought tolerance with the tolerance to growth conditions at high elevations.

| CON CLUS I ON S AND FUTURE PROS PEC TS
In this study, we modelled the distribution of suitable habitat of cultivated and wild cacao in Peru by comparing multiple ensemble models constructed using different pseudo-absence selection approaches and resolutions of spatial filtering. We found that model selection significantly impacted both present and future modelling results. This stresses the importance of tailoring the modelling approach to the objectives and characteristics of the target species and projecting multiple models to future climatic conditions, to better assess uncertainty of future projections, which should be considered in decision-making related to climate change policies (Araújo & New, 2007;Thuiller et al., 2019).
Overall, our models foresee a contraction of suitable area for cultivated cacao while predicting a more positive future for wild cacao in Peru. Because suitability is not necessarily correlated with yield quantity or quality (Ramirez- Villegas et al., 2013), future studies should asses the expected impacts of climate change on cacao yields, also including indirect effects such as altered incidence of pests and diseases (Gateau-Rey et al., 2018).
Tolerant genotypes will be required to facilitate the adaptation of cacao cultivation under climate change in Peru. To this aim, populations of native cacao in the identified areas will be target of planned collection missions. The resistance of these genotypes to extremes in growth conditions will be verified in greenhouse and climate chamber experiments. The most promising genotypes will then be established in clonal gardens for the supply of the propagation material. In the longer term, these genotypes should be included in breeding programmes aimed not only at enhancing climate change tolerance but also yield quantity and quality, pest resistance and low heavy metal accumulation.
Finally, the climate change-tolerant genotypes identified in this study could also benefit other cacao producing countries where climate change is expected to cause a contraction of the suitable area for cacao cultivation (Eitzinger et al., 2015;Läderach et al., 2013;Schroth et al., 2016;de Sousa et al., 2019). Especially genotypes tolerant to high temperatures and low precipitation/ high evapotranspiration have the potential to increase the climate change resilience of cacao production systems in these countries, as these are the factors expected to drive contraction of suitable area for cacao cultivation in future Schroth et al., 2016).

ACK N OWLED G EM ENTS
This work received financial support from the German Federal

PEER R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/ddi.13294.

DATA AVA I L A B I L I T Y S TAT E M E N T
The dataset with the presence points used for modelling cultivated and wild cacao is available in Appendix S2.