Estimating consensus and associated uncertainty between inherently different species distribution models


Correspondence author. E-mail:


  1. Forecasting shifts in biome and species distribution is crucially needed in the current context of global change. So far, most projections of vegetation distribution rely on correlative species distribution models (SDMs). Yet, process-based or hybrid models based on explicit physiological description may be more robust to extrapolation under future climatic conditions. Differences between model projections may be wide, leading to scepticism among environmental stakeholders.
  2. Here, we propose to combine outputs of several distribution models based on physiological responses, to produce both consensual maps of occurrences and maps of associated uncertainty. The consensus map relies on the conditional projections of each SDM. Because the models used are based on processes, their errors are likely to vary consistently with climate as some processes not implemented in a model might be important under a given set of climatic conditions. Uncertainty of the consensus model is thus assessed through multimodel regression of deviance maps with respect to current climatic conditions, and can be extrapolated to forecast climates.
  3. We illustrate this approach using three SDMs, on three widely distributed European trees (Fagus sylvatica L., Quercus robur L. and Pinus sylvestris L.), and project their distributions under two scenarios. The conditional consensus outperforms classical methods of model consensus (i.e. to use the mean, the median or a weighted average of individual SDM outputs) in projecting current occurrences.
  4. Consistently, with the results of individual SDMs, the conditional consensus projects that the suitable areas for F. sylvatica and Q. robur will expand towards north-eastern Europe, while that of P. sylvestris will contract. Projections of future occurrence are most uncertain towards the margins of the distribution (particularly the trailing edge).
  5. Our approach can help modellers identify the limitations of each SDM and stakeholders pinpoint the regions of models agreement and highest certainty.


The latest Intergovernmental Panel on Climate Change (IPCC) scenarios are already exceeded by recent estimations of greenhouse gas emissions (Raupach et al. 2007), leaving open questions about the development of global climate modifications and their impact on natural ecosystems. Recent climatic and atmospheric composition changes have modified the distribution, structure and function of ecosystems (Walther, Berger & Sykes 2005), thus altering biodiversity and ecosystem services, and leading to socioeconomic and financial costs. Adaptive management strategies directly based on spatialized, comprehensive and robust projections of species distribution and extinction risks could help mitigate these effects (TEEB 2010).

To date, species distributions are mostly investigated using three types of species distribution models (SDMs): correlative, process-based and hybrid models (Peterson et al. 2011). Correlative SDMs infer correlations between current species occurrences and various environmental descriptors. Process-based SDMs describe the responses of selected traits or processes (such as phenology, resistance to stress, resource acquisition) to environmental descriptors, based on empirical observations, and estimate proxies of occurrence, such as growth or fitness (Kearney & Porter 2009). Hybrid SDMs associate correlative models to describe habitat suitability, and process-based models to narrow down to the realized niche, through describing e.g. population dynamics, dispersal and/or energy uptake. Correlative models allow the exploration a species' limiting environmental variables across its realized niche, while process-based models infer its fundamental niche. Because correlative models rely on widely available occurrence and climatic data, they are largely used in the literature. However, their extrapolation to novel climates is uncertain. It has been suggested to use smooth response curves and to refrain from making projections to climates that differ too much from currently observed climates (Elith, Kearney & Phillips 2010). In contrast, process-based SDMs are thought to be more robust to extrapolation to novel climates (Morin & Thuiller 2009; Dormann et al. 2012), because their parameterization relies solely on empirically determined response curves driving important processes regulating the species' probability of surviving and reproducing, with respect to environmental conditions.

Forecasts of future distribution vary according to the correlative model used (e.g. Pearson et al. 2006), and between correlative and process-based SDMs (Buckley 2008; Kramer et al. 2010; Cheaib et al. 2012; but see Kearney, Wintle & Porter 2010). This may puzzle stakeholders and policy makers, and jeopardize the credibility of species distribution projections. Ensemble or consensus approaches, using information provided by different SDMs, have been advocated to tackle this problem (Araújo & New 2007): models can vote for the species' presence or absence. Votes can be weighted by models' accuracies (e.g. Marmion et al. 2009), or models can be combined using multimodel inference (Burnham & Anderson 2002; see e.g. Gibson et al. 2004; Hartley, Harris & Lester 2006).

Providing consensus maps is, however, not sufficient to guide stakeholders. All models may agree with each other for wrong reasons (Elith, Kearney & Phillips 2010), potentially leaving systematic errors. Mapping the resulting uncertainty is therefore as important as mapping the consensual projection itself. Yet, few studies have provided uncertainty maps of SDM projections. Maps of model discrepancies (e.g. Hartley, Harris & Lester 2006) only inform on the uncertainty associated with different model projections, not the uncertainty associated with the relevance of the climatic descriptors or the processes considered. Should an important environmental descriptor have been omitted in the individual SDMs, its variation would be absent from any multimodel, and even the best model among those considered would be unable to accurately project the species' range (Elith, Kearney & Phillips 2010; Dormann et al. 2012). The performance of conceptually different SDMs may vary with environmental conditions: each SDM may surpass the others in projecting a species' presence under a given set of climatic conditions, for the environmental variables or the processes it considers are more relevant in these conditions.

Here, we build a simple consensus between SDMs relying on vegetation's physiological responses to climate. Its uncertainty due to the poor parameterisation or omission of important processes is assimilated to its statistical deviance to observed occurrence maps. To account for the environmental clustering of SDM errors, uncertainty is modelled as a function of composite, independent environmental descriptors, in a multimodel framework. Both the probabilities of occurrence and the associated uncertainty can then be projected onto forecasted climatic conditions. We illustrate this approach through modelling the potential distribution of three common European tree species (Fagus sylvatica L., Quercus robur L. and Pinus sylvestris L.), combining the outputs of three conceptually different SDMs (one correlative with physiological basis, one hybrid and one process-based).

Materials and methods

Our approach is summarized in Fig. 1.

Figure 1.

Schematic workflow of the design of the consensus model and its forecasts. The maps illustrating the chart correspond to Quercus robur.

Step 1: Species distribution models

Details on all three models and their parameterization are provided as Supplementary Information.

STASH (correlative model)

STASH is a correlative, physiologically based climate envelope model (Sykes, Prentice & Cramer 1996). It relies on bioclimatic limits restricting the species' envelope, and on variables acting as multipliers of the species' growth efficiency index. All bioclimatic limits and variables are assumed to have strong links with vegetation responses through important physiological mechanisms. Because bioclimatic limits are defined according to the observed species distribution, this model is likely to over fit. To avoid this, we ran this model 100 times, with bioclimatic limits defined on random re-samplings of 30% of the Atlas Flora Europaeae distribution map (AFE; Tutin et al. 1964). For each pixel, the final STASH output corresponded to the average of the outputs obtained for that pixel when belonging to the remaining 70% validation set (Supplementary Information).

LPJ (hybrid model)

Lund-Potsdam-Jena (LPJ) model is a general ecosystem model combining bioclimatic limits to the species' establishment and survival and mechanistic representations of physiology, biochemistry, vegetation dynamics and carbon and water fluxes (Sitch et al. 2003). A minimum set of bioclimatic limits defines the bioclimatic envelope of the species. From climatic, soil and CO2 data, the model simulates different growth-related variables such as leaf area index (LAI) or net primary production (NPP). Here, we used the LPJ version described in Gritti, Smith & Sykes (2006), but did not take competition into account. LPJ was run at the species level, using specific parameters when available (Supplementary Information), or the generic parameters of the corresponding plant functional type described by Smith, Prentice & Sykes (2001). Because bioclimatic limits do not directly derive from the observed distribution, no cross-validation from resampling approaches was performed.

PHENOFIT (process-based model)

PHENOFIT (Chuine & Beaubien 2001) is a process-based SDM relying on the assumption that a temperate tree species' survival and reproductive success are related to its capacity to synchronize its annual life cycle with seasonal climatic variations, as well as to sustain temperature and water stresses. From daily temperature and precipitation records, PHENOFIT estimates survival and reproductive success for an average tree. This model intrinsically takes phenotypic plasticity into account through the reaction norms of phenology and resistance to stress in relation to climate. Here, PHENOFIT was parameterized for up to four populations per species, thus somewhat accounting for local adaptation. Observed species distribution is not used as input for the model, nor is it used to estimate parameters (parameters are derived from empirical observations of trees' physiological responses to climate). This model can therefore not be cross-validated by resampling approaches. As its output is related to the species' fundamental niche, the model can be validated a posteriori, by comparing its output to the observed distribution of the species.

Step 2: Climate data

SDMs simulations

Climatic and atmospheric CO2 concentration time series were extracted from the Advanced Terrestrial Ecosystem Analysis and Modelling (ATEAM; data set for the period 1901–2100. Forecasts of climatic data were computed for the period 2081–2100 using the HadCM3 atmosphere–ocean general circulation model (Mitchell et al. 2004) following two scenarios: A1Fi (‘business as usual’) and B2 (local development, with environmental focus). This climatic data set covers the European window from 11 °W, 34 °N to 32°E, 72°N with a 10′ × 10′ pixel resolution for monthly values of temperature, precipitation and percentage of sunshine. Twenty-year averages of monthly means served as input data for STASH. Monthly values were directly used as input data for LPJ. Daily interpolation was performed using the weather generator CLIGEN (Nicks, Lane & Gander 1995) to drive PHENOFIT.

Description of the climatic space used to explain the deviance of the consensus model

We eliminated multicollinearity between environmental descriptors, through summarizing the variation of eight potentially correlated climatic variables (four related to temperature, four to the amount and seasonality of precipitation) in a Principal Component Analysis carried on the concatenated climatic data sets (historical, 1981–2000 and scenarios, 2081–2100; Supplementary Material). The first three principal axes (PC) summarized 92·8% of the total variance of climatic descriptors, with PC1 mostly explained by temperature, PC2 by the amount of precipitations and PC3 by their seasonality. The coordinates of each pixel along these three axes were used as synthetic climatic descriptors.

Step 3: Conditional consensus model and associated uncertainty

All three models produce different synthetic estimates (growth efficiency index, LAI, fitness), none of them actually being a probability of occurrence, and none being directly comparable to each other. We thus decided to transform their outputs into comparable binary presence/absence data, using model- and species-specific thresholds (hereafter, SPT). The SPTs were defined so as to maximize the sum of sensitivity and specificity (see e.g. Nenzén & Araújo 2011), using the AFE occurrence data as a reference.

For each species and each climatic data set, each pixel in the simulation window was attributed to one of 23 = 8 subsets, indexed by S, corresponding to the triplets of the combinations of {STASH, LPJ, PHENOFIT} projected presence (above the SPT) or absence (below the SPT). For example, {1,0,0} would be one such triplet, corresponding to STASH projecting occurrence, and LPJ and PHENOFIT projecting absence of the species in the pixel. Within each subset, all pixels shared the same probability of occurrence pS (equal to its maximum likelihood estimator nobs/ntot), leading to eight levels of probabilities of occurrence.

Step 4: Estimating the associated uncertainty

For each pixel, the deviance to the observed occurrence was computed as (McCullagh & Nelder 1989; p 118):

display math(eqn 1)

where obs is the observed occurrence (0 or 1) and pS the estimated probability of occurrence for the relevant subset. Note that this is the exact deviance of projections (pS) with respect to the observed occurrences (i.e. to a perfect model). Within each subset, deviance can take at most two values, so that the ratio of the difference between observed and minimum deviance to its maximum span (thereafter the ‘standardized deviance’, noted (δ) is 0 or 1. The standardized deviance was modelled as a Bernoulli process. To account for its possible dependency upon environmental variables, we modelled the logit of its mean as a polynomial function of the three synthetic climatic variables with polynomial degree ≤ 2:

display math(eqn 2)

Twenty-seven models for δ were considered for each species and each subset, corresponding to the combinations of models including or excluding coefficients in eqn (eqn 1), with the additional constraints that (i) all models included the intercept α0,S; and (ii) second-degree terms αi2,S were constrained to co-occur with the corresponding first-degree term αi1,S. For each species and subset S, each model j was weighted by its Akaike weight wS,j (Burnham & Anderson 2002) so that the projection of the modelled deviance math formula for current data is:

display math(eqn 3)

Species distribution projections were obtained by pooling together model-averaged projections of deviance [eqn (eqn 2)] for each of the eight subsets S, so that finally each pixel was assigned a probability of presence and a modelled deviance.

Step 5: Forecasts

Each SDM was used to produce forecasts for the period 2081–2100, for the two scenarios A1Fi and B2. Yearly outputs of LPJ and PHENOFIT were averaged over this period to yield their final output; STASH directly outputs a growth efficiency index over the considered period. Model outputs were transformed into presence/absence using the SPT (step 3).

For each species and each climatic scenario, pixels were assigned to one of eight subsets (S) according to the combinations of {STASH, LPJ, PHENOFIT} projected occurrence (as in step 3), and were attributed the species' probability of occurrence pS while deviance was extrapolated based on eqns (eqn 1), (eqn 2) and on forecasted climatic descriptors.

Step 6: Model evaluation and comparison

In addition to the ‘conditional’ consensus model presented above, we generated three consensus models for each species, using methods described in Marmion et al. (2009). The first two methods (i.e. Mean and Median) assign to each pixel its computed value (mean or median) from the three SDMs outputs. The third one (WA consensus) computes the average of the three SDMs outputs, weighted by their Area Under the Receiver Operating Characteristic curve (AUC, a measure of discriminative power; Swets 1988).

We then compared the accuracy of the projections of all three SDMs and all four consensus models over the historical period for each couple model/species, using various criteria: AUC, the proportions of well-predicted pixels (accuracy), of false positives (commission error) and of false negatives (omission error), once applied the SPT. Note that all these measures rely on discretizing the continuous output of each SDM into binary data. Hence, they only estimate the discriminating power of the model's output (Lobo, Jiménez-Valverde & Real 2008).

All computations from step 2 onwards were conducted using R (R Development Core Team 2011); scripts and data are provided as Supplementary Material.


Accuracy measures of the projections of SDMs and the consensus models over the historical period

SDM projections for current distributions are shown on Fig. 2. Their AUC values are relatively high (Table 1), considering that the three studied species are widely distributed and occur in a wide range of environments, spanning most of the study area (Lobo, Jiménez-Valverde & Real 2008; Grenouillet et al. 2011). Overall, STASH and LPJ show higher AUC and accuracy (i.e. proportion of well-projected pixels) than PHENOFIT.

Table 1. Accuracy measures of the projections of the three species' current distribution by STASH, LPJ, PHENOFIT and four consensus models: ours (Conditional) and the Mean, Median and Weighted Average (WA) of Marmion et al. (2009). Species Presence Threshold (SPT): threshold maximizing the sum of sensitivity and specificity, above which model outputs were considered to indicate species occurrence. AUC: Area Under the Receiver Operating Characteristic curve. Accuracy: proportion of correctly projected pixels (true absences + true presences). Commission/omission error: proportion of false positives (respectively of false negatives) over the whole simulation window. For each descriptor of accuracy, the best performing model is highlighted in bold face
Species ModelSPTAUCAccuracy (%)Commission error (%)Omission error (%)
Fagus sylvatica SDMsSTASH0·270·83474·9 3·8 21·3
ConsensusConditional0·52 0·876 84·0 9·5 6·5
Quercus robur SDMsSTASH0·080·853 82·5 5·212·3
LPJ0·670·83078·011·5 10·5
ConsensusConditional0·66 0·854 82·5 5·212·3
Median0·330·82880·8 4·2 15·0
Pinus sylvestris SDMsSTASH0·300·63875·66·917·5
LPJ0·590·68773·6 2·6 23·9
ConsensusConditional0·75 0·744 76·4 7·2 16·4
Figure 2.

Projections of the species current distribution (1981-2000) by the individual SDMs (STASH: growth efficiency; LPJ: LAI (leaf area index); PHENOFIT: mean fitness).

The conditional consensus model projects high probabilities of occurrence over most of the observed distribution range of all three species (Fig. 3, left column). The conditional consensus shows AUC values comparable or higher than the best SDM and the other consensus models for F. sylvatica and Q. robur, and outperforms them for P. sylvestris, the species worst projected by all SDMs (Table 1).

Figure 3.

Projection of the probability of presence of the three species by the conditional consensus (left column); observed deviance (middle column) and modelled deviance (right column) for the period 1981–2000. The consensus model yields 23 = 8 levels of probability of presence, corresponding to the triplets of {STASH, LPJ, PHENOFIT} projected presence or absence.

Divergence and uncertainty in projections of current distributions

While the three SDMs capture relatively well the upper and lower boundaries of the distributions at the broad scale, regional discrepancies are noticeable (Fig. 2). When transformed into presence/absence using the SPT, SDMs projections mismatch for 25%–35% of the pixels. These discrepancies are partly attributable to bioclimatic limits in STASH and LPJ leading to too important contrasts, and to the weak representation of water stress in all models. Furthermore, the coarse spatial resolution of occurrence data, and to a lesser extent of climatic data (particularly in contrasted areas such as mountain ranges), lead to mismatches between projected and observed distributions.

In light of these mismatches, identifying which SDM (s) fail to model the occurrence of which species, and under which conditions, becomes a primary goal. The spatial variation in the deviance of the projected probability of occurrence to observed data yields insight into this question (Fig. 3, middle column). Note that deviance does not depend upon whether models agree with each other, but on whether they agree with observed data. The observed deviance is relatively low, indicating high confidence in the projections of the conditional consensus model. However, it tends to be higher towards the margins of the distributions, where SDMs disagree.

The synthetic environmental descriptors appear to be good predictors of the variation of deviance (Fig. 3, right column), even though some regions of high deviance are not captured by the models, such as the Alps for F. sylvatica. In this case, however, all models rightly predict the absence of the species, while the AFE data set, because of its coarse resolution, inaccurately describes the species as present. Deviance to actual occurrence data should therefore be low for F. sylvatica in the Alps.

Forecasts and associated uncertainty

Under both scenarios, the suitable areas for all three species are projected to shift towards the North-East and towards higher elevations, both by the conditional consensus model (Fig. 4) and the SDMs (Supplementary Material). While the size of the suitable area for F. sylvatica and Q. robur will remain approximately stable or increase, that of P. sylvestris is projected to decrease, albeit less intensely under scenario B2 (Fig. 4). Projected probabilities of occurrence are overall lower under the A1Fi scenario than under the B2 scenario. Large areas of uncertainty appear towards the trailing edges of the distribution for all three species, and over large areas in Central and Eastern Europe for F. sylvatica, especially in regions where the projections of the SDMs disagree (Fig. 4; Supplementary Material). These regions show a partial overlap with the least analogous projected climates (not shown).

Figure 4.

Projections of the species potential distribution considering unlimited migration for the period 2081–2100 under scenarios A1Fi and B2, as modelled by the conditional consensus model, and associated modelled deviance.


We propose a simple framework to estimate consensual projections of species distributions, while jointly assessing their uncertainty as a function of synthetic environmental descriptors. This framework can be applied to any number of SDMs (provided each combination of SDM projected presence or absence gathers a large enough number of points), and is particularly well-suited for process-based SDMs, whose errors are expected to be environmentally clustered, and whose projections do not (or not only) rely on observed distributions. Consensus projections generated by this approach are not affected by the introduction of poorly predictive models. We illustrate its use by associating a spatialized quantification of the uncertainty to the forecasts of the future distribution of suitable habitats of three emblematic European forest trees, under two climatic scenarios and making use of the projections of three very different SDMs.

This study only considers uncertainty due to species distribution models. However, uncertainties also arise because the occurrence data used to parameterize the correlative and hybrid models may be inaccurate or too coarse, because processes are calibrated on too narrow an environmental range, because of the choice of the occurrence threshold, and because of uncertainties in the climatic and land use scenarios (Beale & Lennon 2012). Because SDM type has often been found to be the main source of variation between forecasts, as compared with other sources of uncertainty (Dormann et al. 2008; Buisson et al. 2010; Nenzén & Araújo 2011), we chose to deal first with reconciling model projections and assessing their uncertainty.

Forecasting the distribution of three European tree species

In agreement with earlier studies (Kramer et al. 2010; Cheaib et al. 2012; Meier et al. 2012), suitable habitats for the temperate deciduous species Q. robur and F. sylvatica are projected to shift towards the North-East, and their potential range to increase slightly, while that of P. sylvestris is projected to contract. Compared with earlier studies, our results highlight regions whose future suitability is most questionable, notably towards the trailing edge of their distributions, and in large regions of central and Eastern Europe for F. sylvatica.

High uncertainty does not necessarily reflect SDMs disagreement. For example, as compared with the AFE occurrence data, all three SDMs wrongly project the current presence of F. sylvatica just south of the Alps and close to the French Atlantic coast (Fig. 2 and Fig. S1). Models of deviance based on three composite environmental descriptors capture this discrepancy (Fig. 3). Thus, in regions presenting future climatic condition analogous to the conditions currently observed along the French Atlantic coast or south of the Alps (e.g. east of the Baltic Sea under the A1Fi scenario for year 2100), where all three models project a future occurrence of F. sylvatica, the conditional consensus model associates large uncertainties to this species' projected occurrence. This error pattern is commonly observed (Hanspach et al. 2011) and can be attributed to the coarse resolution of both occurrence data (Rocchini et al. 2011) and climatic data, stressing the need for accurate occurrence and climatic data (Austin & Van Niel 2011). Uncertainties uncovered for future distributions do not only reflect discrepancies between SDMs' projections, but also weaknesses common to all SDMs included in the consensus approach.

Current species distributions might not fill their potential range (Svenning & Skov 2007; Dormann et al. 2012), and because none of the three models (at least as they were used here) accounts for dispersal, land use nor interspecific competition, the maps shown here only indicate potentially suitable habitats. In this regard, our projections are arguably too optimistic: whether the species can migrate towards newly available habitats, or establish there, is highly uncertain. When dispersal limitation and land use are taken into account, effectively accessible suitable sites are much scarcer than potentially suitable sites (Meier et al. 2012). However, other factors usually not taken into account by correlative SDMs, such as local adaptation or phenotypic plasticity, may help species cope with climate change. Process-based and hybrid models provide less alarmist forecast of species range shift than correlative SDMs, which can thus be argued to be overly pessimistic (Morin & Thuiller 2009; Cheaib et al. 2012). Overall, actual shifts in distributions are likely to lie between our projections and those of Meier et al. (2012). This stresses the fact that providing consensual projections of species range changes should not prevent from reducing individual SDMs errors. Incorporation of realistic dispersal models is a step towards such improvements. Other advances could be gained from including the representation of other processes, such as biotic interactions (e.g. Davis et al. 1998) increases in atmospheric CO2 (as in LPJ) or a finer representation of local adaptation than what is currently implemented in PHENOFIT.

More than a democratic vote

Improving the reliability of SDMs, ultimately aiming at developing reliable, integrated or hybrid models (Morin & Lechowicz 2008; Thuiller et al. 2008) requires much detailed information on the studied species. A more tractable approach is to establish consensus projections from already existing models, and take advantage of the strength of each model. More or less refined ways have been proposed to combine the outputs of individual SDMs to improve their reliability, from the single vote to weighted averages of each SDM output (Marmion et al. 2009). To our knowledge, such consensus methods have only been conducted on correlative distribution models. Here, we have combined the outputs of three conceptually different SDMs: a correlative model with physiological grounds, a hybrid model and a process-based model. Because these SDMs do not output probabilities of occurrence, the classical consensus methods were not expected to accurately describe the patterns of species occurrence. Indeed, classical consensus methods are sensitive to the addition of non-predictive models, or of models consistently producing lower- or over-than-average scores. Therefore, the Mean, Median and WA consensus models performed worse than the conditional consensus model for the poorly projected species (P. sylvestris).

While projections of the three SDMs mostly differ at the regional scale for the historical period, their forecasts strongly differ at the continental scale for 2081–2100, regardless of the climatic scenario. This calls not only for a consensus modelling of the probability of occurrence but most importantly for quantifying uncertainty. So far, uncertainty has been considered as the variance between the projections of SDMs, all of the same family (e.g. linear models, Hartley, Harris & Lester 2006). However, we argue that uncertainty should be seen as the spatially explicit deviance of predictions to the observations. Indeed, models from a given family, or models using a certain set of processes or of environmental variables, are likely to produce (possibly cryptically) environmentally clustered errors, which can only be detected through comparison with observed occurrences, and neither through inter-model variance, nor through global quality estimators such as AUC.

When SDMs are based on biological processes, uncertainty is likely to vary with abiotic variables, and to be highest in regions where a weakly modelled process is crucial to explain the species' occurrence. For example, the version of PHENOFIT presented here uses a simplistic representation of water stress, and is thus expected to yield poor predictions in dry areas. If all SDMs were given a constant weight in a consensus model (i.e. a constant confidence, for example proportional to their AUC; Marmion et al. 2009), this weight for PHENOFIT would strongly rely upon the proportion of dry pixels over the simulation window. In contrast, in our conditional consensus model, the contribution of PHENOFIT to the consensual projection of the probability of occurrence depends on the output of the other two models and on the realized distribution, while confidence into its projections relies upon climatic variables.

Limitations of the conditional consensus model

A useful perspective would be to make use of the continuous outputs of the individual SDMs to generate the consensus model: for instance, instead of coercing the individual SDM outputs into binary occurrences, one could consider more subsets, through incorporating classes of low, medium and high outputs for each model. This could help account for gradual response curves of the species to abiotic variables (Meynard & Kaplan 2012). We chose to use only two classes of outputs for each model, to avoid over-fitting: obviously, such an approach is limited by the amount of available data as the more subsets are used, the more likely it is that one of the subsets corresponds to fewer pixels than necessary to statistically assess the link between model deviance and environmental variables.

The modelling of uncertainty (step 4) might be improved by incorporating non-climatic variables, such as soil quality; and by taking spatial autocorrelation into account so as to reduce uncertainty in parameter estimation. Various methods have been developed (Dormann et al. 2007; Beale et al. 2010) to account for spatial autocorrelation in data; however, the most efficient ones (Moran eigenvector approaches, generalised linear model with explicit spatial covariance) are still too computer-intensive to be tractable on large numbers of points. Because we have not accounted for the effect of spatial autocorrelation in step 4, we expect (i) unbiased estimates of model coefficients but; (ii) larger standard errors of these estimates around the expected value (McGill 2012); and (iii) a slight tendency for model comparison procedures to favour over-parameterized models (as both model uncertainty and the variables used for regression are spatially autocorrelated).

The main strength of the conditional consensus model is to characterize uncertainty in a spatially explicit, environment-dependent way. However, this approach can only be used with models that are thought to be extrapolable to future conditions. We chose not to include purely correlative models in our approach, despite their high accuracy in projecting historical distributions. First, their errors are unlikely to be environmentally clustered, as all available potentially explanatory layers would already have been included in designing the model. Secondly, finding the right amount of model complexity, leading both to accurate projections of current conditions and to extrapolable relationships between climate and occurrence is a perilous task. The consensus approach presented here is sensitive to the addition of overfitted models – which would drive the outputs of the consensus model, to the detriments of locally less accurate, but maybe more robust models –, but helps make rid of consistently inaccurate models.

SDMs relying on physiological processes may be more plausibly extrapolated to non-analogous conditions because, even though the reaction norms of trees' physiological traits to climate may evolve within a few generations; they are likely to be conserved in the next few decades (Dormann et al. 2012). However, despite being realized for a wide range of climatic conditions, parameterization of process-based SDM may also be inaccurate under the novel combinations of environmental conditions expected for the coming century (Williams, Jackson & Kutzbach 2007). For example, phenological models in PHENOFIT are parameterized using empirical relationships between time-series observations of phenological events in natural populations and daily temperatures and photoperiod. As these two factors are broadly correlated in nature, their relative importance is difficult to capture: different parameter sets may describe current phenology with equal likelihood; yet their projections to future conditions may vary. This is why this kind of models constantly need refining through the incorporation of experimental data (Caffarra, Donnelly & Chuine 2011; Basler & Körner 2012). The increase in atmospheric CO2 also generates non-analogue conditions, and its impact on vegetation dynamics, functioning and distribution remains under debate, calling for ecophysiological experiments (Körner 2000; Prentice & Harrison 2009).


The authors thank the EMBERS group of Lund University for providing STASH and LPJ models, the ATEAM project for providing climatic datasets, and the GDR CNRS 2968 Observatoire Des Saisons (, the ONF-RENECOFOR Network, the ONF-Seed Service Sécherie de la Joux and the French Public Arboreta Network for providing the phenological observations used to parameterize PHENOFIT. We thank Rémi Choquet and Olivier Gimenez for advice regarding statistical analyses, and Fabien Laroche, Christine Meynard, Xavier Morin, Pedro R. Peres-Neto, A. Townsend Peterson and four anonymous referees for critical reading of the manuscript. We acknowledge financial support from the Agence Nationale de la Recherche (projects ANR-05-BDIV-009 QDiv; ANR-2009-PEXT-001105 SCION). AD was funded by a Marie Curie IOF grant (TRECC-2009-237228) within European Commission's FP7.