Predicting potential distributions of invasive species: where to go from here?


  • All authors contributed equally to the paper.

Correspondence: Laure Gallien, Laboratoire d’Ecologie Alpine, CNRS UMR 5553, Université Joseph Fourier, BP 53, 38041 Grenoble Cedex 9, France.


Aim  There has been considerable recent interest in modelling the potential distributions of invasive species. However, research has developed in two opposite directions: the first, focusing on screening, utilizes phenomenological models; the second, focusing on predictions of invasion dynamics, utilizes mechanistic models. Here, we present hybrid modelling as an approach to bridge the gap and to integrate the advantages of both research directions.

Location  Global.

Methods  First, we briefly summarize the characteristics and limitations of both approaches (screening vs. understanding). Then, we review the recent developments of hybrid models, discuss their current problems and offer suggestions to improve them.

Results  Generally, hybrid models are able to combine the advantages of currently used phenomenological and mechanistic approaches. Main challenges in building hybrid models are the choices of the appropriate degree of detail and efficiency and the decision on how to connect the different sub-models. Given these challenges, we discuss the links between the phenomenological and the mechanistic model parameters, the underlying concepts of fundamental and realized niches and the problem of feedback loops between population dynamics and environmental factors.

Main conclusions  Once the above challenges have been addressed and the necessary framework has been developed, hybrid models will provide outstanding tools for overcoming past limitations and will provide the means to make reliable and robust predictions of the potential distribution of invasive species, their population dynamics and the potential outcomes of the overall invasion process.


Biological invasions, resulting in biotic exchange and subsequent homogenization, are a major component of global change (Vitousek et al., 1997). The anthropogenic displacement of species when followed by permanent establishment, rapid colonization and uncontrolled spread, i.e. biological invasion (Pyšek et al., 2004), modifies native diversity, ecosystem functioning and associated goods and services (Vitousek et al., 1997). Predicting and understanding invasion processes is therefore essential for management actions and policies. The search for common patterns among different invasion events has produced a large body of literature focussing on the intrinsic properties of invaders (reviewed in Rejmánek et al., 2005; Pyšek & Richardson, 2007), the propensity of natural communities to be invaded (Rejmánek et al., 2005) and the relationship between invaders’ distributions and environmental factors (Thuiller et al., 2006; Wilson et al., 2007). Although insights into this work has improved our understanding of invasions and has fostered the development of improved approaches for screening, our ability to reliably predict invasion processes is still very limited. A number of limitations result from the fact that studies traditionally either focused on ‘brute-force’ broad-scale screening and multi-species predictions (Peterson et al., 2008) or on incorporating local-scale processes to analyse species-specific dynamic outcomes (Higgins et al., 1996). However, increased computer power now allows combining the advantages of these two approaches, offering a promising avenue towards better models for predicting which species could invade and what could be the course and outcome of invasions.

Broad-scale screening approaches aim to predict which species have the ecological niche to potentially maintain viable populations in a given area (Peterson & Vieglais, 2001). They rely on phenomenological habitat suitability models (HSMs) that describe and extrapolate patterns and relationships (Daehler et al., 2004; Kolar, 2004). HSMs are based on the ecological characteristics of known occurrences in the native distribution of a species and aim to identify the suitable local areas in a potentially available new range (Peterson & Vieglais, 2001). Screening approaches do not directly account for underlying processes but assume that the influence of local processes can be captured indirectly by analysing patterns at larger spatial scales (Ficetola et al., 2007; Beaumont et al., 2009a; Roura-Pascual et al., 2009b). However, this underlying assumption might be violated when extrapolating into new regions and under global change (e.g. climate or land use changes), resulting in potentially erroneous predictions (Davis et al., 1998; Dormann, 2007). Nonetheless, screening approaches, promoted by the increasing availability of environmental and distributional data, have been successfully applied to describe and extrapolate presence/absence patterns for large numbers of potentially invading species and over large areas (Daehler & Carino, 2000; Roura-Pascual et al., 2004; Thuiller et al., 2005; Ficetola et al., 2007). Although HSM have been criticized (see next part, Mack, 1996; Hulme, 2003), their efficiency in predicting invasions is of primary importance for preventive invasion management. The reason is that the attempt to eradicate invasive species after their establishment causes colossal costs and is often unsuccessful (Perrings et al., 2005; Pimentel et al., 2005).

Alternatively, approaches that aim to predict the spread and dynamic outcomes of invasions usually incorporate demographic processes and/or landscape structure. They are mostly applied to address questions focusing on demographic dynamics of invasive species after their establishment: How is the species likely to spread (Higgins et al., 1996)? How is the species going to influence the native community? Mechanistic simulation models are the tools of choice for such purposes as they are able to explicitly incorporate local-scale processes and dynamics (Table 1). As these models directly simulate the mechanistic link between the environment, biotic interactions and the invaders’ demographic responses, they are supposed to be less prone to produce erroneous predictions for new regions and under global change (Morin & Lechowicz, 2008). However, building mechanistic models is highly data demanding and involves more complex model structures for which better expert knowledge and process-based understanding is required.

Table 1.   Broad classification of different modelling techniques mentioned in the article and their associated key references. This table is a toolbox for hybrid model builders
Type of modelDescriptionKey referenceExample of use in invasion ecology
Curve fitting model (CFM)A CFM is a formula-based description of a process or a pattern, typically analytically solvable. It is often used as a sub-model in a more complex model. Examples are CFMs describing dispersal kernels (e.g. fat-tailed negative exponential models) or population dynamics (e.g. logistic model)(May, 1976)Parameterization of dispersal ability of an invasive species (Skarpaas & Shea, 2007)
Matrix population model (MPM)A MPM describes the growth process of individuals or cohorts via life-stages and transition probabilities (e.g. using Leslie matrices) and is analytically solvable. Examples of applications are population viability analyses. There is no information on space(Caswell, 2001)Evaluation of the local dynamic of an invasive species (Sebert-Cuvillier et al., 2007)
Metapopulation model (MM)A MM describes the demographic dynamics of a population living on suitable habitat patches within a hostile matrix of unsuitable habitat. The main focus is on extinction and colonization of local populations. The simpler metapopulations MMs are analytically solvable (e.g. incident models). More complex MMs can be spatially explicit and can describe dispersal, reproduction and competition explicitly(Hanski & Gaggiotti, 2004)Evaluation of the risk of introduction of a non-native species (Deines et al., 2005)
Cellular automaton (CA)CAs are stochastic spatially explicit models that may be used to describe spread and spatial interactions. Each cell on a grid evolves through discrete time steps according to a set of rules based on the states of neighbouring cells. It is typically used to explore colonization processes and patterns(Bolliger et al., 2003)It could be used to evaluate the influence of initial spatial structure in the spread of an invasive species (Ferrari & Lookingbill, 2009)
Landscape model (LM)LMs are spatially explicit models aiming at projecting a landscape (structure, function, composition) over time. They can include spatial interactions, community dynamics or/and ecosystem processes. LMs are typically used to simulate different management or global change scenarios. Two broad classes of examples are gap/landscape models (e.g. LANDIS, ForCLIM) and dynamic vegetation models (e.g. IBIS, LPJ)(Scheller & Mladenoff, 2007)There are few examples of landscape models in invasion ecology. It could be used to evaluate the colonization dynamic of a species (Albert et al., 2008)
Individual-based model (IBM)IBMs are models that focus on units (e.g. individuals, populations…) and their interactions. It describes processes at small scales that directly influence the units. IBMs are typically used to investigate patterns emerging at larger scales and to make predictions(Grimm & Railsback, 2005)It could either describe qualitatively the invasion process (Travis et al., 2007) or quantify results of invasion process (Nehrbass & Winkler, 2007)
Mechanistic niche model (MNM)MNMs are based on niche theory and describe the link between a species and its environment from the relationship between species’ characteristics (behaviour, morphology, physiology…) and environmental factors. They are mainly used to predict patterns of species distribution over space and/or time(Kearney & Porter, 2009)Predictions of the cane toad’s distribution under future climatic scenario (Kearney et al., 2008)
Habitat suitability model (HSM)HSMs are statistical models that are based on niche theory and fit the link between a species and its environment from occurrence or abundance data and environmental data. They are mainly used to predict patterns of species distribution over space and/or time(Guisan & Thuiller, 2005)Large-scale predictions of the risk of invasion by an alien species (Thuiller et al., 2005)

Today, there is a growing awareness that the advantages of both phenomenological models (most notably their efficiency on broad spatial scales and for many species) and mechanistic models (most notably their ability to model new situations) are necessary to improve our ability to predict accurately the outcome of invasion (Morin & Lechowicz, 2008; Thuiller et al., 2008; Brook et al., 2009; Franklin, 2010). This is because of the fact that invaders often encounter completely new settings in the adventive range. These new settings are not captured by the broad-scale relationship perceived in the native range. For example, invaders may encounter new types of landscapes, new barriers, new competitors or enemies. To account for such differences between the native and the invaded range, we need to model the processes that are sensible to these differences. The idea of modelling species distributions on the basis of large-scale relationships while at the same time considering the most important processes has recently led to the development of so-called hybrid models (Morin & Lechowicz, 2008; Thuiller et al., 2008; Brook et al., 2009). We believe that these hybrid models can solve some of the most important problems occurring when projecting species distributions in space and time and aim at advancing their use in invasion ecology.

In the following, we introduce the different models used either for screening and broad-scale predictions or predictions of dynamics outcomes and discuss their respective purposes and limitations. We then briefly review how these contrasting approaches have been combined to hybrid models to overcome the conceptual and statistical shortcomings underlying the single approaches. However, hybrid models have only recently been developed and can be improved in several ways. Therefore, we finally develop a set of rules of thumb to facilitate and improve the use of hybrid models for predicting invasion events and suggest solutions to overcome some of their current limitations.

Approaches to predict invasions

Screening and broad-scale predictions

Screening studies are based on phenomenological habitat suitability models (HSM), which statistically relate species occurrences to environmental variables (Guisan & Thuiller, 2005; Franklin, 2010). Although species distributions are co-determined by various physical factors (e.g. temperature or soil pH), biotic interactions (e.g. predation or pollination) and disturbances, climate is often seen as the main driver at large spatial scales (Woodward & Williams, 1987; Willis & Whittaker, 2002). Thus, at first, HSMs were often solely based on climatic data (Franklin, 1995; Guisan & Zimmermann, 2000; Heikkinen et al., 2006) but they have been refined afterwards with data representing other aspects of the environment, such as land use, soil or productivity (Pearson et al., 2004; Bradley & Mustard, 2006; Ficetola et al., 2007). Generally, researchers have either calibrated models using the species’ native range to extrapolate the found patterns into the adventive range (e.g. Beerling et al., 1995; Welk et al., 2002; Peterson et al., 2003; Richardson & Thuiller, 2007; Ibanez et al., 2009) or simply calibrated the model in the adventive range to predict the potential extent of species’ distribution (Zalba et al., 2000; Roura-Pascual et al., 2004; Parker-Allie et al., 2009). By using such an environmental-based approach, scientific efforts have focused on defining potential invasive species through environmental matching (Peterson et al., 2008). The use of this approach is related to one of the main hypothesis in invasion ecology stating that the environment of native vs. adventives’ ranges has to be similar to allow for a successful invasion (Panetta & Mitchell, 1991; Scott & Panetta, 1993).

Habitat suitability models have a limited accuracy in providing predictions of future invasions as they do not explicitly incorporate demographic processes driving species distribution and invasion rates (e.g. fecundity and dispersal ability). However, they are particularly efficient to assess the invasive potential of large numbers of species before their introduction (Peterson & Vieglais, 2001) and are often reasonable alternatives when other modelling tools are missing or are excessively time or money consuming.

Besides specific limitations for the application to invasions, HSMs also have some further well known and described limitations that we will not detail here (Guisan & Thuiller, 2005; Bahn & McGill, 2007; Dormann, 2007). In the context of biological invasions, HSMs are prone to predict substantial false presences and false absences because of the non-equilibrium nature of the invader’s distribution. False presences can be predicted when environmental variables non-introduced in the models (such as soil type, disturbance regime or interspecific interactions) are limiting the naturalization of a species in the invaded range. False absences occur if a species’ potential distribution has not been realized in its native range because of non-equilibrium dynamics, e.g. because of historical constraints attributable to human influences or because of physical barriers that prevent full range occupancy (Curnutt, 2000).

In the native range, a given species occurs at the intersection of suitable (climate, resource), available (biotic interactions, habitat disturbance) and reachable (dispersal) habitats (Soberon, 2007). In the absence of source-sink dynamics, this intersection, commonly called the realized niche of the species (Hutchinson, 1957), is theoretically smaller than the species’ fundamental niche (Pulliam, 2000). Comparing the realized niche within the native vs. the invaded ranges can lead to three non-exclusive theoretical cases (Fig. 1). First, in the invaded range, the species could use a similar or smaller realized niche than in the native range. This case is expected when the environment and the outcomes of biotic interactions in an adventive area are comparable to the native area (case 1 in Fig. 1). Only in this case, the assumptions of HSMs are fully met and we can expect reliable predictions from a HSM exclusively calibrated with data from species’ native ranges (Thuiller et al., 2005). Second, the introduced species may occupy a realized niche very different from the one in the native area, for instance because of new predator community, multiple sites of introduction, niche differentiation (e.g. in ploidy level, Treier et al., 2009) or different environmental conditions (case 2 in Fig. 1). In this case, a model exclusively calibrated with data from species’ native ranges will fail by predicting erroneous potential ranges (Broennimann et al., 2007; Fitzpatrick et al., 2007). This problem can be partly addressed for world-wide invaders by using all known occurrences (both from the native and invasive ranges) to calibrate the model (e.g. Kearney et al., 2008; Beaumont et al., 2009b). Third, the species could undergo rapid genetic adaptation. Genetic adaptation violates the underpinning assumption of slow niche evolution when predicting species distribution with HSMs (Holt, 1992) and is probably most difficult to account for (case 3 in Fig. 1). In the last years, several studies challenged the assumption of slow niche evolution by demonstrating that some invasive species have rapidly evolved during the course of invasion because of genome size reduction (Lavergne & Molofsky, 2007), genetic bottleneck, converging selection, mutations (Phillips et al., 2008b) or hybridization (Hall et al., 2006). In this case, the realized niche may extend outside of the species initial fundamental niche. The only way to address this issue in a HSM framework is to calibrate the model with all known occurrences and make sure that the model also includes occurrences from the particular invaded range where rapid adaptation is ongoing. Calibrating habitat suitability models on all known occurrence could also lead to some particular problems which depend on the overall goal of the analysis. Indeed, models calibrated on all known occurrences are likely to over-predict the distribution in the invaded range of a species currently invading in a particular area. The researcher will have to decide whether this is a problem or not. In the goal of predicting the potential distribution of the species for prevention, it is clearly welcome to know where the species could further invade and one would be more tolerant with regard to false presence predictions. In the case of understanding and possibly eradicating the species, a model producing a better match with the current distribution in the invaded range is probably more acceptable.

Figure 1.

 The realized niche dilemma in predicting invasion risk based on habitat suitability models. Several possibilities for the realized niche in the invaded range compared to the realized niche in the native range. Case 1: The realized niche in the invaded range is similar to the one in the native range. It can occur if the outcome of biotic interactions is similar in both ranges. Case 2: The realized niche in the invaded range may be very different from the one in the native range. It can occur because of different biotic interaction like enemy release, different access to sites because of introduction, different environmental conditions, or different niches because of populations having different ploidy levels. Case 3: The realized niche in the invaded range may be partially outside the fundamental niche because of rapid genetic adaptation.

Processes and predictions of dynamics outcomes

Although phenomenological HSMs have been the tool of choice for screening purposes, they have not improved our understanding of the dynamics underlying invasions and their outcomes. A better understanding of invasions requires a better understanding of the demographic processes that drive invasion spatio-temporal dynamics and of the characteristics of the invaders, the recipient communities and the environmental variables that influence these processes. Two key demographic processes: dispersal (Hastings et al., 2005) and growth (Jongejans et al., 2008), are known to influence invasions differently at different invasion stages (Dawson et al., 2009). More specifically, Allee-effects (Taylor & Hastings, 2005), interspecific interactions (Davis et al., 1998; Richardson et al., 2000), phenotypic plasticity (Wilson et al., 2009), genetic adaptations (e.g. hybridization, Hall et al., 2006) or increasing dispersal abilities (Travis et al., 2007; Phillips et al., 2008a) and disturbances (Edward et al., 2009) are examples for processes that have been identified as being important during certain invasions. Improving our understanding of the causal role of demographic processes in invasions can be achieved either on the basis of experiments, for example field, greenhouse (e.g. Leishman & Thomson, 2005) and microcosm experiments (e.g. Davis et al., 1998), or through mechanistic models (With, 2002; Nehrbass et al., 2006).

Mechanistic models of invasions simplify the natural system and reduce it to its basic processes to improve the understanding of the underlying invasion mechanisms (Wissel, 1989). Deciding on how much reality should be simplified (i.e. choosing the best level of ecological details) is one of the hardest questions. The answer to this depends on the research question and may offer an array of different solutions ranging from theoretical to applied models (Bolker, 2008). On the one hand, mechanistic models can be theoretical models developed to explore a concept without reference to a particular species or place. The results of such theoretical mechanistic models show qualitative hints and trends and can be generalized within the framework of ‘a priori’ assumptions (Bolker, 2008). Theoretical models also contributed to growing consensus, such as the importance of long-distance dispersal events for range expansions, though rare and difficult to predict (Hastings et al., 2005). Moreover, dispersal kernels (i.e. the probability function of dispersal distances) might not remain static during invasion events. The process of invasion itself may induce strong selection pressure on species’ dispersal abilities, resulting in increased dispersal at the expanding front (either through mutations or because of higher fitness and resulting agglomeration of strong dispersers, Travis & Dytham, 2002; Phillips et al., 2008a). On the other hand, mechanistic models can be applied models developed with the aim of providing quantitative and detailed predictions on specific cases and then striving to incorporate more ecological details. For example, Nehrbass et al. (2006) parameterized and compared a deterministic matrix model and an individual-based model to analyse why a harmful invasive species, the Giant Hogweed Heracleum mantegazzianum, has shown long-term range expansion but short-term population decline. They identified temporal variability in demographic factors as the main driver of such dynamic and concluded from their model comparison that taking into account invader’s demography can lead to strong practical implications for control measures (Nehrbass & Winkler, 2007). The development of applied mechanistic models is constrained by available expert knowledge used to formulate model rules and functions and by the data needed to parameterize the model. Spatially detailed information on key environmental factors such as pH or soil water humidity is often lacking and can obviously preclude model building.

Both theoretical and applied mechanistic models utilize a broad range of different modelling techniques. Among them are mechanistic niche models (MNM), matrix population models (MPM), metapopulation models (MM), individual-based models (IBM) and landscape models (LM, Table 1). However, a comprehensive description and general classification of different modelling techniques goes beyond the scope of this article and has been presented elsewhere (e.g. Grimm & Railsback, 2005; Jorgensen & McLachlan, 2008; Kearney & Porter, 2009).

Hybrid models – their present and their future

What has been carried out so far?

Recent years have seen the emergence of hybrid models (e.g. Morin & Lechowicz, 2008; Thuiller et al., 2008; Franklin, 2010) that aim to overcome former statistical and conceptual limitations by integrating both (1) the predictive accuracy of phenomenological models at large spatial scales and (2) the ability to capture dynamics of mechanistic models.

A number of studies have successfully hybridized different model types to predict the spread of invasive species or endangered species extinction threats (Jeltsch et al., 2008; Thuiller et al., 2008). The simplest model combination, which is so far also the most commonly used one for predictive biogeography, is the association of a HSM with a spatially explicit applied mechanistic model such as spread, metapopulation or landscape models (Albert et al., 2008; Keith et al., 2008; Anderson et al., 2009; Brook et al., 2009; Dullinger et al., 2009; Jacobs & MacIsaac, 2009; Roura-Pascual et al., 2009a; Smolik et al., 2009; Table 1). Model combinations can also adopt different forms, but in most cases, the spatially explicit model parameters (e.g. mortality/survival, carrying capacity, dispersal rate) are constrained by the outputs of a habitat suitability model (e.g. probability of presence or presence/absence). The biological reasoning is that such constraints on the parameters by the HSM mimic the change of species’ characteristics and performances throughout the environment (Thuiller et al., 2010). These types of hybrid models rely on the assumption that large-scale environmental gradients (commonly climate) determine which species could persist in a given environment (i.e. habitat filtering, Diamond, 1975), while population dynamic processes take place at smaller spatial scales (Weither & Keddy, 1995; Lortie et al., 2004).

For example, Roura-Pascual et al. (2009a) successfully reconstructed the invasion spread of the Argentina ant in Catalonia by constraining the metapopulation dynamics governing the cell-state transition by a topo-climatic-based habitat suitability (see also, Smolik et al., 2009 on Ambrosia artemisiifolia). Only extinction and colonization rates were restricted (i.e. linearly weighted) by habitat suitability.

The form of such hybrid models can still increase in ecological details and therefore complexity. To model the population dynamics of an endangered bird species, Wintle et al. (2005) proposed a three-step hybridization where vegetation dynamics were modelled by a spatially explicit landscape model (step 1) (LANDIS, Mladenoff & He, 1999). This landscape model in turn fed the bird habitat suitability model (step 2) which constrained the metapopulation dynamics of the bird (step 3) (RAMAS GIS – Metapop, Akçakaya et al., 2003).

Hybridization of models to predict the spread and dynamic of invasive species is not restricted to habitat suitability and metapopulation models. There are several examples of models developed for a given target species. For instance SPAnDX, a detailed climate-driven process-based population cohort model, combining the approaches of forest growth models and community dynamics models, has been specifically developed to model the population dynamics of Acacia nitolica (Kriticos et al., 2003). Such complex models focused on a single species are obviously not easily applicable to many species but they can be highly robust and accurate.

Rules of thumb for the hybrid-building process

Typically, hybrid models combine phenomenological habitat suitability models (from moderate to high data requirement and low to moderate expert knowledge Fig. 2), with reasonably complex mechanistic models (low data requirement and moderate to strong expert knowledge) and are complex and data-demanding models (shift towards the upper right corner in Fig. 2). Then, one of the major challenges is to select the most appropriate sub-models regarding at the same time: the research question, the required expert knowledge and data availability. But how much complexity is still reasonable? The theoretical answer is clear: the minimum overall error is obtained at moderate levels of complexity. Consequently, increasing complexity does not automatically increase model performance (Wissel, 1989). To help the decision-making about sub-models selection, we propose here a guideline based on four key questions (see example in Box 1).

Figure 2.

 Requirements and objectives of different kinds of models. To understand invasions, experiments or conceptual models can be used to simulate virtual worlds based on known processes. To predict invasions without much knowledge, phenomenological models like habitat suitability models (HSMs) are really useful. Mechanistic models may be more accurate in predicting invasions but need lot of knowledge to implement processes. Hybrids models may be a compromise to improve predictions without detailing all processes.

The first two questions are of equal importance: (Question 1) How much understanding (vs. screening) do we need to fulfil the study goal? When the aim of a study is a screening procedure among a large number of invasive species, then only the demographic processes that are essential for all species can be included. For screening, we often may want to include sub-models for dispersal and/or local extinction processes to HSMs. When a study aims at predicting future distributions of a single invasive species, then more detailed expert knowledge about the species’ ecology can be used to incorporate a larger number of important processes and sub-models. (Question 2) Which processes of invasion are relevant for the studied system at both spatial and temporal scales? For example, if we would build a hybrid model for a species’ increasing dispersal abilities at the leading edge (e.g. Phillips et al., 2008b), we may consider including not only dispersal but also the evolution of the species’ dispersal abilities along with the pre-defined habitat suitability. Decisions on the importance of processes can be aided by recalling that usually HSMs implicitly already incorporate all demographic processes of the species. It is only necessary to explicitly include in a hybrid model those processes that are prone to change species’ relationships with the habitat and the environment during the invasion process.

These first two questions on the choice of prediction detail level and on the selection of the potential processes of interest allow for the delineation of the ‘maximal hybrid model’. This maximal model ideally contains all processes that are important for the study purpose, regardless of the information required and available for the implementation and parameter estimation procedure. Subsequently, the two last questions deal with the feasibility of the hybrid model: (Question 3) For which of the chosen processes do we have sufficient expert knowledge to implement rules and equations? (Question 4) For which of the chosen processes do we have enough available data to parameterize the model? The ultimately selected processes should simultaneously meet the expert knowledge and data requirements. The hybrid model structure chosen through such a hierarchical design contains the most relevant process combination, avoiding the development of too complex models that could decrease prediction reliability.

Hybrid model limitations and suggested improvements

Hybrid models do not aim to predict perfectly but to overcome specific limitations of traditional models. As discussed before, most of the existing hybridizations concern HSM and metapopulation or landscape models. However, there are different challenges that hamper a more extensive use of hybridization approaches. These challenges concern the form, the strength and the direction of the link between the demographic parameters and the HSM, as well as circularity problem.

Form and strength of the relationship between HSM and demographic parameters

Two essential questions need to be addressed before hybridization: (1) what parameters of the mechanistic model should be constrained by the habitat suitability measure? (2) What link should be established between these model parameters and habitat suitability? The question (1) is rarely addressed explicitly, and the most important parameters of the mechanistic model are generally constrained based on expert knowledge (e.g. carrying capacity, dispersal rate, growth rate). For the question (2), the link between habitat suitability and for example carrying capacity (Keith et al., 2008; Anderson et al., 2009) or survival/fecundity (Wintle et al., 2005; Albert et al., 2008; Dullinger et al., 2009) is generally assumed to be linear or logistic (but see Kearney et al., 2008). This assumption is not fully supported by experiments or observational analyses. Thuiller et al. (2010) showed that the link between habitat suitability and plant performance can be rather idiosyncratic, not always consistent between and within species and not always following the expected direction (e.g. negative relationships instead of positive), corroborating the few other studies having investigated these relationships (Wright et al., 2006; Elmendorf & Moore, 2008). At the moment, using linear links between selected parameters and a given HSM is the most simple approach and given the limited data the only available alternative. Solving these problems would require reproducing experimental and observational analyses in different environments with different species. Additionally, several articles have shown that HSM outputs can strongly vary depending on the used statistical models (Albert & Thuiller, 2008). This raises the question of selecting one given HSM or a combination of the most reliable ones (e.g. ensemble forecasting, Araújo & New, 2007; Marmion et al., 2009; Roura-Pascual et al., 2009b).

Alternatively, to address the limitations mentioned above, hybrid models could use only presence/absence predictions instead of using habitat suitability with a continuous scale between 0 and 1. This way, the HSM only gives the areas where the species could occur and then the applied mechanistic model simulates the demography based on competition, dispersal, extinction and disturbance (Albert et al., 2008). This would avoid dealing with potentially erroneous assumptions on the type and form of relationships between habitat suitability and model parameters. However, using presence/absence predictions requires the transformation of the continuous habitat suitability information into binary presence/absence using a particular threshold. Selecting for an optimal threshold has been reviewed extensively in the past (e.g. Liu et al., 2005; Hirzel et al., 2006) and need to be carefully thought in the invasion context.

One-way or two-way interactions

An additional shortcoming of hybrid models is that they are mostly based on a one-way interaction between a model that is supposed to give patch quality or habitat suitability and another model that is supposed to simulate population and community dynamics. However, in the case of invasive species, this one-way interaction could be of limited relevance if the invader is known to modify the environment and the availability of resources. Examples range from nitrogen-fixing plant species that modify ecosystem functioning (Vitousek et al., 1997) and resource use to animal invaders that could influence dispersal dynamics of vegetation. Future developments of hybrid models for modelling invasions should focus on implementing two-way interactions between sub-models to allow for feedbacks.


Using habitat suitability models to constrain the invader’s population dynamics raises another problem linked to the circularity of the modelling process. There is an ongoing debate on the exact meaning of the output from an HSMs: Do they represent species habitat vs. species niche (Kearney, 2006), the realized vs. the fundamental niche or the realized vs. potential distribution (Soberon, 2007)? Ideally, the HSM should predict the fundamental niche of the invader and not the realized niche to be used to influence the population dynamics (e.g. demography) of the target species in mechanistic models. This is not necessary true if the mechanistic model only concerns dispersal for instance. However, HSMs implicitly and indirectly accounts for biotic interactions, disturbance effects, land use legacy and dispersal limitations. This might be problematic because using habitat suitability model outcomes calibrated on observed distributions will lead the hybrid model to account for biotic interactions twice. This is likely to result in under-predictions of the potential distribution of the invader. A way to deal with this problem could be the use of very liberal models (to avoid false absences) that do not overfit (down-weight false presences) to depict only the broad range limits of the invader and let the applied mechanistic model simulate the population dynamics in the potential range.


While tremendous progress has been made on many aspects related to the building and evaluation of phenomenological HSMs and theoretical and applied mechanistic models in the context of biological invasions, future efforts should focus on combining the advantages of these various approaches. Phenomenological habitat suitability models have been mostly applied to predict the potential distribution of many species in adventive ranges, ignoring population dynamics and resistance of the native communities, while mechanistic models have been used to understand invasion dynamics once the invader was introduced, mainly ignoring the influence of environmental conditions.

Recent years have seen the emergence of a new generation of models that capitalize on the strength and advantage of both approaches and concepts to make more reliable and useful predictions. These hybrid models typically use phenomenological models to constrain demographic parameters of meta-population or landscape models. Important aspects of hybrid models requiring deeper examination include: (1) the form and strength of the link between habitat suitability and demographic parameters; (2) the potential circularity involved in the use of habitat suitability models – that indirectly already account for biotic interactions and limited dispersal – to constrain demographic parameters (3) the one-way interaction between HSMs and mechanistic sub-models which may not always be robust, especially for invasions which may influence the environment in return. Additionally, we argue that the conception of a hybrid model should not utilize a general a priori design but should follow simple strategic steps based on the following criteria: (1) the ultimate goal is to predict, understand or both; (2) relevant processes for the studied system; (3) selection of processes with enough expert knowledge; (4) selection of processes with enough available data.

Once these challenges are addressed and the framework is rigorously built-up, hybrid models provide outstanding tools to overcome past limitations and to make reliable and robust predictions of the potential distribution of an invader but also its population dynamics and the outcomes of the overall invasion process.


We thank David Richardson for the invitation to write this article and for his insightful comments and editing of the manuscript. We also thank Marcel Rejmánek and four anonymous reviewers who helped to significantly improve the article. This work was funded by ANR SCION (ANR-08-PEXT-03) and DIVERSITALP (ANR-07-BDIV-014) projects. We also received support from European Commission’s FP6 ECOCHANGE (Challenges in assessing and forecasting biodiversity and ecosystem changes in Europe, No 066866 GOCE) project.


The EMABIO’s team (Evolution, Modelling and Analysis of BIOdiversity) led by Wilfried Thuiller focuses at revealing the evolutionary mechanisms that have generated extant biodiversity and its spatial patterns, especially climatic niche evolution and its effect on species diversification. The EMABIO’s team also aims to address the mechanisms governing the assembly of biotic communities, such as the genetic effects of keystone species on plant and microbial communities, but also the effects of dispersal limitation, environmental filtering and resource competition. Finally, the EMABIO’s team proposes new modelling tools and generate biodiversity assessments at various spatial scales (continental, national and regional).

Author contributions: All authors conceived the ideas and participated in the writing, L.G. led the writing.

Editor: Marcel Rejmanek

Box 1

The use of a guideline based on the four rules of thumb can be exemplarily shown based on the study of Williams et al. (2008). They developed a hybrid model to predict the potential spread of the orange hawkweed (Hieracium aurantiacum) from the Bogong High Plains to alpine areas of Australia. The goal of their study was to facilitate early detection of new populations before high abundance threatens native biodiversity, that is to say a moderate need of mechanistic understanding (Question 1). The whole target area was an alpine region, its grain size was 20 × 20 m, and the near future was the temporal scale (Question 2). From this information, one could expect them to combine a HSM with processes such as interspecific interactions, disturbance, demography or evolutionary adaptation.

In fact, with their few data but high expert knowledge and information from the literature, they created a HSM on the conditions of high likelihood of the hawkweed establishment (including native vegetation community type, wetness and disturbance). However, they did not possess enough empirical records to create a mechanistic model, so they decided to model the likelihood of seed dispersal from known populations according to the wind direction, solely based on literature information and expert knowledge. Integrating their knowledge (Question 3) and the data available (Question 4) on the processes involved in the spread of the target species, they could partly account for climate (wetness), indirect species interactions (native vegetation community type), disturbance (e.g. distance to roads) and demography (dispersal). Finally, this modelling strategy has been particularly interesting for the detection of newly established populations via wind dispersion, impossible to reach through the use of simple HSMs.