Marcelo H. Cassini, Laboratorio de Biología del Comportamiento, IBYME, CONICET, Obligado 2490, Buenos Aires, Argentina. E-mail: email@example.com
Most species distribution models (SDMs) assume that habitats are closed, stable and without competition. In that environmental context, it is ecologically correct to assume that members of a species will be distributed in direct relation to the suitability of the habitat, that is, according to the so-called habitat matching rule. This paper examines whether it is possible to maintain the assumption of the habitat matching rule in the following circumstances: (1) when habitats are connected and organisms can move between them, (2) when there are disturbances and seasonal cycles that generate instability, and (3) when there is inter-specific and intra-specific competition. Here I argue that it is possible as long as the following aspects are taken into account. In open habitats at equilibrium, in which habitat selection and competition operate, the habitat matching rule can be applied in some conditions, while competition tends to homogenize the species distribution in other environmental contexts. In the latter case, two methods can be used to incorporate these effects into SDMs: new parameters can be incorporated into the response functions, or the occurrence of proportions of categories of individuals (adult/young, male/female, or dominant/subordinate species in guilds) can be used instead of the occurrence of organisms. The habitat matching rule is not fulfilled in non-equilibrium environments. The solution to this problem lies in the design of SDMs with two strategies that depend on scale. Locally, the disequilibrium can be encapsulated using average environmental conditions, with sufficiently large cells (in the case of metapopulations) and/or long enough sampling periods (in the case of seasonal cycles). At coarse scales, the use of presence-only models can in some cases avoid the destabilizing effect of catastrophic historical processes. The matching law is a strong assumption of SDMs because it is based on population ecology theory and the principle of evolution by natural selection.
Species distribution models (SDMs) can be defined as associative models relating occurrence or abundance data at known locations of individual species (distribution data) to information on the environmental characteristics of those locations (modified from Elith & Leathwick, 2009). Most models, especially those developed in the last decade, produce a habitat suitability map as their output, but this definition of SDMs encompasses models that use multivariate analysis to identify environmental predictors that do not have a geographical expression. This definition excludes models that explain variations in biodiversity or species richness (except those that apply to a set of species within the same guild) and models that use physiological characteristics (e.g. ranges of temperature tolerance), behavioural or demographic parameters (e.g. survival rates) as dependent variables.
Several publications have reviewed the available methods for generating SDMs (e.g. Austin, 2002; Anderson et al., 2003; Guisan & Thuiller, 2005; Heikkinen et al., 2006; Elith & Leathwick, 2009; Kery et al., 2010; Mokany & Ferrier, 2011). These reviews found that SDMs have been used successfully to characterize the natural distributions of species, and to apply this information to investigate a variety of scientific and applied issues. SDMs have proved valuable to wildlife and land managers because they allow them to obtain decision criteria within a relatively short time. With presence–absence and GIS-based descriptions of habitats, models that predict species responses to changes in environmental conditions can be generated. SDMs have numerous applications: to assess the potential threat of pests or invasive species, to identify hotspots of endangered species or biodiversity, to prioritize areas for conservation, and to restore ecosystems, among others (Hirzel et al., 2002; Beaumont et al., 2005; Elith et al., 2006).
The aim of this paper is to analyse the explicit and implicit assumptions of the SDM approach and to assess whether sound ecological and evolutionary theory underlies SDMs. The working hypothesis is that the most important assumption of SDMs is the ‘habitat matching rule’ (first named by Pulliam & Caraco, 1984), which states that the occurrence of a species in a habitat is directly related to habitat quality. This paper focuses on the distribution of animal species. However, several authors have applied theories of habitat selection to plants (e.g. Maina et al., 2002).
Making more predictive and explanatory SDMs
Most SDMs are based on correlation statistics, from which causation cannot strictly be inferred (e.g. Austin, 2002), but summing correlative results based on ecologically meaningful predictors can provide support to a hypothesis. The consequence is that SDMs should only be used with care, to predict potential ranges or to extrapolate from the current to alternative conditions (e.g. climates) (Elith et al., 2006), for instance by ensuring that theoretically well-supported predictors are used (Austin, 2002). A strong performance of a particular method in the present conditions does not guarantee a similar performance outside the range of environments on which the original model was based (Araújo et al., 2005). Only models based on fundamental knowledge of the actual processes determining species distributions can be extrapolated reliably to new environments and future or past conditions (Soberón & Peterson, 2005; Elith et al., 2006).
Knowledge of these mechanisms can be incorporated into the SDM by following two procedures. The first solution is to build mechanistic models, that is, models whose dependent variables are measures of the physiology, behaviour or demography of the species considered (Mac Nally, 2000; Morin & Lechowicz, 2008; Kearney & Porter, 2009). Kearney & Porter (2009) recently reviewed a promising line of models based on physiological constraints, mainly thermal tolerances in animals. Individual-based models were used to predict distribution patterns based on behavioural rules (Lomnicki, 1999; Railsback & Harvey, 2002; Dullinger et al., 2004; Biew et al., 2007). Another recent development is simulation models that incorporate demographic parameters in a spatial context (Pulliam, 2000; Soberón, 2007). Morin & Lechowicz (2008) described three mechanistic modelling frameworks used to predict shifts in plant species distributions under climate change: dynamic global vegetation models (Sitch et al., 2003), gap models (Bugmann, 2001) and PHENOFIT (Chuine & Beaubien, 2001). Such models potentially have high explanatory power, but also have some limitations. The design of models that incorporate biological processes is very difficult, because they require the estimation of many parameters and thus inevitably make many assumptions (Austin, 2002). Currently, such models are more useful for the theoretical analysis of the effects of population processes on spatial distribution patterns than for providing practical tools for predicting the current distribution of individuals. In addition, models that explain the coarse-scale patterns exclusively from mechanisms that operate at the local level are reductionist and do not take into account that there are processes operating at larger scales that may overshadow the local-scale processes (Wiens, 2002).
To build SDMs with higher explanatory and predictive power, the second general approach incorporates within the models biotic processes, such as connectivity between habitats and population disequilibrium, without changing the conventional SDM structure, which includes the use of measures of occurrence as the dependent variable. To implement this second approach, a prime requirement is to demonstrate that the SDMs are based on fundamental principles of ecology and evolution.
In this work I set aside discussion of the relationship between niches and SDMs, and focus my attention on the most characteristic component of the environmental paradigm: the association between the distribution of target species and the distribution of environmental predictors. A necessary and central assumption of SDMs is that the response functions that fit these models are an effective representation of the spatial response of species to environmental values that the predictor takes in different habitats. It is assumed that somehow the model captures an aspect of the ecological interaction between species and environment, which is reflected in spatial differences in occurrence or abundance. The consequence is that the greater the number of locations in which the species occurs for a given value of an environmental variable, the greater the environmental suitability for that species. This assumption, which means that there is a proportional relationship between the probability of occurrence of a species in a habitat and the quality of that habitat, is common to all SDMs and is called the ‘habitat matching rule’.
Most criticism regarding SDMs is related to the habitat matching rule. Critics state that species can be abundant at low-quality sites and, more frequently, that they may be absent from sites that are suitable. The most frequently cited causes of these deviations are biotic factors, especially competition, and lack of equilibrium (Van Horne, 1983; Thomson et al., 1996; Railsback et al., 2003; Johnson, 2008). I will discuss these criticisms and assess whether this rule can be based on more general principles of population ecology and evolution by natural selection. I begin with the most familiar of the environmental contexts for SDMs: isolated habitats and the effects of density-independent variables. Then I consider some other important factors and contexts. First I will incorporate competition in its four forms within closed environments: scramble competition without social behaviour, contest competition with equal competitors, intra-specific competition with unequal competitors, and inter-specific competition. Then I will incorporate the possibility of the exchange of organisms between habitats through the phenomenon of habitat selection, which includes dispersal among open populations. I will consider spatially structured populations in the context of static equilibrium and dynamic equilibrium (metapopulations and source–sink dynamics). Finally, I will discuss another form of lack of equilibrium that occurs at a coarse scale, namely the limits on species dispersion after natural catastrophic events.
SDMs for isolated habitats in static equilibrium
The world of most SDMs is composed of isolated environments inhabited by closed populations in which density-independent factors operate. In this world, the probability of occurrence of a species in an environment is a direct function of the values taken by these factors. When an independent variable operates as a direct limiting factor, the function obtained by relating the occurrence or abundance to the dependent variable is a legitimate representation of the pattern generated by the process involved, and the result is the habitat matching rule, because a change in the variable has a direct impact on demography and individual fitness.
Most SDMs use abiotic predictors alone, although reviews of the field recognize the importance of including biological interactions, with intra- and inter-specific competition the most commonly cited (Guisan & Thuiller, 2005; Araújo & Guisan, 2006; Jiménez-Valverde et al., 2008; Elith & Leathwick, 2009). In stable environments, the density of organisms is rarely regulated exclusively by abiotic factors in conditions of density-independence (Krebs, 1972). The state of equilibrium with the environment and the regulation of population size are density-dependent. The population density at equilibrium is the result of the dynamics of resource use, and biotic interactions are the shapers of that dynamic.
It is worth distinguishing three types of intra-specific competition (Sutherland & Parker, 1985; Bernstein et al., 1991). With scramble competition, the per capita rate of use of resources decreases owing to the use of the same resources by co-specifics, but social behaviour is not necessarily present. Contest competition occurs when there are direct negative interactions between the members of a population. This includes kleptoparasitism and territorialism. A third type of competition occurs when the population can be divided into categories of individuals or phenotypes with different competitive abilities. Sex, age, and dominant hierarchies within a sex are common forms of this type of competition. Inter-specific competition among species within a guild that use the same resources can be equated to the third type of intra-specific competition (Bernstein et al., 1991).
Under the effect of scramble competition in closed populations, the interplay between the concepts of carrying capacity and intrinsic growth rate, as described by the logistic growth equation, has become the standard general theory of single-species population growth (Pianka, 1974; McNaughton & Wolf, 1979). One of the most fundamental assumptions of the logistic equation is that the carrying capacity is set by the availability of resources (review by Soberón, 1986). Slobodkin (1953) showed how the predicted pattern of distribution between unconnected populations is described by a function by which population size increases with the increase in initial resource availability and with the carrying capacity of the environment (Fig. 1).
Measuring the environment in terms of the number of organisms in the population implicitly assumes that there are resources (R) in the environment such that each animal at equilibrium requires a/K of R (Slobodkin, 1953), where K is the carrying capacity and a is a proportionality constant. If the total available quantity of R increases in the environment, the equilibrium number of organisms in the population increases proportionally. In effect, the amount of R required by each organism in the population is independent of the other organisms in the population, while the amount of R available to each organism is dependent on the total amount available in the environment and on the number of organisms competing for it. Slobodkin (1953) wrote the equation of population growth as
where N is the number of organisms in the population and l is the intrinsic growth rate. The relationship between the size of the environment and the number of animals at the upper asymptote could be written as
The expectations of this relationship at equilibrium will be:
(i)l1 = l2 = ... = li = 0,
(ii) the habitat matching rule.
Slobodkin (1953) incorporated the effect of contest competition into the model for closed environments:
where b is a proportionality constant, and thus the relationship between quality of the environment and population abundance becomes
It is also possible to incorporate a third component of a power series that may represent the effect of differences in competitive abilities, such that
Slobodkin (1953) defined the efficiency of an asymptotic population as N/K, that is, as the number of organisms that can be maintained by a unit of environment. For a given value of K, the efficiency is higher in populations without social behaviour. This means that the population abundance (probability of occurrence in the SDM) will be relatively lower when species exhibit aggressive behaviour (Fig. 1a, iii) compared with when only scramble competition (Fig. 1a, i) operates.
SDMs for connected habitats in static equilibrium
When populations are connected, the population distribution among habitats at equilibrium is determined by habitat selection, which is a type of behaviour, and is therefore a property of individuals and a biological trait that is subject to natural selection. The study of habitat selection can be framed within the field of evolutionary ecology, particularly behavioural ecology (Krebs & Davies, 1995). Behavioural ecology has an a priori theory from which explanations and predictions can be devised: evolution by natural selection, which increases the explanatory and predictive power of its models in relation to population approaches (Sutherland, 1996). Thus, habitat selection can be understood as a decision to change habitat in search of a place where fitness is maximized (Stephens & Krebs, 1986). Movements between habitats of organisms or propagules can be classified according to their frequency of occurrence: use of home range, seasonal migration and dispersal. All these must be considered types of habitat selection.
When habitats are connected, if there is no competition and no physical barrier, then the prediction in equilibrium is that individuals will use the best-quality habitat, subject to the assumption that individuals are located where their fitness is maximized. Although this condition is unlikely to be met in nature, it is important conceptually because the incorporation of connectivity between habitats represents a significant change in the assumptions of the SDM.
Fretwell & Lucas (1970) were the first to investigate individual habitat selection in a competition context, and coined the term ‘ideal free’ distribution for the simplest situation in which the individuals have no travel costs between environments and there are no differences between individuals. The model describes the distribution of population numbers between habitats under a condition of scramble competition. Habitat quality declines with increasing density. This model assumes that animals select habitat individually, preferring a site that maximizes their fitness. Under the ideal free model, animals will keep moving until all individuals have equal fitness, when they reach the equilibrium distribution. Between nearby habitats, this balance is achieved quickly through the habitat selection process undertaken by individuals within their home range. In the case of remote habitats, the process requires more time, because the balance of population numbers is achieved through dispersal, which is a slower mechanism. The result of this model of the ideal free distribution is also called the ‘habitat matching rule’ (Pulliam & Caraco, 1984) (Fig. 1):
where Pi is number of consumers or probability of occurrence, c is a normalizing constant, and Ri is the amount of resources in the habitat.
Sutherland (1983) developed a model for when there are different levels of interference (contest competition) in the use of resources and the resulting habitat selection:
A high value of the interference constant m indicates that searching efficiency declines markedly with consumer density, while a low value indicates that interference is less important. Under the latter condition, the probability of occurrence is higher than the expectation based on the quality of resources in good habitats (Fig. 1).
Sutherland & Parker (1985) developed a model for animals that are not ‘free’ but for which territoriality and social dominance operate (ideal ‘despotic’ distributions), with individuals that differ in competitive abilities:
where r is the relative competitive ability. The predictions of this type of ideal despotic distribution model (Fig. 1) are that: (1) later settlers will be excluded from the habitats, (2) fitness will be lower in habitats with lower initial habitat suitability, and (3) density may or may not be higher in the best habitats. Independently, Rosenzweig (1981) postulated the ‘theory of isoleg’, which is a graphical model conceptually analogous to the ideal despotic model of Sutherland & Parker (1985) but applied to guilds that use the same resources and for which inter-specific competition operates.
The matching rule was first described not by ecologists but by experimental psychologists, following experiments in which an animal was inside a box that had two levers, which, when pressed, supplied food on different schedules. It was found that the relative frequency of responding on a given key closely approximated the relative frequency of reinforcement or positive stimulus (e.g. food) on that key (Herrnstein, 1961). This phenomenon was observed under many conditions and in various species, and Herrnstein named it the ‘matching rule’. The matching rule was also derived independently from a model of foraging theory. Foraging theory applies optimality modelling to the study of the behaviour of use of resources (Stephens & Krebs, 1986). Optimality modelling is a method that allows explicit, quantitative hypotheses about design or adaptation to be tested (Stephens & Krebs, 1986). Habitat selection by solitary individuals was investigated extensively within foraging theory through the so-called marginal value theorem (Charnov, 1976) and its variants. Staddon (1983) showed that the marginal value theorem predicts the ‘matching rule’. This rule in the context of individual decision theory states that the allocation of time for an individual among habitats will be proportional to the initial quality of these habitats. Senft (1989) independently derived a population matching law, in the field of animal production. He found that domestic herbivores select pastures in direct proportion to the relative biomass of the preferred plant species.
Incorporating biotic factors into SDMs
As a general rule, the simplest forms of competition predict the matching rule: slight levels of scramble competition predict that species with social behaviour will tend to have a more even distribution among habitat types at the local level, with less use of the best habitats than expected by the matching rule. To incorporate these effects into the SDMs, it is first necessary to understand the intra- and inter-specific interactions that affect the target species, which requires prior knowledge of its biology. Then, parameters that represent the form and extent of the competition should be incorporated into the response functions of the SDM based on criteria that depend on each type of model. A description of this procedure is beyond the scope of this review, but it can be based on the equations described above. In the case of different competitive abilities within species, the ratio of behavioural classes can be used (e.g. adult versus young, male versus female) (Railsback et al., 2003; Johnson, 2008). Similarly, the ratio between dominant and subordinate species can be used as a surrogate for inter-specific competition.
Most SDMs assume the condition of stable and isolated habitats in which density-independent factors operate. While the assumption that the statistical relationships represent ecological relationships is legitimate in this context, the context itself is unrealistic. This lack of realism is evident when the possibility that organisms select habitat is included, because in connected habitats it is predicted that without competition all individuals are concentrated in the best habitats because they provide maximal fitness (Fig. 1b, iv). SDM designers must take into account that the predictions of models that assume isolated habitats and density-independent variables may be very similar to those of models that assume connected habitats and the effect of density-dependent variables, although the underlying processes are obviously very different.
Habitats with dynamic equilibrium and habitats without equilibrium
The theory of metapopulations suggests that steady-state conditions can be generated between patches of environments. There are two basic types of dynamics: colonization–extinction and source–sink. In the first case, populations occupy patches of similar habitat type in a matrix of unsuitable habitat, and suitable habitat patches may be empty owing to local extinctions and can then be considered as waiting to be colonized (Levins, 1969). In the second variant, unsuitable habitats (mortality > natality), the sinks, retain a subpopulation owing to supplementary immigration from source habitats (Dias, 1996). Metapopulation equilibriums are more likely to be reached in species that are characterized by a short generation time, small body size, high rate of population increase and high habitat specificity, such as butterflies and annual grasses (Pulliam, 2000). Each metapopulation of this type of species typically occupies a relatively small geographic range. The source–sink structures are more common in sessile organisms (Dias, 1996). The most common metapopulations among vertebrates occur in a context of structurally similar habitats but with density-independent variables that produce different mortality rates between habitats, for example as a result of hunting (Woodroffe & Ginsberg, 1998). In these cases, SDMs can solve the problem of lack of equilibrium within metapopulations, by considering the metapopulation level as the smallest scale of analysis (see below).
Several authors have noted that dispersal limitation causes species to be absent from areas with high-quality habitat, causing a deviation from the habitat matching rule (Pulliam, 2000; Anderson et al., 2003; Railsback et al., 2003; Svenning & Skov, 2004; Araújo & Pearson, 2005; Guisan & Thuiller, 2005; Soberón & Peterson, 2005; Peterson, 2006). Periods of disequilibrium occur when density-independent factors that previously impacted the abundance of organisms cease to regulate the population. Under these conditions, the population tends to return to the equilibrium density, and during this period of imbalance it should not be possible to apply SDMs. The return of the balance may result from changes in birth and mortality or dispersal. This effect of disequilibrium can occur at local and coarse scales.
Locally, the current distribution can reflect conditions in the recent past. One of the most common causes of deviations from the proportionality between species and resource distributions is generated by the inevitable delay that occurs between environmental change and the numerical recovery of the species. Small changes in food availability or environmental conditions may generate offsets when species show site tenacity. Van Horne (1983) gives the example of wildlife studies conducted in northern climates, where identification of habitat quality on the basis of summer densities would be misleading because the availability of the winter range may contribute disproportionately to carrying capacity.
A lack of equilibrium at landscape or sub-regional scales may be caused by catastrophic events, such as fire or volcanic eruptions, or by anthropogenic impacts, such as different hunting laws between neighbouring states or countries, or different traditions in the use of land. Because the delay in the colonization of a site depends on the distance from source populations, the effect of the delayed dispersion is expected to be greater at larger scales (Austin, 2002). Processes at regional and continental scales are generally products of climatic changes such as glaciations or global warming. When processes operate at large scales, species recovery can take a long time. In some cases, these processes can set barriers to dispersal that isolate landscapes or sub-regions. A common example is the effect of glaciations in freshwater environments (Cassini et al., 2009). River basins represent isolated geographic units for aquatic species. The restriction of populations to refugia during glaciations has been followed by the re-occupation of some river basins by certain species that had been displaced by the ice. The basins that had no refugia during glaciations are empty of those species. In both types of basins, the available resources and other environmental variables may be similar, but the differences in geological histories determine heterogeneity at the scale of basins that must be taken into account.
When catastrophic effects occurred in the past, two recommendations can be made when designing SDMs: (1) to apply hierarchical analysis to take into account various ecological scales, and (2) to use only presence data in the case of local variables (absences should not be used). A number of authors have proposed specific solutions to the problem of scale dependence. Mackey & Lindenmayer (2001) were pioneers in developing a hierarchical framework for SDMs. They quantified the environmental response of a species in terms of a hierarchy defined by five scales that represent natural breaks in the distribution and availability of the primary environmental resources. Guisan et al. (2007) and Menke et al. (2009) tested the effect of grain size on 10 distinct modelling techniques for 50 species of plants and vertebrates in five regions, and on the Argentine ant occurrences in California, respectively. Using a Bayesian framework, Gelfand et al. (2005) developed a two-stage, spatially explicit, hierarchical logistic regression model in an attempt to model species diversity in the Cape region of South Africa, and Pearson et al. (2004) presented a model that integrates land cover data into a correlative bioclimatic model whereby artificial neural networks are used to characterize species’ climatic requirements at the European scale and land cover requirements at the British scale.
When the lack of equilibrium occurs at a local spatial scale and is caused by seasonal or annual cycles, the simplest solution is to measure the local dependent variables covering the period including the complete cycles of change, and to use the average values. In the case of species that form metapopulations (local dynamic equilibrium), it is desirable that the smallest cell used in the SDM includes the metapopulation, analogous to the case in which it is desirable that the smallest unit includes all micro-environments that provide different resources to a mobile species.
The ecological consequence of species-specific adaptations is that all members of a species respond similarly to changes in environmental conditions, irrespective of the population to which they belong and their geographical location. Each species is characterized by a set of phylogenetic (anatomical, physiological and behavioural) constraints that shape the strategies of habitat use of the species (Morin & Lechowicz, 2008). Knowledge of these ‘constraints’ allows us to properly select the environmental factors included in the analysis and change the models accordingly. Therefore, before designing an SDM, the researcher should explore the existence of special requirements for certain nutrients, specific substrates for breeding or shelter, thermoregulatory capacity limits, aggressive behaviour or social space requirements, and life history traits that predispose the species to form metapopulation structures, among other traits that can define constraints in habitat use.
The biological strength of the habitat matching rule is reflected in the fact that it has been described repeatedly in different contexts in different disciplines. This reflects the fact that it is based on the principles of population biology and natural selection.
SDMs have had enormous success. However, the ecological theory related to these models has been sorely neglected in the literature (Austin, 2002, 2007; Guisan & Thuiller, 2005). This situation is partly caused by the limitations in the ability of ecological theory to transfer a scientific basis to pressing environmental problems. In this article I have analysed the principles underlying SDMs, without attempting to adjust traditional ecological concepts such as niche theory. The habitat matching rule has been recognized as a fundamental principle of SDMs, but has also been identified as the aspect that is most criticized (e.g. Van Horne, 1983; Thomson et al., 1996; Railsback et al., 2003). The main criticism is that in nature there are suitable sites from which a species may be absent, and unsuitable sites at which it may be present. I propose that SDMs should take the habitat matching rule as a null hypothesis. This rule can be viewed as the basic conceptual pattern, such that the causes of the actual distributions (not the theoretical ones) may be explained by the deviations found with respect to this basic pattern.
This work was supported by grants from CONICET (PIP No. 11420090100367) and the University of Luján (Fondos Finalidad 3.5).
Marcelo H. Cassini specializes in the ecology and behaviour of mammals. He is involved in projects on the conservation of endangered species, and in theoretical research on population ecology and evolutionary ecology. He is a researcher at the Consejo Nacional de Investigaciones Científicas y Técnicas (Argentinean Research Council) and a lecturer at the University of Luján, Argentina.