Aim Species distribution models (SDMs) have been used to address a wide range of theoretical and applied questions in the terrestrial realm, but marine-based applications remain relatively scarce. In this review, we consider how conceptual and practical issues associated with terrestrial SDMs apply to a range of marine organisms and highlight the challenges relevant to improving marine SDMs.
Location We include studies from both marine and terrestrial systems that encompass many geographic locations around the globe.
Methods We first performed a literature search and analysis of marine and terrestrial SDMs in ISI Web of Science to assess trends and applications. Using knowledge from terrestrial applications, we critically evaluate the application of SDMs in marine systems in the context of ecological factors (dispersal, species interactions, aggregation and ontogenetic shifts) and practical considerations (data quality, alternative modelling approaches and model validation) that facilitate or create difficulties for model application.
Results The relative importance of ecological factors to be considered when applying SDMs varies among terrestrial and marine organisms. Correctly incorporating dispersal is frequently considered an important issue for terrestrial models, but because there is greater potential for dispersal in the ocean, it is often less of a concern in marine SDMs. By contrast, ontogenetic shifts and feeding have received little attention in terrestrial SDM applications, but these factors are important to many marine SDMs. Opportunities also exist for applying more advanced SDM approaches in the marine realm, including mechanistic ecophysiological models, where water balance and heat transfer equations are simpler for some marine organisms relative to their terrestrial counterparts.
Main conclusions SDMs have generally been under-utilized in the marine realm relative to terrestrial applications. Correlative SDM methods should be tested on a range of marine organisms, and we suggest further development of methods that address ontogenetic shifts and feeding interactions. We anticipate developments in, and cross-fertilization between, coupled correlative and process-based SDMs, mechanistic eco-physiological SDMs, and spatial population dynamic models for climate change and species invasion applications in particular. Comparisons of the outputs of different model types will provide insight that is useful for improved spatial management of marine species.
Spatial patterns in species distributions have inspired ecological research for over a century (Grinnell, 1904; Gause, 1934; Hutchinson, 1957). Understanding the processes that create these patterns has become increasingly important in the face of continuing threats of habitat destruction, pollution, species invasion and climate change (Millennium Ecosystem Assessment, 2005). One approach providing practical information on the spatial distribution of species is the species distribution model (SDM). A wide variety of SDM methods have now been developed, including correlative (Guisan & Zimmermann, 2000; Pearson & Dawson, 2003), coupled correlative and process-based models (i.e. ‘hybrid’; Smolik et al., 2010) and mechanistic approaches (Kearney & Porter, 2009). Each approach has advantages and disadvantages (Kearney & Porter, 2009), but the vast majority of studies to date have been correlative. This approach correlates species occurrence (presence-only or presence–absence) records with environmental data in geographic space to explain and predict a species' distribution. Such models were first applied in the terrestrial domain where their use has increased rapidly over the past 20 years (Fig. 1). By comparison, application of SDMs to marine species is rare (Fig. 1), although interest in their application is increasing (Redfern et al., 2006; Valavanis et al., 2008).
In transferring correlative SDMs from the land to the sea, the validity of model assumptions and predictive performance will be affected by the unique physical properties of marine habitats and the biological characteristics of marine organisms. Terrestrial applications of correlative SDMs typically assume that the physical environment (notably climate) exerts a dominant control over the natural distribution of a species (Grinnell, 1904; Hutchinson, 1957; Pearson & Dawson, 2003). Other ecological factors including dispersal, species interactions, and shifts in environmental requirements throughout life-history stages (i.e. ontogenetic shifts), which can be important in defining the distribution of a species, are generally not included in these models (Guisan et al., 2006; Araújo & Luoto, 2007; Schurr et al., 2007). Failure to explicitly include these factors can affect the ecological realism and predictive performance of SDMs (Guisan & Zimmermann, 2000; Austin, 2002; Pearson & Dawson, 2003). However, the importance of such factors in modelling the distribution of marine species has yet to be explored and may differ from terrestrial SDM applications.
In this review we discuss the influence of key conceptual and practical issues associated with species distribution modelling in the marine realm. We begin by analysing the literature on marine SDMs. Then we explore the importance of four ecological factors – dispersal, species interactions, ontogenetic shifts and aggregations of individuals – in modelling the distribution of marine species in comparison with terrestrial species. While evaluating the relative importance of these ecological factors for different groups of marine organisms, we discuss how solutions developed in terrestrial SDM applications apply in a marine context and suggest some alternative approaches. Subsequently, we compare and contrast the practical issues of data quality, model selection and model validation in marine and terrestrial SDMs. We conclude with recommendations for applying SDMs in marine systems and highlight future research needs.
The number of published SDMs varies among different groups of marine organisms, but fish are the most common taxonomic group to be modelled (Fig. 2b). This probably reflects both their commercial value and data availability. For example, Maxwell et al. (2009) predicted the distributions of several commercially harvested fish species for use in conservation planning (e.g. Figure 3a). The modelled distributions were complemented with mapped confidence intervals, providing managers with important information on spatial variation in the accuracy of the models (Fig. 3b).
Few SDM applications have focused on marine invertebrates and macroalgae, yet these groups have several attributes that make them well suited to species distribution modelling. Benthic marine invertebrates and macroalgae are easily surveyed, and in some cases many data are available, including historical time series that could be used in model validation exercises (Connell, 1961; Lima et al., 2007). Further, many species within these groups are invasive pests that profoundly affect marine ecosystems, and modelling their current and future distributions should be a priority (Lee et al., 2008; Walther et al., 2009). Here we evaluate the differences and similarities in the ecology of different groups of marine organisms and how this should influence the way in which existing SDMs are applied or new models are developed.
THE INFLUENCE OF ECOLOGICAL FACTORS ON SPECIES DISTRIBUTION MODELS – A CROSS-SYSTEMS COMPARISON
A number of ecological factors, in addition to climate, can influence a species' distribution. Terrestrial SDMs have been extended to include ecological factors such as dispersal (e.g. Broennimann et al., 2006; Schurr et al., 2007; Smolik et al., 2010) and competition (e.g. Leathwick & Austin, 2001). Other factors such as feeding interactions, ontogenetic shifts and aggregation have received less attention in terrestrial SDMs. All of these could be important when applying SDMs to particular groups of marine organisms, but there has been little work that explores how these factors will affect model robustness and performance in the sea. In the following sections we discuss in detail the importance of each of these factors to marine SDMs, relative to terrestrial implementations.
Correlative SDMs are usually based on species–environment relationships that do not directly consider constraints or facilitation of dispersal in the neighbouring spatial context (Guisan et al., 2006). In terrestrial SDMs, authors conventionally assume two dispersal scenarios; either that species cannot disperse at all or that they can disperse everywhere. The dispersal scenario assumed is also dependent on the application. For example, Thomas et al. (2004) explored both dispersal scenarios when predicting species future distributions under climate change, but when predictions of species current distributions are required it is often assumed that that species have dispersed to all environmentally suitable sites (Araújo & Pearson, 2005). However, as a result of dispersal limitation climatically suitable sites are not always occupied (Svenning et al., 2006; Schurr et al., 2007). The influence of dispersal on species distributions and SDM predictions varies among terrestrial organisms (Araújo & Pearson, 2005) and the same is likely to be true for marine organisms.
Dispersal is the movement of propagules away from an existing population or natal site. The potential for dispersal is limited by the structure of the seascape or landscape and the method by which an organism disperses. Physical barriers to dispersal in the terrestrial environment include mountain ranges, lakes, rivers, oceans, urban centres and agricultural land. The marine environment has fewer absolute physical barriers to dispersal than the terrestrial environment (Steele, 1991). Physical barriers in the sea include continental and island land masses, ocean frontal systems and currents, vertical stratification of the water column (Longhurst, 2007), changes in substrate and bathymetry (Gaines et al., 2007) and heavily fished and/or polluted regions. Such barriers are generally weaker than those on land, subtly retarding movement in one direction or another rather than preventing it altogether (Gaines et al., 2007). With fewer, more permeable, barriers in marine systems than on land, there is greater potential for long-distance dispersal of organisms across habitat discontinuities (Carr et al., 2003). As a result, correlative SDMs may be expected to predict current and potential future distributions of marine species better than those of terrestrial organisms.
Dispersal methods vary among marine and terrestrial systems and different groups of organisms (Carr et al., 2003; Kinlan & Gaines, 2003) and ultimately influence whether environmentally suitable and unsuitable sites are occupied (Pulliam, 2000). Almost all marine plants and animals shed their reproductive propagules directly into the ocean, providing a potential for wide dispersal by currents (Kinlan & Gaines, 2003), whereas terrestrial animals predominantly depend on their own locomotion and many plants rely upon animal vectors (such as insects, bats and birds) for dispersal. These dispersal processes limit the rate at which terrestrial organisms can disperse, although wind dispersal of small-seeded plants, analogous to dispersal by currents in the ocean, can be an important predictor of plant distributions (Nathan et al., 2002; Muñoz et al., 2004). Rates of range expansion in macroalgae far exceed those of terrestrial plants and the distances that phytophagous insects disperse are also generally lower on average than herbivorous marine invertebrate and fish species (Kinlan & Gaines, 2003). Hence, it is evident that many marine organisms generally disperse farther and/or faster than terrestrial organisms. However, there are also considerable differences in dispersal capabilities among marine organisms (Wieters et al., 2008). At one end of the spectrum, many macroalgae and some invertebrates and benthic fish species develop within metres of their parents. At the other end of the spectrum, most marine invertebrates and almost all pelagic fish produce young that develop in the plankton for weeks to months and can therefore be dispersed hundreds of kilometres by ocean currents. Consequently, the dispersal distance of many marine fish and most invertebrates is one to five orders of magnitude greater than that of macroalgae (Kinlan & Gaines, 2003).
Where dispersal is an important determinant of the distribution of a species, accounting for it will result in more robust models. Methods used to account for dispersal in correlative SDMs have varied according to the application. Post-processing of predictions or explicitly modelling the dispersal process has been performed for climate change (Broennimann et al., 2006; Keith et al., 2008) and species invasion applications (Williams et al., 2008). Post-processing of SDM can involve setting the probability of occurrence to zero in sites that a species could not reach via dispersal over a particular time period (Broennimann et al., 2006). Alternatively, correlative SDMs have been coupled with a spatial population model (Keith et al., 2008). Given there are fewer barriers in the marine environment and dispersal potential is greater for many species, especially invertebrates and fish, specialized methods for dealing with dispersal in correlative models may be unnecessary. However, it may be useful for those marine species with a low dispersal potential (such as macroalgae and invertebrates that produce benthic larvae) or when highly dispersive species are subject to strong directional effects of marine currents that could influence future distribution patterns (Mullon et al., 2002). For current distribution predictions, dispersal has also been accounted for through statistical models that include spatial structure, which is not described by climatic factors, and this will be discussed further in the section below on aggregation (Dormann et al., 2007). To investigate the importance of dispersal in modelling the distributions of marine species it would be informative to apply alternative approaches for incorporating dispersal to a variety of marine species that vary in their dispersal potential.
Most correlative SDMs do not explicitly include species interactions (Guisan et al., 2006), though the effects of interactions may be implicit in the model. Excluding the direct effects of biotic interactions on species distributions can lead to inaccuracies in SDM predictions and projections (Austin, 2002). There are many different types of inter-specific interactions that could influence the distribution of species, including competition, facilitation, mutualism, parasitism, herbivory, predation, symbiosis and disease. While many of these interactions are likely to influence the performance of SDMs, competition is commonly noted as an issue for terrestrial SDMs (Guisan & Thuiller, 2005; Araújo & Luoto, 2007; Heikkinen et al., 2007). For marine SDMs, trophic interactions such as feeding have received more attention (Redfern et al., 2006; Torres et al., 2008). Here we focus our discussion on competition and feeding.
There are very few published examples of competitive exclusion in pelagic habitats (Bilio & Niermann, 2004). Hence we have a limited understanding of how competition can affect the distributions of pelagic species and are thus limited with regard to using a correlative SDM approach. Competitive interactions are likely to be easier to include in models of benthic distributions. For barnacle species, competition is a key range-limiting process (Poloczanska et al., 2008). In investigating the effects of competition and temperature on the abundance of two competing barnacle species, path analysis revealed that the influence of temperature in June was only significant for the dominant competitor, but this indirectly affected the abundance and distribution of the inferior competitor. Due to the rich datasets available for benthic invertebrates, it may be feasible to explore the influence of competition, relative to environmental variables, on the distribution of other species, using path analysis and/or correlative SDMs. For those species currently affected by competitive interactions, predicting future distributions (e.g. under climate change) is even more challenging than predicting their current distributions, because interactions may change with changing environmental conditions (Pearman et al., 2008).
The influence of prey on the distribution of consumers relative to the effect of the environment is rarely examined, but there are a few more examples in marine (Logerwell & Hargreaves, 1996; Redfern et al., 2006; Torres et al., 2008) than terrestrial systems (Fernandez et al., 2003; Guisan & Thuiller, 2005). According to foraging theory, prey is important in defining suitable habitat for consumers and predators (Boyce & McDonald, 1999), but its influence on SDM predictions is only critical when the environmental preference of the consumer is considerably different from that of its prey. For instance, the distribution of breeding territories of the carnivorous Iberian lynx was modelled using a suite of fine-scale environmental and geographic variables, and the best predictor variable was the density of ecotones between two vegetation classifications (Fernandez et al., 2003). This environmental variable, along with tall shrubs, were also highly correlated with the lynx's prey. It was inferred from this that the high relative importance of the environmental predictors was due to their strong correlation with the prey species. In this example, including the distribution of prey may not have improved the accuracy of territory predictions because the environmental predictor variable and the prey were highly correlated.
In marine systems, many apex predators (such as marine mammals, tuna and some sharks) can maintain body temperatures above the surrounding environmental conditions and are thus less constrained by physical conditions than most of their prey. The distribution and density of prey may therefore be an important factor in determining the distribution of such species (Redfern et al., 2006; Wirsing et al., 2007; Torres et al., 2008), although data are sometimes lacking at the appropriate scale (Torres et al., 2008). By contrast, benthic invertebrates cannot thermoregulate and many species are opportunistic generalist feeders that consume a wide variety of prey suspended in the water column. Consequently, distributions of benthic invertebrates are restricted more by ambient environmental conditions than by specific dietary requirements, so including prey would not improve predictions. For example, the distribution of a vase tunicate species, which is a benthic opportunistic filter feeder, was accurately predicted by an SDM that included only environmental covariates (Therriault & Herborg, 2008). However, given that this species filter feeds on small suspended particles and chlorophyll a was included as an ‘environmental’ covariate, this may have functioned as an index of prey. Further tests of the influence of plankton distributions on SDMs of benthic filter feeders are required.
Path analysis has been used to explore the importance of prey relative to environmental factors (e.g. ocean fronts) on the distribution of pelagic seabirds at a small spatial scale (Logerwell & Hargreaves, 1996). Both seabird species were directly affected by prey (fish density) when all survey years were combined, but prey was only significant for some survey years in isolation. Path analysis could be a useful tool that may reveal whether models of prey or competitor distribution are required to build better SDMs at the spatial and/or temporal scale of interest. However, the inclusion of prey in SDMs is feasible for relatively few applications because the information is rarely available. In the ocean, information on the distribution of mid-trophic level species, such as squid, is currently lacking, but gathering information on this group has been highlighted as a major priority for improving our understanding of the ecology of top-level predators (Lehodey & Maury, 2010). The importance of including prey distributions in marine SDMs needs to be explored further, and this will be feasible as more data become available.
SDMs are generally based on adult occurrence records. It is therefore assumed that the environmental preference and other indirect effects of the habitat association of species will be adequately described using information from the adult stage alone. This assumption may be invalid for organisms that have complex life histories and inhabit different environments during different life stages. Changes in environmental preferences between life-history stages is evident in some terrestrial organisms (e.g. trees whose seeds are wind dispersed or insects that lay eggs in host plants or animals) and is even more prevalent in marine organisms (Carr et al., 2003). A limited number of terrestrial SDM applications have examined the importance of incorporating different life-history stages (Webb & Peart, 2000; Jones et al., 2007; Ficetola et al., 2009). For instance, Jones et al. (2007) showed that environmental predictors better explained the distribution of tree fern species for younger than older life-history stages. By contrast, environmental predictors selected by Webb & Peart (2000) resulted in better model fits for distributions of adults than seedlings. Additionally, both studies demonstrated changes in the importance of key environmental variables throughout life (Webb & Peart, 2000; Jones et al., 2007).
Ontogenetic shifts in habitat use and environmental tolerance/preference are common in mobile marine species (Dahlgren & Eggleston, 2000; Wilson et al., 2008) and have also been noted in benthic sessile species (Hiddink, 2003; Manzur et al., 2010). The degree of habitat specialization can either increase or decrease with ontogeny (Beck, 1995; Halpern et al., 2005). Many pelagic fish species spawn for a limited time each year in areas that are much smaller than their typical distribution (Mullon et al., 2002). For example in anchovy, temperature and offshore hydrodynamic movement are considered to be critical during spawning, but hydrodynamic conditions that support movement towards nursery grounds are most important during recruitment (Mullon et al., 2002). Likewise, many coral reef-associated fish and invertebrates use off-reef habitats (e.g. seagrass, mangroves and macroalgae) as nursery areas before moving onto reef habitats as adults (Dahlgren & Eggleston, 2000). Such shifts can be related to changes in tolerance to abiotic stressors, but are also related to risk of mortality and prey availability. In less mobile benthic organisms, ontogenetic shifts from the juvenile to adult stages can involve movement from mid to high inter-tidal locations towards lower tidal levels (Hiddink, 2003; Manzur et al., 2010). In the case of a predatory benthic sea star, this shift is primarily related to wave exposure and prey (Manzur et al., 2010). In bivalves, the movement to lower tidal habitat at maturity is related more to biotic than abiotic factors, as the risk of predation and avoidance of parasitic infection are the major factors affecting habitat occupancy (Hiddink, 2003).
The greater the environmental difference and geographic distance between habitats at different life-history stages, the more critical it is to model those separate stages, especially if the purpose is for conservation planning. It is likely that information on the environmental requirements of species at different life-history stages is required to improve marine SDMs. If such data are available, models may be fitted on one life-history stage and tested on another to assess whether different models are required for different stages.
Aggregation, in the context of species distribution modelling, refers to how a species is spatially arranged within environmentally suitable conditions (Montoya et al., 2009). In many correlative models it is assumed that spatial structure will be described by environmental covariates included in the model and that the species–environment relationships are linear and stationary, i.e. they do not change over space (Austin, 2002; Fortin et al., 2005). When organisms aggregate through biological processes that are not described by the environmental covariates, such as acquisition of food, predator and/or competitor avoidance, mating behaviour, limited dispersal and advection (Ritz, 1994), this can lead to issues such as spatial autocorrelation and over-dispersion (Dormann et al., 2007).
The presence of aggregation in species occurrence data is linked to the spatial and temporal scale of sampling and the processes that influence distribution (Hui, 2009; Hui et al., 2010). For example, the intensity of aggregation patterns in 23 plant species distributions was only partly explained by SDMs at a relatively fine spatial scale (Montoya et al., 2009). This was attributed to population processes, species interactions and/or the lack of fine-scale environmental data (Montoya et al., 2009). In pelagic species, larger-scale currents influence aggregation through their effect on spawning, migration and feeding. At the mesoscale (c. 10–100 km), eddies entrain and attract schools of pelagic marine organisms (Moser & Smith, 1993; Ritz, 1994; Logerwell & Smith, 2001; Redfern et al., 2006). It has been argued that SDMs that only include climatic data are potentially more robust at a coarse than at a fine spatial scale because environmental variables are more important than population process or species interactions at the broader scale (Pearson & Dawson, 2003; Montoya et al., 2009). Coarse-scale SDMs will be more suitable for widely distributed species in environments without fine-scale variation, such as some pelagic species, than coastal and benthic species that respond to fine-scale variation in habitat.
The intensity of aggregation, and its stability through time, is partly determined by longevity and mobility (Levin, 1994). Patterns of aggregation may be more stable through time in long-lived sessile organisms than mobile organisms or sessile organisms with a high rate of population turnover. For example, inter-tidal macroalgae may be highly aggregated when conditions only permit local dispersal, but more uniformly distributed when conditions promote wider dispersal (Bellgrove et al., 2004). Mobile species in the sea such as cetaceans, seabirds and pelagic fish may be aggregated over short time-scales, but more uniformly distributed in relation to environmental gradients over longer time-scales. For example, the intensity of aggregation was low in pelagic seabird data that had been collated and pooled over 26 years (Huettmann & Diamond, 2006). This was an unexpected result, as the distributions of many pelagic seabirds were thought to be highly aggregated, but this may have been due to the use of long-term averages. The disadvantage of using longer-term averages and/or coarser spatial resolution is that the spatial structure at the finer scale could represent conditions important for particular seasonal and/or annual events such as spawning. Hence models need to include information that describes important aggregations.
There are a number of measures that can be used to assess spatial autocorrelation and other statistical issues that are often related to aggregation (Legendre & Legendre, 1998; Hui et al., 2010). Measures of spatial autocorrelation are valuable, as they provide information on the spatial structure that is not described by the model (Dormann et al., 2007). If species are highly aggregated and standard correlative SDMs do not provide adequate predictions, then regression-based methods that account for spatial structure can be used (see Dormann et al., 2007; Hui et al., 2010).
In our discussion of aggregation and other ecological factors, we have evaluated where correlative SDMs may be appropriate for different marine groups. Practical considerations, such as data quality and applying alternative modelling approaches, will also influence SDM application or constructions. These are discussed further in the following section.
PRACTICAL MODELLING CONSIDERATIONS
The types of data, selection of modelling approaches and model evaluation in SDM applications have been widely discussed for terrestrial systems (e.g. Elith et al., 2006; Jiménez-Valverde et al., 2008; VanDerWal et al., 2009), but there has been little discussion of these practical modelling considerations in marine systems. In this section, we identify some general principles from terrestrial applications that are likely to be transferable to the marine realm and highlight challenges in the marine realm that require special emphasis and new model formulations.
Common issues associated with data quality exist for both terrestrial and marine applications of SDMs (Elith et al., 2006; Ready et al., 2009). Many terrestrial SDMs rely on occurrence records from museums or herbaria and similar data (though fewer in number) are freely available for a variety of marine taxa (Hendriks et al., 2006). For instance, FishBase (http://www.fishbase.org/search.php) provides data on the biology, ecology and estimated range of every marine fish species (c. 31,000). The online database Ocean Biogeographic Information System (OBIS: http://www.iobis.org/) has occurrence data for 108,000 marine species. However, data on terrestrial species far exceed those for marine species (Hendriks et al., 2006).
Interpretations from both terrestrial and marine data are hampered by spatial and taxonomic biases towards certain habitats and taxa. In terrestrial systems, global sampling has been limited in tropical and arid environments (Newbold, 2010). Likewise, few data are available for marine species occupying remote habitats, such as the deep ocean (Hendriks et al., 2006). For individual species, spatial bias in occurrence data commonly arises from sampling sites that are more accessible (Phillips et al., 2009). On land, species occurrence records are more prevalent closer to roads, rivers, towns and cities (Phillips et al., 2009). In marine systems, sampling effort is often biased towards sites closer to the coast and in shallower water. For example, fisheries data are especially biased by factors such as the distance from a port and the lack of absence data (Murphy & Jenkins, 2010). The issue of bias is particularly challenging for SDMs that use presence-only or presence-background data, prompting the development of various correction methods (Phillips et al., 2009). Bias correction has been widely applied when modelling the distribution of a number of terrestrial species (Johnson & Gillingham, 2008; Phillips, 2008; Yates et al., 2010) and it would be appropriate to test these correction methods on marine species data with spatial bias.
Global environmental data for some predictor variables are easily accessible for terrestrial climates via the online data repository WorldClim (http://www.worldclim.org/). Marine equivalents have yet to be established, but important initiatives are currently in progress (J. McPherson, Centre for Conservation Research, Calgary Zoological Society, Calgary, pers. comm.). Environmental data used in marine SDMs are usually a combination of observations and modelled values at the surface (e.g. temperature and salinity), in the water column (e.g. mixed layer depth) and sea floor measurements (e.g. depth and distance to bathymetric features). Much of these data are available online (see Valavanis et al., 2008, for a table of sources). Observational data may be measured in situ (e.g. cruises and Argo floats) or remotely sensed by satellites. Environmental conditions can also be inferred from ocean circulation models, but these have relatively coarse resolution and are often inadequate on finer time and space scales (Redfern et al., 2006).
The temporal and spatial resolution and accuracy of environmental data often differs among variables, an issue common to both marine and terrestrial SDMs. For marine environmental data, observations of sea surface temperature are relatively consistent, accurate, well spatially resolved and with a long global time series. By contrast, dissolved oxygen observations are patchy in their spatial distribution and cannot be measured from satellite imagery. Hence dissolved oxygen data are more likely to be acquired from an ocean model than observations. The resolution and accuracy of ocean models are continually increasing, particularly through assimilation of observations from ocean observing programmes (e.g. the Global Ocean Observing System, http://www.ioc-goos.org) and integration of regional oceanographic features. Many of the data quality issues discussed here are common between terrestrial and marine systems, but they are more prevalent in marine data. Where high quality data are available for some marine species, this allows exploration of more data-intensive modelling approaches.
A large number of correlative SDM algorithms are now available (Elith et al., 2006; Ready et al., 2009). Studies that compare correlative SDM methods suggest that different models are not equally appropriate for all applications. For example, when predicting a species' current distribution, methods that included nonlinear responses and interactions are useful (Elith et al., 2006). However, models that perform well in predicting a species' current distribution may not perform as well when applied to project a species' future distribution under climate change (Thuiller, 2004; Morin & Thuiller, 2009). Correlative SDMs are conceptually simple, descriptive and require relatively little data, but as we learn more about the mechanisms and processes that define the biology and ecology of organisms we can build more complex mechanistic or process-based SDMs. Hence, alternative modelling approaches to correlative models should be explored where possible.
Alternative approaches to correlative SDMs include fully mechanistic eco-physiological SDMs, spatial population dynamics models, ‘hybrid’ models (that couple correlative SDMs with dynamic dispersal and population process models) and different combinations and permutations of these models (Cheung et al., 2008; Keith et al., 2008; Smolik et al., 2010). Mechanistic eco-physiological SDMs use the principles of thermodynamics to derive models specifically concerned with the transfer of energy and mass (Kearney & Porter, 2009). These models are generated at the level of the individual and have been constructed in both terrestrial (Kearney & Porter, 2004) and marine systems (Megrey et al., 2007; Maury, 2010). Detailed information on species functional traits (morphology, physiology and behavioural responses) as well as proximal environmental data are required to build fully mechanistic models. This approach, however, is often not viable due to data scarcity.
There are benefits and challenges in applying mechanistic eco-physiological SDMs in marine systems. Important processes in these models include the balance of energy (i.e. heat and momentum) and mass. Most marine organisms are permanently under water and their body temperature and tissue salinity is the same as that of the surrounding water. This makes heat and water balance calculations simpler than they are for many terrestrial organisms. However, in resolving energy balance, momentum transfer is a challenging consideration for mechanistic modelling of mobile marine animals such as fish, sea turtles, seals and cetaceans (Peng & Dabiri, 2008; Kearney & Porter, 2009).
Another potential benefit of applying mechanistic models to some marine organisms is that some marine habitats, such as the open ocean, have fewer microclimates than an equivalent area on land. Hence the proximal environmental data required to construct a mechanistic SDM, such as temperature, can be obtained from broad-scale remotely sensed or modelled data sources. Consequently, the time-consuming process of collecting fine-scale environmental data in the field (e.g. Kearney & Porter, 2004) can sometimes be bypassed in marine systems. There are likely to be more marine microclimates in benthic and inter-tidal habitats, although microclimates are also common in the pelagic environment in the vertical dimension, especially in the top few hundred metres of the water column. Certainly, finer-scale observations are needed for building mechanistic SDMs for some groups such as inter-tidal invertebrates (Helmuth, 1998).
Hybrid SDMs, which couple dynamic population and dispersal processes with correlative models, provide a compromise between the data requirements and complexity of a fully mechanistic model and the sometimes oversimplified correlative models (Cheung et al., 2008; Keith et al., 2008). Interestingly, such models have arisen almost simultaneously in both marine (Cheung et al., 2008) and terrestrial (Keith et al., 2008) applications. However, the issues that these hybrid SDMs attempt to resolve (e.g. explicit inclusion of dispersal) still seem to be restricted to the theoretical evaluation of correlative SDMs in terrestrial systems. As previously mentioned, modelling the distribution of species in marine systems may require emphasis on explicitly including different and/or additional processes (such as ontogenetic shifts and feeding) from those that have been included in terrestrial hybrid models.
Spatial population dynamics models, also referred to as dynamic-landscape metapopulation models, have been developed in both terrestrial and marine systems (e.g. Wintle et al., 2005; Figueira, 2009). These models can be similar to hybrid SDMs because both include population processes (i.e. reproduction, mortality and movement). In a hybrid SDM, the movement and carrying capacity of the population dynamics model are linked to the output from a correlative SDM (Cheung et al., 2008; Keith et al., 2008). Therefore, direct effects of each environmental variable on the population dynamics are unknown. In some spatial population dynamic models, such as the marine-based spatial ecosystem and populations dynamics model SEAPODYM, the direct effects of environment variables are modelled explicitly, as are trophic interactions (Lehodey et al., 2008). Mechanistic components might be less developed in SEAPODYM than in other eco-physiological models (Buckley & Buckley, 2010), but this model is particularly appropriate where trophic interactions are of prime importance.
There are unique opportunities for the simplification of some aspects of terrestrial mechanistic SDMs for marine species and further development of hybrid SDMs should include processes that are most important for the distribution of the marine organism being modelled. To assess the accuracy of alternative or correlative SDM approaches, validation of correlative SDM model predictions is required. Methods of evaluating correlative SDM approaches in marine systems are discussed briefly below.
Methods for evaluating SDMs should focus on the purpose of the model. Validating a prediction of a species' current distribution requires data from the same space and approximately the same time in which the model was fitted. Typical choices include internal validation (using split data sets or cross-validation) or more rarely, but more robustly, external validation on a completely independent dataset (Guisan et al., 2006). Tests of model performance in historical time periods may provide useful insight into the performance of SDMs for climate change and species invasion applications (Randin et al., 2006; Tingley & Beissinger, 2009). Spatial transferability tests are relevant to both current and future SDM predictions and projections because test results can provide an indication of the stability of modelled relationships through space (Randin et al., 2006). Ultimately, there may be greater opportunities for transferability tests in marine than terrestrial systems. The large number of palaeo-records of historical distributions of marine organisms (e.g. forams, cysts, diatoms, crustaceans and fish scales) provide an opportunity for the validation of model performance over an independent time period. Additionally, the wider geographic ranges of many marine species relative to terrestrial species would allow validation of model performance in novel space. This approach entails models being fitted in one part of a species' geographic range and tested on another part of the range (Randin et al., 2006; Nogués-Bravo et al., 2008).
Accurate and realistic models of species distributions greatly assist in managing human threats and improving our understanding of species ecology in marine and terrestrial systems alike. Indeed, most marine SDMs are currently intended for conservation planning applications. Critical evaluations of correlative SDMs on land have highlighted challenges (and in some cases solutions) that are also relevant to marine applications (see Table 1). Dispersal and competition compromise many terrestrial SDM applications, but these factors are often less critical in many marine SDMs. However, other ecological factors rarely mentioned in the terrestrial SDM literature, such as feeding and ontogenetic shifts, are pertinent to the realism and accuracy of models for many marine species. These are broad generalizations, and variations are evident among marine organisms, particularly across life-history traits and habitats (Table 1). Aggregation can result from dispersal, species interactions and ontogenetic shifts, among other ecological factors, but its intensity may vary across spatial and temporal scales. Consequently, it is critical to both marine and terrestrial applications (Table 1).
Table 1. A general ranking of the importance of ecological factors when applying species distribution models (SDMs) to marine taxa with different life-history traits and habitats (pelagic and benthic) in relation to terrestrial species.
Systems and species traits/habitats
***, critical; **important; *less important. Studies that were used as examples in our discussion on ecological factors are listed.
Some methods developed in terrestrial applications, accounting for the ecological factors discussed, may benefit marine SDM applications. However, there is a mismatch between progress in model development (mostly targeting dispersal) and identified needs for marine models. Species interactions are important to all marine groups considered, but competition is more likely to be tackled in benthic rather than pelagic SDMs, while the reverse is likely to be true for the importance of feeding. Methods that can be used to explore the relative importance of species interactions, such as path analysis, can be useful when modelling terrestrial and marine species with a variety of different traits. Few SDMs have been built to account for feeding, ontogenetic shifts and aggregation, indicating fertile fields for future research that will require new modelling approaches.
Existing correlative SDM methods have generally been under-utilized in marine systems. So far, relatively few marine SDM applications have predicted future impacts of climate change and species invasions. Modelling approaches that extend beyond correlative SDMs are receiving increasing attention for both of these applications. We anticipate rapid future developments in ‘hybrid’, spatial dynamic population and mechanistic eco-physiological models, with different combinations and permutations of these approaches in both terrestrial and marine systems. Additionally, studies that compare mechanistic, hybrid and correlative SDM predictions (Morin & Thuiller, 2009; Elith et al., 2010) are important in furthering our understanding of factors that limit species ranges. The conceptual and practical issues associated with the applications of SDMs discussed here provide future direction for building more realistic and accurate SDMs in the marine realm.
Reviews by Scott Burgess, Chris Brown, Richard Fuller and two anonymous referees improved the quality of earlier drafts. This research is supported in part by a CSIRO Climate Adaptation Flagship scholarship and by the Australian Research Grants DP0879365 and FT0991640.
Lucy Robinson is a doctoral student with a broad interest in theoretical and applied ecology in marine and terrestrial systems. Her research interests concern finding optimal methods for assessing anthropogenic impacts, and she is currently focused on modelling future impacts of climate change on pelagic fish species and implications for spatial management.
Jane Elith is a research fellow at the University of Melbourne, focusing on methods for modelling the distributions of species and biodiversity. She is interested in both theory and the application of these methods, in all ecosystems.
Anthony J. Richardson holds a joint position with the University of Queensland and the CSIRO Marine and Atmospheric Research Division. He is a quantitative marine ecologist interested in the structure and function of pelagic ecosystems. He uses this information to delve into the impacts of global change on our oceans and better manage these fragile and changing systems.