On the approximation of continuous dispersal kernels in discrete-space models

Authors

  • Joseph D. Chipperfield,

    1. Field Station Fabrikschleichach, Biozentrum, Universität Würzburg, Glashüttenstraße 5, 96181 Rauhenebrach, Germany
    2. Department of Biology, University of York, Heslington, PO Box 373, York YO10 5YW, UK
    Search for more papers by this author
  • E. Penelope Holland,

    1. Landcare Research, PO Box 40, Lincoln 7640, New Zealand
    Search for more papers by this author
  • Calvin Dytham,

    1. Department of Biology, University of York, Heslington, PO Box 373, York YO10 5YW, UK
    Search for more papers by this author
  • Chris D. Thomas,

    1. Department of Biology, University of York, Heslington, PO Box 373, York YO10 5YW, UK
    Search for more papers by this author
  • Thomas Hovestadt

    1. Field Station Fabrikschleichach, Biozentrum, Universität Würzburg, Glashüttenstraße 5, 96181 Rauhenebrach, Germany
    2. Muséum National d'Histoire Naturelle, CNRS UMR 7179, 1 Avenue du Petit Château, 91800 Brunoy, France
    Search for more papers by this author

Correspondence author. E-mail: j.chipperfield@biozentrum.uni-wuerzburg.de

Summary

1. Models that represent space as a lattice have a critical function in theoretical and applied ecology. Despite their significance, there is a dearth of appropriate theoretical developments for the description of dispersal across such lattices.

2. We present a series of methods for approximating continuous dispersal in discrete landscapes (denoted as centroid-to-centroid, centroid-to-area, area-to-centroid and area-to-area dispersal). We describe how these methods can be extended to incorporate different conditions at the boundary of the simulation arena and a framework for approximating continuous dispersal between irregularly shaped patches.

3. Each approximation method was tested against a baseline of continuous Gaussian dispersal in a periodic simulation arena. The residence probabilities for an individual dispersing in each time step according to a Gaussian kernel across grids of three differing resolutions were calculated over a number of dispersal steps. In addition, the steady-state asymptotic properties for the transition matrices for each approximation method and cell resolution were calculated and compared against the uniform expectation under continuous dispersal.

4. All four methods described in this article provide a reasonable approximation to the continuous baseline (<0·03 absolute error in probability calculations) on landscapes with grid cells of length equal to the expected dispersal distance or finer, but error increases as grid cells become progressively larger than the expected dispersal distance.

5. Each approximation method exhibits a different spatial pattern of approximation error. Centroid-to-centroid dispersal overestimates residence probabilities near the origin, resulting in decreased invasion rates relative to the baseline diffusion process. All other approximation methods underestimate residence probabilities near the origin and overestimate such probabilities in the peripheries, leading to an overestimation of invasion rates.

6. The asymptotic properties of centroid-to-centroid and area-to-centroid dispersal approximation methods deviate from that which is expected under continuous dispersal. This characteristic renders these methods unsuitable for use in long-term simulation studies where the equilibrium properties of the system are of interest.

7. Centroid-to-area and area-to-area approximation methods exhibit both low approximation error and desirable asymptotic properties. These methods provide a viable mechanism for linking individual-level dispersal to larger-scale characteristics such as metapopulation connectivity.

Introduction

The extension of ecological models into the spatially explicit realm presents one of the most rewarding but also one of the most challenging aspects of model development. Traditionally, ecological models have focused on describing interactions between individuals in terms of the mean density of individuals in a population. Models derived from this so-called mean field assumption have provided many new insights in ecological theory but, without the inclusion of local interactions between individuals, the lack of spatial structure in these models can produce very different conclusions on crucial phenomena such as invasion speed and species coexistence than their spatially explicit counterparts (Ovaskainen & Cornell 2006; Murrell 2010).

Whilst the spatial element can, in some cases, represent a substantial leap in complexity, it can often elucidate the mechanisms of otherwise confusing observations. For example, the addition of spatial structure in models of predator–prey dynamics in Murrell (2005) and Kondoh (2003) have shown that equilibrium prey densities are negatively linked to the spatial covariance of the antagonists, which can increase when prey fecundity is increased. This extension thus provides an alternative spatial explanation for the ‘paradox of enrichment’ of Rosenzweig (1971). Moreover, some core principles of the theory of competition, such as the assertion that a high ratio of intraspecific to interspecific competition provides community stability (appearing in many text books such as Putman & Wratten 1984), have been shown to be incomplete when interrogated with models able to explicitly describe and simulate the spatial aggregation of conspecifics (Neuhauser & Pacala 1999; Murrell 2010). In applied ecology, spatially explicit models are also commonly used to describe the spatial arrangement of populations and dispersal of individuals, and have shown themselves invaluable in the context of reserve selection strategies and responses to climate change (for example Moilanen et al. 2005; Willis et al. 2009).

One of the crucial elements of a spatially explicit model is the specification on how this space is represented. Indeed, Murrell (2005) postulates that one of the reasons why the findings of Wilson, de Roos & McCauley (1993) appear to contradict the demonstration in Murrell (2005) that increased prey movement reduces the equilibrium population size is that the study of Wilson, de Roos & McCauley (1993) represents space as a discrete lattice of environments with each patch able to support a maximum of one individual. This type of stochastic cellular automaton is one commonly employed in ecological models (see Silvertown et al. 1992; Jeltsch et al. 1996; Mustin et al. 2009, for more examples), although other variants where populations of more than one individual (as implemented in Travis & Dytham 2002), or communities of more than one species (as implemented in Travis, Brooker & Dytham 2005), can inhabit a single cell are also used.

Whilst lattice models have the potential to provide many novel ecological insights (Nakamaru 2006), with some authors exalting these methods as a ‘paradigm’ (Hogeweg 1988), their simplification of spatial structure can lead to a number of biases in the interpretation of their output. No more so is this bias shown so prominently than in the methodologies employed to model dispersal through these habitats. The most basic simplification of dispersal, often denoted ‘stepping-stone’ dispersal or sometimes ‘nearest-neighbour’ dispersal (Kimura & Weiss 1964), defines movement as a local process where individuals can only move to adjacent lattice cells with some given probability, usually uniformly selected amongst the neighbourhood of cells (although see Topping et al. 2003; Wiegand et al. 2004, for other weighting methods). For rectangular lattices, different concepts of the neighbourhood are employed (see Milne et al. 1996): ‘Moore neighbourhoods’ define the eight neighbouring cells in the horizontal, vertical and diagonal directions as potential destinations for dispersing individuals (Topping et al. 2003; Wiegand et al. 2004, for example), whilst ‘von Neumann’ neighbourhoods consider only the four cardinally adjacent cells as potential destinations for dispersing individuals (Söndgerath & Schröder 2002, for example). However, Holland et al. (2007) show that both neighbourhood definitions can exhibit unnatural artefacts, both in terms of the spatial densities observed when considering multiple realisations of such defined dispersal events and the maximum traversable distance after a set number of time steps.

In continuous space, the probability density function of dispersal distances of a motile individual (or propagule in sessile organisms) from the point of origin is often referred to as the distance distribution (Nathan & Muller-Landau 2000), the circular distribution (Wilson 1993) or the distance pdf (Cousens, Dytham & Law 2008). These distributions describe the probability of the magnitude of a movement event but not its direction. In a one-dimensional world, the distance distribution is the folded equivalent of a displacement distribution, where displacement also accounts for the direction of movement and can therefore be negative. We can extend these one-dimensional descriptions of displacement into the spatial domain by describing dispersal in terms of its polar coordinates from the point of origin. For models with descriptions of dispersal in continuous space, there are a number of distributions of spatial displacement available to the investigator (see Clark et al. 1999; Cousens, Dytham & Law 2008). This is not the case for discrete lattice-based dispersal. Outside the simple stepping-stone models of dispersal there is a dearth of appropriate models for the calculation of cell-to-cell dispersal probabilities. To avoid confusion, the term ‘dispersal kernel’ will hereafter refer to the probability density function of displacement and not the probability density function of dispersal distance.

To address some of the deficiencies of stepping-stone models of dispersal, Chesson & Lee (2005) describe a number of families of integer-valued displacement distributions for use in lattice models of arbitrary dimensionality. These distributions have the flexibility to allocate non-zero probabilities of dispersal to cells beyond the nearest neighbours and hence can potentially provide a mechanism of dispersal not too dissimilar to their continuous counterparts. The distributions described in Chesson & Lee (2005) also exhibit a number of desirable qualities that make their development a significant step forward for incorporating more realistic dispersal in cell-based studies. Firstly, most of the distributions described in Chesson & Lee (2005) have functional forms that are closed under convolution. This means that when iterating the dispersal forward a number of time periods, total displacement is simply a re-parametrisation of the one-step displacement distribution. More generally, this means that we are able to parametrise the displacement distribution as a function of time. Secondly, each of the displacement kernels have a parameter controlling the kurtosis of the probability distribution and allowing flexibility in specification of the probability weight of the tails of the distribution. This is particularly useful for helping to include the effects of long-distance dispersal that often requires a ‘fat-tailed’ displacement distribution (Hovestadt, Messner & Poethke 2001; Petrovskii, Morozov & Li 2008). Finally, the displacement distributions of Chesson & Lee (2005) also exhibit asymptotic radial symmetry, which ameliorates some of the artefacts of lattice-based dispersal described by Holland et al. (2007).

Field data such as telemetry or seed shadow data are often used to parametrise continuous models of dispersal (see Greene et al. 2004), but such data are rarely applied so explicitly in the parametrisation of lattice dispersal, nor are such data collected in such a way as to be applicable in these settings. Whilst Chesson & Lee (2005) provide models of lattice dispersal with desirable mathematical properties, the underlying theoretical basis of these models is the mixture of a random quantity of stepping-stone dispersal sub-stages, requiring that individuals disperse cardinally with respect to the artificial geometry placed upon them within each of these dispersal sub-stages. On a two-dimensional grid, this means that although an individual can disperse further than the nearest neighbours, the final dispersal of the entire time step is comprised of a number of stepping-stone dispersal sub-steps, with each dispersal sub-step limited to movement within a von Neumann neighbourhood. It is difficult to see the theoretical link between such models and those that are commonly fitted to dispersal data. We adopt here a different approach and instead describe a general method for the approximation of continuous displacement distributions on lattices of arbitrary resolution. We use this methodology to derive approximate cell-to-cell transition probabilities for commonly employed models of continuous dispersal and describe how this method can be extended to allow for common boundary conditions and irregularly shaped source and destination patches.

For convenience, all notation used in this paper is summarised in Table 1.

Table 1.   Summary of notation used
 DescriptionSupport
rDispersal distanceinline image
θ1Angle of dispersal measured in an anticlockwise direction from the positive x-axisinline image
θ2Angle of dispersal measured in a clockwise direction from the positive y-axisinline image
inline imageStep function defined in Eqn 2inline image
jPotential source coordinates (jx,jy)inline image
kPotential destination coordinates (kx,ky)inline image
g·(r,θ2)Two-dimensional dispersal kernel described in terms of polar displacementinline image
c·(jx,jy,kx,ky)Two-dimensional dispersal kernel described in terms of the source and destination Cartesian coordinates (see Eqn 6)inline image
axayThe width and height of the simulation arena respectivelyinline image
JSource cell bounded by jx1 and jx2 on the x-axis and jy1 and jy2 on the y-axisinline image
inline image
KDestination cell bounded by kx1 and kx2 on the x-axis and ky1 and ky2 on the y-axisinline image
inline image
K(i1,i2)Translation of the destination cell bounded by [kx1 + i1ax] and [kx2 + i1ax] on the x-axis and by [ky1 + i2ay] and [ky2 + i2ay] on the y-axis 
inline imageProbability of moving from cell J to cell K, calculated according to the approximation method in the superscript brackets (CC denotes centroid-to-centroid, AC area-to-centroid, CA centroid-to-area, and AA area-to-area dispersal; Eqns 5, 9, 7, and 8 respectively)inline image
inline imageinline image corrected for the incorporation of restricting boundary conditions (see equation S2-2)inline image
inline imageinline image corrected for the incorporation of periodic boundary conditions (see equations S2-3, S2-4, and S2-5)inline image
JA source patch consisting of NJ cells 
KA destination patch consisting of NK cells 
inline imageProbability of moving from patch J to patch K calculated using the underlying cell transition probabilities, inline image, according to Eqns 12 and 13inline image
P′′(·)A transition matrix with each element, inline image, containing the probability of moving to cell K from cell J with periodic boundary correction applied 
inline imageA vector with each element, inline image, containing the probability that an individual resides within cell J at time t according to the relevant approximation methodinline image
wtJProbability that an individual resides within cell J at time t under a continuous Gaussian diffusion process (see Eqn 18)inline image
w′′tJwtJ with periodic boundary correction applied (see Eqn 19)inline image

Materials and methods

Calculating transition probabilities

We first begin by defining the two-dimensional displacement kernel g·(r,θ), which describes the probability density of a polar displacement of length r (where r > 0) at a bearing θ in a single dispersal event. There are a number of different ways to define the direction of dispersal, θ. One common method employed in the mathematical domain is to define θ as the angle of direction measured in an anti-clockwise direction from the x-axis such that −π < θ ≤ π. However, a measurement regime that is more intuitive to field biologists, and one that may be more consistent with the format of collected data, is to define the angle of dispersal as a clockwise bearing from the y-axis, with θ instead defined between the limits 0 ≤ θ < 2 π. For the sake of clarity, we will adopt the notation θ1 and θ2 to refer to the former and latter definitions, respectively. It is worth noting that θ1 and θ2 are linked by the relationship

image(eqn 1)

where inline image is the step function

image(eqn 2)

Grids are defined on a Cartesian coordinate system and so the polar displacement function must be converted to describe the probability density of displacement to a set of Cartesian destination coordinates, denoted here as k = (kx, ky), given a set of source coordinates, j = (jx, jy). We can rewrite r and θ2 in terms of these coordinates

image(eqn 3)
image(eqn 4)

The derivation for r describes the standard magnitude of dispersal distance in a Euclidean two-dimensional coordinate system. However, the formula for θ2 differs from the standard polar conversion formula as it incorporates both a correction factor to match our definition of θ2 and also an extra term to make the equation valid regardless of which quadrant the destination coordinate, k, occupies in relation to the source coordinate j.

Centroid-to-centroid dispersal

The simplest method to approximate a continuous displacement kernel on a lattice is to set the cell-to-cell transition probabilities using the displacement kernel density for the distance from the centroid of the source patch, J, to the centroid of the destination patch, K (one version of the dispersal mechanism implemented in Moilanen 2004). For this quantity to represent a true probability, however, it is necessary to normalise these values by dividing over the sum of the probability densities of the displacement kernel evaluated at the centroids of all candidate dispersal locations. If we denote the centroid-to-centroid transition probability from cell J, bounded between jx1 and jx2 on the x-axis and jy1 and jy2 on the y-axis (where jx1 < jx2 and jy1 < jy2), to cell K, similarly bounded between kx1 and kx2 on the x-axis and ky1 and ky2 on the y-axis, as inline image, then

image(eqn 5)

where c·(jx,jy,kx,ky) is a reparametrisation of the displacement kernel

image(eqn 6)

and L is a candidate destination cell bounded between lx1 and lx2 on the x-axis and ly1 and ly2 on the y-axis.

Centroid-to-area dispersal

An alternative derivation of cell transition probabilities is the centroid-to-area definition, with the probability of moving from cell J to cell K denoted here by inline image. Under this definition, the transition probabilities are defined by the probability that the dispersing individual lands somewhere within the area of the target cell such that

image(eqn 7)

Unlike centroid-to-centroid dispersal, centroid-to-area dispersal allows the correct treatment of destination patches that are of different sizes. This is comparable to the models of dispersal implemented in studies such as Hanski, Alho & Moilanen (2000) and Chapman, Dytham & Oxford (2007) that weight the dispersal probabilities to destination patches according to area.

Area-to-area dispersal

Both centroid-to-centroid dispersal and centroid-to-area dispersal can suffer from severe biases when the cell size is large relative to the expected dispersal distance (Collingham, Hill & Huntley 1996). Under such circumstances, the dispersal distance may need to be improbably large for an individual to move from the centroid to the edge of a source cell, resulting in close to zero probability weights for all possible non-source destination cells. Iterating such models forward a number of time steps can produce a gross underestimation of invasion rates compared to a continuous model counterpart. We can remedy some of these effects by allowing dispersal to originate from alternative points from within the cell. One method, such as that employed in one specification of the Spomsim model of Moilanen (2004), describes dispersal in terms of the distance of the nearest edges between patches. Another method, and the one that we will describe here, assumes that dispersal is equally likely from all possible locations from within the cell. Here, the locations of individuals are represented as a uniform probability distribution bounded by the spatial boundary coordinates of the cell. The probability of any dispersal event occurring between the source coordinates, j, and destination coordinates, k, is then simply the product of the probability of the origin of the dispersal event, inline image, and the probability of dispersing to the destination given that origin, inline image. The transition probability from cell J to cell K, which we denote ‘area-to-area’ dispersal and by the notation inline image, requires that we integrate over all possible source coordinates within the boundaries of the source cell and all possible destination coordinates within the boundaries of the destination cell such that

image(eqn 8)

Area-to-centroid dispersal

One final method for the derivation of transition probabilities on a lattice is area-to-centroid dispersal, inline image. This method is less applicable for use in cell-based dispersal but is included here for the sake of completeness. In a similar manner to the area-to-area dispersal approximation method described earlier, this definition requires the spatial integration over all possible source coordinates except, that in this case, the destination coordinates are fixed at the centre of the destination cell. However, like centroid-to-centroid dispersal, the final probability requires normalisation such that

image(eqn 9)

Figure 1 illustrates the four different definitions used in this article to approximate continuous dispersal when generating lattice-based cell-to-cell transition probabilities.

Figure 1.

 Illustration of the four different lattice-based dispersal transition probability definitions described in this paper. (a) Centroid-to-centroid dispersal, inline image, where dispersal events are assumed to originate from the centre of the cell and dispersing individuals can only disperse to the centroids of the possible destination cells. Centroid-to-area dispersal, inline image, as depicted in (b), shows how the transition probability is defined as the probability of landing anywhere within the boundaries of the destination cell but with all dispersal originating from the centre of the source cell. Area-to-centroid dispersal, (c) and inline image, allows weights the dispersal probabilities of arriving at the centroid of the destination cell given the point of origin of the dispersing individual by the probability that the individual begins its dispersal from that origin. This is assumed to be uniform over the area of the cell. Area-to-area dispersal, inline image in (d), extends area-to-centroid dispersal by integrating over all possible destination points in the destination cell and relaxing the restriction that individuals can only disperse to centroids. Note that inline image and inline image require normalisation to represent true transition probabilities.

A detailed derivation of transition probability estimates under Gaussian dispersal (see Clark et al. 1999) for each of the four approximation methods described in this article is given in Data S1. It may not be easy to derive results analytically for other dispersal kernels however, and, in these circumstances, it may be necessary to resort to the application of numerical integration techniques (see Davis & Rabinowitz 2007) to derive transition probability estimates. Functions to perform any of the approximation methods described in this article have been provided for the R statistical computing platform as part of the ecomodtools package available from RForge (https://r-forge.r-project.org/projects/ecomodtools/). The LatticeTransitionProbs function of the ecomodtools package can be employed to calculate cell-to-cell transition probabilities using the analytic results of commonly employed dispersal kernels or by using Monte Carlo integration for an arbitrary, user-defined, dispersal kernel. To install the package and the relevant documentation from R, simply type the following at the console whilst connected to the internet: install.packages(“ecomodtools”, repos = “http://R-Forge.R-project.org”).

Composite DISPERSAL kernels

Some authors have argued that one dispersal kernel alone does not offer enough flexibility to describe the observed changes in species distributions and that a composite dispersal kernel combining the different modes of dispersal at short and long ranges is preferable (Shigesada, Kawasaki & Takeda 1995; Higgins & Richardson 1999; Bullock & Clarke 2000). The commonest form for a composite displacement kernel, denoted here as g+(r,θ), usually consists of two sub-kernels, g1(r,θ) and g2(r,θ), weighted by an extra parameter, φ:

image(eqn 10)

Under this specification, each dispersal event involves the drawing of a random distance and direction from a joint distribution described by the probability density function g1(r,θ) with probability φ (where 0 ≤ φ ≤ 1), otherwise the distance and direction are drawn according to a random number with joint probability density function g2(r,θ). This allows the specification of a kernel that describes a common localised dispersal pattern, with a high value of φ, but with the possibility of very rare but long-distance dispersal events. This formulation of composite dispersal can be included into our cell-to-cell transition probabilities under a lattice-based modelling structure very simply by weighting the transition probabilities corresponding to the composite kernels according to the weighting parameter φ such that

image(eqn 11)

where inline image and inline image represent the transition probabilities, calculated using any of the four approximation methods described earlier, of the dispersal described by probability density functions g1(r,θ) and g2(r,θ), respectively.

Incorporating boundary conditions

The methods described in this article have assumed thus far that the arena in which the simulation takes place is infinite in size. However, computation must take place over a finite grid and so decisions must be made by the investigator as to what fate befalls individuals that interact with the border of the simulation arena. These decisions can drastically alter the outcome of the simulation (Sullivan 1988; Burton & Travis 2008), and so no discussion of lattice-based dispersal would be complete without a description of how to incorporate boundary conditions. For a thorough treatment of this topic, with accompanying illustrations, the reader is referred to Data S2.

Extension to patch-based models

The methods described here are not limited to the description of cell-to-cell dispersal. Many metapopulation and metacommunity models use simplified descriptions of the spatial extent of patches. For example, Moilanen (2004) assumes that all patches are circular, whilst Hanski (1994) assumes that patch shape is negligible in determining patch connectivity. Both studies assume that inter-patch dispersal probabilities need only be expressed in terms of the shortest distance between patches and the area of the patch. However, if the spatial extent of the patches can be approximated using a set of cellular pixels, then it is possible to bring the methods described here to bear and allow for the description of patch-to-patch dispersal in terms of an underlying continuous displacement kernel. This allows patch connectivity to be described both in terms of the dispersal capabilities of the species of interest and the spatial extent of the individual patches.

Whilst approximating the spatial extent of patches as a series of cells may seem like an abstraction, it rarely represents a loss of information. This is because the representation of habitat areas in spatial data sets, such as the Corine land cover data set (as described in Brown, Gerard & Fuller 2002, for the UK extent), are often stored in a cellular ‘raster’ format anyway. Even data stored as areal units in ‘vector’ format can be approximated using fine-resolution cellular lattices: the vector LandCover 2000 data set of Fuller et al. (2002) is also available in a 25 m resolution raster version.

If we define a patch, J, as the set of NJ cells that comprise the source patch, with each constituent cell indexed J1, J2, …, JNJ, and K as the destination patch of NK similarly indexed cells, then we can calculate the probability of moving to patch K given that the source of dispersal originated somewhere within patch J, inline image, in terms of the component cell-to-cell dispersal probabilities. The event of a dispersing individual relocating to destination cell K1 and the alternative event of that same individual relocating to any other cell, such as the cell K2, during a single dispersal event are mutually exclusive. This means that the probability of dispersing to any of the destination cells in patch K given a specific source cell as the point of origin, inline image, is the sum of the probabilities of dispersal from the source cell to each of the destination cells:

image(eqn 12)

If we assume that the probability of the location of the point of origin is uniformly spread across the area of the patch, the final patch-to-patch probability is defined as the sum of the probabilities of dispersing to any of the destination patch cells from each of the source patch cells, but with each probability weighted by the proportional area of the relevant source cell relative to the total area of the source patch. Therefore,

image(eqn 13)

where jnx1 and jnx2 are the lower and upper boundaries on the x-axis of cell Jn, respectively. jny1 and jny2 are similarly defined as the lower and upper boundaries of cell Jn on the y-axis. For a regular lattice of cells, where all cells have the same area, Eqn 13 simplifies to inline image.

To satisfy the condition ∑Kp·JK = 1, a requirement for a properly defined probability mass function, it is necessary that the patches collectively account for all space over which it is possible for individuals to disperse to. Whilst this may be reasonable when deriving movement probabilities for individuals dispersing over landscapes with no gaps, such as the coarse-grained Dirichlet landscapes of Holland et al. (2007), this may be unsuitable for application in most metapopulation models where the total area of the patches combined can account for only a very small proportion of the total area of study. In these situations, it is important to describe explicitly the fate of individuals that do not disperse successfully to another habitat patch. At one extreme, we can define all patches that are not of suitable habitat as absorbing states and apply the absorbing state correction (as described in Data S2) to the cell-to-cell transition probabilities (and hence to the patch-to-patch transition probabilities). However, for landscapes with patches that are relatively small compared to the total area of study or that appear infrequently, this dispersal-mediated mortality may represent a sizeable mortality risk. For seed dispersal, the displacement to unsuitable soil or environmental conditions may well doom that individual, but in animal dispersal, the description of a ‘black hole’ effect between patches may present an artificial inflation of dispersal mortality risk. At the other extreme, it is possible to apply restricting boundary conditions to the set of patches so that an individual always successfully disperses to a suitable patch. This in effect truncates the dispersal kernel so that only suitable patches can be dispersed to. Under these conditions, dispersal mortality is always zero, even in very isolated patches. In common application however, it may be most practicable to mix these two extreme scenarios using a method such as the dispersal mixture formula presented in Eqn 11.

Testing the approximation

To assess the accuracy of the four approximation methods described in this paper, we describe the movement of an individual across the landscape over multiple time periods using each approximation method and compare the probability of the individual residing in each cell over each time period with what would be expected if continuous point to point dispersal was employed.

We define a transition matrix, inline image, as a comprehensive description of cell-to-cell dispersal probabilities and with each element, inline image, containing the probability of moving to cell K if the dispersal event originated from cell J with periodic boundary correction applied. Here, the matrices inline image, inline image, inline image and inline image are defined as dispersal matrices filled with transition probabilities derived using the relevant approximation method. The state-vector, inline image, with elements inline image contains the probabilities that the individual resides in cell J at time period t. This specification allows the use of inline image to describe inline image in terms of a Markov chain recurrence relationship where

image(eqn 14)

For the purposes of this exercise, we approximate Gaussian dispersal on a lattice by filling the cell-to-cell transition probabilities in matrix inline image with those calculated using equations S1-3, S1-8, S1-7 and S1-11 derived in Data S1 for centroid-to-centroid, area-to-centroid, centroid-to-area, and area-to-area approximation methods, respectively.

To compare the discrete approximations to continuous dispersal, it is necessary to derive a cell-based description of residence probability based on continuous dispersal over time. Starting with the Cartesian representation of the Gaussian dispersal kernel as defined in Clark et al. (1999), we have shown in Data S3 that the total displacement in the Cartesian coordinates, δx and δy, at time t, arising from Gaussian steps in each time period, is a bivariate-normal random variable with probability density function

image(eqn 15)

where α represents the isotropic standard deviation parameter of displacement in one time step. From Eqn 15, it is possible to derive the probability that an individual resides in cell J at time t, or wtJ, by integrating the probability density function between the limits of the cell extent:

image(eqn 16)

From the identity

image(eqn 17)

where κ is a substitution used in integration (inline image), we can express wtJ in terms of the numerically tractable error function, erf (Z), as defined in equation S1-5:

image(eqn 18)

The final element to include in the derivation of continuous Gaussian dispersal to make it comparable to the formulation used in the approximations we have applied here is to apply a correction for periodic boundary conditions. Similarly to the derivation for periodic correction derived in equation S2-3, the corrected form of the cell probabilities, w′′tJ, can be defined in terms of the uncorrected probabilities such that

image(eqn 19)

Like equations S2-3 and S2-6, the convergent infinite series in Eqn 19 can be evaluated numerically using techniques such as those described in Caliceti et al. (2007). Here, cell inline image is a translation of cell J where inline image is bounded by [jx1 + axi1] and [jx2 + axi1] on the x-axis and by [jy1 + ayi2] and [jy2 + ayi2] on the y-axis. The vector, W′′t, with an element for each cell set to w′′tJ, provides a description of continuous dispersal over time that has a structure allowing comparison to the discrete approximations, inline image, described in this article.

To test the effect of spatial scale of the lattice on the quality of the approximation, we calculate the relevant probabilities over lattices of three different grid sizes of 1, 3 and 5 units in width and height. The simulation arena is a total of 45 × 45 units meaning that, in terms of cell count, the intermediate and coarse-grained spatial resolutions comprise of arenas of 15 × 15 and 9 × 9 cells, respectively. In each calculation, we initialise the continuous Gaussian dispersal process with an individual starting at the centre of the grid, which for notational convenience we have designated as the origin of the x and y axes without loss of generality. For the discrete approximations, we initialise the starting probability vector, inline image, so that all elements are zero with the exception of the one cell containing the origin that is given a value of one.

α is set to inline image for all calculations. Converting the bivariate normal displacement kernel into a probability density function of dispersal distance results in a rescaled Rayleigh distribution (Tufto, Engen & Hindar 1997; Snäll, O'Hara & Arjas 2007; Cousens, Dytham & Law 2008) with expected value inline image (Clark, Macklin & Wood 1998). By setting α to inline image, we standardise the expected dispersal distance over one time step to the cell length of the medium resolution grid. This provides a convenient midpoint benchmark to judge the approximation methods at grid resolutions with cell lengths larger than the expected dispersal distance, such as the 5 × 5 resolution grid, and grids at a finer scale than the scale of dispersal, such as the 1 × 1 grid.

Residence probabilities were calculated for each cell over the 40 time periods using the transition matrices generated using each of the four approximation methods described in this article. Each of the resultant vectors of residence probabilities at each time period was compared to those expected under continuous dispersal.

Results

The absolute range of error values given in Table 2 show that for most grid sizes tested here, all four approximation methods provide a reasonable approximation to what would be expected under continuous dispersal. Here, approximation error is defined as the difference between the probability that the individual resides within a cell at a given time period calculated according to the approximation method being tested and the probability that the individual would reside in that cell at the same time period under truly continuous dispersal (Eqn 19). Positive values represent incidences where the residence probabilities calculated by the approximation method exceed those expected under continuous dispersal, whilst negative values denote incidences where the ‘true’ residence probabilities exceed those calculated by the approximation method.

Table 2.   Table of the range of approximation error for each approximation method and cell resolution over the entire arena and the 40 time steps calculated. Approximation error is defined here as the difference of the probability of the individual residing in a cell under continuous dispersal (Eqn 19) and the probability calculated using a discrete approximation method. Positive values represent an ‘excess’ of probability, where the residence probabilities calculated by the approximation method exceed that expected under continuous dispersal. Conversely, negative values represent residence probabilities calculated by the approximation method below those expected under continuous dispersal
Approximation methodGrid cell size
1 × 13 × 35 × 5
Centroid-to-centroid
 Minimum−5·335 × 10−5−3·437 × 10−3−2·856 × 10−2
 Maximum3·999 × 10−42·993 × 10−21·740 × 10−1
Centroid-to-area
 Minimum−9·904 × 10−5−6·565 × 10−3−1·755 × 10−2
 Maximum1·339 × 10−59·117 × 10−45·381 × 10−3
Area-to-centroid
 Minimum−1·758 × 10−4−6·565 × 10−3−1·755 × 10−2
 Maximum4·464 × 10−59·177 × 10−45·381 × 10−3
Area-to-area
 Minimum−3·885 × 10−4−2·333 × 10−2−1·053 × 10−1
 Maximum5·224 × 10−52·702 × 10−31·239 × 10−2

Residence probability estimates are correct to within three decimal places (<0·0004) of the true probability for calculations made under the fine-resolution grid (1 × 1 cell size) for all estimation methods calculated for all cells over the entire 40-step time period. For medium resolution grids (3 × 3 cell size), this accuracy reduces to values within 0·03 of the continuous dispersal baseline. At coarse resolutions (5 × 5 cell size), reasonable approximation to true continuous dispersal is not guaranteed: at the extremes, approximation methods show an inaccuracy in residence probability calculation of up to 0·175.

Figures 2–4 show the spatial distribution of approximation error on fine, medium and coarse resolution grids, respectively. From these figures, we can see that centroid-to-centroid methods tend to overestimate the probability weights around the origin of dispersal. Conversely, centroid-to-area, area-to-centroid and area-to-area methods all underestimate the residence probabilities in these areas whilst overestimating residence probabilities in the peripheries. A full time series of approximation for three locations sampled across the simulation arena is displayed in Data S4.

Figure 2.

 Spatial distribution of approximation method error through time on a fine resolution grid (cell size 1 × 1 units). Probability error is defined here as the difference of the probability of the individual residing in a cell under continuous dispersal (Eqn 19) and the probability calculated using a discrete approximation method. Positive values (blue shading in the panels above) represent an ‘excess’ of probability, where the residence probabilities calculated by the approximation method exceed that expected under continuous dispersal. Conversely, negative values (red shading in the panels above) represent residence probabilities calculated by the approximation method below those expected under continuous dispersal. (a–d) Four snapshots of the spatial error for each of the four approximation methods.

Figure 3.

 Spatial distribution of approximation method error through time on a medium resolution grid (cell size 3 × 3 units). Probability error is defined here as the difference of the probability of the individual residing in a cell under continuous dispersal (Eqn 19) and the probability calculated using a discrete approximation method. Positive values (blue shading in the panels above) represent an ‘excess’ of probability, where the residence probabilities calculated by the approximation method exceed that expected under continuous dispersal. Conversely, negative values (red shading in the panels above) represent residence probabilities calculated by the approximation method below those expected under continuous dispersal. (a–d) Four snapshots of the spatial error for each of the four approximation methods.

Figure 4.

 Spatial distribution of approximation method error through time on a coarse resolution grid (cell size 5 × 5 units). Probability error is defined here as the difference of the probability of the individual residing in a cell under continuous dispersal (Eqn 19) and the probability calculated using a discrete approximation method. Positive values (blue shading in the panels above) represent an ‘excess’ of probability, where the residence probabilities calculated by the approximation method exceed that expected under continuous dispersal. Conversely, negative values (red shading in the panels above) represent residence probabilities calculated by the approximation method below those expected under continuous dispersal. (a–d) Four snapshots of the spatial error for each of the four approximation methods.

The time series of approximation error in figure S4-1 of Data S4 shows that the most extreme deviation from continuous dispersal occurs, for all approximation methods and grid resolutions, close to the origin in the earlier time periods. For locations further from the origin, the peak of approximation error occurs later in the time series, and at a much reduced magnitude. As time increases, a wave of increased residence probability spreads out from the centre of the simulation arena; if the timing for the arrival of this probability wave for an approximation method is different than that predicted under continuous dispersal then, during this period of disparity, we observe a peak of approximation method error.

Under two-dimensional Gaussian diffusion, the variance of the probability mass function of the particle location (Eqn 15) tends to infinity as time increases. The cell residence probabilities, calculated with periodic boundary conditions according to Eqn 19, thus tend towards a uniform distribution bounded by the margins of the simulation arena. Owing to the Markovian nature of the calculation mechanism for the residence probabilities for each approximation method, it is possible to calculate the asymptotic probability distribution for such methods. In Markovian models, the distribution of the asymptotic probability of residence is equivalent to the right eigenvector corresponding to the dominant eigenvalue of the transition matrix, rescaled so that all components sum to one. For properly defined transition matrices, the dominant eigenvalue is always equal to one as the total probability is conserved between time periods.

Table 3 contains the sum of the absolute difference between the asymptotic residence probabilities calculated for each of the approximation methods at each cell resolution and the uniform probabilities expected under continuous dispersal. From Table 3, it is clear that the centroid-to-area and area-to-area approximations methods exhibit very small deviations from the asymptotic expectation at all cell resolutions (<1·8 × 10−8). Whilst the total deviation exhibited by the centroid-to-centroid and area-to-centroid methods is still relatively small (<0·12), it is still many orders of magnitude larger than those exhibited by the areal destination methods and represents a non-negligible departure from the asymptotic optimum.

Table 3.   Table of sums of asymptotic deviance of approximation methods from continuous dispersal. Elements are calculated from the dominant right eigenvector of the transition matrices used in Eqn 14. The element values are the sum of the absolute difference between the elements of the eigenvector and the uniform probability distribution that represents the asymptotic result of continuous dispersal
Approximation methodGrid cell size
1 × 13 × 35 × 5
Centroid-to-centroid2·835 × 10−141·042 × 10−15·111 × 10−2
Centroid-to-area8·229 × 10−91·748 × 10−84·751 × 10−13
Area-to-centroid1·158 × 10−11·107 × 10−18·365 × 10−2
Area-to-area8·902 × 10−104·927 × 10−113·665 × 10−11

Discussion

The methods described in this article provide a number of different mechanisms to approximate continuous dispersal in lattice-based models. We have shown that, for Gaussian dispersal at least, these approximations hold well at resolutions equivalent to the expected dispersal distance and finer. At coarse resolutions, the approximation methods described in this article begin to exhibit significant deviations from what would be expected under continuous dispersal. The spatial signal of this error is quite different under the different approximation methods however. The overestimation of residence probabilities at the core of the range observed under centroid-to-centroid dispersal can be explained by the fact that, under this dispersal regime, the distances between the origin and the destination sites are relatively large compared to the other dispersal approximation methods; centroid-to-area dispersal provides a destination area that has margins closer to the point of dispersal, area-to-centroid dispersal has a margin of the departure area closer to the destination point, and finally, area-to-area dispersal has margins of both destination and origin areas that are yet closer again. This results in centroid-to-centroid approximation methods generating residence probabilities in the nearby and source cells far in excess of what would be expected under continuous dispersal because the probability of spanning the distance between the origin and destination centroid for intermediately isolated and distant cells is very low (see Collingham, Hill & Huntley 1996). The unit sum requirement for dispersal probabilities thus requires that the proportional weight be loaded in the nearby cells.

The effect of increased cell size on the spatial distribution of approximation error observed in centroid-to-area, area-to-centroid and area-to-area dispersal appears to act oppositely to that observed with centroid-to-centroid dispersal. Here, residence probabilities close to the origin of dispersal are underestimated, whilst more distant dispersal events are predicted with a greater frequency than that expected under continuous dispersal. For those approximation methods where dispersal originates from an areal unit, this overestimation of residence probability in the peripheral grid cells can be explained by the added dispersal advantage conferred by the assumption that the point of departure is selected uniformly over the originating cell. If the originating cells are large, then this first stage in the dispersal process can potentially garner origins of dispersal distant from locations likely to be dispersed to under continuous dispersal in the previous time period. In other words, dispersing individuals are ‘pulled’ across the interior of cells, effectively accelerating the dispersal rate. This extra process can, once compounded over several time steps, induce considerable increases in the predicted invasion speed.

The point-based origin of dispersal in centroid-to-area dispersal means that it may not be immediately obvious why centroid-to-area dispersal may suffer from the same spatial patterns of approximation error that afflict the area-to-centroid and area-to-area approximation methods. However, this phenomenon can be elucidated by envisioning the scenario where an individual moves from a cell centroid to just inside the margins of a nearby cell in one time period. When the model is iterated to the next time period, the individual is assumed to disperse from the centre of the destination cell of the last time period. Like the areal origin approximation methods, this effect essentially creates an extra intracellular dispersal event in each time period. Compounded over multiple time periods, this effect will produce the observed spatial patterning of approximation error and can potentially bias predictions of expansion rates dramatically.

Whilst the absolute approximation error is an area of key consideration when selecting an appropriate approximation method (Table 2), for simulations run over long-term timescales, particularly those studies that focus on the equilibrium properties of the system, it is also important for the investigator to consider the asymptotic properties of the approximation method applied (Table 3). Methods that do not create outcomes that tend towards the continuous process that they are supposed to approximate will produce an artefact of approximation and may bias the interpretation of such results. Except at very fine scale resolutions, we have shown here that centroid-to-centroid and area-to-centroid dispersal do not exhibit the requisite asymptotic properties for these purposes. Both of these methods share the characteristic that they require the evaluation of the dispersal kernel at a point. For a continuous dispersal kernel, the probability of dispersing to a point is infinitesimally small and, to express cell-to-cell dispersal probabilities in terms of a true probability that sums to unity across all possible destinations, both approximation methods require normalisation. As a result, both the centroid destination methods cannot characterise a ‘true’ dispersal process resulting in a long-term deviation from the continuous process, even if the approximation in the short and medium term is accurate.

The sensitivity of the approximation error and asymptotic properties of cell-based dispersal to the resolution of the lattice has resulted in a number of authors suggesting rules for appropriate cell resolution. Martin (1993) expresses such recommendations in terms of a so-called m-criterion. In the context of dispersal approximation, this criterion is only satisfied if the cell length is less than or equal to the expected dispersal distance of the underlying continuous dispersal kernel over one time step. The expected dispersal distance of the underlying dispersal kernel can be calculated by converting the two-dimensional displacement kernel, g·(r,θ), into a probability distribution of distances (see Clark et al., 1999; Cousens, Dytham & Law 2008), and calculating the expected value of this distribution. Rules for lattice-based dispersal have also been documented in Collingham, Hill & Huntley (1996), where the authors recommend that the cell lengths should be no longer than one-half of the square root of the mean dispersal distance. Both heuristics may be excessively stringent however. The fine-resolution grid (cell length 1 × 1) and dispersal kernel parametrisation evaluated in this study falls slightly outside the maximum cell length criterion of Collingham, Hill & Huntley (1996). However, even at the poorest performing locations and time periods within the 40 time periods sampled, all approximation methods described here still give accurate residence probabilities to within three decimal places at this spatial resolution. Moreover, the medium resolution grid (cell length 3 × 3) falls exactly on the limit of acceptability to satisfy the ‘m-criterion’ of Martin (1993). Even at this limit, the centroid-to-area and area-to-centroid dispersal methodologies still provide residence probabilities to within two decimal places of the continuous baseline.

All attempts to emulate continuous dispersal on a discrete lattice will suffer from some form of approximation error. Indeed, Chesson & Lee (2005) state that theory relating to the continuous distribution, such as the moments and convolution properties, may not necessarily apply once the distribution has been mapped onto a discrete lattice. Whilst this is undoubtedly true, we argue here that when scaling up from point-to-point continuous dispersal to cell-to-cell dispersal, it is useful to maintain a theoretical link between the dispersal as modelled at the smaller scale. In plants, dispersal kernels are most commonly fitted to seed shadow data (Clark et al. 1999) or the outcomes from molecular parentage analysis (Robledo-Arnuncio & Garca 2007). In animals, mark–release–recapture data (Fujiwara et al. 2006) or telemetry data (Dahl & Willebrand 2005; Rhoads, Bowman & Eyler 2010) analysis methods are most commonly employed. As such, most studies will quote dispersal strategies in terms of point distances. To incorporate the information garnered from these small-scale studies into estimates of cell-to-cell or patch-to-patch connectivity, the calculation of which is an important prerequisite for any form of spatially explicit metapopulation model (Hanski 1994; Moilanen 2004), it is important to define connectivity in terms of parameters derived from data collected at these scales.

For some study species, particularly territorial mammals, the field of dispersal ecology has pursued a much more mechanistic description of movement (for example Will & Tackenburg 2008; van Moorter et al. 2009). Such models are often described as rule based because they rely mainly on the simulation of individuals that move according to a set of rules rather than through description from a redistribution kernel. Some of the simpler simulation models can still be described in terms of a dispersal kernel and, for the approximation of such models in a discrete landscape, the methods described in this article remain directly applicable. For the more complex models, where the description of the movement in terms of a dispersal kernel is not tractable, the approximation of transition probabilities must be garnered from direct simulation. Here, multiple simulations must be performed. As the number of simulations grows large, the proportion of simulations that reside in each cell at the end of the movement will provide a reasonable approximation to the transition probabilities. If the simulations all start from the centre of the source cell, then this corresponds to centroid-to-area dispersal, whilst area-to-area dispersal corresponds to a set of simulations that pick a source location at random from within the source cell according to a uniform distribution within its borders.

In summary, Holland et al. (2007) have shown that nearest neighbour dispersal produces results that are highly dependant upon the geometry of the lattice and the dispersal neighbourhood. For more reasonable implementations of dispersal, we must apply methods that approximate dispersal defined in continuous space to models where space is represented discretely. In most applications, centroid-to-centroid dispersal is used as a default approximation method. Whilst this may represent the least demanding method in terms of computational power, we have demonstrated that such methods can provide a very poor approximation to continuous dispersal: producing biased estimates of invasion speed and asymptotic residence probabilities. Conversely, approximation methods with areal destination spatial units exhibit both desirable asymptotic qualities and high accuracy, even at relatively coarse spatial scales. The adoption of these more complex methods need not be demanding and the use of numerical tools such as the ecomodtools package, or through direct derivation (such as that described for Gaussian dispersal in Data S1), can provide the investigator with a much better approximation of continuous dispersal at very little cost in terms of time, either computationally or in implementation. Moreover, we have shown how rows and columns of the transition matrices generated using these approximation methods can be aggregated to provide estimates of patch connectivity for use in metapopulation and metacommunity models. These methods may provide a valuable part of a suite of techniques to draw inference about ecological processes from data collected at multiple spatial scales.

Acknowledgements

This research was supported by a NERC/UKPopNet studentship (NER/S/R/2005/13941) and by the DFG Priority Program 1374 ‘Infrastructure-Biodiversity-Exploratories’ (HO 2051/2-1). The authors thank an anonymous reviewer for suggestions to improve an earlier draft of this article.

Ancillary