Predicting dispersal distance in mammals: a trait-based approach



  1. Dispersal is one of the principal mechanisms influencing ecological and evolutionary processes but quantitative empirical data are unfortunately scarce. As dispersal is likely to influence population responses to climate change, whether by adaptation or by migration, there is an urgent need to obtain estimates of dispersal distance.
  2. Cross-species correlative approaches identifying predictors of dispersal distance can provide much-needed insights into this data-scarce area. Here, we describe the compilation of a new data set of natal dispersal distances and use it to test life-history predictors of dispersal distance in mammals and examine the strength of the phylogenetic signal in dispersal distance.
  3. We find that both maximum and median dispersal distances have strong phylogenetic signals. No single model performs best in describing either maximum or median dispersal distances when phylogeny is taken into account but many models show high explanatory power, suggesting that dispersal distance per generation can be estimated for mammals with comparatively little data availability.
  4. Home range area, geographic range size and body mass are identified as the most important terms across models. Cross-validation of models supports the ability of these variables to predict dispersal distances, suggesting that models may be extended to species where dispersal distance is unknown.


Natal dispersal, the movement an animal, plant or propagule undertakes from its point of origin to the place where it reproduces, is a demographic parameter which is almost ubiquitous in its importance to ecology and evolution (Howard 1960; Greenwood 1980; Vellend & Orrock 2010). Its effects are seen at all spatial scales, from the distribution of genes and individuals within populations to community composition and the position and extent of species geographic ranges (Slatkin 1987; Johnson & Gaines 1990; Stenseth & Lidicker 1992; Gaston 2003; Dytham 2009). In addition, there is now a growing understanding about the importance of dispersal in enabling species to keep pace with changing climates (Anderson et al. 2012; Urban, Tewksbury & Sheldon 2012). A recent framework, constructed to identify the threats and benefits of climate change for individual species, highlighted dispersal as an exacerbating factor important in identifying species at risk (Thomas et al. 2010). There is therefore a burgeoning need to include quantitative descriptions of dispersal in spatially explicit population models and species distribution modelling (SDM) techniques (Araújo & Guisan 2006; Hawkes 2009).

The ability to accurately parameterize dispersal distance within models is severely hampered in many taxonomic groups by a lack of data (Nathan 2001). For example, SDMs predicting the impact of climate change on species distributions frequently contrast scenarios of unconstrained and no dispersal with the caveat that, in reality, most species will show a range of dispersal distances which fall between these two assumptions (Araújo & Rahbek 2006; Broennimann et al. 2006; Botkin et al. 2007). There are a number of reasons for this shortfall, including inconsistencies in measurements and definitions of dispersal, difficulties in collecting empirical data owing to the ‘once in a lifetime’ nature of dispersal events in many species and the inability of most studies to capture long-distance dispersal (LDD) events (Koenig, VanVuren & Hooge 1996; Bowman, Jaeger & Fahrig 2002). Difficulties also arise because the process itself takes many forms, from a gradual range shift to an adjacent home range, to a one-way movement over a great distance (Stenseth & Lidicker 1992).

Identifying significant correlations between dispersal distance and biological or ecological traits may also help us to estimate dispersal ability when data are scarce. Linear relationships have been established between dispersal distance and body mass (Van Vuren 1998; Sutherland et al. 2000; Jenkins et al. 2007) and dispersal distance and home range size (Bowman, Jaeger & Fahrig 2002; Ottaviani et al. 2006) for selected subsets of mammals and birds. There is good evidence linking life-history and demographic traits to dispersal distance in a variety of invertebrate species such as butterflies (Stevens, Turlure & Baguette 2010b; Sekar 2012). In vertebrates however, relationships between dispersal distance and a wider variety of life-history traits have been explored primarily for birds, where generally body mass, wingspan or wing morphology and migratory behaviour are shown to be the dominant driving forces (Paradis et al. 1998; Dawideit et al. 2009; Garrard et al. 2012).

We compiled a database of dispersal distance in mammals, spanning a broad taxonomic and geographic range to explore these questions. We used mammals as data availability across the group is relatively high, including the mammal supertree, a recent species-level phylogeny of 5020 extant mammals (Bininda-Emonds et al. 2007; Fritz, Bininda-Emonds & Purvis 2009), and a database of mammalian life-history traits (Jones et al. 2009). Mammals are also ecologically diverse and show a range of sizes and variability across axes of life history that allow robust comparisons to be made across the group. Here, we use this dispersal database to test scaling of dispersal distance across median and maximum measures and use a multipredictor phylogenetic framework to test a number of traits that are thought to predict mammalian dispersal distance. We also examine the predictive power of models and make recommendations for those wishing to predict mammalian dispersal distances when empirical evidence is lacking.

Materials and methods

Data collection

Dispersal distance data came from experimental and observational studies and from reports of individual movement distances. We excluded distances derived from genetic information. While empirical data may underestimate LDD (Kot, Lewis & van den Driessche 1996), there are relatively good data availability and measurements reflecting actual dispersal distances across a variety of known locations and time-scales. We gathered data from systematic searches of relevant peer-reviewed journals including all issues of Mammalian Species (1969–2009), all issues of the Journal of Zoology from 1998 to 2009 and all issues of the Journal of Mammalogy from 2000 onwards. In addition, we queried electronic search engines from ISI Web of Knowledge, JSTOR and Google Scholar using the terms, ‘natal dispersal’ and ‘dispersal distance’ and identifying relevant records from the first 1000 returned results. Natal dispersal was defined as ‘the movement of a propagule between its birthplace or natal group and its first breeding site or group’ and demarcated this from breeding dispersal which was defined as ‘the movement between consecutive breeding sites or groups of adult breeders’ (Greenwood, Harvey & Perrins 1979; Shields 1982) as data availability for natal dispersal was higher. In addition to empirical studies and observational data, we used papers reporting previous meta-analyses of dispersal data (Van Vuren 1998; Sutherland et al. 2000; Harris et al. 2009).

We combined maximum and median natal dispersal distances for juveniles, subadults and adults undertaking a permanent movement away from the natal home range. Maximum dispersal values (Dmax) were defined as the maximum recorded distance travelled by a species across all studies, and excluded any values where the maximum distance coincided with the limit of the study area as this underestimates maximum dispersal distances. Median dispersal distances (Dmed) were calculated as the median of all reported distances for a species across all studies (i.e. the median of average dispersal distances). The proportion of variance in dispersal distance attributed to the sex of the dispersing individual was small (Dmax = 0·34%, Dmed = 2·15%) compared to variation between species (Dmax = 81·16%, Dmed = 87·72%), and therefore, we combined data for both sexes (Appendix S1).

From these data sets, we obtained dispersal data for 104 species (with 16 species excluded from the analyses because of the lack of matching trait data), leaving maximum natal dispersal distances (Dmax) and matching trait data for 82 species (based on 210 dispersal records) and median natal dispersal distances (Dmed) plus traits for 77 species (based on 203 dispersal records). Sixty-one species had data for both maximum and median dispersal distances. Data span 9 mammalian orders and 33 different families. Maximum dispersal distances range between 0·087 and 467 km, and median dispersal distances range between 0·0159 and 86·65 km. Although these dispersal data are predominantly from the Nearctic and Palearctic, the data set covers all terrestrial biogeographic realms except Oceania (Olson & Dinerstein 1998).

Predictor variables

We obtained data for explanatory variables from the PanTHERIA database (Jones et al. 2009) and supplemented these data with a targeted collection in the primary and grey literature for missing data values (Appendix S2). We chose body mass (g), area of home range (km2), trophic level (carnivore/herbivore/omnivore), population density (number of individuals km−2) and geographic range size (km2) based on the extent of occurrence, to use as explanatory variables in our model (Van Vuren 1998; Sutherland et al. 2000; Bowman, Jaeger & Fahrig 2002; Gaston 2003; Lester et al. 2007). Following Bielby et al. (2007), we also chose weaning age and gestation length to capture two orthogonal axes of life-history covariation across mammalian species, independently of body size. The first axis represents the timing of reproductive bouts, ranging from species that, for their body size, wean early and mature quickly to species that wean late and mature slowly. The second axis captures the trade-off between offspring size and offspring number, from species that, for their body size, produce large litters of small young after a short gestation period to species producing small litters of large young after a long gestation period (Smith & Fretwell 1974; Bielby et al. 2007). Of the traits explored by Bielby et al. (2007), weaning age and gestation length also had the advantage of being available for all the species in the study. We specifically chose a broad interpretation of the definition of species traits that included ecological ‘traits’ (Freckleton, Harvey & Pagel 2002) in the belief that traits which already reflect how species perceive and interact with their environment may correlate with dispersal distance and thus help predict dispersal for species where data are unavailable.

Explanatory variables were chosen specifically based on a priori assumptions about their relationship with dispersal ability. Body mass is thought to be a key correlate of dispersal distance in terrestrial vertebrates (Van Vuren 1998; Sutherland et al. 2000; Bowman, Jaeger & Fahrig 2002; Bowman 2003; Jenkins et al. 2007): locomotion becomes energetically cheaper as body size increases, allowing larger daily movements, thereby giving rise to larger overall dispersal distances (Schmidt-Nielsen 1984; Biewener 1989; Jetz et al. 2004). Home range area (hereafter described simply as ‘home range size’) is defined as ‘the size of the area within which everyday activities of individuals or groups (of any type) are typically restricted’ (Jones et al. 2009). Home range size captures aspects of a species’ habitat use and energy requirements, as well as broadly reflecting other traits such as dietary requirement (Bowman, Jaeger & Fahrig 2002; Bowman 2003). Trophic level can give an indication of the distribution of resources and distances required to maintain populations, as carnivores tend to range over wider areas than herbivores and omnivores (Carbone et al. 2005). Interspecific competition is also highly likely to play a role in observed dispersal behaviour (Gadgil 1971). If, on average, populations within a species tend to live at higher densities, they may produce higher frequencies of dispersing individuals and thus show higher average dispersal distances for a given body size (Johnson & Gaines 1990). Trade-offs between competitive ability and dispersal have been demonstrated (Hughes, Hill & Dytham 2003); therefore, we expect to see negative relationships between dispersal distances and investment in parental activities such as length of weaning period (Stevens et al. 2012). Finally, dispersal ability has been shown to correlate with geographic range size in a number of taxa (e.g. Böhning-Gaese et al. 2006; but see Lester et al. 2007); therefore, this trait is included as a predictor even though the direction of causality may run counter to that for other predictor variables (Gaston 2003). LDD can provide opportunities for rapid access to newly available and climatically suitable habitats; for example, after periods of environmental change, the recolonization of habitats is typically first seen by species with good dispersal ability (Böhning-Gaese et al. 2006; Araújo et al. 2008). Additionally, species with low dispersal ability often exhibit reduced levels of gene flow between populations, which may lead to a higher probability of speciation through either peripheral isolation or vicariance; thus, species that have only recently diverged may show smaller observed range sizes (Belliure et al. 2000; Jablonski & Roy 2003; Phillimore et al. 2007).

We checked variance inflation factors (VIFs) to discount the influence of multicollinearity between predictor variables on the results: all VIFs were <10, and therefore, we included all predictors in the analyses (Table 1 in Appendix S3). We combined trophic levels for herbivores and omnivores to enable comparison with a previous study by Sutherland et al. (2000). We used a generalized additive model to identify model terms that had a significantly nonlinear relationship with dispersal (Wood 2006). Gestation length and weaning age showed nonlinear relationships with both median and maximum dispersal distances. The maximal model therefore included log body mass (g), log home range size (km2), trophic level, log population density (individuals km−2), log geographic range size (km2), log gestation length (days), log weaning age (days) and quadratic terms for weaning age and gestation length. All continuous predictor variables and dispersal distances were log-transformed for the analyses.

Relationship between median and maximum dispersal distances

We explored the relationship between maximum and median dispersal distances within the subset of 61 species where both measures were available using standardized major axis (SMA) regression, which is desirable when the aim is to summarize the relationship between two variables rather than to identify a predictive relationship (Warton et al. 2006). In order to assess the effects of clade structure and phylogeny in shaping this relationship, we calculated the SMA slope using a generalized linear mixed model within a Bayesian framework using Markov chain Monte Carlo (MCMC) methods using mcmcglmm (Hadfield & Nakagawa 2010). The null model was fitted just with a bivariate response (Dmax and Dmed), and the SMA slope was estimated from the posterior distribution of residual covariances. We then added random effects describing the effects of taxonomic order, phylogeny and the combination of both phylogeny and order. The inclusion of taxonomic order over and above the inclusion of phylogeny was to rule out possible clade effects arising from taxonomic bias in data availability. We used the deviance information criterion (DIC) to compare model complexity and fit (Spiegelhalter et al. 2002). We specified the error structure for all models as Gaussian and used flat priors. We ran all models for 250 000 iterations with a burn-in of 3000 iterations and a thinning interval of 10 iterations. Method of dispersal data collection and sample size may be dependent on the size and habits of the study species (e.g. use of mark–recapture techniques and larger sample sizes in smaller species that live at higher densities and use of radiotelemetry for large-bodied species that tend to roam widely). This methodological bias may introduce directionality in data quality between groups of species that may subsequently affect the slope of the relationship. We therefore used sample size as a model covariate to act as a proxy for study effort/method and to test for bias in the relationship.

Phylogenetic signal of dispersal distance

The shared ancestry of species means that there may be similarity in trait values between close relatives, leading to statistical non-independence between data points (Felsenstein 1985). Traits, such as body mass, that are thought to influence dispersal show a strong phylogenetic signal, under a model of Brownian trait evolution (i.e. evolution occurring under a random walk process with constant trait variance over time, such that more closely related species should show more similar trait values; Freckleton, Harvey & Pagel 2002). Therefore, dispersal distance itself may show a signal. If there is limited or no evidence of dispersal distance evolving under Brownian motion, then it may be, when looking across species, that population-level drivers such as landscape structure or more labile behavioural traits are more important than species' traits in determining dispersal (Blomberg, Garland & Ives 2003). We first measured the strength of the phylogenetic signal in the dispersal distance data set and continuous explanatory variables using Pagel's λ statistic (Freckleton, Harvey & Pagel 2002; Pagel 1999), finding the maximum-likelihood (ML) value of λ for the given phylogeny and trait distribution. A λ value of 0 corresponds to no phylogenetic signal in the trait in question, and λ = 1 shows that the trait distribution corresponds to the distribution expected under a model of Brownian evolution. Phylogenetic signal in trophic level was measured using D, a measure of phylogenetic signal strength in a binary trait, based on the sum of sister-clade differences in a given phylogeny – in this case presence or absence of carnivory (Fritz & Purvis 2010). The estimated D value is tested for significant departures from 1 (random trait distribution across the selected tree) and 0 (trait distribution is clumped as expected under a model of Brownian evolution). All phylogenetic analyses were conducted using the mammalian supertree (Bininda-Emonds et al. 2007; updated by Fritz, Bininda-Emonds & Purvis 2009; see Supporting information for pruned phylogenetic trees used in analyses) and the caper package (Orme 2012) in r 2·12·0 (R Development Core Team 2010). Throughout, we used the taxonomy of Wilson & Reeder (2005) to define species.

Multipredictor models, model selection and averaging

Preliminary analyses indicated that many combinations of predictor variables are capable of producing highly explanatory models. Here, we are interested in exploring which combinations of variables are most useful in predicting dispersal, but given the potential for complex interactions between variables and many models of similar explanatory power, use of a stepwise deletion process may incorrectly remove significant terms from the model. We therefore used multimodel inference to naively assess support for a set of candidate models, generate a confidence set of models based on the relative weight of evidence for each model and finally calculate a weighted average of parameter estimates (Burnham & Anderson 2004; Johnson & Omland 2004). Highly weighted individual models may subsequently be recommended for use in predicting dispersal distance, and model-averaged coefficients can be used to assess the directionality of relationships.

We assessed the 287 possible simplifications of the maximal model described previously. This is a disproportionately large number of models to fit to relatively small data sets (Dmax: n = 82; Dmed: n = 77; Burnham & Anderson 2002). However, the prior selection of traits with existing biological relevance, lack of a priori knowledge of which specific combinations should be chosen and existing studies with competing findings meant that fitting all combinations was the only way to test all possible correlates of dispersal distance. We fitted models using phylogenetic generalized linear models (PGLM; Freckleton, Harvey & Pagel 2002), which incorporate the expected covariances between species, calculated using the ML value of λ, into the error term of the model. Quadratic terms for gestation length and weaning age were only included in conjunction with their linear terms. We obtained Akaike Information Criterion values, using the AICc correction for small sample size, and used these to calculate Akaike weights for each model. The highest weighted models that summed to an Akaike weight >0·95 (the 95% confidence interval set; Burnham & Anderson 2002) were selected and used to calculate model-averaged parameter estimates and standard errors across all models using the MuMIn package in r (Barton 2010). Predictors were averaged only over the models in which they appeared. Finally, we wanted to test for the presence of particular combinations of variables consistently occurring together in highly weighted models and subsequently identify axes of explanatory power in the predictor traits. Therefore, we calculated variable weights across models for each explanatory term, and from this, we derived the relative variable importance for all pairwise combinations of explanatory traits (Burnham & Anderson 2002).


To test the ability of our models to correctly predict dispersal distances beyond the set of species for which we have information, we re-ran the models using k-fold cross-validation to obtain comparisons between estimated and observed dispersal distances (Arlot & Celisse 2010). We repeated model averaging using PGLM across the 287 possible simplifications of the maximal model within 100 separate three-fold partitions of the data: two-thirds of the data were used to fit models and one-third retained to test model predictions. Lambda was constrained to the ML value derived from the same model with the full data set fitted, as computing λ for each subset of the variance–covariance matrix was computationally intensive. To examine the predictive power of each model, and the ability of models to correctly predict dispersal distances for individual species, we calculated the residual variance of predictions from empirically observed dispersal distances for the retained test data. This residual variance was then averaged separately across models and species. Weighted means were calculated for species using the overall model Akaike weight. We used Kruskal–Wallis tests on the deviation between observed values and weighted predictions across model to look for taxonomic differences at the order level in both the direction and variation (using absolute deviation) of species predictions.


Phylogenetic signal of dispersal distance

Maximum and median dispersal distances show strong phylogenetic signal: ML values of λ for Dmax = 0·74 (95% CI: 0·512–0·883) and for Dmed = 0·84 (95% CI: 0·663–0·937). The phylogenetic signal in continuous explanatory variables was >0·81 for all traits except geographic range size which exhibits a low phylogenetic signal with wide confidence limits (Table 2 in Appendix S3). Trophic level also showed high phylogenetic signal with a distribution not significantly different from that expected under a model of Brownian trait evolution. Multipredictor models showed extremely low phylogenetic signal across the 95% CI set with a mean lambda across models for Dmax of 0·002 (95% CI: 0–0·006) and for Dmed of 0·203 (95% CI: 0·171–0·235), possibly due to the inclusion of geographic range size as a highly explanatory term within the models.

Relationship between median and maximum dispersal distances

Without accounting for the relatedness of species, the SMA slope of the relationship between maximum and median dispersal distances was found to be not significantly different from 1. However, DIC values for more complex models describing the relationship indicate that the best model includes phylogeny fitted as a random effect (Fig. 1). Once the effects of phylogenetic relatedness are accounted for, the slope of the SMA regression on log–log axes is 1·432 (95% CI: 1·118–1·818): maximum dispersal therefore increases disproportionately with increasing median dispersal distance. No significant effect of sample size was found.

Figure 1.

Main plot: relationship between loge median and maximum dispersal for 61 species, showing the 1 : 1 slope (dashed). Sub plots show the posterior distribution of slope estimates and deviance information criterion (DIC) estimates for Markov chain Monte Carlo bivariate linear mixed models of maximum and median dispersal distance for a null model (a) and models with random effects of order (b), phylogeny (c) and both order and phylogeny (d). The posterior mode (solid line) and 95% confidence estimates (dashed lines) are shown.

Multipredictor models, model selection and averaging

The best supported multipredictor models for both maximum and median dispersal distances had low Akaike weight (Dmax = 0·092, Dmed = 0·127, Table 3 in Appendix S3), indicating no overwhelming support for any particular model (Johnson & Omland 2004). Furthermore, a large number of models are included in both 95% CI sets of models: 78 for maximum dispersal distance and 74 for median dispersal distance. The explanatory power of all models in both 95% CI sets was high. Models predicting maximum dispersal distance had a mean-adjusted r2 of 0·717 (95% CI: 0·712–0·721), while models predicting median dispersal distance had a mean-adjusted r2 of 0·818 (95% CI: 0·816–0·820). A list of all model coefficients for Akaike weights >0·02 can be found in Table 3 in Appendix S3.

For maximum dispersal, model-averaged variable weights (Aw, Table 1) support geographic range size, body mass and home range size as the three most important terms (Aw > 0·79), and all three terms show significant positive slopes when averaged across the model set (Table 1). Weaning age, its quadratic term and population density also show relatively high variable weights (Aw > 0·45, Table 1); however, the slopes of their relationships are not estimated precisely, with wide confidence limits indicating variability around model predictions. For median dispersal, variable weights support home range size as the most important variable (Aw > 0·90), with body mass, population density, weaning age and its quadratic term also showing high weightings (Aw ≥ 0·58). Variable weights are low (Aw < 0·26) for all other continuous terms, and model-averaged confidence limits of estimated slopes include zero for all variables apart from home range size (Table 1). Trophic level shows low variable weight in both models (Aw < 0·26), and there is no clear pattern between variables: carnivores appear to have higher maximum dispersal but lower median dispersal in comparison with herbivores and omnivores (Table 1). Differences in model-averaged intercept values between trophic levels were not significant for models of maximum or median dispersal distance. There is no evidence of consistent aggregation between pairs of variables to form alternative model subsets (Fig. 2, Table 4 in Appendix S3): the mostly strongly supported variables form a single cluster with highest weights in all models in which they co-occur.

Figure 2.

Summed Akaike weights for all pairs of variables across the whole model set for (a) maximum dispersal distance and (b) median dispersal distance. Darker, heavier lines indicate higher weighting in models in which the terms occur together (Table 4 in Appendix S3). BM, body mass; HR, home range size; GR, geographic range size; TL, trophic level; PD, population density; GL, gestation length; WA, weaning age.

Table 1. Summed Akaike weight (∑Aw) and weighted average coefficient for each model term across the 95% confidence interval (CI) set for (a) maximum dispersal distance models and (b) median dispersal distance models
AwModel-averaged coefficientAwModel-averaged coefficient
  1. Confidence limits for coefficient estimates at 95% CI set are given in brackets. Average coefficients with consistently positive or negative slopes indicated by confidence limits are shown in boldface.

(Intercept) −1·153 (−13·620/11·314)1·100 (−9·282/11·481)
Trophic level 0·2570·176
Herbivore and omnivore−0·895 (−11·990/10·199)0·806 (−8·799/10·410)
Carnivore−0·433 (−8·259/7·393)0·734 (−4·777/6·244)
Adult body mass (g) 0·873 0·315 (0·018/0·613) 0·5850·178 (−0·069/0·425)
Home range size (km 2 ) 0·794 0·220 (0·009/0·431) 0·949 0·328 (0·117/0·538)
Population density (individuals km −2 ) 0·460−0·137 (−0·337/0·064)0·703−0·153 (−0·340/0·034)
Geographic range (km 2 ) 0·925 0·252 (0·029/0·474) 0·2320·062 (−0·118/0·242)
Gestation length (days) 0·3190·261
Without quadratic term0·028 (−0·705/0·761)−0·172 (−0·849/0·504)
With quadratic term4·235 (−1·351/9·821)2·655 (−3·474/8·784)
Weaning age (days) 0·5760·723
Without quadratic term−0·207 (−0·773/0·358)−0·025 (−0·507/0·458)
With quadratic term3·202 (−0·505/6·909) 3·716 (0·236/7·195)
Gestation length2 (days)0·183−0·505 (−1·164/0·154)0·076−0·334 (−1·047/0·380)
Weaning age2 (days)0·453−0·372 (−0·774/0·030)0·6860·406 (0·783/0·029)


Variation in predictive ability was observed depending on both the model and the species for which these data were being estimated (Figs 3 and 4). Across species, both the accuracy and precision of estimated dispersal distance were related to the Akaike weight of the 287 models from the full data set. Highly weighted models had no systematic bias in dispersal estimates, with mean differences between observed and predicted values across species close to zero (Dmax = 0·00013 log metres; Dmed = 0·00014 log metres, Table 3 in Appendix S3). The standard error in mean differences between observed and predicted across species is also low (Dmax = 0·164; Dmed = 0·153; Fig. 3, Table 3 in Appendix S3). Therefore, models and terms identified in the main analysis as high weighted are the most likely to provide robust estimates of dispersal distance in cross-species analyses where data are lacking. Species varied widely in the average accuracy of predictions (Fig. 4, Table 5 in Appendix S3). For models of maximum dispersal distance, neither direction (χ2 = 6·97, d.f. = 8, P = 0·54) nor variation (χ2 = 10·87, d.f. = 8, P = 0·21) in departures from observed values was predicted by taxonomic order. For models of median dispersal distance, average variation in departure from empirical data was not predicted by taxonomic order (χ2 = 8·18, d.f. = 7, P = 0·32); however, mean direction of departures (χ2 = 14·71, d.f. = 7, P = 0·04) showed weakly significant differences between orders, possibly caused by an underprediction of dispersal distances for species of Diprotodontia (the marsupial taxonomic order containing kangaroos and possums).

Figure 3.

Average departure from empirically derived dispersal distance for (a) maximum dispersal distance and (b) median dispersal distance against the standard error in predictions across 100 runs – each point represents a single model.

Figure 4.

Within species variation in predictive accuracy for (a) maximum dispersal distance and (b) median dispersal distance. Each grey box represents the interquartile range of residuals from model predictions and the average residual value for one species. Bold horizontal lines are weighted residual means across all model predictions. Thin horizontal lines are unweighted residual means. Species names and mean values are given in Table 5 in Appendix S3.


We found that no single model performed best in describing either maximum or median dispersal distances. Support for a wide range of models across all predictor variables shows that a number of different aspects of a species life history and ecology can be used to predict dispersal distances. This support for many model combinations may reflect the complex nature of the evolution and expression of dispersal as one part of a co-evolving suite of interlinked behavioural, physiological and ecological traits (Ferriere et al. 2000). For example, if selection is to favour dispersal, then individuals must possess the physical ability to disperse and to survive the dispersal process, but they must also be able to successfully reproduce upon arrival in a suitable environment (Johnson & Gaines 1990).

Supporting evidence for multiple correlates of dispersal has recently been demonstrated by Lyons, Wagner & Dzikiewicz (2010). Using data from the fossil record, they found significant relationships between several life-history traits, such as gestation length and maximum life span and the distance moved by the range centroid of North American mammals during the glacial–interglacial cycles of the Late Pleistocene epoch. In addition, Stevens et al. (2012) recently found correlations between dispersal ability in butterflies and a suite of demographic traits. Therefore, the correlations between multiple life-history parameters and dispersal distance found in this study are not unexpected, but the mechanisms behind the co-evolution of ‘dispersive traits’ are unclear and beyond the remit of this study. The traits used in our models vary in the way that they are hypothesized to interact with dispersal: traits such as large body size may aid the dispersal process by lowering the energetic cost of dispersal and decreasing predation risk, some traits such as population density are thought to be drivers of dispersal, and some traits such as home range size and age at sexual maturity may be a reflection of dispersal and colonization ability. Thus, our study correlates traits to allow for prediction rather than identifying mechanisms for dispersal.

Although no single model outperformed all others, body mass and home range size consistently emerged as important predictors of dispersal ability in accordance with findings of previous studies (Van Vuren 1998; Sutherland et al. 2000; Bowman, Jaeger & Fahrig 2002; Dawideit et al. 2009; Sekar 2012). We do not find a strong effect of trophic level (cf. Sutherland et al. 2000). This is likely due to the inclusion of phylogeny in the models. As trophic level shows strong phylogenetic signal, studies that fail to account for the non-independence of species may confound results by introducing pseudoreplication into the analyses, for example species with smaller home range size tend to be herbivores and also show lower dispersal distance (Purvis et al. 2005). In single predictor models, home range size was a better predictor of dispersal distance than body mass (home range size: Dmed r2 = 0·798, Dmax r2 = 0·672; body mass: Dmed r2 = 0·307, Dmax r2 = 0·373). However, the amount of variance in maximum dispersal distance explained by home range size was still lower than found in a previous study by Bowman, Jaeger & Fahrig (2002). The high weighting of these terms may reflect the ability of species to cross landscapes, the spatial scale at which landscape is perceived and/or energetic constraints to dispersal distance imposed by size, as these are factors that are mediated by body mass and reflected in home range size and as such make these variables useful in predicting dispersal distance (Holt 2003; Jetz et al. 2004; Lester et al. 2007). In models of median dispersal distance, weaning age and its quadratic term were also found to show a significant and consistent relationship across the model set. The prominence of the quadratic term hints at a possible trade-off between life-history specialization and dispersal ability, with those at the extreme ends of life-history variation showing reduced expression of dispersal ability.

Interestingly, geographic range size was found to be the term with highest model weighting across models of maximum dispersal distance but among the lowest weighted in models of median dispersal distance. This may be related to the capacity of species for LDD, a trait that can facilitate access to new habitats which subsequently leads to range expansion and hence observed range size (Kot, Lewis & van den Driessche 1996; Gaston 2003; Petrovskii & Morozov 2009). The contrast in predictive power of geographic range between models of maximum and median dispersal distances indicates that different parts of the dispersal distance distribution may have evolved independently and under different selection pressures (Stevens et al. 2012). For example, short-distance dispersal undertaken by large proportions of the population may be selected for inbreeding avoidance, whereas longer movement distances undertaken by relatively few individuals may be selected for based on the benefit of finding empty territories (Lambin, Aars & Piertney 2001; Higgins, Nathan & Cain 2003; Ronce 2007). The hypothesis that median and maximum dispersal distances may be under differing selection processes is given weight by the high variable weightings of population density and weaning age in models of maximum dispersal distance, a finding not echoed for models of median dispersal distance. Although the confidence intervals of the slopes for these relationships are broad, they hint at some interesting underlying drivers of dispersal. In the case of weaning age, this may be linked to the importance of competitive ability upon arrival in new habitats (Burton, Phillips & Travis 2010).

Both maximum and median dispersal distances show evidence of strong phylogenetic conservatism. In addition, the relationship between these measures shows phylogenetic structure: without accounting for phylogeny, the maximum and median dispersal distances are proportional, but the inclusion of phylogeny shows that maximum dispersal distance increases disproportionately with median dispersal distance. Phylogenetic information may therefore be used to guide predictions of dispersal distances for species where data are lacking but where information for closely related species is available (Fisher & Owens 2004; Thomas 2008) and the inclusion of phylogeny in any future comparative models of dispersal ability is recommended.

The strength of this phylogenetic signal is perhaps more than would be expected based on studies examining the evolution of dispersal kernels across populations of the same species (Murrell, Travis & Dytham 2002; Burton, Phillips & Travis 2010; Travis, Smith & Ranwala 2010) and therefore likely masks within-species flexibility in response to the often diverse landscapes and locations within the range which impose differing dispersal costs. Comparative analyses of the shape of dispersal kernel might therefore reveal greater phylogenetic lability (Simmons & Thomas 2004). For example, evidence of changes in both dispersal kernel shape and frequency of ‘disperser’ morphs have been found in cane toads (Phillips et al. 2008), wing-dimorphic bush crickets (Simmons & Thomas 2004) and European butterflies (Stevens, Pavoine & Baguette 2010a).

Cross-validation of the predictive accuracy of models showed that highly weighted models could also predict dispersal distances to varying extents for the species excluded from the data set for testing purposes. Therefore with knowledge of relatively few life-history and ecological traits, predictions of dispersal distance can be made where empirical data are lacking, given the set of preferred models presented in this study. The ability of models to accurately predict observed dispersal distances varies across species (Fig. 4, Table 5 in Appendix S3). It is not immediately obvious as to the cause of this variation. Overpredictions may (among a number of reasons) simply be due to a lack of data (i.e. we have not yet captured the ‘true’ maximum or median dispersal distance of the species) or the location of the species in patchy or highly modified landscapes. Underpredictions also provide an interesting avenue for exploration as species are dispersing farther than would be expected given their traits. These include wide-ranging species with a tendency to become classified as invasive outside the native range (e.g. Sciurus carolinensis, the eastern grey squirrel in the UK, and Vulpes vulpes, the red fox in Australia).

The identification of a relationship between median and maximum dispersal distances also provides the opportunity to predict longer distance movements from the more readily observed and available shorter dispersal distances. There is, however, a degree of scatter around the relationship such that relationships between median and maximum dispersal distances may vary over several orders of magnitude from the line of best fit. While comparative predictions are useful at broad taxonomic levels across many groups, however, further work must be undertaken to assess the quality and quantity of data necessary to make accurate predictions for specific species in a landscape context. The relationship of average dispersal distance with ‘true’ LDD is unknown as capturing the tail of the dispersal kernel is extremely difficult, especially from observational studies (Koenig, VanVuren & Hooge 1996). Tests that combine indirect genetic data and data from direct or observational studies may help improve estimates of LDD and have been employed at the local scale (Vandewoestijne & Baguette 2004) but are currently unavailable at the macroscale required for this study (Ferriere et al. 2000; Stevens et al. 2010b).

One criticism levelled at direct measures of dispersal distance is that they frequently fail to capture rare LDD events (Koenig, VanVuren & Hooge 1996; Kot, Lewis & van den Driessche 1996) and that using gene flow measures provides a more robust way of capturing the tail of the dispersal kernel. Indirect measures of gene flow, such as FST, that quantify variance in allele frequency between populations can be combined with measure of distance between sampled populations in pairwise comparisons to create models of isolation by distance (IBD; Wright 1965; Slatkin 1987; Rousset 1997) from which dispersal rates can be inferred. However, such models are only effective at the spatial scale at which the samples are taken. Ascertaining estimates for LDD can be confounded when two populations separated by great distances both receive migrants from the same intervening populations (Lester et al. 2007). Although these distantly separated populations may exchange no individuals, they show low genetic variance owing to the homogenizing effect of migration from the same source, which can lead to an incorrect inference of high migration distances (Whitlock & McCauley 1999). Incomplete knowledge of the spatial structure of habitats separating populations may also confound IBD estimates as many of the common statistical tests are based on a non-spatial null model (Raybould et al. 2002; Meirmans 2012). These caveats are not insurmountable, and gene flow measures have much to offer the study of dispersal (Waser & Hadfield 2011); however, these problems mean that amount of data available to create robust IBD models is limited, and therefore, the number of species which can be studied is relatively small.

Combining predictions of dispersal distance with generation time can give realistic estimates about how far and how quickly species can track changing climates and move between protected areas. Some small-bodied species, however, may offset low dispersal distances with fast generation times which may enable them to keep pace with climate change or adapt to changing conditions. Cross-validation of models identified species that were consistently over- or underpredicted in their dispersal ability (Fig. 4, Table 5 in Appendix S3). This may be a true reflection of a mismatch between vagility and life history or may be due to the failure of empirical studies to capture the true maximum dispersal distance of the species or as a result of landscape effects modifying dispersal distances seen from empirical studies. Species that fail to disperse as far as would be expected given their life history warrant further field research to identify those that are truly limited in dispersal capability. Although there is no clear taxonomic pattern to which species are overpredicted, it can be inferred that these species may be at higher risk from the impacts of climate change, as they show lower vagility than organisms of similar size and life-history speed. Finally, species range maps can be combined with information of speed of climate change in that area to identify species that are poor dispersers given their size, range and life history and also inhabit areas likely to experience high velocity of climate change (Loarie et al. 2009). Several plant-focussed SDMs have started to incorporate migration limits and dispersal estimates into models (Engler & Guisan 2009; Thuiller et al. 2009). Our study allows these techniques to be extended to mammals where a better understanding of how far species can disperse could help us to understand drivers of extinction risk and improve conservation planning, especially when predicting the impacts of climate change.


We are grateful to Lynsey McInnes, Alex Pigot, Yael Kisel, Albert Phillimore, Georgina Mace, Rob Ewers, Uta Berger and three unknown reviewers for helpful comments on the manuscript. We thank Kate Jones and all PanTHERIA members for access to the PanTHERIA life-history trait database. SW was funded by a NERC PhD studentship. CDLO was funded by an RCUK lectureship.