Ronald Bialozyt, Department of Conservation Biology, University of Marburg, Karl-von-Frisch-Strasse, D-35037 Marburg, Germany. Tel.: +49 6421 28 24878; fax: +49 6421 28 26588; e-mail: firstname.lastname@example.org
Currently many attempts are made to reconstruct the colonization history of plant species after the last ice age. A surprising finding is that during the colonization phase genetic diversity did not decrease as much as expected. In this paper we examine whether long distance seed dispersal events could play a role in the unexpected maintenance of genetic diversity during range expansion. This study is based on simulations carried out with a maternally inherited haploid locus using a cellular automaton. The simulations reveal a close relationship between the frequency of long distance seed dispersal events and the amount of genetic diversity preserved during colonization. In particular, when the colonized region is narrow, a complete loss of genetic diversity results from the occurrence of very rare long distance dispersal (LDD) events. We call this phenomenon the ‘embolism effect’. However, slightly higher rates of LDD events reverse this effect, up to the point that diversity is better preserved than in a pure diffusion model. This phenomenon is linked to the reorganization of the genetic structure during colonization and is called the ‘reshuffling effect’.
Climatic shifts have occurred repeatedly in the temperate zone, resulting in major expansions and reductions of the range of species living in these areas (Hewitt, 2000). Plants and animals that managed to cope with such challenges generally possess particular attributes, such as high dispersal ability (Dynesius & Jansson, 2000) and high levels of genetic diversity within population (Hamrick & Godt, 1989), which can be a source for further adaptation to the new environments. Since each range expansion should in principle result in the loss of diversity at the leading edge because of founder events (Hewitt, 1993), maintenance of fair amounts of genetic diversity in newly colonized regions represents a major paradox (Petit et al., 2001; Allendorf & Lundquist, 2003; Petit et al., 2004).
Long distance dispersal (LDD) is one mechanism involved in range shifts. Its importance had been anticipated quite early but the resulting consequences are only about to be clarified (e.g. Reid, 1899; Skellam, 1951; Clark et al., 1998). The reason that prompted researchers to incorporate LDD in models of range expansion was the necessity to account for the high speed with which colonization of open habitats took place after the last ice ages. For example, Huntley & Birks (1983) derived from the pollen record a colonization speed of up to 500 m year−1 for oaks in Europe. Given the high age of sexual maturity of these species (∼30 years), this implies that much longer jumps are made within each generation.
The literature that deals with LDD events is mostly based on numerical simulations, because these rare events are both very difficult to observe empirically and to analyse analytically (e.g. Cain et al., 2000; Nathan et al., 2003). For instance, Ibrahim et al. (1996) compared the spatial genetic patterns generated by three different forms of dispersal during range expansion. One mechanism they modelled and analysed was leptokurtic dispersal. They obtained this dispersal function by summing up two normal distributions with different standard deviations (SD). The second normal distribution has a large SD and represents the LDD events, whereas the first normal distribution accounts for local dispersal measured by empirical studies, hence the term ‘stratified’ used to describe dispersal in such models.
Le Corre et al. (1997) also used this approach and showed that the amount of LDD events largely controls the colonization speed. Very low frequencies of LDD events were sufficient to lead to colonization speeds comparable to those deduced from the pollen record. The speed of colonization was found to depend on both the frequency and the distance (SD) of LDD events.
Apart from their role in boosting colonization speed, LDD events also determine the spatial genetic structure that establishes during colonization and persists further on (Ibrahim et al., 1996; Le Corre et al., 1997; Davies et al., 2004). For instance, spatial genetic patterns attributed to colonization events have been described at a regional scale for maternally inherited markers in oaks in Western France (Petit et al., 1997). Patches fixed for a single chloroplast variant form a kind of mosaic over the landscape. Le Corre et al. (1997) were able to simulate such patterns by incorporating LDD events, thereby providing a likely mechanism for the emergence and temporal persistence of this genetic pattern within the landscape.
Interestingly, in these simulations, under certain conditions, genetic diversity was maintained at the regional scale over long distances during colonization (Le Corre et al., 1997; Davies et al., 2004). This is intriguing in view of the well-known depressing effect of founder events on genetic diversity. So far, a detailed investigation of the effect of LDD events on genetic diversity is lacking. In this paper, we examine under which specific conditions a stratified dispersal mode allows genetic diversity to persist during colonization, compared with situations where only local diffusion takes place.
In order to investigate this phenomenon, we opted for a mixed simulation approach of the dispersal process. We use a frequency-based model for the diffusion part and an individual-based model for simulating the LDD events. This makes our simulation approach more flexible and less computer-intensive than pure individual-based models (e.g. Le Corre et al., 1997; Davies et al., 2004). Our model relies on a cellular automata (CA) framework. This CA theory is a powerful tool to understand natural population dynamics, because simulations are spatially explicit, as every population lays on a specific grid point and the notion of space is embodied in the interaction rules (e.g. Boersma et al., 1991; Wissel, 1992). The dynamics, defined by the interaction rules, results in spatial patterns that can be directly compared with empirically observed patterns. The type of patterns generated during the simulation is the outcome of the dynamics of the CA and not an intrinsic preprogrammed property.
The aim of this study is to analyse the influence of varying frequencies of leptokurtic dispersal events and of variable population sizes (i) on the speed of colonization, (ii) on the maintenance of genetic diversity (including both gene diversity HT and allelic richness (AR) and (iii) on differentiation (GST), using a spatially explicit simulation framework.
Description of the simulation model
For our simulations we used a two-dimensional CA model. CA's are based on a regular spaced grid, which defines the space for the simulations. All grid cells are equal with respect to their state space, neighbourhood template and transition function (interaction rules). The state space of a cell in our model is defined by the demographic parameters of the corresponding population. Each cell has a well-defined population size and has specific haplotype frequencies. The cells change their states according to the transition function, which depends on their own state and the state of the cells in the defined neighbourhood.
The transition function is generated through the application of population genetic and demographic processes. Both local and LDD are considered. To model local dispersal we used the classical Moore neighbourhood (Adamatzky, 1994; Wolfram, 1994), in which only the eight adjacent cells influence the dynamics of the central cell. The values of the dispersal matrix for local dispersal are obtained by applying a finer grid onto the Moore neighbourhood and estimating probabilities of dispersal within this refined grid using a two-dimensional Gaussian probability function (with a standard deviation of SD1), and then summing up all probabilities for each of the nine cells (including the central cell itself). Within this part of the model, only haplotype frequencies of the populations (cells) are stored.
In order to model LDD it was necessary to use an individual-based approach to be able to follow the individual haplotype of the seed. In this case, the neighbourhood template is enlarged to include distances up to 25 cells apart. The calculation of the dispersal matrix has been carried out according to the same principles as for local dispersal, but using a much higher standard deviation (SD2). Probabilities are obtained for all cells whose cumulative probability represents at least 99% of the distribution.
Therefore, we obtained two look-up tables, one for each type of dispersal. The relative weight of the two modelled dispersal functions is given by the parameter a, where (1 −a) is the proportion of local seed dispersal events and a is the proportion of LDD events. This allowed us to model a leptokurtic dispersal function corresponding to that in the models of Nichols & Hewitt (1994) and Le Corre et al. (1997). When the parameter a is set to 0, colonization proceeds by means of pure diffusion.
The constant parameters of all simulations are listed in Table 1. The size of the cells is set to 5 × 5 km and one time step corresponds to the time needed to reach sexual maturity. Most parameters are taken from the work of Le Corre et al. (1997) with some exceptions in the simulation setup. The carrying capacity was set either to a low population size (1000 trees per cell as in Le Corre et al., 1997) or to a high population size (100 000 trees per cell), which is a more realistic value (see Loewenstein et al., 2000). Parameter a, the frequency of LDD events, ranged from 10−10 to 10−1. This allowed an investigation of the complete spectrum of LDD effects across nine orders of magnitude. In this investigation the number of haplotypes was fixed to 50. But for illustration purposes we also ran simulations with only four haplotypes. In order to study the effect of varying frequencies of LDD events systematically we restricted the simulation space to 100 × 320 km as in the model of Le Corre et al. (1997). For the analysis of the colonization corridor width effect we enlarged the width of the simulation space to 80 cells (400 km). Finally, to illustrate the effect of LDD vs. pure diffusion during long-range colonization, we enlarged the simulation space to 100 × 1320 km, with 1000 simulation time steps, a carrying capacity of 1000 trees per cell and only four haplotypes (Fig. 1).
Table 1. General programme parameters
The following parameters are used in the simulations for the comparison with the model of Le Corre et al. (1997). The seeds produced by each tree represent only the successful seeds that result in an adult tree.
*Only two for the simulation using a = 10−1 and N = 100 000 individuals.
Carrying capacity per km2
5 × 5 km
Local seed dispersal (SD1)
Long distance seed dispersal (SD2)
Size of the plot east-west
Size of the plot north-south
Number of haplotypes
In order to generate the simplest possible model some initial assumptions were made:
1The model simulates only one species at a time, i.e. there is neither succession nor interspecific competition.
2The simulated genetic marker is haploid and maternally inherited.
3Mutations are absent.
4The model simulates overlapping generations.
5Time is discrete and one time step corresponds to the time needed to reach sexual maturity.
6Population growth within each cell is density regulated.
7The death rate is constant and independent of the genotype; hence there is neither selection nor senescence.
8Every seed produced in this simulation will potentially become a mature tree, except those, which leave the simulation area across the border. Hence there is neither a specific germination rate nor a seedling mortality considered.
The transition function of each cell depends on population dynamics (population growth and size, immigration of seeds and death rate) and the associated genetic processes (seed production, seed dispersal and immigration of new genotypes). Therefore the dynamics are regulated in each cell through five processes:
1Within a cell trees die according to a uniform random Markovian process, in which only the current state of the population is taken into account.
2Population growth per cell is deterministic and modelled by a logistic function (characterized by the carrying capacity K and the growth rate r). Applying this function yields the number of seeds produced. Seeds that move to a new cell, either because of local or LDD, are removed from the total of those produced locally.
3For local dispersal, seeds are collected from the Moore neighbourhood corresponding to the local look-up table and the population sizes within these nine cells. Haplotype frequencies for the next time step are derived.
4The amount of seeds that is dispersed through LDD is calculated according to parameter a and local population size. The individual seeds are added to a new cell within the CA according to the LDD look-up table. Seeds may leave the simulation area but are then definitively lost. Haplotypes are attributed at random to each seed with probabilities corresponding to the haplotype frequencies of the source cell.
5Within one cell, the numbers of seeds originating from the two dispersal functions are summed up and new haplotype frequencies are obtained for this cell. These seeds become the new trees within the next time step.
Hence, a new cell can be colonized in two ways. Either a seed, produced in the directly adjacent cell, enters the cell through local dispersal: this corresponds to a pure diffusion process. Or a seed colonizes the cell through a LDD event.
Edge conditions and starting pattern
The border of the CA is defined empty. This means that no seeds come from outside the simulation area. Seeds might be dispersed over this border but will then be lost. This scenario is comparable to an isolated forest stand surrounded by fields or other habitats not suitable for trees.
At the start of the simulation, the first four cell lines on one side of the simulation space are completely filled with trees at carrying capacity. All haplotypes are equally represented within these cells. The remaining cells of the simulation area are empty.
Output of the model
In order to estimate the colonization speed, the number of time steps necessary to complete colonization is recorded. The colonization phase is considered complete when all cells have at least one individual. The population genetic parameters that are calculated include gene diversity (HT) and genetic differentiation (GST), as defined by Nei (1973). The frequency of each haplotype in the cells was used directly for the calculation (hence no correction for sampling effects is needed). The number of different haplotypes present was also recorded (AR). These three parameters were computed for a region of 100 × 100 km at the end of the simulation grid. The individual cells (5 × 5 km) within these 100 × 100 km patches are considered as subpopulations to derive GST. The results were averaged across at least 50 independent simulations for each combination of parameters, except in the case of the large population size (100 000 individuals) and a LDD frequency of a = 10−1, where only two simulations were made because of limiting computer capacities. Statistical analyses were performed with the computer programme ‘R’ (R Development Core Team, 2004).
We first compared the output of the model with that of Le Corre et al. (1997) using exactly the same parameters (in particular the carrying capacity was set to 1000 individuals). The model of Le Corre et al. (1997) is entirely individual-based and can therefore be used as a reference to check the validity of the simplifying assumptions made by tracking only haplotype frequencies at the population level. Similar colonization speeds were obtained with both models when only diffusion was allowed (0.38 cells per time step, 156 time steps for the complete colonization of the 320 km long landscape). As observed in the model of Le Corre et al. (1997), the inclusion of LDD events leads to an increase in the colonization speed. Moreover, the temporal development of the differentiation coefficient as a function of LDD frequency (including the curve shape and the position of the maximum GST value) was comparable in both models.
LDD and long range pattern formation
We made a first simulation to graphically visualize the type of patterns obtained. The frequency of LDD ranged from 10−1 to 10−10. At the beginning of the simulation, all cells are empty except the 80 leftmost cells, which are filled at carrying capacity. Each of the four haplotypes has the same frequency. The simulated range is 100 × 1320 km and population size is set to 1000 individuals. Simulations are stopped after colonization is complete since the resulting patterns remain very stable afterwards.
Under pure diffusion (a = 0), the haplotypes are distributed in longitudinal stripes with a steady loss of haplotypes along the colonization corridor until only one haplotype remains (Fig. 1; bottom). This type of pattern is still observed at frequencies of LDD events lower than 10−6 the frequency of LDD events is such that some LDD events have taken place at the very beginning of the simulation. Since these are far enough from the main colonization front and rare enough, all subsequent individuals in this population derive from the same founder and hence share the same haplotype. Subsequently, another haplotype will have little chance to achieve a significant frequency in this population. As a consequence, one haplotype becomes fixed well before fixation happens in the pure diffusion model (<300 km; Fig. 1; middle). When the proportion of LDD events increases further, the patches become smaller because many more LDD events take place and none of them has time to grow into a large population. As a consequence, the patchy structure extends to the extremity of the simulation space. For high LDD values (a = 10−1), several haplotypes persist at the leftmost end of the colonization space (Fig. 1; top).
The effect of LDD on diversity and differentiation
The effect of varying amounts of LDD events on genetic diversity and differentiation was studied systematically using two different population sizes (low = 1000 and high = 100 000 individuals, see Fig. 2). When the frequency of LDD is very low (a = 10−10), genetic diversity is similar to the pure diffusion scenario (a = 0, leftmost box in Fig. 2). With an increasing proportion of LDD events diversity declines to a minimum at a = 10−5 in simulations with low population size and at a = 10−7 in simulations with high population size (Fig. 2a,b). With a further increase in a, genetic diversity rises again and reaches higher values than in the pure diffusion scenario for values of a > 10−3 at low population size and of a > 10−5 at high population size. When the frequency of LDD events is higher than these values (up to a = 10−1) diversity is increasingly better preserved during colonization.
Trends in AR are qualitatively similar to those found for gene diversity (HT), with a minimum at a = 10−5 and 10−7 for the low and high population sizes, respectively (Fig. 2c,d). As the frequency of LDD events increases, the increase of AR is even more pronounced than that of gene diversity, eventually reaching its maximum attainable value of 50 haplotypes at a = 10−1 for the low population size and already at a = 10−4 for the high population size.
Genetic differentiation (GST) shows a different behaviour (Fig. 2e,f). With low amounts of LDD events GST remains first at high values similar to those in the pure diffusion scenario. For values of a that correspond to the strongest loss of diversity, GST has a large variance. For higher values of the frequency of LDD events GST declines quickly, especially when population size is low.
Effects of population sizes
An increase in population size from 1000 to 100 000 individuals did not affect qualitatively the relationships between diversity (or differentiation) and the frequency of LDD events (compare Fig. 2a,c,e with Fig. 2b,d,f). However, diversity was better preserved during colonization when population sizes were high. With 100-fold larger population sizes, 100-fold smaller frequencies of LDD events are required to obtain the same effect on diversity and differentiation (e.g. Fig. 2e vs. f). With a higher population size, fewer generations are needed to complete colonization (Fig. 3). Interestingly, for a given population size, the time needed to colonize the simulation space does not decrease immediately with the first inclusion of LDD events. With low population sizes an increase in the colonization speed starts at values of a > 10−6 and with high population size at a > 10−8, i.e. again a 100-fold difference. The decrease in colonization speed as a function of a is sigmoid on this semi-logarithmic plot.
The simulations described above were performed again using a broader colonization space (400 km instead of 100 km). The decrease in genetic diversity was qualitatively similar to that of the previous simulations but less notable (results not shown).
Contrasting effects of LDD
To understand the origin of the initial decrease in gene diversity and AR during colonization in relation to the case of pure diffusion, we followed the development of the spatial genetic structure for a value of a that corresponds to the largest decrease in genetic diversity (Fig. 4). LDD events are so rare that very few seeds (one or two) get established ahead of the main colonization front. This creates foci of population growth with minimal or no interference from further seed flow. The resulting populations have no diversity at all but grow into very large populations. Consequently, migration of other haplotypes into the colonization corridor is largely blocked. Subsequent LDD events into empty cells originate mostly from the populations fixed for the first haplotype, which becomes most frequent at the colonization front. We call this phenomenon the ‘embolism effect’ as it is clearly caused by the blocking effect of the haplotype which has established itself well ahead of the colonization front (Fig. 4). Given the almost complete loss of diversity following such embolism effects, genetic differentiation can take spurious values, hence the large heterogeneity in GST values observed across simulations (Fig. 2).
The contrasting effect of the LDD events is illustrated in the upper part of Fig. 1. With increasing values of the LDD frequency (from middle to top), patches develop along the simulation space. These patches become smaller and smaller with an increasing frequency of LDD events. At the same time, diversity is increasingly well preserved. We suggest to call this the ‘reshuffling effect’, to indicate that the transfer of several founder individuals far away from the colonization front results in an efficient maintenance of diversity across the newly colonized landscape.
Rare LDD events have already been reported to boost colonization speed and affect the spatial organization of genetic variants (e.g. Le Corre et al., 1997; Davies et al., 2004). It was also suggested that LDD could alter the amount of genetic diversity that is maintained during colonization (Davies et al., 2004). The present study systematically explored the effect of varying LDD frequencies on diversity, differentiation and colonization speed. An effect, so far not reported, became evident: the nonmonotonous response of diversity as a function of LDD frequency.
First, when LDD events are very rare, a given haplotype can become fixed well ahead of the colonization front. Subsequent migration of other haplotypes is limited, especially when the colonized region is narrow. As a consequence, genetic diversity decreases compared to the pure diffusion case. We called this the ‘embolism effect’. The blocking effect of patches that establish far from the colonization front and that become fixed for a given genetic variant is not caused by the failure of seeds originating from LDD events to establish in populations that are already at carrying capacity but rather by their negligible impact given the high number of trees that are already present.
Second, slightly higher yet still very low rates of long distance seed dispersal allow a better maintenance of diversity during colonization, compared to cases where only local diffusion takes place. This effect on genetic diversity appears to be caused by the propelling of genetic variants far away from their place of origin, thereby injecting diversity into the colonization corridor at the regional level. We proposed to call this the ‘reshuffling effect’, to indicate that the repeated movement of individuals far away from the main colonization front results in an efficient maintenance of diversity across the newly colonized landscape by reducing drift compared to the pure diffusion case.
In our simulations of a maternally inherited genome, diversity was preserved at the landscape level rather than at the population level, as indicated by the high differentiation coefficient (GST). Maintenance of diversity beyond values observed for the pure diffusion scenario occurred at comparatively much lower rates of LDD events with higher population sizes, suggesting that it is the absolute number of LDD events that is important, not their frequencies.
Can such results help interpret empirical data obtained in the frame of population genetic survey of animals and plants that have experienced phases of rapid expansion? A case in point is that of oaks in Europe since the parameters used in this study were inspired from data obtained with these species. A large cpDNA variation study on oaks has been carried out in Europe, with 2613 populations investigated (Petit et al., 2002). A total of 17 haplotypes of 32 were distributed north of the 45°N latitude. It appeared that the crossing of the Pyrenees and of the Alps resulted in a significant loss of diversity, whereas there was little loss during the extensive spread further north. Actually, diversity has been retained all the way from Southwestern France to Scandinavia along the Atlantic coast and a mosaic of different haplotypes established locally (see Fig. 1f–h in Petit et al., 2002). The latter observation can be interpreted as a result of the ‘reshuffling effect’, whereas the loss of diversity observed when crossing the mountain ranges would have been caused by the embolism effect.
Although the simulation model used here is highly simplified and the parameters have been chosen to mimic the particular situation of the post-glacial colonization of oak in Europe, it does illustrate general processes that have not been well investigated so far: the embolism effect and the reshuffling effect. These two effects result in a nonmonotonous response of genetic diversity and differentiation to the frequency of LDD events, which would not have been detected if space had not been taken into account explicitly. Hence, along with other ecological and evolutionary studies (e.g. Molofsky, 1994; Tilman et al., 1997; Ziv, 1998; Harada, 1999), our simulations illustrate the importance of the spatial context for the understanding of population dynamics, even if the external environment is homogeneous and selection is lacking. In recent years the incorporation of a spatial context into theory has resulted in a massive revision of some of the seemingly best-established paradigms in ecology and population genetics (Kareiva, 1994; Durrett & Levin, 1994; Levin & Pacala, 1997; Silvertown & Antonovics, 2001). In particular, the usefulness of ‘mean-field approximations’ (the assumption that space does not matter) has been seriously challenged. In all these simulations, the size of the neighbourhood where interactions are taking place turned out to be a particularly crucial parameter for population dynamics (e.g. Molofsky, 1994). In the present study, interactions between cells were represented by local and LD dispersal, which correspond to very different neighbourhoods. Furthermore, not only the size but also the strength of the interaction plays a role, since the proportion of local vs. LD dispersal considerably affects the results.
Although the model studied here is strictly neutral, selection could also play an important role in the maintenance of diversity during colonization. In particular, balancing selection (e.g. Arnaud-Haond et al., 2003) or diversifying selection (e.g. Le Corre & Kremer, 2003) may be a source for a higher-than-expected diversity in colonizing populations. However, neutral explanations should be considered before invoking selective effects to account for the maintenance of diversity. In particular, the establishment of differentiated populations across the landscape due to separate founder effects, possibly followed by the admixture of these populations, could result in high levels of genetic diversity. In the European brown hare an admixture of haplotypes from different source populations combined with a reduced effect of genetic drift and a relaxed selection pressure due to rapid population growth after introduction are mechanisms that have been invoked to account for the observed high mtDNA haplotype diversity (Thulin & Tegelstrom, 2001). In plants, studies that have contrasted gene flow through seeds and pollen indicate that a combination of founder events (resulting in strictly separated maternal lineages) and wide genetic exchanges through pollen dispersal after secondary contact allow intrapopulation diversity to be quickly recovered (e.g. Liepelt et al., 2002), thus increasing genetic diversity even further. This secondary increase of diversity can be observed not only during natural (e.g. post-glacial) colonization but also in human-induced biological invasions. For instance, Kolbe et al. (2004) have investigated how diversity had been preserved through multiple introduction events during the invasion of the Cuban lizard into Florida. Multiple introduction events (originated from different places) followed by admixtures of these differentiated gene pools explain why genetic diversity is higher in Florida than in Cuba. Similar explanations were proposed to account for the high diversity of Arabidopsis thaliana in Northern Europe (Stenoien et al., 2005); other examples are provided in Petit et al. (2004).
In conclusion, there are certainly several ways to resolve one of the most prominent genetic paradoxes in invasion biology (Allendorf & Lundquist, 2003): Why are invasive species so successful despite the occurrence of potential harmful bottlenecks? Our simulations illustrate one of the mechanisms that could be involved. While genetic diversity can be reduced locally by founder effects, it can be preserved at the landscape scale by the ‘reshuffling effect’. Subsequently, rapid redistribution of genetic diversity could take place for markers experiencing high gene flow (e.g. for nuclear markers moved not only by seeds but also by pollen), thereby mitigating the effects of the founder events.
This work was partly funded by the EU-Project ‘Biodiversity in Alpine Forest Ecosystems (Analysis, Protection and Management)’ (CT96-1949). The stay of RB in Cestas in 2002 was funded by the ‘Département Forêt-Milieux Naturels’ of INRA.