#### Life cycle

The habitat consists of a finite number, *n*, of demes on a lattice organized in a 1D or 2D torus of sizes *n*_{x} and *n*_{y} in each dimension (*n*=*n*_{x}*n*_{y}). On the lattice, individuals are separated by distances **i**=(*x*,*y*) for discrete values of *x* and *y*. All demes are occupied with an equal number, *N*, of haploid individuals. We assume the following life-cycle: (i) reproduction occurs and a large number of juveniles are produced. (ii) Mutation occurs at a rate *u*. (iii) Juvenile dispersal occurs under a given distribution of dispersal distances. (iv) Dispersing offspring incur the cost of dispersal, *c*_{i}, which is a function of dispersal distance **i**. (v) Adults die. (vi) Offspring compete for the *N* available sites in each deme.

#### Conditions for convergence stability

Here we define conditions for convergence towards an evolutionarily stable dispersal distribution. We consider selection among ‘symmetrical’ strategies, i.e. given a dispersal probability at some distance (*x*,*y*), we assume the same dispersal probabilities at distances (*x*,−*y*), (−*x*,*y*), and (−*x*,−*y*) (as show, e.g. in Fig. 1). We write *d*_{±x,±y} for the sum of dispersal probabilities at these different symmetrical distances. Each phenotype may thus be described by (*d*_{0,0},*d*_{±1,0},*d*_{0,±1},*d*_{±1,±1},…), where *d*_{0,0} is the fraction of philopatric offspring. No constraint is imposed on these probabilities, except that they sum to 1.

Consider first two alleles *a* and *A*, strategies **d**^{a} and **d**^{A} differ as follows. For allele *A* the dispersal probability *d*_{D}^{A} at some distance *D*≡(*D*_{x},*D*_{y}) is higher than the dispersal probability *d*_{D}^{a} at this distance for allele *a*. For allele *A*, dispersal probabilities to all other distances are reduced in proportion to (1−*d*_{±Dx,±Dy}^{A})/(1−*d*_{±Dx,±Dy}^{a}). Here *d*_{D} may be noted simply *z*, with values *z*^{a} or *z*^{A} for individuals bearing allele *a* or *A*. The strategy of a focal individual will be noted *z*_{•}, and the average strategy of individuals at distance **i** relative to the focal individual will be written *z*_{i}.

The expected number of adult offspring of a focal individual can then be described by a fitness function *w*(*z*_{•},**z**,*D*), function of the focal individual's strategy and of the strategies of all its competitors **z**≡(…,*z*_{i},…) over all distances from the focal individual on the lattice. Following Gandon & Rousset (1999), *w* can be expressed in terms of the relative number *g*_{ij} of juveniles from deme **i** (relative to the focal deme) in competition for deme **j** (relative to the focal deme) after dispersal:

- (1)

Each term of the sum over **j** represents the expected number of offspring of the focal in the deme at distance **j** from the focal parent. Each of these terms is the ratio of the focal individual's juveniles (the numerator of the ratio) relative to all juveniles that come in competition for this deme (the denominator of the ratio). The functions *g*_{ij} are detailed in eq. 5 in the Appendix. Selection on the *A* allele can then be measured by

- (2)

where

- (3)

(Rousset & Billiard, 2000). In this expression, effects of neighbours with average phenotype *z*_{i} on the fitness of an *A*-bearing individual are measured by the derivative with respect to *z*_{i}, evaluated in *z*^{a} for all *z* variables. Each such effect is weighted by the probability *Q*_{i}^{′}, which measures the covariation between the phenotype of the focal individual and the phenotypes represented by each variable *z*_{i}. Hence *Q*_{i}^{′} is the identity between the focal adult and a random adult at distance **i** relative to the focal one.

In particular, *z*_{0} represents the average phenotype in the focal deme, including the focal individual. Hence *Q*_{0}^{′} is the identity between the focal adult and a random adult in the focal deme. With probability 1/*N*, this random adult is the focal individual itself. The identity between *different* individuals when adults (i.e. after competition in the life cycle) is identical to their identity when sampled younger (right before competition), which was noted *Q*_{i}. Hence *Q*_{0}^{′} = 1/*N* + (1−1/*N*)*Q*_{0} and *Q*_{i}^{′} = *Q*_{i} for **i**≠**0**.

A dispersal probability *z*^{a} is stable against a mutant *A*, with the effects described above on the dispersal distribution, if it obeys either of the following conditions: (i) *z*^{a}=0 and φ_{D}(0)≤0 (dispersal is counterselected at distance *D*), or (ii) *z*^{a}>0 and φ_{D}(*z*^{a})=0 (some intermediate dispersal probability is selected). Now consider the stability of a strategy against mutants that may alter the dispersal distribution in a more complicated way, e.g. by increasing dispersal at several distances. For all dispersal distances *D*,*D*^{′} at which it has nonzero levels of dispersal, the strategy is stable only if φ_{D}=φ_{D ′}=0 (see Appendix). Using expressions for φ_{D} derived in the Supplementary Appendix (see section *Supplementary material*), this implies

- (4)

where the *m*'s are the backward dispersal probabilities *m*_{j}≡*g*_{0j}/∑_{i}*g*_{ij}, i.e. the probabilities that an adult was born **i** demes away, by contrast with the ‘forward′ rates *d*_{i} which describe where juveniles go.

#### A cost–benefit argument

Equation 4 leads to an intuitive cost-benefit argument. Dispersal at a distance *D* is associated with two types of costs: (i) *c*_{D} is the direct cost (cost paid by the disperser) whereas (ii) ∑_{i}*m*_{D+i}*Q*_{i}^{′} measures the indirect cost because of the competition (in deme *D*) with related individuals (from demes **i** steps apart). The above relation shows that, at the ESS, the product of direct and indirect benefits (1-costs) should be the same at different dispersal distances. If the overall benefits associated with a particular dispersal distance were higher, a mutant strategy with higher dispersal at such distance could invade. The convergence stable distribution of dispersal cannot be replaced by any mutant and is characterized by an equilibration of the fitness gains among all the different dispersal distances.

The above cost-benefit argument yields some predictions regarding the shape of the convergence stable dispersal distribution. Let us focus on an organism sending all the dispersers at the same distance. All the dispersed offspring produced in a given deme (i.e. relatives) will compete against each other after the dispersal phase, leading to a large indirect cost of dispersal. The indirect cost of dispersal would be lower if these offspring dispersed at different dispersal distances. In other words, there is an inclusive benefit to spread the dispersers in different demes. Hence, if the cost of dispersal does not increase with distance, dispersal should follow an island mode of dispersal, where individuals that leave their natal deme are distributed randomly among all other demes.

However, the direct cost of dispersal is likely to be an increasing function of distance and, consequently, will select for less dispersal at higher distances. On the contrary the indirect cost will decrease with distance if there is lower dispersal at higher distances, because probabilities of identity will decrease with distance (genetic isolation by distance). This will select for more dispersal at higher distances. The magnitude of these different effects will be investigated below.

#### Finding the ES dispersal distribution

The analysis of the evolution of the distribution of dispersal distances can be viewed as the analysis of the coadaptation of dispersal probabilities at different distances. The above cost-benefit analysis has heuristic value but it does not directly yield a quantitative prediction because the probabilities of identity are themselves a function of the dispersal distribution. In the Appendix we show how to construct an iterative algorithm to find the ES distribution of dispersal distances.

We note that this algorithm could be used for other purposes than in this paper. In particular we may constrain the evolution of dispersal to a given range of dispersal distances. Iterating the algorithm from an initial distribution with nonzero dispersal within this range and zero dispersal outside leads to the optimal distribution under such constraints.

We will use this algorithm to analyse the effects of (i) the size of the demes (ii) the shape of the cost function and (iii) the shape of the habitat (one or two dimensions). We explored the effect of the cost of dispersal through five different shapes of dispersal function: (1) an ‘island’ cost of dispersal where the cost is independent of dispersal distance as in the classical island model; (2) a ‘saturating’ cost as function of distance; (3) a ‘linear’ increase of the cost with distance; (4) an ‘accelerating’ cost; (5) a ‘stepped’ function where the cost increases step by step. Figure 2 presents some results: the convergence stable dispersal distribution, the average dispersal distance, the mean squared distance and kurtosis of the distribution. Dispersal can be described as the probability distribution of dispersal at some vectorial distance (*x*,*y*), but it is also often described as the distribution of Euclidian distance . Further, dispersal data are often binned in distance classes. Hence we will show binned distributions, computed by summing such probabilities into bins corresponding to different ranges of Euclidian distance (see Fig. 1).

**Result 1 (Deme size): ** Not surprisingly, larger deme size decreases the strength of kin competition at a natal site, which reduces both the dispersal probability (see also Taylor, 1988; Gandon & Rousset, 1999) and (with the exception of the island cost of dispersal) the average dispersal distance (Fig. 2).

**Result 2 ( ‘Island’ cost of dispersal): ** As expected from the qualitative argument presented in the previous subsection, we find that an ‘island’ cost of dispersal (Fig. 2a) selects for an island mode of dispersal. A formal proof of this result can be obtained (see Supplementary Appendix).

**Result 3 (intermediate minimum cost): ** Another easily understood result is that when the cost of dispersal is minimized at some dispersal distance, the distribution of dispersal is also maximized at an intermediate distance (not shown). The maximum dispersal probability may be slightly further than the distance of this minimal cost, as a result of kin competition effects.

**Result 4 (slow increase of cost): ** Less trivially, we find that if the direct cost increases more slowly than the indirect cost decreases with distance, it is possible to have a local maximum of dispersal at an intermediate distance. For example, when the cost of dispersal increases step by step (Fig. 2e) the distribution of dispersal evolves towards a saw-like shape where peaks of dispersal occur at the end of each step. On each step the direct cost does not vary whereas the indirect cost decreases with distance (because of the decrease of genetic identity with distance). This explains the evolution of higher dispersal at the end of the steps.

The example of the stepped cost function shows that if the cost sharply increases at some threshold distance, the optimal strategy allocates more dispersal right before this threshold. It should be noted that there is a simple mechanism which can generate a distribution with a maximum followed by a sharp decline at larger distances: ballistic dispersal (Stamp & Lucas, 1983; Neubert *et al*., 1995). Ballistic dispersal may be understood as a mechanism with a given cost independent of realized dispersal distance up to a maximum distance imposed by this mechanism, and with a maximal dispersal probability at this maximum distance. Thus, it may be understood as a way of minimizing kin competition under constraints close to those described by a stepped cost function. Airborne dispersal can also generate distributions with a maximum at an intermediate distance, e.g. the lognormal distribution (Stoyan & Wagner, 2001, and references therein).

Another example is illustrated in the one-dimensional case when the cost saturates (Fig. 2b). In this case, the direct effect of the cost increases more slowly than the indirect cost of competing with relatives decreases. This yields a local minimal probability of dispersal at an intermediate distance.

When the cost of dispersal increases monotonically with distance (Fig. 2b–d) the evolutionary outcome depends on the increase of the cost function. As expected, a ‘saturating’ function of the cost of dispersal (Fig. 2b) yields much more long-distance dispersal than an ‘accelerating’ function (Fig. 2d). It is shown in the Supplementary Appendix (eq. A.29) that in one dimension, if 1/(1−*c*_{i}) increases faster than distance **i**, dispersal must be zero beyond some distance, so that there is no long-distance dispersal. This is because the relative quantitative effects of the cost of dispersal and of the benefits of avoiding competition between relatives are given by a comparison of 1/(1−*c*_{i}) to a measure of relatedness that varies linearly with distance (Rousset, 1997). In two dimensions we have a similar result except that the same measure of relatedness increases as the logarithm of distance. In other words, as relatedness decreases more slowly with distance in two dimensions, it pays less to disperse further away. Thus, the condition for the existence of long distance dispersal is more restrictive in two than in one dimension.

**Result 5 (Evolution of long distance dispersal): ** If the survival probability does not vanish at long distances, then long distance dispersal is selected for. This is a consequence of the previous result 4, as in this case 1/(1−*c*_{i}) increases too slowly at long distances to select against dispersal.

Individual based simulations have also been done in some cases as an independent way of obtaining the ES dispersal distribution. These simulations confirmed the results of numerical computation (Fig. 2e) and showed that evolution yields only one optimal distribution of dispersal (e.g. no stable polymorphism). Some differences between numerical and simulation results can result from the recurrent introduction of new genotypes through mutation (in the simulation we assumed a mutation rate equal to 2.5×10^{−4}). Indeed, higher mutation rates tend to bias the distribution of dispersal distance towards a distribution with identical dispersal probabilities at each distance. Nevertheless, these differences were small.