THE EFFECT OF COLLECTIVE DISPERSAL ON THE GENETIC STRUCTURE OF A SUBDIVIDED POPULATION

Authors

  • Jonathan M. Yearsley,

    1. School of Biology & Environmental Science, University College Dublin, Dublin, Ireland
    Search for more papers by this author
  • Frédérique Viard,

    1. CNRS, UMR 7144, Team Div& Co, Lab. Adaptation et Diversité en Milieu Marin, Station Biologique de Roscoff, Roscoff, France
    2. UPMC Univ Paris, 06, UMR 7144 AD2M, Station Biologique de Roscoff, Roscoff, France
    Search for more papers by this author
  • Thomas Broquet

    1. CNRS, UMR 7144, Team Div& Co, Lab. Adaptation et Diversité en Milieu Marin, Station Biologique de Roscoff, Roscoff, France
    2. UPMC Univ Paris, 06, UMR 7144 AD2M, Station Biologique de Roscoff, Roscoff, France
    Search for more papers by this author

Abstract

Correlated dispersal paths between two or more individuals are widespread across many taxa. The population genetic implications of this collective dispersal have received relatively little attention. Here we develop two-sample coalescent theory that incorporates collective dispersal in a finite island model to predict expected coalescence times, genetic diversities, and F-statistics. We show that collective dispersal reduces mixing in the system, which decreases expected coalescence times and increases FST. The effects are strongest in systems with high migration rates. Collective dispersal breaks the invariance of within-deme coalescence times to migration rate, whatever the deme size. It can also cause FST to increase with migration rate because the ratio of within- to between-deme coalescence times can decrease as migration rate approaches unity. This effect is most biologically relevant when deme size is small. We find qualitatively similar results for diploid and gametic dispersal. We also demonstrate with simulations and analytical theory the strong similarity between the effects of collective dispersal and anisotropic dispersal. These findings have implications for our understanding of the balance between drift–migration–mutation in models of neutral evolution. This has applied consequences for the interpretation of genetic structure (e.g., chaotic genetic patchiness) and estimation of migration rates from genetic data.

Movement of individual organisms need not be an individual affair. Aggregations are widely observed in many organisms, leading to the orchestrated mass movements of individuals (Parrish and Edelstein-Keshet 1999). This collective dispersal (i.e., correlated dispersal paths between two or more individuals) may be an emergent property from complex interactions between individuals (Sumpter 2006; Guttal and Couzin 2010) or driven by the environment (e.g., dispersal by ocean or air currents Hofmann et al. 1998; Wenny 2001), or a combination of both. Irrespective of the underlying driver, collective dispersal implies that the movements of individuals are not entirely independent of one other. For instance, recent work modeling larval dispersal in the nearshore marine environment shows collective dispersal due to the dispersal and settlement of “packets” of larvae (Siegel et al. 2008).

One difficulty in obtaining a general picture of the significance of collective dispersal is the diverse terminology for dispersal modes which all imply, to some extent, that dispersal events are not entirely independent. Some examples are: collective motion (Bazazi et al. 2011), cohesive motion (Gregoire and Chate 2004), collective migration (Guttal and Couzin 2010), collective behavior (Sumpter 2006), coordinated group movement (Holyoak et al. 2008), clumped dispersal (Fromhage and Kokko 2010), patchy dispersal (Potthoff et al. 2006), propagule pool dispersal (Slatkin 1977), group dispersal (Soubeyrand et al. 2011), directional dispersal (Wenny 2001; Levine 2003), kin-structured migration (Rogers 1987; Fix 2004), and rafting (Fraser et al. 2011). At the risk of creating an additional term, we use “collective dispersal” throughout, as an umbrella term to mean any mode of dispersal where two or more dispersing individuals are more likely to reproduce nearby one another if they were born nearby one another.

Collective dispersal has knock-on consequences for the population genetics of a structured population, because collective dispersal can reduce the mixing effect of dispersal (even in systems with high rates of migration), which slows down the rate at which related lineages (e.g., siblings) move away from one another. The population genetic theory of collective dispersal was partly addressed by the propagule pool model of dispersal (Slatkin 1977), which is a generalization of Wright's infinite island model to include local extinction and deme recolonization by a propagule of k individuals from the same deme. This model therefore focuses upon collective dispersal only at the time of deme recolonization. Whitlock and McCauley (1990) extended this model by allowing propagule pool dispersal to occur with probability ϕ and migrant pool dispersal (i.e., a random sample from the entire metapopulation) with probability 1 − ϕ. The general conclusion from these models is that increasing relatedness between colonists (e.g., by increasing ϕ) increases the equilibrium population differentiation, FST. Both of these approaches only allow collective dispersal to have an effect in the presence of local population extinction.

Rogers (1987) developed a model of kin-structured migration, where individuals have a tendency to disperse in the company of relatives. This model was built upon an earlier model by Rogers and Harpending (1986) and describes the variances (and covariances) in allele frequencies. Rogers quantified collective dispersal (in the form of kin-structured migration) by combining the genetic correlation between individuals who migrate together (κ) and the size of a group of migrants (γ) into a collective dispersal parameter math formula (e.g., if all migration involves pairs of full-sibs, κ = 0.5, γ = 2, and math formula). This model also shows that collective dispersal increases FST, especially when migration rates and local genetic drift are high. Surprisingly this is the only population genetic theory that we are aware of that incorporates collective dispersal as a regular mode of dispersal and not just as an exceptional mode of dispersal (i.e., during founder events).

In this article we present a simple extension of Slatkin's propagule pool model (Slatkin 1977; Whitlock and McCauley 1990) that incorporates collective dispersal as a regular mode of dispersal. Compared to the kin-structured migration model of Rogers (1987), our model is formulated in terms of coancestries rather than allele covariances, encompasses a broad range of life histories (e.g., selfing rate, gametic/diploid dispersal), is easily extendable to other hierarchical spatial structures and predicts coalescence times as well as F-statistics. A specific, simplified version, of our model is used in Broquet et al. (in press) to study genetic differentiation in marine invertebrates (see Discussion). Here, using a two-sample coalescent approach we derive results for coalescence times, genetic diversities, and genetic differentiation for a range of metapopulation scenarios. Finally, we link our theory on collective dispersal with processes of anisotropic dispersal (i.e., different dispersal rates according to the direction of movement of the organisms) and demonstrate, using simulations of a finite island model, the similarities in predicted genetic structure from models with anisotropic and collective dispersal.

The Model

To look at the effects of collective dispersal upon coalescence times, genetic diversity, and differentiation we study an extension of the classic finite island model. This model considers a structured population with an array of D demes. For clarity, we will focus on results in the many-deme limit (Wakeley 2004). This approximation yields simple results that are applicable to a broad range of situations, unless the number of demes is small (e.g., in the order of D < 20). For completeness, the results from a finite island model are presented in the online Supporting Information (Table S2–S5) and in the Mathematica notebook available on the Dryad Repository (Yearsley et al. 2013). We will highlight some links with classical results from an infinite island model, but we do not present a full derivation for this model (an infinite island model would give infinite coalescence times and F-statistics identical to the many-deme limit).

Each deme contains N diploid individuals whose gametes disperse between demes (a model for diploid dispersal is presented in the online Supporting Information and in the Mathematica notebook available on the Dryad repository, Yearsley et al. 2013). Generations are nonoverlapping and the demographic life cycle (going forward in time) is: reproduction, mutation, adult mortality, dispersal of gametes, union of the gametes (selfing occurs at rate s), population regulation (N diploid individuals). We calculate expected coalescence times, genetic diversities, and F-statistics for a sequence of DNA under Kimura's infinite-sites approximation (Kimura 1969) with a mutation rate μ per generation per sequence. Our model could equally well be formulated as a single locus with an infinite number of alleles, with identical results.

Our results are calculated just prior to population regulation. From these results F-statistics can be obtained at any stage during the life cycle by calculating the effect of mutation and drift within a life cycle upon the probabilities of identity by descent. An example of this calculation for a simplified collective dispersal model is given by Broquet et al. (in press). Expected coalescence times are not sensitive to the point within the life cycle at which they are calculated.

COLLECTIVE DISPERSAL IN A FINITE ISLAND MODEL

Two randomly sampled DNA sequences from the entire population must be in one of the following three states: two sequences in a diploid individual, two sequences in different individuals in the same deme, and two sequences in different individuals in different demes (states 1, 2, and 3, respectively). The ancestral history of a pair of sequences (mutations will be considered later) can be defined by a transition matrix, G, where an element, math formula, gives the probability that a pair of sequences in state i had ancestors from the previous generation in state j (the rate of coalescence per generation for a pair of sequences in state i is then given by math formula). This matrix can be written as

display math(1)

where s is the rate of selfing (s = 1/N gives random mating), math formula the probability that the ancestors of a sequence pair in the same deme (either state 1 or 2) were in state 3, and math formula the probability that the ancestors of a sequence pair in state 3 were also in state 3. For example, math formula arises under gametic dispersal because the two gametes that form a diploid individual may have come from different demes (under diploid dispersal this transition is impossible because segregation occurs before migration). For the infinite island model math formula and math formula, where m is the probability that a sampled sequence in a deme came from another deme.

To introduce collective dispersal into the above model, we need only focus on the parameter α, because collective dispersal reduces the rate at which within-deme pairs are broken apart. Collective dispersal has no effect upon the rate at which sequences in different demes are brought into the same deme (i.e., β). To start with, we consider the infinite island model, which assumes that individuals migrate independently of one another. Collective dispersal changes this by introducing a probability that two immigrants came from the same source deme, ϕ. We define the probability ϕ as our measure of collective dispersal, which can take any value between zero and one. In a system with a finite number of demes, D, we can distinguish two contributions to the probability ϕ: (i) the probability of sampling two independently dispersing immigrants from the same source deme is math formula; (ii) the additional probability of sampling two immigrants that have “actively” dispersed collectively. Only the second contribution is collective dispersal due to an organism's biology. When D > 20 the effect of the first contribution will be weak. For a finite number of demes, no active collective dispersal corresponds to math formula, biologically relevant collective dispersal corresponds to math formula and math formula corresponds to immigrants coming from different demes more often than would be expected by chance. In the many-deme limit, we shall refer to no collective dispersal as ϕ = 0. Including ϕ in the model gives

display math(2)

where the first term is for nonmigrants and the second term for “collective migrants” (the second term is purely a second-order effect of migration because it requires at least two individuals to migrate). All parameter definitions are given in Table S1.

Coalescence times and genetic diversity

The effect of collective dispersal on expected coalescence times can be calculated by first-step analysis (Wakeley 2009). We can find the expected coalescence time for two sequences in state i, Ti (units of generations), using the property that Ti will not change from one generation to the next,

display math(3)

Solving equation (3) gives

display math(4a)
display math(4b)
display math(4c)

In the many-deme limit β = 0 and α is given by equation (2). Substituting in these expressions for α and β, and assuming random mating (s = 1/N) gives

display math(5a)
display math(5b)

where math formula and math formula are expected coalescence times when there is no collective dispersal (Wakeley 2009).

Equations (5a) and (5b) show that increasing collective dispersal (increasing ϕ) causes a linear decline in coalescence times between all pairs of sequences. Within-deme coalescence times (T1 and T2) are now functions of migration rate when ϕ > 0 (eq. (5a)), whereas in the absence of collective dispersal (ϕ = 0) within-deme coalescence times are well known to be independent of migration rate (Nagylaki 1982; Wakeley 2009).

The largest effects of collective dispersal occur in systems with high migration rates. One way of viewing this is to consider the rate math formula, which is the rate (looking backward in time) at which two lineages sampled in the same deme separate from one another in the many-deme limit (in a finite island model the relevant rate is math formula). The effect of collective dispersal is proportional to m2 and therefore largest as m approaches 1. The central role of the rate math formula can be seen by using equation (4b) to calculate the decrease in all coalescence times relative to no collective dispersal under random mating (assuming the many-deme limit and math formula),

display math(6)

Although the largest absolute effect of collective dispersal occurs when m approaches 1, the effect of collective dispersal is still apparent when math formula. For small migration rates math formula.

Selfing rates above random (s > 1/N) accentuate the effect of collective dispersal because selfing is itself a form of collective dispersal of genes within individuals. For example, when m = 1 and N ≫ 1 we find that math formula and math formula. This shows that T1, T2, and T3 are all reduced as the selfing rate increases and that this reduction is greatest for T1. Detailed results for other metapopulation scenarios (e.g., finite number of demes D, diploid dispersal, and extinction) are given in the Supporting Information (Figures S1–S4, Tables S2–S5).

From equation (6) it is straightforward to predict the effect of collective dispersal upon genetic diversity when diversity is defined as the probability that a pair of randomly sampled sequences differ at a random nucleotide site (Charlesworth 1998). This genetic diversity is proportional to the expected coalescence time under the infinite-sites model, because no site can receive more than one mutation (Pannell and Charlesworth 1999). In this case, equation (6) predicts that collective dispersal will reduce all genetic diversities (e.g., within-deme diversity and between-deme diversity; Pannell and Charlesworth 1999) by a factor of approximately math formula. Including local deme extinction into our model would allow results to be compared to the propagule pool model (Slatkin 1977; Whitlock and McCauley 1990; Pannell and Charlesworth 1999), which only allows for collective dispersal during recolonization of extinct demes (see online Supporting Information, Figure S4, Table S6).

IDENTITY BY DESCENT AND F-STATISTICS

Let Q = {Q1, Q2, Q3} be the vector of expected probabilities that pairs of sequences in states 1–3 are identical by descent (i.e., coalesce before a mutation occurs in either sequence). These probabilities, Q, can also be calculated using first-step analysis, giving

display math(7)

where math formula is the probability per generation that neither sequence in a pair mutates. The identity by descent probabilities, Q, can be written as Wright's F-statistics, FIS and FST (Rousset 2004),

math image(8)

Solving equation (7) and taking the limit of small mutation rate (γ → 1) gives

display math(9a)
display math(9b)

which for an infinite number of demes reduces to the classical result for an infinite island model, math formula, math formula when β = 0, math formula, 1 ≫ m, s = 1/N, and m is the migration rate (Rousset 2004). Equations (9a) and (9b) can also be obtained from Equations (4a) to (4c) using Slatkin's approximation for inbreeding coefficients in terms of coalescence times, math formula (Slatkin 1991). Supporting Information presents results for diploid dispersal (Figure S1).

The parameter math formula, which appears in equation (9b) (see also Table S4 in the online Supporting Information), describes the amount of between-deme mixing. It is the sum of two processes: breaking apart within-deme pairs into between-deme pairs (β measures the contribution into state 3 from gene pairs in states 1 and 2) and creating novel within-deme pairs (math formula measures the contribution into state 2 from gene pairs in state 3). Mixing is highest when a = 1 (α = β) giving an FST = 0. Mixing is lowest (a = 0) when α = 1 and β = 0, giving an FST = 1 regardless of the values for N and s. Equations (9a) and (9b) can be used to describe a metapopulation, because expressions for α and β can be found when the number of demes is finite with local extinction (see online Supporting Information), but we will concentrate upon the many-deme limit with no local extinction (math formula).

Collective dispersal is predicted to amplify the effect of selfing upon FIS. For gametic dispersal, the effect drives FIS further from zero (either further negative when s < 1/N or further positive when s > 1/N), but this is a weak, indirect effect because it is effectively an example where two processes of collective dispersal are nested (nonrandom mating can be thought of as a form of “collective dispersal of lineage pairs from the same individual”). To visualize this effect, we start by considering the many-deme limit with no collective dispersal. In this case, selfing (s > 1/N) will decrease the coalescence time of lineage pairs within the same individual relative to lineage pairs in different individuals within a deme, because like collective dispersal, selfing slows down the rate at which pairs of lineages within the same individual are broken apart. However, selfing can only occur if both lineages are nonmigrants. Migrant lineages do not contribute to selfing (under gametic dispersal) because the probability of sampling two gametes that came from the same deme, math formula, is vanishingly small. Collective dispersal changes this by giving a finite probability that a pair of migrant gametes came from the same individual. Collective dispersal therefore enhances the effect of selfing in addition to its own direct effects. This shows that processes of collective dispersal at different spatial scales can interact with one another. For diploid dispersal there is no effect of collective dispersal upon FIS (see online Supporting Information, Figure S1). In the many deme limit math formula and is therefore independent of ϕ.

Collective dispersal is predicted to have a stronger effect on FST than on FIS; increasing FST as ϕ increases. FST can be increased by several orders of magnitude for systems with high migration rates (Fig. 1a). The relative increase in FST shown in Figure 1a is almost independent of local deme size (N). However, the absolute effect of collective dispersal is strongest when N is small (Figure 2, see Broquet et al. in press, for an example of a relevant situation). The effects under diploid dispersal are still stronger than under gametic dispersal (Supporting Information, Figures S1, S2). Standard population genetic theory predicts that increasing migration rate typically reduces FST. However, equation (9b) can be used to show that FST has a minimum value of math formula in the presence of collective dispersal when math formula. When migration rate is greater than math formula further increases in migration causes FST to increase (Fig. 1b). Therefore, increasing dispersal can increase measures of genetic differentiation in the presence of collective dispersal.

Figure 1.

Expected values of FST as a function of migration rate, m, in the small mutation rate limit (γ→1) of a finite island model eq. (9b) when ϕ = 0.1, 0.5, and 0.7 (dotted, solid, and dashed lines, respectively). The other parameters are D = 50, N = 100, s = 1/N. (a) FST values relative to FST with no active collective dispersal (ϕ = 1/[D − 1]). (b) FST reaches a minimum at math formula in the presence of collective dispersal.

Figure 2.

The absolute increase in FST between ϕ = 0.2 and ϕ = 1/49 for N = 10 (solid line), N = 20 (dotted line), and N = 100 (dashed line). All results are for a finite island model (D = 50) with random mating (s = 1/N) and gametic dispersal.

An approximate link can be made between our results and the kin-structured dispersal model of Rogers (1987) by relating his equation (4b) to our (9b). Using our notation this gives

display math(10)

where me is the effective migration rate defined by Rogers and Harpending (1986) and θ is Roger's kin-structured dispersal parameter (Rogers 1987). We note that the kin-structured dispersal model is expected to differ from our model (e.g., it assumes regulation before dispersal). Substituting equation (2) for α into equation (10) and setting ϕ = 0 (equivalent to θ = 0) gives an effective migration rate math formula and the remainder gives math formula. The two models both show that collective dispersal has an additive effect upon math formula for small values of ϕ (i.e., math formula). However, in Rogers’ model the effect of collective dispersal on math formula is independent of m, whereas under constant ϕ our formalism predicts that θ (and therefore math formula) increases nonlinearly as m approaches 1 (when math formula, math formula and when m = 1, math formula).

ESTIMATION OF MIGRATION RATES IN THE PRESENCE OF COLLECTIVE DISPERSAL

Genetic approaches for estimating migration will generally underestimate migration rates if collective dispersal is present but not accounted for. This underestimation occurs because collective dispersal reduces the rate of mixing of lineages between demes. Relatively high FST values may therefore not be indicative of low migration rates, but instead high migration rates (m ≈ 1) with collective dispersal (Fig. 1b). Because collective dispersal affects coalescence times and F-statistics, we expect it to affect a broad range of existing migration inference methods. Methods that exploit spatial genetic differentiation (e.g., Vitalis 2002; Broquet et al. 2009) will be affected by the increase in genetic differentiation due to collective dispersal, including regression methods that exploit isolation by distance (e.g., Rousset 1997, 2001). When collective dispersal is present, these methods will estimate an effective migration rate, which will be a function of collective dispersal and migration (this is discussed in more detail for Vitalis’ method in the next paragraph and in the online Supporting Information). Coalescence-based approaches (e.g., Beerli and Felsenstein 2001; Hey 2010) will also be affected by decreased expected coalescence times under collective dispersal, although a detailed analysis of the robustness of these methods to collective dispersal is beyond the scope of this article.

We note that estimation biases will also apply to methods that do not assume migration-drift equilibrium. This can be demonstrated using the estimation approach of Vitalis (2002), which estimates migration rate by comparing FST's from pre- and postdispersal samples. This approach can be modified to allowing for collective dispersal (see online Supporting Information). In the many-deme limit this gives

display math(11)

where FST(pre) and FST(post) are the nonequilibrium FST values before and after dispersal, respectively, and disregard whether two genes are from the same individual or different individuals in the same deme. Equation (11) is equivalent to equation (A1) in Vitalis (2002) with math formula replaced by math formula. So in this nonequilibrium example, failing to account for collective dispersal causes underestimation of migration.

COLLECTIVE DISPERSAL AND ANISOTROPIC DISPERSAL

Anisotropic dispersal (where different directions have different dispersal rates) is another form of dispersal that influences within and between deme genetic variance and is arguably widespread in natural metapopulations. We make a formal link between anisotropic dispersal in a finite island model and our ϕ parameter of collective dispersal, and highlight some scenarios of anisotropic dispersal that can be described by our results on collective dispersal.

Anisotropic patterns of dispersal generally arise when the contribution of each source deme into the pool of immigrants is not uniform. For a finite island model with D demes, anisotropic dispersal into deme i can be described by a parameter math formula, defined by math formula , where mij is the probability that an individual in deme i is an immigrant from deme j. Anisotropy corresponds to math formula for at least one source deme given the constraint math formula (i.e., math formula). Anisotropy could be due to deterministic forces, where each math formula would be time invariant (e.g., dispersal down a river is anisotropic because there is a fairly constant bias toward movement downstream; Pollux et al. 2009). Anisotropy could also be due to randomness, where each math formula has a time-average of zero, but fluctuates from one time to another (e.g., Wright 1948; Nagylaki 1979). Other forms of anisotropy allow the total number of immigrants to vary between demes (math formula), but for the rest of this article we let the number of immigrants be the same across all focal demes (m is constant) with some source demes contributing more immigrants to a focal deme than others.

Our descriptions of anisotropic dispersal (the parameter math formula) and collective dispersal (the parameter ϕ) are in many ways equivalent. For example, the extreme case of ϕ = 1 is equivalent to all immigrants in a single deme coming from a single source (i.e., for the focal deme all but one pairwise immigration rate equals zero). Our collective dispersal parameter, ϕ, can be related to anisotropic dispersal by math formula. A finite island model with D demes has mij  =  (1/(D − 1) + δj) for all ji, giving math formula (note that math formula).

The effect of constant anisotropic dispersal on FST has been calculated from the existing coancestry theory for metapopulations with arbitrary, time-invariant migration matrices (e.g., Rousset 2004; Lehmann and Perrin 2006). In contrast, our model for collective dispersal (e.g., the FST predictions of eq. (9b) is expected to describe scenarios where anisotropic dispersal randomly varies from one generation to another but whose long-term average is isotropic. This is because our model has no history about the source demes of collective migrants (i.e., the Markov property of eq. (7)), and so spatial anisotropies in dispersal from one generation to the next are independent.

We confirmed this expectation by simulating anisotropic dispersal in a finite island model (Table 1). We ran simulations for time-invariant and randomly time-varying migration matrices (labeled “constant” and “random” in Table 1) with D = 50, N = 100, s = 1/100, μ = 10−4 (γ = 0.9998) at 20 independent loci and recorded the final average F-statistics across loci after 10,000 generations (see Appendix for details). Under constant anisotropic dispersal patterns our simulations agreed with existing theory (Lehmann and Perrin 2006). However, patterns of anisotropic dispersal that randomly varied from one generation to another were correctly predicted by our equation (9b). Results on collective dispersal therefore provide new analytical results for scenarios of anisotropic dispersal.

Table 1. FST predictions for a classic finite island model (D = 50, ϕ = 1/49), an infinite island model (D = ∞) with collective dispersal ϕ = 0.8 (eq. 8b), a finite island model (D = 50, ϕ = 0.8), two simulations of a finite island model with anisotropic dispersal equivalent to ϕ = 0.8 (mean ± SD from 20 unlinked loci) and theoretical predictions for a finite island model with constant anisotropic dispersal (Lehmann and Perrin 2006). All models have D = 50, N = 100, γ = 0.9998, and random mating within a deme (s = 1/N) unless otherwise stated. The finite island anisotropic migration simulations have either a “random” migration matrix every generation or a “constant” migration matrix (for details see Appendix)
 Classic finite island modelCollective dispersal theoryAnisotropic finite island
Migration rateD=50D=∞D=50RandomConstantTheory
m(×10−2)(×10−2)(×10−2)(×10−2)(×10−2)(×10−2)
0.12.042.202.142.10±0.174.37±0.674.57
0.20.861.010.980.94±0.072.36±0.362.28
0.50.160.410.380.36±0.060.96±0.150.94
0.80.020.610.570.57±0.070.92±0.170.86
0.90.0030.950.870.91±0.101.08±0.131.07

Discussion

We find that collective dispersal decreases expected coalescence times for all pairs of lineages. Collective dispersal therefore weakens the strength of migration relative to genetic drift and mutation, increasing the probability per generation of coalescence for all pairs of lineages and speeding up the rate of turnover at neutral loci. Collective dispersal will also decrease both within-deme and total genetic diversity and increase FST. The collective dispersal of gametes will also amplify the effect of selfing upon FIS. For example, FIS becomes increasingly negative with increasing collective dispersal when selfing rates are low (s < 1/N).

The effect of collective dispersal will be particularly important in systems with high migration rates (e.g., marine invertebrate dispersal; Cowen and Sponaugle 2009). Broquet et al. (in press) studied the significance of collective dispersal in explaining the weak, yet significant, genetic structure commonly observed in marine species (so-called chaotic genetic patchiness; Johnson and Black 1982; David et al. 1997). This genetic structure is not expected based on classical population genetic theory because the dispersal potential of these species is far greater than the spatial scale of the observed genetic differentiation. Using a simplified model to the one presented here, Broquet et al. showed that under certain circumstances, in particular small breeding groups (i.e., small deme size), chaotic genetic patchiness can build up due to collective dispersal. Introducing collective dispersal can therefore change the interpretation of genetic data, and the intuitive role of migration. Systems with high migration rates are not limited to marine systems, and patterns similar to chaotic genetic patchiness are also observed in terrestrial systems (e.g., Torimaru et al. 2007). Our model has assumed that demes are fixed in space, but in cases where the majority of individuals move together it may be more relevant to define a deme by group integrity (e.g., a natal group Clutton-Brock and Lukas 2012) rather than spatial location.

Increases in genetic differentiation between demes (increasing FST) with increasing collective dispersal are consistent with previous models of two specific types of collective dispersal: propagule-pool dispersal (Whitlock and McCauley 1990; Pannell and Charlesworth 1999) and kin-structured migration (Rogers 1987). This effect of collective dispersal may be enhanced when within-deme groups of related individuals disperse together (e.g., kin-structured dispersal and diploid dispersal). For example, diploid dispersal produces higher equilibrium FST compared to gametic dispersal because the pair of lineages within a diploid always move together (Supporting Information, Figures S1, S2). Kin-structured dispersal is commonly thought of as the movement of family groups with a fairly well defined size (Rogers 1987). Rogers found that increasing family group size strengthened kin-structured dispersal. Our model does not explicitly define the size of a collective group, and instead group size is a random variable where the expected group size is an increasing function of ϕ. Within-deme groups of a fixed size could be incorporated into our framework by including additional hierarchical groupings, in much the same way as diploids introduce within-deme groups containing two lineages.

We have identified two qualitative differences between models with and models without collective dispersal: the possibility that increasing migration rate can increase FST and the breakdown in the invariance of within-deme coalescence times with respect to migration rate. This first effect on FST is counter to the “intuitive” classical role of migration, and is most likely to be seen in systems with small local breeding groups (i.e., systems where drift is strong and equilibrium FST high) and high migration rates (for examples of marine systems where this applies see Broquet et al. in press). One way of understanding this effect is by considering the rate at which pairs of genes separate into different demes (which is the rate math formula). This rate reaches a maximum when math formula. This means that the power of migration to break apart pairs of lineages weakens for sufficiently large values of m. This effect can also be understood in terms of coalescence times. From equations (5a) and (5b) the ratio of within-deme (T2) to between-deme (T3) coalescence times reaches a maximum when math formula and therefore math formula (Slatkin 1991) decreases as migration rate approaches unity. However, for systems with larger local population size (N > 100), the effect of collective dispersal on FST will be hard to detect. The second qualitative effect of collective dispersal, the breakdown of within-deme invariance to migration rate, was also identified by Pannell and Charlesworth (1999) when they included local extinction in a metapopulation model. The effect of collective dispersal on coalescence times, and genetic diversities, has a broader biological relevance than the effects on FST because it does not depend strongly upon local population size. The long-term effect of collective dispersal upon genetic diversity and neutral evolutionary dynamics will be felt even in large populations. For example, in the many-deme limit with no selfing (s = 0) and m = 0.8, a collective dispersal of ϕ = 0.5 will reduce the expected coalescence time T2 (and the within-deme diversity) by 34% when N = 10 and by 36% when N = 106 relative to the expectation when ϕ = 0 (when ϕ = 0, T2 = 20.01 and 2.0 × 106 when N = 10 and 106, respectively).

Although the migration rate, m, and our measure of collective dispersal, ϕ, describe distinct processes they can have similar quantitative consequences upon equilibrium genetic structure, making them easy to confound. Although still to be developed, a full-coalescent theory for collective dispersal will have more power to distinguish m and ϕ because it will describe samples larger than two (collective dispersal is a process involving movements of groups). Currently, mark-recapture methods (Bennetts et al. 2001), or genetic parentage or assignment methods (Broquet and Petit 2009; Saenz-Agudelo et al. 2009), look like the most effective approach to estimating collective dispersal, although they suffer from important limitations for their application in nature (Broquet and Petit 2009). In marine systems, biological–physical particle tracking models may also be useful in estimating the probability of collective movements (e.g., Ayata et al. 2009; Yearsley and Sigwart 2011).

Our results exploit the simplicity of only considering sample sizes of two (pairwise coancestries and coalescence between pairs of lineages), whereas a full coalescent approach would describe all possible sample sizes. However there are several technical difficulties in developing a full-coalescent approach for collective dispersal. The group movement implicit in collective dispersal makes a full coalescent model of the process difficult (Hudson 1998; Wakeley 2009) because it is uncertain whether a structured population with collective dispersal converges to the structured coalescent (although certain asymmetric migrations can be described by a structured-coalescent approach Wilkinson-Herbots and Ettridge 2004). The weak migration approximation (when math formula) is also of limited interest because collective dispersal is of most interest in systems where migration rate is high. A separation of time-scales approximation to the full coalescent may be one possible approach, but to our knowledge the details have not been worked out (Ethier and Nagylaki 1980; Nordborg 1997; Wakeley 1999; Wakeley and Aliacar 2001; Wakeley 2009). Such an approach would describe a recent, relatively quick scattering phase (short time-scale), during which migration separates lineages into different demes. The scattering phase will end when no deme contains more than one lineage from the initial sample. There would then follow an extended collecting phase (long time-scale) that can be described by the Kingman coalescent, during which coalescence can occur when migration brings pairs of lineages into the same deme. Collective dispersal will have a very weak effect in the collecting phase because lineages will predominantly be in different demes (lineages in different demes cannot move collectively). The main effect of collective dispersal is therefore expected to be during the scattering phase. We would expect collective dispersal to lengthen the scattering phase because it slows down the rate at which lineages in the same deme separate into different demes.

Although the theory presented here was developed to describe collective dispersal it also has direct applicability to the population genetic effects of anisotropic dispersal. We predict that anisotropic dispersal has largely the same consequences for neutral evolutionary dynamics as collective dispersal. The theory is especially relevant when anisotropies are due to random fluctuations in migration rates from one generation to the next. Anisotropic dispersal has recently received more research interest than collective dispersal in both population genetics (e.g., Wilkinson-Herbots and Ettridge 2004; Labonne et al. 2008; Chaput-Bardy et al. 2009; Morrissey and de Kerckhove 2009; Horreo et al. 2011) and demography (e.g., Fagan 2002; Levine 2003; Vuilleumier et al. 2010; Kleinhans and Jonsson 2011). Examples of systems with anisotropic migration are dendritic networks, such as river systems (Campbell Grant et al. 2007) and more complicated topologies, such as hedgerow networks and ocean current systems, which may be described using graph and network theory (Lieberman et al. 2005; Dale and Fortin 2010) or Lagrangian models in marine systems (Siegel et al. 2008). Asymmetric migration has been shown to increase FST for migration patterns that converge to a structured-coalescent (Wilkinson-Herbots and Ettridge 2004) and for random fluctuations in migration rates (Nagylaki 1979), in agreement with results for collective dispersal. The increase in FST due to asymmetric migration also plays a role in Phase III of Wright's Shifting Balance theory (Wade in press). Individual-based simulations of dendritic networks give the surprising results of a quadratic relationship of FST to the proportion of within-stream to overland migration (Chaput-Bardy et al. 2009) and an increase in isolation by distance as network connectivity is increased (Labonne et al. 2008).

Conclusions

Using a two-sample coalescent model we have studied the generic consequences of collective dispersal upon evolutionary neutral dynamics in a finite island metapopulation. To truly assess the importance of collective dispersal as an evolutionary force in real systems will require the estimation of parameters such as our ϕ parameter (Rogers highlighted the same issue for his parameter of kin-structured migration; Rogers 1987). Our results may allow some estimation of collective dispersal from genetic data, although a full (n-sample) coalescent approach that includes collective dispersal would give greater power to estimate collective dispersal parameters and should be an avenue for future research.

Our method has focused upon neutral dynamics, but collective dispersal is likely to have implications at loci under selection because it will change the balance between the forces of migration and selection. Future research into the balance between migration, drift, and selection under collective dispersal would give us a new description of adaptive dynamics in systems where individual dispersal paths are correlated (e.g., dispersal by ocean currents, and wind), perhaps using a similar approach to Nordborg's selection-migration models (1997). Such research would also be relevant to the evolution of collective dispersal as a dispersal strategy. With a clear link between collective dispersal and patterns of anisotropic dispersal, future work on either will have broader relevance to both types of dispersal.

ACKNOWLEDGMENTS

The authors thank J. Sigwart and S. Baird for their stimulating discussions, three anonymous reviewers and Ophélie Ronce for her constructive comments and expert associate editorial work. JY was supported by a Ulysses grant from the Irish Research Council. TB was supported by a Ulysses grant from the French ministers “Affaires étrangères et européennes” and “Enseignement supérieur de la recherche.” TB and FV were supported by the “Marine Aliens and Climate Change” programme funded by AXA Research Funds

Appendix

We simulated the gene frequencies for a finite island model with anisotropic dispersal. The model had D = 50 demes, each with N = 100 diploid individuals per deme. Mating was assumed to be at random within a deme with a selfing rate of s = 1/100. We calculated the gene frequencies for 20 unlinked loci with 50 gene variants per locus. We tested that the finite number of gene variants has a negligible effect when compared against theory that assumes an infinite number of variants. Mutation gave an equal probability of transforming a gene into any of the other 49 gene variants at a locus and occurred at a rate μ = 10−4 per generation (i.e., γ = 0.9998). Dispersal was assumed to be of gametes. For the anisotropic island model each focal deme had one “dominant” source deme that contributed a proportion (p) of the immigrants (each focal deme had a different dominant source deme), and the remaining 48 source demes contributed equally to the remaining immigrants (each remaining deme accounted for a proportion [1 − p]/48 of immigrants). For the random anisotropic migration model dominant source demes were randomly assigned each generation, whereas for the constant anisotropic migration model the dominant source deme was randomly assigned at the start of a simulation. This anisotropic migration, p, can be related to the collective dispersal parameter, ϕ, when math formula because math formula, which gives

display math(A1)

Results from the constant anisotropic migration model were validated against theoretical expectations (Lehmann and Perrin 2006).

Each simulation was run for 10,000 generations, which was always sufficient to reach a migration–drift–mutation equilibrium. For the final generation, the average within-deme and total heterozygosity at each locus (HS and HT, respectively) were calculated and FST at each locus was estimated using Nei's math formula. This estimate will be unbiased for our simulated data because mating is random, demes have equal population sizes and complete sampling. In general GST can give a biased estimate of FST. The results in Table 1 give the mean and standard deviation for our FST estimates across the 20 loci. All simulations were performed using MATLAB (2012b). The code and the simulation data used in this article are available on the Dryad Repository (Yearsley et al. 2013).

Ancillary