How species evolve collectively: implications of gene flow and selection for the spread of advantageous alleles


Carrie L. Morjan. Fax: 812 855 6705; E-mail:


The traditional view that species are held together through gene flow has been challenged by observations that migration is too restricted among populations of many species to prevent local divergence. However, only very low levels of gene flow are necessary to permit the spread of highly advantageous alleles, providing an alternative means by which low-migration species might be held together. We re-evaluate these arguments given the recent and wide availability of indirect estimates of gene flow. Our literature review of FST values for a broad range of taxa suggests that gene flow in many taxa is considerably greater than suspected from earlier studies and often is sufficiently high to homogenize even neutral alleles. However, there are numerous species from essentially all organismal groups that lack sufficient gene flow to prevent divergence. Crude estimates on the strength of selection on phenotypic traits and effect sizes of quantitative trait loci (QTL) suggest that selection coefficients for leading QTL underlying phenotypic traits may be high enough to permit their rapid spread across populations. Thus, species may evolve collectively at major loci through the spread of favourable alleles, while simultaneously differentiating at other loci due to drift and local selection.


The nature of species is one of the most widely debated topics in evolutionary biology. The traditional view is that species evolve as cohesive units held together by gene flow, which acts to prevent populations from differentiating through local adaptation or genetic drift (Futuyma 1987). This view originated with Darwin (1859, ch. 4), who commented that ‘Intercrossing plays a very important part in nature in keeping individuals of the same species, or of the same variety, true and uniform in character.’ Similarly, Mayr (1963, pp. 521–522) argued that ‘The steady and high genetic input caused by gene flow is the main factor responsible for genetic cohesion among the populations of a species.’ Population genetic theory provides a rough estimate of the amount of gene flow required to hold species together. When gene flow among populations exceeds about four migrants per generation neutral alleles are homogenized among populations, effectively producing a panmictic species (Hartl & Clark 1997). Species cohesion is thought to break down when gene flow is reduced among populations to fewer than one migrant per generation, allowing differentiation to occur through the fixation of alternative alleles through genetic drift (Wright 1931).

The view that gene flow is important to species integration has been challenged by reports that contemporary gene flow among natural populations is generally too low to prevent divergence (Ehrlich & Raven 1969; Levin 1979). Indirect estimates of gene flow from molecular marker surveys indicate that historical levels of gene flow are probably somewhat higher than those estimated from direct observation, but confirm that for many taxa levels of gene flow are too low to overcome genetic drift and local selection within populations (Slatkin 1985a; Ellstrand & Elam 1993; Levin 2000). These data have led some researchers to conclude that populations, rather than species, are the units of evolution (Ehrlich & Raven 1969), and that species are arbitrary assemblages of these independently evolving units and no different from higher levels of taxonomic organization (Mishler & Donoghue 1982; Ereshefsky 1999; Mishler 1999). The implications of this conclusion are profound: if species are merely a collection of individuals that have shared an ancestor in the recent past, then conspecific populations may typically evolve along independent trajectories, the study of reproductive isolation becomes a largely meaningless venture and the use of the species taxon in estimates of biodiversity can no longer be justified (Bachmann 1998).

The widespread recognition that levels of gene flow seem too low in some species to hold them together has led to a variety of responses. As mentioned above, some scientists have advocated the abolishment of the species category (e.g. Mishler 1999) or at least recognition that it is an artificial construct (Bachmann 1998; Ereshefsky 1999). Others have suggested replacements to the biological species concept that either rely on multiple evolutionary forces to account for species cohesion (e.g. Templeton 1989; Levin 2000) or that do not require knowledge of the evolutionary processes that unify species (e.g. de Queiroz 1999).

Rejection of the biological species concept on the grounds of too little gene flow may be premature, however, because these arguments do not fully consider the interactions of gene flow with selection. Gene flow interacts with selection in two ways (Slatkin 1987). It may prevent adaptive genetic differentiation if the migration rate (m) exceeds fitness differences (s) (Slatkin 1987); it is this interaction that has received the most attention from students of speciation. For example, narrow clines within a species range suggest that relatively weak selection may overcome the conservative effect of gene flow (Barton & Hewitt 1985). However, selection also controls the rate of spread of advantageous alleles (Rieseberg & Burke 2001). Given sufficiently strong levels of selection for globally advantageous mutations, their rapid spread across populations could allow species to evolve as a cohesive unit, even for those populations connected with very low levels of gene flow. In this paradigm, species with low rates of gene flow (Nem < 4) may evolve collectively at these ‘major’ genes, while experiencing uniformity or divergence at neutral alleles (depending on the level of gene flow), and divergence at loci responsible for local adaption. This hypothesis for the creative role of gene flow, if valid, could expand the purview of the biological species concept to essentially all sexual species.

The theoretical basis for this hypothesis derives from studies by Fisher (1937) and Slatkin (1976). Fisher (1937) demonstrated that the rate of spread of advantageous alleles depends largely on the diffusion coefficient of dispersal and the selection coefficient. Population subdivision has little effect on reducing the rate of spread as long as Nem is fairly large; however, its effects are unclear when migration rates are low (Nem < 1). This work was extended by Slatkin (1976), who modelled the number of generations necessary for an advantageous mutation to spread unidirectionally across a subdivided population in a stepwise manner. Slatkin's model demonstrates that the strength of selection has a greater effect on the rate of spread of advantageous mutations across subdivided populations than does the organismal migration rate (Fig. 1), and that rare, long-distance dispersal events can greatly enhance the rate of spread. Even in the most extreme case, with gene flow as low as 0.1 migrants per generation, which may often be the case for selfing annuals (Hamrick & Godt 1996), mutations with selective coefficients (s) between 0.05 and 0.01 can spread across 20 populations in 4000–18 000 generations, respectively. This is relatively rapid considering that the duration of herb species has been estimated to average 10 myr (Levin & Wilson 1976). The clear message is that alleles with high selective advantage are able to spread quite rapidly across population systems, even when levels of gene flow are much lower than traditionally thought necessary for species cohesion.

Figure 1.

Number of generations for an advantageous allele of selective advantage s to spread across a species range. The data are based on Table 1b provided by Slatkin (1976), which are results from a stepping-stone simulation model assuming that 20 steps are required for an allele to spread across 20 populations (n = 500 per population) in a linear arrangement.

The lack of attention given to the role of gene flow in the dissemination of advantageous alleles is due partly to theoretical emphasis on Fisher's (1930) infinitesimal model, which assumes that most advantageous mutations have individually slight phenotypic effects, implying that selection coefficients associated with these mutations will be small as well. Indeed, Ehrlich & Raven (1969) noted that the fate of a new allele arising ‘will presumably be governed by the kind of gloomy odds facing mutant genes’, and concluded that its likelihood of spreading among populations is remote. Recent theoretical work confirms that weakly advantageous alleles will take nearly as long as neutral alleles to spread across a subdivided population (Fig. 2, Cherry & Wakely 2003). Indeed, the timescales are sufficiently large that a mutation with a similar phenotypic effect may arise independently in a local population rather than be imported through migration. The general focus of population genetic theory on weakly selected mutations contrasts with the wealth of empirical evidence for an oligogenic basis for adaptation (Orr 1999), and perhaps for species differences (Coyne & Orr 1998; Orr 2001). Assumptions of weak selection common to all diffusion equations may also cause difficulty in predicting the dynamics of strongly advantageous mutations. For example, Whitlock (2003) showed through simulation that although fixation times for strongly advantageous alleles are predicted accurately by diffusion, the probability of fixation is underestimated as the strength of selection increases. More theoretical work is needed concerning the role of alleles with high selective advantage in evolutionary processes.

Figure 2.

Number of generations for a single advantageous allele of selective advantage s to spread across a species range. The data are based on Table 2 provided by Cherry & Wakely (2003), which are results of simulations based on Wright's island model of equal exchange of migrants among populations.

In addition to the relative effects of selection and migration, extinction and recolonization may have important effects on the genetic consequences and evolution of subdivided populations (McCauley 1991; Harrison & Hastings 1996; Whitlock 2003). Although Slatkin (1977, 1985a, 1987) suggested that recurrent extinction and recolonization constituted another form of gene flow and hence would reduce genetic differentiation among populations, Wade & McCauley (1988) argued that the mode in which vacant sites are founded could either increase or decrease genetic differentiation. Most subsequent work on the genetic and evolutionary consequences of metapopulations involves the dispersion of genetic complexes related to Wright's shifting balance theory (Slatkin 1989; Barton 1993; Rouhani & Barton 1993; for a review see Whitlock & Phillips 2000). Although we do not address specifically the implications of extinction and recolonization on the spread of globally advantageous alleles, we acknowledge that metapopulation structure may play an additional role in the collective evolution of species.

Although Slatkin's (1976) theoretical model provides a plausible basis for the collective evolution of low gene flow species, more information is needed to evaluate its applicability. For example, as far as we are aware we do not yet have empirical estimates of the strength of selection on any mutation contributing to the fixed differences between species. However, there are numerous estimates of the strength of selection on phenotypic traits, which are controlled typically by multiple genes (Falconer & Mackay 1996). To provide a rough guess of what the strength of selection on individual mutations contributing to species differences might be, Rieseberg & Burke (2001) compiled information on selection coefficients for phenotypic traits and the genetic architecture of quantitative traits in plants. Their crude estimations suggested that selection coefficients for genes of major effect may be surprisingly high. First, the average leading quantitative trait loci (QTL) for 50 traits from interspecific or intersubspecific crosses explained 31% of the phenotypic variance in the mapping population. Rieseberg & Burke (2001) then used this information to estimate that the strength of selection (s) for a leading hypothetical QTL averaged 0.11 in plants. In light of Slatkin's (1976) model, such alleles are likely to spread quite rapidly across a species range despite very low levels of gene flow (Fig. 1).

It is unclear whether a similar case can be made for animals; some authors argue that animal traits are more likely than plant traits to be influenced by numerous genes because of the more integrative nature of animal development (Gottlieb 1984). Also, while gene flow is considered to be too low to explain the cohesiveness of plant species (Levin 1979), it is not thought typically to be limiting in animal species, except possibly for amphibian and freshwater fish species (Ward et al. 1992).

In this study, we provide a comprehensive review for sexual animals, bacteria, fungi and plants of historical estimates of gene flow, the strength of selection on phenotypic traits and the genetic architecture of inter- and intraspecies differences. We ask specifically: (1) are levels of gene flow sufficiently high in these different organismal groups to hold conspecific populations together across all loci or just at those loci with unconditionally advantageous alleles; and (2) are estimated selection coefficients large enough for models of collective evolution through the spread of advantageous alleles? Based on the findings from our literature review, we argue that the view of species as passive end products of evolution is premature, as it is based on early claims that systematically underestimated levels of gene flow, and failed to recognize the creative role of gene flow in collective evolution. We acknowledge, however, that our arguments are based on very crude estimates of selection coefficients and simplistic theoretical models. None the less, we hope the information reviewed here can be used as an impetus for empirical studies that estimate selection coefficients for mutations that contribute to species’ differences, as well as for the proper parameterization of theoretical models that build on those of Slatkin (1976), Cherry & Wakely (2003), and Whitlock (2003).


Gene flow

We compiled estimates of gene flow from the literature by reviewing all studies published in Molecular Ecology, from its first issue in May 1992 to December 2002, for FST, GST, Nem and ΦST values for plants, fungi and animals. In some studies, ΦST could be calculated directly from percentage variation within and among subpopulations provided from amova tables (Excoffier et al. 1992). In cases where overall population subdivision was not reported, values were calculated as the average of all pairwise comparisons. We also included in the database information about geographical ranges, mating systems and types of molecular markers used in the analyses. For studies not reporting Nem values, Nem was calculated from FST, GST or analogous statistics for nuclear markers as Nem = 1 − FST/4 * FST (Wright 1931) or as Neom = 0.5[(1/FST) − 1] for chloroplast and mitochondrial markers (Birky et al. 1989; McCauley 1994). We also calculated GST and Nem estimates from allozyme frequencies reported from a database including 160 and 538 studies from plants and animals, respectively (Ward et al. 1992). Bcause this study synthesizes the results of many different studies, we performed meta-analysis to account for differences in sampling variance among studies (Cooper & Hedges 1994; Gurevitch & Hedges 1999). Therefore, all summary statistics and analyses using GST values (and their analogues) and Nem values were weighted by the number of loci used for each estimate. An anova evaluating patterns of life history and geographical distributions on normally distributed, log-transformed Nem included only data from the Molecular Ecology database, as most (if not all) of the independent variables were not available for the Ward et al. (1992) database. However, all available data are included in tables and summary statistics when possible.

Interpretation of Nem values with respect to the broader question of species cohesion requires caution due to many unrealistic assumptions underlying the island model used to estimate FST (Whitlock & McCauley 1999). For example, biases in estimation may occur by assuming no selection or mutation, equal numbers of migrants, equilibrium between genetic drift and gene flow and lack of spatial structure among populations. We presented both FST and Nem estimates to prevent introducing further bias in interpretation.


Our second goal was to provide a rough estimate of what the strength of selection on individual mutations that define species differences might be. Although Rieseberg & Burke (2001) calculated the average strength of selection for a leading QTL that contributes to species differences in plants, no such estimates were made for animals. We similarly estimated the strength of selection for individual mutations by a two-step process. We first compiled estimates of the strength of selection on phenotypic traits from two review studies (Kingsolver et al. 2001; Lexer et al. 2003a), and added studies from the literature that measured selection in experimentally manipulated or disturbed populations. Second, we evaluated studies reporting the genetic architecture of inter- and intraspecies differences in animals, to estimate the average effect size of a leading QTL for a phenotypic trait. From this compiled information, we estimated the average strength of selection on major QTLs that contribute to species differences (following Rieseberg & Burke 2001; see Results).

To evaluate the strength of selection in experimentally manipulated and disturbed populations, peer-reviewed literature included in the Science Citation Index was searched using keywords (e.g. phenotypic selection, selection gradient, selection differential, experiment, manipulated) and references provided within the literature found through these search results were also surveyed. Criteria for studies included in our database were as follows: (1) only nondomesticated populations were included; (2) phenotypic variation in traits must be experimentally varied, either through direct manipulation or by comparing locally adapted strains (populations) in a common garden; and (3) standardized directional selection gradient or selection differential must be included. The linear selection gradient, β, measures the intensity of phenotypic selection on a trait, independent of the confounding effects of indirect selection on correlated characters (Lande & Arnold 1983). Linear selection differentials (i) estimate the strength of selection on a trait through direct selection and also through selection on correlated traits. (4) Selection analyses must include the groups expressing experimentally enhanced phenotypic variation. Unfortunately, the majority of studies using experimentally manipulated populations found in our search were necessarily excluded because they failed to meet the above criteria. Most frequently, studies evaluated whether statistically significant differences existed among groups or experimental treatments, but did not estimate the strength of selection. Other studies did not combine experimental treatments or locally adapted strains within selection estimates, thereby failing to evaluate the strength of selection for populations with experimentally enhanced phenotypic variation. One study was excluded because selection gradients were not standardized (Sinervo et al. 1992). One review study (Ennos & McConnell 1995) provided a table of selection coefficients calculated by the authors from several studies of competitive fungal growth. Those studies were also included in the database. All summary statistics and analyses using selection estimates were weighted by sample size to account for among-study sampling variation.

We also reviewed the literature using the Science Citation Index, and literature cited from references listed therein, for studies reporting the effect sizes of QTL detected for phenotypic traits in undomesticated animals. In the database we included information on the species, inter- or intraspecies cross, minimum number of QTL affecting the trait and effect sizes. We also recorded the method used to determine the effect size, whether in standard deviations (Keightley & Bulfield 1993), percentage of phenotypic variance explained (PVE: Lander & Botstein 1989), or percentage of difference in parental means (%D: Zeng 1992). Summary statistics for these studies were not weighted, as very few of the studies provided sample sizes or variances for estimates of QTL effect sizes. All data for gene flow, selection and QTL estimates are provided in tables S1–S5 as a supplementary electronic appendix.

Of course, any biological interpretation of these data must recognize sources of ascertainment and publication bias. For example, spurious correlations between traits and fitness may occur through environmental variation among individuals, causing bias in magnitude and direction of selection estimates (Scheiner et al. 2002; Stinchcombe et al. 2002). Kingsolver et al. (2001) note that phenotypic traits are not chosen at random for selection studies. Studies are probably biased towards traits where strong selection is suspected to occur, thereby overestimating the frequency of large selection gradients in nature. Furthermore, publication trends preferentially exclude studies in which selection is weak or undetected, further biasing estimates of selection gradients upward. Finally, most selection studies tend to have low statistical power (Kingsolver et al. 2001), reducing the ability to detect weak selection.

There are also several sources of bias in estimating the number of loci and their effects on a quantitative trait. For example, a QTL may include the combined effects of more than one closely linked locus (Steinmetz et al. 2002). Linked QTL with opposing effects are unlikely to be detected, whereas linked QTL with effects in the same direction may appear as a single QTL with large effect (Barton & Keightley 2002). A QTL may also reflect more than one mutation even within a gene (Stam & Laurie 1996). Effect sizes are likely to be overestimated with small sample sizes and low marker density (Beavis 1998; Orr 2001), and if they ignore the effects of environmental variance. However, a simulation study showed that QTLs with large effect (> 15% PVE) may be systematically underestimated, whereas effect sizes for minor QTLs may be overestimated, when effect sizes are estimated from the same analysis that detects the QTL (Flint 2003). Many estimation methods for QTL effects also ignore interactions with other segregating QTL, which may also introduce bias (Mackay 2001).


Gene flow

Estimates of historical levels of gene flow (Nem) from molecular markers in plants are consistent with predictions from their life histories and with previous literature reviews (Hamrick & Godt 1989; Heywood 1991; Hamrick & Godt 1996). For example, median Nem estimates were higher for molecular markers transmitted only through pollen than for those through seeds or for those through both seeds and pollen (Table 1), confirming similar patterns of population subdivision reported previously by Ennos (1994) and Latta & Mitton (1997). anovas using log-transformed FST estimates or Nem from nuclear data revealed significant differences in gene flow by mating system (Table 2). Median FST and Nem estimates from nuclear markers indicated higher gene flow among outcrossers and those with mixed mating systems than selfers (Table 3), as shown previously by Hamrick & Godt (1989) and Heywood (1991). Estimates of FST and Nem in plants were also affected by the proportion of the species range sampled (Table 2). Further analysis of sampling ranges showed that studies at local levels for plants provided much higher estimates of gene flow than studies that sampled a greater portion of the species’ range (i.e. biregional, regional and species-wide), with no significant differences occurring among the remaining sampling ranges (Table 3; FST: d.f. = 2, 87; P = 0.53; Nem: d.f. = 2, 87; P = 0.77). In contrast to patterns of population subdivision reported by Hamrick (1983) and Loveless & Hamrick (1984), no differences in gene flow were found based on species’ geographical distributions (endemic, regional or widespread; Table 2). Although population subdivision may be affected by interactions between mating system and geographical range (e.g. Table 5 in Hamrick & Godt 1996), no interaction between these two variables was found (Table 2); neither were gene flow estimates affected by whether the data were derived from allozyme or DNA-based markers (Table 2).

Table 1.  Estimates of gene flow by mode of transmission in four taxonomic kingdoms
Mode of transmissionST*Nem
nMean ± SEMediannMean ± SEMedian
  • *

    Consensus based on three methods for evaluating population subdivision: FST, GST and ΦST. These values are combined in the table because ΦST serves as an analogue to FST (Excoffier et al. 1992) and GST bears a similar relationship to Nem as FST (Avise 1994), and is standard practice in tables summarizing population subdivision (e.g. Heywood 1991; Bohonak 1999).

 Pollen  80.39 ± 0.080.13  82.51 ± 1.383.45
 Seed 290.46 ± 0.050.44 296.92 ± 3.420.64
 Both2940.32 ± 0.010.182891.82 ± 0.281.10
 Mitochondrial1500.45 ± 0.020.451516.83 ± 3.140.60
 Nuclear7810.20 ± 0.010.107816.30 ± 0.632.12
 Haploid  50.24 ± 0.050.20  52.36 ± 1.461.96
 Diploid 140.19 ± 0.040.18 132.15 ± 1.021.13
 Haploid  30.10 ± 0.020.10  34.14 ± 0.424.50
Table 2.  Results from anova evaluating effects of mating system, experimental design, and geographical distribution on (1) log-transformed combined gene flow estimates (top values: ST; see footnote to Table 1) and (2) log-transformed estimates of number of migrants (bottom values: Nem) in plants and animals. The interaction effect was excluded from the animal study due to lost degrees of freedom
Source of variationPlantsAnimals
  • *

    Outcrossing, selfing and clonal, mixed mating and unknown.

  • Local, regional, biregional and species-wide.

  • Endemic, regional or widespread.

Mating system*3, 1174.690.00393, 2332.370.072
3, 1174.570.00463, 2353.710.012
Portion of range sampled3, 1174.220.00713, 23318.720.0001
3, 1173.620.01523, 23519.870.0001
Geographical distribution2, 1171.060.352, 2332.660.072
2, 1172.090.132, 2352.680.071
Mating system × geographical distribution6, 1170.270.95NANANA
6, 1170.620.71NANANA
Nucleotides or allozymes1, 1171.340.251, 2330.00020.99
1, 1170.800.371, 2350.370.54
Table 3.  Estimates of gene flow by mating system in plants and by mating system and geographical distribution in animals using nuclear markers
Taxon variableST* N e m
nMean ± SEMediannMean ± SEMedian
 Mating system
  Outcrossing 740.29 ± 0.020.14 731.38 ± 0.311.47
  Mixed1740.30 ± 0.020.181702.99 ± 0.571.17
  Selfing or clonal 220.43 ± 0.030.36 220.43 ± 0.060.45
  Unknown 240.48 ± 0.060.18 240.91 ± 0.271.15
 Range sampled
  Local 430.25 ± 0.020.16 432.04 ± 0.671.19
  Regional 520.39 ± 0.040.26 521.12 ± 0.240.71
  Biregional  20.41 ± 0.080.37  20.51 ± 0.100.56
  Species-wide 360.33 ± 0.030.25 360.82 ± 0.170.80
 Mating system
  Outcrossing2160.22 ± 0.010.112194.67 ± 0.872.09
  Mixed 140.40 ± 0.100.16 137.55 ± 6.191.50
  Selfing or clonal 110.24 ± 0.060.22 112.99 ± 1.500.90
  Unknown  20.27 ± 0.110.33  20.80 ± 0.290.63
 Range sampled
  Local 650.10 ± 0.010.08 647.54 ± 1.413.13
  Regional1160.28 ± 0.020.121163.25 ± 1.091.88
  Biregional 140.17 ± 0.040.02 1717.21 ± 6.321.52
  Species-wide 480.23 ± 0.020.14 483.44 ± 1.731.56
 Geographical distribution
  Widespread 560.22 ± 0.020.07 587.16 ± 2.443.35
  Regional1430.25 ± 0.010.121433.60 ± 0.701.89
  Endemic/relict 440.19 ± 0.020.16 444.09 ± 2.221.32

In animals, mating system, portion of range sampled and geographical distribution of species were significant (or nearly statistically significant) factors affecting gene flow, as estimated by log (FST) and log (Nem) for nuclear data (Table 2). Similar to plants, median FST and Nem estimates indicated higher levels of gene flow for outcrossing species and species with mixed mating systems than did species with selfing or unknown mating systems (Table 3). Sampling ranges from animal studies also exhibited similar patterns of gene flow as plant studies, with local sampling revealing higher median Nem and lower mean FST than studies at the regional, bioregional and species-wide levels. The highest levels of gene flow were exhibited by widespread species, followed by moderate levels in regionally distributed species and endemic/relict species (Table 3). Nevertheless, these differences across geographical distributions may be due to historical relationships among populations rather than reflecting current patterns in gene flow. Gene flow levels also varied according to the taxonomic class or life history of the animal [Fig. 3; combined Molecular Ecology and Ward et al. (1992) data sets]. For example, as reported in an earlier review (Slatkin 1985b), birds and marine fishes exhibited high levels of gene flow (median FST < 0.05; median Nem > 5), whereas low levels (median FST > 0.23; median Nem < 1) were displayed by snails, amphibians and freshwater fish. Note that median Nem values are more useful than means in comparisons among groups or to theory because mean values tend to be inflated upwards by a small number of very high Nem values. In general, variation in migration rate among taxonomic groups appears to be related to differing dispersal capacities, which are known to affect population structure in animals (Bohonak 1999). As with plants, organellar genes showed greater levels of differentiation than did nuclear genes (Table 1); however, this difference could be due to either the lower effective population size of organelle genomes (Birky et al. 1983) or to variation in dispersal rates.

Figure 3.

Estimates of mean and median gene flow (± range and SE, n in parentheses) as evaluated by (A) F̃ST (see footnote for Table 1) and (B) Nem for common taxonomic groups of animals using allozymes and nuclear data.

Moderate to low levels of gene flow were reported from 19 studies of genetic diversity in fungi (mean FST= 0.20; median Nem = 1.17, Table 1). Also, the three studies reporting FST or GST estimates for bacteria suggested moderate gene flow (mean FST= 0.10; median Nem = 4.5, Table 1).

Despite the unrealistic assumptions underlying the island model (Whitlock & McCauley 1999), our Nem estimates do not appear to be so biased as to mask expected patterns of diversity (Table 2). Our review indicates that levels of gene flow between 0.5 and 1 migrants per generation are very common for plant, animal and fungal species, with most estimates less than four (Fig. 4). Indeed, every taxonomic group, even those with average gene flow levels, had representatives in the low gene flow group (Nem < 1) and some taxa, particularly selfing plants, snails, amphibians and freshwater fish, had median Nem of less than one (Fig. 3). Such levels of gene flow may be too low to prevent differentiation at some loci through drift or local adaptation, but high enough to allow rapid spread of advantageous mutations (Fig. 1).

Figure 4.

Frequency histogram of (A) F̃ST (see footnote for Table 1) and (B) migration rate (Nem) for total nuclear data from plants and animals. Data for Nem are binned for Nem = 0.01, 0.1, 0.25, 0.5, 1, and binned into 1-unit intervals thereafter.

The strength of selection in experimental populations

Our database included information from the following sources: (1) Kingsolver et al. (2001), who report 993 linear selection gradients and 753 selection differentials for phenotypic traits in 62 studies of natural populations; (2) Lexer et al. (2003a), who provide 149 estimates of directional selection gradients and 27 selection differentials from eight studies of experimental hybrids in natural populations; and (3) the present review, in which 133 selection gradients and 96 selection differentials from experimentally manipulated or disturbed populations were compiled from 26 studies. Of the 172 traits in this new compilation, 74 were from animals, 69 from plants and 29 from fungi. Because the magnitude, not the directions, of the estimates are of primary interest, the following results report absolute values of directional selection gradients and differentials unless otherwise indicated.

Five studies from manipulated/disturbed populations reported both selection gradients and differentials for 57 traits, allowing us to evaluate differences between the two estimates. Although correlated (r = 0.59 for the nontransformed estimates), selection differentials were significantly greater in magnitude than were selection gradients (paired t-test: mean linear differential i = 0.33; mean linear gradient β = 0.26; t = 4.08, d.f. = 56, P = 0.0001). These results suggest that indirect selection on correlated traits affected some estimates of selection differentials. On the other hand, significant effects of indirect selection on correlated traits were not widely evident in natural populations (Kingsolver et al. 2001). A paired t-test using these values from the database provided by Kingsolver et al. (2001) confirms a lack of significant differences between β and i (t = 1.50, d.f. = 374, P = 0.13). Similarly, for the four hybrid studies reporting both selection gradients and differentials for 27 traits (Lexer et al. 2003a), no systematic biases were observed between the two estimates (paired t-test: t = 0.992, d.f. = 26, P = 0.33). Given that indirect selection is not typically widespread in selection studies, its presence in studies of manipulated/disturbed populations is difficult to explain. It could be that experimental designs may not have allowed a sufficient number of generations for recombination to break down linkage disequilibrium in manipulated or disturbed populations.

The distribution of linear selection gradients for experimentally manipulated/disturbed populations was roughly exponential, but the distribution for selection differentials had a more flattened profile with a greater proportion of higher estimates (Fig. 5). Despite the prevalence of estimates of low magnitude, strong selection was not uncommon. Mean and median selection gradients for manipulated/disturbed populations were similar to those from natural populations (Kingsolver et al. 2001), but selection differentials yielded much higher estimates (Table 4). In addition to possible effects of indirect selection, selection differentials for the manipulated/disturbed populations may have been inflated by the inclusion of fungal studies, which generally reported very strong selection in the form of differentials rather than gradients (Table 5). Note that selection differentials from fungal studies were estimated directly from changes in genotype frequencies (Leonard 1969; Ennos & McConnell 1995) rather than indirectly from fitness proxies, accounting in part, perhaps, for the apparent difference in the strength of selection between the fungal studies and those of other organisms.

Figure 5.

Distribution of 133 linear selection gradients and 96 linear selection differentials for 172 traits from experimentally manipulated or disturbed populations. Data with values < 1 are binned into 0.1 unit intervals, and binned into 1-unit intervals thereafter.

Table 4.  Comparison of linear selection gradients (β) and differentials (i) for phenotypic traits derived from natural, hybrid, and experimentally manipulated populations. Experimental (2) includes only traits that were directly manipulated or varied significantly among groups or treatments
Population typeβi
Trait (n)Mean ± SEMedianTrait (n)Mean ± SEMedian
Natural8140.15 ± 0.010.147180.17 ± 0.010.13
Hybrid1480.09 ± 0.010.05 270.17 ± 0.030.09
Experimental1330.15 ± 0.020.13 630.36 ± 0.030.33
Experimental (2) 440.17 ± 0.030.18 340.26 ± 0.040.24
Table 5.  Summary statistics of selection gradients (β) and selection differentials (i) for phenotypic traits reported from 26 experimental studies
Taxonomic groupβi
Trait (n)Mean ± SEMedianTrait (n)Mean ± SEMedian
Plants620.23 ± 0.020.19580.37 ± 0.030.33
Animals710.11 ± 0.020.09 50.33 ± 0.100.30
Fungi 0290.57 ± 0.190.28

We also compared selection gradients for manipulated traits from the experimentally manipulated/disturbed populations with unmanipulated or nonexperimental covariates from the same studies. Mean and median β for the 45 manipulated traits were slightly but significantly higher than for those excluded (Table 4; weighted Mann–Whitney U-test: n1 = 18 210, n2 = 15 891, U = 1.21 × 108, P < 0.001). Among the 45 manipulated traits, selection gradients differed significantly among behavioural, life history, morphological and physiological traits (weighted Kruskall–Wallis test: n = 15 891, χ2 = 7351.3, d.f. = 3, P < 0.001), with life history traits yielding larger selection gradients than morphological traits (mean β± SE; life history 0.42 ± 0.05, n = 7; morphological 0.09 ± 0.02, n = 34; behavioural 0.76, n = 1; physiological 0.12 ± 0.15, n = 3). This result is consistent with (1) an earlier review of selection in experimental hybrids, which also reported stronger selection on life history traits (Lexer et al. 2003a); (2) reports that life history traits have lower heritabilities than morphological traits (Gustafson 1986), suggesting that stronger selection has reduced additive genetic variance for these traits (Fisher 1930); and (3) observations that QTLs with opposing effects are less frequent for life history traits than morphological traits (Rieseberg et al. 2003), suggesting that a history of stronger directional selection on life history traits has favoured the accumulation of allelic effects in the same direction (Orr 1998b). In contrast, Kingsolver et al. (2001) reported higher selection gradients for morphological traits than for life history traits from studies of natural populations. Why results differ among these studies is unclear, but it may be that strongly selected life history traits are invariant within populations and have been necessarily excluded from within population studies of selection.

Experimentally manipulated traits showed a greater percentage of significant linear selection gradients (48.9%) than did traits under selection in natural populations (25%, Kingsolver et al. 2001). Similarly, Lexer et al. (2003a) found a greater percentage (55.7%) of significant estimates of selection in experimental hybrid populations than natural populations, suggesting that hybrid and experimentally manipulated populations may provide greater opportunities to detect selection because of increased phenotypic variance.

The genetic architecture of adaptation

From our literature review we found 41 studies reporting the number of QTL detected for phenotypic traits in undomesticated animals. Thirty-six of these studies reported effect sizes for 125 traits, either through percentage of difference between parental means (%D), percentage of phenotypic variation explained (PVE) or in standard deviations. The average effect size for the leading QTL for the 26 traits measured in %D was 42.6 ± 4.5 (SE), but the 79 traits measured by PVE provided a lower average at 14.4 ± 1.0. Because studies using %D were predominantly interspecies crosses, whereas studies using PVE typically used intraspecies crosses (Fig. 6), it is unclear whether these differences are due to the statistic used or whether greater effect sizes tend to be found among interspecies crosses. Also, %D may overestimate effect sizes because it fails to include the contribution of environmental variance to phenotypic variation. The effect sizes for leading QTLs for 20 traits measured in standard deviations averaged 0.85 ± 0.2. Minor QTL had much smaller effect sizes than the leading QTLs. Effect sizes for 99 minor QTL for the 26 traits measured in %D averaged 18.21 ± 1.3, whereas effect sizes for the 460 minor QTL for the 79 traits measured in PVE averaged 5.0 ± 0.2. Thirty-nine minor QTL for the 20 traits measured in standard deviations averaged 0.49 ± 0.1. Rieseberg & Burke (2001) compiled effect sizes of the leading QTL for 50 traits from seven studies involving interspecific or intersubspecific crosses for plant taxa and found the average PVE for plants to be much higher at 31.1%. It is unclear whether QTL tend to have larger effects in plants than animals, or if this difference is due to nearly half of the plant estimates involving interspecies crosses, whereas few animal studies reported PVE from interspecies crosses.

Figure 6.

Distributions for the estimated strength of selection (s) for leading QTLs underlying phenotypic traits in animals as measured by (A) percentage of difference in parental means (%D) for 26 traits, and (B) percentage of variance explained (PVE) for 79 traits. s was calculated by multiplying either %D or PVE for a QTL by the average selection gradient for phenotypic traits (0.13) and halving for diploidy. Interspecies differences are indicated by stippled bars, and intraspecies differences are indicated by white bars.

Given the sources of error inherent in selection studies and QTL analyses, our estimations for the average strength of selection on individual mutations likely to define species differences are necessarily rough. We may estimate the average strength of selection for individual alleles underlying phenotypic traits by multiplying QTL effect sizes by the average strength of selection on a trait and halving to account for diploidy (following Rieseberg & Burke 2001). From the 1095 linear selection gradients weighted by sample size for phenotypic traits, averaging 0.13 ± 0.01 derived from the three reviews (Kingsolver et al. 2001; Lexer et al. 2003a; this study) and from our compiled information on QTL effect sizes in animals, we provide estimated distributions of the strength of selection on leading QTLs (Fig. 6) and for minor QTLs (Fig. 7). These distributions show that most selection coefficients are higher than 0.01 for leading QTLs (Fig. 6), and even for many minor QTLs defining species differences (Fig. 7A). For intraspecies differences, s for minor QTL is expected to be smaller (s = 0.001–0.003, Fig. 7B) and therefore the spread of such QTLs may be more limited. Rieseberg & Burke (2001) used similar methods to estimate an average strength of selection (s) of 0.11 for a hypothetical leading QTL in plant taxa, which is higher than most estimates reported here (Fig. 6). This difference is due probably to a higher PVE reported for the plant studies (31.1%) and a higher average linear selection differential derived from 22 phenotypic traits in 13 plant taxa than the linear selection gradients reported here. The finding that selection coefficients are consistently higher than selection gradients in our data set suggest that correlated selection may be a significant factor in these studies, and may have caused in part the difference between these estimates and those reported by Rieseberg & Burke (2001).

Figure 7.

Distributions for the estimated strength of selection (s) for minor QTL underlying phenotypic traits in animals as measured by (A) percentage of difference in parental means (%D) for 26 traits, and (B) percentage of variance explained (PVE) for 79 traits. s was calculated by multiplying either %D or PVE for a QTL by the average selection gradient for phenotypic traits (0.13) and halving for diploidy. Interspecies differences are indicated by stippled bars, and intraspecies differences are indicated by white bars.


Ehrlich & Raven's (1969) argument that levels of gene flow were too low in many species to hold them together represented the first major challenge to the biological species concept. Their arguments seem to have resonated most strongly in the botanical community, with some authors now questioning both the biological reality of species and the usefulness of the species rank as a unit of diversity (Levin 1979; Bachmann 1998; Mishler 1999). The impact on zoologists seems to have been less profound, due perhaps to the focus of many of the leading students of animal speciation on organisms with high levels of gene flow such as insects or birds (Fig. 3). None the less, the absence of a convincing response to Ehrlich and Raven has contributed to a proliferation of species concepts. Indeed, Mayden (1997) lists 22 different concepts and several new concepts have been proposed since this list was compiled.

Our review of the literature suggests two possible solutions to the problem of inadequate gene flow, both of which have been alluded to in earlier reviews (Slatkin 1987; Rieseberg & Burke 2001). First, as pointed out by earlier authors, early studies of migration and pollen dispersal appear to have systematically underestimated actual levels of gene flow (Schaal 1980; Slatkin 1987; Ellstrand 1992). Thus, the fraction of species in which there is enough gene flow to prevent neutral divergence and/or local adaptation is considerably larger than suspected by Ehrlich & Raven (1969). However, even given these higher estimates there are many species from essentially all organismal groups that lack sufficient gene flow to prevent divergence; so the problem cannot be solved entirely by these newer estimates of gene flow.

A second possibility, suggested previously by Rieseberg & Burke (2001), is that low gene flow species are held together primarily by the spread of advantageous alleles, effectively expanding the purview of the biological species concept. It has long been known that only very low levels of gene flow are required for the rapid spread of an advantageous allele (Slatkin 1976), as long as selection coefficients are fairly large (Fig. 1). Our crude calculations suggest that selection coefficients for leading QTLs may indeed be large enough to facilitate their rapid spread across the range of a species, even in low gene flow species, whereas minor QTLs (and neutral alleles) will spread far more slowly and, as a result, are less likely to contribute to species cohesion. If this theory is correct, then the whole genome need not evolve in concert within a species: some loci may diverge through drift or local selection, while others are held together by the spread of advantageous alleles.

Although we believe the scenario we have presented is plausible, we also recognize that our calculations of selection coefficients for QTLs are necessarily crude and most probably overestimate the strength of selection. As alluded to earlier, problems include biases in the estimation of QTL magnitudes (Beavis 1998), the possibility that detected QTLs result from multiple genes and/or multiple mutations, publication biases and spatial and temporal variation in the strength of selection. None the less, evidence from studies that calculate the strength of selection on the QTLs themselves, on transgenes or indirectly from the size of hitchhiked regions in selective sweeps all suggest that selection coefficients may often be large enough to facilitate rapid spread.

QTLs.   Selection coefficients for three survivorship QTLs associated with mineral ion uptake ranged from –0.084, to –0.094, to 0.126, for segregating hybrids between two sunflower species (Helianthus) that had been transplanted into a salt marsh habitat occupied by a natural hybrid derivative of these two species (Lexer et al. 2003b). Similarly, a QTL affecting flower colour explains 80% of variation in pollinator visitation in Mimulus monkeyflowers (Schemske & Bradshaw 1999). Presumably, pollinator visitation would have a very large effect on fitness, although fitness per se was not measured in this study. As far as we are aware, no studies to date have measured the strength of selection on individual QTLs in animals under natural conditions.

Transgenes.   The fitness effects of transgenes are less relevant to this problem as the genes were introduced from unrelated organisms and most fitness studies have been conducted in unrealistic environments. None the less, it is noteworthy that selection coefficients of as high as 0.35 have been reported for a Bt transgene in wild sunflower populations (Snow et al. 2003). On the other hand, a transgene for white mould resistance had no detectable effect on fitness in wild sunflower under natural conditions (Burke & Rieseberg 2003). Transgenic lines may also show reduced fitness compared to their wild counterparts; for example, Arabidopsis thaliana transgenic for resistance to the herbicide chlorsulphuron experienced a 34% reduction in lifetime fitness compared to susceptible lines when grown in natural field conditions (Bergelson et al. 1996). As far as we are aware, no other study has reported selection coefficients for transgenes in natural populations, although a large increase in fecundity has been reported for the fish, Japanese medaka, but this fecundity advantage is counterbalanced by high juvenile mortality (Muir & Howard 2001).

Selective sweeps.   Some of the most direct estimates of selection coefficients come from evaluating the length of DNA sequence associated with a selective sweep. Both the age and strength of selection associated with a selective sweep can be estimated from the size of the region affected and the recombination rate (Kaplan et al. 1989; Sáez et al. 2003). Three of the four studies that have evaluated selection coefficients from the size of an affected region have shown s of even higher magnitude than our rough calculations using QTL studies, suggesting that our calculations may not be gross overestimates. The selection coefficient for a locus associated with a selective sweep close to Sod in Drosophila melanogaster was estimated at s = 0.078–0.103 with the most realistic recombination rate (Sáez et al. 2003). The selection coefficient for a highly recombining region encompassing rp49, associated with a selective sweep in D. simulans, was estimated at s = 0.011 or higher (Quesada et al. 2003). A high level of linkage disequilibrium associated with a locus conferring chloroquine resistance in Plasmodium falciparum indicated s = 0.1–0.7 (Wootton et al. 2002). On the other hand, two genomic regions near an adaptive mutation allowing range expansion in D. melanogaster exhibited much smaller selection coefficients, s = 0.002–0.01 and 0.0001–0.002 (Harr et al. 2002).

Although the early speciation literature emphasized the conservative role of gene flow in homogenizing populations (e.g. Darwin 1859; Grant 1963; Mayr 1963), a major shift in emphasis is under way, driven by findings from molecular evolutionary studies (Gillespie 2001). Not only are the footprints of selective sweeps common (Hudson et al. 1994; Majewski & Cohan 1999; Ayala et al. 2002; Betran et al. 2002; de Groot et al. 2002), but even the assumptions underlying tests for directional selection, which contrast patterns of within-species polymorphism and between-species divergence, imply that the fixed differences between species mostly result from past selective sweeps (McDonald & Kreitman 1991; Fay et al. 2002; Smith & Eyre-Walker 2002). Selective sweeps are sometimes viewed as diversifying, population-specific events indicative of local adaptation (Nurminsky 2001) and populations may also diverge at hitchhiking neutral alleles during a global selective sweep (Slatkin & Wiehe 1998). However, the view of selective sweeps as a diversifying force may be overemphasized because globally adaptive selective sweeps are unlikely to leave a molecular signature long after the sweep ends (Przeworski 2002).

For selective sweeps to contribute substantially to species cohesion, a substantial fraction of strongly selected alleles must be unconditionally advantageous across a species range. Another potential problem could be the prevalence of temporal and spatial variation in selection coefficients documented within and among populations (e.g. Arnqvist 1992; Forsman 1993). Such variation would impede the spread of a locally advantageous allele across an entire species range. Although we know little about the geographical distribution of selective sweeps, the large number of fixed differences among species with substitution patterns indicating positive selection (McDonald & Kreitman 1991) suggest that globally advantageous mutations may not be uncommon. On the other hand, it may be that low gene flow species are held together very loosely by collective evolution at just a handful of significant loci.

Although recent theoretical models provide a good foundation (Slatkin 1976; Cherry & Wakely 2003; Whitlock 2003), future models should address some of their limitations to provide a better understanding of the spread of advantageous mutations. For example, differences in fixation time between the models by Slatkin (1976) and Cherry & Wakely (2003) may be due to their use of two extremes of population structure, the stepping-stone and island models. Fixation times may be exaggerated in the former model compared to the latter. Time to fixation across one step in the stepping-stone model is related to the inverse of s, and this effect is then compounded across each step of the model. On the other hand, the relative importance of s may be diminished in the island model as it requires only two fixation steps: the initial population in which the mutation arises and the remaining subpopulations. Whitlock (2003) showed that fixation times for strongly advantageous alleles in an extinction–recolonization model may be predicted accurately through diffusion equations, although migration rates were fairly high (m = 0.05; Nem = 5). We are currently developing models that combine the stepping-stone model's wave-like fixation events with the island model's exchange of migrants among nonadjacent populations, by using more biologically realistic dispersal functions with closer populations exchanging more migrants than distant populations. Furthermore, a wider range of parameter space, particularly for Nem and s, needs to be investigated. Finally, the relative likelihood of selective sweeps vs. parallel evolution should also be evaluated. Given typical selection coefficients and migration rates in natural populations, are selective sweeps expected to occur at rates sufficiently fast to reduce the likelihood of advantageous mutations with similar phenotypic effects arising elsewhere in the species’ range? We will use existing empirical data in our models to evaluate their generality, and through these models we hope to identify new directions for empirical study on adaptive evolution.

In addition to providing a possible solution to the old problem of how species evolve collectively when gene flow levels are low, the creative role of gene flow through the spread of advantageous alleles has implications for the genetic architecture of species’ differences and possibly the apparent divide between micro- and macroevolution. With respect to the genetic basis of species’ differences, the collective evolution of populations within species may serve to restrict the spread of minor mutations, biasing the fixed differences between species toward genes with larger effects. This effect should be most pronounced in low gene flow species, but a similar albeit less striking trend toward major genes should be observed for differences between high gene flow species as well. Note that the effect we are describing may be viewed as an extension of Orr's (1998a) model of phenotypic evolution to account for differences in the rate of spread of major vs. minor QTLs across a subdivided population.

Although we do not yet have sufficient knowledge of the genetic architecture of species’ differences to evaluate these predictions, the creative role of gene flow may help account for the perceived decoupling of micro- and macroevolution. Highly advantageous mutations will spread to fixation quickly within populations and within species and thus are unlikely to contribute significantly to the standing variation within populations at any given time point. Indeed, much of the variation found within populations is likely to be slightly deleterious, maintained by mutation/selection balance (Barton & Turelli 1989; Kondrashov & Turelli 1992; Falconer & Mackay 1996). Other variation may be maintained by balancing selection, antagonistic pleiotropy, and/or environmental heterogeneity (Barton 1990; Falconer & Mackay 1996; Barton & Keightley 2002; Bamshad & Wooding 2003). Of course, a small fraction of this variation may contribute to future selective sweeps in response to species-wide changes in the environment or genetic background. None the less, the bulk of the genetic variation segregating within populations is likely to be different from that fixed between species and explicit recognition of these differences might help bridge the divide between micro- and macroevolution.

The primary goal of this study was to demonstrate that gene flow may be an important cohesive force in essentially all sexual species, even when gene flow levels are very low. Thus, the study can be viewed as a defence of those species concepts that emphasize the importance of gene flow in holding species together and reproductive barriers in keeping them apart. However, an emphasis on gene flow does not exclude an important role for other evolutionary forces in species cohesion (Templeton 1989). Indeed, we recognize that common descent, parallel and stabilizing selection and developmental constraints all help form the clusters of morphologically connected populations that recognize as species. However, gene flow/reproductive isolation is the only force that acts almost exclusively at the species level. Parallel and stabilizing selection act mainly on individuals, whereas common descent and developmental constraints act both above and below the species level. Thus, if we wish to define species and study the diagnostic property of this most important taxonomic rank, an emphasis on gene flow/reproductive isolation is fully justified.


We would like to thank D. Wolf for preliminary analyses of the Ward et al. database, D. C. Adams for statistical advice and N. Barton for helpful comments on an earlier version of this study. This research was funded by an NSF postdoctoral fellowship in biological informatics to C. Morjan.

Supplementary material

The following material is available from

Table S1. Measures of population subdivision (FST, GST) and Nem values for bacteria and fungi.

Table S2. Measures of population subdivision (FST, GST, Φst) and Nem values for plants.

Table S3. Measures of population subdivision (FST, GST, Φst) and Nem values for animals.

Table S4. Selection gradients and differentials for studies with experimentally enhanced phenotypic variation. Studies are classified according to experimental design, whether traits were varied by phenotypic manipulation or by combining varying strains, and fitness measure. Linear selection gradient or differential, with significance levels, and sample size are provided.

Table S5. Effects of quantitative trait loci for animal crosses. (Includes references).