Widespread evidence for incipient ecological speciation: a meta-analysis of isolation-by-ecology


Correspondence: E-mail: aaron.shafer@ebc.uu.se


Ecologically mediated selection has increasingly become recognised as an important driver of speciation. The correlation between neutral genetic differentiation and environmental or phenotypic divergence among populations, to which we collectively refer to as isolation-by-ecology (IBE), is an indicator of ecological speciation. In a meta-analysis framework, we determined the strength and commonality of IBE in nature. On the basis of 106 studies, we calculated a mean effect size of IBE with and without controlling for spatial autocorrelation among populations. Effect sizes were 0.34 (95% CI 0.24–0.42) and 0.26 (95% CI 0.13–0.37), respectively, indicating that an average of 5% of the neutral genetic differentiation among populations was explained purely by ecological contrast. Importantly, spatial autocorrelation reduced IBE correlations for environmental variables, but not for phenotypes. Through simulation, we showed how the influence of isolation-by-distance and spatial autocorrelation of ecological variables can result in false positives or underestimated correlations if not accounted for in the IBE model. Collectively, this meta-analysis showed that ecologically induced genetic divergence is pervasive across time-scales and taxa, and largely independent of the choice of molecular marker. We discuss the importance of these results in the context of adaptation and ecological speciation and suggest future research avenues.


The importance of ecologically mediated selection as the main initiator of speciation, initially proposed by Charles Darwin, has re-gained considerable attention in the last decade (Schluter 2000, 2001; Dieckmann et al. 2004; Rundle & Nosil 2005; Funk et al. 2006). Ecological speciation arises when divergent selection, leading to local adaptation, results in the reduction of gene flow between populations (Schluter 2000; Rundle & Nosil 2005). This process initiates population differentiation and under the right circumstances can lead to complete reproductive isolation (Nosil 2012). The primary difference between ecological speciation and Mayr's (1942, 1963) classic allopatric speciation model is that the latter does not focus on divergent selection as a driver of population divergence, but instead emphasises geographic isolation as a natural means to reduce migration among populations (Coyne & Orr 2004).

Complete isolation among populations, however, is usually temporary (e.g. during glaciation periods) and populations generally experience repeated periods of contact (Abbott et al. 2013); during such episodes, a species’ dispersal capacity and the geographic distance between populations play an important role in modulating the degree of migration (m). Low levels of migration will in turn increase the influence of genetic drift, and reinforce population divergence (Fig. 1a). This relationship is commonly observed as a correlation between neutral genetic differentiation and geographic distance, or isolation-by-distance (IBD; Wright 1943), and is among the most common eco-evolutionary patterns observed in nature (Jenkins et al. 2010). In a parapatric model, IBD falls into the speciation continuum where stronger IBD (a steeper slope) indicates a greater reduction in gene flow and increased genome-wide differentiation (Fig. 1a). If populations are isolated for long enough, novel mutations (μ) can arise, and in the absence of gene flow become fixed. These mutations can eventually cause genetic incompatibilities and the evolution of reproductive isolation, otherwise known as the Bateson–Dobzhanksy–Muller model (Gavrilets 2003; Nei & Nozawa 2011).

Figure 1.

A comparison of the processes and expected patterns involved in neutral (parapatric) and ecological speciation models. Dashed arrows denote a negative effect; solid arrows denote a positive relationship. Contrasts in box colour are meant to highlight the additional processes emphasised in ecological speciation models. The large green arrows indicate those interactions typically associated with ecological speciation. The included processes are migration rate (m), selection coefficient (s), recombination rate (r) and mutation rate (μ) and genetic drift. Effective population size (Ne) and ecological and geographic distances are included. Which processes will produce the patterns of isolation-by-ecology (IBE) and isolation-by-distance (IBD) is shown in the central column of the figure. The bar intersecting the arrows leading to reproductive isolation denote that the mechanisms underlying this transition are not fully understood.

The ecological speciation model

In contrast to the above scenario, ecological speciation models place a premium on the role of selection initiating divergence (Fig. 1b). Conceptually, ecological speciation starts when divergent or disruptive selection acts on phenotypes that convey a fitness advantage in one environment but not another. This fitness differential induces a shift in allele frequencies of the selected loci in the respective population that leads to local adaptation. Local adaptation will, in turn, reduce overall gene flow between populations (large arrow in Fig 1b), through, for example, selection against migrants (Hendry 2004; Thibert-Plante & Hendry 2009), assortative mating (Weissing et al. 2011) or matching habitat choice (Edelaar et al. 2008). It is thus expected that ecological distance, analogous to geographical distance, will reduce homogenising gene flow and similarly correlate to neutral genetic population differentiation: we refer to this as isolation-by-ecology (IBE). Similar to the allopatric and parapatric models of speciation, ecological induced population differentiation need not necessarily result in complete reproductive isolation and the formation of new species (Elias et al. 2012; Nosil 2012). However, many credible cases for ecologically mediated population differentiation have been made in natural populations (reviewed in Schluter 2000, 2009) and a direct link between reproductive isolation among species and ecological divergence has been established (Funk et al. 2006).

While the overarching process of ecological speciation may be relatively straightforward, many processes can be added to the model resulting in complicated feedback loops and interactions (Fig. 1b, Räsänen & Hendry 2008; Nosil 2012). Moreover, the underlying genetic mechanisms are still being formulated. Wu (2001) postulated a genic view of speciation where speciation begins in locally confined genomic regions containing gene(s) directly involved in reproductive isolation; this was later expanded to explicitly include genes related to ecological function (Wu & Ting 2004). Ecological speciation scenarios fall into this genic view (Wolf & Lindell 2010), and the evidence for such genes is slowly mounting (Nosil & Schluter 2011). Importantly, this viewpoint requires that selection first act on a few (ecologically) relevant loci with the remainder of the genome unaffected. Over time and under favourable combinations of recombination rates (r) and selection strength (s), such localised divergence can extend to areas surrounding the loci under selection by ‘divergence hitchhiking’, and eventually to the entire genome via ‘genome hitchhiking’ (large arrows in Fig 1b; Feder & Nosil 2010; Feder et al. 2012). At the genome hitchhiking stage, gene flow is effectively reduced across the entire genome and a ‘generalised barrier to gene flow’ forms that can be detected with neutral molecular markers originally unlinked to ecological loci under selection (Thibert-Plante & Hendry 2010). This process is expected to create an IBE pattern where the degree of neutral genetic differentiation co-varies with ecological variables exerting the divergent selection pressure (Figs 1b and 2). The transition from a few genes under selection to complete reproductive isolation has been formally articulated as the speciation-with-gene-flow model (Feder & Nosil 2010; Feder et al. 2012) and is the working genetic model of ecological speciation (Nosil 2012).

Figure 2.

Conceptual overview of isolation-by-ecology (IBE) correlations. (a) IBE studies typically consist of three distance matrices: geographical distance (D) – spatial distance between populations (e.g. Euclidean distance in kilometres); genetic distance (G) – a measure of genetic distance or differentiation (here FST is used as an example); and ecological distance (E) – either phenotypic (e.g. mass) or environmental (e.g. precipitation) measurements. All distances are measured between all n possible population pairs, which results in a symmetrical × n matrix of non-independent values. (b) All correlations between distance matrices are shown and named.

Testing for ecological speciation

Experimental work in amenable systems has provided ample evidence for the central assumption of ecological speciation: ecologically divergent selection results in local adaptation. Reciprocal crosses and transplant experiments show unequivocal evidence for local adaptation (Leimu & Fischer 2008) and quantitative trait locus (QTL) studies support a heritable basis for traits of adaptive value (Orr 2001). Experimental studies have further shown the evolution of reproductive isolation under conditions of differing environments (Rice & Hostert 1993; Dettman et al. 2007), and field experiments of the stickleback model system provide strong evidence for repeated ecological speciation occurring in parallel (reviewed in Schluter & Conte 2009).

Most of the evidence in free-ranging populations, however, is necessarily indirect as it is difficult to formally establish that selection is the main driver of divergence (Nielsen 2005; Excoffier et al. 2009). Traditionally, comparisons between divergence of ecologically relevant quantitative traits (QST) and neutral genetic differentiation (FST) have been used to detect selection and local adaptation (McKay & Latta 2002; Leinonen et al. 2013). The major limitation to this approach is that QSTFST comparisons should be conducted under controlled environments (common-garden experiments), although they can be done in the wild (Leinonen et al. 2013). With the genome age fully upon us, the approach has extended from quantitative traits to screening for the underlying quantitative trait loci using genome-wide association studies (e.g. Nosil et al. 2012; Via et al. 2012). These association studies rely on screening thousands of genetic markers for outlier loci statistically associated with the ecological trait of interest. If extensive trait information is lacking, such genome-wide scans among young evolutionary lineages have been used to identify outlier loci (Beaumont 2005; Rice et al. 2010) or regions (speciation islands) among ecologically separated populations. Here, the test relies on detecting areas that are more divergent than expected by genetic drift alone (Nosil et al. 2009a) and has been successfully used to identify candidate loci associated with ecological divergence (e.g. walking-stick insects – Nosil et al. 2008; stickleback – Roesti et al. 2012). Importantly, these genomic outlier approaches can only provide evidence for locus-specific selection at the very onset of ecologically driven divergence (Feder et al. 2012), but may be confounded by other processes such as demography (Nielsen 2005; Excoffier et al. 2009) and background selection (Nachman & Payseur 2011).

Under favourable conditions of selection, migration and recombination, localised selection can transform to a genome-wide signature through divergence hitchhiking (Flaxman et al. 2013), creating the ‘generalised barrier to gene flow’ mentioned above (Thibert-Plante & Hendry 2010). Such ecologically mediated genome-wide divergence will, in principle, be picked up with smaller sets of essentially neutral genetic markers (Feder & Nosil 2010; Feder et al. 2012). A common approach in free-ranging populations has thus been to examine the relationship between genetic differentiation at neutral markers and ecological population parameters (e.g. Galápagos sea lions and marine habitat – Wolf et al. 2008) or phenotypic properties considered of ecological relevance (e.g. Darwin's finches and beak shape – de León et al. 2010). This approach works under the assumption that the observed genetic differentiation is the result of divergent selection and cannot be explained by an allopatric past (Nosil et al. 2009a), an assumption that appears reasonable as few species (or populations) appear to have diverged in complete isolation (Pinho & Hey 2010). A correlation between neutral genetic differentiation and ecological divergence, or IBE, is thus generally recognised as evidence for the ecological speciation model (Nosil 2012).

A brief review of isolation-by-ecology

Two IBE models have garnered attention in the literature: isolation-by-adaptation and isolation-by-environment. In these models, isolation-by-adaptation tests for the association between neutral genetic differentiation and phenotypic divergence (Funk et al. 2002; Nosil et al. 2008), where isolation-by-environment is focused on environmental differences (Wang & Summers 2010). The term isolation-by-ecology has been used previously (Claremont et al. 2011) and is based on differences in resource use (Edelaar et al. 2012), which we consider a subset of isolation-by-adaptation. Collectively, we have referred to these correlations simply as isolation-by-ecology, in reference to ecological speciation. Although different approaches are used to test for these associations, the methodology has largely mirrored that of IBD where a matrix of a population differentiation, such as FST, is correlated with a matrix of ecological distance (Fig. 2). One important difference does exist in that for the IBE pattern to be validated, spatial genetic autocorrelation – in the form of IBD – must be included in the model (Fig. 2; Box 1; Meirmans 2012).

Box 1. Isolation-by-ecology

Isolation-by-distance (IBD) is the positive correlation between geographic distance (D) and genetic distance (G). An extension to this is the isolation-by-ecology (IBE) model that examines genetic distance (G) relative ecological distance (E) based on phenotypes or environmental variables. For the correlation between ecological distance (E) and spatial distance (D), we introduce the term eco-spatial autocorrelation (see Fig. 2). The most common approach to detect IBE in nature is to calculate a correlation coefficient: rE×G. However, to ensure the IBE pattern is not being driven by spatial autocorrelation, geographic distance should be controlled for in the model, denoted by | D, using the partial correlation equation:

display math

Significance of the relationship is then assessed using permutation. Importantly, high levels of IBD and eco-spatial autocorrelation: rE×D. will lead to false positives if not partialled out of the model. We stress that researchers calculate both the simple and partial correlation coefficients, as was often not the case in our literature review.

To account for IBD in models of IBE, the most common approach is the partial Mantel test (Mantel 1967). In these models, a partial IBE correlation can be calculated which directly incorporates the IBD correlation (Box 1). Of equal importance is the ecological divergence by geographic distance correlation – which we refer to as eco-spatial autocorrelation – that is also factored into the partial equation (Fig. 2; Box 1). Collinearity among the variables is still cause for concern, but testing a multitude of hypotheses and comparing the correlation and significance of values can help disentangle correlation from causation (Legendre & Troussellier 1988; Cushman et al. 2006), and this is paramount to ruling out the influence of spatial autocorrelation on IBE patterns.

While IBD can be considered the norm for many wild systems (Jenkins et al. 2010), it is unclear how common IBE patterns are in nature. Recent simulations showed that while divergent ecological selection produced barriers to gene flow that were detectable with neutral markers, there were multiple factors (i.e. m, s) that affected the strength of ecological speciation making predictions and generalisations difficult (Thibert-Plante & Hendry 2010). Alternatively, Nosil et al. (2009a) showed that 15 of 22 empirical studies had patterns consistent with ecological selection, and Wang et al. (2013) recently showed clear environmental selection in a suite of lizard species. Yet, a formal examination of these associations is lacking (Feder et al. 2012).

Here, we present a meta-analysis that examines the association between neutral genetic differentiation and ecological divergence within species, with the primary goal of elucidating how strong this pattern is in nature. Specifically, we (1) determined the mean effect size of IBE studies before and after the contribution of geographic distance to population differentiation was partialled out of the model; (2) explored how effect sizes varied by molecular marker and coding of ecological variables; (3) examined if phenotypic divergence and environmental divergence differed in effect sizes; and (4) looked for a publication bias. Taking a meta-analysis approach permitted us to synthesise the empirical data on ecologically driven genetic differentiation and provide a barometer for the strength of this relationship in nature. Further, it allowed us to identify common ecological patterns associated with IBE in the context of ecological speciation, and offer recommendations for future research.


Literature search and effect size

We conducted a literature search for studies that analysed the relationship between neutral genetic differentiation and environmental or phenotypic variables. Specifically, we searched the Wiley and ScienceDirect online databases using the following keywords: ‘isolation-by-ecology’, ‘isolation-by-environment’ and ‘isolation-by-adaptation’. To increase search results, we also jointly searched the terms ‘isolation-by-distance’ and ‘partial Mantel test’. Our main criterion was that the studies must have a reported summary statistic (i.e. R2, r, t, F) that could be converted into an effect size. We were particularly interested in studies that reported summary statistics before and after the effect of geographic distance was modelled on the same response variable. We gathered data on the molecular marker, the taxonomic rank, and coding of the ecological variables (phenotype vs. environment and discrete vs. continuous) for each study. For the reported P-values, we disregarded the less than sign and considered the absolute value, and for studies that reported > 0.05 or simply stated ‘non-significant’, we assigned a P-value of 0.50. It is important to note that we were not interested in studies that tested for physical barriers to gene flow (i.e. intervening rivers) or pure landscape genetics studies. While there may be overlap between some ecological variables and landscape genetics or barriers, we retained such variables only if there was a recognisable adaptive or differing environmental attribute associated with them. Finally, if studies partitioned their analysis into multiple spatial groups they were all included, but if studies broke down their data set by the molecular marker behaviour (i.e. outlier vs. neutral), we only used the putatively neutral markers or the entire data set.

Mixed-effect meta-analysis

All statistical analyses were conducted in the R v.2.15.1 environment (R Core Development Team 2012). For most studies, we obtained r (Pearson's or Spearman's) and calculated the effect size (Zr) and variance (ZVAR) using Fisher's (1921) transformation:

display math(1,2)

where n is the sample size. For other summary statistics (i.e. t, F) we used

display math(3,4)

from Wilson & Lipsey (2000) and Rosenthal (1994). If R2 was reported it was converted to r using

display math(5)

where p is the number of predictors in the model (see Nakagawa et al. 2007 and references therein). Summary statistics requiring eqns 3–5 were again transformed using eqn 1.

We conducted a mixed-effect meta-analysis using the MCMCglmm library (Hadfield 2010). To estimate the mean effect size, we first ran the analysis without fixed effects but with the study identity and taxonomic affiliation treated as random effects. We then included three fixed effects in the model: whether the scored ecological variable was a phenotype or environmental metric, if the scoring of ecological variable was continuous or discrete, and which molecular marker was used [mitochondrial, microsatellite, Amplified Fragment Length Polymorphisms (AFLPs), other]. For studies that provided summary statistics both with and without the effect of geographic distance on the same response variable, we ran a separate univariate model with a binary factor (0 = without geographic distance, 1 = with geographic distance). All models were run for 1000 000 iterations, with 150 000 iterations removed as a burn-in and a sampling interval of 1000. We used an inverse-gamma prior for the random effects (V = 1, nu = 0.002) and the vector of variances from eqn 2 were passed to MCMCglmm using the mev argument (see supplemental material in Hadfield & Nakagawa 2010). Three separate runs were conducted and their convergence assessed using the Gelman-Rubin diagnostic (Gelman & Rubin 1992). We calculated the percent heterogeneity, based on the sum of variance components, arising from each random effect following Prokop et al. (2012), because this value is positively bound (Wilson et al. 2010), we regarded low values with wide-confidence intervals as having a negligible effect (Verdú et al. 2012). We ran five models in each data set (with and without geographic distance): the null model including only the intercept, three models with each individual fixed effect and one containing all covariates. Models were compared using the Deviance Information Criterion (DIC: Spiegelhalter et al. 2002).

To determine if a publishing bias exists, we examined the correlation between effect size and sample size (Palmer 2000; Cassey et al. 2004). If there were a publication bias we would expect a negative relationship between sample and effect sizes. We also conducted Egger's regression (Egger et al. 1997), where an intercept differing from zero is indicative of asymmetry and a publication bias. Rosenthal's (1979) failsafe number, that is the number of studies required to reduce the effect size to (user defined) non-significant level, was calculated to determine the robustness of the results. The relationship between reported P-values and both sample and effect size was also examined. Further, we were interested in a ‘perception bias’ that, by our definition, can arise when large effect sizes are preferentially reported in highly visible journals biasing the common perception of the phenomenon's penetrance. We therefore examined the relationship between the journal's impact factor the year before the article's publication date, and both effect size and P-value. These relationships were modelled using the same approach as above but with the iterations, burn-in and sampling reduced by 10% and the mev argument omitted.

A positive association between ecological parameters and genetic differentiation can only be taken for evidence of divergent ecological selection once spatial autocorrelation has been partialled out of the model. It is assumed that there is no IBD or eco-spatial autocorrelation in simple IBE correlation models, an assumption that is violated in most natural systems. While IBD has been discussed in this context (Meirmans 2012), the eco-spatial autocorrelation component of the partial IBE covariance matrix is often not explicitly addressed. Accordingly, we explored their effects on the inference of IBE by simulation. We used the mean IBE effect size without geographic distance (from this study) as a static element of a covariance matrix (see Fig. 2). We adjusted the eco-spatial correlation between 0 and 0.75 in approximately 0.05 increments. We used the mean IBD effect size from Jenkins et al. (2010) and increased it in 5% intervals. A Choleski decomposition was used to simulate 1000 correlated data points per increment. For each increment the partial correlation coefficients for IBE were calculated. Simulations were plotted using the ggplot2 library.


We collected data from a total of 106 studies where summary statistics on the relationship between genetic differentiation and ecological divergence were provided: approximately 350 studies were omitted because they did not meet our search criteria. Most studies explored more than one ecological variable or used multiple genetic differentiation metrics, resulting in a total of 396 and 277 summary statistics reported before and after controlling for geographic distance respectively. The majority of summary statistics were based on within-species analyses, but three studies examined species complexes. The specific breakdown of the data set is provided in Table 1 and the raw data are included in Appendix S1. All models showed adequate convergence (Gelman–Ruben diagnostic < 1.1) meaning the independent model runs had achieved the same posterior distribution.

Table 1. Overview of data collected for the meta-analysis. Included is the number of studies with summary statistics and the number data points retrieved
 IBE without distanceIBE with distance
  1. Correlations with and without geographic distance included in the model are grouped by phenotypic or environmental ecological variables and whether they were scored in a continuous or discrete fashion, the molecular markers used are as follows: AFLPs (A), microsatellites (U), mitochondrial sequence (M) and other (O); and the taxonomic rankings are as follows: mammals (M), invertebrates (I), fish (F), birds (B), herpetofauna (H) and plants (P).

No. studies8671
No. data points
Molecular marker: A/U/M/O78/235/59/2440/181/30/26
Taxonomy: M/I/F/B/H/P64/56/110/36/40/9061/60/82/9/19/46

For IBE analyses without geographic distance in the model, the mean effect size was 0.34 (95% CI 0.24–0.42; Fig. 3a) and 174 summary statistics (44%) reported P-values ≤ 0.05. For IBE analyses with geographic distance partialled out of the model, the mean effect size was 0.26 (95% CI 0.13–0.37; Fig. 3b) and 113 summary statistics (41%) reported P-values ≤ 0.05. This absolute drop of 0.08 corresponds to a relative reduction in the mean effect size of 24%. When we only looked at studies that provided summary statistics before and after geographic distance were modelled on the same response variable (n = 200), we saw a stronger relative decrease in mean effect size of 49% (0.35 to 0.18, < 0.01).

Figure 3.

Funnel plot of effect size against sample size for (a) isolation-by-ecology (IBE) without geographic distance included in the model, and (b) with geographic distance included in the model. The mean effect size is shown as a solid line and stippled lines represent the 95% confidence intervals.

The subgroup analysis showing how geographic distance influenced the effect size of each covariate is presented in Fig. 4. The mean effect was reduced in 7 of the 8 groupings when geographic distance was included in the model. Notably, the mean effect size of IBE studies on phenotypes did not decrease at all, and those studies using AFLPs had the largest decline, after geographic distance was partialled out of the model. In general, the confidence intervals around the effect size increased after accounting for geographic distance. Variance among studies accounted for most of the variation in effect size, while the effect of taxonomy was minimal (Table 2).

Table 2. Mixed-effect models and their deviance information criterion (DIC) values and model heterogeneity (% variance of random effect relative to total variance) averaged over three runs
Fixed effectsDICVarStudy (% heterogeneity)VarTaxonomy (% heterogeneity)VarResidual (% heterogeneity)
  1. The response variable was effect size (Zr) and the random effects were taxonomic ranking and study. The fixed effects were the scoring ecological variables (continuous vs. discrete), whether the study examined phenotypic or environmental divergence, and the molecular marker used.

Without distance
Continuous vs. Discrete133.7951.445.1943.38
Phenotype vs. Environment144.2847.986.7145.31
Molecular marker145.2949.025.1845.79
All three135.5052.365.0842.55
With distance
Continuous vs. Discrete14.3251.4911.3837.12
Phenotype vs. Environment29.4044.5514.0841.37
Molecular marker29.5047.038.6244.36
All three15.9049.699.1141.20
Figure 4.

Forest plot of isolation-by-ecology (IBE) effect sizes by covariate. Black squares represent the mean effect size before geographic distance was included in the model, and grey squares show the mean effect size after. Black horizontal lines are the 95% confidence intervals. Environment and phenotype are the type of ecological variable used in the correlation, and continuous or discrete is in reference to how the variable was scored. Individual molecular marker IBE effect sizes are shown.

Regarding publication bias, there was a slight negative trend between effect size and sample size (without geographic distance: β = −0.01, = 0.77; with geographic distance: β = −0.04, = 0.04; Fig. 2). Egger's regression produced an intercept of 1.52 (95% CI 1.06–4.84) and 1.98 (95% CI 1.01–2.98) without and with geographic distance in the model respectively. Rosenthal's failsafe number, at an α-level of 0.05, exceeded 100 000 for both data sets suggesting a robust signal. The relationship between P-values and both sample size and effect size was negative (Fig. S1) and there was no relationship between journal impact factor and P-value or effect size (all P's > 0.10: Fig. S2).

By simulation, we explored how collinearity among geographical distance, ecological divergence and genetic differentiation, affected inferences of IBE. With increasing eco-spatial autocorrelation, the partial IBE correlation significantly dropped while the uncorrected measure remained unaffected. This effect was magnified with increasing IBD (Fig. 5) and lower IBE values (Fig. S3). Notably at low eco-spatial autocorrelation but high IBD, the partial IBE correlations improved – a phenomenon where geographic distance acts as a suppressor variable (see Maassen & Bakker 2001). The partial correlation mirrored the absolute drop of 0.08 as in the empirical data when eco-spatial autocorrelation was approximately 0.25.

Figure 5.

Simulated data showing the influence of eco-spatial autocorrelation and isolation-by-distance (IBD) on partial isolation-by-ecology (IBE) correlations. We simulated a range of eco-spatial autocorrelation (X axis) and IBD values (coloured stippled lines) with the IBE held value constant to the average of this study (black line). The colour of the partial IBE line (IBE | D) corresponds to its IBD value.


Ecological speciation is easily detectable with neutral markers once gene flow has been effectively reduced across the genome. A robust pattern of IBE in situations where gene flow is still permitted is therefore indicative of ecological speciation (Nosil 2012). Our meta-analysis showed that this is indeed the case, with a mean IBE effect size of 0.34. When geographic distance was included in the model, which accounts for genetic differentiation due to IBD, the effect size was reduced by 24% to 0.26. We suspect this decrease is an underestimate because when we only examined studies where the effect size was calculated with and without geographic distance on the same response variable, the decrease was closer to 50%. Nonetheless, after accounting for neutral processes the IBE correlation was still well above zero and explained ~5% of the total genetic differentiation among populations. Collectively, our meta-analysis showed that adaptive divergence is common, and incipient ecological speciation pervasive, throughout nature.

Factors influencing the strength of IBE

Before discussing the implications of IBE for ecological speciation in more depth, our meta-analysis revealed some interesting patterns that are worth briefly discussing.

Publication bias

The data point towards there being a slight publication bias, that is, a disproportionate amount of significant results being reported. Feder et al. (2012) noted that negative studies might not be getting published; however, the calculated failsafe number indicates that such studies are unlikely to have a major influence on the effect size estimation of this study. Further, we found no evidence of a perception bias, suggesting effect sizes were equally reported among journals of differing ‘impact’. Overall, these data support a robust IBE signature in nature.

Genetic marker and taxonomic effects

With the exception of AFLPs, the choice of molecular marker did not appreciably influence the IBE effect size. The weak IBE pattern with AFLPs is slightly perplexing, but may be influenced by dominant nature of AFLP scoring. The disproportionate amount of AFLP studies (> 50%) focusing on plants might also be having an effect. Phenology in plants is under less selection pressure compared to morphological traits in animals (Kingsolver et al. 2001) – both of which are categorised as phenotypes. This would also explain why plants had the lowest effect size among taxonomic groups (data not shown). We might also anticipate stronger correlations for populations or taxonomic groups with generally lower recombination rates and large effective population sizes (as divergence hitchhiking and selection should have stronger influences). However, in a population genetics context the population recombination rate increases with effective population size (Stumpf & McVean 2003) making such predictions less straightforward. Determining how aspects of the molecular marker (e.g. differentiation metric, marker number), effective population size, and dispersal characteristics influence IBE effect size is worth exploring as they likely contributed to the high level of heterogeneity seen among studies.

Ecological variables

The scoring of characters (continuous vs. discrete) had a significant influence on the mixed model and visibly influenced effect sizes (Fig. 4). In general, it is advisable to use continuous over discrete scoring of predictor variables in regression-based models when possible (Gelman & Hill 2006). Discrete scoring constituted only 25% of our data set, but did appear to inflate effect sizes; thus IBE studies should be aware of this effect when comparing effect sizes of differently encoded variables. In addition, environmental covariates were more affected by geographic distance (autocorrelation) than phenotypes, and the mean effect size for phenotypic correlations did not change after accounting for geographic distance. The most likely explanation for this pattern is that environmental spatial autocorrelation, compared to phenotypes, is more prominent in nature (e.g. Legendre 1993).

Spatial autocorrelation

High levels of spatial autocorrelation in IBE studies are a concern because they will increase the risk of Type I error. The most common way to test for ecological associations with genetic differentiation is through Mantel tests, and there has been an ongoing debate centred largely on its susceptibility to Type I error (e.g. Raufaste & Rousset 2001; Castellano & Balletto 2002; Harmon & Glor 2010). This discussion we feel is analogous to the debate on the utility of FST, and suggest the outcome will be (or is) similar, in that Mantel tests will continue to be used and their coefficients will remain the standard reference. We take the viewpoint of Cushman & Landguth (2010) in that it is not a problem with the test, but collinearity among variables. Our results demonstrated that partialling out geographic distance significantly reduced the effect size, thereby showing how ecological influences on genetic differentiation will be misled, or suffer from Type I error, if geographic distance is not factored into the equation.

Meirmans (2012) discussed the issue of IBD with respect to IBE, but issue of eco-spatial autocorrelation (and its interaction with IBD and IBE) has largely been overlooked in the literature. An influence is not unexpected as the partial equation factors in both IBD and eco-spatial autocorrelation equally (Box 1), with their inclusion primarily resulting in a reduction of the partial IBE correlation (Fig. 4). However, our simulations showed at low levels of eco-spatial autocorrelation and high IBD (and vice versa) the IBE correlation can actually improve (Fig. 4): a phenomenon due to the effect of a so called suppressor variable (Maassen & Bakker 2001). Approximately, 20% of the summary statistics with before and after geographic distance comparisons had higher partial correlations relative to the simple correlation, suggesting this a relevant consideration for many IBE studies. Thus, eco-spatial autocorrelation should factor into study design and hypothesis formulation, as ignoring it (like IBD), will increase false positive inferences or even underestimate correlations. At the very least its magnitude should be reported.

The relative role of ecologically driven population divergence

Whether neutral processes or divergent selection under conditions of gene flow is the major driver of population divergence is heavily debated. When explicitly tested, comparisons between IBE and IBD have produced conflicting results (e.g. Lee & Mitchell-Olds 2011; Wang et al. 2013). Based on Jenkins et al.'s (2010) meta-analysis, approximately 22% of the variance in neutral genetic differentiation could be explained by genetic drift. Our study suggests that an additional ~5% can be explained by ecological-based selection. These are gross generalisations, but they speak to the relative influence of neutral processes and selection on shaping neutral genetic differentiation in nature. However, it is important to remember that the two processes can act in concert. Similar to geographic distance, ecological selection will result in a reduction of gene flow, thus promoting genetic drift and population differentiation (Räsänen & Hendry 2008; Nosil et al. 2009a; Rice et al. 2010). Given that the IBE patterns persisted after controlling for geographic distance implies that there is sufficient levels of local adaptation to reduce gene flow (to some degree) in most natural populations. Here, genomic data will be particularly useful for disentangling the relative influence of selection and neutral processes on ecological-driven divergence by screening the genomes of contiguous populations across ecological transitions (e.g. Freedman et al. 2010).

Our results might also give some indirect insight on the rate of transition to the genome hitchhiking phase of ecologically driven divergence. Following the genome hitchhiking model (Feder et al. 2012), to detect IBE with neutral data set typically requires a genome-wide reduction in genome flow that should only occur in the later phases of divergence (but see Thibert-Plante & Hendry 2010 and Flaxman et al. 2013). The majority of surveyed studies would be categorised as young systems with ongoing gene flow. The fact that most of these studies have IBE effect sizes above zero suggests ecological-based selection transfer into a the genome-wide signature very early and rapidly, as these small, neutral data sets will generally not be linked to causal loci under selection. Hendry et al.'s (2007) review showed that (partial) reproductive barriers could accrue somewhat rapidly in response to adaptive divergence, and recent genomic work on stickleback showed a rapid transition to the genome-wide divergence phase in response to divergent selection (Roesti et al. 2012). Further supporting this pattern are simulations, which similarly displayed a relatively rapid reduction in gene flow in response to divergent selection pressures (Thibert-Plante & Hendry 2009; Flaxman et al. 2013). Ultimately, a comparative analysis across taxa and a speciation gradient, as suggested by Feder et al. (2012), is required to substantiate this proposed rapid transition, and ultimately quantify the strength and rate of IBE promoting divergence to the genome-wide phase.

IBE and ecological speciation

Divergent ecological selection can lead to reproductive isolation, but it is difficult to imagine the majority of populations surveyed here evolving into distinct species. In fact, we know that most of these populations will not evolve into distinct species (Elias et al. 2012). It is thus important to consider that ecological speciation is a continuum and is often stalled at intermediate points (Nosil 2012). The genetic architecture of wild populations is under complex drift-flow-selection dynamics that do not act independently (Andrews 2010) and that fluctuate during the course of a species’ evolutionary trajectory (Abbott et al. 2013). While gene flow may be an important determinant of a species fate, it is not an absolute hindrance (Nosil et al. 2009b; Feder et al. 2012; Abbott et al. 2013). The widespread evidence found here for a correspondence of neutral genetic differentiation and ecological contrast in situations that permit gene flow demonstrates that ecological processes are likely to play a significant role in a species’ trajectory.

The strong evidence for IBE presented here is particularly interesting when compared to Hendry's (2009) literature review that found only moderate evidence for ecological speciation. While the contrast between studies is due to Hendry's (2009) search criteria (not explicitly stated, but likely reproductive isolation), this dichotomy has some interesting eco-evolutionary implications. First, it implies that most wild species are stalled at an intermediate state of ecological speciation, and thus at some migration-selection balance (Räsänen & Hendry 2008; Gourbiere & Mallet 2009). Here, our results form an interesting basis to compare theoretical expectations on equilibrium levels of divergence generated under different ecological speciation scenarios in the face of gene flow (e.g. Thibert-Plante & Hendry 2010; Flaxman et al. 2013). Second, most systems we examined likely have ongoing gene flow suggesting that the genes under selection and causing the IBE pattern are different from those required for complete reproductive isolation. This has important evolutionary ramifications as it means reproductive isolation, in the sense of intrinsic post-zygotic reduction of hybrid fitness, may need to evolve secondarily by a less efficient means of indirect selection (Nosil 2012). Because outbreeding reduces linkage disequilibrium, the physical linking of ecological and reproductive isolation genes will take longer to evolve in systems permitting gene flow, which is likely facilitating the observed intermediate stage of divergence. What promotes ecological speciation from an IBE state into full reproductive isolation is perhaps the most important unanswered question stemming from this article.

Conclusions and future directions

On the basis of this study, we make the following five inferences and suggestions regarding the current status and future of IBE studies.

  1. IBE and ecological speciation. A positive correlation between neutral genetic differentiation and ecological divergence (IBE) is an indicator of ecological speciation (Nosil 2012). This study clearly showed that ecologically based divergent selection is both readily present in nature, and detectable with small neutral molecular data sets. More broadly, we view this as a companion study to Funk et al.'s (2006) review of ecological divergence among reproductively isolated species, and support for Doebeli et al.'s (2005) assertion that many different ecological selection scenarios can give rise to adaptive divergence.
  2. Using IBE to study ecological speciation. As the IBE correlation can be viewed as equilibrium between selection and migration, future studies should focus on how modulating this relationship in IBE systems influences the speciation trajectory. Comparisons among populations experiencing rapid environmental change to those in more stable environments should prove informative in this regard. Identifying specific IBE genes in nature, and deciphering whether they are decoupled from those facilitating reproductive isolation also needs to be resolved. Genomic scans of related species (i.e. species complexes) that vary in their degree of interbreeding and occupy different ecological niches could help elucidate this relationship. Finally, comparing the reduction in fitness from reciprocal transplants among populations with varying degrees of IBE is important to quantify the cause and effect relationship of this pattern, and in essence be a proof of principle. While these highlighted research questions focus on the relationship between IBE and ecological speciation, much broader ecological speciation questions remain unanswered; for this, we point readers to the last chapter in Nosil (2012).
  3. Spatial autocorrelation and IBE. We are hesitant to recognise IBE patterns when environmental spatial autocorrelation has not been reported or included in the model. If the selective agents causing IBE are a priori known (or hypothesised) to be spatially structured, the appropriate sampling and statistical strategies should be considered (e.g. Fortin et al. 1989; Legendre et al. 1989; Legendre 1993). Furthermore, our simulations focused on the effect of increasing eco-spatial autocorrelation and IBD on a static IBE value, but it should be noted the magnitude of this effect varies also by the strength of IBE (Fig. S3). Collectively, this implies that there are scenarios, that is, moderate to high IBD and eco-spatial autocorrelation, where detecting IBE might be a fruitless pursuit.
  4. Traditional molecular markers or genome-scale approaches? With the exception of identifying specific genomic regions and outlier loci (or reducing variance in genetic differentiation measures), it will be informative to see how genomic data sets influence IBE correlations. For example, with only 4 microsatellites, Galindo et al. (2009) observed a similar IBE pattern as seen with > 600 AFLP non-candidate loci. Genomic data sets will be particularly useful for quantifying the degree of heterogeneity in ecologically driven genomic divergence and helping resolve the relative length of, and transitions between, the phases of the speciation-with-gene-flow model (Feder et al. 2012).
  5. Clarifying the IBE language. The isolation-by-ecology terminology can be confusing. Three different terms have been put forward (Nosil et al. 2008; Wang & Summers 2010; Claremont et al. 2011) with slightly different meanings and covariates. In addition, comparisons between QST and FST matrices (e.g. Streisfeld & Kohn 2005; Olivieri et al. 2008) are essentially another variant of the model. All, however, share the common goal of identifying evidence for ecologically mediated genetic divergence and local adaptation. For this reason, we recommend studies testing for these correlations adopt the IBE moniker and the IBE acronym and simply specify what their covariate is measuring.

In summary, IBE correlations based on traditional molecular markers are detectable in nature and cannot be attributed to spatial autocorrelation. The observed IBE relationship across taxa in situations permitting gene flow supports the widespread occurrence of ecological speciation. With genomic (Rice et al. 2010) and experimental approaches (Hendry et al. 2007; Räsänen & Hendry 2008) paving the way for future of ecological selection studies, these IBE studies can be a tool for identifying a suite of non-model species suitable for such methods, and provide an additional framework by which to study ecological speciation. Ultimately, incorporating IBE data into these emerging approaches should provide new insights into the mechanistic underpinnings of ecological speciation.


This study was partially inspired by discussions in the EBC fika room. We acknowledge the R-bloggers (www.r-bloggers.com) for their tutorials on simulating correlated data sets. Thanks to Xavier Thibert-Plante (XT-P), Joshua Miller, Kimberly Ong and Joey Northrup (JN) for comments on the manuscript, JN for discussions on modelling and XT-P for a discussion on ecological speciation. We thank Jeffrey Feder and two anonymous referees for their comments that greatly improved this manuscript. We acknowledge funding from the Wenner-Gren Foundation and the Royal Swedish Academy's Physiographic Society.


ABAS and JBWW conceived of the study; ABAS collected the data and analysed it with input from JBWW; Both JBWW and ABAS wrote the manuscript.