The coupling hypothesis: why genome scans may fail to map local adaptation genes



    1. Université Montpellier 2, Place Eugène Bataillon, 34095 Montpellier Cedex 5, France
    2. ISEM – CNRS, UMR5554, SMEL, 2 rue des Chantiers, 34200 Sète, France
    Search for more papers by this author

    1. Université Montpellier 2, Place Eugène Bataillon, 34095 Montpellier Cedex 5, France
    2. ISEM – CNRS, UMR5554, SMEL, 2 rue des Chantiers, 34200 Sète, France
    3. AD2M – CNRS, UMR7144, Station Biologique, 29682 Roscoff Cedex, France
    4. Department of Genetics, University of Cambridge, Cambridge CB23EH, UK
    Search for more papers by this author

    1. ISEM – CNRS, UMR5554, SMEL, 2 rue des Chantiers, 34200 Sète, France
    2. AD2M – CNRS, UMR7144, Station Biologique, 29682 Roscoff Cedex, France
    Search for more papers by this author

    1. Université Montpellier 2, Place Eugène Bataillon, 34095 Montpellier Cedex 5, France
    2. ISEM – CNRS, UMR5554, SMEL, 2 rue des Chantiers, 34200 Sète, France
    Search for more papers by this author

    1. Université Montpellier 2, Place Eugène Bataillon, 34095 Montpellier Cedex 5, France
    2. CEFE – CNRS, 34293 Montpellier Cedex 5, France
    Search for more papers by this author

Nicolas Bierne, Fax: 33 4 67 46 33 99; E-mail:


Genomic scans of multiple populations often reveal marker loci with greatly increased differentiation between populations. Often this differentiation coincides in space with contrasts in ecological factors, forming a genetic–environment association (GEA). GEAs imply a role for local adaptation, and so it is tempting to conclude that the strongly differentiated markers are themselves under ecologically based divergent selection, or are closely linked to loci under such selection. Here, we highlight an alternative and neglected explanation: intrinsic (i.e. environment-independent) pre- or post-zygotic genetic incompatibilities rather than local adaptation can be responsible for increased differentiation. Intrinsic genetic incompatibilities create endogenous barriers to gene flow, also known as tension zones, whose location can shift over time. However, tension zones have a tendency to become trapped by, and therefore to coincide with, exogenous barriers due to ecological selection. This coupling of endogenous and exogenous barriers can occur easily in spatially subdivided populations, even if the loci involved are unlinked. The result is that local adaptation explains where genetic breaks are positioned, but not necessarily their existence, which can be best explained by endogenous incompatibilities. More precisely, we show that (i) the coupling of endogenous and exogenous barriers can easily occur even when ecological selection is weak; (ii) when environmental heterogeneity is fine-grained, GEAs can emerge at incompatibility loci, but only locally, in places where habitats and gene pools are sufficiently intermingled to maintain linkage disequilibria between genetic incompatibilities, local-adaptation genes and neutral loci. Furthermore, the association between the locally adapted and intrinsically incompatible alleles (i.e. the sign of linkage disequilibrium between endogenous and exogenous loci) is arbitrary and can form in either direction. Reviewing results from the literature, we find that many predictions of our model are supported, including endogenous genetic barriers that coincide with environmental boundaries, local GEA in mosaic hybrid zones, and inverted or modified GEAs at distant locations. We argue that endogenous genetic barriers are often more likely than local adaptation to explain the majority of Fst-outlying loci observed in genome scan approaches – even when these are correlated to environmental variables.


Evolutionary processes are most often inferred indirectly from contemporary patterns of genetic variation (Harrison 1993, 1998). Here, we consider a pattern that has long attracted the attention of population geneticists: the association between the level of genetic differentiation at specific loci (e.g. those identified as outliers by Fst scans) and one or more ecological variables (genetic–environment association, GEA, Hedrick et al. 1976). GEA is a pattern that is well explained by the existence of ecologically driven divergent selection at some genes in the genome, which in itself warrants interest. However, the processes by which the signal of local selection can be captured by Fst-outlying markers may not be as straightforward as is commonly believed.

The simplest hypothesis is that the loci detected are directly involved in local adaptation. Examples now exist where the whole chain of links between genotype, phenotype, fitness and environmental variation has been characterized (Lenormand et al. 1999; Hoekstra et al. 2004; Wheat et al. 2006; Schmidt et al. 2008b; Storz & Wheat 2010), but they remain rare. Often genetic markers are neutral and reveal local adaptation through their association with selected loci. The reliance on this indirect path (i.e. the use of neutral markers to infer the behaviour of selected loci) remains widespread, despite our ability in the age of genomics to scan hundreds of loci with ease. Unfortunately the indirect nature of the inference is often neglected (Faure et al. 2008; Bierne 2010), and the indirect effect of selection on linked neutral variation is implicitly assumed to be similar in kind (if not strength) to that of direct selection. For example it is not uncommon to read that the very strong genetic structure observed at a given locus is consistent with the operation of selection on that locus or a locus closely linked to it, as if the direct or indirect nature of selection was an ancillary detail. However, the indirect effect of selection on neutral markers requires a different theory than direct selection. It is better understood as a stochastic process, which can be approximated as within-genome variation in the effective population size (Felsenstein 1974; Gillespie 2000; Charlesworth 2009) or in the effective migration rate between populations (Barton 1979b; Bengtsson 1985; Ingvarsson & Whitlock 2000). Moreover, the ability of selection to affect population differentiation at neutral markers is not yet fully understood. For example, the hitchhiking effect, in which the fixation of a favourable gene produces a marked and durable footprint on linked neutral variation (Maynard & Haigh 1974; Nielsen 2005), can also have a strong effect on genetic differentiation between demes in a structured population, and these are less well studied (Slatkin & Wiehe 1998; Barton 2000; Santiago & Caballero 2005; Bierne 2010). The effect of local selection on the linked neutral variation is expected to extend over a very small chromosomal region (Charlesworth et al. 1997; Bierne 2010; Feder & Nosil 2010) and to last for a short period of time (Miller & Hawthorne 2005) – predictions supported by the few empirical examples in which the chromosomal footprint of local adaptation has been investigated (Berry et al. 1991; Schmidt et al. 2008b; Linnen et al. 2009). Recent extensive genome scans have shown that genetic differentiation does not extend far beyond 5 kb (<0.003% of the genome) around adaptive polymorphisms in Arabidopsis lyrata (Turner et al. 2010) and Drosophila melanogaster (Kolaczkowski et al. 2010). With this expectation in mind, and given the number of loci usually studied (almost always a small fraction of the genome), invoking some form of association between the marker and a linked selected locus is often no more parsimonious than assuming direct selection on the marker itself (Lemaire et al. 2000). Indeed, direct selection tended to be the favoured hypothesis in the golden age of allozyme loci (Koehn & Hilbish 1987; Johannesson et al. 1995a; Mitton 1997; Lemaire et al. 2000; Riginos et al. 2002), and remains very popular as an interpretation of results obtained with EST-derived markers (markers in protein coding sequences, Vasemägi et al. 2005; Oetjen & Reusch 2007; Galindo et al. 2010; Shikano et al. 2010). However, loci with an abnormally strong genetic structure, the so-called Fst outlier loci, are common among both coding and non-coding polymorphisms, and often represent a substantial fraction of the loci investigated (2–10%, Nosil et al. 2009). If each locus only maps 0.003% of the genome as in the rock-cress and fly genomes, thousands of polymorphic local-adaptation loci would have to be present to account for the observed proportions of outliers. We therefore suggest that the standard explanation, as outlined above, cannot plausibly explain why GEAs are found so easily and commonly with molecular markers.

Various alternative hypotheses have already been proposed to explain outlying loci: (i) correlation in co-ancestry among subpopulations under some forms of spatial subdivision (e.g. historical branching, hierarchical structure) can inflate the neutral variance of genetic differentiation, which then exceeds that of the null model used in outlier tests (Robertson 1975; Excoffier et al. 2009; Bonhomme et al. 2010); (ii) a neutral mutation that arises in the wave front of an expanding population can increase in frequency in colonized populations, mimicking local selection at that locus (gene surfing, Klopfstein et al. 2006; Hofer et al. 2009); (iii) for a broad range of parameter values, background selection against deleterious mutations is expected to increase population differentiation because of decreased effective size within subpopulations (Charlesworth et al. 1997), sometimes to decrease it because of heterotic effects (Bierne et al. 2002c), and in any case to inflate the differentiation variance when loci from regions with different recombination rates are analysed; (iv) transient differentiation can arise from species-wide selective sweeps (Bierne 2010), or multiple sweeps (Ralph & Coop 2010), by globally favourable mutations (not locally adapted alleles) in subdivided species. In this review we advocate another alternative and neglected explanation, that a high proportion of Fst outliers likely imply the operation of an endogenous genetic barrier. We argue this because (i) endogenous barriers are more likely to impede neutral gene flow at a substantial proportion of the genome (Barton & Bengtsson 1986); (ii) they are expected to become coupled with exogenous selection, and thus to coincide with environmental variation (Barton & Hewitt 1985), forming GEAs.

Throughout, incompatibilities between groups of alleles due to habitat-independent selection (underdominance or epistasis) or pre-zygotic isolation will be referred to as endogenous barriers to gene flow, and where populations with such incompatible genetic backgrounds come into contact they form a tension zone. Groups of alleles adapted to different habitats form exogenous barriers to gene flow, and their geographical limits are habitat boundaries or ecotones. Endogenous and exogenous genetic barriers effectively restrict gene flow because of the indirect effect of selection on the linked neutral variation. Genetic barriers are often semi-permeable (Harrison 1986; Rieseberg et al. 1999), and their strength varies among genome regions depending on local recombination rates and densities of selected loci (Barton & Hewitt 1985). A third kind of barrier to gene flow is natural obstacles (i.e. zones of low population density such as mountains, rivers and oceanic front), which may or may not coincide with ecotones. Unlike a genetic barrier, a natural barrier decreases gene flow equally throughout the whole genome. The terminology used in this paper is defined in the Glossary.

Tension zones, being due to endogenous selection, are not stabilized geographically and, in the absence of exogenous selection, they are expected to move more or less haphazardly according to population density, stabilizing only when they reach natural barriers, with which they will ultimately coincide (Barton 1979a; Barton & Hewitt 1985; Hewitt 1988). However, speciation and hybrid zone theories have highlighted that different components of reproductive isolation can easily become coupled together in the absence of natural barriers (Udovic 1980; Kirkpatrick & Ravigné 2001; Barton & de Cara 2009). The most discussed type of coupling is between post-zygotic (e.g. hybrid fitness depression or disruptive selection) and pre-zygotic (e.g. assortative mating) factors, because this underlies the reinforcement (Butlin 1989) and ecological (Rundle & Nosil 2005) models of speciation. The coupling between multiple endogenous genetic incompatibilities has also long been recognized in the hybrid zone literature (Barton 1983; Barton & Hewitt 1985), and Barton & de Cara (2009) recently emphasized its role in speciation. The coupling between endogenous and exogenous components of reproductive isolation is a further prediction of the theory, but this has not tended to be strongly emphasized to date. For example, the concomitant action of endogenous and exogenous factors has often been discussed in the hybrid zone literature (Barton & Hewitt 1985; Hewitt 1988; Moore & Price 1993), but the discussions have focused on the relative importance of the two factors (Arnold 1997; Barton 2001) rather than on their possible interaction. In addition, the concept of genetic barriers has not sufficiently permeated the literature on local adaptation at the molecular level, being primarily restricted to the hybrid zone literature. However, hybrid zones are little more than extreme examples of the broader phenomenon of genetic differentiation. Genome scans simply reveal more subtle examples: weak barriers that affect a smaller but still substantial proportion of genomes (>2%). In the latter case, we suggest that the exclusive focus on local adaptation as the cause of the barrier might be mistaken.

In this article, we present a simple model in which endogenous and exogenous selection interact. We emphasize two underappreciated outcomes of this interaction: (i) endogenous clines (i.e. clines of allele frequencies at genes under endogenous selection) can come to coincide with exogenous clines (i.e. clines of allele frequencies at genes under exogenous selection) at an environmental boundary; (ii) fine-grained GEAs can emerge locally within an endogenous cline. We then review results from the literature that support our hypothesis: (i) theory suggests that endogenous barriers are more efficient than exogenous barriers at impeding neutral gene flow in a substantial proportion of the genome, (ii) genetic barriers to gene flow are usually both exogenous and endogenous, (iii) endogenous barriers tend to have environmental/physical correlates, (iv) GEAs are often evident only at local scales in mosaic hybrid zones, and (v) the association between genetic differentiation and habitat variation is sometimes reversed or modified at distant locations.

The barrier trap: coupling exogenous and endogenous selection

In this section we present a simple model to illustrate the coupling of endogenous and exogenous backgrounds, and how this coupling shapes the spatial genetic structure.

A simple model

We present a model of evolution in a metapopulation of n demes arranged in a linear stepping-stone structure. Migration connects not only adjacent demes but also demes separated by up to six demes. The auto-recruitment rate is 1 - m; the migration rate to adjacent demes is m/4, to demes at a distance of two demes m/8, to demes at a distance of three demes m/16, at a distance of four demes m/32, and at a distance of five and six demes m/64 (m was set to 0.5 in the simulations presented here). Reflecting barriers are present at both ends of the chain of demes.

We consider three unlinked bi-allelic haploid loci. The first two loci have alleles labelled A and a (locus 1) and B and b (locus 2). These two loci are involved in a symmetric genetic incompatibility (endogenous selection) in which a is incompatible with B, and b is incompatible with A. The two-locus fitnesses are W(AB) = W(ab) = 1 and W(Ab) = W(aB) = 1 − s, in all demes in the chain (here we modelled strong endogenous selection, s > 0.5). Below, we model a secondary contact between these two endogenous backgrounds (AB and ab) by initially fixing AB in the left part of the chain of demes and ab in the right part of the chain.

The third locus has alleles labelled C and c, which are under exogenous selection. Each deme in the chain was assigned to one of two habitats types (habitat 1 and habitat 2), and one of the alleles was adapted to each habitat. Allelic fitness were W(C) = 1 and W(c) = 1 − t in habitat 1 and W(C) = 1 − t and W(c) = 1 in habitat 2 (t was set to 0.1 for scenarios with moderate exogenous selection and to 0.9 for scenarios with strong selection). The assignment of habitat types, and the initial state of locus 3 varies in the simulations presented below.

The effects of endogenous and exogenous selection combined multiplicatively so that, for example, W(Abc) = (1 − s)(1 − t) in habitat 1. Genotypic frequencies in each deme were derived from those of the previous generation after accounting for recombination, migration and selection in that order. Simulations were often deterministic (i.e. deme sizes were assumed to be large enough for drift to be negligible) but random drift was also simulated by multinomial sampling of genotypes within each deme at each generation (N = 200 per deme). Windows executables are provided as supplementary material and Borland Delphi 4.0 source code is available from the authors upon request.

Moderate selection: tension zones are trapped by exogenous clines

Selective coupling arises in the form of linkage disequilibria among barrier loci when increased variance in compatibility increases mean fitness (multiplicative fitness or positive epistasis, Barton & de Cara 2009). In a spatial model, barrier loci organize in clines maintained by the dispersal/selection balance. Linkage disequilibria are caused by dispersal even with no linkage, which favours the coupling process. Coupling therefore happens when clines overlap (Slatkin 1975). For our purposes, the problem is to explain why and how endogenous and exogenous clines can overlap. We here propose three different scenarios in which endogenous clines are produced by a secondary contact of incompatible backgrounds. However, multilocus endogenous clines can also accumulate in parapatry and this will be described elsewhere.

In the first scenario, environmental variation is fine-grained, and each deme is assigned to habitat 1 or habitat 2 at random. We then model a secondary contact between two backgrounds (ABC and abc) differentially adapted to this environmental variation. Figure 1 shows the results of two simulations under this scenario. While the endogenous loci form a cline at the position of the contact (thin line in Fig. 1), the exogenous alleles (C and c) quickly spread through the metapopulation because they are favoured in one of the local habitat types, found at either side of the cline. In the initial stages of the spread, because selection is moderate (s = 0.5, t = 0.1), the initial association between endogenous and exogenous alleles is lost, and spatial structures establish themselves quite independently. However, once the spatial structure has been established (and to a lesser extent during its establishment) linkage disequilibrium between endogenous and exogenous loci builds up and spatial coupling occurs. As the position of the endogenous cline is not constrained by selection, it can move, and eventually comes to rest where the gradient in exogenous allele frequency is locally maximal (bold lines in Fig. 1). As expected, the tension zone moves so as to minimize its length (Barton & Hewitt 1985). Interestingly, the eventual phase of the coupling (whether AB becomes associated with C or with c) does not depend strongly on the initial phase. Here, all simulations began with only ABC and abc genotypes, and yet many ended with ABc and abC being most common. Simulations which led to both outcomes are shown in Fig. 1a and b.

Figure 1.

 Secondary contact between two intrinsically incompatible backgrounds in the presence of fine-grained environmental variation. Each background is initially adapted to one of two alternative habitats, and each deme is assigned to habitat 1 (blue dots) or habitat 2 (green dots) at random. Selection is moderate (s = 0.5, t = 0.1). Migration connects demes separated by up to six demes (see text). Exogenous alleles (blue lines) quickly spread through the metapopulation and organize according to the landscape (thin blue line). The endogenous loci form a cline at the position of the contact (thin red line), then coupling occurs and the endogenous cline can move and comes to rest where the gradient in exogenous allele frequency is locally maximal (bold red line). Once coupling is complete, the spatial variation of the exogenous alleles is slightly modified by the barrier to gene flow generated by the endogenous cline (bold blue line). The black arrow illustrates the movement of the endogenous cline. (a) and (b) show individual simulations in which the endogenous background from the demes on the left coupled with the exogenous allele favoured in habitat 2 (a) or habitat 1 (b).

Figure 2 depicts a second scenario: secondary contact between two backgrounds adapted to habitat 2 (genotypes Abc and abc) in a two-patch coarse-grained environment. All demes on the left-hand-side of the chain are habitat 1, and those on the right, habitat 2. We assume that allele C (which is favoured in habitat 1) now arises in the first deme on the left side (in the habitat 1 patch) at an initial frequency p0 = 0.001. The spread of this allele throughout the habitat 1 demes is virtually unaffected by the endogenous barrier, as is expected for a favourable allele (Barton 1979b; Pialek & Barton 1997), but the endogenous cline moves slightly as allele C crosses because the AB background becomes transiently associated with C. The outcome then depends on the distance of the environmental boundary from the endogenous cline. If this distance is too great, then the coupling does not persist and the wave of advance continues toward the environmental boundary leaving the endogenous cline behind (not shown). Figure 2 shows a situation where the coupling does persist, that is, when the exogenous cline, once established at the environmental boundary, overlaps with the endogenous cline so that some demes are polymorphic for both endogenous and exogenous loci (Slatkin 1975; Barton & de Cara 2009). In this case, the endogenous cline moves until it coincides with the environmental boundary where the gradient in exogenous allele frequency is maximal and the width of the endogenous cline is minimal (Barton & Hewitt 1985). In this scenario the phase of the association is always the same (AB with C) owing to the initial position of AB populations (initially predominant in habitat 1, where C is favoured). An important characteristic of this process is that coupling occurs because the exogenous and endogenous clines overlap in space. Although stronger exogenous selection favours the coupling process, it also reduces the width of the exogenous cline, and thus opportunities of overlap. Paradoxically, coupling is therefore more likely with moderate exogenous selection as it results in very wide cline (e.g. latitudinal clines) able to trap the endogenous tension zone from a remote distance.

Figure 2.

 Effect of the spread of a new locally favoured mutation on a secondary contact endogenous cline in a two-patch coarse-grained environment. A chain of 60 demes was divided into 30-deme patches of habitat 1 (blue dots) or habitat 2 (green dots). Selection is moderate (s = 0.5, t = 0.05). Migration connects demes separated by up to six demes (see text). The endogenous cline (red line) is initially positioned at deme 20 within the habitat 1 patch (1). A new exogenous allele arises in the first deme on the left (blue arrow) and spreads throughout habitat 1, crossing the endogenous cline (2), which transiently moves (3), and forms a cline at the environmental boundary (4). As the endogenous and exogenous clines overlap, coupling continues and the endogenous cline moves until it coincides with the environmental boundary (5). The width of the exogenous cline is reduced due to coupling with the endogenous cline. Hence, the exogenous and endogenous clines coincide with each other and with the environmental boundary (bold lines). Black arrows illustrate the movement of the endogenous cline.

Genetic drift does not modify the outcomes of the two previous examples qualitatively. With drift the efficacy of selection is reduced and therefore the coupling process is weakened. However drift can also help the endogenous cline to move haphazardly in space, which increases its probability of reaching the attraction basin of exogenous clines near ecotones. Our third scenario, depicted in Fig. 3, illustrates this point. We consider an endogenous cline that moves because of stochastic processes. We modelled an environmental pocket of habitat 1 demes surrounded by two patches of habitat 2 demes, implying that the endogenous cline could be trapped by one of the two environmental boundaries without reaching the ends of the range. The initial position and width of exogenous and endogenous clines does not allow coupling to occur without some movement of the endogenous cline (they do not overlap). The endogenous cline moves randomly because of drift while the positions of exogenous clines are stabilized by selection. The endogenous cline is certain eventually to overlap with one of the two exogenous clines, allowing coupling to occur, and the two clines to collapse at one of the environmental boundaries. Figure 3 shows the position of the endogenous cline at secondary contact (thin line) and of the exogenous and endogenous clines after coupling (bold lines). The process is best observed by using the Windows executable provided as a supplementary file.

Figure 3.

 Barrier trap. A secondary contact results in an endogenous cline at the centre of a chain of 100 demes (thin red line), in between two exogenous clines (blue line). Selection is moderate (s = 0.5, t = 0.1). Genetic drift occurs in demes of N = 200 individuals. Migration connects demes separated by up to six demes (see text). The endogenous cline moves randomly because of drift (illustrated by the two black arrows) while the positions of exogenous clines are stabilized by selection. The endogenous cline eventually comes to overlap with one of the two exogenous clines, allowing coupling to occur, and the two clines collapse at the environmental boundary (bold lines).

Strong selection in a fine-grained environment: GEAs at endogenous loci within a cline

We now model strong endogenous selection. We assume large selection coefficients on a small number of loci for convenience, but the same selection intensity could have been modelled with many loci, each with small effects. In this latter case each locus is expected to cumulate indirect selective effects from other loci in addition to its own selection coefficient (Barton 1983; Kruuk et al. 1999a) resulting in so-called ‘congealed genomes’ (Turner 1967) that behave roughly as a single superlocus under strong selection (Kruuk et al. 1999a).

First, as in Fig. 1, we consider secondary contact between two backgrounds (ABC and abc) differentially adapted to a fine-grained environment. In this case, even strong endogenous selection cannot prevent favourable exogenous alleles from rapidly crossing the tension zone and spreading throughout the metapopulation (Barton 1979b). However, within the tension zone a small-scale GEA arises, producing a mosaic structure in which allele frequencies at endogenous loci locally co-vary with habitat type (as in Fig. 4a). Contrary to the outcome with moderate selection (Fig. 1), the phase of the initial association (AB associated with C) is nearly always retained. Another interesting result is that GEA at endogenous loci can only be observed locally within the tension zone because this is the only place where these loci are polymorphic, although both habitat heterogeneity and polymorphism at exogenous loci (i.e. local adaptation) are present elsewhere in the range (Fig. 4). If neutral gene flow is prevented by the endogenous barrier, GEA will also be observed locally (in the tension zone) with neutral markers. We here have concentrated on the case were exogenous and endogenous loci do not interact directly but one can imagine the situation in which exogenous alleles are adapted to their native endogenous background while being deleterious in the foreign background (i.e. epistasis). In this case, and if epistasis is sufficiently strong, exogenous alleles can remain confined in their native background, being unable to cross the tension zone (not shown, Bierne 2001).

Figure 4.

 Secondary contact between two intrinsically incompatible backgrounds when polymorphism at the exogenous locus segregated within the two endogenous backgrounds. Each deme is assigned to habitat 1 (blue dots) or habitat 2 (green dots) at random. Selection is strong (s = 0.9, t = 0.9). Migration connects demes separated by up to six demes (see text). Exogenous alleles rapidly spread throughout the metapopulation (blue lines), crossing the tension zone like favourable alleles. Coupling occurs locally within the tension zone and a mosaic structure emerges at the endogenous loci (red lines). (a) and (b) show simulations in which the endogenous background from the left becomes coupled with the exogenous allele favoured in habitat 1 (a) or habitat 2 (b).

In the above scenario, each endogenous background is initially adapted to a different habitat. As the environmental heterogeneity predates the secondary contact, and populations of each background live in both habitats prior to the contact, this assumption is questionable. Locally adaptive polymorphism is not unlikely to be present in both endogenous backgrounds before they come into contact. In a second category of simulations, we consider a model in which habitat polymorphism at the exogenous locus occurs within the two endogenous backgrounds. This simulates the situation in which adaptation to micro-environmental heterogeneity predates the secondary contact. In this model, small-scale GEA readily emerged in the tension zone as previously, but the phase of the coupling (whether AB associates with C or c) proved to form in any direction depending on the environmental landscape and initial configurations. Simulations leading to opposite phases of association are shown in Fig. 4a and b.

Relevant theory and data from the literature

In this section we review theory and data that support the hypothesis of coupling between endogenous genetic barriers and environmental variation. Although many of these arguments have previously appeared in the hybrid zone literature (Barton & Hewitt 1985), they are also valid for more weakly isolated backgrounds that would not usually be called hybrid zones.

Endogenous genetic barriers are probably more efficient than local adaptation at preventing neutral gene flow at a substantial portion of the genome

Anybody who has modelled a barrier to gene flow imposed by a selected locus on a linked neutral locus, whether via underdominance or disruptive selection, must have been impressed, maybe disappointed, by the facility with which recombination breaks the association between the two loci, and how small the chromosomal portion affected by selection is. Indeed, the strength of a genetic barrier is expected to be roughly proportional to 1/r (Barton 1979b) and therefore to quickly decrease with the recombination rate.

Barton & Bengtsson (1986) identified the conditions required for the flow of neutral genes to be significantly reduced. First, the barrier needs to be produced by many loci. Thus, one can define two multilocus genotypes or sets of alleles (genetic backgrounds) that, upon interbreeding, produce a variety of intermediate genotypes called hybrids. Second, the combination of alleles from the two backgrounds in hybrid genotypes must substantially reduce the fitness (hybrid fitness depression). This does not imply that selection must be endogenous, only that hybrids should perform poorly whatever the environment (while each parental type outperforms the other in a given habitat). Note that hybrid unfitness is a strong prerequisite of models of ecological speciation (Gavrilets & Vose 2005); in this sense, ecological speciation is nothing other than a form of reinforcement in which hybrid unfitness is caused by exogenous rather than endogenous selection. Hybrids represent ‘bridges’ between genetic backgrounds and they need to be rare for a barrier to be effective. Selection against hybrids also generates a reduction in population density that mechanistically impedes gene flow, the ‘hybrid sink’ effect (Barton 1980, 1986).

Barton & Bengtsson (1986) emphasized that for the barrier to be genome-wide ‘the number of genes involved in building the barrier must be so large that the majority of other genes become closely linked to some locus which is under selection’. If only a fraction of loci screened exhibit high levels of differentiation, the requirement is that the number of genes involved in the barrier must be large enough for these markers to be closely linked to a selected locus. Recall, however, that Fst outliers often represent an appreciable proportion of the panel of loci screened (2–10%).

Theory thus predicts that hybrid fitness depression distributed across many loci is the most efficient selection regime to prevent the flow of neutral genes. Which of endogenous and exogenous selection is the most likely candidate for such effects? We argue that endogenous selection on intrinsic genetic incompatibilities is more plausible than ecologically driven divergent selection. It is now well established that hybrid fitness depression between well defined taxa results from the combination in hybrid genotypes of alleles at two or more loci involved in negative epistatic interactions, so-called Dobzhansky–Muller (DM) incompatibilities (Orr & Presgraves 2000; Coyne & Orr 2004; Gavrilets 2004). The accumulation of alleles involved in DM incompatibilities does not require environmental changes and can occur whenever populations happen to be geographically isolated (Orr 1995) including in parapatry (Gavrilets et al. 2000; Kondrashov 2003; Navarro & Barton 2003). DM incompatibilities can affect any interacting loci (Noor & Feder 2006; Barton & de Cara 2009) which implies that virtually any mutation in a functional region might become involved in an endogenous barrier. The study of the genetics of post-zygotic reproductive barriers has indeed revealed unforeseen molecular mechanisms (e.g. gene transposition, heterochromatin formation, splicing regulation) underlying the few genetic incompatibilities identified thus far (Masly et al. 2006; Ferree & Barbash 2009; Chou et al. 2010), which suggests we still have many other fascinating mechanisms to uncover. The DM model therefore provides a simple answer to the existence of efficient genetic barriers to gene flow not only between well delineated species or subspecies (Coyne & Orr 2004), but also between more weakly isolated genetic backgrounds. Conversely, the proportion of sites in a genome that are functionally involved in local adaptation, and can directly lead to an Fst outlier, is probably low (Remold & Lenski 2001). To make this claim is not to deny the importance of ecology-driven selection at the individual level, nor to underestimate the absolute number of potential targets of exogenous selection in a genome. Undoubtedly local adaptation can involve changes in many complex phenotypic traits (morphological, life history, behavioural and physiological) each influenced by variation in numerous genes (e.g. Bernatchez et al. 2010). But if quantitative trait variation involves many genes, local adaptation is likely to proceed through small allele frequency differences at many loci, and so will not result in a large Fst at any locus (Le Corre & Kremer 2003). In addition, we would argue that the genome-wide mutational opportunities for intrinsic incompatibilities are probably even more numerous than mutational opportunities for local adaptation to the sort considered in most Fst scan studies. Genomes are immensely large, most of the genes or ‘elements’ (transposons, chromosomal arrangements, duplications, splicing sites, satellite repeats, recombination hotspots, retroviruses, vertically transmitted symbionts, etc.) are modestly affected by the extrinsic environment while they are involved in complex epistatic and pleiotropic, co-adaptive or antagonistic (arms-race) interactions with other genes or elements – in other words they adapt to their evolving intrinsic environment. The likelihood of a molecular marker being closely linked to a gene or element driven by genomic conflicts (Rice & Holland 1997) and involved in a genetic incompatibility (Phadnis & Orr 2009) is probably as high, at least, as the probability of its being closely linked to a gene involved in adaptation to temperature or salinity, notwithstanding that environmental variation is often faced through phenotypic plasticity.

We also acknowledge that the limit between endogenous and exogenous selection is not clear-cut and that we have in this review idealized the distinction for clarity. Alleles adapted to one environment can also be favoured only within their initial genetic background, so that they contribute to endogenous as well as exogenous selection. Furthermore, ecological and endogenous selections are not necessarily independent. Barton (2001) has argued that endogenous isolation is likely to evolve when selection favours different optima for a given trait in different environments, while other traits have the same optimum everywhere. This occurs when mutations have pleiotropic effects on several traits, a common situation, as suggested by measures of genetic correlations in quantitative genetic studies. While populations adapt to different environments by fixing new alleles at loci controlling the trait under divergent selection, compensatory fixations occur at other loci so as to maintain the other traits near their optimum. These compensatory fixations become DM incompatibilities, as they are favoured in the genetic background of the population in which they evolved but potentially detrimental in other backgrounds, irrespectively of the environment. Under Barton’s model, one might expect differential adaptation to generate a genetic barrier composed of many more endogenous than exogenous loci.

Intensively studied genetic barriers are usually multifactorial, and both exogenous and endogenous

The hybrid zone literature has long debated the relative roles of endogenous and exogenous selection (Arnold 1997; Barton 2001). In their seminal review, Barton & Hewitt (1985) concluded: ‘It is harder to distinguish whether parapatrically distributed forms remain distinct because they are adapted to different environments, or because hybrids between them are less fit. However, both direct evidence of hybrid unfitness and the indirect evidence of the close concordance of different characters lead us to believe that the latter is more likely, and that most hybrid zones are in fact tension zones’. However, pronounced GEAs have been identified in many if not most hybrid zones (Hewitt 1988), at both large (Hunt & Selander 1973; Moore & Price 1993; Riginos & Cunningham 2005; Leaché & Cole 2007) and smaller spatial scales (Rand & Harrison 1989; Howard & Waring 1991; Bierne et al. 2002b; Vines et al. 2003). Hybrid zones in which the genetic structure is intricately associated with a patchy fine-grained environment have been called mosaic hybrid zones (Harrison & Rand 1989) and undoubtedly imply the action of exogenous factors.

In many cases both exogenous and endogenous factors are acting concomitantly. This observation is far from novel (Barton & Hewitt 1985; Hewitt 1988) but often receives insufficient attention. A list of examples is provided in Table 1. Examples that will be described in more detail below could also be added to this list. The number of multifactorial genetic barriers is remarkable when one considers that the study of hybrid fitness depression is difficult and that neither an absence of fitness depression of intermediate genotypes at marker loci inferred in nature, nor an absence of hybrid fitness depression in the F1 generation, allows us to definitively rule out endogenous selection (Bierne et al. 2006). The observation that tension zones are expected to be trapped by exogenous clines, reconciles Barton and Hewitt’s famous assertion – that most genetic barriers are in fact tension zones – with the widespread existence of GEAs: although exogenous factors sometimes explain the location of genetic clines or shifts, they are not necessarily responsible for the barrier to gene flow, and thus for abrupt shifts in allele frequencies at neutral loci. The barrier is more likely due to endogenous selection.

Table 1.   Examples of multifactorial genetic barriers to gene flow
Hybrid zoneEndogenous factor(s)Exogenous factor(s)
  1. 1Kruuk et al. (1999b); 2MacCallum et al. (1998); 3Smadja et al. (2004); 4Britton-Davidian et al. (2005); 5Hunt & Selander (1973); 6Pickles & Grahame (1999); 7Hull et al. (1996); 8Janson & Sundberg (1983); 9Cruz et al. (2004a); 10Harrison (1983); 11Harrison (1985); 12Rand & Harrison (1989); 13Gardner & Skibinski (1990); 14Bierne et al. (2002a); 15Bierne et al. (2006); 16Bierne et al. (2003b); 17Gilg & Hilbish (2003); 18Rawson et al. (2003); 19Miranda et al. (2010); 20Riginos & Cunningham (2005); 21Dopman et al. (2010); 22Malausa et al. (2005); 23Calcagno et al. (2001); 24White et al. (2010); 25Diabatéet al. (2005); 26Simard et al. (2009); 27Moore & Koenig (1986); 28Moore & Price (1993); 29Lu & Bernatchez (1998); 30Pigeon et al. (1997); 31Rogers & Bernatchez (2006); 32Mavarez et al. (2009); 33Thériault et al. (2007); 34Shaw & Wilkinson (1980); 35Shaw et al. (1993); 36Orr (1996); 37Bidau (1990); 38Tosto & Bidau (1991); 39Mallet (1989); 40Mallet & Barton (1989); 41Via et al. (2000); 42Via (1999); 43Nosil et al. (2002); 44Nosil et al. (2006); 45Matute & Coyne (2010); 46Matute et al. (2009); 47Bert & Arnold (1995).

Bombina bombina/B. variegata (fire-bellied toads)Selection against hybrids1Habitat preference2
Mus m. musculus/M. m. domesticus (house mice)Assortative mating3
Hybrid infertility4
Association with rainfall5
Littorina saxatilis (rough periwinkle)Assortative mating6
Selection against hybrids7
Local adaptation8
Habitat preference9
Gryllus firmus/G. Pennsylvanicus (field crickets)Reproductive incompatibility10
Temporal isolation11
Association with soil type12
Mytilus edulis/M. galloprovincialis (Smooth-shelled blue mussels)Asynchronous spawning13
Assortative fertilization14
Selection against hybrids15
Habitat preference16
Local adaptation17
Mytilus edulis/M. trossulus (Smooth-shelled blue mussels)Gamete incompatibility18
Selection against hybrids19
Association with salinity20
Ostrinia nubilalis (maize/mugwort races of borers)Assortative mating21,22
Temporal isolation21
Local adaptation23
Anopheles gambiae (M and S forms)Assortative mating24Local adaptation25,26
Colaptes auratus (red/yellow shafted flickers)Selection against hybrids27
Territorial defence28
Association with a multifactorial ecotone28
Coregonus clupeaformis (normal/dwarf whitefishes)Selection against hybrids29Association with bathymetry 30
Extrinsic hybrid unfitness31
Salvelinus fontinalis (anadromous/resident brook charrs)Selection against hybrids32Anadromy/residency tactic33
Caledia captive (Moreton/Torresian chromosomal races of grasshopper)Selection against hybrids34Association with a climatic gradient35
Melanoplus sanguinipes/M. devastator (acridid grasshopper)Selection against hybrids36
Assortative mating36
Association with an altitudinal gradient36
Dichroplus pratensis (melanopline grasshopper)Selection against hybrids37Association with altitude38
Heliconius erato/H. melpomene (butterflies)Selection against hybrids39,40Adaptation to local mimicked species39,40
Acyrthosiphon pisum (Alfalfa/Red Clover races of pea aphids)Selection against hybrids41Habitat preference42
Local adaptation41
Timema cristinae (walking-sticks)Assortative mating43Habitat preference44
Local adaptation43
Drosophila yakuba/D. santomeaAssortative mating45
Selection against hybrids45
Adaptation to temperature46
Mercenaria mercenaria/M. campechiensis (clams)Selection against hybrids47Local adaptation47

Genetic barriers tend to have physical/environmental correlates

The next question is whether endogenous barriers tend to coincide with environmental limits. There exists an abundant literature on the existence of clusters of hybrid zones or hotspots of genetic structure, sometimes called suture zones (Remington 1968; Hewitt 1996, 2000, 2004; Avise 2000; Swenson & Howard 2005). Questions about suture zones concern their existence (Swenson & Howard 2004), their origins (primary or secondary contacts, Endler 1977) and their maintenance (physical, environmental or genetic barriers). We will here focus on explanations of their position. Three hypotheses have been proposed (i) suture zones occur on zones of secondary contact between glacial refugia, (ii) they represent zones of reduced dispersal (i.e. natural barriers to gene flow) that have trapped multiple tension zones, (iii) they represent environmental boundaries that favour differential adaptation/speciation. We here add a fourth hypothesis: (iv) they represent environmental boundaries that have trapped tension zones through the coupling between endogenous and exogenous backgrounds.

It is clear that suture zones are often well explained as common zones of secondary contact between glacial refugia (Hewitt 2000). For instance, the geography of Europe with its three Mediterranean peninsulas (Iberian, Italian and Greek) that each could have served as refugia during glacial maxima, has most likely resulted in secondary contacts in Southern France and in Central Europe. Similarly in the sea, one can easily imagine that numerous species have been split between the Mediterranean Sea and the Atlantic coasts of Africa during glacial maxima and that populations have secondarily met somewhere near Gibraltar (Patarnello et al. 2007). However, this hypothesis cannot explain why genetic structure has not vanished since the secondary contact, why the zones coincide so neatly at a small spatial scale among numerous species and why their position correlates so well with natural barriers to dispersal (e.g. the Alps or the Almeria-Oran front), that are often environmental boundaries as well.

Studying the dynamics of tension zones, Barton (1979a) provided a crucial theoretical prediction: tension zones can move in response to variation in population density and dispersal rate but they are expected to be trapped by natural barriers to dispersal. This prediction not only explains why clusters of hybrid zones often coincide with natural barriers, but it also explains why species with a similar biology (e.g. similar dispersal abilities) can be affected differently by the same barrier. Often, the natural barrier itself would not reduce gene flow enough to generate a strong genetic differentiation, and genetic differentiation can be maintained only because a genetic barrier is superimposed on the natural barrier. This process has been well recognized to explain terrestrial hotspots of hybrid zones (Barton & Hewitt 1985; Swenson & Howard 2005). The Alps in Europe and the Appalachian and Rocky Mountains in USA are such hotspots of hybrid zones well explained by Barton’s prediction. Surprisingly, Barton’s trapping hypothesis has been relatively neglected in the marine literature (Avise 1992; Palumbi 2003; Patarnello et al. 2007; Schmidt et al. 2008a) while good examples also exist in the sea. The Almeria-Oran front, an oceanographic front that separates Atlantic from Mediterranean water masses, is a hotspot of genetic structure in the Sea (Patarnello et al. 2007). The Almeria-Oran front has been recognized as a natural barrier to dispersal for marine species that disperse via a planctonic larval stage. It is tempting to infer that the front itself is responsible for the genetic structure. However, examples exist of species that do not exhibit any genetic break at the Almeria-Oran front (Launey et al. 2002; Patarnello et al. 2007) while their dispersal capabilities and population sizes are similar to those of species that do exhibit a break. It is therefore very likely that we presently observe only the tensions zones that have been trapped by the front. The same reasoning holds for other hydrographic barriers such as Cape Canaveral in Florida (Avise 1992; Cunningham & Collins 1994), the Siculo-Tunisian Strait in the Mediterranean Sea (Bahri-Sfar et al. 2000), or point Conception in California (Burton 1998). Barton’s trapping hypothesis can also explain why the locations of shifts in allele frequency do not always coincide between groups of species (e.g. Pelc et al. 2009). Species with different dispersal capabilities will not respond identically to natural barriers, and so tension zones of restricted dispersers are expected to be trapped by the first minor barrier encountered near a zone of secondary contact, while tension zones of high dispersers are expected to be trapped only by a strong natural barrier, even if this is located far from the secondary contact.

While Barton’s prediction accounts for the position of many hybrid zones and hotspots of hybrid zones, in other cases the positions of hybrid zones or genetic breaks correspond to ecotone rather than natural obstacles (Moore & Price 1993; Johannesson & Andre 2006), and exogenous selection is assumed to cause genetic differentiation. For instance, Moore & Price (1993) studied in great details the ecological correlates, both abiotic and biotic, of a flicker (Aves; Piciformes) hybrid zone. This zone is localized in the Great Plains of the USA, where there is no obvious natural barrier to dispersal but an obvious ecotone and biogeographic boundary. This area coincides roughly with one of Remington’s (1968) suture zones, and has been identified as a hotspot of avian hybrid zones by Moore & Price (1993), as confirmed by the meta-analysis of Swenson & Howard (2005). In this case, it seems clear that ecological selection plays a role in determining the position of the zone. It is also tempting to infer that exogenous selection explains the genetic structure. However, endogenous factors have also been identified, such as low hybrid male fecundity (Moore & Koenig 1986) and assortative mating (Wiebe 2000). An alternative explanation is therefore that the flicker hybrid zone and other hybrid zones in this area are tension zones that have been trapped by exogenous loci at an environmental boundary.

Another example of a hotspot of genetic differentiation with a strong environmental correlate is the Öresund and Danish Belts between the Kattegat (North Sea) and the Baltic Sea (Johannesson & Andre 2006). The most spectacular ecological difference between the Baltic Sea and the North Sea is in salinity. The Baltic Sea is brackish with salinity below 10‰ (as low as 2–4‰ in the northern part) while the North Sea has a standard salinity (∼30‰) and the Kattegat is intermediate (∼20‰). The Kattegat and the Baltic Sea are separated by a steep salinity gradient. There are also other gradients such as in water temperature, oxygenation and biotic factors, but salinity is usually strongly emphasized. As a consequence, local adaptation to a marginal environment has been proposed as the causative agent of the genetic differentiation, often restricted to a few outlying markers, observed between populations of fish and invertebrates between the Baltic and North Seas (Johannesson et al. 1990; Riginos & Cunningham 2005; Johannesson & Andre 2006; Hemmer-Hansen et al. 2007b; Gaggiotti et al. 2009; Limborg et al. 2009). In addition, the Baltic Sea used to be a freshwater lake during the last glacial period and the connection to the North Sea was established only about 8000 years ago, allowing colonization by marine taxa. This has been taken as an argument against scenarios involving secondary contacts. However this argument neglects the possibility that secondary contact zones can move if they are tension zones and therefore could have formed elsewhere and come to coincide with the environmental boundary secondarily. In fact, many of the genetic deviants found in the Baltic Sea are also observed elsewhere, often in the Barents Sea (northern Scandinavia). Baltic Mytilus mussels are M. trossulus and form a hybrid zone with M. edulis in the Öresund (Väinölä & Hvilsom 1991). M. trossulus is also found in some Norwegian fjords (Ridgway & Nævdal 2004; Väinölä & Strelkov 2011), in the Barents Sea (Bufalova et al. 2005; Väinölä & Strelkov 2011) and sometimes in the White Sea (Daguin 2000; Väinölä & Strelkov 2011) in Europe as well as on the Pacific and Atlantic sides of North America (Riginos & Cunningham 2005). The tellinid bivalve Macoma balthica also forms a hybrid zone in the Öresund (Nikula et al. 2008), which is ‘replicated’ in the Barents Sea such that populations of the Baltic Sea are very similar to populations of the White Sea (Nikula et al. 2007; Strelkov et al. 2007). Cod (Gadus morhua) of the Baltic Sea and of the Barents Sea have similarly low levels of HbI-1 allele at the Haemoglobin-I locus which strongly differentiates cods of the Baltic and North Sea (Petersen & Steffensen 2003). European flounders (Platichthys flesus) of the Baltic Sea and of the Barents Sea have similar Hsc70 allele frequencies while both populations are strongly differentiated from populations of the North Sea at this locus (Hemmer-Hansen et al. 2007a). In the Amphipods Gammarus zaddachi, a cline in allele frequency is observed at the arginine phosphokinase (APK) allozyme locus in the Öresund, yet a population from Tromsø in Norway was found to have the same allele frequencies at the APK locus as populations of the Baltic Sea (Bulnheim & Scholl 1981). Although some might see the footprint of parallel adaptation in such a pattern, the most parsimonious explanation is that a shared history of secondary contact has resulted in multiple tension zones that have all been trapped in the entrance of the Baltic Sea, in the Öresund and Danish Belts. The geography in this region might have promoted a split of a tension zone while it was moving northward, producing two ‘daughter’ zones, one that became stuck at the entrance of the Baltic Sea and the other going further north along the Norwegian coast. The trapping process could have been produced either by exogenous clines at the ecotone or by a natural barrier to dispersal. Indeed, although the environmental differences between the Baltic and North Sea have been strongly emphasized, the entrance of the Baltic Sea also acts as a physical barrier to dispersal. The steep salinity gradient exists because the water masses do not greatly mix. Either scenario implies that clines now observed in allele frequencies at a few marker loci in this region do not necessarily reflect salinity-dependent selection at these or linked loci, although adaptation to salinity probably exists somewhere in the genome.

To summarize, barriers to gene flow often have either or both physical and environmental correlates. When they coincide with a natural barrier to dispersal, it is tempting to attribute the genetic structure to the natural barrier. However, it is often more likely that the genetic barrier is a pre-existing tension zone that has become trapped by the natural barrier (Barton’s trapping prediction). The natural barrier explains the position of the genetic break but not its maintenance. Similarly, when a barrier to gene flow coincides with an environmental boundary, it is tempting to attribute the genetic structure to local adaptation. However, environmental boundaries often coincide with barriers to dispersal, and even when this is not the case, tension zones can still be trapped by the environmental boundary through the coupling of endogenous and exogenous clines. Local adaptation explains the position of the genetic break but not necessarily its maintenance.

GEAs are evident at a local scale in mosaic hybrid zones

GEAs sometimes involve microhabitat differentiation at a small spatial scale (Table 2). A puzzling observation of such fine-scaled mosaic hybrid zones is that microhabitat heterogeneity (which explains the genetic patchiness within the zone) often exists outside of the zone as well, but there, does not coincide with genetic heterogeneity at neutral marker loci.

Table 2.   Examples of microhabitat differentiation
Hybrid zoneMicrohabitat differentiationReferences
Bombina bombina/B. variegata (fire-bellied toads)Ponds/puddlesVines et al. (2003)
Gryllus firmus/G. Pennsylvanicus (field crickets)Sands/loamsRand & Harrison (1989)
Mytilus edulis/M. galloprovincialis (Smooth-shelled blue mussels)High tide/low tide
Low salinity/high salinity
Gardner (1994)
Mytilus edulis/M. trossulus (Smooth-shelled blue mussels)Low salinity/high salinityRiginos & Cunningham (2005)
Ostrinia nubilalis (European corn borer)Maize/mugwortMalausa et al. (2005)
Acyrthosiphon pisum (Pea Aphids)Alfalfa/Red CloverVia (2009)
Rhagoletis pomonella (Apple maggot)Hawthorn/appleMichel et al. (2010)
Allonemobius fasciatus/A. socius (Ground crickets)Microclimatic, topographic diversityHoward & Waring (1991)
Festuca ovina (Sheep’s Fescue)Grassland habitat variationPrentice et al. (1995)
Semibalanus balanoides (Northern acorn barnacles)Exposed to the sun, high-tide (thermally stressed)/algal cover, shadow, low-tideSchmidt & Rand (2001)
Littorina saxatilis (Rough periwinkle)High shore/low shoreButlin et al. (2008b)
Fucus spiralis (spiral wrack)High shore/mid shoreBillard et al. (2010)
Gadus morhua (Atlantic cod)Coastal/pelagic
Sarvas & Fevolden (2005)
Dicentrarchus labrax (Sea bass)Sea/lagoonLemaire et al. (2000)

In the large mosaic hybrid zone (spanning from the Southwest of France to the North of Great Britain) between the marine mussels Mytilus edulis and M. galloprovincialis, habitat specialization is evident between sheltered habitats under freshwater influence, which are occupied by M. edulis-like genotypes, and oceanic habitats exposed to wave action, which are occupied by M. galloprovincialis-like genotypes (Gardner 1994; Bierne et al. 2002b). In the Gryllus mosaic hybrid zone, G. pennsylvanicus alleles are more frequent in loam soils, and G. firmus alleles more frequent in sand soils (Rand & Harrison 1989). In the Bombina hybrid zone, B. bombina-like individuals are more frequent in ponds, and B. variegata-like more frequent in puddles (MacCallum et al. 1998; Vines et al. 2003). The intuitive interpretation is that M. edulis is adapted to sheltered habitats and M. galloprovincialis adapted to exposed habitats; that G. pennsylvanicus is adapted to loam soils and G. firmus to sand soils; and that B. bombina is adapted to ponds and B. variegata to puddles. However, the sheltered/exposed rocky shores seascape is widespread along the European coastlines, and sand/loam and pond/puddles are also widespread landscapes. The same is true of many of the microhabitat differentiation examples listed in Table 2. The model presented in the previous section provides an alternative interpretation to simple differential adaptation and can also explain why GEAs are observed only locally, in the mosaic hybrid zone, even if the microhabitat heterogeneity is present elsewhere. Subspecies may be equally adapted to both habitats because habitat adaptation polymorphisms segregate in both backgrounds. However linkage disequilibria between endogenous, exogenous and neutral loci are maintained locally in the hybrid zone, allowing neutral molecular markers to capture the GEA only there.

A posteriori this explanation makes sense for two reasons. First, in their allopatric ranges, each species/subspecies tends to occupy all habitats, with specialization becoming apparent only in contact zones. Polymorphisms for adaptation to different habitats can remain invisible at the level of neutral markers within allopatric populations of either background because no linkage disequilibrium exists between neutral genes and habitat-adaptation genes. Second, even with a very low hybridization rate, the adaptive introgression of exogenous alleles from one background to the other poses no problem as alleles are beneficial in one of the two habitats and it is well known that favourable alleles easily cross tension zones (Pialek & Barton 1997).

The remarkable geographic structure of the hybrid zone between Mytilus edulis and M. galloprovincialis has largely inspired the ideas developed in this paper. Geographically, this zone is a two-scale mosaic (Bierne et al. 2002b). At a large-scale along the Atlantic coast of France, we observed roughly three independent hybrid zones that define two enclosed patches of parental populations in Brittany (M. galloprovincialis) and in the Bay of Biscay (M. edulis) separated from their external conspecific populations by an allospecific patch (Bierne et al. 2003c). Environmental factors do not easily explain this pattern. At a fine-scale, within each hybrid zone, hybrid populations and pure populations of both species are found in close proximity, forming a micro-mosaic structure that correlates with habitat heterogeneity as explained above. The Bombina hybrid zone also has the structure of a two-scale mosaic although the large scale mosaic has been attributed to altitudinal differences (Hofman et al. 2007). We have argued that the interaction between endogenous and exogenous factors can well explain the fine-grained mosaic structure within a single mosaic hybrid zone, but why are small-scale mosaics repeated independently at several locations and not restricted to a single area? We propose that stochasticity in migration/colonization occurring early in the period of secondary contact between M. galloprovincialis and M. edulis could have favoured the mixing of the different backgrounds over a large spatial scale. To illustrate this, we modified the initial conditions of the secondary contact model described above. The demes in the periphery of the range were fixed for one of the two backgrounds as previously; however each deme in the central portion of the chain of demes was initially assumed to be either fixed for AB or for ab, randomly and independently drawn with equal probabilities. This situation simulates a random colonization of habitat patches by long-distance migrants (instead of a continuous, deterministic diffusion process) when the two entities came into contact. Polymorphism at the exogenous locus (adaptation to local variations in habitat) occurred within the two endogenous backgrounds. Figure 5 shows the result of one representative simulation that resulted in a two-scale mosaic hybrid zone. In the central portion of the chain of demes, the system formed multiple stable mosaic hybrid zones, which varied according to initial conditions, migration, selection, linkage disequilibria and habitat heterogeneities. In some places, linkage disequilibria between endogenous and exogenous loci broke down and local-adaptation genes segregate in a fixed endogenous background, while in other places linkage disequilibria were maintained and the coupling operated between endogenous loci and environmental variation (Fig. 5). We often obtained a two-scale mosaic structure with patches of parental endogenous backgrounds enclosed within the zone, in which habitat polymorphism segregates, and independent fine-scale mosaic hybrid zones, as in the simulation presented in Fig. 5 and in Mytilus hybrid zone in France. In the simulation chosen in Fig. 5, endogenous and exogenous loci phased in opposite directions in the central and peripheral zones. This was not the case in every simulation, but it can occur when habitat polymorphism was present in both endogenous backgrounds before the contact. To date, the GEA observed in the hybrid zone between Mytilus edulis and M. galloprovincialis has always been in the same direction (M. edulis-sheltered habitats/M. galloprovincialis-exposed habitats), but it does not mean reversed associations could not be found with further examination. Interestingly, we recently have sampled M. galloprovincialis mussels in a sheltered brackish habitat in the port of Cherbourg (Normandie, France), in a poorly known portion of the hybrid zone. More convincing examples of reversed GEAs will be given in the following section.

Figure 5.

 Secondary contact between two intrinsically incompatible backgrounds with random colonization of habitat patches by long-distance migrants and polymorphism at the exogenous locus within each of the two endogenous backgrounds. Each deme in a central portion of 50 demes in a chain of 100 demes was randomly colonized by one alternative endogenous background. Each deme is assigned to habitat 1 (blue dots) or habitat 2 (green dots) at random. Selection is strong (s = 0.9, t = 0.9). Migration connects demes separated by up to six demes (see text). The system forms multiple stable mosaic hybrid zones. In some places, linkage disequilibria between endogenous and exogenous loci break down and local-adaptation genes segregate in a single endogenous background, while in other places linkage disequilibria are maintained and the coupling operates between endogenous loci and environmental variation.

The coupling between an endogenous and an exogenous background therefore explains why GEAs are often observed at a local scale. GEAs are captured by neutral markers only in places where the coupling process can occur because habitats and gene pools are sufficiently intermingled to maintain linkage disequilibria between genetic incompatibilities, local-adaptation genes and neutral loci. The coupling model also predicts that if secondary contacts could occur several times, as in computer simulations, the phase of the association between the exogenous and the endogenous backgrounds could form in opposite directions (Fig. 4). Gould (1989) has argued that if we could replay the history of life it would certainly turn out to be different; in these fictive worlds, Mytilus edulis would perhaps seem to be adapted to exposed habitats and M. galloprovincialis to sheltered habitats, Bombina variegata would appear adapted to ponds and B. bombina to puddles and Gryllus pennsylvanicus would seem adapted to sand soils and G. firmus to loam soils. Unfortunately, secondary contacts often occur only once in nature.

Case studies of reversed or alternative GEAs

An attractive prediction of the coupling model is that the phase of the association between endogenous and exogenous backgrounds is not always constrained. If contacts between populations of the same species or subspecies occur in different locations, and if the initial conditions vary appropriately, then the same set of endogenous loci can couple with different sets of exogenous loci in the different locations. We have found some valuable examples of such varied GEAs at remote locations that might be explained by this process.

Mytilus edulis and M. trossulus in Europe and in North America. M. edulis and M. galloprovincialis have met only once in Europe, however M. edulis and M. trossulus– the third species of the M. edulis complex – have probably met on two independent occasions: in Europe and in North America (Riginos & Cunningham 2005). As mentioned above, in Europe M. trossulus is found in the Baltic Sea, in some fjords of Norway and bays of the Barents Sea. Consequently, M. trossulus has been inferred to be adapted to lower salinities (Johannesson et al. 1990; Riginos & Cunningham 2005). M. trossulus is also found in the western Atlantic where it forms another hybrid zone with M. edulis, which extends from Maine (Rawson et al. 2001) to Hudson Bay (Koehn et al. 1984). The spatial structure of the Western Atlantic hybrid zone between M. edulis and M. trossulus is a mosaic that resembles the European hybrid zone between M. edulis and M. galloprovincialis. It is probably a two-scale mosaic with enclosed patches of differentially introgressed parental backgrounds (Koehn et al. 1984) and several small-scale mosaic zones with environmental correlates (Bates & Innes 1995; Comesaña et al. 1999; Rawson et al. 2001; Toro et al. 2004). The spatial segregation strongly suggests that M. edulis is preferably found within bays and estuaries while M. trossulus is preferably found in oceanic coasts (Gartner-Kepkay et al. 1980; Gardner & Thompson 2001; Rawson et al. 2001). With respect to salinity gradients, the distribution in North America therefore appears to be the opposite of that observed in Europe. This led Riginos & Cunningham (2005) to suggest that local adaptation could have occurred after the secondary contacts. However marine mussels have always been confronted by salinity gradients, and so a more parsimonious explanation is that this reversed GEA provides an illustration of the prediction of the coupling model. The genetic integrity of the two species is mainly maintained by an efficient endogenous barrier. This has recently been verified through experimental demonstrations of gamete incompatibility (Rawson et al. 2003; Miranda et al. 2010) and hybrid inviability (Miranda et al. 2010). The coupling of the two endogenous backgrounds with habitat-adaptation genes could have occurred in opposite directions during two independent secondary contacts. It is also possible that opposite phasing could have occurred locally in some unexplored portions of these two large contact zones. Preliminary results from the Kola Bay in Russia indeed suggest such a complex relation with the environment, with inverted zonation at a small spatial scale (M.V. Katolikova, personal communication).

If our hypothesis proves to be true, it would mean that local adaptation has a negligible role in the maintenance of the genetic structure at neutral loci and of barriers to gene flow in Mytilus but is simply revealed within mosaic hybrid zones because in these zones habitat-adaptation genes are in linkage disequilibrium with endogenous backgrounds and neutral markers.

Semibalanus balanoides in Maine, the Gulf of St. Lawrence and Rhode Island.  In the barnacle S. balanoides, allele frequencies at the mannose-6-phosphate isomerase (Mpi) locus are correlated with the degree of physiological stress experienced between high-tide (‘hot’) and low-tide (‘cold’) microhabitats (Schmidt & Rand 1999). In Maine (USA), genetic homogeneity was observed both among stages and habitats for two control loci (Gpi and mtDNA), while genetic differentiation appeared at the Mpi locus at the juvenile stage, reflecting habitat-specific differential mortalities (Schmidt & Rand 2001). This species therefore provides a beautiful example of local selection in a fine-grained environment. However, things might be more complex than selection on a single locus in a heterogeneous environment (Bierne et al. 2003a). Further north, in the Gulf of St. Lawrence, a genetic discontinuity is observed over a short distance at the position of the Miramichi River, not only at the Mpi locus but also at the Gpi locus (Holm & Bourget 1994; Véliz et al. 2004) and at two out of six microsatellites (Dufresne et al. 2002). The differential in Mpi allele frequency between the two sides of the Miramichi River is exactly the same as the difference between the two microhabitats in the Maine –the frequency of the F allele changes from ∼50% (low-tide, south of Miramichi) to ∼75% (high-tide, north of Miramichi). In such a large-scale spatial context it is difficult to understand why the allele frequency differential does not reach stronger values if selection directly affects allozyme loci. Furthermore, the ecological differences between each side of the Miramichi River are unclear. The north is slightly cooler (Drouin et al. 2002) while the frequency of the F allele, inferred to be adapted to higher temperature in the Maine, increase in the north, suggesting the relationship with temperature is possibly inverted. The Miramichi estuary, on the other hand, is likely to act as a natural barrier to dispersal (Drouin et al. 2002) capable to trap tension zones. Finally, Rand et al. (2002) replicated the study of GEA at the Mpi and Gpi loci further in the South in Narragansett Bay, Rhode Island. Not only did Rand et al. (2002) find a significant microhabitat zonation at the Gpi locus, but they also observed an opposite zonation pattern at the Mpi locus.

We propose that Gpi, Mpi and structured microsatellite loci might be simple markers of differentiated, partly incompatible, endogenous backgrounds. Interestingly, populations south to the Miramichi River have a similar genetic composition to European populations at the Mpi and Gpi loci (Holm & Bourget 1994), an observation which could have filled in a secondary contact scenario. Mpi and Gpi may possibly be directly involved in the barrier (Flight et al. 2010) but irrespectively of the environment (i.e. endogenous selection) and with a complex determinism, as differential fixation is not observed either side of the Miramichi River. The cohesiveness of the two endogenous backgrounds as well as their associations with habitat-adaptation genes would vary between localities. Under this hypothesis, the reversed GEA observed between Maine and Rhode Island, or between Maine and the Miramichi River area is not surprising as it is one prediction of the coupling model.

Anopheles gambiae s.s. in Western and Central Africa.  The malaria mosquito A. gambiae sensu stricto, which is the nominal species of the complex A. gambiae sensu lato, provides a convincing example of how different backgrounds, probably involved in different components of reproductive isolation (i.e. endogenous vs. exogenous), can form different associations at different places. In this case, both backgrounds have been characterized at the molecular level.

Chromosome surveys of inversion polymorphisms had led to the subdivision of A. gambiae s.s. into five chromosomal forms: Savanna, Mopti, Forest, Bamako and Bissau (Coluzzi et al. 1985; Touréet al. 1998). Cytogenetic studies indicated strong deviations from Hardy-Weinberg and linkage equilibrium and revealed association between inversion polymorphisms and environmental variation, such as the degree of aridity (Bryan et al. 1982; Coluzzi et al. 1985; Touréet al. 1998; Powell et al. 1999).

Extensive molecular analyses attempted to further distinguish the number of isolated or semi-isolated gene pools that exist in A. gambiae s.s., but demonstrated the existence of only two different entities, now referred to as molecular forms M and S (della Torre et al. 2001; Lehmann & Diabate 2008). The level of differentiation between the two forms is mainly restricted to small chromosome regions near the centromeres (Turner et al. 2005; White et al. 2010), including the X pericentric region to which the rDNA locus belongs; alleles at this locus are used to define the M and S forms (della Torre et al. 2005). Furthermore, the centromeric regions of all three chromosomes are in near-maximal linkage disequilibrium (White et al. 2010). Elsewhere in the genome, the level of genetic differentiation between the M and S forms is low though still significantly higher than between geographically distant populations of the same form (Wondji et al. 2002). It remains unclear whether the M/S barrier is semi-permeable -in this case chromosomal regions with high M/S differentiation correspond to the localization of genes involved in reproductive isolation – or whether the barrier is genome-wide – in this case differentiated regions correspond to regions of low recombination that differentiate faster owing to a stronger impact of hitchhiking and background selection (Noor & Bennett 2009; White et al. 2010).

However, the relationship between the molecular (M/S) and chromosomal forms is not simple. In West Africa (Mali and Burkina Faso), exact correspondence between molecular and chromosomal forms has been found: all Mopti individuals belong to the M molecular form while Savanna and Bamako chromosomal forms correspond to the S molecular form (this explains the origin of the M/S nomenclature). But this association breaks down in other areas of Africa (della Torre et al. 2001; Simard et al. 2009). In Cameroon (Central Africa), chromosomal arrangements assort independently of the M/S molecular forms (Simard et al. 2009). Each molecular form is polymorphic for chromosomal arrangements that approximately correspond to the Forest and Savanna chromosomal forms.

An abundant literature has emphasized the importance of chromosomal inversions in ecological adaptation (Powell et al. 1999). For instance, the frequency of the 2La arrangement correlates well with the degree of aridity in West African populations (Coluzzi 1992). Simard et al. (2009) recently studied GEA with both cytological and molecular markers and found that alternative homokaryotypes segregated in contrasting environments within both the M and S genetic backgrounds in Cameroon. Therefore inversion polymorphisms are probably directly involved in environmental adaptation. In contrast, the M and S molecular forms, the divergence of which has been studied in a chromosomally homosequential context (Turner et al. 2005), could reflect an endogenous barrier to gene flow, most probably pre-zygotic (White et al. 2010).

In this example the endogenous backgrounds (M/S) have become tightly coupled to an exogenous polymorphism (chromosomal arrangement) in one location, Western Africa, while exogenous loci have remained polymorphic within each endogenous background in another location, Central Africa. As a consequence a GEA involving variations in aridity is detected for the M/S polymorphisms in Western Africa while it is not detected, or with different environmental factors, in Central Africa (Simard et al. 2009). This suggests that local adaptation has a negligible effect on gene flow compared to the endogenous barrier. Interestingly, the chromosome 2 inversion that contributes in the adaptation of the M form to drier conditions than the S form in Western Africa is likely to have introgressed adaptively from A. arabiensis, another sister species of the complex (Besansky et al. 2003). Adaptive introgression of locally adapted genes through endogenous backgrounds is another prediction of the model.

This might be a case where the distinction between endogenous and exogenous selection is probably not clear-cut. For example, inversions might also have some slight intrinsic disadvantage as heterozygotes. Nevertheless, discordance between karyotypic and molecular markers in Anopheles gambiae (as with the Australian Morabine grasshoppers of the genus Vandiemenella, Kawakami et al. 2009; Kearney & Hewitt 2009) illustrates how co-adapted alleles can couple at some locations, and scatter at others depending of the landscape, the genetic determinism and historical contingency.

Littorina saxatilis in Spain, Great Britain and Sweden.  The intertidal snail L. saxatilis has become a popular model in the study of parallel ecological speciation (Rolán-Alvarez et al. 2004; Sadedin et al. 2009; Johannesson et al. 2010). L. saxatilis is direct-developing, lacking a dispersive larval stage, which could facilitate local adaptation. It inhabits the intertidal rocky shores throughout Europe. Two morphologically different ecotypes are usually found on the shore: a small, thin-shelled, morph with a wide aperture and a larger, thicker-shelled morph with a narrower aperture (Butlin et al. 2008). Pairs of divergent morphs have been well studied in three different areas: Galicia (northwest Spain), Yorkshire (northeast England), and around the Tjärnö marine biological laboratory in southwest Sweden. Within each site one morph occupies the high shore and the other morph the lower shore. The two morphs are separated by very narrow transition zones, sometimes only 2 m wide (Grahame et al. 2006). The two morphs are isolated by multiple mechanisms of reproductive isolation: local adaptation (Janson & Sundberg 1983; Rolán-Alvarez et al. 1997) and habitat preference (Cruz et al. 2004b) as well as assortative mating (Johannesson et al. 1995b; Hull 1998; Pickles & Grahame 1999; Cruz et al. 2004a; Hollander et al. 2005) and hybrid fitness depression (Hull et al. 1996; Cruz & Garcia 2001; Rolán-Alvarez 2007). The level of genetic differentiation between the two morphs is globally low but usually higher than between populations of the same morph separated by a similar distance (Grahame et al. 2006; Panova et al. 2006). However, genome scans have revealed that around 5% of the AFLP markers studied are high-Fst outliers (Wilding et al. 2001; Galindo et al. 2009). There are nonetheless very few convincing arguments to support the claim that these outlier loci are more affected by disruptive local adaptation than by hybrid fitness depression or assortative mating genes.

An interesting result is the opposite vertical zonation observed in Spain and England: in Spain, the large and thick morph occupies the high shore and the small and thin morph occupies the lower shore, while it is the reverse in England (Butlin et al. 2008). One explanation is that the environment to which morphs are differentially adapted also shows an opposite vertical relationship (Butlin et al. 2008). A thick shell with a narrow aperture is thought to improve resistance to crab predation, while a small size and a large aperture, and thus a larger foot, would be an adaptation to withstand wave action. However, the two morphs have also been assumed to respond to other environmental gradients, such as emersion times (desiccation), salinity or temperature (Johannesson et al. 1993; Rolán-Alvarez et al. 1997). In the English shore as in many other shores, the upper part is exposed to stronger wave action and crabs are more abundant in the mid and low shore. In the Spanish shore, the risk of crab predation would possibly be higher in the upper shore and wave action stronger in the lower shore (Butlin et al. 2008), while the possible reversal of other intertidal gradients is unclear. The coupling model shows that one does not necessarily need to find a reverse relationship between the vertical zonation and the environment to explain inverted GEA. It is also possible that exogenous and endogenous barriers have phased differently in the two areas.

We here leave aside the question of whether the various barriers evolved independently at different places (parallel speciation, Johannesson et al. 2010) or whether they have a common origin and subsequently self-organized according to the history of population displacements and to the landscape while leaving the neutral diversity freed from equilibrating with geographical distance (Grahame et al. 2006). This issue may be tackled in the near future through detailed studies of sequence variation at outlier loci found in genome scans. If the divergence between alleles at these loci turns out to be very ancient, as suggested by preliminary results (Wood et al. 2008) or in another case study of parallel evolution in sticklebacks (Colosimo et al. 2005), the hypothesis of a common origin will become plausible, although alternative explanations implying evolution from standing genetic variation exist (Schluter & Conte 2009; Johannesson et al. 2010).

Ostrinia nubilalis in France, United States and Japan. O. nubilalis (corn borer) is a phytophagous insect that exhibits interesting patterns of association between genetic and environmental differentiation. Two pheromonal races, named E and Z according to the composition of the volatile compounds emitted during the mating process, have been described as reproductively isolated in sympatry (Malausa et al. 2005). In France, the E and Z races appear specialized to two different plant hosts: the E race is found on mugwort or hop, while the Z race is found on maize (Pelozuelo et al. 2004). In North America, where the European corn borer has been introduced at the beginning of the 20th century, the two pheromonal races coexist on a single host, maize. Furthermore, a voltinism polymorphism has evolved. O. nubilalis moths are either univoltine (one reproduction cycle per year, hereafter named U strain) or bivoltine (two or more reproduction cycles per year, hereafter named B strain). This character can be considered either as an endogenous isolation system (moths are not emerging at the same time) or as an adaption to a temperature gradient (moths living in the north of their distribution in the US are generally univoltine, while those living in the south tend to be bivoltine). Interestingly, in the US, an association can be found between the E/Z and U/B polymorphism. Moths can be BE, BZ or UZ, but UE phenotypes have never been observed (Dopman et al. 2010). This situation illustrates how the E/Z polymorphism, responsible for an endogenous barrier that was pre-existing in European populations, has coupled with an exogenous barrier associated with host choice in France, while it seems to have entered into a coupling process with an alternative environmental factor (temperature variation) in the US. Interestingly, in Japan the E/Z polymorphism exists in two different Ostrinia species, not only O. nubilalis but also O. scapulalis (Huang et al. 2002). However, in Japan, O. nubilalis exclusively lives on maize while O. scapulalis lives on hop and other dicotyledonous plants. This observation led Frolov et al. (2007) to revise the systematics of Ostrinia moths and to propose that moths living on maize should be considered as belonging to O. nubilalis, while those living on dicots should be named O. scapulalis, including the E races initially named O. nubilalis in France. This illustrated that the interaction of multiple genetic barriers (pheromones, local adaptation to host plants, voltinism) can be so complex as to create a taxonomical conundrum even in well-studied taxa. Whatever the species names, the three geographic locations (France, US and Japan) clearly display alternative associations between an endogenous barrier (mate preference) and exogenous selection (adaptation to host plants or temperature). A broader analysis of European populations on a larger spatial scale should reveal whether the coupling between the E/Z backgrounds and the mugwort/maize environmental variation is only local (to France) or is more widespread.


An alternative interpretation of GEAs

The approach of scanning genomes for loci with anomalously high levels of differentiation has become a standard of population genetics (Luikart et al. 2003). Sophisticated tests are continuously developed to identify so-called Fst outlier loci (Beaumont & Balding 2004; Foll & Gaggiotti 2008; Excoffier et al. 2009; Bazin et al. 2010). However, the exact form of the selection responsible for extreme differentiation is hardly ever addressed and loci of greatly increased differentiation are simply assumed to be under divergent selection (‘local adaptation’). This interpretation is sometimes given even in the absence of an observed relationship with an ecological variable. GEA is nonetheless often observed at outlier loci, which is then taken as additional support for the action of ecological selection either on the locus itself or on a linked locus. We argue here that GEA provides evidence that local adaptation exists somewhere in the genome but provides little indication that the identified loci (or small chromosomal regions surrounding them) directly respond to that selection. This is particularly true when outlier loci are found to represent a substantial portion of the panel of loci screened.

Decades of research on hybrid zones should have taught us that efficient genetic barriers to gene flow are very often endogenous – i.e. tension zones (Barton & Hewitt 1985). The existence of an endogenous genetic barrier is strongly suggested when many loci exhibit a concordant genetic structure. However, when GEA is observed it is very tempting to attribute the structure to environmental selection, one Holy Grail of evolutionary genetics. We believe genetic structure of this kind might often be due to many endogenous loci trapped at an environmental boundary by a smaller number of exogenous loci. Not only is this possibility theoretically expected but we have found examples that can plausibly be explained by this process. In the mosquito A. gambiae s.s., the strength of the barrier to gene flow between the M and S forms is as strong in Cameroon, where endogenous backgrounds assort independently of the exogenous backgrounds, as in Mali where the endogenous and exogenous backgrounds have become coupled together (Wondji et al. 2002; White et al. 2010). In Mytilus mussels, habitat adaptation polymorphisms are likely to segregate within monospecific patches of populations but have remained invisible to molecular studies to date. Their existence is attested by local GEAs observed in fine-grained hybrid zones, in which coupling occurred with the endogenous barrier, and by the opposite relationship to environmental heterogeneity in two replicated secondary contact zones in Europe and North America.

When the concomitant action of endogenous and exogenous factors is recognized as contributing to reproductive isolation, either a secondary contact is suspected and one refers to the hybrid zone framework, or the secondary contact scenario is thought unlikely (albeit difficult to rule out definitively) and one refers to ecological speciation. In the latter case, exogenous selection is seen as the driving force that led to the evolution of reproductive isolation, the first spark to initiation of the subsequent accumulation of barrier genes (Nosil et al. 2009; Schluter & Conte 2009; Via 2009), possibly in the chromosomal neighbourhood of locally adapted genes (Smadja et al. 2008; Via & West 2008). This review is not the place for a discussion of the importance of ecology-driven divergent selection in speciation – a complex issue that despite abundant attention, and much speculation, is not yet settled. There is nonetheless a point that is relevant here: the ecological speciation hypothesis faces a ‘chicken-egg dilemma’, but typically assumes that the exogenous barrier has arisen first and then drives the evolution of the endogenous barrier. While this scenario is attractive, the coupling model presented here provides an alternative possibility. New environmental adaptation can trap a pre-existing endogenous barrier from a distant location (Fig. 2), and most species are likely to have endogenous barriers somewhere in their distribution range, either because of a history of vicariance (Hewitt 2000) or because genetic incompatibilities accumulate in parapatry (Kondrashov 2003; Navarro & Barton 2003; Gavrilets 2004). New adaptations are not required to drive the evolution of a new barrier; they can couple with a pre-existing one. One possible example of this process is found in the corn borer, Ostrinia nubilalis, which colonized maize after its introduction into Europe, around 500 years ago. Adaptation to maize has probably resulted in the capture of a pre-existing endogenous barrier between O. nubilalis and O. scapulalis rather than the evolution of a new one through the process of ecological speciation. Similarly, the complexity and the genomic extent of the barrier to gene flow between the hawthorn and apple host races of the maggot fly Rhagoletis pomonella (Michel et al. 2010) does not suggest an origin as recent as 150 years – the estimated age of the host shift to apple. When exogenous and endogenous barriers arise together, it is not necessarily meaningful to ask which of the two drove the evolution of the other. In a groundbreaking mathematical analysis, Barton & de Cara (2009) recently revised our view of the reinforcement process: strong isolation can evolve through the coupling of any kind of incompatibility, whether pre- or post-zygotic. The coupling occurs when selection for increased variance in heterozygosity drives the increase of linkage disequilibrium, which couples different components of reproductive isolation. That is why Barton & de Cara (2009) argued that the action of selection on linkage disequilibria is adaptive. Although these authors considered intrinsic incompatibilities only, disruptive exogenous selection often favours maximal variance and exogenous loci can also enter the coupling process, as shown in our model. By presenting speciation as a coupling process, Barton and de Cara (also see Udovic 1980; Kirkpatrick & Ravigné 2001) reconciled traditionally opposed views of speciation: the evolution of reproductive evolution is seen as an accumulation of incompatibilities of any kind, pre- or post-zygotic, endogenous or exogenous, regardless of their nature or order of appearance. Acknowledging that speciation is a gradual multifactorial process renders less important the questions of which type of factor acts first or most strongly, questions to which a general answer may not exist.

Why are endogenous barriers underappreciated to explain GEAs?

It is difficult to find evidence for endogenous selection.  Our own experience in the study of hybrid fitness led us to realize how difficult it can sometimes be to demonstrate the existence and causes of hybrid fitness depression. Two decades of research on the hybrid zone between the mussel species M. edulis and M. galloprovincialis failed to demonstrate lower fitness of hybrids (Wilhelm & Hilbish 1998). On the other hand, associations between genetic structure and environmental factors (salinity, wave exposure), led researchers to emphasize habitat specialization as the principal mechanism of reproductive isolation between the two species (Gardner 1994). However, the pattern of genetic structure within hybrid populations suggested exogenous selection could not be acting alone and that endogenous isolation mechanisms must also have existed (Bierne et al. 2002b). To uncover hybrid fitness depression, we had to perform controlled crosses in the lab, and to go through two generations of hybridization (Bierne et al. 2006). Unfit hybrid genotypes, those key genotypes that prevent introgression (Barton 2001), are expected to be rare in nature. In order to estimate their fitness accurately, they must be produced in large numbers through experimental crossing. In addition, hybrid dysgenesis often appears only in the F2s (after one generation of recombination) while heterosis is often observed in F1 progeny (Dobzhansky 1952; Alibert et al. 1997; Edmands 1999; Bierne et al. 2002a). We believe the Mytilus case is not isolated because controlled crosses are not always feasible and the F2 generation rarely investigated. Furthermore, hybrid fitness is often investigated in nature through the study of associations between genotypes at marker loci and phenotypes. However, ‘hybrid individuals’ sampled in natural populations and used to estimate hybrid fitness are often complex genetic mosaics in which neutral loci have a loose linkage disequilibrium with barrier loci. Although the study of genotype/phenotype associations can often be powerful at inferring phenotypic differences between parental backgrounds in hybrid zones, its efficiency in accurately inferring the fitness of ‘hybrids’, an ambiguously defined genotypic category, is less clear (Boecklen & Howard 1997).

Identifying genetic incompatibilities is even more difficult. For instance, the identification of barrier genes is a recent development in the study of speciation that awaited the development of molecular techniques in model species (Noor & Feder 2006; Wolf et al. 2010). The genetics of endogenous post-zygotic isolation has long been mostly investigated in Drosophila, focussing on X-linked genes with large effects on fitness and between well-delimited species (Orr et al. 2004). Decades of research pioneered by Dobzhansky provided overwhelming evidence that hybrid fitness depression results from the accumulation of DM incompatibilities (Coyne & Orr 2004). However, despite valuable recent works in other species (Presgraves 2010) the study of more loosely isolated backgrounds and of autosomal genes with moderate effects on fitness remains limited (Rieseberg & Buerkle 2002; Bierne et al. 2006; Edmands et al. 2009; Wolf et al. 2010). In Mytilus mussels, we were lucky enough that the number of DM incompatibilities was sufficiently high for our markers to easily map them, but it proved more difficult to infer the genetic determination of hybrid dysgenesis in Tigriopus copepods (Edmands et al. 2009). For weaker barriers involving fewer genetic incompatibilities, the effort needed to map interacting incompatibilities is much greater.

It is usually thought that there exists an optimal genetic distance above which the beneficial effect of hybridization (i.e. heterosis) are overwhelmed by its negative effects (i.e. outbreeding depression, Waser 1993). Surveys that have tried to identify this optimum have often found the distance to be very small (Escobar et al. 2008). At larger scales, the correlation between parental divergence and post-zygotic isolation is positive (Edmands 1999, 2002), although the scaling remains somewhat blurred because the metrics used are either spatial, ranging from metres to thousands of kilometres, or temporal, ranging from thousands to millions of years (Edmands 2002). In any case, hybrid fitness depression is often observed at a spatial scale for which neutral markers do not exhibit a strong genetic differentiation (Escobar et al. 2008).

In the era of genomics, research on the genetic basis of reproductive isolation is likely to reveal the importance of DM incompatibilities even between populations within species, provided it is looked for and inherent analytical difficulties are solved. Recent studies are progressing in this direction. Studying segregation distortion in an F2 cross between two divergent populations of Mimulus guttatus, Hall & Willis (2005) found that half of all markers significantly departed from Mendelian expectations, allowing them to map 12 genetic incompatibilities. A similar level of departure was detected in an interspecific map between M. guttatus and M. nasutus. Similarly, McDaniel et al. (2007) found a high rate of segregation distortion in an interpopulation cross of the moss Ceratodon purpureus, with evidence that this arises from epistatic interactions. Montooth et al. (2010) recently studied mitochondrial-nuclear incompatibilities in flies by combining mitochondria with either interspecific nuclear backgrounds or with a nuclear background of a different population of the same species. Significant epistasis for male fitness was observed in an intraspecific cross, and surprisingly this was the strongest effect, stronger than in an interspecific combination. Lachance & True (2010) studied epistatic fitness interactions between the X chromosome and autosomal genetic backgrounds derived from different geographic locations of Drosophila melanogaster and found considerable amounts of recessive incompatibilities. Recently, genetic incompatibilities have been identified between Arabidopsis thaliana accessions (Bikard et al. 2009; Alcazar et al. 2010). All of these studies suggest that DM incompatibilities are widespread, even within a species, although their characterization is difficult and time-consuming. There is thus no reason to dismiss the possibility of such incompatibilities in the interpretation of Fst scans. Interestingly, the segregation of genetic incompatibilities within Drosophila and Arabidopsis have been observed at spatio-temporal scales which are much smaller than usually anticipated (Kolaczkowski et al. 2010; Turner et al. 2010).

It is not appreciated that trapping of tension zones can occur.  It is well recognized that tension zones are expected to become trapped by natural barriers to dispersal (Barton 1979a; Barton & Hewitt 1985; Hewitt 1988). Despite this prediction, natural barriers are still often interpreted as explaining not only the position of genetic breaks, but also the genetic breaks themselves. Even worse, when a natural barrier coincides with an environmental boundary, the genetic structure is often said to result from local adaptation. Although well established, Barton’s trapping hypothesis does seem to us to be insufficiently appreciated outside of the hybrid zone literature. The expectation that tension zones can coincide with environmental boundaries, even without a barrier to dispersal, remains implicit even in much of the hybrid zone literature so it is not surprising that it is not appreciated elsewhere. However, it has long been known that tension zones are expected to stabilize where parental fitnesses become equal and in such a way as to minimize their length (Barton & Hewitt 1985), and it should be an intuitive outcome of speciation and cline theories that endogenous and exogenous backgrounds can couple together. It is therefore timely to emphasize that tension zones can be trapped by exogenous clines at ecotones, and more generally to encourage molecular ecologists to disentangle the question of the cause of the position of a genetic break/cline and the question of its nature. An interesting issue is how natural barriers compete with environmental variation in attracting tension zones. Tension zones can be trapped by minor local barriers and will not easily get to an environmental transition. It is however plausible that a continual process of extinction and recolonization could tend to bring exogenous and endogenous clines together, and that once this happens, the strong net effect of environment-related selection may push hybrid zones to the ecotone more effectively. More generally, random variations in population density can help to regularly relocate endogenous clines until overlapping with an exogenous cline. Further modelling is required to investigate this issue.

Ideas from the hybrid zone literature have not been considered in the Fst outlier literature.  The hybrid zone literature is consistently ignored in the literature on local adaption at the molecular level and in Fst scan surveys. The reason is probably a mistaken belief that the cases are quite different. We believe they are in fact the same and that the population genetics of local adaptation should consider arguments from the hybrid zone literature more seriously.

In the hybrid zone literature of the 1980s, while genetic markers were mostly allozymes and the number of polymorphic loci available could not exceed a few dozen, it was common practice to identify a handful of loci that best discriminated partially isolated forms. After preliminary analyses of genetic differentiation, panels of so-called ‘diagnostic’ or ‘semi-diagnostic’ markers were consistently used in further field studies and lab experiments ignoring less informative loci. For example, this was true of Mytilus hybrid zones that were consistently analysed with four or five allozymes chosen from a possible 24 (Bierne et al. 2003d) and regrettably are now often analysed with a single DNA marker only. This was also true for other famous hybrid zones such as the Bombina toads hybrid zone studied with five allozymes from a possible 29 (Szymura 1993), the Gryllus crickets hybrid zone studied with three allozymes from a possible 23 (Harrison 1979), or the Mus mice hybrid zone studied with 10 allozymes from a possible 36 (Bonhomme et al. 1984; Raufaste et al. 2005). Although these markers were recognized as being affected by reproductive isolation, potentially because they were directly involved, they were used as markers of the genetic backgrounds responsible for the ‘superphenotype’ that combined the effect of every reproductive isolation factor. The initial scan of differentiation, although modest, resembles the Fst scan strategy that is increasingly used nowadays (Luikart et al. 2003; Beaumont 2005; Storz 2005; Nosil et al. 2009). Strangely enough, the recent genome scan literature contains a far narrower range of hypotheses than the older hybrid zone literature for interpreting high Fst outliers, almost always invoking local selection. Such an interpretation avoids the questions of whether selection is acting now or acted in the past, whether selection affects, or has affected, the focal loci directly or indirectly, whether the association with a trait or an environment is direct or indirect, whether it is selection or other factors that similarly modifies genotypic frequencies (e.g. pre-zygotic isolation mechanisms, habitat choice) and finally consistently ignores the possible existence of hybrid fitness depression. One might argue that we are comparing different time scales and differentiation levels. However, Fst scans have often been conducted between recognized isolated forms that would fit well in the hybrid zone framework (Wilding et al. 2001; Murray & Hare 2006; Via & West 2008; Nielsen et al. 2009; Michel et al. 2010) and conversely, some hybrid zones display very low levels of genetic differentiation at most loci (Halliday et al. 1983; Nielsen et al. 2003). Barton & Hewitt (1985) estimated that on average, 14% of loci showed clear genetic differences in the 34 hybrid zones with molecular data they reviewed. This is not very different from the proportion of Fst outlier loci usually detected in modern genome scans. Finally, few species are likely to have remained unperturbed by a history of population displacement, fragmentation and secondary contact (Avise 2000; Hewitt 2000; Lewontin 2002), providing ample opportunities for the evolution of partially isolated genetic backgrounds. Genetic barriers and hybrid zones are expected to be ubiquitous. The question is less their existence than their intensities, and the number of loci needed to reveal partially isolated backgrounds of various natures. The detailed study of the spatial population genetics of outlier loci identified from genome scans is likely to reveal that multiple outliers often display concordant geographic structure, as is observed in the first few examples (Coop et al. 2009; Bradbury et al. 2010; Hohenlohe et al. 2010). Geographic coincidence calls for nontrivial explanations – it requires either that all the relevant environmental variables are concentrated at the same location or more likely that incompatible endogenous backgrounds largely contribute to the genetic structure observed in natural populations (Barton & Hewitt 1985).


The early study of hybrid zones was arguably characterized by excessive optimism about our ability to infer evolutionary process from genetic patterns. A few decades of research later these hopes have been tempered, and it is widely accepted that long term experiments in the lab and field must accompany inferences from genetic data (Moore & Price 1993; Harrison 1998; Ross & Harrison 2002). However, the same excessive optimism now seems to underlie the genome scan approach, which mostly focuses on a single concept -local adaptation – to explain unusual differentiation, and seems to ignore the complexities of (i) population history, (ii) the genetic architecture of barriers to gene flow, (iii) the interaction between selected loci of different types, and (iv) the indirect path through which selection affects neutral variation. Our understanding of the genetics of local adaptation must be informed by arguments from the hybrid zone literature because the two biological situations differ in respect of the intensity of the barrier to gene flow but not necessarily in respect of its nature, which is often multifactorial, and both exogenous and endogenous. Barriers of both types have a tendency to become coupled, such that tension zones can come to coincide with habitat boundaries, while not themselves involved in local adaptation.


The first author thanks Louis Bernatchez and members of his lab for a stimulating week in Québec which initiated this review. The authors thank Matthieu Faure, Pierre-Alexandre Gagnaire and Sergine Ponsard for insightful discussion as well as David Rand and two anonymous reviewers for helpful comments on the manuscript. This work was funded by the Agence National de la Recherche (Hi-Flo project ANR-08-BLAN-0334-01). This is article 2011-033 of Institut des Sciences de l’Evolution de Montpellier.

The authors are interested by diverse areas of population genetics, evolutionary biology and molecular evolution. N.B. and F.B. are members of the Institut des Sciences de l’Evolution at the University of Montpellier. They often focus their research on marine species and conduct their experiments at the marine lab in Sète. P.D. is member of the Centre d'Ecologie Fonctionnelle et Evolutive in Montpellier and has long collaborated with N.B. and F.B., especially on speciation and population structure in marine bivalves. J.W. and E.L. were postdoctoral fellows of the Hi-Flo project coordinated by N.B. which aimed to broaden and improve the interpretations of genomic regions of enhanced differentiation. J.W. is now a lecturer in the Department of Genetics at the University of Cambridge and E.L. is post-doc at the Station biologique de Roscoff.

Data accessibility

Windows executables for simulations has been deposited at Dryad: doi: 10.5061/dryad.8743


Genetic barrier to gene flow – A reduction in effective gene flow at neutral loci as a consequence of selection on linked loci. This reduction occurs because a neutral allele introduced from one gene pool into another, will be initially associated with alleles that are locally eliminated by either endogenous or exogenous selection. Recombination is thus necessary for the neutral immigrant allele to diffuse between the two gene pools. The diffusion of neutral alleles across such barriers is slower than over similar distances within a single gene pool.

Endogenous loci – Loci that produce hybrid fitness depression irrespective of the environment.

Endogenous barrier – A genetic barrier produced by endogenous loci.

Tension zone – Geographical zone where populations with incompatible genetic backgrounds are in contact (i.e. in which a cline of allele frequencies at endogenous loci can be observed).

Endogenous cline – A monotonic change in frequency of alleles at endogenous loci along some direction in space.

Exogenous loci – Loci at which different alleles are adapted to different environmental conditions.

Exogenous barrier – A genetic barrier produced by exogenous loci.

Exogenous cline – A monotonic change in frequency of alleles at exogenous loci along some direction in space.

Ecotone – The frontier, abrupt or gradual, between two different habitats.

Natural barrier to dispersal – a natural obstacle (mountain, river, unsuitable habitat, etc.) that locally reduces the probability of successful migration of individuals.

Fine-grained environment – Spatial heterogeneity in habitat that occurs at a fine spatial scale relative to species dispersal, so that migration often occurs between contrasting habitats.

Coarse-grained environment – A type of environmental heterogeneity in which dispersal is sufficiently low relative to the scale of the environmental variation for local adaptation to be easily maintained.

Strength of the barrier to gene flow – The decrease in the effective rate of migration.