WHEN GAPS REALLY ARE GAPS: STATISTICAL PHYLOGEOGRAPHY OF HYDROTHERMAL VENT INVERTEBRATES

Authors


Abstract

The invertebrate animals endemic to deep-sea hydrothermal vents are distributed intermittently along relatively linear oceanic ridge axes. A one-dimensional stepping-stone model, therefore, provides a reasonable starting hypothesis of population structure for these species. Nevertheless, population genetic studies of many species from eastern Pacific vents did not detect the expected signatures of isolation-by-distance (IBD). Instead, distinct patterns of geographical subdivision have been attributed to the unique dispersal modes of individual species, topographical discontinuities of the ridge axes, nonequilibrium metapopulation scenarios and cryptic species. Here, we reexamined these inferences in light of expectations generated by computer simulations of a one-dimensional stepping-stone model. We evaluated whether the previously inferred subdivisions are statistically robust to an alternative explanation that continuous stepping-stone migration has occurred along the ridge axes but discontinuities in the sampling design (gaps) have generated the apparent disjunctions. We found that previous inferences about barriers to gene flow (vicariance) were supported in many cases, but that failures to detect evidence for IBD could be explained by low statistical power associated with the sampling effort. The simulation approaches presented here might be useful for testing the significance of inferred phylogeographic gaps in other species.

Comparative phylogeography is a powerful tool for identifying historical factors that have shaped the histories of codistributed species (Avise et al. 1987; Arbogast and Kenagy 2001; Soltis et al. 2006). When multiple species exhibit concordant geographical subdivision in their genealogical patterns, they likely experienced similar demographic responses to shared historical events (Avise 2000). Yet, evaluating statistically which historical factors (vicariance, dispersal, genetic drift, local extinctions and recolonizations) might have generated the observed genetic subdivisions remains difficult, although a number of approaches have been developed during the last decade (Knowles 2004; Hickerson and Meyer 2008; Templeton 2008; review in Nielsen and Beaumont 2009). Achieving sufficient statistical power to distinguish the effects of demographic history versus random genetic drift is particularly difficult in nonmodel organisms from hard-to-access environments where sample sizes are often limited. Nevertheless, population genetic and phylogeographic studies often provide the only means to infer the recent demographic history of species living in such poorly explored domains.

The endemic animals native to deep-sea hydrothermal vents share several features that make them good candidates for comparative phylogeography (Vrijenhoek 1997; Hurtado et al. 2004; Plouviez et al. 2009). Vent communities are distributed as habitat islands, constrained by geochemical conditions that exist along Earth's midocean ridge system, in back-arc basins and on volcanically active seamounts. Without sunlight and photosynthesis, primary production in these communities relies on chemosynthetic microbes that oxidize reduced volcanic gases such as H2S and CH4 (reviewed in Van Dover 2000). Individual vent fields (local clusters of active vents) in the eastern Pacific Ocean tend to be ephemeral and may persist for only a few decades before they are extinguished by fluctuations in the magma supply (Lalou and Brichet 1982; Haymon et al. 1991). New vent fields arise with sporadic tectonic and volcanic events, and they are rapidly colonized by propagules from neighboring vents (Shank et al. 1998). Vent-endemic organisms are expected to have high dispersal capabilities that facilitate the colonization of distant sites, and indeed, some bivalve mollusks and annelids appear to exhibit little genetic structuring among vent fields separated across thousands of kilometers (Karl et al. 1996; Hurtado et al. 2003; Won et al. 2003). Bathymodiolin mussels, for example, produce planktotrophic larvae that are capable of dispersing great distances, but a surprisingly large proportion of vent-endemic gastropods and annelids produce nonplanktotrophic larvae that are believed to disperse relatively short distances (Lutz et al. 1986; Gage and Tyler 1991).

A linear stepping-stone model might be expected to describe the population structure of many vent species; consequently, genetic distances between discrete populations are expected to increase with the geographical distances between them (Slatkin and Barton 1989). Nonetheless, nearly half of the genetic studies conducted with eastern Pacific vent species have failed to detect geographical patterns indicative of an isolation-by-distance (IBD) process (Vrijenhoek 1997; Won et al. 2003; Hurtado et al. 2004). Perhaps a linear stepping-stone model fails to approximate the demographic complexity affecting most vent taxa. Nonequilibrium conditions that result from the inherent instability of hydrothermal vent fields and the temporal variability of suitable habitats may be responsible for establishing the similarity seen across large distances for some vent species (Jollivet et al. 1999). Although we cannot refute the operation of complex metapopulation scenarios involving sources, sinks and moving vent fields, we suggest that many failures to detect the genetic signature of an IBD process are simply a result of low statistical power. The number of population samples may have been too small, the sampling could have been geographically uneven, or the number of independent genetic markers employed in a particular study may not have been sufficient.

Our aim was to test the hypothesis that sampling effects have greatly limited previous inferences about population structure in vent organisms. We have employed computer simulations that manipulate various aspects of gene and population sampling to generate expectations for linear stepping-stone models of population structure. The results of published genetic studies of vent-endemic animals from the Galápagos Rift (GAR) and East Pacific Rise (EPR) were then compared against expectations generated by the simulations. We first ask whether a linear stepping-stone model can best describe the genetic structure of vent invertebrates and whether failures to observe its expected signature in some empirical studies can be explained by low statistical power. Second, we test whether previous inferences about geographical barriers correspond with significant genetic subdivisions in vent species, or alternatively if the apparent subdivisions can be explained by stochastic genetic drift and gaps in the sampling design. At equilibrium, a linear stepping-stone model of dispersal is expected to produce a positive correlation between genetic differentiation and geographical distance, a pattern that resembles the signature of IBD in a continuously distributed population (Slatkin and Barton 1989; Rousset 1997). Researchers commonly refer to these signatures as an IBD pattern, but it is important to remember that other nonequilibrium processes (e.g., vicariance, secondary contact, range expansions, etc.) can produce similar patterns (Slatkin 1993). Although nonequilibrium processes should also be considered, their simulations were beyond the scope of the present study. The statistical approach to test the significance of phylogeographic subdivisions and assess the power to detect the genetic isolation by distance pattern is used here for the specific case of hydrothermal vent invertebrates, but could be applied to a broad range of taxa or systems.

Methods

TOPOGRAPHIC CONSIDERATIONS

Published results must be considered in view of the topographical complexity of eastern Pacific ridge axes and the unique life histories of vent-endemic species. The GAR joins the EPR at a triple junction, punctuated by ∼6000 m deep feature known as the Hess Deep (Fig. 1). Known GAR vents are located about 2000 km east of the triple junction, but the dominant vent taxa are mostly shared between the two ridge systems (Desbruyères et al. 2006). The relatively linear EPR consists of a series of tectonic spreading centers (ridge segments) that together span nearly 6000 km between 27°N and 32°S latitude. Active EPR vent fields are typically distributed tens to hundreds of kilometers apart. The larvae of many vent-endemic invertebrates are hypothesized to disperse in axial currents that are constrained to some degree by lateral walls that flank the rift valleys (Marsh et al. 2001; Pradillon et al. 2001), although lateral walls are low to nonexistent along rapidly spreading and bathymetrically inflated portions of the southern EPR (Hey et al. 2006). These inflated segments may be subject to cross-axis currents that limit the larval supply to axial vents (Won et al. 2003; Hurtado et al. 2004). Transform faults (sliding plate boundaries) between discrete ridge segments also interrupt the linearity of EPR axis. Large offsets that extend hundreds of kilometers appear to impede gene flow to differing degrees in various vent species (Johnson et al. 2006; Young et al. 2008).

Figure 1.

Hydrothermal vent fields sampled between 1990 and 2005 along the East Pacific Rise (EPR) and Galápagos Rift (GAR) by Vrijenhoek and coworkers. Open circles represent vent fields that were explored and closed circles represent vent fields that were sampled for the taxa considered in this study. The diamonds represent areas that were explored but no active vents were found. Lines perpendicular to the main EPR axis indicate major transform faults.

EMPIRICAL STUDIES OF EPR VENT SPECIES

We assessed published genetic studies of EPR and GAR hydrothermal vent populations to determine if they could be re-examined in the present context. A number of studies were excluded for the following reasons: (1) at least four discrete EPR populations must be analyzed, which excluded Bathymodiolus thermophilus (Grassle 1985), Riftia pachyptila (Bucklin 1988), Lepetodrilus pustulosus and Lepetodrilus galriftensis (Craddock et al. 1997); (2) the presence of sympatric cryptic species violated assumptions of random mating, which excluded Oasisia alvinae (Hurtado et al. 2004) and Lepetodrilus elevatus (Matabos et al. 2008); (3) pairwise genetic distances were not reported, which excluded alvinellids (Jollivet et al. 1995); and (4) genetic markers must have expected mutation rates that are consistent with our simulations, which excluded R. pachyptila AFLPs (Shank and Halanych 2007). In the end, we reexamined genetic data from nine species that met the above criteria: four polychaete annelids (R. pachyptila, Tevnia jerichonana, Alvinella pompejana and Branchypolynoe symmitilida); two bivalve mollusks (Calyptogena magnifica and B. thermophilus); two gastropod mollusks (L. elevatus and Eulepetopsis vitrea); and one amphipod crustacean (Ventiella sulfuris) (Table 1). Three species were studied only for allozymes, two were studied for allozymes and mtDNA and four were studied for mtDNA alone (Fig. 2).

Table 1.  Biological characteristics and dispersal properties of nine East Pacific Rise and Galápagos vent species.
SpeciesAdult1Food2Larv.3Mob.4Capability5Col.6N8 all: mtDNANo. L9No. A10 all: mtDNARef11
  1. 1Adult mobility: (S) sessile or sedentary; (V) limited vagility; (M) highly mobile.

  2. 2Adult nutrition: (S) contains sulfur-oxidizing endosymbionts; (F) filter/suspension feeding; (C) commensal of mussels; (G) bacterial grazing.

  3. 3Larval nutrition: (L) lecithotrophic; (P) planktotrophic; (B) brooding, from Tyler and Young 1999.

  4. 4Larval mobility: (S) free-swimming; (P) passive.

  5. 5References for dispersal capability: Lutz et al. 1984, 1986; Lutz 1988; Marsh et al. 2001.

  6. 6Reference for colonization order: Shank et al. 1998; Vrijenhoek et al. 1998.

  7. 7Estimated distance achieved during 30–40 day larval lifespan, from Marsh and Minckley 1989.

  8. 8Average number of individuals examined per population studied in reviewed studies (allozymes: mtDNA data).

  9. 9Number of polymorphic allozyme loci.

  10. 10Average number of alleles per allozyme locus: total number of mtDNA haplotypes.

  11. 11References for population genetic data: (1) Black et al. 1994; (2) Hurtado et al. 2004; (3) Karl et al. 1996; (4) Won et al. 2003; (5) Craddock et al. 1997; (6) France et al. 1992.

Polychaeta
 Riftia pahyptilaSSLP100–200 km7early38: 17101.6: 31,2
 Tevnia jerichonanaSSLP?early –: 19 –: 92
 Alvinella pompejanaVFLPdemersalearly –: 40  –: 132
 Branchypolynoe symmitilidaSCLP?late –: 38 –: 122
Bivalvia
 Calyptogena magnificaSSLSlimitedlate –: 20 –: 63
 Bathymodiolus thermophilusSSPSgreatlate28: 1272.6: 274
Gastropoda
 Lepetodrilus elevatusVGLSlimited 63: –81.5: –5
 Eulepetopsis vitreaVGLSlimited?67: –61.6: –5
Amphipoda
 Ventiella sulfurisMFBlimitedearly40: –82.0: –6
Figure 2.

Previously defined evidence for genetic differentiation in mtDNA and allozyme markers for nine vent species from the East Pacific Rise and Galápagos Rift (shaded in gray). Symbols of different shading (black, gray and white) denote samples that belong to genetically distinct groups (see the Results section and Table 3). Triangles indicate the presence of putative cryptic species. Numbers on the top of each column indicate the correlation (r) between genetic and geographical distances (GAR samples excluded). Those with nonsignificant correlations (P≥ 0.10) are shown as ns. The mean reported FST values are listed below each of the species analyzed with allozymes. For species that were subdivided into distinct geographical groups, the maximum FST or ϕST values within groups (Fmax(WG)) and the ranges between groups (F(BG)) are also listed. Two F(BG) ranges are listed for the three groups identified in T. jerichonana. *– mean of population pair-wise FST values.

Life histories differed considerably among these nine species. The four polychaete worms all have nonplanktotrophic larvae and rely on yolk reserves (lecithotrophy) to sustain larval development. Only the alvinellid palm worm, A. pompejana, is known to be capable of arrested development and dispersal in the demersal zone (at the benthic boundary) (Pradillon et al. 2001). Adults of the polynoid scale worm, B. symmitilida, live in the mantle cavity of vent mussels and produce very large eggs and lecithotrophic larvae (Tyler and Young 1999), but little more is known about larval lifespans or their position in the water column. Adult siboglinid tubeworms (formerly Vestimentifera), R. pachyptila and T. jerichonana, are strictly sessile and entirely dependant on sulfur-oxidizing bacterial endosymbionts for nutrition. R. pachyptila produces intermediate sized lecithotrophic larvae that persist for about 38 days at in situ deep sea temperature of 2°C, allowing the worms to disperse between 100 to 200 km with predominant along-axis currents (Marsh et al. 2001). Two bivalve molluscs are sedentary as adults and depend on sulfur-oxidizing bacterial endosymbionts for nutrition. The mytilid, B. thermophilus, produces tiny eggs that develop into planktotrophic free-swimming larvae. A four-fold increase in size between the first and second larval shells (prodissoconchs) indicates a feeding and swimming planktotrophic stage that is expected to have high dispersal capabilities (Lutz et al. 1980). The vesicomyid clam, C. magnifica, however, produces larger eggs and lecithotrophic larvae that may not feed and are believed to have a more limited dispersal capability (Lutz et al. 1986). Larval shell remnants (protoconchs) suggest that two gastropod species L. elevatus sensu stricto and E. vitrea have nonplanktotrophic larvae with limited dispersal capabilities (Lutz et al. 1986). The vent amphipod V. sulfuris broods its larvae and is believed to disperse as free-swimming juveniles or adults (France et al. 1992).

Subsequent to submission of the present publication, Plouviez et al. (2009) published a comparative mtDNA phylogeography of seven EPR invertebrates. Four of their species were previously examined by Hurtado et al. (2004) and consequently are also considered in our study. Plouviez et al. (2009) increased the sample sizes and employed Approximate Bayesian Computation methods to test for vicariance across the Equator. Although we were unable to consider their novel data in this study, we discuss their conclusions in the light of our present findings.

FORWARD-BASED SIMULATIONS

Multilocus genotypic distributions were generated with forward-looking simulations as implemented in Easypop (Balloux 2001), a program that models stochastic processes in a population. The number of populations per species (K), number of loci (L), number of alleles or haplotypes per locus (a), initial allele frequencies, mutation rates (μ), migration rate (m), variance effective population size (N), and the number of sampled individuals (n) can be specified for a particular model. The program generates genotypic output files readable in the Genpop format (Raymond and Rousset 1995). For the nuclear genes, simulations involved K= 6 populations, L= 5 freely recombining diploid genes with a= 2 or 3 alleles per locus to asses the sensitivity of results to allelic richness. Values of a= 2 or 3 are consistent with the numbers of allozyme alleles typically found in empirical studies of these vent taxa. Each population engaged in random mating involving Nf= 1000 diploid females plus Nm= 1000 diploid males. Without any prior information (see the Discussion section for further comments on this topic) as to the likely ratio of census to effective populations size, we assumed the ratio to be at about 10–100; smaller effective populations sizes have been modeled for mtDNA data simulations (see below). The mutation rate was set at μ= 1 × 10−6 per locus. Preliminary analyses revealed that varying mutation rates five-fold in both directions did not substantively alter the results. Simulations involving mitochondrial genotypes were conducted separately with K= 6 populations, Nf= 1000 females, a= 40 haplotypes, and a faster mutation rate of μ= 1 × 10−5, which is expected to be about 10 times faster than μ for nuclear coding genes (Denver et al. 2004).

Simulations were started with the maximum allelic and genotypic diversity and a constant proportion of migrants (m) in a one-dimensional stepping-stone model. Each generation m/2 individuals are expected to immigrate from an adjacent population; the actual number of migrants is a random variable thus allowing the software to incorporate stochasticity. Emigration outside of the margins of the array was ignored (see Balloux 2001). We used values of m= 0.0005, 0.0025 and 0.0125 to correspond with dispersal rates of Nm= 1, 5 and 25 diploid individuals per generation in a population of N= 2000 breeding individuals. Nm values were half these sizes for mtDNA. Although arbitrary, these Nm values encompass the range of values commonly detected with the sampling strategies employed in empirical studies; consequently, they have been suggested as possible evolutionary criteria for defining connectivity among populations (Waples and Gaggiotti 2006). Each simulation was conducted for 10,000 generations to achieve equilibrium conditions, when the FST values reached a steady state. Then, 40 individuals were sampled at random from each simulated population for their nuclear genotypes, and 30 individuals were sampled for mtDNA haplotypes. These sample sizes were typically used in the empirical studies reviewed herein (Table 1). Allelic frequencies were estimated for use in the subsequent analyses. Easypop simulations generate genotypic data, so mtDNA analyses estimated FST values (rather than ϕST) as implemented in Fstat (Goudet 1995). For each set of parameters we ranked FST values from 100 simulation replicates, excluded the three upper and three lower values, and thus obtained 94% confidence ranges for FST values.

COALESCENT SIMULATIONS

Mitochondrial DNA sequences were generated with backward-looking simulations using the coalescent methods implemented in SimCoal2 (Laval and Excoffier 2004). As before, populations contained N= 2000 individuals of which Nf= 1000 were females. The populations were connected through a stepping-stone dispersal model with m/2 immigrants per generation. Simulations were performed with m= 0.0005, 0.0025 and 0.0125. The nucleotide mutation rate was set at μ= 5 × 10−8 per site; however, estimated mutation rates occurring at recent time scales may be faster (e.g., Audzijonyte and Väinölä 2006; Waters et al. 2007). The use of μ= 5 × 10−8 in our simulations produced mtDNA diversities that approximated empirical observations of average sequence divergence among the sampled haplotypes of vent invertebrates (Won et al. 2003; Hurtado et al. 2004).

We conducted simulations involving different numbers of populations. First, we simulated K= 6 to be consistent with the approach used in the forward-based simulations. Second, we examined a scenario involving many more populations from which a smaller subset was sampled (or known) for inclusion in a phylogeographic study (presence of “ghost” populations, i.e., Beerli 2004; Slatkin 2005). The occurrence of large active vent fields along the EPR has been relatively well explored and large areas indeed appear to have no large fields (Fig. 1). Yet, it is impossible to exclude a possibility that smaller vents or otherwise conditions suitable for sustaining the chemosynthetic vent-type communities occur considerably more frequently. In this article we make an assumption that such suitable habitats occur at rough intervals of two degrees of latitude (∼200 km); the 6000 km EPR could therefore contain K= 30 populations. For each vent species considered in Table 1, we sampled from this hypothetical array only the localities that corresponded with known sample locations. For example, Tevnia jerichonana was sampled from six localities (in Fig. 1: 13N, 11N, 9N, 7S, 17S and 32S), which corresponded with hypothetical populations 4, 8, 10, 20, 23 and 29. To test whether our results were sensitive to the sample sizes, the simulation populations were sampled for n= 15 and 30 individuals. To test the sensitivity of our conclusions to effective population sizes, we repeated these simulations with Nf= 100 and 5000. The mitochondrial variants generated by these simulations were examined with Arlequin (Excoffier et al. 1992) to estimate pairwise ϕST, which also accounts for sequence divergence among the haplotypes.

DETECTING ISOLATION-BY-DISTANCE (IBD)

Pairwise genetic distances (FST or ϕST) estimated from the simulated data were compared with pairwise geographical distances (D) among the simulated populations, each spaced one distance-unit apart. GAR samples were excluded from these calculations, because they fall outside the linear EPR axis. Correlations between the genetic and geographical (logD+ 0.1) distance matrices were estimated with Mantel test procedure implemented with Poptools (Hood 2008), an add-in package for Microsoft Excel. To determine the power of IBD tests, we estimated the proportion of simulations that failed to produce a significant rank-order correlation between the genetic and geographical matrices, assuming critical values of α= 0.05 and a less stringent value of α= 0.10.

PROBABILITY OF INCORRECTLY INFERRING A PHYLOGEOGRAPHIC BOUNDARY

Isolation-by-distance in a linear stepping-stone model will generate pairwise FST values that are inversely proportional to Nm (Slatkin and Barton 1989). A significant barrier to dispersal might be inferred from an observed FST value that is greater than some expected value under a particular migration model. We used the simulations to determine how frequently elevated FST values would occur by chance. For example, imagine that a prospective barrier (e.g., a large transform fault) between populations 3 and 4 divides the K= 6 populations into two groups (1–3 vs. 4–6). We will let Fmax(WG) estimate the maximum pairwise divergence within each group, and Fmin(BG) estimate the minimum pairwise divergence between the groups. We inferred a significant dispersal gap under two criteria: (1) Fmin(BG) > 2 ×Fmax(WG); and (2) Fmin(BG) > 3 ×Fmax(WG). Such values of two or three-fold increase in genetic differentiation value are often reported as evidence for phylogeographic subdivisions and/or reduced gene flow, including in the deep sea (e.g., Won et al. 2003; Hurtado et al. 2004; Young et al. 2008). Negative values of FST and ϕST were converted to zero for these calculations, and only one Fmax(WG) value was used when one of the groups consisted only of one population. The proportion of replications meeting these criteria was used to assess the probability of incorrectly inferring a spurious barrier to gene flow (pB) when none existed in the simulated data. First, we included the K= 6 populations and estimated pB for a potential barrier between each pair of population in the array (between 1 vs. 2–6, between 1–2 vs. 3–6, and so on). The five pB values obtained in this way were then used to estimate a probability of inferring one or more barriers, according to the formula 1 −Π (1−pB1) (1−pB2)…(1−pB5) (which gives one minus the probability of not estimating any barriers). Second, we jackknifed single populations, one at a time, out of the K= 6 array to determined the effects of a one-population sampling gap on the bordering FST estimates. For example, after removing population 2, we examined pB for inferring a barrier between population 1 and populations 3–6, and so on. Because each of these four jackknifed scenarios are mutually exclusive, we calculated the arithmetic mean of each of the pB values to obtain an approximate probability of inferring a phylogeographic gap when one population is missed from the sampling design. Third, we estimated the pB values generated around gaps involving two missing samples; this was done in the same way as with one missing sample. All estimates were derived from simulations involving Nm= 1 and 5, because Nm= 25 produced FST values that were very small (<0.05) and typically would not be judged significant for the simulated sample sizes or used to infer barriers to gene flow.

Coalescent simulations involving K= 30 populations were used to specifically test the probability of incorrectly inferring a phylogeographic boundary from a limited number of subsamples. For example, Hurtado et al. (2004) suggested a barrier to gene flow in A. pompejana that appeared to partition the EPR populations into northern (21N–9N) and southern (11S–32S) groups that were separated by the Equator. The sampled populations roughly corresponded with simulated populations 3, 7, 9, 19, 21, 22 and 29. Pairwise ϕST values between the A. pompejana groups were ≥0.40, whereas values within the groups were ≤0.10. To test whether this observed elevated ϕST values might be a product of sampling variance in a stepping-stone model without phylogeographic boundaries, we estimated the frequency that simulated ϕST values across a putative boundary were minimally four times greater than the maximum values obtained for other pairwise comparisons. We also tested a more specific hypothesis of Fmin(BG) > 0.40 and Fmax(WG) < 0.10 (more explanation is given in the Results section when each test is introduced).

Results

LEVELS OF DIFFERENTIATION AND POWER TO DETECT ISOLATION-BY-DISTANCE IN SIX POPULATIONS

Simulations involving K= 6 populations (no sampling gaps) and Nm= 1, 5 and 25 revealed ranges (94%) of differentiation (global FST and ϕST) that were smaller with five nuclear loci than with mtDNA alone (Table 2). FST values obtained from stepping-stone simulations were nearly twice as great as those expected under an equilibrium island-model population structure [FST= 1/(1 + 4Nm)]. The values expected under an island model were at the lower boundary of the 94% range expected with stepping-stone dispersal. Differentiation obtained with five nuclear genes were essentially nonoverlapping for Nm= 1, 5 and 25, whereas FST values estimated from mtDNA overlapped partially across these migration rates, suggesting that mtDNA alone provides low power for distinguishing among five-fold differences in effective migration rates. The range of FST values (0.02–0.12, Fig. 2) observed in allozyme-based studies that excluded distinct geographical races were compatible with those expected under the stepping-stone model when the number of migrants between populations was 5 or 25 individuals per generation (Table 2).

Table 2.  Results from forward-based simulations of six populations. The PB values indicate probabilities of detecting a barrier to gene flow that causes a 2- or 3-fold increase in FST or ϕST between adjacent populations. The PB values were estimated for all six populations and situations in which one or two populations were missing. The β(IBD) indicates a probability that no significant isolation of distance will be detected, when critical values of α was 0.05. Forward-based simulations were used to generate the FST data for nuclear and mitochondrial loci. Coalescent simulations used to generate ϕST data for mtDNA. Nm—migration rate, na—not assessed.
ModelSix populationsMissing oneMissing two
FST range1β(IBD)2PB(2x)PB(3x)PB(2x)PB(3x)PB(2x)PB(3x)
  1. 1The 94% confidence limits for FST or ϕST values obtained from 100 simulation replicates and listed parameters.

  2. 2The probability of failing to detect IBD when it is the true hypothesis.

Nm=1
 5 loci, 3 alleles0.20–0.510.040.010.000.020.010.150.04
 5 loci, 2 alleles0.17–0.470.220.010.000.040.010.220.09
 mtDNA, FST0.30–0.760.500.040.010.040.020.140.09
 mtDNA, ϕST0.44–0.870.310.010.000.030.010.140.05
Nm=5
 5 loci, 3 alleles0.05–0.210.140.010.000.050.010.340.17
 5 loci, 2 alleles0.05–0.180.210.010.000.060.010.260.14
 mtDNA, FST0.05–0.480.480.060.030.070.050.220.17
 mtDNA, ϕST0.12–0.550.320.020.010.090.030.270.15
Nm=25
 5 loci, 3 alleles0.00–0.050.42nananananana
 mtDNA, FST0.02–0.190.60nananananana
 mtDNA, ϕST0.02–0.180.52nananananana

Despite an expectation that stepping-stone dispersal should produce a correlation between geographical and genetic distances, tests for this signature of IBD were not significant in a large proportion of our trials. Only the case in which Nm= 1 and greatest number of molecular markers (5 loci with 3 alleles each) produced significant IBD correlations in >95% of the replicates (Table 2). Failures to detect IBD were more frequent with higher migration rates and fewer alleles. Analyses based on mtDNA alone failed to detect IBD 48–60% of the time when FST was used to estimate genetic distance. In contrast, the use of ϕST, which also accounts for divergence among haplotypes, reduced failure rates to 31–52% in coalescent-based simulations. With Nm= 1 or 5, the ϕST based tests of IBD from a single mtDNA locus were only slightly less powerful than FST based tests from five nuclear loci with two allelic states.

Empirical data was consistent with our simulations in that about half of the taxa analyzed did not reveal significant IBD pattern. Thus three out of five vent species examined for allozymes (R. pachyptila, B. thermophilus and V. sulfuris) exhibited significant IBD correlations (Fig. 2) (France et al. 1992; Black et al. 1994; Won et al. 2003). Yet, for one limpet species, L. elevatus (K= 4) the IBD correlation was positive but not significant because at least five samples are needed to obtain significance in Mantel tests (Craddock et al. 1997). Another limpet, E. vitrea, did not show evidence for IBD but it might be composed of two geographically distinct groups (Craddock et al. 1997). Three out of six species examined for mtDNA sequences exhibited significant IBD correlations when ϕST was used (Hurtado et al. 2003; Won et al. 2003). Failure to detect IBD pattern in mtDNA of the three species (R. pachyptila, B. symmitilida and C. magnifica) could be explained by very low levels of mtDNA diversity and hence low statistical power.

Sampling gaps and spurious dispersal barriers

Forward simulations involving K= 6 populations (no sampling gaps) and Nm= 1 or 5 revealed a relatively low probability (≤6%) of incorrectly inferring a two- or three-fold barrier to gene flow with nuclear or mtDNA data (Table 2). These results suggest that with dense sampling, the probability of incorrectly inferring a barrier to gene flow is low even if only one mtDNA locus is used. Note, that these results do not imply that no two- or three-fold differences in pairwise FST or ϕST values are found between any random pair of populations. Instead, the results show that under the simulated parameters, there exists only a low probability that significant differences will arise by random gene sorting, in which populations are much more similar to other populations on the same side of an imaginary barrier as they are to populations on the other side.

In contrast, gaps in sampling from K= 6 populations had the potential to produce elevated FST values that might be misinterpreted as evidence for a phylogeographic barrier. When one population was removed to simulate a sampling gap, the probability of obtaining a two-fold increase in FST was 2 to 4% for Nm= 1 and 5 to 9% for Nm= 5. The probability of obtaining a three-fold increase was <5% for Nm= 1 or 5. Finally, when two demes were removed the probability of obtaining a two-fold increase in FST or ϕST could be as high as 34% when expected migration values were higher (Nm= 5); the probability of obtaining a three-fold increase was 4–17%.

ASSESSING THE EMPIRICAL EVIDENCE FOR BARRIERS BASED ON ALLOZYMES

Forward simulations involving K= 6 populations were used to evaluate empirical evidence for geographic subdivision in five species. Four of these species were studied prior to 1999 and the southern (s) EPR sampling expeditions; so only northern (n) EPR and GAR vents were included (Fig. 2). Among the gastropod limpets, L. elevatus sensu stricto (K= 4; NT= 251; FST= 0.12) exhibited no evidence for subdivision along the nEPR, whereas E. vitrea samples could be partitioned into two groups (13–21N vs. 9–11N) (NT= 334; FST= 0.20). Minimum divergence between the E. vitrea groups (Fmin(BG) > 0.15) was twice the maximum divergence within groups (Fmax(WG) < 0.07). Based on the simulations, the probability of a two-fold increase in FST by chance alone was <5% when zero or one population was missing from the stepping-stone array. The largest sampling gap (∼800 km) straddles the Rivera Fracture Zone between, but the bordering 21N and 13N samples were not different. In contrast, the 11N and 13N sites are only ∼200 km apart, yet they fall into different partitions. Consequently, the two-fold elevation in FST between these groups is not likely to have arisen by chance. Although the GAR is not part of the EPR axis, E. vitrea from this distant site (∼2000 km) fell in with the 9–11N partition.

The vent amphipod V. sulfuris also exhibited no evidence for significant subdivision across nEPR sites, but samples from the GAR were very different (FST≈ 0.45) and may represent a cryptic species. The polychaete tubeworm R. pachyptila exhibited no evidence for significant subdivision across the nEPR, but a small GAR sample also was not different. The mussel B. thermophilus was sampled across GAR, nEPR and sEPR localities and examined for allozymes and mtDNA. Based on divergence in both sets of genetic markers, an unnamed cryptic species is recognized from the southernmost portion of the EPR (Won et al. 2003); consequently, the 32S sample was excluded from our reanalysis. The mtDNA data will be considered separately. Seven polymorphic allozyme loci were identified in the samples (FST= 0.023). The only significant pair-wise FST values along the EPR (excluding GAR) samples were those between the southernmost 17S and 13N-7S (0.044–0.067), FST value between 17S and the adjacent 11S was 0.027, whereas FST values between 13N-7S sites never exceeded 0.016. Won et al. (2003) suggested that 17S was partially isolated from all populations in the north. Comparing these results to our simulations we can find that the a three-fold (maximum of 0.016 vs. minimum of 0.044) increase if FST values, observed within the 13N-7S cluster and between 13N-7S and 17S is unlikely to have arisen by chance in an equal-migration stepping-stone; the same conclusion holds even when two intermediate populations are missing from the sampling scheme. We conclude that there is a significant reduction in gene-flow intensity in B. thermophilus between 13N-7S sites and 17S.

MITOCHONDRIAL SEQUENCES AND GEOGRAPHIC SUBDIVISION

Six species were examined for mtDNA variation across this entire EPR and GAR (Fig. 2). To evaluate the empirical data in the hypothesis testing framework, we applied coalescent simulations involving K= 30 populations, where each population is assumed to be ∼200 km apart (see the Methods section). We replicated the simulations 100 times under a specific set of parameters. Samples were taken from simulated populations that corresponded to the rank-order location of actual samples collected for each species. We then determined the probability of obtaining an observed subdivision (ϕST) by chance. For example, the R. pachyptila population from 32S differed from all other EPR samples, with Fmin(BG) > 0.45 and Fmax(WG) < 0.05, a more than eight-fold increase in differentiation that was observed in only 1–2% of the simulations involving the full range of parameters. In general, the probability of finding an eight-fold or greater increase in ϕST values by chance alone is very low (Table 3). We also assessed a probability that, just as reported in Hurtado et al. (2004), no differentiation will be found across samples as widely distributed as 27N to 17S. To do that we estimated the proportion of simulated replicates that had ϕST values among these samples always <0.05. Only the combination of small effective population size (Nf= 100) and high migration rates (Nmf= 12.5 between each 200 km section) had a reasonable chance (27%) of such homogeneity (Table 3).

Table 3.  Results of testing specific hypotheses from empirical data with coalescence simulations of mtDNA in 30 populations. Numbers in the table report probabilities of observing a specific pattern, probabilities higher than 0.05 are shown in italics. Three simulation parameters are Nf—effective number of females, Nm—number of (female) migrants, n—sample size. References: (1)Hurtado et al., 2004, (2)Hurtado et al., 2003(3)Won et al., 2003. //, a tested barrier to gene flow; ∩, a logical symbol “and” (see the Methods section for further explanations).
Species(ref)HypothesisPattern tested       Nf=10001000100010001000100010010050005000
Nm=0.52.512.50.52.512.50.512.50.512.5
n=30303015151530303030
Riftia(1)no IBDr<0.5 (no IBD)0.400.170.320.360.210.390.440.650.370.14
 No structureFmax27N–17S<0.050.000.000.010.000.010.010.060.270.000.00
27N–17S//32SFminBG=8x FmaxWG0.000.000.000.000.010.000.020.020.000.00
 27N–17S//32SFminBG>0.45∩FmaxWG<0.050.000.000.000.000.010.000.020.000.000.00
Tevnia(1)IBD, r=0.68*r<0.5 (no IBD)0.180.210.230.240.160.320.290.530.200.10
 13N–17S//32SFminBG=2x FmaxWG0.000.020.030.000.010.030.070.060.000.01
13N–17S//32SFminBG>0.4∩FmaxWG<0.20.000.010.010.000.010.000.050.020.000.00
 13N–9N//7S–17SFminBG=2x FmaxWG0.110.130.160.030.210.140.180.150.030.14
13N–9N//7S–17SFminBG>0.1∩FmaxWG<0.050.000.010.070.000.010.060.080.030.000.00
Alvinella(1)IBD, r=0.82*r<0.5 (no IBD)0.220.220.290.310.230.400.430.590.340.15
21N–9N//11S–32SFminBG=4x FmaxWG0.000.010.010.000.010.000.060.080.000.00
 21N–9N//11S–32SFminBG>0.4∩FmaxWG<0.10.000.000.000.000.000.000.060.060.000.00
Branchypolynoe(1)no IBDr<0.6 (no IBD)0.830.570.450.870.520.560.720.760.720.51
 No structureFmax9N–32S<0.050.000.000.020.000.000.010.060.270.000.00
9N–32S//GALFminBG=4x FmaxWG0.000.000.000.000.000.010.020.020.000.00
 9N–32S//GALFminBG>0.2∩FmaxWG<0.050.000.000.000.000.000.000.020.000.000.00
Calyptogena(2)no IBDr<0.6 (no IBD)0.810.610.520.850.590.620.780.670.740.53
 No structureFmax21N–17S<0.10.000.000.050.000.010.030.080.280.000.00
21N–17S//GALFminBG=8x FmaxWG0.000.000.000.000.000.010.030.020.000.00
 21N–17S//GALFminBG>0.8∩FmaxWG<0.10.000.000.000.000.000.000.000.000.000.00
Bathymodiolus(3)IBD, r=0.53*r<0.5 (no IBD)0.280.330.330.330.300.530.490.580.360.25
 No structureFmax13N–11S<0.150.010.000.100.000.010.070.140.330.000.00
13N–11S//17SFminBG=2x FmaxWG0.010.010.020.000.000.010.080.070.000.00
 13N–11S//17SFminBG>0.25∩FmaxWG<0.150.010.000.000.000.000.010.060.030.000.00

We similarly reexamined the phylogeographic patterns reported for the other species (Table 3). Hurtado et al. (2004) inferred two barriers to gene flow for Tevnia jerichonana. The 32S sample exhibited pairwise ϕST values that were minimally twice as great as those among the remaining samples. The frequency (P) in our simulations of such an increase was ≤7% both when testing specific (Fmin(BG) > 0.40 and Fmax(WG) < 0.20), or general hypotheses (Fmin(BG) > 2 ×Fmax(WG)) (Table 3). In contrast, the reported two-fold increase in pairwise ϕST values between sEPR samples from 7S–17S versus nEPR samples (9–13N) might have arisen by chance, P= 10–20%; however, testing the hypothesis (Fmin(BG) > 0.10 and Fmax(WG) < 0.05) led to lower probabilities (P= 0–8%) of observing this specific degree of subdivision. Hurtado et al. (2004) also inferred subdivision across the Equator for the palmworm, A. pompejana. We found that the frequency of observing the reported four-fold increase in ϕST values between nEPR (21N–9N) and sEPR (11S–32S) samples was 0–8% (Table 3).

The polynoid polychaete Branchypolynoe symmitilida and vesicomyid clam Calyptogena magnifica exhibited no subdivision among nEPR and sEPR samples, but ϕST values increased four- to eight-fold in pairwise comparisons between EPR and GAR samples (Hurtado et al. 2003, 2004). The Galápagos vents are located about 2000 km to the east of the EPR axis. Yet, we could assess how frequently such differentiation would arise by chance in a one-dimensional stepping-stone model by imagining that the GAR vents were “relocated” to the EPR, corresponding to the simulated population #13. The simulations revealed that the observed differentiation between EPR and GAR in the two species was not likely to be a product of stepping-stone processes alone, P= 0–3%. We also assessed, as in the case of R. pachyptila, the probability of finding no differentiation across all EPR samples. The frequency of simulations in which ϕST values between all pairs of EPR populations were <0.05 and 0.10 (as reported for the two species respectively) was relatively high (P= 27–28%) only in simulations combining small effective population sizes (Nf= 100) and high migration rates (Nmf= 12.5) (Table 3).

The mitochondrial evidence for subdivision in the mussel, B. thermophilus was broadly congruent with the allozymes results (Fig. 2). The minimum ϕST values between the 17S sample and samples to the north (Fmin(BG) > 0.25) was nearly twice the maximum found among populations 13N–11S farther to the north (Fmax(WG)= 0.15). The frequency of this degree of differentiation in our simulations was very low (Table 3). Similarly, just as in the other species analyzed above, the frequency of simulations in which no differentiation between the remaining populations was detected (as observed between 13N and 11S sites) was relatively high (P= 7–33%) only when effective population sizes were small (Nf= 100) and migration rates were high (Nmf= 12.5).

Evidence for significant IBD was previously reported for three out of the six species. Computing correlations (r) with Mantel tests for each set of simulated parameters involving 100 replicates was too time consuming; so, we estimated correlations for only one of the ten parameter sets. From 100 replicates, we obtained the following critical values for r between genetic and geographic distance: r≈ 0.50 for A. pompejana, R. pachyptila, T. jerichonana and B. thermophilus (six-seven populations); and r≈ 0.60 for C. magnifica and B. symmitilida (four populations). For the remaining nine sets of parameters we only estimated the number of replicates that had r values higher than the determined thresholds. We found that none of the simulated scenarios had high (>95%) probability of finding significant IBD. The failure rate was particularly high when effective population sizes were small (Table 3).

Discussion

Computer simulations offer a powerful and intuitive means for assessing sample adequacy for phylogeographic studies. Beerli (2004) used simulations to address the effects of “ghost” populations (sampling gaps) on the estimation of migration rates and effective population sizes. He concluded that low to moderate levels of gene flow from “ghost” population are unlikely to bias estimates of migration rates (m), but they can inflate estimates of effective population size (Ne), and hence the effective number of migrants (Nm). Slatkin (2005) argued that there is no easy way to predict the upper bound on the effects of “ghost” populations on the migration matrix, especially in a stepping-stone model. Nevertheless, both studies agree that intense immigration will tend to homogenize populations that border on “ghost” populations, leading to overestimates of gene flow. Our purposes, however, are not directly comparable to those of Beerli (2004) for several reasons. First, we simulated a stepping-stone rather than a island model population structures. Second, we did not attempt to infer m or Ne from our simulations; instead, we were interested in the probability that sampling gaps might lead to incorrect inferences of geographical barriers to gene flow.

We used forward- and coalescence-based simulations to rigorously assess previously published inferences about geographical structure in deep-sea hydrothermal vent organisms, but the basic methods we have demonstrated could be used for any organism, and they are likely to be of value in designing sampling strategies for de novo studies of population structure. The simulations were particularly useful for assessing the rigor of inferences made from data that may have suffered from the following potential problems: (1) including a small number of population samples; (2) having likely gaps in sampling coverage; and (3) using numbers or kinds of genes that might exhibit sampling error or large variance in coalescence. These problems have plagued previous population genetic studies of hydrothermal vent organisms because the habitats are remote and very difficult and expensive to sample in a comprehensive manner. Sampling strategies rely on prior knowledge from oceanographic explorations, which may be incomplete, thereby creating ample opportunities for sampling gaps. Finally, the invertebrates found at vents are not model organisms, which greatly limits the suite of genetic markers available for population surveys. Thus, we used simulations to test whether a stepping-stone model of dispersal could be rejected and whether inferred subdivisions among population of nine codistributed hydrothermal vent species could be explained by sampling variance and gaps in the sampling design.

ASSESSING ISOLATION-BY-DISTANCE

Our first goal was to assess the power to detect correlations between genetic and geographical distances under a one-dimensional stepping-stone model. Failures to detect these signatures of IBD for five out of nine vent species could be explained by low statistical power associated with sampling alone. Generally, allozyme-based studies were more likely to find IBD than studies involving mtDNA alone—four out of five taxa studied with allozymes showed strongly positive correlation between genetic and geographic distances (the nonsignificant correlation in L. elevatus was due to small number of samples). On the one hand, this observation is consistent with our simulations, which show that five nuclear loci will have somewhat higher power to detect IBD than one mtDNA locus alone. On the other hand, when mtDNA has sufficient levels of diversity and when it is analyzed using ϕST statistics, which takes into account divergence between haplotypes, both five allozyme loci and mtDNA had similar power to detect IBD patterns (Table 2). Indeed, in two of the species (C. magnifica and R. pachyptila) in which mtDNA failed to reveal IBD pattern, the overall mtDNA diversity was very low and most of the vent sites contained just one common mtDNA haplotype. Such low levels of mtDNA diversity could be caused by selective mtDNA sweeps (Bazin et al. 2006). This again emphasizes the importance of analyzing multiple independent loci when drawing conclusions about demographic processes in populations.

Our simulations only tested the power to detect patterns consistent with IBD assuming a one-dimensional stepping-stone model. Yet, the second part of our study revealed that an IBD model alone could not completely explain the genetic structuring of some EPR invertebrates; consequently, barriers to gene flow were justifiably inferred. When significant barriers subdivide an array of populations, vicariance can result in differentiation that in turn can produce a pattern that mimics the effects of IBD. Yet, vicariance alone cannot explain the observed patterns of increasing differentiation with geographical distance in several EPR species. For example, allozyme-based FST's increased with geographical distance between R. pachyptila populations in the absence of distinct allopatric groups. Similarly, allozyme-based FST's increased with distance in nEPR populations of L. elevatus sensu strictu, nEPR populations of V. sulfuris and EPR populations of B. thermophilus sensu strictu even after the divergent 17S population was removed from the analysis (Won et al. 2003). Geographic subgroups of A. pompeiana and B. thermophilus exhibited positive correlations (r∼ 0.42) between genetic and geographic distances, although the correlations were not significant due to the reduction in the number of samples. Even though apparent barriers to gene flow subdivide EPR populations of several species into distinct geographic groups (Hurtado et al. 2004; Plouviez et al. 2009), patterns consistent with the effects of IBD were still found within many of the geographical groups.

ASSESSING PHYLOGEOGRAPHIC GAPS

Our second goal was to test whether geographic subdivisions reported in EPR invertebrates could be explained by sampling gaps. We found that most previously reported evidence for geographical subdivisions along the East Pacific Rise (EPR) and Galápagos Rift (GAR) are not likely to be products of sampling variance or design. Our findings were not sensitive to the effective size of simulated populations, migration rates, or the number of specimens sampled per locality. We list three previously noted boundaries that are associated with significantly inflated pairwise divergence values for multiple species:

First, the Easter Microplate region (22–27°S)—Won et al. (2003) noted that Bathymodiolus mussels on either side of Easter Microplate were highly divergent and reciprocally monophyletic for mtDNA and allozymes markers, possibly warranting their consideration as distinct species. Morphologically distinct sister-species of bythograeid crabs also appear to segregate across this boundary (Guinot et al. 2002; Guinot and Hurtado 2003). Hurtado et al. (2004) noted that the polychaete tubeworms R. pachyptila and T. jerichonana exhibit significant shifts in mitochondrial haplotype frequencies across this region, but two other polychaetes, B. symmitilida and A. pompejana do not. The Easter Microplate region coincides with a biogeographic boundary that separates fauna from the Pacific-Antarctic Ridge from cognate forms on the East Pacific Rise. Deep-ocean circulation models for this region suggest that an eddy driven by the Antarctic Circumpolar Current generates strong cross-axis currents (Fujio and Imasato 1991). A large westward deflection of He3 plumes generated by vents along the axis in this region (Lupton 1998) verifies the existence of strong cross-axis currents that are hypothesized to sweep buoyant larvae from the ridge axis and thereby limit the larval supply to new vents (Won et al. 2003; Hurtado et al. 2004). The rapid tempo of habitat turnover in this region may also contribute to the subdivision because viable populations may persist for a decade or less. Turnover is a function of the magma supply, which drives tectonic spreading rates of ∼150 mm per year, the highest rates for the midocean ridge system globally (Hey et al. 2006). Most of the active vents explored in the region between 18 and 38°S latitude during expeditions in 1999 and 2005 did not support robust communities containing R. pachyptila, T. jerichonana and C. magnifica (R. C. Vrijenhoek, pers. observations). With limited larval supplies, nascent vents in this region may be extinguished before they have the opportunity to develop the rich communities seen at many mature vents along the northern EPR (e.g., Desbruyères et al. 1982; Shank et al. 1998).

Second, the Equator—Hurtado et al. (2004) inferred subdivision between northern and southern populations of the palmworm, A. pompejana (Fig. 2). Based on our simulations, the frequency of observing the reported four-fold increase in ϕST between nEPR (21N–9N) and sEPR (11S–32S) samples was small, 0–8% (Table 3). The tubeworm T. jerichonana also exhibited subdivision across this region, but based on the present simulations the two-fold increase in ϕST between the three nEPR samples and the samples from 7–17S on the sEPR may be due to chance. On the other hand, four species did not exhibit subdivisions across the Equator (Fig. 2). Strong equatorial currents were hypothesized to contribute to this barrier, but the sampling of populations from the equatorial region itself is too limited to pinpoint the location of a dispersal boundary.

Plouviez et al. (2009) recently assessed geographic subdivision across the equatorial region of the EPR for seven invertebrate species. Based on significant shifts in the frequencies of mitochondrial haplotypes, the authors reported that six of these species exhibited evidence for simultaneous vicariance between nEPR and sEPR groups of populations. Although locations of these shifts varied somewhat among the six species, putative historical barriers were inferred between the Equator and 17°S latitude. Four out of seven species analyzed by Plouviez et al. (2009) were also examined for mtDNA in our study. Neither study identified significant differentiation along the EPR for B. symmitilida; however, Plouviez et al. did not examine GAR samples, which differed slightly from EPR samples. Both studies identified significant genetic subdivision between the northern and southern samples of A. pompeiana and B. thermophilus. The only potentially conflicting result existed between allozyme-based evidence for a barrier to gene flow in E. vitrea between 13N and 11N versus the mtDNA-based evidence of Plouviez et al. that revealed 13N and 9N samples were genetically similar. Given the different effective population sizes and dynamics of nuclear versus mitochondrial genes, such discrepancies are not surprising.

It should be noted that the putative equatorial barrier to gene flow identified by Hurtado et al. (2004) and verified by Plouviez et al. (2009) also coincides with a large sampling gap of nearly 1800 km between 9°N and 7°S latitude (Fig. 1). The previous studies did not address the possibility that the observed differentiation between nEPR and sEPR populations might have been a product of sampling gaps and a range of populations connected by stepping-stone dispersal. Notwithstanding, our present simulations suggest that sampling gaps alone do not provide a sufficient explanation for the observed degrees of differentiation, thereby supporting the inferences of the earlier studies.

Third, the Galápagos Rift—The GAR vents are displaced nearly 2000 km east of the EPR axis. Although active vents were recently discovered in the Triple Junction region where the GAR, nEPR and sEPR join (T. Shank, Woods Hole Oceanographic Institute, pers. comm.), no other intervening vents are known so far. Consequently, it is not surprising that samples of several species differed significantly from corresponding EPR populations. Of the six species sampled from the GAR vents, two (C. magnifica and B. symmitilida) exhibited strong differentiation, and one, V. sulfuris, exhibited differentiation that might be indicative of interspecific divergence (France et al. 1992). GAR samples of R. pachyptila did not differ significantly from EPR counterparts for allozymes or mitochondrial COI (Black et al. 1994; Hurtado et al. 2003), but new data based on five single copy nuclear genes supports differentiation between the GAR and EPR (D. K. Coykendall et al., unpubl. ms.).

Factors producing low levels of differentiation across broad geographical ranges

Four species, B. symmitilida, B. thermophilus, C. magnifica and R. pachyptila exhibited very low levels of differentiation for mtDNA across most of the EPR and GAR axes (Hurtado et al. 2003, 2004). Reduced observed differentiation might result from several factors: i) low statistical power, ii) small long-term effective population sizes, iii) selective sweeps on mtDNA (Bazin et al. 2006), iv) metapopulation processes related to the instability of habitats and nonequilibrium conditions that prevent the accumulation of genetic diversity and differentiation (Wade and McCauley 1988). Based on our simulations of a one-dimensional stepping stone model, a combination of relatively low female population size (Nef= 100) and high migration rates (Nm= 25) might lead to such limited mtDNA differentiation with 25–30% probability (Table 3). If Nef= 1000 or 5000, the probability of such low differentiation across the studied range was ≤10%. Consequently, small Nef rather than low statistical power best explains the absence of differentiation along the EPR. But is it possible that Nef could be as low as 100 females? When found, the census sizes for C. magnifica and R. pachyptila appear to be enormous, although no one has attempted to estimate their numbers. Nonetheless, for broadcast spawners, the ratio of census to effective population sizes can be very high; for example, the ratio was estimated to be as high as 10,000 in New Zealand snapper (Hauser et al. 2002). Local extirpations or strong bottlenecks associated with tectonic and volcanic events and frequent recolonization events would increase this ratio. Theoretically we should be able to estimate the long term effective female population sizes from the observed levels of mitochondrial diversity. Yet, there are many caveats to this approach, especially due to uncertainty of mutation rates applicable to the short-term intraspecific questions (Audzijonyte and Väinölä 2006; Ho and Larson 2006; Waters et al. 2007). Occasional mitochondrial selective sweeps might further bias attempts to correlate mitochondrial diversity with effective population sizes (Bazin et al. 2006). Given these confounding factors we believe that attempts at estimating long-term evolutionary Nef from mtDNA alone are unlikely to be informative.

Chevaldonné et al. (1997) and Jollivet et al. (1999) attempted to apply metapopulation models to hydrothermal vent invertebrates. Their efforts were stimulated by an earlier empricial study that reported no increase in genetic differentiation with increasing geographical distance in the alvinellid polychaetes A. pompejana and A. caudata (Jollivet et al. 1995). According to their metapopulation models, vent populations may generally remain in nonequilibrium conditions due to frequent habitat shifts caused by episodic volcanic eruptions and limited temporal availability of hydrogen sulfide. Such nonequilibrium conditions would prevent the build-up of genetic differentiation on larger geographical scales, even though some neighboring populations could be differentiated. Although it is likely that frequent disruptions in habitat quality will cause local extinctions, the spatial scales over which homogenization of genetic diversity occurs remains unclear. One could argue that if colonizers of newly available habitats arrive mostly from adjacent sites, genetic differentiation will eventually build up over larger spatial scales. The pattern of genetic differentiation will also depend on the rates of habitat turnover in relation to the effective size of local populations (Wade and McCauley 1988).

In accordance with the approach outlined in this article we have asked whether Jollivet et al. (1995) had the power to detect IBD even if it existed. In each of the two species, they examined allozymes in five to six spatial subsamples composed of 20 to 50 individuals from the 13N area and one sample composed of 20 to 30 individuals from the 21N site. Jollivet et al. (1995) reported that FST between the 13N and 21N vent fields was not significantly larger than FST among subsamples within the 13N region. We used the Easypop program to precisely simulate their sampling design. Migration among the 13N subsamples followed an island model and was ten times greater (m= 0.005) than migration between the 13N and 21N vent fields (m= 0.0005). Using other parameters consistent with our previously described methods, we conducted 20 replicates and estimated FST values. Only eight of the twenty replicates produced between-group FST's (mean = 0.014) that were greater than the within-group FST's (mean = 0.020). This quick simulation suggests that the Jollivet et al. (1995) study did not have sufficient power to detect a ten-fold difference in m between geographically adjacent and distant sites; consequently they did not have enough power to find positive correlation between genetic distance and geographic distance. Although their nonequilibrium metapopulation model may be warranted for hydrothermal vent organisms, more intensive temporal and spatial sampling is needed to generate sufficient statistical power to reject simpler hypotheses like the one-dimensional stepping-stone model.

Conclusions

We encourage the use of computer simulations to test hypothesis about patterns of genetic differentiation among populations. The main focus of our study was to address the power of available data sets to reject a simple stepping-stone model of population structure for vent organisms and to assess whether reported barriers to dispersal might be due to gaps in the sampling design. Yet, a similar approach could be used to address a range of other questions. For example, how frequently isolation-by-distance would be detected by chance if populations conformed with an island model of population structure. Or alternatively, how frequently the statistical power would be insufficient to detect existing barriers to gene flow.

Surely, the applicability of any simulated data to real populations will only be as useful as the model. Our first assumption is that a one-dimensional stepping-stone model represents a simple and realistic model applicable to the East Pacific Rise vent invertebrates with both short and long larval lifespans. Even if some of the invertebrate larvae are capable of long-distance dispersal and are not confined to the vent ridges, they still will be more likely to colonize nearby vents as compared to those situated thousands of kilometers away. We also assumed equal population sizes over the whole range of EPR, an assumption that will not often apply to natural populations. When population sizes fluctuate and vary among localities, the overall effective number of individuals in the metapopulation is expected to be smaller than under a constant population size. With smaller long-term effective sizes, genetic drift should be stronger, the signatures of IBD weaker, and probability of incorrectly inferring barriers to gene flow would be greater (see Table 3 where Ne= 100). More simulations should be conducted to assess effects of systematically different effective population sizes in different areas of EPR.

More genetic studies on hydrothermal vent invertebrates are likely to emerge in the near future, and it may become obvious that a stepping-stone model does not provide an inadequate description of a hydrothermal vent metapopulation. For example, western Pacific vents occur in back-arc basins and on seamounts that are scattered in a more complex two-dimensional pattern. Also, the eastern Pacific vents are found at similar depths (2500 ± 200 m), whereas western Pacific and Mid-Atlantic vents vary greatly in this third dimension. As sampling becomes more fine-grained at vents and employs additional genetic markers, we may learn that small-scale spatial differences are as great or greater than the large-scale differences in some rapidly evolving markers, or that temporal variance in gene frequencies at a vent locality exceeds differentiation at larger geographical scales (e.g., Shank and Halanych 2007). Alternatively, it may turn out that regardless of small-scale variations, vent metapopulation as a whole can still be described by relatively simple models. In any event, researchers are encouraged to assess the power of their sampling designs to reject simpler models before embracing more complex scenarios.


Associate Editor: D. Posada

ACKNOWLEDGMENTS

We thank P. R. England, R. Waples, O. Gaggiotti, P. Chevaldonné and J. Carlsson and two anonymous reviewers for helpful ideas on the computer simulations and comments on the manuscript. Funding for this project was provided by the Monterey Bay Aquarium Research Institute (The David and Lucile Packard Foundation) and NSF grants OCE9910799 and OCE 0241613.

Ancillary