Self-recruitment and sweepstakes reproduction amid extensive gene flow in a coral-reef fish


Mark R. Christie, Fax: +1 541 737 0501; E-mail:


Identifying patterns of larval dispersal within marine metapopulations is vital for effective fisheries management, appropriate marine reserve design, and conservation efforts. We employed genetic markers (microsatellites) to determine dispersal patterns in bicolour damselfish (Pomacentridae: Stegastes partitus). Tissue samples of 751 fish were collected in 2004 and 2005 from 11 sites encompassing the Exuma Sound, Bahamas. Bayesian parentage analysis identified two parent–offspring pairs, which is remarkable given the large population sizes and 28 day pelagic larval duration of bicolour damselfish. The two parent–offspring pairs directly documented self-recruitment at the two northern-most sites, one of which is a long-established marine reserve. Principal coordinates analyses of pair-wise relatedness values further indicated that self-recruitment was common in all sampled populations. Nevertheless, measures of genetic differentiation (FST) and results from assignment methods suggested high levels of gene flow among populations. Comparisons of heterozygosity and relatedness among samples of adults and recruits indicated spatially and temporally independent sweepstakes events, whereby only a subset of adults successfully contribute to subsequent generations. These results indicate that self-recruitment and sweepstakes reproduction are the predominant, ecologically-relevant processes that shape patterns of larval dispersal in this system.


The vast majority of marine invertebrates and fishes have a planktonic larval stage. How far and to what extent larvae disperse from their natal sites remains a pressing question in marine ecology, conservation biology, and fisheries biology. Answers to these questions have vast ramifications for understanding metapopulation dynamics (Hixon et al. 2002; Kritzer & Sale 2004), enhancing marine reserve design (Botsford et al. 2003; Palumbi 2003), and facilitating fisheries management (Gell & Roberts 2003; Francis et al. 2007). Because the larvae of most marine species are miniscule, it is extremely difficult to observe and track them in situ. Consequently, early approaches for determining dispersal patterns focused on predictive models of passive larval transport (e.g. Roberts 1997). Results from such studies fostered the common assumption that the vast majority of marine populations were demographically open and characterized by high levels of larval connectivity (Cowen et al. 2000). More recent work, using both genetic and microchemical analyses, have demonstrated that self-recruitment—the return of larvae to their natal population—may be more common than previously thought (Jones et al. 2005; Almany et al. 2007).

Reconciling these conflicting patterns of larval dispersal remains challenging because most marine populations cannot be simply categorized as closed or open (Cowen et al. 2000), but rather occur along a dynamic continuum of self-recruitment and population connectivity. Understanding the full complexity of dispersal patterns requires sampling of multiple cohorts (i.e. multiple dispersal events) both spatially and temporally (Selkoe et al. 2006). Furthermore, the majority of marine species have high rates of gene flow over evolutionary time scales (Hedgecock et al. 2007a). Determining the extent to which populations are connected, despite high gene flow, remains the single greatest challenge for revealing ecologically meaningful patterns of larval dispersal (Botsford et al. 2009).

Spatial patterns of larval dispersal can be detected with either direct or indirect methods (Hedgecock et al. 2007a). Indirect methods focus on population-level analyses and often require theoretical assumptions (e.g., drift-mutation equilibrium). Such methods are often plagued by a lack of statistical power for detecting ecologically relevant patterns of connectivity when faced with moderate to high levels of gene flow (Wang 2004). Nevertheless, when the appropriate conditions are met, certain indirect methods can effectively reveal broad patterns of larval dispersal (Manel et al. 2005; Saenz-Agudelo et al. 2009). Direct methods, on the other hand, focus on tracking individual larvae from birth to settlement usually via mark/recapture methods. For example, fairly elaborate methods have been developed to tag the otoliths (ear stones) of fishes with various elemental markers (Thorrold et al. 2006). However, such tagging methods are often quite expensive and can be limited by logistical constraints, such as limited mark duration and the need for multiple field collections.

One underexplored direct method of tracking marine larvae is parentage analysis (Hauser et al. 2007; Planes et al. 2009; Christie 2010). To date, parentage analyses have been used to determine dispersal patterns only in fishes with short pelagic larval durations in populations where all of the adults can be sampled (Jones et al. 2005; Planes et al. 2009; but see Hauser et al. 2007). Here, we overcame difficulties of applying parentage methods to large natural populations by employing a novel Bayesian parentage method that fully accounts for large numbers of pair-wise comparisons and small or unknown proportions of sampled parents (Christie 2010). Given the large population sizes and potentially vast dispersal distances of many marine species, it remains likely that even large data sets may record few direct observations of larval dispersal. Thus, the coupling of both direct and indirect methods will likely reveal greater insights than either approach alone.

Besides parentage, other tests of relatedness within marine species hold much promise (Veliz et al. 2006). Analyses that focus on cohorts of settling recruits can yield important spatial and mechanistic insight into patterns of larval dispersal (Selkoe et al. 2006). One important process is the ‘sweepstakes effect,’ in which a small proportion of the available gene pool successfully contributes to the replenishment of the population (Hedgecock 1994a; b). Because the majority of adults do not successfully reproduce, the characteristic signatures of a sweepstakes effect include reduced genetic diversity and increased levels of relatedness in cohorts of recruits when compared to adults. The sweepstake hypothesis further predicts that recruits should have less within-cohort but greater among-cohort genetic diversity than adults (Hedgecock et al. 2007b). While most studies indicate that sweepstakes effects are likely caused by stochastic larval mortality, a similar pattern could be created before the pelagic larval stage if a subset of adults (e.g. the largest individuals) produce either more offspring (Berkeley et al. 2004) or offspring that are disproportionately more likely to survive. Regardless of the mechanisms underlying sweepstakes effects, documenting such patterns over both spatial and temporal scales can reveal detailed insights into the patterns of larval dispersal (Hedgecock 1994b).

Bicolour damselfish (Stegastes partitus) are ubiquitously distributed on coral reefs throughout the Bahamas and Caribbean and are an ideal species for studying patterns of larval dispersal. Furthermore, bicolour damselfish possess large population sizes and high rates of gene flow typical of most marine fishes targeted by commercial fisheries (Ward et al. 1994). Tagging and observational studies have revealed that bicolour damselfish rarely move more than a few meters after settlement (McGehee 1995; Hixon et al. unpublished data). Because there is little post-settlement movement, any geographic distances between parents and offspring can be attributed solely to larval dispersal. Male bicolour damselfish vigorously defend territories on coral heads that often include multiple females. In the Bahamas, spawning activity peaks during the summer months and occurs in 2 week cycles that are influenced by lunar phase (Robertson et al. 1988). Males guard nests of demersal eggs, which often consist of clutches laid by several females (Knapp et al. 1995). We estimate an average of 4945 eggs per clutch with males guarding up to nine clutches in their nest per lunar month (Johnson et al. unpublished data). The eggs hatch 3.5 days after spawning, and the larvae are planktonic for approximately 28 days (Wilson & McCormick 1999).

Despite low overall levels of genetic differentiation, a large-scale population-genetics study of bicolour damselfish revealed significant isolation-by-distance at spatial scales around 1000 km (Purcell et al. 2009), suggesting little gene flow among distant sites. Region-wide comparisons of FST indicated that reefs lining Exuma Sound, Bahamas (our study system) were isolated from most other sites in the Caribbean (Purcell et al. 2009). Additionally, the Great Bahama Bank, a wide but shallow (< 5m deep) limestone shelf that encompasses the Exuma Sound (> 1000m deep), likely acts as a barrier to larval dispersal both into and out of the sound because it contains no suitable coral-reef habitat (Stoner & Davis 1997; Gutierrez-Rodriguez & Lasker 2004). Within the Exuma Sound, complex oceanographic patterns likely influence patterns of larval dispersal. As illustrated in Fig. 1, seasonal mesoscale gyres could entrain larvae and provide a mechanism for larval transport between reefs located on different sides of the sound (BM Hickey, University of Washington, unpublished data). Furthermore, general northwesterly surface currents derived from the Antilles Current could result in along-shore transport of larvae from southern to northern reefs (Colin 1995).

Figure 1.

 Sample sites and prevailing surface currents (dashed arrows) for bicolour damselfish collected in Exuma Sound, Bahamas. Parentage analysis identified two parent–offspring pairs (solid arrows), which directly documents self–recruitment at the two northern-most sites. Light seafloor indicates the shallow (mostly < 3 m deep) Great Bahama Bank, whereas dark seafloor indicates the Exuma Sound and nearby open ocean (mostly >1500 m deep). Triangles and straight arrows indicate 2004 sample sites, and filled circles indicate 2005 sample sites. Site abbreviations are as follows: Compass Cay (CC), Bock Rock (BR), String Bean Cay (SB), Big Point (BP), Lee Stocking Island (LSI), Three Sisters Reef (TS), and South Reef (SR). The Exuma Cays Land and Sea Park (Park) is a marine reserve that has been protected since 1964.

Here, we address two questions regarding ecologically relevant patterns of dispersal in bicolour damselfish: (i) to what extent do larvae return to their natal populations (self-recruitment) versus disperse among local populations (connectivity), and (ii) to what spatial and temporal extent do sweepstakes effects occur? We conclude that self-recruitment and local sweepstakes events are the central processes that influence patterns of larval dispersal in this system.

Materials and methods

Sample collection

Tissue samples were gathered from 751 Stegastes partitus collected from 11 sites within the Exuma Sound, Bahamas, during 2004 and 2005 (Fig. 1, Table 1). Adults (> 5cm total length, = 437) and recently settled recruits (< 2.5cm total length, = 314) were collected via hand nets by pairs of SCUBA divers. Based on size at age relationships, most recruits were less than 72 days post hatching while the average reproductively mature adult was greater than 300 days post hatching. A solution of 10% quinaldine to 90% methanol was used to anaesthetize the damselfish before live capture. Tissue was clipped from the pelvic fins of adults and placed in a urea-based storage solution consisting of 10 mM Tris, 125 mM NaCl, 10 mM EDTA, 1% SDS, 8 M urea, pH adjusted to 7.5 with HCl (JFH Purcell, personal communication). After sampling, adults were returned unharmed to their original collection location on the reef. Caudal fin tissue was collected from recruits, which were preserved for future analyses.

Table 1.   Summary statistics for each sample site averaged over all seven loci. Observations include: sample locality, site abbreviation, sample size (N), mean number of alleles per locus, mean allelic richness, mean inbreeding coefficient (FIS), observed heterozygosity (Ho), expected heterozygosity (HE), among-loci HWE P value (with number of loci out of HWE), percentage of loci pairs in linkage disequilibrium, and the number of private alleles
Sample localitySite codeNMean # alleles/ locusMean allelic richnessFISHoHEHWE P value (# loci out)% loci in LDPrivate alleles
  1. *Significant after a Bonferroni correction.

  2. †Calculated as number of loci pairs in linkage disequilibrium (< 0.05) divided by the total number of comparisons.

 Lee Stocking IslandLSI4227.718.70.0530.9190.9340.5016 (0)00
 Land and Sea ParkPark4423.119.70.0260.9260.9300.4950 (0)01
 Eleuthera Eleuthera4923.319.50.0510.9270.9360.0826 (0)02
 Cat IslandCat4625.020.10.0980.9190.9290.0000* (2)4.767
 Long IslandLong4122.919.80.0410.9250.9380.1858 (0)4.763
 Compass CayCC3922.018.70.0610.9030.9350.0004 (2)02
 Bock RockBR2820.919.40.0310.9010.930.2719 (0)01
 String Bean Cay SB5630.019.70.1210.9160.9240.3192 (0)04
 Big PointBP2825.018.50.0450.9110.9320.3984 (0)03
 Three Sisters ReefTS2521.020.10.0580.8940.9290.0609 (0)00
 South ReefSR3922.318.70.0770.8850.9270.0145 (0)02
 Lee Stocking IslandLSI6127.119.30.1210.9160.9260.0000* (3)28.57*4
 Land and Sea ParkPark4523.019.10.0110.9190.9280.1116 (1)4.765
 Eleuthera Eleuthera3722.920.50.0460.9150.9280.0141 (2)03
 Cat IslandCat4422.718.80.0990.9170.9280.0000* (2)4.762
 Long IslandLong4722.618.70.0610.9140.9290.6336 (0)03
 String Bean CaySB3122.619.10.0220.8560.9080.3222 (0)9.520
 Big Point BP4927.319.40.0810.8560.9310.0004* (2)03

In 2004, sampling was concentrated along the western edge of the Exuma Sound (Fig. 1). Tissue was collected from 315 fish from six sites. In 2005, sampling was expanded to include sites located around the entire Exuma Sound. Approximately fifty adults and fifty recruits were collected at each of five sites for a total of 456 fish. Recruit samples likely come from a single settlement pulse given their similarity in size, while adults undoubtedly consist of individuals from many settlement events. For both years, sampling was conducted from June to August, which encompasses the peak spawning and recruitment period for bicolour damselfish.

DNA extraction and microsatellite typing

DNA was extracted using a protocol optimized for samples stored in urea-based buffer (JFH Purcell, personal communication). Tissue was incubated in extraction buffer (75 mM NaCl, 25 mM EDTA, 1% SDS) along with proteinase K (2 μL of 20 mg/mL) at 55 °C for 2 h. After incubation, one half volume of ammonium acetate (7.5 M) was added. Samples were centrifuged and genomic DNA was precipitated from the resulting supernatant with standard isopropanol and ethanol washes (Sambrook & Russell 2001).

Samples were genotyped at seven microsatellite loci originally described by Williams et al. (2003). The seven loci employed in this study were SpGATA16, SpGATA40, SpAAT40, SpAAC33, SpAAC41, Sp AAC42, and SpAAC47. PCR reactions contained 1.5 mM MgCl2, 0.2 mM dNTPs, 0.2 U Taq DNA polymerase (Promega), 10 μM of each primer, and 2.0 μL of approximately 100 ng/μL template in a total reaction volume of 15 μL. Thermocycling profiles consisted of an initial denature at 94 °C for 4 min followed by 35 cycles of 30 s at 94 °C, 45 s at 52 °C, and 45 s at 72 °C. All loci had an optimal annealing temperature of 52 °C, except for SpGATA40 (60 °C), SpAAC41 (55 °C), and Sp AAC42 (55 °C). PCR products were screened on an ABI 3100 automated sequencer (Applied Biosystems). Allele sizes were determined with the fragment analysis software genotyper 3.7. Approximately 5% of individuals were re-processed through the entire procedure to remedy difficulties with scoring alleles and to regenotype individuals that were homozygous at the most polymorphic loci (see methods in Morin et al. 2009). A further 96 individuals were re-processed to calculate a study-specific error rate.

All data were tested for departure from Hardy–Weinberg equilibrium (HWE) within each population by locus and over all loci using genepop v. 3.4 (Raymond & Rousset 1995). A total of 10 000 batches and 5000 iterations per batch were employed to reduce the standard errors below 0.01. genepop was additionally used to calculate observed and expected heterozygosities. The mean number of alleles per locus, mean allelic richness, and observed number of alleles were calculated with fstat v. 2.9.3 (Goudet 2001). Additionally, randomization tests (21 000 randomizations) were conducted using fstat to detect significant FIS. micro-checker v. 2.2.3 was employed to determine whether any deviations from HWE were due to null alleles or large allele drop-out, as well as to check for stuttering (Van Oosterhout et al. 2004). Both genepop (10 000 batches, 5000 iterations) and genetix 4.05 (5000 permutations) (Belkhir et al. 2002) were employed to test for linkage disequilibrium at all locus pairs and over all populations.

Parentage methods

The multi-locus genotypes of all adult damselfish were compared to the multi-locus genotypes of all recruits. The study-specific genotyping error rate of approximately 0.014 allowed for up to 1 locus to mismatch (see methods in Christie 2010). All pairs that shared at least one allele at six out of seven loci were considered putative parent–offspring pairs. The putative pairs were completely reanalysed, from extraction through scoring, to minimize the possibility of laboratory error. Due to an increase in type I error by allowing 1 locus to mismatch, all putative parent–offspring pairs that continued to mismatch at one locus were discarded (n =1, and the mismatch was likely a true Mendelian incompatibility as heterozygous alleles were separated by more than 80 base pairs). None of the putative parent–offspring pairs had missing data. When calculating allele frequencies, missing data were coded as the most common allele, which is a conservative approach because it makes the underlying allele frequency distribution less uniform (Christie 2010). Allele frequencies could have been estimated after the missing data were either ignored, or coded as a single null allele, but these approaches were less conservative. For each putative parent–offspring pair, we calculated the probability of the pair being false given the frequencies of shared alleles, Pr (φ|λ). (Christie 2010). This method employs Bayes’ theorem to account fully for the exclusion probability of each locus, while also accounting for the frequencies of shared alleles. Within this framework, shared rare alleles decrease the probability of a putative pair being false because it is an unlikely event. This method, unlike many commonly implemented approaches, is not affected by differences in allele frequencies between adult and recruit samples. Furthermore, this approach fully accounts for the large number of pair-wise comparisons. Thus Pr (φ|λ) equals the probability of a putative pair being false after accounting for the frequencies of shared alleles and for the total number of pair-wise comparisons. Simulations required for the calculation of Pr (φ|λ) were conducted with 10 000 false pairs generated from over 100 null data sets. Programs to implement these methods are available at:

To assess the possibility of parent–offspring pairs being a different first-order relative (i.e. full sibs), we calculated the probability of full-sib pairs being indistinguishable from parent–offspring pairs (< 0.025) (Goodnight & Queller 1999). To generate this P-value we created 10 000 simulated full sibs in kingroup v. 2.0 (Konovalov et al. 2004) with the observed damselfish allele frequencies and calculated the proportion of pair-wise comparisons that shared at least one allele at all loci. Furthermore, it is unlikely that two siblings of such vastly different sizes (adult vs. recruit) would be alive at the same time. Both parents of such full-sibs would have to be at least 2 years old, given known bicolour damselfish growth rates and average size at maturity. Because, on average, each recruit damselfish has less than an 8% chance of surviving to more than 2 years, this event was quite unlikely (Hixon et al. unpublished data).

Population structure

We performed multiple tests for the presence of population genetic structure. Adults and recruits were treated as separate samples, though pooling did not alter findings. Global and pair-wise FST values among all populations were calculated with fstat. Given the high marker polymorphism we also calculated a standardized measure of genetic differentiation, GST, (Hedrick 2005) using recodedata v. 0.1 (Meirmans 2006). Exact tests for allelic differentiation among populations were performed in genepop with 10 000 batches and 5 000 iterations per batch. We also employed assignment methods, using a broad array of input parameters, to search for fine-scale population structure. We ran structure 2.3.1 (Pritchard et al. 2000; Hubisz et al. 2009) both with and without prior population information and with multiple parameter sets (i.e. with and without admixture, with and without correlated allele frequencies). We also used Bayesian assignment methods (Rannala & Mountain 1997) as implemented in geneclass2 (Piry et al. 2004).

To examine patterns of dispersal by comparing shared alleles, we calculated pair-wise relatedness values among all 751 individuals using Queller & Goodnight’s (1989) relatedness metric as implemented in genalex 6.2 (Peakall & Smouse 2006). This relatedness metric describes the number of shared alleles between pairs of individuals and standardizes this value based upon the individual’s allelic state (e.g. heterozygous) and on the frequency of the alleles in the reference population. To visualize the results of this analysis, we conducted a principal coordinates analysis (PCoA) on the pair-wise relatedness matrix. Individuals that share identical alleles occupy the same location in multivariate space, while individuals with different and rare alleles occupy distant locations in multivariate space. PCoA performs well with a wide variety of distance measures (McCune & Grace 2002; Jombart et al. 2009) and is well suited for a pair-wise relatedness matrix. We repeated this analysis with pair-wise relatedness matrices calculated with other relatedness metrics (e.g. Lynch & Ritland 1999), which produced similar patterns but with more outliers.

To evaluate statistically whether our sample groups (adults or recruits collected from different sites) occupied different regions of multivariate space we performed multi-response permutation procedures (MRPP). This method calculates the average multivariate distance within each group and determines whether the average within-group distance is significantly smaller than the average within-group distances generated by random assignment of individuals to groups (Mielke & Berry 2001; McCune & Grace 2002). We used Euclidean distances and 10 000 permutations for each comparison. Analyses were performed within the R statistical software environment (scripts available from corresponding author upon request) (R Development Core Team 2009). Test statistics were compared to a Pearson Type III distribution with mean, variance and skewness calculated from permuted datasets (McCune & Grace 2002). We performed MRPP for all PCoA groups as well as for each between-site comparison to determine whether the observed pattern was different than expected by chance. We also calculated effect size, which is the chance-corrected within group agreement, by dividing the observed and expected weighted mean within-group distances and subtracting the resulting quotient from one (McCune & Grace 2002). An effect size of 0 indicates chance assignment of samples to groups, while an effect size of 1 indicates maximal differences between groups.

We next examined our data for temporal differences between 2004 and 2005 at Lee Stocking Island (LSI), the only site sampled both years, using exact tests. We further examined our 2005 data by conducting a principal coordinates analysis (PCoA) on pair-wise FST values using genalex. Note that the FST analysis identifies differences in allele frequencies between all sampled populations, while the PCoA on pair-wise relatedness values (see above paragraphs) examines shared alleles among all individuals. We also calculated within-population levels of heterozygosity and relatedness. For both measures, we calculated the mean across populations, but within groups (adults or recruits), and calculated 95% confidence intervals using a t multiplier with four degrees of freedom. Additionally, we employed randomization procedures as implemented in fstat to detect differences in heterozygosity, relatedness and FST between samples of adults and recruits. Lastly, we estimated the effective number of breeders by calculating the effective population size of recruits with a linkage disequilibrium method (Waples 2006). We employed LDNE to calculate our estimates and calculated confidence intervals with both parametric and jackknife methods (Waples & Do 2008). Estimates were obtained for alleles that occurred with frequencies greater than 0.01, 0.02, and 0.05, though excluding rare alleles did not affect the results.


General genetic patterns

The mean number of alleles per locus ranged from 20.9 to 30.0 across populations. Allelic richness over all loci, calculated from a minimum sample size of 26 individuals, ranged from 18.5 to 20.5 (Table 1). The observed heterozygosity over all populations and loci was 0.89 and ranged from 0.84 to 0.93. Loci spGATA16 and spGATA40 were approximately twice as polymorphic as the other five loci, with 70 and 78 alleles, respectively, sampled throughout the entire Exuma Sound. The number of private alleles per population ranged from zero to seven, with the most being found in Cat Island adults.

There was no evidence for large-allele drop-out or stuttering at any locus or population, as determined by micro-checker. Null alleles were suggested as a possible cause for departure from HWE for two loci at four of the eighteen sample locations. This problem was largely resolved after homozygous individuals were re-genotyped (see Methods). None of the seven loci had more than two significant departures from HWE across all 18 populations, suggesting few null alleles. Most of the occurrences of loci being out of HWE occurred in populations of recruits (Table 1), which is a characteristic of sweepstakes effects (see Discussion). Seventeen of the eighteen populations showed no evidence of linkage disequilibrium (Table 1). One population, 2005 LSI recruits, had a small but significant percentage of loci pairs in linkage disequilibrium.


Remarkably, two parent–offspring pairs were identified, directly documenting self-recruitment at the two northern-most sites in Exuma Sound (Fig. 1). One pair was sampled at Eleuthera, (Pr (φ|λ) = 0.036), and the other pair was sampled at the Land and Sea Park (Pr (φ|λ) = 0.011). Despite conducting pair-wise comparisons between all adults and recruits, no parent–offspring pairs between any two sites were detected. Given the relatively small sample sizes, it is remarkable that any parent–offspring pairs were identified and is suggestive of high rates of self-recruitment at the two northern sites.

Evidence for self-recruitment within bicolour damselfish populations located in the Exuma Sound was further bolstered by results from PCoA of pair-wise relatedness values (Fig. 2), where the first principal coordinate explained 23.19% of the total variation and the second principal coordinate explained 18.62% of the total variation. MRPP analysis revealed that it is very unlikely to have observed this overall pattern by chance (= −12.36,< 0.001). Although the analysis was performed for all individuals jointly, Fig. 2 is displayed by population for graphical clarity. Thus, the relative positions in two-dimensional ordinate space of all individuals within and among populations are accurately depicted. The adults and recruits within each population demonstrate extensive overlap (Fig. 2), which is highly suggestive of self-recruitment within each population and further supports the results from parentage analysis. Additionally, the pair-wise MRPP analyses reveal that no adults and recruits from the same site were significantly different from one another and all within site comparisons had low effect sizes (Table 2, Appendix 1), where effect size measures the strength of the difference between the two groups (McCune & Grace 2002). Each sample of recruits had a lower effect size when compared to adults from the same sample site than when compared to the average effect size of adults from all other populations (Table 2), which suggested that recruits shared more alleles with adults from their own sampling location than any other location. Additionally, most pair-wise comparisons of effect sizes among sample sites were significant (Appendix 1), indicating differences among sites.

Figure 2.

 Principal coordinates analysis (PCoA) on all pair-wise relatedness values of sampled bicolour damselfish, with results separated by sampling location for clarity. Adults are represented by filled circles and recruits are represented by open circles. Both axes combined explain 42% of the total variation. Note that (i) all recruits cluster in the same multivariate space as adults from the same sampling location, suggesting self-recruitment; and (ii) sites such as Park and Lee Stocking Island, and Eleuthera and Long Island occupy different quadrants suggesting little larval connectivity. LSI 2004 comprised samples collected at String Bean Cay and Big Point (see Fig. 1). Results from multi-response permutation procedure for all sites indicate that the observed distribution of individuals in multivariate space is unlikely to occur by chance (< 0.0001).

Table 2.   Comparisons of effect sizes between recruits and adults collected from the same site (in bold) and between recruits and adults collected from all other sites. Lower effect sizes between adults and recruits collected from the same site suggest self-recruitment. Effect sizes averaged over all between-site comparisons, inline image, reveal greater differences between recruits and adults from different sites. Effect sizes were calculated following multivariate analyses of pair-wise relatedness values
Recruit sampleAdult sampleinline image
CatLongEleutheraParkLSI 2004LSI 2005
LSI 20040.1690.3370.1110.366−0.0570.0290.202
LSI 20050.2510.3160.2740.3530.1110.1420.261

Global (study-wide) FST was low and not significantly different from zero (95% confidence interval: −0.001 to 0.003). Pair-wise FST values among all samples were low (range: 0 to 0.0097) and only one out of 153 pair-wise comparisons were significantly greater than 0 (Appendix 2). Standardized FST (GST) values were higher (range: 0 to 0.072), but still suggestive of extensive gene flow as 42 out of 153 pair-wise comparisons were 0 and the average value was 0.022 (Appendix 3). Furthermore, results from structure did not identify more than one population regardless of parameter selections. Assignment tests (i.e. geneclass) lacked power to assign recruits to adult populations, and recruits were consistently assigned to reference populations with larger sample sizes regardless of recruit origin. All of these results are indicative of substantial gene flow among populations at evolutionary timescales.

Sweepstakes reproduction

Recruit allele frequencies at Lee Stocking Island were significantly different between 2004 and 2005 (exact tests, Table 3). Further evidence for this effect is illustrated in Fig. 2, where the recruits from 2004 cluster in the lower left quadrant whereas recruits from 2005 tend to cluster in the upper left quadrant. No within-year comparisons of recruit allele frequencies from Lee Stocking Island were significant. The principal coordinate analysis of pair-wise FST values between all sites, for which both axes explained 44.34% of the total variation, provided further evidence suggestive of sweepstakes effects (Fig. 3). All adult populations were much more similar to one another than the recruit populations were to each other. No recruit populations clustered near the adult populations, suggesting that recruit populations had different allele frequencies from one another and from the adult populations. Additionally, results from randomization tests as implemented in fstat indicated significant differences in FST values between adult and recruit samples (< 0.033). These differences in allele frequencies are best explained as a striking consequence of spatially independent variance in reproductive success as opposed to recruits coming from distinct natal sources (see below and Discussion).

Table 3.   Patterns of genetic differentiation for sites located near Lee Stocking Island in 2004 (BP, BR, SB) and 2005 (LSI). Pair-wise FST values are below the diagonal. P values for exact tests of allelic differentiation are above the diagonal. Significant tests, after a Bonferroni correction, are indicated in bold. Notice that no within-year comparisons are significant, while all between-year recruit comparisons are significant
 BP adultsBP recruitsBR adultsSB recruitsSB adultsLSI recruitsLSI adults
BP adults —0.3420.2190.1320.0460.0990.991
BP recruits0.000 —0.0160.2860.0970.0050.637
BR adults0.0030.006 —0.1170.3510.0000.117
SB recruits0.0010.0010.001 —0.0480.0040.487
SB adults0.0000.0000.0000.002 —0.0000.126
LSI recruits0.0010.0020.0090.0050.010 —0.187
LSI adults0.0000.0050.0000.0000.0010.000 — 
Figure 3.

 Principal coordinates analysis (PCoA) on all 2005 pair-wise FST values. Adults are represented by filled circles and recruits are represented by open circles. Both axes combined explain 44% of the total variation. Note that (i) all adults cluster together indicating greater genetic similarity to other adult samples than to the recruit samples; and (ii) all recruit samples are both different from other recruit samples and from adult samples, which is indicative of separate sweepstakes events. Randomization tests as implemented in FSTAT indicated significant differences in FST values between pooled adult and pooled recruit samples (< 0.033).

Further signatures of a sweepstakes effect were indicated by examining differences in average heterozygosity and average relatedness among adults and recruits. Using a t multiplier, the average level of heterozygosity among adult populations was significantly higher than the average heterozygosity among recruits (Fig. 4). Additionally, average relatedness was significantly higher among recruits than adults (Fig. 4), which is a strong indicator of sweepstakes patterns as recruits coming from a subset of adults would be expected to share more alleles (Hedgecock 1994a; b). Furthermore, randomization tests from fstat indicated significant differences in relatedness (< 0.031) between adult and recruit samples. Unlike analyses with the t multiplier, fstat did not detect significant differences in heterozygosity between adults and recruits (< 0.43), which suggests that this pattern may be less robust. When all recruit samples were pooled, relatedness was negative (mean: −0.0010, 95% CI: −0.0012 to −0.0008). This pattern was not surprising given the dissimilarity among recruit samples displayed in Fig. 3 and suggests that each sample of recruits came from spatially (and possibly temporally) independent reproductive events. Two trends provide additional evidence of sweepstakes-type reproduction: (i) recruit samples tended to have higher levels of linkage disequilibrium among pairs of loci, and (ii) less within-cohort but greater among-cohort genetic diversity was observed in recruits compared to adults.

Figure 4.

 Mean levels of observed heterozygosity and relatedness for 2005 sample sites. Bars represent 95% confidence intervals and do not overlap. The pattern of reduced genetic diversity and increased relatedness within recruit samples is a distinctive signature of sweepstakes effects.


Despite relatively small sample sizes, two parent–offspring pairs were identified at two widely separated reefs on opposite sides of the Exuma Sound, Bahamas. These finding are remarkable given the large population sizes at the two sites where these fish were identified. Using fore-reef habitat data for the Bahamas Millennium Coral Reef Mapping Project ( and conservative estimates of adult bicolour damselfish densities (Figueira et al. 2008), we roughly estimate the minimum number of adult bicolour damselfish to be greater than 40 000 at Eleuthera and more than 100 000 at the Land and Sea Park. Given the large population sizes of bicolour damselfish and low proportion of sampled candidate parents, these results strongly suggest that there are high rates of self-recruitment at these two sites.

Self-recruitment to the Exuma Cays Land and Sea Park indicate that larvae produced from this marine reserve settle within its boundaries, which is an important consideration for evaluating the effectiveness of marine reserves and, to date, has only been directly demonstrated in one other marine protected area (Planes et al. 2009). Note that both of the sites with documented parent–offspring pairs occurred in the northern Exuma Sound, suggesting that there may be oceanographic features that facilitate self-recruitment in this region. Indeed, large gyres are known to form in this region during the summer recruitment season (BM Hickey, unpublished data).

Pair-wise relatedness analyses further indicated that self-recruitment may be prevalent within all sites sampled in the Exuma Sound. All of the recruits clustered in the same multivariate space as adults from the same location, which is the pattern that would be expected if there were high levels self-recruitment. Furthermore, for 34 of 36 possible comparisons of effect sizes, samples of recruits were more similar to adults from the same reef than to adults from other sampled reefs, suggesting high levels of self-recruitment. These results complement studies indicating behavioral mechanisms that would facilitate larval retention in this species (e.g., Cowen et al. 2000). Because the PCoA of pair-wise relatedness values clusters individuals with shared alleles, this method can also be employed to examine genetic differences between samples. As such, large effect sizes revealed clear demarcations between both adults and recruits from Lee Stocking Island vs. the Land and Sea Park, and Lee Stocking Island vs. Long Island. Also, comparisons of both adults and recruits from Long Island vs. Eleuthera, as well as Cat Island vs. Long Island, reveal large effect sizes, and thus few shared alleles.

Nonetheless, the direct parentage and relatedness analyses must be considered within the context of our indirect analyses. Genetic differentiation as measured by FST, which employs population allele frequencies as opposed to shared alleles between pairs of individuals, was low and only one pair-wise comparison was significantly different from 0. Fine-scale clustering tools such as structure were unable to identify more than one population, which is not surprising given the low levels of genetic differentiation (Manel et al. 2005). Given these patterns, we conclude that there is a background of gene flow to neighboring populations that, over evolutionary time scales, has led to homogeneity in allele frequencies among sites. While it is difficult to determine how this homogeneity translates to connectivity at ecological time scales, the parentage and relatedness results suggest that connectivity among populations may occur less frequently than self-recruitment. Simulation-based exercises, perhaps coupled with estimates of dispersal from larger-scale isolation-by-distance analyses (e.g. Puebla et al. 2009), could uncover quantitative estimates of self-recruitment and connectivity among sites. Adding to the complexity of this system is the fact that highly polymorphic markers, while ideal for parentage analyses, make it difficult to isolate a signal for genetic differentiation from noise (Waples 1998). Additionally, homoplasy and large population sizes may further weaken any signal (Purcell et al. 2009). Nevertheless, the novel and coupled analyses presented here clearly demonstrate that self-recruitment is occurring amidst a background of high gene flow.

We also found that recruitment to bicolour damselfish populations is influenced by processes similar to sweepstakes effects. Our analysis of FST among sample sites in 2005 indicate that each population experienced spatially independent reproductive events (Fig. 3), which further supports the observation of self-recruitment within each population because it suggests that each population is governed by separate processes. It is unlikely that genetic differences among recruit samples resulted from distant or unsampled source populations. This interpretation would require that each sample of recruits came from a genetically distinct source population, which is unlikely given the low overall levels of genetic differentiation observed in this study and across the entire Caribbean (Purcell et al. 2009). We therefore conclude that the striking differences in FST between adults and recruits are the result of differential reproductive success among adults within each population.

Additional evidence of sweepstakes events comes from the observation that the average relatedness value of recruits from each sample site was greater than zero, while the overall relatedness of all pooled recruits was negative. Comparisons at the same site (Lee Stocking Island) from different years yielded significant differences in allele frequencies among recruit samples. Thus, it appears that sweepstake effects occurred within sites among years as well as among sites within years. Future work should focus on determining the relative importance of temporal versus spatial variation in patterns of larval dispersal.

While we did detect a clear signature of sweepstakes effects in this study system, we do not believe that the magnitude of this effect equals that of other published studies. For example, Hedgecock et al. (2007b) estimated that only 10 to 20 adult oysters produced over 185 sampled offspring. Using our genetic markers, estimates of the effective number of breeders included infinity as both the lower and upper 95% confidence limits. This outcome is likely due to the characteristics of this data set (i.e. sample size, number of loci, etc.) and because such methods are not effective at accurately estimating large effective population sizes (Waples 2006). Because we cannot quantitatively determine the variance in reproductive success, it is difficult to know whether these patterns qualify as bonafide sweepstakes events. Nevertheless, we estimate the total census population size of damselfish in the Exuma sound to be in the tens of millions. Such large population sizes likely make accurate estimation of the effective number of breeders difficult, even if the effective number of breeders were several orders of magnitude smaller than the census size. Because we have documented characteristic signatures of sweepstakes events (i.e. differences in FST, relatedness, and heterozygosity between adults and recruits), we believe that sweepstake processes are likely influencing damselfish populations. At any rate, we conclude that only a small portion of the potential parents contribute to subsequent generations of bicolour damselfish in Exuma Sound.

Our demographic data from these same sites reveal that larger male bicolour damselfish have more mates and guard greater numbers of eggs per clutch than smaller males (Johnson et al. unpublished data). High variance in reproductive success among males could contribute to the observed patterns of few adults contributing successfully to subsequent generations. Nevertheless, we cannot eliminate stochastic larval or post-settlement mortality as an underlying mechanism. It is unlikely that selection on our genetic markers caused the observed patterns because selection would not occur across all unlinked markers and is unlikely to act differentially at the small spatial scale of this study.

The combination of novel direct and indirect methods used in this study provides much greater insight into patterns of marine larval dispersal than previous methods. Specifically, we have shown that self-recruitment, sweepstakes effects, and gene flow all play a role in characterizing the patterns of larval dispersal in bicolour damselfish. Detailed knowledge of both within and among population dispersal patterns is vital for improving marine conservation efforts, informing fisheries management, and advancing marine metapopulation theory (Botsford et al. 2009). Incorporating this knowledge into a broader theoretical and socio-political framework will also provide measurable advances towards conservation and management goals.


This paper is a chapter from the doctoral dissertation of M.R. Christie, who thanks his graduate committee: M.A. Hixon (chair), S.J. Arnold, M.A. Banks, M.S. Blouin, and S.S. Heppell. We also thank S.J. Arnold, M.A. Banks, and M.S. Blouin for laboratory space and valuable advice. J.F.H. Purcell warrants special thanks for detailed advice regarding storage, extraction and amplification of damselfish samples. We also acknowledge M. Albins, K. Buch, C. Dahlgren, K. Kingon, A. Krupkin, J. Noell, T. Pusack, C. Searle, I. Phillipsen, and R. Watts for field and laboratory assistance as well as comments from M. Hansen and three anonymous reviewers that greatly improved this manuscript. This work was supported by grants from the National Science Foundation (00–93976, 05–50709, 08–51162) and generous support from Jim and Kala Paul to M.A. Hixon.


Appendix 1

Multi-response permutation procedure (MRPP) results for each pair-wise sample locality comparison. MRPP was calculated on results from PCoA analysis of pair-wise relatedness values (Figure 3) with 10 000 permutations. Observed and expected δ equal the observed and expected weighted mean within group distance. The probability of observed differences between the groups occuring by chance, P, and corresponding effect size, A, are also reported

Sample locality Observed δExpected δPA
  1. * Significant at the 0.05 level.

  2. ** Significant after a Bonferroni correction.

Long adultsLong recruits0.0130.0140.196290.046
Long adultsPark adults0.0130.0140.082080.077
Long adultsPark recruits0.0130.0150.06283*0.088
Long adultsCat adults0.0170.0190.03681*0.149
Long adultsCat recruits0.0150.0200.00000**0.244
Long adultsEleuthera adults0.0140.0190.00000**0.272
Long adultsEleuthera recruits0.0150.0200.00001**0.258
Long adultsLSI 2005 adults0.0140.0190.00000**0.269
Long adultsLSI 2005 recruits0.0130.0190.00000**0.316
Long adultsLSI 2004 adults0.0160.0200.00007**0.196
Long adultsLSI 2004 recruits0.0120.0190.00000**0.337
Long recruitsPark adults0.0130.0140.323660.027
Long recruitsPark recruits0.0140.0140.225180.042
Long recruitsCat adults0.0170.0190.095650.117
Long recruitsCat recruits0.0150.0190.00000**0.219
Long recruitsEleuthera adults0.0140.0190.00000**0.242
Long recruitsEleuthera recruits0.0150.0200.00001**0.244
Long recruitsLSI 2005 adults0.0140.0200.00000**0.276
Long recruitsLSI 2005 recruits0.0130.0190.00000**0.317
Long recruitsLSI 2004 adults0.0160.0210.00000**0.208
Long recruitsLSI 2004 recruits0.0130.0190.00000**0.333
Park adultsPark recruits0.0130.0140.190630.049
Park adultsCat adults0.0160.0190.055160.138
Park adultsCat recruits0.0150.0200.00001**0.244
Park adultsEleuthera adults0.0140.0190.00000**0.269
Park adultsEleuthera recruits0.0150.0200.00000**0.275
Park adultsLSI 2005 adults0.0140.0200.00000**0.311
Park adultsLSI 2005 recruits0.0130.0200.00000**0.353
Park adultsLSI 2004 adults0.0160.0210.00000**0.241
Park adultsLSI 2004 recruits0.0130.0200.00000**0.366
Park recruitsCat adults0.0170.0190.077720.117
Park recruitsCat recruits0.0150.0200.00001**0.213
Park recruitsEleuthera adults0.0140.0190.00001**0.234
Park recruitsEleuthera recruits0.0150.0200.00002**0.241
Park recruitsLSI 2005 adults0.0140.0200.00000**0.279
Park recruitsLSI 2005 recruits0.0130.0200.00000**0.332
Park recruitsLSI 2004 adults0.0160.0210.00000**0.210
Park recruitsLSI 2004 recruits0.0130.0190.00000**0.327
Cat adultsCat recruits0.0190.0200.098330.021
Cat adultsEleuthera adults0.0170.0200.04793*0.137
Cat adultsEleuthera recruits0.0190.0210.097820.126
Cat adultsLSI 2005 adults0.0160.0200.00916*0.157
Cat adultsLSI 2005 recruits0.0160.0210.00002**0.251
Cat adultsLSI 2004 adults0.0190.0210.154780.093
Cat adultsLSI 2004 recruits0.0160.0190.02038*0.169
Cat recruitsEleuthera adults0.0160.0180.01330*0.120
Cat recruitsEleuthera recruits0.0170.0180.198790.055
Cat recruitsLSI 2005 adults0.0150.0170.066030.089
Cat recruitsLSI 2005 recruits0.0150.0190.00000**0.237
Cat recruitsLSI 2004 adults0.0180.0180.392130.025
Cat recruitsLSI 2004 recruits0.0140.0150.204570.059
Eleuthera adultsEleuthera recruits0.0160.0170.065040.029
Eleuthera adultsLSI 2005 adults0.0150.0170.00686*0.131
Eleuthera adultsLSI 2005 recruits0.0140.0190.00000**0.274
Eleuthera adultsLSI 2004 adults0.0170.0180.142670.070
Eleuthera adultsLSI 2004 recruits0.0140.0150.03503*0.111
Eleuthera recruitsLSI 2005 adults0.0150.0160.294690.032
Eleuthera recruitsLSI 2005 recruits0.0140.0180.00005**0.209
Eleuthera recruitsLSI 2004 adults0.0180.0170.74778−0.047
Eleuthera recruitsLSI 2004 recruits0.0140.0140.66135−0.028
LSI 2005 adultsLSI 2005 recruits0.0140.0160.090980.142
LSI 2005 adultsLSI 2004 adults0.0160.0160.62418−0.017
LSI 2005 adultsLSI 2004 recruits0.0140.0140.300450.029
LSI 2005 recruitsLSI 2004 adults0.0160.0180.03598*0.110
LSI 2005 recruitsLSI 2004 recruits0.0130.0160.00000**0.222
LSI 2004 adultsLSI 2004 recruits0.0150.0150.78843−0.057
All samples 0.0140.021<0.0001**0.288

Appendix 2

Pair-wise FST values for all sample sites (below diagonal) and corresponding P-value after 10 000 randomizations (above diagonal). Significant tests after a Bonferroni correction are indicated in bold. Recruit samples are indicated with an asterisk (*). Negative values are reported as 0

Long 0.3520.4530.0400.5330.6460.5780.3650.9000.052
Long0.0061 0.8470.0380.7580.3090.1450.5010.9940.035
Park0.00540.006 0.7690.9180.7170.8470.7160.9850.175
Park*0.00780.00870.0064 0.7590.5320.1890.0410.6570.030
Cat0.00590.00580.00560.0067 0.7390.1840.7690.9470.221
Cat*0.00620.00680.00650.00760.0064 0.5650.6370.7780.582
Eleuthera0.00520.00650.00480.00640.00620.0061 0.8990.8380.083
Eleuthera*0.00740.00640.00690.00970.00620.00700.0059 0.9290.015
LSI0.00470.00460.00400.00510.00410.00560.00340.0058 0.788
LSI 0.7880.9800.9950.8890.5260.7120.3580.9550.323
LSI*0.0000 0.1190.4560.0870.0020.1250.0120.0090.000
TS0.00000.0043 0.2790.6850.6720.5220.2490.2850.279
BP0.00000.00090.0040 0.6040.2080.2540.2100.7670.023
BP*0.00520.00230.00320.0000 0.1150.3970.1300.3160.006
BR0.00000.00880.00010.00290.0064 0.2450.1860.3390.467
SB*0.00000.00470.00000.00100.00070.0037 0.1360.7100.094
CC0.00000.00610.00240.00000.00000.00150.0007 0.2870.002
SR0.00000.00520.00190.00000.00000.00230.00000.0006 0.062

Appendix 3

Standardized pair-wise GST values for all sample sites (below diagonal) and corresponding P-values (which are the same as Appendix 2), after 10 000 randomizations (above diagonal). Recruit samples are indicated with an asterisk (*). Negative values were reported as 0

Long 0.3520.4530.0400.5330.6460.5780.3650.9000.052
Long0.045 0.8470.0380.7580.3090.1450.5010.9940.035
Park0.0410.040 0.7690.9180.7170.8470.7160.9850.175
Park*0.0580.0600.046 0.7590.5320.1890.0410.6570.030
Cat0.0450.0400.0400.048 0.7390.1840.7690.9470.221
Cat*0.0470.0470.0460.0530.045 0.5650.6370.7780.582
Eleuthera0.0420.0480.0370.0480.0470.045 0.8990.8380.083
Eleuthera*0.0560.0440.0500.0680.0450.0490.045 0.9290.015
LSI0.0360.0320.0290.0360.0300.0400.0260.042 0.788
LSI 0.7880.9800.9950.8890.5260.7120.3580.9550.323
LSI*0.000 0.1190.4560.0870.0020.1250.0120.0090.000
TS0.0000.033 0.2790.6850.6720.5220.2490.2850.279
BP0.0000.0070.030 0.6040.2080.2540.2100.7670.023
BP*0.0360.0160.0230.000 0.1150.3970.1300.3160.006
BR0.0000.0660.0010.0220.046 0.2450.1860.3390.467
SB*0.0000.0330.0000.0070.0050.027 0.1360.7100.094
CC0.0000.0450.0180.0000.0000.0110.005 0.2870.002
SR0.0000.0360.0130.0000.0000.0160.0000.004 0.062