Many tropical forest tree species have broad geographic ranges, and fossil records indicate that population disjunctions in some species were established millions of years ago. Here we relate biogeographic history to patterns of population differentiation, mutational and demographic processes in the widespread rainforest tree Symphonia globulifera using ribosomal (ITS) and chloroplast DNA sequences and nuclear microsatellite (nSSR) loci. Fossil records document sweepstakes dispersal origins of Neotropical S. globulifera populations from Africa during the Miocene. Despite historical long-distance gene flow, nSSR differentiation across 13 populations from Costa Rica, Panama, Ecuador (east and west of Andes) and French Guiana was pronounced (FST= 0.14, RST= 0.39, P < 0.001) and allele-size mutations contributed significantly (RST > FST) to the divergences between cis- and trans-Andean populations. Both DNA sequence and nSSR data reflect contrasting demographic histories in lower Mesoamerica and Amazonia. Amazon populations show weak phylogeographic structure and deviation from drift–mutation equilibrium indicating recent population expansion. In Mesoamerica, genetic drift was strong and contributed to marked differentiation among populations. The genetic structure of S. globulifera contains fingerprints of drift-dispersal processes and phylogeographic footprints of geological uplifts and sweepstakes dispersal.
There are both historical and ecological reasons, however, to expect high levels of population differentiation in tropical forest tree species. Tropical forests date to Cretaceous times (Morley 2000; Davis et al. 2005) and have remained relatively stable through climatic fluctuations of the Pleistocene and earlier (Colinvaux et al. 1996; Fine and Ree 2006), thus providing ample time for divergence through isolation-by-distance processes (Wright 1943). Because most tropical moist forest tree species occur at low population densities (<1 ha−1), they may be more susceptible to genetic drift than the relatively common tree species studied in temperate and boreal forests (Fedorov 1966). Moreover, lowland tropical plants are thought to be physiologically sensitive to mild stress of drought and cold, especially to cool temperatures associated with increasing elevations (Janzen 1967). This combination of factors could promote a latitudinal gradient with stronger phylogeographic structure in tropical than temperate plants, as has been found in a recent survey of extra-tropical terrestrial and marine taxa (Martin and McKay 2004). In support of this idea, several recent studies have documented phylogeographic structure for Neotropical trees sampled around geographic barriers (Aide and Rivera 1998; Cavers et al. 2003; Dick et al. 2003; Novick et al. 2003) and higher FST in tropical trees than in their temperate zone or boreal forest counterparts (Hamrick et al. 1992; Dick et al. 2008).
In this study, we tested patterns of population differentiation, mutational and demographic processes to gain information on the biogeographic history of the widespread tropical forest tree Symphonia globulifera L. (Clusiaceae) at a regional scale within Panama, and at the continental scale in Central and South America. Symphonia globulifera is unusual among tropical forest tree species in that it has an extensive fossil record, which reveals a demographic history of sweepstakes dispersal from Africa and the invasion of several Neotropical forest domains. We expand upon a single-locus phylogeographic analysis based on the ribosomal Internal Transcribed Spacer (ITS) (Dick et al. 2003) with data from the chloroplast trnH-psbA intergenic spacer and expanded ITS sampling. Furthermore, we analyzed five nuclear SSRs in 13 populations of this species from Costa Rica, Panama, Ecuador (east and west of Andes) and French Guiana to infer genetic clusters, to test vicariance patterns based on the depth of SSR divergence and to look for signals of bottlenecks or population expansions testing SSR mutation-drift equilibrium. The results are placed in a context of fossil and regional landscape history, and contrasted with studies of other tropical and temperate zone tree species. The rate of Symphonia's geographic expansion through species-rich Neotropical forests is compared to expectations based on Hubbell's (2001) neutral theory of community ecology and biogeography.
Materials and Methods
Symphonia globulifera L. f. (Clusiaceae) is a shade-tolerant rainforest tree species that is broadly distributed across the Neotropics and equatorial Africa. It is the only recognized species in its genus found outside of Madagascar, which harbors 16 Symphonia species (Abdul-Salim 2002). Nectar-feeding birds, including hummingbirds (Bittrich and Amaral 1996; Gill et al. 1998) pollinate the small scarlet flowers (Fig. 1). The large (4–5 cm) drupes are consumed and dispersed by bats and monkeys (Aldrich and Hamrick 1998). The species is usually > 90% outcrossed (Degen et al. 2004; da Silva Carneiro et al. 2007), although higher levels of self-fertilization (>10%) have been documented in disturbed habitats (Aldrich and Hamrick 1998). Symphonia globulifera occurs in low population densities (<1 adult tree ha−1) across much of its Neotropical range (Center for Tropical Forest Sciences, forest inventory plot data). It can be identified in the field by the combination of opposite simple leaves, bright yellow latex, and aerial roots (Fig. 1). There is some morphological variation across the Neotropical range; although S. globulifera are typically large canopy trees, some populations in Costa Rica occur only as understory treelets with an adult reproductive size of less than 10 cm diameter at breast height (dbh). In French Guiana, trees with large leaves and large flowers are sympatric with the common form, and are treated as separate species by local workers (Baraloto et al. 2007). None of this morphological variation has yet been considered sufficient to merit splitting of S. globulifera into more than one Neotropical species (Abdul-Salim 2002).
The genus Symphonia has morphologically distinctive fossil pollen (fossil taxon Pachydermites diederexi) used by the oil industry for stratigraphic dating. Hence, there is an unusually well-sampled fossil record for S. globulifera in Africa and the Neotropics. The combined fossil and DNA evidence indicates that S. globulifera migrated from Africa to the Neotropics via sweepstakes dispersal (Dick et al. 2003). The earliest P. diederexi appears in mid-Eocene deposits (ca. 45 Ma) in Nigeria (Jan-du-Chene et al. 1978). After a 30 million year fossil record in equatorial Africa, P. diederexi appeared abruptly in Miocene sediments off the coast of Venezuela (Germeraad et al. 1968) and Brazil (Regali et al. 1974). It is reported from mid-Pliocene sediments in Mexico (Graham 1976, pers. comm.) and Southeast Costa Rica (Graham and Dilcher 1998).
For the nSSR dataset, approximately 30 individuals were sampled in each of eight populations from Panama, two from Ecuador from either side of the Andes, two from Costa Rica, and one from French Guiana (Table 1). The populations in Costa Rica, Panama, and West Ecuador were sampled along transects with sufficient spacing to avoid sampling closely related individuals. In Paracou (French Guiana), Barro Colorado Island (BCI, Panama), and Yasuní (Amazonian Ecuador), individuals were randomly sampled (using a random sorting of tree ID tags) from forest inventory plots. The Paracou site consists of 15 permanent inventory plots of 6.25 ha, and a 25-ha plot (plot 16) of mapped trees ≥10 cm dbh. The BCI and Yasuní forest inventory plots consist of all trees ≥1.0 cm dbh in 50 ha (Losos and Leigh 2004). The Costa Rican populations represent the distinct understory form of S. globulifera. The Paracou plot contains some trees with the large-leaf and large-flower morphotype. The samples for the nSSR analysis were taken from the small-leafed form, although individuals of both forms were included in the phylogeographic analysis. The DNA sequence-based phylogeographic analyses included samples from the above-mentioned populations (N≤ 10 per site) and supplemental field collections or herbarium accessions from Honduras, Nicaragua, Bolivia, West Indies, and Peru for a total of 20 Neotropical locations (Table 1).
Table 1. Symphonia globulifera study populations. Site abbreviation is indicated in parentheses in Column 2. Column 3 (SSR) is population size for SSR-based analyses. Column 4 (phylo) indicates sample size for DNA sequence-based phylogeographic analyses.
Pipeline Road (PR), Panama
Santa Rita ridge (SR), Panama
Cerro Campana (CC), Panama
Fort Sherman (FS), Panama
Cerro Jefe (CJ), Panama
Veraguas (VER), Panama
El Cope, Panama
Barro Colorado Island (BCI), Panama
Chilamate, Sarapiqui (FCH), Costa Rica
La Selva (LS), Costa Rica
Paracou (FG), French Guiana
Yasuní, Amazonas (YAS), Ecuador
Muisne, Esmeraldas (Esm), Ecuador
Puerto Viejo, Sarapiqui, Costa Rica
Blue Fields, Nicaragua
Los Amigos, Peru
Valle del Sajta, Bolivia
Leaf samples were dried in silica gel in the field or stored in liquid nitrogen prior to DNA extraction, which was performed using DNeasy kits (Qiagen Corporation, Valencia, CA). The DNA aliquot from each tree was assigned an identification number (Lab ID), which is linked to information on geographic location, herbarium label information, or tag numbers for trees sampled in permanent inventory plots (Appendix).
Of the 15 published SSR primer pairs developed for S. globulifera populations in Costa Rica (Aldrich et al. 1998), French Guiana (Degen et al. 2004), and Brazil (Vinson et al. 2005), only five primer pairs (Table 2) consistently amplified across the geographic range represented by the study populations. The Sg92 primers used in this study (designed by Henri Caron, INRA, Bordeaux, France) amplify the same locus as the Sg18 primers published by Degen (2004): CCCACCAAACACAAATCAT (Sg92f) and GTTGAGGATTGTTTGCCCAG (Sg92r). The Polymerase Chain Reaction (PCR) cocktail contained 1× QIAGEN buffer, 3.75 mM MgCl2, 100 μM of each dNTP, 2.0 μM of each primer, 0.25 units of QIAGEN Taq polymerase, and 1.0 μl of undiluted template DNA. The thermal cycle was 5 min at 94°C; 25 cycles of 45 sec at 94°C, 1.0 min at 55°C, and 30 sec at 72°C; ending with 15 min at 72°C. The SSR primers were labeled with fluorescent dyes (TAMRA, NED, ROX, 6-FAM) and electrophoresed on an ABI 3700 automated DNA sequencing machine with Liz 500 size standard. Genotypes were scored using STRand version 2.2 (M. Locke, E. Baack and R. Toonen, University of California).
Table 2. Gene diversity of five nuclear SSR loci developed for S. globulifera. Column 2 refers to geographic origin of SSR isolates (Costa Rica, French Guiana, Brazil) and reference: (1) Aldrich et al. (1998), (2) Degen et al. (2004), (3) Vinson et al. (2005), (4) this article. N, number of genes sampled; K, total number of alleles; Size (var), mean allele size and variance (var); HE, Nei's (1978) gene diversity; FIS, Wright's inbreeding coefficient; FIS≠0 refers to the number of populations (out of the number of populations genotyped at each locus) showing significant departure from Hardy–Weinberg genotypic proportions.
SSR DIVERSITY AND INBREEDING
For each locus, we recorded the number of alleles along with their sizes and computed Nei's (1978) gene diversity and Wright's inbreeding coefficient using SPAGeDI version 1.2g (Hardy and Vekemans 2002). Within populations we evaluated the same statistics as well as allelic richness, a measure of the number of alleles standardized for sample size, using Fstat version 2.9.3 (Goudet 1995). Departure from Hardy–Weinberg expectations was tested with exact tests in GENEPOP v. 3.3 (Raymond and Rousset 1995).
BAYESIAN CLUSTER ANALYSIS OF SSR GENOTYPES
The program STRUCTURE version 2.2 (Pritchard et al. 2000) was run 10 times on individual multilocus SSR genotypes for a number of clusters K ranging from 1 to 20, using a burn-in length of 50,000 and a run length of 500,000 iterations. We used the admixture model without prior information on sample population membership and allowed allele frequencies to be correlated among clusters (Falush et al. 2003). The likelihood for the data given each of K clusters was recorded. Because it increased steadily with increasing K without displaying a clearly highest value for any K, the most likely number of clusters was inferred with the ΔK statistic of Evanno et al. (2005). For the best inferred K, the proportion of ancestry Q in each cluster of each sampling location (sample population) was computed as the average proportion of ancestry over individuals and plotted on a map. To further explore the genetic structure, we ran a mixture analysis (Manel et al. 2005) on population samples in BAPS, a software for Bayesian Analysis of Population Structure (Corander et al. 2003), using either a uniform prior over populations or considering their spatial coordinates (“spatial clustering of groups” option) and testing 10 replicates of 1 ≤K≤ 13 groups.
TESTS OF SSR DIFFERENTIATION, MUTATIONAL, AND DEMOGRAPHIC PROCESSES
After defining the global pattern of population structure with the above methods, differentiation statistics (FST, RST) were computed between (pairs of) sampling locations or inferred clusters using SPAGeDI version 1.2.g (Hardy and Vekemans 2002). The null hypothesis of no population differentiation was tested for overall FST and RST using 10,000 permutations of individuals among locations or clusters in SPAGeDI, and for pairwise FST using exact tests in GENEPOP. The presence of phylogeographic structure, i.e., whether alleles within populations were more related than alleles in the overall sample, was tested by comparing RST with its value after permuting allele sizes within loci (“permuted RST”) using SPAGeDI (10,000 permutations). A significant one-sided test establishes the alternative hypothesis of RST > “permuted RST.” This means that allele size mutations contributed to population differentiation and can be interpreted as phylogeographic structure (Hardy and Vekemans 2002). Sequential Bonferroni corrections (Rice 1989) were applied to significance levels of multiple tests.
The demographic history of a population leaves a signature in its allele frequency distributions: a reduction in effective population size (population bottleneck) leads to the preferential loss of rare alleles, whereas new alleles accumulate under population growth. When rare alleles are lost, the frequencies of the remaining alleles are more evenly distributed than under drift–mutation equilibrium, which results in a gene diversity value higher than expected under drift–mutation equilibrium with the same number of alleles. The program BOTTLENECK (Piry et al. 1999) permits testing for such “heterozygosity excess,” and for “heterozygosity deficit” in the converse case of population expansion, by comparing gene diversity values HE to their expected values based on the number of alleles, HA, obtained from coalescent simulations using an infinite allele model (IAM) or a stepwise mutation model (SMM) adapted to microsatellites. To test for deviations from equilibrium demographic processes in sampling locations and genetic clusters of S. globulifera, we computed the T2 statistic of Cornuet and Luikart (1996), which represents an average over loci of standardized deviates for heterozygosity, and tested its significance with the Wilcoxon signed ranks test (Piry et al. 1999). We used T2 under the SMM model because alleles at our loci were separated by multiples of 2 bp, with the exception of a single allele at locus SG19.
DNA SEQUENCE ANALYSIS
The ITS region (ITS1, ITS2, and 5S ribosomal gene) was amplified using the ITS4 (White et al. 1990) and ITSi primers (Urbatsch et al. 2000). TrnH-psbA was amplified using primers developed for plant DNA barcoding (Kress et al. 2005). The PCR contained 1× QIAGEN buffer, 3.75 mM MgCl2, 100 μM of each dNTP, 2 μM of each primer, 0.25 units of Taq polymerase (Qiagen), and 1.0 μl of undiluted template DNA. The thermal cycle for ITS and trnH-psbA was 94˚C for 4.0 min, followed by 30 cycles of 94˚C for 45 sec, 55˚C for 45 sec and 72˚C for 3 min. PCR products were extracted from low melting point agarose using GELASE (Epicentre Biotechnologies, Madison, WI) prior to sequencing. Forward and reverse DNA strands were sequenced using BIG DYE chemistry (Applied Biosystems Incorporated [ABI], Foster City, CA) on an ABI 3700 capillary sequencer. DNA chromatograms were aligned and edited using SEQUENCHER 4.1 (Gene Codes Corporation, Ann Arbor, MI), MacClade 4.0 (Maddison and Maddison 2000), and Se-Al version 2.0 (Rambaut 1996). Indels and multiple-nucleotide substitutions were coded as single changes. Haplotype networks were created in PAUP* 4.0b10 (Swofford 1998) using neighbor joining and in TCS (Clement et al. 2000) using statistical parsimony. Indices of haplotype diversity (h) and nucleotide (π) diversity were obtained using DNasp 4.0 (Rozas et al. 2003). Analysis of molecular variance (AMOVA) (Excoffier et al. 1992) was used to estimate the within and among population components of nucleotide diversity (ΦPT) using GenAlEx (Peakall and Smouse 2006).
SSR DIVERSITY AND INBREEDING
Multilocus SSR genotypes were obtained for 380 individuals from 13 S. globulifera populations, although some loci did not amplify consistently in all populations. Amplification problems occurred for SgC4 in Paracou, Sg03 in Esmeraldas, and Sg06 in both Chilamate and Esmeraldas and those locus/population combinations were coded as missing data. The mean number of alleles per locus was 26.4 (range: 19–41) for a total of 132 alleles (Table 2). Expected heterozygosity (HE) per locus ranged from 0.822 (Sg03) to 0.921 (Sg06) (mean HE= 0.887). Multilocus HE ranged from 0.677 (Esmeraldas) to 0.881 (Paracou). The global inbreeding coefficient was significantly positive (FIS= 0.087; RIS= 0.214) and single population estimates ranged from −0.084 (Chilamate) to 0.253 (Paracou) (Table 3). All of the loci displayed deviation from Hardy–Weinberg expectations in at least one population (Table 2), but deviations were not consistent across loci within populations, which suggests that the homozygote excess was caused by null alleles rather than inbreeding. Exceptions were BCI and Paracou where mating system might have an effect on consanguinity because both had three loci with significantly positive FIS.
Table 3. SSR diversity values per population. N, average number of genes sampled/locus; A, mean number of alleles per locus; AR, allelic richness or number of alleles expected in a sample of 14 individuals; HE, expected heterozygosity; FIS, Wright's inbreeding coefficient and probability of departure from Hardy–Weinberg genotypic proportions, T2(SMM), T2 statistic of Cornuet and Luikart (1996) under the stepwise mutation model and probability of heterozygosity excess/deficit; Nc, not computed, ns, not significant, ***, P≤0.001; **, P≤0.01; *, P≤0.05).
1data from loci 1–4;
2data from loci, 1, 3–5;
3data from loci 1–3.
Alto de la Piedra
BAYESIAN CLUSTER ANALYSIS OF SSR GENOTYPES
Run lengths of 500,000 iterations with the STRUCTURE algorithm were deemed sufficient because run parameters such as the log likelihood of data LnP(D) stabilized within 100,000 iterations for all tested values of the number of clusters K (1 ≤K≤ 20). The true number of clusters K in the data was difficult to determine empirically because LnP(D) steadily increased as K increased until a plateau was reached at approximately K= 11 (Fig. 2.A). The ΔK statistic of Evanno et al. (2005) however permitted detection of a rate change in LnP(D) corresponding to K= 3 (Fig. 2.B). This type of clustering made geographical sense as it essentially separated the Esmeraldas population on the Pacific side of the Andes (mostly Cluster 01, Fig. 3) from the cis-Andean populations (mostly Cluster 03, Fig. 3). Both Costa Rican populations had the highest proportion of ancestry, over 76%, in Cluster 03 as well. In Panama, an additional gene pool, Cluster 02, was detected. It was predominant with over 84% ancestry in two populations (Cerro Campana and Santa Rita ridge), but most Panamanian populations were admixed between Cluster 01 and Cluster 02 with smaller shares of the cis-Andes-Costa Rica Cluster 03. Evanno et al.'s (2005)ΔK detected a further rate change in LnP(D) at K= 11 (Fig. 2.B). With K= 11 individuals essentially clustered into their population of origin, except that both Costa Rican populations shared one gene pool, and Panamanian populations Fort Sherman and BCI another (results not shown). The strong pattern of among-population structure was confirmed with BAPS, where both spatial and nonspatial mixture analysis resulted in a best partition of sampling locations into 13 clusters, one per population (results not shown).
TESTS OF SSR DIFFERENTIATION, MUTATIONAL, AND DEMOGRAPHIC PROCESSES
On the basis of these results, we computed differentiation statistics directly between sampling locations, which resulted in strong genetic differentiation, with FST= 0.138 (P≤ 0.001) and RST= 0.391 (P≤ 0.001). When individual multilocus genotypes were attributed to the STRUCTURE cluster (K= 3) in which they had > 70% ancestry, differentiation was slightly weaker with FST= 0.084 (P≤ 0.001) and RST= 0.352 (P≤ 0.001) among clusters. Phylogeographic structure was detected in both cases (RST > permuted RST, P≤ 0.001). For STRUCTURE results with K= 3, this was due to a significant influence of allele size mutations between the cis-Andes-Costa Rica Cluster 03 and both other clusters (Cluster 03 vs. Cluster 01: RST= 0.469, Cluster 03 vs. Cluster 02: RST= 0.442, RST > permuted RST with P≤ 0. 001 in both cases), but not between the trans-Andes Cluster 01 and the Panamanian Cluster 02 (RST= 0.026). The same phylogeographic pattern is reflected between pairs of populations, where comparisons involving populations from cis-Andes or Costa Rica on the one hand and Panama on the other resulted in significant RST > permuted RST (Table 4). All pairs of populations were strongly and significantly differentiated based on FST (P < 0.01 for all pairwise comparisons, Table 4). Strong differentiation was also found at short spatial scales within Panama, with overall FST= 0.111 (P≤ 0.001) and RST= 0.149 (P≤ 0.001), but no contribution of mutations to differentiation was detected (Table 4).
Table 4. Pairwise measures of differentiation between populations. Above diagonal: FST, all values are significant at P<0.01 after multiple test correction; below diagonal: RST, for shaded values RST>permuted RST at P<0.05 after multiple test correction, indicating that mutations contributed to population differentiation.
French Guiana FG
East Ecuador YAS
West Ecuador Esm
A total of 34 population-specific alleles were detected, of which 32 were private to populations from the cis-Andes-Costa Rica Cluster 03: La Selva had 19 private alleles, Chilamate 5, Paracou 5, and Yasuni 3. The Panamanian population El Cope had the two remaining private alleles. An accumulation of new mutations can be a sign of a recent population expansion. Results from the BOTTLENECK program (Table 3) support this with a negative average for the T2 statistic in cis-Andean and Costa Rican populations (T2=−0.623) and a significant heterozygosity deficit (for individuals with >70% ancestry) in the cis-Andes-Costa Rica Cluster 03 (T2=−1.767, P= 0.016). On the other hand, T2 was on average positive in trans-Andean and Panamanian populations (T2= 0.613) and the heterozygosity excess was significant in Fort Sherman, Cerro Jefe, El Cope but also in the Costa Rican La Selva, which may reflect local bottlenecks.
DNA SEQUENCE VARIATION
The aligned ITS region sampled from 102 individuals of S. globulifera was 645 base pairs in length (range: 645–753 bp). The ITS dataset contained eight haplotypes and 42 polymorphic sites, 15 of which were parsimony informative. Indels were found in the cis-Andean haplotype (5-bp deletion), coastal Ecuador (1 bp insertion), and Belize-Nicaragua (1 bp insertion). The average number of nucleotide differences (k) among haplotypes was 4.05 bp. Nucleotide diversity (π) was 0.00815. Consistent with previous analyses (Dick et al. 2003) Mesoamerican ITS haplotypes clustered together in the parsimony network and were fairly differentiated from a haplotype found in Dominica and from the single cis-Andean haplotype, which occurred over large distances (> 2500 km) from French Guiana to Peru and Bolivia. There was no ITS haplotype diversity within populations in any Neotropical location, despite examination of relatively large samples of trees from locations such as BCI, Panama (N= 18) and Yasuní, Ecuador (N= 15). The same haplotype was found across moderate distances within some of the sampling areas shown as circles in Figure 4: Belize [20 km between samples]; Western Ecuador [20 km]; Manaus, Brazil [60 km]. The 36 additional ITS sequences of this study increased the geographic breadth of the previous analysis to Peru, Honduras, and Nicaragua, but did not reveal new haplotypes in these countries. One novel ITS haplotype was encountered in northeastern Panama (Nusagandi; Fig. 4 haplotype 5; N= 2) and is nested within the Mesoamerican ITS clade.
The 574 bp of aligned trnH-psbA sequences from 153 individuals (Appendix) contained 20 haplotypes (excluding cpSSR variation; Fig. 5). TrnH-psbA variation consisted of single nucleotide substitutions (N= 23); a tri-nucleotide substitution; single nucleotide indels (N= 5); and indels of six bp (N= 1), seven bp (N= 1), 16 bp (N= 1), and 22 bp (N= 1). The trnH-psbA locus also contained a poly-T site (cpSSR) with eight length variants at position 355 to 375. Excluding indels, the average nucleotide difference among haplotypes was 6.02 bp and nucleotide diversity (π) was 0.01109. Haplotypes from the Amazonian basin and French Guiana clustered together in the parsimony network, as did haplotypes from Mesoamerica. One Paracou haplotype did, however, cluster with the Mesoamerican group. There were 11 trnH-psbA haplotypes in the 13 populations sampled for SSR analyses (133 individuals), with a total of 41 polymorphisms. In these 13 populations, 95% of the haplotype diversity was distributed among populations (ΦPT= 0.953, P < 0.001). BCI and Paracou each harbored two haplotypes, and in Paracou the two haplotypes were distributed among both large- and small-leaved trees. Only three trnH-psbA haplotypes were shared among proximal populations ([BCI, Sherman], [La Selva, Chilamate], [BCI, Santa Rita, Pipeline Road]).
MUSEUMS AND CRADLES
The geographic differentiation of S. globulifera populations contrasts with patterns in temperate zone trees such as the European ash Fraxinus excelsior (Heuertz et al. 2004). Using a similar sampling design and five SSR loci (275 total alleles), the authors found no evidence of allele size variance and only very weak differentiation (FST= 0.023) among populations spanning 2000 km from the British Isles over central Europe to the Baltic States. In addition to life-history differences such as wind-mediated pollen and seed dispersal of Fraxinus, there is a major historical difference between the Symphonia and Fraxinus comparison: the common ash has occupied most of its present distribution only in the past 10,000 years, whereas S. globulifera has occupied parts of its Neotropical distribution for over 10 million years.
The high levels of genetic differentiation across the range of S. globulifera are due in some part to the age of the species, which permitted it to be subject to both successful sweepstakes dispersal and to subdivision by emerging geographic barriers. Symphonia globulifera falls into one end of a spectrum of tropical tree ages and geographic distributions. Based on a fossil-calibrated molecular phylogeny of the species-rich Neotropical tree genus Inga (Richardson et al. 2001), which comprises nearly 300 lowland forest species, Lavin (2006) estimated the age of the Inga crown clade at less than 2 million years. Many Inga species are widespread in the Neotropics, and their ancestral populations should be much younger than those of codistributed Symphonia. Thus, compared to nearly 300 species of Inga, S. globulifera is a living fossil. Inga, however, may be the youthful outlier in the spectrum of ages of tropical tree lineages. Like Symphonia, which is monotypic in the Neotropics, most Neotropical tree genera contain relatively few species (Bermingham and Dick 2001). The cross-Andean disjunction of many widespread rainforest plant species also suggests antiquity. In Ecuador, approximately 1432 lowland plant species (∼30% of Ecuador's flora) have population disjunctions east and west of the Andes (Jørgensen and León-Yánez 1999) and experience no contemporary gene flow. Raven (1999) has suggested that these disjunctions derive from vicariance by the Andean uplift several million years ago. Although S. globulifera populations are old enough to fit this model, recent groups such as Inga and some well-dispersed species, such as the kapok tree Ceiba pentandra (Dick et al. 2007), appear to have recently dispersed around the Andes. Further analyses focused on genetic divergences of lowland populations around the Andes may help determine if most widespread species are good dispersers with weak phylogeographic population structure, like Inga and C. pentandra, or old and dispersal limited within terrestrial rainforest biomes, like S. globulifera.
Although its seeds are not adapted for water dispersal, long-distance marine dispersal has played a primary role in the migration and establishment of S. globulifera in the Neotropics. Symphonia globulifera fossils in Central and South America pre-date the Pliocene emergence of the Panama landbridge, and the West Indian island of Dominique was never connected to mainland. The colonization of these three regions may derive from a single dispersal event from Africa followed by oceanic dispersals within the Neotropics, or by multiple oceanic dispersal events originating from Africa. The nSSR, ITS, and trnH-psbA data show association between trans-Andean Ecuador and Mesoamerica indicative of marine or terrestrial connections between the continents.
An unanticipated result of the nSSR Bayesian cluster analysis was the finding that Costa Rican populations share more recent ancestry with cis-Andean (Fig. 2) than with trans-Andean populations of west Ecuador and Panama. The relationship is corroborated by low pairwise FST-based estimates between Costa Rican and cis-Andean populations suggesting a marine connection between Costa Rica and South America that post-dates the establishment of S. globulifera elsewhere in Mesoamerica. Alternatively, a cis-Andes-Mesoamerican gene pool homogenized through marine or aerial gene flow could have become disjoint when a differentiated trans-Andes gene pool immigrated into Mesoamerica after closure of the Panama isthmus. The coancestry of Costa Rica and cis-Andean populations is not supported by ITS or trnH-psbA data, however, which indicate genetic divergence of the Costa Rican populations but no link to South American populations. Events of chloroplast capture and losses of ITS haplotypes through concerted evolution (Hughes et al. 2005) could account for discrepancies between the nSSR and sequence datasets. Another scenario is that alleles of equal length in Costa Rica and Amazonia attained similar frequencies by chance. If populations in either region established as small founder populations following oceanic dispersal, the latter hypothesis may in fact be most realistic.
POPULATION EXPANSION AND OTHER DEMOGRAPHIC PROCESSES
Following its abrupt fossil appearance in the Neotropics during the early to mid Miocene, S. globulifera invaded species-rich rainforests in all the major forest domains of Neotropics, including Mesoamerica, Chocó, Amazon basin, Guiana shield, and the Atlantic forests of Brazil; S. globulifera thus coexists with several thousand tree species across its Neotropical range. Symphonia globulifera expanded through Neotropical forests with great speed in light of expectations based on Hubbell's (2001) neutral theory, which was formulated in part to explain the generation and maintenance of tropical forest tree diversity. Under the neutral theory, propagules of coexisting tree species arrive at open germination sites with probabilities equal to their relative abundances in the local community. The appearance of a new species, following transoceanic dispersal, is equivalent to Hubbell's “point mutation mode” of speciation, in which a new species enters the metacommunity as a single individual. As pointed out by Leigh et al. (2004), the rate of spread of the new species under Hubbell's model is approximately equal to the rate of increase of a neutral allele in an ideal population, taking approximately N generations to reach a size of N individuals (Fisher 1930; Ewens 1972). Using conservative estimates of the generation time and population density of S. globulifera in Neotropical forests, Leigh (2007) calculated a probability of approximately 2 × 10−9 for the expansion of S. globulifera across Neotropical forests within 20 million years by stochastic processes. Hence, although S. globulifera is a relatively old species, it required some competitive advantage to become established across vast expanses of species-rich forests.
Our genetic data support a population expansion event in modern S. globulifera, at least in the Amazon basin and probably beyond. A heterozygosity deficit and abundance of population-specific alleles in the cis-Andes-Costa Rica Cluster 03 (and in the cis-Andean populations analyzed jointly, results not shown) suggest that population expansion affected this gene pool. The absence of ITS variation across the Amazon basin documented by Dick et al. (2003) and extended to further sampling locations in northern and southern Peru, as well as lesser divergence in cis-Andean trnH-psbA haplotypes (Fig. 5) and star-shaped phylogenetic relationships between them (e.g., Bailey et al. 1996), were also in agreement with a hypothesis of recent population expansion.
Although the spread of S. globulifera left a trace of population expansion in its genomes, our data indicate that genetic drift can be strong at the local scale in this species. A significant heterozygosity excess consistent with bottlenecks was detected in several populations, mostly from Panama (Table 3), which presumably led to losses of alleles. Allelic richness is on average lower in populations belonging to the trans-Andes and Central American Clusters 01 and 02 than in the cis-Andes Costa Rica Cluster 03 affected by expansion (Table 3), although comparative tests were nonsignificant because of low power. Also, pronounced nSSR and cpDNA differentiation with limited haplotype sharing between proximal populations in Panama provides evidence of very restricted gene flow. Historical climatic variations in a particular topographic setting might be at the origin of these genetic patterns (see below).
CIS- AND TRANS-ANDEAN CONTRASTS
Both ITS and trnH-psbA support a Mesoamerican (including Western Ecuador) and a cis-Andean clade and the deepest nSSR divergence was found between populations separated by the Andes. This pattern of ancient divergence is consistent with Symphonia fossil records of a broad South American and Mesoamerican distribution prior to the uplift of the northern Andes, which reached altitudes above ca. 1800 m in the late Miocene (about 10 Ma, Gregory-Wodzicki 2000), and the Pliocene formation of the Panama land bridge (Coates and Obando 1996). There are presently no continuous habitat connections between lowland rainforests over or around the northern tips of the Andes. Although DNA sequence and nSSR divergence is consistent with Andean vicariance, the uplift of the cordilleras may alternatively be a barrier to secondary contact of populations that had differentiated in Central and South America before the two continents were connected (Dick et al. 2003). The Talamanca cordilleras of Central America are smaller in stature than the northern Andes and punctuated by lowland passes, but these mountains also delimit high FST and/or reciprocal monophyly between Atlantic and Pacific slope populations for several tropical tree species studied (Chase et al. 1995; Cavers et al. 2003, 2005; Novick et al. 2003).
An unusual result is the regional asymmetry in phylogeographic structure between cis- and trans-Andean populations of S. globulifera, with high haplotype diversity, strong phylogeographic structure, and high FST in Mesoamerica, and in Amazonia an absence of ITS variation and relatively low FST between populations separated by over 2500 km. The signature of strong genetic drift in populations from Mesoamerica, including divergence among populations within Panama, may be partly explained by topographic heterogeneity that is not found in lowland Amazonia. The spine of mountains that run through peninsular Mesoamerica create rainfall gradients that define habitats available to rainforest trees (Pyke et al. 2001; Engelbrecht et al. 2007) and thus place constraints on patterns of gene flow (McRae and Beier 2007). In the watershed of the Panama Canal, for example, annual rainfall ranges from 1600 mm yr−1 on the Pacific coast to 4000 mm yr−1 along the Caribbean coast less than 50 km to the north. The topographic variation may have further constricted gene flow during the last glacial maximum (LGM), when average rainfall was reduced by ∼30% in Panama (Piperno and Jones 2003). Forest cover persisted in sites that presently receive ≥3000 mm rainfall per year (Bush and Colinvaux 1990) whereas Pacific slope sites that presently receive ∼1800 mm rainfall per year were dominated by grass pollen (Bush et al. 1992). Symphonia globulifera is uncommon on the dryer Pacific coast of Mesoamerica, but it is common in the upper elevation forests that were probably isolated during the LGM (Whitmore and Prance 1987; Gentry 1992). This historical contraction of Mesoamerican rainforests, which has not been demonstrated in core Amazon basin, could have produced population bottlenecks resulting in the observed levels of genetic differentiation.
Some tropical tree species have had cross-continental or intercontinental geographic ranges dating long before Pleistocene climate changes. The relative stability of the tropical forest biome suggests that sweepstakes dispersal, vicariance histories, and strong phylogeographic structure may be found more frequently in tropical species than in their temperate zone counterparts. Tertiary geological history was reflected in the genetic structure of S. globulifera, and our analysis indicates that regional populations have been genetically isolated through the Pleistocene and earlier. We expect that studies of other widespread species, especially those with cross-Andean disjunctions, will reveal strong phylogeographic structure and distinct regional patterns of population differentiation. In particular, the study of S. globulifera suggests that tropical forest tree populations will be more highly differentiated in Mesoamerica than across similar spatial scales in the Amazon basin.
Associate Editor: W. O. McMillan
This project was initiated while CD was a Tupper postdoctoral fellow at the Smithsonian Tropical Research Institute (STRI). CD would like to acknowledge the formative collaboration of Eldredge Bermingham at STRI. We thank the Center for Tropical Forest Sciences (CTFS) for providing access to the BCI plot in Panama; CTFS and the Pontifica Católica Universidad del Ecuador (PUCE) for access to the Yasuní plot; and CIRAD/France for access to the Paracou forest plot. For help in obtaining leaf samples, we are grateful to Saint-Omer Cazal, B. Degen and P. Fine (French Guiana), Iñigo de la Cerda (Nicaragua), D. Neill (Ecuador), K. Dexter (Peru), D. Boshier and P. Ryme (Honduras), C. Woodward (Costa Rica), D. Hardesty, Salomon Aguilar and Hugo Mogollón (Panama); Catalina Perdomo, Denise Hardesty, C. Vergara and S. Pereira for assistance in the laboratory. CD acknowledges a collaborative European Union grant (SeedSource) and National Science Foundation award DEB 0640379. MH is currently a postdoctoral researcher of the National Fund for Scientific Research of Belgium (FRS-FNRS) and acknowledges an FNRS-funded scientific visit to CIFOR-INIA.
Table Appendix.. Voucher information for sequenced individuals of Symphonia globulifera. The laboratory ID refers to the voucher number for the leaf tissue and DNA maintained at the University of Michigan. GenBank accession numbers are provided for all trnH-psbA sequences, and for previously unpublished ITS sequences. GenBank accessions for published ITS sequences are available in Dick et al. (2003).