On gene dispersal studies in complex landscapes: a reply to the comment on


  • C.G. is a post-doc researcher working on the spatial and genetic consequences of pollen and seed dispersal by animals in heterogeneous landscapes. J.A.G. is interested in the application of molecular markers to address ecological and evolutionary questions, with a focus on endangered vertebrate species and conservation biology. P.J. studies the ecological factors that shape plant recruitment patterns, with a special interst on the theoretical, ecological, and evolutionary implications of plant–animal interactions.


Pedro Jordano, Fax: + 34 954 62 11 25; E-mail: jordano@ebd.csic.es

The direct assessment of dispersal events in plants has been a long lasting conceptual and technical challenge severely limiting our understanding of the dispersal process (Nathan 2006). Our field studies with Prunus mahaleb in the SE Spanish mountains combined molecular genetic and field approaches to obtain a thorough empirical case study of seed and pollen dispersal. First, we documented seed dispersal patterns mediated by animal frugivores based on a direct comparison of the genotypes obtained from the maternally-derived seed endocarp tissue with the genotypes of adult trees within a stand (Godoy & Jordano 2001). Second, we used maximum-likelihood (ML) paternity analyses to infer patterns of pollen flow, based on the assignment of the most likely sires for seed progenies sampled from known maternal trees (Jones & Ardren 2003).

The comment by Herrera (2009) claims that the finding of additional P. mahaleb trees close to our study area reveals that the sampled trees do not form a discrete, spatially isolated population. He argues that our assumptions of spatial isolation and exhaustive genotyping are false, and that the presence of ungenotyped trees adjacent to our focal trees might severely affect our main conclusions (García et al. 2005, 2007). In this reply, we consider these issues and discuss the potential implications that the finding of additional trees would have for our conclusions. Specifically, we argue that there is no such an error in our description of the study site and that our premises do not include complete isolation of the study population. We also evidence the robustness of our conclusions given the amount of sampling performed, the set of microsatellite markers used, and the specific landscape setting of our studies.

Considerations on population description

Natural populations inhabit heterogeneous three-dimensional landscapes, rather than homogenous two-dimensional spaces (Waples & Gaggiotti 2006). Our study area is located in a karstic landscape combining abrupt ridges, sinkholes, and small to medium-size poldjes (flat areas with fertile and deep soils), where most adult P. mahaleb trees grow in aggregated patches limited by rocky outcrops (Fig. 1). A large portion of the P. mahaleb trees reported by Herrera (2009; Fig. 1) grow on the other side of a rocky hill, belonging to a different watershed (Fig. 1C), in what we considered in our previous work the population closest to our study site. This conspicuous topography imposes marked differences in leafing, flowering and fruiting phenologies of individual P. mahaleb trees and populations (Alonso 1999) significantly influencing their genetic structure (Jordano & Godoy 2000; García et al. 2005). Our study area can be considered as a distinct, completely sampled, stand of trees relatively separated from other patches by altitude, orientation, and landscape features that effectively contribute to isolation beyond the purely distance effects.

Figure 1.

 A, Topographic 3D model of the area, indicating the approximate limits of our study population, where all adult Prunus mahaleb trees reproducing in 1996–1997 were sampled and genotyped. Each contour line represents 10 m elevation. Our study population is a stand of trees growing on poldje-like soils and rocky slopes. A ridge with NE-SE orientation to the E of our study site isolates our population from other tree stands growing in a different watershed. B, location of our study trees for the analysis of 1996–1997 seed progenies (blue dots) and the additional trees of the comment by Herrera (2009; red dots) relative to the topographic 3D model. The thin lines labelled 1a, 1b, and 1c encompass the approximate distribution of the trees included in García et al. (2005) (1a), core (1b), and extended (1c) datasets used in Table 1. C, 3D model for the elevational transect between points a–a′ (shown in A, spanning approximately 2.2 km). A small red square is located in each panel to facilitate the visual comparison.

Besides the natural geographic distinctness of our study area, an active regeneration has occurred during the last 15 years, with an increased tree population census size in the region. We mapped and genotyped additional P. mahaleb trees over the years as they became reproductive. In our study population, 25.8% of reproductive trees in 2006 were censused as saplings prior to 2006 and were thus not reproductive in 1996–1997 when we sampled the adults and seed progenies used in our analyses. Herrera indicates that the large trunk diameters of the trees he located in 2008 provide evidence for their reproductive status in 1996–1997. This needs not to be the case. In our study population we identified trees with 16–20 years of age and 20–30 cm dbh that were not reproductive during our study years. As a matter of fact, Herrera’s map includes at least 13 trees that we positively identified as saplings (non-reproductive) before 2006: they are eight of the isolated trees to the west, and the five isolated trees to the south of our study area (Herrera 2009; Fig. 1). Therefore, together with the phenological differences discussed above, it is unlikely that all or even most of the trees censused by Herrera in 2008 were candidate sources of pollen or seeds in our analyses with progenies sampled in 1996–1997. Some scattered trees within the 1.5 km distance separating the centres of our study population and the closest high-density stand of trees (trees located to the East of the highest ridge in Fig. 1B) might have been sources of seeds, an aspect that was insufficiently explicited in our early papers (e.g. Godoy & Jordano 2001) but fully accounted for in our subsequent work (Jordano 2007; Jordano et al. 2007). In the following sections we evidence that a small fraction of unsampled trees outside our focal stand would not affect the robustness of our conclusions in relation to pollen dispersal inferences.

Would unsampled trees in the immediate neighbourhood of genotyped focal trees affect the main conclusions of García et al. (2005, 2007)?

First, Herrera seems to imply that both spatial isolation of the study stand and complete sampling are necessary requirements in parentage assignment studies. However, studies based on direct assignment of pollen or seed to parental trees rely, instead, on a thorough sampling of trees potentially contributing pollen or seeds within an arbitrarily delimited study area (Smouse & Sork 2004). This main premise was clearly fulfilled in our studies of pollen and seed movement, even in the unlikely case that new trees reported by Herrera were potential parents at the time of the study, as all these trees occur outside our focal area.

Second, Herrera suggests that our conclusions should be interpreted with caution because a high proportion of unrecognized, ungenotyped candidate parents might have led us to a high rate of erroneous assignments and to overestimate the statistical confidence of our analysis. However, it has been documented that statistically significant assignments are reasonably robust even under scenarios of limited sampling of candidate fathers (Oddou-Muratorio et al. 2003). Moreover, Duchesne et al. (2005) showed that the correctness rate (i.e. the proportion of correct allocations among all allocations of progeny to collected parents) remains high even when the sampling effort drops to a limited fraction of potential fathers. In fact, the correctness rate critically depends on the power of parental exclusion, and hence on an adequate number of highly polymorphic markers (Duchesne et al. 2005).

Third, Herrera backed up his argument with simulations in Cervus 3.0 (Kalinowski et al. 2007) showing a drastic decrease of assignment rates as the proportion of unsampled parents increases. For his simulations, he used an example data set (provided by Cervus 3.0, Kalinowski et al. 2007) from a study of red deer (Cervus elaphus), whose relevance might be questioned given the obvious differences in life history and mating strategies between red deer and our study tree, and in the experimental settings of these studies. Instead, we use our own data to explore directly the effect of increasing the proportion of unsampled candidate parents in our analyses. We ran simulations and performed new paternity analyses using Cervus 3.0 based on three different P. mahaleb data sets (Table 1): (i) the same set of sampled candidate trees as in our original work (Table 1a); (ii) only a subsample of candidate trees located in the centre of the population (core population) (Table 1b); and (iii) an extended sample of candidate fathers including 68 new adult trees located in the surroundings of our focal population (Table 1c). Different fractions of sampled candidate fathers were considered when performing paternity analysis for each data set, providing a percentage of embryos assigned at the 80% confidence level based on Kalinowski et al. (2007) algorithm. Additional simulations based on Oddou-Muratorio et al. (2003) provided an estimate of the type I error rate (false positive assignments) and the type II error (false negative assignments) associated to these different sampling scenarios. Let us recall that a confidence level of 80% allowed by Cervus involves a tolerance of 20% for the type I error rate.

Table 1.   Paternity analysis results obtained for three different data sets including: (a) the same set of candidate fathers as in our original studies (García et al. 2005); (b) a reduced set of 70 trees centrally located in study site; and (c) an extended sampled of 68 candidate fathers including additional trees located outside the study site. We considered different sampling fractions for each data set providing different census population sizes. Note that P. mahaleb is a gynodioecious species and only hermaphrodite trees act as candidate father. In our population the ratio males:hermafrodites is close to 1:1. Paternity analyses were performed by applyling Cervus 3.0 (Kalinowski et al. 2007) and error rates were estimated based on Oddou-Muratorio et al. (2003). Note that by applying Kalinowski et al. (2007) algorithm the percentage of assigned paternity increased from 81% (based on Marshall et al. 1998) to 89% (García et al. 2005)
 Demographic variablesPaternity analysis
Sampled # CFSampling fractionActual # CFCensus population size% APError type
Type IaType IbType II
  1. Sampled # CF: Sampled number of candidate fathers.

  2. Sampling fraction: sampling fraction considered for Cervus 3.0 analysis.

  3. Actual # CF: actual number of candidate fathers that result from the number of sampled candidate fathers corrected by the sampling fraction.

  4. Census population size: census population size which approximately doubles the actual number of candidate fathers since only hermaphrodite trees (≈ 50% of adult trees) can act as candidate fathers.

  5. % AP: percentage of assigned paternity based on Kalinowski et al. (2007) at 80% confidence level.

  6. Type Ia: (false- positive assignments). Percentage of wrongful assignment of paternity of an offspring to a sampled father matching by chance when the actual father is a sampled father.

  7. Type Ib: (false-positive assignments). Percentage of wrongful assignment of paternity of an offspring to a sampled father matching by chance when the actual father is not sampled. Corresponds to cryptic gene flow.

  8. Type II: (false-negative assignments). Percentage of cases that paternity could not be attributed to any father at the chosen level of confidence in spite of having sampled the actual father. Values in parenthesis are obtained when the possibility of selfing for hermaphrodite trees is not allowed.

(a) Original data set1040.95109c. 21889747
 0.85122c. 244837105
(b) Core group700.9574c. 148801644
0.7094c. 188795164
0.60117c. 234792194
(c) Extended group1720.95181c. 362907310 (14)
0.80215c. 4308161310 (14)
0.75229c. 458786167 (11)
0.70246c. 492725237 (11)
0.60286c. 572694267 (11)

Table 1 documents two most interesting results. On one hand, the percentage of assigned embryos remained high with a reduced fraction of sampled candidate pollen parents, in agreement with previous studies (Duchesne et al. 2005). Furthermore, this result evidenced a remarkable distance-dependent mating pattern since the addition of candidate fathers located in the sorroundings of our focal stand (Table 1c) still provided a proportion of assigned progeny around 80%. Ignoring actual candidate fathers (Table 1b) increased cryptic gene flow, even though the type I error rate was ∼20% (Table 1b). On the other hand, only three embryos were assigned to any of the new 68 reproductive trees located outside the study area (Table 1c). Yet these embryos did not include any of the offspring assigned to trees within our focal stand in our previous analysis (García et al. 2005). Table 1 also shows that the assignment success would remain around 70% even in the most limited sampling scenario corresponding to a census of c. 572 trees (i.e. a proportion far above the 37% obtained by Herrera). Furthermore, adding trees as candidate fathers from outside the study stand (very unlikely to sire our focal trees given the geographic, elevational and topographic isolation of the stand) had the undesirable effect of increasing the type II error (false negatives), i.e. the proportion of embryos with no father assigned at a certain confidence level in spite of having sampled the actual father (Table 1c). Note that the proportion of false positives specially increased when the substantial rates of selfing shown by P. mahaleb are not accounted for, as in Herrera’s simulations.

Finally, based on the genotypes of sampled parents and offspring used in our original data set we estimate that the proportion of non-sampled fathers was 15% (PASOS software, Duchesne et al. 2005). This is most likely an overestimate, since the software assumes random mating and excludes the possibility of selfing. Even so, the proportion of assigned paternity only slightly decreased from 89% to 83% when the proportion of sampled fathers decreased from 0.95 (as previously assumed) to 0.85 (actual estimation) (Table 1a).

Herrera also argued that rates of non-assigned progeny should be even larger for dispersed seeds (compared to pollen), as we initially ignored both parents. However, our seed dispersal studies were not based on parentage assignment methods, but on direct genotype comparisons: seeds sampled in seed traps, directly defecated or regurgitated by animal dispersers, were assigned to the tree whose genotype matched the genotype obtained from the endocarp of the seed, a tissue of maternal origin (Godoy & Jordano 2001). Thus, our seed studies should be affected much less than pollen assignments by the presence of unsampled nearby trees, because their genotypes were very unlikely to match by chance the genotypes of sampled trees given the set of markers used. All the trees within the reference stand were genotyped and a design with more than 600 seed trap sampling stations was used (Robledo-Arnuncio & García 2007). As a matter of fact, none of the included trees showed identical genotypes by chance and we estimated a low expected probability (10−7) for the local maternal genotypes to be produced by trees in the immediate neighbourhood of our site, based on genetic data from a set of trees located nearby our focal population (Harju & Nikkanen 1996; García et al. 2007).

Robustness of our mating and gene flow studies (and related work)

The additional analyses and simulations we report here are highly consistent and support our previous analyses. They further support our subsequent studies on P. mahaleb, aimed at exploring to what extent changes in the spatial distribution of parental trees affect topological features of spatial mating networks (Fortuna et al. 2008) and how seed rain sampling designs based on seed traps affect the estimation of seed dispersal kernels (Robledo-Arnuncio & García 2007). Finally, (Jordano et al. 2007) documented how different functional groups of animal frugivores unevenly disperse propagules within the focal population and differentially contribute to the total dispersal kernel. Given that all these studies strictly deal with dispersal within the focal population, i.e. only use the pool of dispersal events allocated within the focal population where adult trees have been genotyped and located on an individual basis, they are qualitative and quantitatively robust to the presence of unsampled trees outside the reference population. It must be noted that unassigned progeny was not included in our distance or direction analyses, although the assignment of these would add important information on the frequency and distance of long-distance dispersal events. We are addressing this issue in ongoing projects expanding our sampling to surrounding populations and by assigning the population sources of immigrant pollen and seeds (Jordano 2007).

In conclusion, the additional trees found by Herrera (2009) do not alter the conclusions of our previous works, at least for three reasons: (i) the reproductive dynamics of trees in the area; (ii) the conspicuous topography that imposes a high distinctness and geographic isolation among population patches belonging to different watersheds; (iii) high robustness in parentage analysis given our experimental design and the set of polymorphic markers used, coupled with a high accuracy of the source tree identification of dispersed seeds based on direct genotype match rather than ML parentage assignments.

The study of gene dispersal in plants has progressed by accommodating the shortcomings of logistic problems inherent to sampling extensive areas and by incorporating proper statistical tools that help validating and assessing the robustness of available field data (Robledo-Arnuncio & García 2007). The challenge remains to solve how future studies overcome the constraints derived from logistic limitations for sampling candidate fathers in pollen dispersal studies and those associated with setting seed trap locations that encompass extensive areas where potential source trees grow. These limitations are faced by all parentage studies in plants.


The comments from two anonymous reviewers contributed to improve a first version of the manuscript.