A mathematical method for estimating patterns of flower-to-flower gene dispersal from a simple field experiment


†Author to whom correspondence should be addressed. E-mail: j.e.cresswell@ex.ac.uk


  • 1Pollination is a principal means of gene dispersal in animal-pollinated plants. Theoretically, gene dispersal can be predicted from pollinator movements and their associated patterns of flower-to-flower gene dispersal, or paternity shadow, but quantifying the paternity shadow under field conditions is problematic.
  • 2We developed a mathematical method to quantify the paternity shadow from a simple field experiment by initially stating the problem in matrix algebra, then using a least-squares regression to find the paternity shadow that best explained the observed spatial distribution of gene dispersal, given the observed pollinator movements.
  • 3We applied the method to data on the dispersion of a marker gene along rows of oilseed rape (Brassica napus) pollinated by bumble bees (Bombus spp.) and thereby produced the first field-based paternity shadow. When coupled with observed movements of bumble bees, this paternity shadow explained virtually all (r2 = 90%) of the dispersion of the marker gene.
  • 4Close similarity between our result and a paternity shadow obtained for B. napus in a separate laboratory experiment suggests that the paternity shadow is a fundamental attribute of the Bombus–B. napus interaction.


Localized gene dispersal is the major reason why species’ gene pools are not evenly mixed. The evolutionary consequences of this are profound, and can include divergence among populations and local adaptation, depending on the actions of natural selection and the vagaries of genetic drift (Slatkin 1985). Understanding of gene dispersal, and the factors restricting it, is therefore fundamental to evolutionary biology, and can also be applied to the genetic conservation of small or fragmented populations (Ellstrand 1992) and to the confinement of genetically modified (GM) organisms (Pilson & Prendeville 2004).

For many plant species, animal-mediated pollination is a principal mode of gene dispersal (Fenster 1991). Typically, it is highly localized (Bateman 1947; Schaal 1980; Fenster 1991) because of: the restricted movements of pollinators (Schmitt 1980); the rapid attenuation of flower-to-flower pollen carry-over (Morris et al. 1995); and the absence of secondary stigma-to-stigma transfers (Bateman 1947) due to pollen's apparent tendency to adhere to the first stigma on which it is deposited. Theoretically, gene dispersal can be predicted from pollinator movements and their associated patterns of flower-to-flower gene dispersal, or paternity shadow. Several studies have aimed to model patterns of pollinator-mediated gene flow (Bateman 1947; Schmitt 1980; Morris 1993) and, building on these, a recent theory known as the portion-dilution model (PDM) has made it possible to predict pollinator-mediated gene dispersal (Cresswell et al. 2002; Cresswell 2003, 2005). Use of the PDM requires knowledge of pollinator movements, which are in principle directly observable, and their associated paternity shadow (Fig. 1). However, determination of the paternity shadow under field conditions is problematic. Here we develop a feasible method to determine the paternity shadow relevant to field conditions where plants are exposed to ambient weather conditions and multiple visits by wild pollinators.

Figure 1.

Examples of paternity shadows. (a) Paternity shadow of a genetically marked flower (shaded) extending across m = 3 unmarked flowers. The relative size of the circle for the vth component of the paternity shadow, fv, indicates the proportion of that flower's seed that is fertilized by marked pollen. (b) Paternity shadows of W = 2 marked flowers extending across a sink population of unmarked flowers. The proportion of marked seed at the ith-visited flower, Ψi, is given by the sum of the fi associated with the shaded circles directly below that particular flower (equation 2). The total amount of marked paternity realized in the sink, ΨW, is given by the sum of the fi associated with all the shaded circles (equation 10).

The PDM applies when a portion of a source population's paternity is realized in a sink population where this paternity is diluted among seeds of differing paternity (Fig. 1). The proportion of the sink's seed with source paternity, ξ, is given by:

image( eqn 1)

where each pollinator arriving in the sink population from the source population fertilizes Ψ fruits with source paternity for every flowers it pollinates, and where a proportion, E, of all the sink's pollinators arrive directly from the source (Cresswell et al. 2002). The parameters E and relate to pollinator behaviour that can be observed directly. The parameter Ψ derives from the paternity shadow (Fig. 1). The experimental characterization of paternity shadows is therefore crucial to the quantitative prediction of gene dispersal by the PDM.

A paternity shadow can be quantified experimentally based on the proportions of genetically marked seeds in the unmarked flowers that a pollinator has visited after a visit to a single marked flower (Cresswell et al. 2002). However, these experiments are laborious and realistically feasible only under laboratory conditions, because they require a single pollinator first to visit a marked flower and then to visit consecutively a series of virgin, unmarked flowers, which is difficult to arrange in the field. Further complications arise if patterns of pollen transfer are influenced by variation in floral characteristics, such as levels of nectar and available pollen (Galen & Plowright 1985; Cresswell 1999). In this case, the relevant floral variables must be adjusted in the laboratory experiments to match field conditions if the emerging paternity shadow is to apply. We aimed to quantify the paternity shadow relevant to flowers under field conditions while avoiding complicated experimental procedures. The objectives of our study were to develop a theoretical framework for estimating paternity shadows from simple field experiments and to use this theory, together with data collected from the field, to estimate the paternity shadow of bumble bee pollinated oilseed rape (Brassica napus). Brassica napus is an economically important GM crop that is pollinated by both insects (Cresswell 1999) and wind (Eisikowitch 1981).

overview of theoretical approach

Table 1 summarizes the notation used in the following exposition. Consider a pollinator that visits W flowers in a genetically marked population before moving to an unmarked population of the same plant species. The pollinator thereby transfers marked pollen that fertilizes some seeds at unmarked plants. What is the representation of marked seeds among the progeny of the unmarked plants? If fv denotes the proportion of progeny fertilized by a particular flower at another flower that is visited v flowers later in a pollinator's foraging sequence, the collection of fv for all v is the paternity shadow (Cresswell et al. 2002). Let m denote the maximum extent of the paternity shadow, i.e. fv = 0 when v > m.

Table 1.  Definitions of variables and parameters used
vSequential position of a component of the paternity shadow
fvvth component of the paternity shadow
mMaximum extent of the paternity shadow, or number of its non-zero components
WNumber of marked flowers visited initially by a pollinator
iSequential position of an unmarked flower visited by pollinator after leaving marked flowers
ΦiProportion of marked seed in fruit of the ith unmarked flower visited by a pollinator after leaving marked flowers
XSpatial position of an unmarked plant
PX,iProbability that a pollinator visiting a plant at X arrives on its ith flower visit after leaving the unmarked plants
MXProportion of marked seed set by a plant located at X
nX,iNumber of times pollinators were observed to visit a plant located at X on their ith flower visit after leaving marked flowers
nXTotal number of times pollinators were observed to visit a plant located at X
α, βParameters governing the shape of the least-squares fitted paternity shadow
(a)eExpected value of any variable, a, calculated from the fitted paternity shadow

Suppose that pollen from the pollinator's visits to the marked plants fertilizes a proportion Φi of the fruit's seed at the ith unmarked flower that the pollinator visits. This proportion (Fig. 1) is compounded from the paternity shadows of the marked flowers, and is given (Cresswell 2005) by:

image( eqn 2)

Let PX,i denote the probability that a pollinator visiting the unmarked plant at location X arrives at its ith flower visit after leaving the marked plants. The PX,i can be estimated from a collection of observations of pollinator movements by:

image( eqn 3)

where nX,i is the number of times pollinators visited a plant at location X on their ith flower visit after leaving the marked plants, and ∑nX,. is the total number of visits observed at location X.

If MX denotes the proportion of all seeds on a plant at location X that are marked, then, following Morris (1993):

image( eqn 4)

or in matrix notation:

M = Pφ,( eqn 5)

where M is a vector containing the MX, i.e. M = {MX}. Eqn 4 applies to plants on which the flowers each receive one or more visits from pollinators (Cresswell 2003), provided that all the pollinators generate the same marked paternity shadows, fv, and have patterns of movement governed by PX,i.

If the number of plant locations is equal to the number of successive visits to unmarked flowers under consideration, denoted r, then P is a square matrix (r × r elements) with elements PX,i, φ = {ΦX}, a vector of r elements, and M has r elements. If P is non-singular, we can deduce φ by:

φ = P−1M.( eqn 6)

The paternity shadow can then be recovered from φ by equation 2 because:

Φi − Φi+1 = fi − fWi.( eqn 7)

Thus unique solutions for fv are feasible provided that the number of plant locations exceeds m.

The preceding approach omits the possible contribution of sampling error to the observed values that must be used to populate M prior to solving equation 6. Therefore the approach is at risk of distorting the paternity shadow to account (mistakenly) for part of an observed pattern that is properly attributed to statistical noise. Least-squares regression analysis is a convenient tool for finding the best fit of a parametric model to data that contain sampling errors. We therefore developed a regression method to fit a biologically realistic, parametrically defined paternity shadow by using the matrix framework defined in equation 5. We required the paternity shadow to decrease monotonically in the form of one of the highly leptokurtic, parametric curves that fits well to patterns of flower-to-flower pollen transfer (Morris et al. 1995). For simplicity, we chose the exponential power function (Cresswell 2005), although our calculations showed that fitting a Weibull function, as suggested by Morris et al. (1995), gave almost identical results (data not shown). Thus the expected elements of the paternity shadow are given by:

(fv)e = exp(αvβ)( eqn 8)

We obtained the expected elements of φ, denoted (φi)e, as functions of constants α and β using equations 2 and 8. Hence the expected value of M, which comprised the elements (MX)e, was calculated from (φi)e and P using equation 5. We then implemented the least-squares regression technique by varying parameters α and β of eqn 8 to minimize the residual sum of squares (SSR) between the observed and expected values of MX:

image( eqn 9)


study system and associated data

We applied the above theory to a published empirical study of gene dispersal in rows of bumble bee pollinated oilseed rape, Brassica napus L. (Cresswell 2005). In this logistically simple outdoor study, a patch of 10 genetically marked plants was located in the middle of each of three replicate rows of unmarked plants. Each row comprised 24 single plants at 0·4-m intervals to either side of the marked patch. We determined MX by the observed mean proportion (over the three replicates) of marked seed set by the unmarked plant at each location X, where X is the distance of each unmarked plant from the marked plants in units of interplant intervals. The maximum value of X was 24, so we fixed the size of the matrices at r = 24. We determined PX,i from the observed movements of bumble bees (Bombus lapidarius L., Bombus pascuorum Scop., Bombus terrestris L. and Bombus lucorum L.), which were the only animal pollinators observed in this setting, with consideration given only to the first 24 flower visits after the bees left the marked plants. The mean number of flowers probed during a visit to the marked patch was W = 9·0 (SE = 0·56, n = 154).

Values for P and M are given in Appendix 1. We inverted the matrix P using the ‘R’ software package (Ihaka and Gentleman 1996). We used the ‘Solver’ routine from the excel spreadsheet package (Microsoft) to fit a paternity shadow to match the observed pattern of gene dispersal as closely as possible. To quantify the overall fit to the observed pattern of gene dispersion, we calculated r2 = (SST −SSR)/SST, where inline image, and M̄ is the mean of the collection of MX.


The matrix method yielded an exact value of φ (eqn 6) of [3·24, 3·76, 5·06, –0·07, −8·15, 2·20, −1·99, 5·50, −2·70, −6·09, 6·41, −3·19, 1·42, 2·21, 2·69, −8·03, 6·81, −3·51, 5·21, −2·71, −6·69, 5·80, 1·81, −1·83]. The resulting paternity shadow contains negative values and hence is not biologically realistic.

The least-squares method yielded a solution for the paternity shadow of (fv)e = exp(−2·20v0·43) (Fig. 2), which explained r2 = 90·1% of the spatial variation in the observed levels of marked paternity (Fig. 3). The fit of the least-squares model to the observed levels of marked paternity (Fig. 3) did not vary systematically with distance from the marked source plants (correlation analysis, residuals vs relativized expected values, r = 0·35, df = 22, P > 0·05).

Figure 2.

Comparison of the paternity shadow for a single marked flower of bumble bee pollinated Brassica napus estimated from this study (•), paternity = exp(−2·20flower visit0·43), with empirically determined results (○) of Cresswell et al. (2002), paternity = exp(−1·77flower visit0·49). X-axis (flower visit) indicates number of interflower flights made by the bee after leaving the marked flowers. Y-axis (paternity) indicates proportion of marked progeny among seed in each successively visited unmarked flower. Points are interpolated for ease of inspection only.

Figure 3.

Comparison of observed (○) marker-gene dispersal in a row of bumble bee pollinated Brassica napus with that predicted based on bumble bee movements and a least-squares fitted paternity shadow (•). X-axis indicates distance from marked plants in units of interplant spaces. Y-axis (paternity) indicates proportion of total number of marked progeny among seed collected at distance X from centrally located, marked plants. Points are interpolated for ease of inspection only.


We identified a simple pattern of flower-to-flower gene dispersal (paternity shadow) that, when coupled with observed bumble bee movements, explained virtually all the spatial variation in dispersion of a marker gene along a row of bumble bee pollinated B. napus. Our result suggests that bumble bee mediated pollination was the overriding determinant of gene dispersal in our small experimental array, although it does not rule out a minor contribution by other pollen vectors such as the wind. We believe that the fairly rapid rate of flower visits, and hence pollen delivery, by bumble bees in the experimental array obscured the possible contribution of wind pollination, which delivers pollen to stigmas only slowly (Cresswell et al. 2005).

This is the first time that a paternity shadow has been estimated in flowers with attributes, such as levels of available nectar and pollen, relevant to field conditions. Our best-fit paternity shadow is nevertheless similar to the paternity shadow for bumble bee pollinated B. napus determined by a separate laboratory experiment (Cresswell 2005) (Fig. 2). This similarity emerged even though the laboratory-based paternity shadow was the product of single visits of one bumble bee species to pollen-rich virgin flowers, whereas the paternity shadow inferred by the present study was the product of many visits by four bumble bee species to pollen-depleted flowers. Apparently, the paternity shadow is a fundamental attribute of the Bombus–B. napus interaction that is independent of the levels of floral nectar and pollen. The paternity shadow is probably conserved by the exact architectural fit between bee and flower (Cresswell 2000), and by the consistent within-flower foraging behaviour of bumble bees (Cresswell 1999). This conservatism in the paternity shadow suggests that bumble bee mediated gene dispersal in B. napus can now be reasonably predicted in a wide range of situations from any suitable description of bee movements, such as a P matrix. It will be convenient for future investigations of pollinator-mediated gene dispersal if paternity shadows generally are conservative, as they will need to be measured only once.

The similarity in extent between paternity shadows from single bee visits in the laboratory and from multiple bee visits in the field is also evidence against the important involvement of secondary transfer in the pollination of B. napus. Secondary transfer is a remobilization process whereby pollen deposited onto a stigma by a bee may be transferred to another stigma during a subsequent bee visit (Thomson & Eisenhart 2003), which extends the paternity shadow. Secondary transfer has been investigated in only one other pollination system (Thomson & Eisenhart 2003), where it was detected, so currently there is insufficient comparative information to identify the features predisposing a pollination system to secondary transfer.

There is a noteworthy difference between our best-fit paternity shadow and the laboratory-based paternity shadow, however. Each component of the paternity shadow, fv, was slightly smaller in the field experiment than in the laboratory, which means that flowers in the field realized less paternity through flower-to-flower pollination. There are two possible explanations for this. First, the transgenically marked pollen used in the laboratory experiment may have been competitively more successful at fertilizing ovules than the conventionally marked pollen used in the field experiment (Cresswell et al. 2001). Second, the mode of pollination may have differed between laboratory and field situations. The proportion of within-flower self-fertilization (autogamy) can be calculated from the paternity shadow as inline image (Cresswell et al. 2002). Substituting the estimated values of f1, f2, … , and f20 from this study and from the laboratory-based study, we estimate autogamy to be higher in the field (74%) than in the laboratory (59%). We suggest that this difference arose because flowers in the field were wind-blown, whereas laboratory plants experienced only still air. Wind-blown flowers collide with each other and autogamous pollination may result, pre-empting ovules even before the first pollinator arrives.

The similarity in bumble bee mediated outcrossing in B. napus under laboratory and field conditions increases our confidence in earlier predictions of the maximum level of field-to-field gene flow that bumble bees may produce in oilseed rape (Cresswell et al. 2002; Cresswell & Osborne 2004). These predictions were made from the PDM using the laboratory-based paternity shadow to evaluate Ψ, but are intended to inform debate over the confinement of genetically modified B. napus under arable conditions. Therefore it is more realistic to apply a field-based estimate of the paternity shadow. Using the formulae in Cresswell et al. (2002), our field-based paternity shadow estimates Ψ = 0·8 for field-to-field gene dispersal, whereas the laboratory-based estimate (Cresswell et al. 2002) was Ψ = 1·2. Based on the PDM (equation 1) and estimates of (≈490–720 flowers; Cresswell et al. 2002), both paternity shadows yield a similar range of predictions for maximum levels of field-to-field gene flow in oilseed rape that can be mediated by bumble bees (assuming E = 1), i.e. ≈0·2%.

directions for future studies

The matrix method failed to produce a biologically realistic paternity shadow in the case we studied, and we believe it may give unrealistic solutions in most other cases because the matrix solution is likely always to be distorted by sampling error inherent in the empirical description of gene dispersal, and also in the description of pollinator movements (matrix P). In contrast, the least-squares method derived from the matrix formulation was successful because it made explicit provision for residual variation that is unexplained by the fitted paternity shadow. Furthermore, the least-squares method can be applied regardless of the number of plants in the experimental array, although better results should be obtained as the number of unmarked flowers visited by a pollinator between visits to marked plants approaches the length of the paternity shadow, m, because of improved resolution on the tail of the paternity shadow. We therefore recommend that future studies utilize the least-squares method to estimate a paternity shadow.

To further evaluate our least-squares method, we used the fitted paternity shadow to predict the proportion of marked seed observed in the experimental array by solving the PDM (equation 1) as follows. When a bee visits W marked flowers followed by ≥m unmarked flowers (Fig. 1), the expected value of the portion parameter (ΨW)e is given by (Cresswell 2005):

image( eqn 10)

Given that W = 9, we find that (ΨW)e = 0·75. The remaining parameter values in equation 1 are estimated from the pollinator behaviour observed in the experimental rows. We assumed that bumble bees always arrived at the unmarked plants directly after visiting the marked plants, hence we set E = 1, and they visited an average of 50 unmarked flowers between visits to the marked plants (Cresswell 2005), hence  = 50. Therefore (ξ)e = [(ΨW)e/50] × 100% = 1·5%, which is fairly close to the observed proportion of marked seed, 2·1% (Cresswell 2005). Given that the PDM requires estimation of several parameters with sampling error, further testing is necessary to determine fully whether the least-squares method provides acceptable accuracy for predictions of gene dispersal.

In summary, our study begins to demonstrate the feasibility of recovering paternity shadows relevant to field conditions, which could make it practical to exploit recent theory (Cresswell et al. 2002; Cresswell 2003) for predicting the level and extent of pollinator-mediated gene dispersal. We successfully predicted gene dispersal in an experimental array of plants that had only one kind of pollinator, but natural pollination systems are seldom this simple because flowers attract a diversity of visitors (Fenster et al. 2004). We nevertheless anticipate that our method can be applied where a single, most effective pollinator prevails (Stebbins 1970). Moreover, our method can, in principle, be extended to pollination systems with two or more kinds of pollinator, each with its own paternity shadow, although the quantity of data required to solve the model is likely to increase with the number of paternity shadows fitted simultaneously.


This research was funded by the NERC. We thank Dave Hodgson and three anonymous reviewers for constructive criticism.


Appendix 1: Matrices used in the analysis

The matrix P, where columns 1–24 designate flower visits and rows 1–24 designate plant locations in the row. For example, the probability that a bee at plant 2 arrived after its first interflower flight was 0·14.

inline image

[0·1156, 0·1391, 0·0422, 0·0294, 0·0307, 0·0056, 0·0275, 0·0128, 0·0138, 0·0135, 0·0062, 0·0091, 0·0147, 0·0060, 0·0067, 0·0054, 0·0031, 0·0062, 0·0113, 0·0021, 0·0027, 0·0000, 0·0061, 0·0050]

The vector M, where the 24 elements quantify the observed proportion of marked seed set at the locations in the row. For example, the observed proportion of marked seed set at plant location 2 was 0·1391.