Inferring seed bank from hidden Markov models: new insights into metapopulation dynamics in plants

Authors


Summary

  1. Capturing metapopulation dynamics of plants that have seed banks is challenging, because of the difficulty in characterizing the seed bank in the field.
  2. To account for the presence of a seed bank, we developed a hidden Markov model, where the focus species can be present in two forms, both above-ground and below-ground, the latter being unobservable. We generated patch histories of presence–absence for a species with a one-year seed bank under different colonization–extinction dynamics and metapopulation sizes, using a mechanistic model that accounts for three different sources of seedlings (seed bank, newly locally produced seeds and migrant seeds) as well as a disturbance process reflecting extinction. Using the program e-surge, we analysed these simulated data to evaluate the statistical performance of the hidden Markov model in detecting the presence of a seed bank and providing accurate estimates of the model parameters for different sets of parameter values.
  3. Our simulation tests showed that the absence of a seed bank was very well detected when data sets were simulated with no seed bank, regardless the size of the metapopulation. Similarly, the presence of a seed bank was well detected when data sets were simulated with a seed bank. In this latter case, detection of the seed bank improved with increasing size of the metapopulation.
  4. The quality of the estimates of the model parameters increased with the size of the metapopulation but still remained high for small metapopulation sizes. The two parameters reflecting the colonization process and seed dormancy were those best estimated. In addition, we showed that ignoring the presence of a seed bank unvaryingly led to overestimations of colonization and extinction rates.
  5. Synthesis. Hidden Markov models offer a reliable way to estimate colonization and extinction rates for plant metapopulations with a seed bank using time series of presence–absence data. Therefore, these models have the potential to provide valuable insights into the metapopulation dynamics of many plant and animal species with an unobservable life form that have remained poorly studied because of methodological constraints.

Introduction

Since first proposed by Levins (1969), the metapopulation concept has triggered many developments in evolutionary biology (Ronce, Perret & Olivieri 2000) and ecology (Hanski & Gilpin 1997; Hanski 1999). The metapopulation framework considers that species distribution over space and time results from a balance between extinction of local populations inhabiting a discrete network of suitable patches and colonization of empty patches. While such a framework has been extremely influential in the study of animal species, its relevance for studying plant species has been much more controversial (Bullock et al. 2002; Freckleton & Watkinson 2002, 2003; Ehrlén & Eriksson 2003). On the one hand, Husband & Barrett (1996) argued that the patchy structure of plant populations, as well as their supposedly high turnover, made them ideal candidates for metapopulation studies. On the other hand, Freckleton & Watkinson (2002) argued that specific plant characteristics could make the metapopulation model inadequate for capturing the regional dynamics of many plant species. Among these characteristics is the seed bank, which is a prevalent trait of plant species (Harper 1977; Thompson, Bakker & Bekker 1997) and that is known to have a major impact on regional dynamics. Indeed, by spreading seed germination and reproduction through time, prolonged seed dormancy that builds up the seed bank (Harper 1977) can represent a bet-hedging strategy that allows species to reduce temporal variation in fitness in unpredictably varying environments and thus mitigate the effects of unfavourable years (Cohen 1966; Evans et al. 2007; Venable 2007). Seed bank effects on regional dynamics may, however, not be captured when considering only colonization and extinction processes (Freckleton & Watkinson 2002; Ouborg & Eriksson 2004). Therefore, metapopulation dynamics of species possessing a seed bank have been poorly documented in the field because of the difficulty in characterizing the seed bank. The few empirical metapopulation studies conducted in plants either have ignored the potential effect of a documented seed bank on metapopulation persistence (Lesica 1992; see a review in Freckleton & Watkinson 2002) or have been developed for species with no seed bank (Dornier, Pons & Cheptou 2011). The specific problem associated with a seed bank is that species cannot be detected when present only below-ground. Consequently, such missing information makes it unclear whether a newly observed population is derived from colonization or from the germination of the seed bank. Similarly, it is not clear whether a previously occupied habitat corresponds to an extinction process or whether there are still some individuals left in the seed bank. Therefore, estimates of colonization and extinction probabilities are necessarily biased whenever prolonged seed dormancy is ignored for species with a seed bank.

In a monitoring context, field ecologists are able to obtain long-term presence–absence data from patch surveys above-ground. Such data sets have recently stimulated a large amount of work on the issue of imperfect detection (e.g. MacKenzie et al. 2003; Royle 2006; Royle & Kery 2007; and references therein). Repeated surveys within a season can be used to compensate for imperfect detection (MacKenzie et al. 2009). However, they do not help detect the presence of a seed bank, making standard patch occupancy models inappropriate. Patch occupancy surveys only allow occupancy to be assigned to the above-ground state, and thus present uncertainty for patch occupancy below-ground. The question remains whether such occupancy data above-ground can provide information about the presence of the species below-ground through the existence of a seed bank. Hidden Markov models, such as multievent models (Pradel 2005) and patch occupancy models (MacKenzie et al. 2009), decouple the observation process and the state process and thus enable to take into account the uncertainty in the assignment of state. In the context of plant metapopulations with a seed bank, the observation corresponds to the presence or absence of the species above-ground, and the states correspond to the combination of the presence or absence of the species above-ground and below-ground.

In this study, we highlight how hidden Markov models can be used to address the longstanding problem of unobservable stages in the life cycle, a problem that has hampered empirical studies of plant metapopulations. To do so, we performed stochastic simulations using a mechanistic model to generate patch histories of presence–absence for a species with a one-year seed bank, utilizing different colonization–extinction dynamics and metapopulation sizes. Using the program e-surge (Choquet, Rouan & Pradel 2009), we analysed these simulated data to evaluate the statistical performance of our model in (i) detecting the presence of a seed bank and (ii) providing accurate estimates of the model parameters for different sets of parameter values.

Materials and methods

Plant life cycle

To explore the potential of our approach, we limited ourselves to a simple situation, although we suggest further extension of our model in the discussion. We considered a metapopulation of an annual plant species with a one-year seed bank, consisting of a finite number N of discrete suitable patches (Fig. 1). At the time of the census during the flowering period, if any plant was observed in a patch at time t, we assumed that those plants automatically produced seeds. We assumed that those seeds gave rise to at least one seedling between t and t + 1 with a probability g0, while some automatically entered the seed bank and produced at least one seedling between t + 1 and t + 2 with a probability g1. Thus, whenever a plant was observed above-ground in a patch at time t, this patch contained a seed bank at time t + 1. Given that seeds had a one-year longevity in the seed bank, seeds produced at time t could thus germinate at the latest at time t + 2. Each patch had a probability c of carrying at least one seedling following colonization between t and t + 1. We further assumed that migrant seeds did not enter the seed bank. Thus, whenever a patch was not occupied above-ground at time t, the species was not present in the seed bank at time t + 1. The three sources of seedling occurrence between t and t + 1 in each patch corresponding, respectively, to parameters g0, g1 and c were independent in our mechanistic model. Finally, we defined d as the probability of all seedlings from any origin dying due to the occurrence of a disturbance taking place between t and t + 1. It is worth noting that the probability of a successful colonization of a patch empty at time t, that is totally devoid of the focus species above- and below-ground, is c(1 − d) because the alien seedlings must escape extinction from disturbance to effectively reinstate the species in the patch at time t + 1.

Figure 1.

Life cycle graph and demographic survey for a metapopulation of an annual species with a one-year seed bank. Using an example of patch history over three occasions, we illustrate the link between the parameters g0, g1, c and d reflecting the mechanistic processes of the model, the observations {‘0’: no plant above-ground, ‘1’: plant above-ground} and the underlying patch states {AA, AP, PA, PP}.

Demographic model

Each patch was characterized by whether the species was observed above-ground (‘1’) or not (‘0’) at the time of the census. Thus, the data set consisted of the histories of encounters of the species in a set of N patches surveyed each year during the flowering period over a finite number of years. To account for the uncertainty of patch state inherent to the presence of a seed bank, we developed a hidden Markov model (Pradel 2005) based on the species’ life cycle described above. The model was characterized by four states, AA (absent above-ground and absent in the seed bank), AP (absent above-ground and present in the seed bank), PA (present above-ground and absent in the seed bank), PP (present above-ground and present in the seed bank) and two observations (no plant above-ground, plant above-ground). This implementation can thus be seen as defining a multistate occupancy model (MacKenzie et al. 2009), where the focus species can be present in two forms, one of which is unobservable. In the presence of a one-year seed bank, the metapopulation dynamics were fully described by the 4 × 4 Markovian transition matrix, Φt, which describes the fate of patches in states AA, AP, PA and PP from t to t + 1, and whose elements were expressed as a function of the parameters g0, g1, c and d. Given our assumptions of the species’ life cycle, patch occupancy was dependent upon the balance between the occurrence of seedlings arising from the three possible sources of seeds (seed bank g1, newly produced seeds g0, immigrant seeds c) and the disturbance process d. For the sake of clarity, we thus expressed each transition matrix as a function of those two steps. We thus defined a parameter fijk, reflecting the first step, as the probability for a patch in a given state to contain seedlings arising from the seed bank (i = 1 when the patch contained seeds in the bank at time t, 0 otherwise), from the seeds produced by plants above-ground at t (j = 1 when the patch was occupied above-ground at time t, 0 otherwise) and from immigrant seeds (k = 1, any patch, regardless of its state at time t, having the same probability c to contain at least one seedling arising from the colonization process between t and t + 1). For instance, f101 corresponded to the probability for a patch in state AP at time t to contain seedlings that emerged between t and t + 1. Under the four-state model with a one-year seed bank, hereafter referred to as Model 1, the transition matrix Φt from the state at t (in rows) to the state at t + 1 (in columns) can be written as:

display math

with f001 = c, f101 = (g1 + − g1c), f011 = (g0 + c − g0c) and f111 = (g1 + g0 + c − g0 g1 − g1c − g0c + g1g0c) (see Appendix S1 in Supporting Information for the detailed transition matrix as a function of g0, g1, c and d). Besides the transitions between states, probabilities of the two observations {no plant above-ground, plant above-ground} had to be specified as conditional on the underlying state. If no plant was observed above-ground at t (conventional code ‘0’), the patch was not occupied above-ground at t (states AA and AP). Conversely, if plants were observed above-ground at t (conventional code ‘1’), the patch was occupied above-ground at t (states PA and PP). We thus defined the following matrix of observation probabilities Et with states in rows and observations in columns:

display math

When g1 equalled 0, the model reduced to the classical metapopulation model for a species with no seed bank, with only two possible states A (absent above-ground) and P (present above-ground). By analogy with the four-state model, we defined a parameter fjk, as the joint probability for a patch to contain seedlings arising from the seeds produced by plants above-ground at t (j = 1 when the patch was occupied above-ground at time t, 0 otherwise) and from immigrant seeds (k = 1, see above). Under the two-state model with no seed bank, hereafter referred to as Model 2, the transition matrix Φ’t from the state at t (in rows) to the state at t + 1 (in columns) was as follows:

display math

with f01 = c and f11 = (g0 + c − g0c). Note that the matrix elements f01(1 − d) and (1 − f11) + f11d correspond, respectively, to the apparent colonization rate