### Abstract

- Top of page
- Abstract
- Introduction
- Materials and methods
- Results
- Discussion
- Acknowledgements
- References

**1.** The construction of a predictive metapopulation model includes three steps: the choice of factors affecting metapopulation dynamics, the choice of model structure, and finally parameter estimation and model testing.

**2.** Unless the assumption is made that the metapopulation is at stochastic quasi-equilibrium and unless the method of parameter estimation of model parameters uses that assumption, estimates from a limited amount of data will usually predict a trend in metapopulation size.

**3.** This implicit estimation of a trend occurs because extinction-colonization stochasticity, possibly amplified by regional stochasticity, leads to unequal numbers of observed extinction and colonization events during a short study period.

**4.** Metapopulation models, such as those based on the logistic regression model, that rely on observed population turnover events in parameter estimation are sensitive to the implicit estimation of a trend.

**5.** A new parameter estimation method, based on Monte Carlo inference for statistically implicit models, allows an explicit decision about whether metapopulation quasi-stability is assumed or not.

6. Our confidence in metapopulation model parameter estimates that have been produced from only a few years of data is decreased by the need to know before parameter estimation whether the metapopulation is in quasi-stable state or not.

7. The choice of whether metapopulation stability is assumed or not in parameter estimation should be done consciously. Typical data sets cover only a few years and rarely allow a statistical test of a possible trend. While making the decision about stability one should consider any information about the landscape history and species and metapopulation characteristics.

### Introduction

- Top of page
- Abstract
- Introduction
- Materials and methods
- Results
- Discussion
- Acknowledgements
- References

The apparent paradigm shift from the theory of island biogeography to the theory of metapopulation dynamics has created a demand for quantitative metapopulation models for application to conservation (McCullough 1996; Hanski & Simberloff 1997). The construction of a practical metapopulation model begins with the choice of the factors presumed to affect metapopulation dynamics. Here the basic variables of classic metapopulation dynamics, habitat patch area and isolation (following MacArthur & Wilson 1967), have a prominent place, but additional factors may be included as found appropriate. After choosing the variables, a modelling approach and a particular model structure are chosen. Modelling of local dynamics, dispersal, and the effects of habitat patch area and isolation on local extinction and colonization are among the components that are typically included into metapopulation models (e.g. Hanski 1994a).

The next step is model parameterization using empirical data. Here, it is important to consider the quality and quantity of data needed for reliable parameter estimation, as gathering data from metapopulations generally requires substantial resources. Stochastic patch occupancy models (SPOM; see Moilanen 1999 and references therein) ignore local dynamics and only model the presence and absence of the species within discrete habitat patches. One practical reason for using SPOMs is the relative simplicity of these models; because data on local dynamics are not required, the amount of work needed for parameter estimation is significantly reduced. SPOMs that have been used for prediction include the incidence function model (IFM; Hanski 1994a) and the logistic regression model (Sjögren Gulve & Ray 1996; ter Braak, Hanski & Verboom 1998).

This study is concerned with the parameterization of SPOMs and especially with a particular assumption relevant to parameter estimation, namely whether the metapopulation is assumed to be at a stochastic quasi-equilibrium or not. (By quasi-equilibrium I mean the ‘typical’ dynamic state of the metapopulation before the inevitable eventual extinction, characterized by a stationary distribution of the number of turnover events per time unit and a characteristic pattern of patch occupancy reflecting the long-term probabilities of different spatial patterns of occupancy.) This assumption has generally been coupled with the choice of the metapopulation model: for example, the IFM in its original form assumes that the metapopulation is at a stochastic quasi-equilibrium (Hanski 1994a), whereas the logistic regression model (Sjögren Gulve & Ray 1996) does not assume stability.

Here, I demonstrate, how, in the case only turnover data is used in model parameterization and metapopulation stability is thus implicitly not assumed, estimation from empirical data is liable to produce parameter values that predict a spurious trend in metapopulation size. The implicit estimation of a trend occurs because empirical data are liable to show a spurious trend during a study period of only a few years. Such an apparent trend will almost certainly be present in empirical data even if the metapopulation truly is at a stochastic quasi-equilibrium. This is because extinction-colonization stochasticity, possibly amplified by regional stochasticity (spatially-correlated environmental stochasticity, Hanski 1991), is likely to cause the numbers of extinction and colonization events to be unequal during a short study period. As an extreme case one may envision a situation, where by chance only extinctions or colonizations are observed during a 2-year study period.

In this study, a number of simulated patch occupancy data sets were generated, and subsequently parameters relevant for the logistic regression model and the IFM were estimated from these simulated data sets. For the IFM, parameters were estimated using a new method for SPOMs, which is based on Monte Carlo inference for statistically implicit models (Moilanen 1999). This method computes maximum likelihood estimates from an observed sequence of patch occupancy patterns, and it allows one to make an explicit choice of whether metapopulation stability is assumed or not. Finally, metapopulation dynamics were predicted using the different parameterized models and the models/estimation methods were compared in their susceptibility to the implicit estimation of a trend.

### Results

- Top of page
- Abstract
- Introduction
- Materials and methods
- Results
- Discussion
- Acknowledgements
- References

Figure 4 shows the predicted metapopulation dynamics when simulated with parameters estimated from simulated data. Each line in the figure corresponds to one simulation run done with parameters estimated from one of the 10 simulated data sets. Using logistic regression of population turnover events in parameter estimation (Fig. 4a,b) resulted in great variation in the predicted dynamics, even when 5 years of data were available. With logistic regression and 2 years of data, not a single replicate produced dynamics that resemble the original time series (Fig. 2). This demonstrates how the implicit estimation of a trend can cause the predicted dynamics to diverge from the true dynamics of the metapopulation at a stochastic quasi-equilibrium. In contrast, dynamics (Fig. 4c,d) predicted by the IFM/MC more closely resemble the dynamics of the original time series. Figure 4e,f shows the results for the IFM/TMC, which calculates an exact likelihood for the observed data, but ignores the first term in equation 4 and thus does not impose the condition of being stationary. The quality of the results deteriorates in comparison to those produced with IFM/MC. In summary, Fig. 4 demonstrates how implicit trend estimation can cause an increase in the variance of predictions produced by different model parameterizations. Predictions can err in both directions, either metapopulation extinction or too strong metapopulation persistence can be erroneously predicted. For example, with 2 years of data, simulated data sets 7, 9 and 10 had a tendency to produce parameter values that predict extinction. With 5 years of data, estimations from data sets 1 and 4 tended to lead to extinctions. These data sets are ones that show a declining trend (for 2 or 5 years) in patch occupancy in Fig. 2.

Table 1 summarizes some quantitative aspects of the time series shown in Fig. 4. IFM/MC produces results which most resemble the original time series. For IFM/MC the mean distance between the mean *p* of the original time series and the mean *p* of predicted time series was only 0·07 when 5 years of data was available. This can be compared to mean distances of 0·15 and 0·21 for the IFM/TMC and logistic regression, respectively. Actually, using this measure of similarity between original dynamics and predictions, the IFM/MC produces better results with 2 years of data than the other methods do with 5 years of data. As the original metapopulation was at a stochastic quasi-equilibrium, this result is not unexpected, because IFM/MC is the only one of the three models that assumes metapopulation quasi-being stationary. However, the result underscores the significance of the equilibrium assumption, since the only difference between IFM/MC and IFM/TMC is the inclusion/exclusion of the stability term P[*O*(*t*_{0})] of equation 4 into the objective function in parameter estimation.

Table 1. **.** Quantitative characteristics of time series shown in Fig. 4, computed for the 10 predictions generated with each model/estimation method. The mean and standard deviation of the average *p* in the 10 predictions is reported as well as the number of replicates declining to *P* = 0 or climbing to *P* = 1. Column ‘distance’ gives the average distance between the mean *p* of the original time series (0·437) and the mean *p* of the predicted time series, which characterizes the sensitivity of the method to implicit trend estimation, because different mean *p*'s in the original and predicted time series indicate an initial period, during which the predicted dynamics diverge from the original dynamics Method | Number of patterns | Average predicted *P* (SD) | No. of replicates with *P =* 0 or *P* = 1 | Distance (SD) |
---|

Logistic | 2 | 0·33 (0·32) | 4 | 0·31 (0·30) |

Regression | 5 | 0·46 (0·28) | 3 | 0·21 (0·16) |

IFM/MC | 2 | 0·35 (0·19) | 2 | 0·15 (0·14) |

5 | 0·41 (0·10) | 0 | 0·07 (0·08) |

IFM/TMC | 2 | 0·42 (0·34) | 5 | 0·30 (0·13) |

5 | 0·38 (0·21) | 2 | 0·15 (0·15) |

Figure 5 shows the predicted dynamics of the metapopulation when simulated with parameters estimated from the 10 replicate sets of simulated non-equilibrium data (Fig. 3). IFM/MC predicts fluctuations with a low *p*. These simulations do not indicate very stable persistence, since eight out of the 10 simulations went extinct before year 500 (not shown). The logistic regression method and IFM/TMC correctly predict rapid metapopulation extinction, with the exception of one replicate with the logistic regression method predicting *p* increasing to 1·0. This one deviant prediction demonstrates how spurious trend estimation may cause a true non-equilibrium situation to appear quasi-stable if data for only a few years are available.

Finally, consider a metapopulation model that has been parameterized assuming metapopulation quasi-stability. When such a model predicts stability with a low *p*, there may be cause for concern about the persistence of the metapopulation. Figure 6 repeats the simulations in Fig. 5b (IFM/MC, non-equilibrium data), but now the value of the extinction parameter *e* is increased by 33%. Six out of 10 replicates now predict the metapopulation to go extinct in 100 years, which suggests that one cannot really be confident about the persistence of this metapopulation.

### Discussion

- Top of page
- Abstract
- Introduction
- Materials and methods
- Results
- Discussion
- Acknowledgements
- References

The results for the logistic regression model and IFM/TMC (Fig. 4) demonstrate how the implicit estimation of a trend is likely to occur when only turnover data are used in model parameterization and metapopulation quasi-stability is not assumed. Particularly, the logistic regression model, which necessarily uses only data on population turnover events in parameterization, failed to produce dynamics that resembled the true dynamics of simulated data. This is discouraging, because simulated data were produced using exactly the same model structure (logistic regression) and, thus, the logistic regression should have been able to recover the original dynamics. The problem is, of course, that only a few years of simulated data were available for parameter estimation and, thus, the number of observed turnover events remained low. Also, a few years of data is so short a period that a spurious trend is likely to occur, which induces the implicit estimation of a trend. However, the amount of data maximally available here (five surveys of 53 patches) is quite large in comparison to what is available in empirical studies and, therefore, one hopes that such data could be used for metapopulation model parameterization. The implicit estimation of a trend can also be seen in Moilanen (1999), where simulated data were generated using the IFM. There the scatter in parameter estimates produced with the IFM/TMC is much larger than with the IFM/MC, which implies a larger variance in predictions with the IFM/TMC.

Better parameter estimates can naturally be obtained when more data are available. Unfortunately, it is likely that merely increasing the number of surveyed habitat patches will not much alleviate the problem of the implicit estimation of a trend. This is because in natural metapopulations regional stochasticity is liable to produce extra fluctuation in the observed numbers of extinction and colonization events (Hanski 1991). For example, in a survey of 1530 habitat patches used by the butterfly *M. cinxia*, 524 patches were observed to be populated in 1993. From 1993 to 1994, 256 extinction and 119 colonization events were observed (Hanski *et al*. 1995). In this case, the great inequality in extinction and colonization events was probably due to a dry summer, which caused many local populations to go extinct due to larval food plant shortage (Hanski *et al*. 1995). However, there was no reason to expect that the metapopulation of *M. cinxia* would be declining to extinction, because the metapopulation is large and it contains many areas where habitat is dense (I. Hanski, personal communication). During 1995–98 the number of local populations has stayed around 250–300, but there have been large fluctuations in the numbers of larval groups: 1989, 1018, 1009 and 1467 larval groups were observed during years 1995–98, respectively, in the patches surveyed in 1993 (M. Nieminen, personal communication). This demonstrates how regional stochasticity may cause large and rapid fluctuations in the number of occupied patches even in a large metapopulation. However, if only turnover data for the years 1993–94 were available for parameter estimation, one might have concluded that the metapopulation was strongly declining, possibly to extinction. This demonstrates that surveying a large number of patches will not eliminate the risk of an unintended estimation of a trend, but rather, the remedy is to acquire data for many years.

An alternative approach is to assume that the metapopulation is at a stochastic quasi-equilibrium, and perform the parameter estimation accordingly. In this study, IFM/MC, which assumes metapopulation quasi-stability, produced dynamics (Fig. 4, panels C and D, Table 1) that were close to that of the original time series (Fig. 2). The amount of data typically available for model parameterization, which rarely is more than 5 years/snapshots of data even for large data sets (see van der Meijden & van der Veen-van Wijk 1997, for an exceptional data set of 20 years), will not be enough for a proper statistical test for the presence of a long-term trend. Therefore, the decision of whether metapopulation quasi-stability or the presence of a real trend is assumed should be based on all available information on landscape history, species characteristics and metapopulation structure. Particularly, if there is no evidence for recent habitat loss or other environmental change affecting the habitat patches or the matrix, one might assume that the metapopulation is at a stochastic quasi-equilibrium. If there is evidence for recent habitat loss, a non-equilibrium situation (Tilman et al. 1994; Hanski, Moilanen & Gyllenberg 1996b) is possible, especially if the turnover rate in the metapopulation is low. However, if the turnover rate in the metapopulation is high and habitat is yet relatively dense, the metapopulation may be expected to achieve a new quasi-equilibrium relatively quickly (Hanski 1998b), and a few years after habitat loss the equilibrium assumption may again be justified.

It is possible to examine the sensitivity of a particular metapopulation/metapopulation model to the implicit estimation of a trend. First estimate best possible parameter values for the model, without much concern for the reliability of the estimates. Then simulate the parameterized model and sample a number of simulated data sets, as was done in this study. Next parameterize the model with the simulated data sets and perform quantitative predictions for the metapopulation. If there is large variation in the predictions based on parameters estimated from the simulated data, it is very likely that the system is sensitive to the implicit estimation of a trend. Note that the situation with the empirical data is liable to be worse than that suggested by the previous procedure. In reality, the metapopulation model is at best an approximation of what is going on in nature and extra unaccounted factors, such as regional stochasticity, will decrease the reliability of model predictions. Unaccounted sources of stochasticity may invalidate turnover data-based parameter estimation in one important way: it is perhaps possible to argue that if the metapopulation is at quasi-stationary state then parameter values predicting being stationary should be within the joint confidence limits of parameters even when only turnover data is used in estimation. However, if unaccounted forms of stochasticity are amplifying fluctuations in the metapopulation, parameter values predicting being stationary do not have to be within confidence limits produced by turnover-based parameter estimation.

The results showing how turnover-based parameter estimation is sensitive to the implicit estimation of a trend cast doubt on our ability to reliably estimate parameters of metapopulation models even with relatively large data sets. This suggests that quantitative predictions with predictive metapopulation models such as the logistic regression model and the IFM should be viewed with caution. Basing conservation decisions on direct quantitative predictions will be hazardous as problems in model parameterization cause predictions to be unreliable to varying degrees. Fortunately, this does not mean that nothing can be done. An alternative approach is to use predictive models to rank different conservation scenarios (Possingham & Davies 1995; Akçakaya & Atwood 1997; Hanski & Simberloff 1997; Hanski 1998a). A ranking of scenarios should be combined with a sensitivity analysis (McGarthy, Burgman & Ferguson 1995; Possingham & Davies 1995). It is especially important to ensure that results predicting metapopulation persistence are not sensitive to small variations in model parameters.

Sensitivity analysis can be used to reveal a particular potential error when model parameters are estimated assuming metapopulation quasi-stability, for example using the MC method, which in this study was clearly superior in its ability to recover the dynamics of the quasi-stable original time series. However, one is concerned about the situation that metapopulation quasi-stability is assumed when, in reality, there is a non-equilibrium situation (Fig. 5b). In this case, enforcing stability during parameter estimation will result in parameters that predict metapopulation persistence, but with a low average p. This is because assuming stability necessitates that the estimated parameters enable the metapopulation to persist, but the decreasing trend in p will favour parameter values that produce persistence with a low p. It thus follows that such persistence is likely to be sensitive to small variations in parameter values, and a sensitivity analysis should indicate that the persistence is suspect. Such an analysis was done in Fig. 6, and in this case the metapopulation turned out to be unstable, which is the correct result, as the original data came from a metapopulation that was declining to extinction (Fig. 3).

The possible existence of a trend in the empirical data can also be examined in relation to a particular metapopulation model. First obtain parameter estimates assuming quasi-stability, then perform a number of simulations starting from the first empirically observed patch occupancy pattern. Next examine whether the observed dynamics fall within the 95% confidence limits of the predicted dynamics. This is depicted in Fig. 7. If the observed dynamics fall out of the 95% confidence limits of predicted dynamics (as computed from a large number of simulation runs), then the model is unlikely to produce dynamics that correspond to observed dynamics. Unfortunately, there are several possible causes for this situation to occur. There may be a trend in the observed data, but it is also possible that regional stochasticity is amplifying observed dynamics or that the model structure is incomplete or wrong. Note also that the fit of a metapopulation model to data cannot be measured by a mere correspondence between observed and predicted p-values, as p is an aggregate measure of metapopulation dynamics. The correct thing to do is to examine the fit between the observed and predicted patterns of occupancy (see Moilanen et al. 1998). However, if the model is unable to produce fluctuations in p similar to those that have been observed empirically, then something is surely wrong with the model or its parameterization.

The truly important question is not whether there is a trend in the data, but rather whether the metapopulation is persistent or not. For example, the metapopulation may be endangered merely because of its small size, even if the empirical data does not show a distinct declining trend. Also, a strong short-term declining trend may be caused by regular extinction-colonization stochasticity and regional stochasticity, and it will be difficult to prove statistically the existence of a trend from only a few years of data. Even so, strong short-term fluctuations indicate a large amplitude in the fluctuations of the proportion of occupied patches, which itself may be an indication of susceptibility to metapopulation extinction. In summary, sensitivity analysis, combined with the most informed choice of metapopulation model structure and parameterization, is the correct tool for investigating metapopulation persistence.