Experimental or precautionary? Adaptive management over a range of time horizons

Authors

  • Cindy E Hauser,

    Corresponding author
    1. Australian Centre of Excellence for Risk Analysis, School of Botany, University of Melbourne VIC 3010 Australia;
    2. The Ecology Centre, School of Integrative Biology, University of Queensland, St Lucia QLD 4072 Australia; and
    3. Department of Mathematics, University of Queensland, St Lucia QLD 4072 Australia
      Correspondence author. E-mail: chauser@unimelb.edu.au
    Search for more papers by this author
  • Hugh P Possingham

    1. The Ecology Centre, School of Integrative Biology, University of Queensland, St Lucia QLD 4072 Australia; and
    2. Department of Mathematics, University of Queensland, St Lucia QLD 4072 Australia
    Search for more papers by this author

Correspondence author. E-mail: chauser@unimelb.edu.au

Summary

  • 1Many studies of adaptive harvest management already exist in the literature, but most (if not all) use long, sometimes infinite, time horizons. Such long-term objectives provide an opportunity to manage experimentally, so that poorly understood dynamics are learned and any returns sacrificed for experimentation are repaid by improved management over the remaining time horizon.
  • 2However, a manager is unlikely to weight outcomes in the distant future equally against outcomes in the present. Furthermore, the most appropriate model of system dynamics may not remain constant over the time-frame required to experiment, learn and improve management. In these cases the use of discounting and/or a finite time horizon fit the manager's assumptions and goals more effectively, and the value of experimentation is likely to be diminished.
  • 3In this paper we construct a simple model of a hypothetical population and compare optimal passive and active adaptive harvest strategies over a range of time horizons. This allows us to determine the optimal level of experimentation for short-, medium- and long-term goals.
  • 4We discover that the optimal active adaptive harvest strategy may be precautionary over short to medium time horizons, rather than experimental. That is, an action with known moderate benefits is preferred over an action with uncertain but marginally larger expected benefits. This runs counter to the widespread assumption in the adaptive management literature that incorporating learning into an optimization of management will encourage experimentation.
  • 5Synthesis and applications. The general results of this paper have potential application to any environmental management problem where adaptive management might be applied; for example, conservation, pest control, harvesting and management of water flows. We examine adaptive management over a range of finite time horizons to reflect a variety of possible management goals and assumptions. Our simple example demonstrates that in the face of model uncertainty, the management strategy that maximizes benefits does not necessarily include deliberate experimentation and learning. Optimal active adaptive management weighs experimentation against all its potential consequences, and this can yield a precautionary approach.

Introduction

There is widespread interest in applying adaptive management principles to ecological systems. The theory of adaptive management has been developed in the context of harvesting, particularly for fish (Walters & Hilborn 1976; Smith & Walters 1981; Walters 1981; Walters, Goruk & Radford 1993) and waterfowl (Nichols, Johnson & Williams 1995; Williams & Johnson 1995; Johnson & Case 2000) but more recently it has been explored in other areas of environmental management, such as pest control (Shea et al. 2002), marine reserve design (Gerber, Beger, McCarthy & Possingham 2005), water flow management (Pearsall, McCrodden & Townsend 2005), forest management for the maintenance of old-growth habitat (Moore & Conroy 2006) and revegetation planning (McCarthy & Possingham 2007).

The purpose of adaptive management is to acknowledge, deal with and sometimes resolve model uncertainty (Holling 1978; Walters 1986). Model uncertainty arises when several plausible hypotheses exist to explain system dynamics, and these hypotheses imply different optimal management strategies (e.g. Walters 1981; Pascual, Kareiva & Hilborn 1997; Runge & Johnson 2002; Hauser et al. 2007). When adaptive management is framed as a formal optimization hypotheses take the form of mathematical models, which are weighted to determine the optimal management strategy. Over time, new data are used to reassess the plausibility of each hypothesis and update model weights. Thus, the manager learns about system dynamics as they manage.

Adaptive management approaches can be classified as passive or active (Parma et al. 1998). Passive adaptive management has been defined in several different ways (e.g. Walters 1981; Parma et al. 1998; Williams 2001; Pearsall et al. 2005) and we use Williams’ (2001) definition, where multiple models and model weights are used but learning is not accounted for explicitly in determining the optimal management strategy. Nevertheless, model updating and learning occur ‘passively’ as the system is managed. Active adaptive management is more widely agreed to be the management strategy that explicitly incorporates learning, and manages the system optimally to maximize returns.

It is usually anticipated that an active adaptive strategy will be more experimental than a passive adaptive or non-adaptive strategy, although the optimal active adaptive strategy has been similar to passive adaptive management in a small number of studies (Smith & Walters 1981; Ludwig & Walters 1982). An experimental strategy typically involves forgoing returns in the short term for learning, resulting in improved understanding and maximized returns in the long term (Silvert 1978; Smith & Walters 1981; Walters et al. 1993). Some studies use economic or information discounting (Silvert 1978; Smith & Walters 1981). This diminishes the value of future returns and, thus, experimentation.

This study uses a simple hypothetical population model with one uncertain parameter to explore adaptive management approaches to harvesting. In addition, the main results can be interpreted in the context of other adaptive management applications, such as conservation and pest control. We determine optimal passive and active adaptive harvest strategies that are time- and state-dependent, using stochastic dynamic programming (Walters & Hilborn 1976; Williams 1996). The forms of these optimal strategies over long time horizons are consistent with those in other studies. However, we discover that the early experimentation that is present in many active adaptive strategies may be a result of the long (sometimes infinite) time horizon. That is, only the very long-term strategy is usually published in the literature. We present scenarios of active adaptive management with short- and medium-length time horizons that exhibit precautionary, rather than experimental, characteristics.

Population model

Consider a harvested population where the population can be in one of three classes: collapsed (1), vulnerable (2) or robust (3). Each year the population may move from one class to another and the probability of the population having any particular size next year depends on the current size of the population, but not its size in any earlier years. Furthermore, the probabilities of transition from one class to another depend on whether or not the population is being harvested during that year. If the population is collapsed then it cannot be harvested. When the population is vulnerable or robust the manager must decide whether or not to harvest. Harvesting the population increases the probability that it will collapse next year. We assume that all transition probabilities are known for both strategies, harvest or not, except for the probability of recovering from a collapsed state to a vulnerable state, p, which is initially unknown. Uncertainty surrounding this particular parameter, the recovery rate p, is relevant for a renewable resource that, historically, has not been harvested or has been harvested lightly. In this situation, data describing the population's ability to recover from high rates of harvest are likely to be limited.

The transition probabilities investigated in this study are listed in Table 1. We consider two scenarios: in the first, the population will never collapse unless it is harvested; in the second, there is a chance of the population collapsing even in years when no harvest takes place. We call these the ‘no random collapse’ and ‘some random collapse’ scenarios, respectively.

Table 1.  Transition probabilities in the population model
 Current sizeSize next year
123
  • a

    The recovery rate p is a fixed constant but unknown.

  • b

    b When the population is currently collapsed (size 1), harvest is not possible.

‘No random collapse’ scenario
 Without harvest11 –papa0
200·50·5
300·40·6
 With harvest1– b– b– b
20·60·40
300·70·3
‘Some random collapse’ scenario
 Without harvest 11 –papa0
20·050·450·5
300·40·6
 With harvest1– b– b– b
20·60·40
30·20·60·2

We assume that the manager's goal is to maximize the expected number of years in which harvesting takes place over a finite time horizon. Furthermore, future harvests can be devalued with respect to the value of harvest in the present with the use of a discount rate. The discount rate is the annually compounding interest rate at which returns can be invested for the remainder of the management time frame. We use it to obtain the present value of each harvest that occurs in the future.

Optimal harvest strategy with complete knowledge of parameters

Our task is to determine whether or not to harvest the population given the current population size, and we aim to harvest the population in as many years as possible, in order to maximize profit. The method of stochastic dynamic programming (Walters & Hilborn 1976; Williams 1996) provides us with the optimal solution, to harvest or not, for every possible combination of states (population size and time-to-horizon) that the manager could encounter.

As a first step, we assume that the recovery rate is a known constant and determine the optimal harvest strategy. This gives us some understanding of the trade-off between harvesting in the present and avoiding future collapse before we consider uncertainty surrounding the recovery rate. The harvest value during the management time-frame is calculated recursively using the equation:

image(eqn 1)

where V(s, t) is expected cumulative harvest value (in time t currency) from year t to the time horizon T, given that the optimal harvest strategy is taken over those years and the population has size s in year t; d is the manager's decision to harvest (d = 1) or not (d = 0); D(s) is the set of harvest options ({0} for a population size of s = 1; {0, 1} for s = 2 or 3); r is the discount rate; and pd(j, s) is the probability of the population shifting from size s to size j in one year, given that harvest decision d is taken (see Table 1).

A collapsed population cannot be harvested, so in this case equation 1 becomes:

V(1, t) = (1 + r)−1[(1 − p)V(1, t + 1) + pV(2, t + 1)]( for t < T, eqn 2)

using the transition probabilities in Table 1.

We assume no returns at or beyond the time horizon, so the terminal condition is:

V(s, T) = 0 for s = 1, 2, 3.(eqn 3)

We will consider other terminal conditions in the Discussion.

Optimal harvest strategies are given in Fig. 1. Note that:

Figure 1.

The optimal harvest strategy, given a known recovery rate p, as a function of p, population size (robust, vulnerable or collapsed), scenario and time to horizon. Shaded regions indicate that the optimal decision is to harvest the population; unshaded regions denote that it is optimal not to harvest the population. The solid borderline between regions shows the change in optimal decision without discounting (r = 0), while the dashed line shows the shift in borderline with a 5% discount rate (r = 0·05).

  • • The population can not be harvested when it is collapsed (size is 1).
  • • It is always optimal to harvest a population when it is robust (size 3). This supplies an immediate profit to the manager, with only a small (‘some random collapse’, Fig. 1b) or zero (‘no random collapse’, Fig. 1a) probability of sacrificing future harvests by collapsing the population.
  • • When the population is vulnerable (size 2), the optimal harvest decision depends on the time horizon, recovery rate, discount rate and scenario.
  • • When the time-to-horizon is short, a vulnerable population should be harvested. The possibility of collapse puts only a few years of future harvest at risk and the reward of an immediate profit outweighs that risk.
  • • When the time-to-horizon is medium or long, a vulnerable population should only be harvested for a sufficiently high recovery rate. If instead the recovery rate is low, a population collapse is likely to last many years and, thus, an immediate harvest would put many future harvests at risk.
  • • As the discount rate increases, harvesting a vulnerable population becomes acceptable for a broader range of recovery rates. For large discount rates, future harvests have diminished value and the risk of losing future profits through population collapse weighs less heavily against profits from an immediate harvest.

Figure S1 in the Supplementary material shows the long-term time-invariant optimal harvest strategy for a vulnerable population. This indicates the combinations of recovery rate and discount rate that are sufficiently high and low, respectively, so that harvesting a vulnerable population is optimal.

Optimal harvest strategies with parameter uncertainty

Next we consider that the recovery rate p might not be known with confidence, because this creates uncertainty around whether or not it is optimal to harvest a vulnerable population. The formal adaptive management approach is to pose alternative hypotheses for the recovery rate and allocate prior belief to each model. If the population collapses to size 1 we can observe how long it takes to recover to size 2, and use this observation to update our belief in each of the models. As we collect more observations we are better able to estimate the underlying recovery rate, and furthermore make harvest decisions that achieve our objective most effectively. In theory, we could speed up the learning process by being actively adaptive. If we were always to harvest the population when it is of size 2 or 3, then we would increase the probability that the population would collapse. This provides more observations for learning, at the expense of harvest loss until the population recovers to size 2.

We will consider that all possible recovery rates between 0 and 1 are plausible, and use a beta distribution (with parameters α and β, both positive) for the probability density function of the recovery rate p:

image

where B(α, β) is the beta function, which ensures the probability density integrates to 1. By varying the parameters α and β, this distribution can take a variety of shapes, representing different states of belief regarding the value of p.

If the population begins the year in a collapsed state, then at the end of the year we have data X, where X = 0 if the population remains collapsed and X = 1 if the population recovers to the vulnerable state. With a beta(α, β) prior distribution for the likely recovery rate, the posterior distribution is:

image

which is a beta distribution with new parameters α+X and β+ 1 − X. Therefore, when the population fails to recover (X = 0) then the first parameter (α) remains the same but the second parameter (β) increases by 1. When the population recovers to the vulnerable state (X = 1), then α increases by 1 and β is unchanged. Thus, the parameters α and β provide sufficient information to completely describe model belief at any time during the management process. We present the optimal harvest strategy in terms of the expected recovery rate E(p) =α/(α + β) and the parameter sum α+ β, instead of considering α and β directly. These can be interpreted as the present best estimate of the recovery rate, and the amount of information on which that estimate is based [as α+β is a linear function of n, the number of collapse observations gathered: α+ β =α0 + β0 + n, where the prior distribution is beta(α0, β0)].

The optimal passive adaptive strategy is determined by calculating the expected harvest value under each possible recovery rate, weighted by model belief (the beta probability density function), and integrating over all possible values for the recovery rate. Therefore, the optimal decision at each time depends not only on population size but on current belief regarding the recovery rate, which is represented by parameters α and β. The recursive dynamic programming equation becomes:

image( eqn 4)

where pd(j, s | p) is now the probability that the population shifts from size j to size s under harvest decision d, given the recovery rate is p (cf. to equation 1). When the population is vulnerable or robust, then the transition probabilities are not affected by the recovery rate, and equation 4 simplifies to:

image( eqn 5)

This equation is effectively unchanged from the recursive equation when the recovery rate is known (equation 1).

  • When the population is collapsed, equation 4 becomes:

image( eqn 6)

where E(p) is the mean of the beta distribution:

image

This recursive equation is comparable to equation 2, where the recovery rate is known. Instead of weighting future returns by the known transition probabilities 1 – p and p, the mean of the uncertainty distribution, E(p) is substituted for the recovery rate into the transition probabilities.

The terminal condition is similar to equation 3:

V(s, α, β, t) = 0  for s = 1, 2, 3; α, β > 0.(eqn 7)

When active adaptive management is adopted, probable changes to α and β in the future are incorporated into the optimization. The terminal condition (equation 7) and the recursive equation for vulnerable and robust populations (equation 5) are used in the same way, as they are independent of the recovery rate p. However, the recursive equation for a collapsed population becomes:

image(eqn 8)

Again, the mean of the distribution for recovery rate is substituted in for the transition probabilities. The important difference from equation 6 is that the beta shape parameters are updated at time t + 1 as a response to the new population size at that time. This allows us to determine whether it is worthwhile to take actions that will speed up our learning of the recovery rate, such as harvesting the population to collapse, even though there is a risk of many lost harvests.

The forms of the optimal strategies for a vulnerable population are displayed in Figs 2 and 3, and Figs S2 and S3 in the Supplementary material. The general patterns emerging from these strategies follow.

Figure 2.

Optimal adaptive harvest strategies for the ‘no random collapse’ scenario when the population is in the vulnerable state, without discounting (r = 0). Combinations of α and β are plotted according to the corresponding expected recovery rate E(p) and the amount of recovery information (α + β). Shading shows the optimal passive adaptive strategy: harvest (shaded) or not (unshaded). Symbols show the optimal active adaptive strategy for various combinations of (α, β): harvest (crosses) or not (dots).

Figure 3.

Optimal adaptive harvest strategies for the ‘some random collapse’ scenario when the population is in the vulnerable state, without discounting (r = 0). Combinations of α and β are plotted according to the corresponding expected recovery rate E(p) and the amount of recovery information (α + β). Shading shows the optimal passive adaptive strategy: harvest (shaded) or not (unshaded). Symbols show the optimal active adaptive strategy for various combinations of (α, β): harvest (crosses) or not (dots).

population size

Note again that the population can never be harvested when it is collapsed (size 1). Under both passive and active adaptive management, it is optimal to harvest a robust (size 3) population. We saw that when the recovery rate was known, its value did not influence the optimal action in the robust state: it was always optimal to harvest (Fig. 1). Now uncertainty in the recovery rate also does not affect the optimal action and it remains optimal to harvest the robust population under all adaptive strategies.

When the recovery rate is known, the optimal action for a vulnerable population is influenced by the time horizon, recovery rate, discount rate and scenario. When the recovery rate is uncertain, the optimal action is again influenced by the time horizon, discount rate and scenario, and also by the current data regarding the recovery rate (represented by the beta distribution with expected value E(p) and amount of information (α + β). Thus, the optimal strategy figures (Figs 2 and 3, S2 and S3) focus on the optimal action for a vulnerable population.

recovery rate

As in the parameter certainty case, the vulnerable population is harvested when the recovery rate is estimated to be sufficiently high. The definition of ‘sufficient’ is a function of the time horizon, adaptive strategy taken, scenario and discount rate.

time horizon

When the time horizon is short it is optimal to harvest a vulnerable population for all possible values of a known recovery rate (Fig. 1) and, thus, it is optimal to harvest the vulnerable population under passive and active adaptive strategies (Figs 2a and 3a, S2a and S3a). This follows the same line of reasoning as for a robust population (‘Population size’, above): when the known recovery rate does not influence the optimal action, nor does an uncertain recovery rate influence the action. As the time horizon increases, the optimal adaptive strategy is determined by whether or not the estimated recovery rate is sufficiently high (the definition of ‘sufficient’ is a function of the adaptive strategy taken, scenario, discount rate and data on the recovery rate).

passive adaptive management

The optimal passive adaptive harvest strategy does not depend upon the level of uncertainty surrounding the recovery rate (α + β). Rather, the strategy depends entirely on the expected recovery rate E(p) and the relationships described for optimal management with a known recovery rate. Comparing the recursive equations for certain-parameter management (equations 1–3) with those for passive adaptive management (equations 5–7), the only important change is that the known recovery rate p is replaced by E(p). Thus, p can be replaced by E(p) in Figs 1 and S1 to determine whether the estimated recovery rate is sufficiently high and the discount rate sufficiently low for harvest to be optimal. The threshold of p or E(p) where the optimal decision changes from harvest to no harvest is the same across Figs 1–3 and S1–S3 for corresponding scenarios, discount rates and time horizons. Note that in the practice of passive adaptive management, the estimate of p is updated over time as new observations come to hand, and so it is not a fixed constant over time as it is in the case of parameter certainty.

active adaptive management

The optimal active adaptive strategy follows the same general patterns as the passive adaptive strategy (i.e. harvest the vulnerable population for short time horizons or if the expected recovery rate is sufficiently high). However, it exhibits subtle differences in which estimated recovery rates are deemed to be sufficiently high. Over long time horizons, there are a number of data states (E(p), α + β) at which it is optimal to not harvest under the passive adaptive strategy but to harvest under the active adaptive strategy (Figs 2c–d and 3c–d, S2c–d; unshaded regions with crosses). These states occur close to the E(p)-threshold at which the optimal passive adaptive action changes, particularly when the recovery rate is highly uncertain (α + β is small). The optimal active adaptive action in these cases can be interpreted as being experimental. That is, even though the best current estimate of the recovery rate is slightly lower than the decision-threshold if we were perfectly confident of our estimate, an active adaptive manager should harvest the population, thus increasing the probability of collapse and further learning of the recovery rate. It is plausible that the new data collected could yield a higher estimate of recovery rate, which in turn justifies increased harvest in the future. This experimentation is justified, particularly when the current estimate of recovery rate is based on few data (α + β is low, Fig. 2d).

Over medium-length time horizons, there are smaller a number of data states (E(p), α + β) at which is optimal to harvest under the passive adaptive strategy but not harvest under the active adaptive strategy (Figs 2b and 3b, S2b and S3b; shaded regions with dots). These states occur again close to the E(p)-threshold at which the optimal passive adaptive action changes. In the ‘no random collapse’ scenario without discounting (Fig. 2b), the range of data states is broader when the current estimate of recovery rate is based on little data (α + β is low, Fig. 2b). Here we interpret the active adaptive strategy as being more precautionary than the passive adaptive strategy. Even though the best current estimate of the recovery rate, if we were perfectly confident of it, would direct us to harvest the population, an active adaptive manager should not harvest the population. The active adaptive recursive equations (equations 5, 7 and 8) project plausible future learning as part of the management process. Over a medium time horizon harvesting the population to collapse, while promoting learning is likely to sacrifice the ability to harvest the population for many of the remaining years. Were the population to recover, there would be few (if any) years remaining to manage the population according to the new and improved estimate of the recovery rate. Even if the best current estimate is slightly higher than the decision-threshold in the parameter certainty case, it is still plausible that harvesting the population will cause a prolonged collapse and new data that suggest a lower recovery rate. Over a medium-length time horizon, there is not sufficient time to recover these losses with improved management based on more recovery data.

scenario (transition probabilities)

The differences between the passive and active adaptive strategies are influenced by the transition probabilities in the population model. In the ‘some random collapse’ scenario, the population is likely to collapse even when it is not harvested and so some learning will occur by chance. Thus, there is less need for experimentation in an active adaptive strategy and there are fewer data states that exhibit it (Figs 2c–d and 3c–d, S2c–d and S3c–d; unshaded regions with crosses). Precautionary behaviour in the active adaptive strategy (Figs 2b and 3b, S2b and S3b; shaded regions with dots) also occurs in fewer data states.

discount rate

As the discount rate increases, the value of future harvests is diminished in comparison to harvest in the present. Thus, it becomes more tolerable to harvest a population when the recovery rate is low (cf. Figs 1, S1). This corresponds to a lower E(p)-threshold where the optimal decision changes when the recovery rate is uncertain. Furthermore, the returns to be made from an improved estimate of recovery rate in the future are diminished with a higher discount rate, and so there is less value attached to experimental harvesting and learning (Figs 2c–d, S2c–d; unshaded regions with crosses). In the ‘no random collapse’ scenario, the time-invariant (long time horizon) active adaptive strategy is actually more precautionary than the passive adaptive strategy (Fig. S3b–d; shaded regions with dots).

An example simulation of the optimal active adaptive management of a population is shown in Fig. 4. Each year, the population size and data state for the recovery rate (E(p), α + β) are used to determine whether or not the population is to be harvested. The population size and harvest decision affect the new population size one year later, and this affects the data state if the population was collapsed previously. Over time the population fluctuates in size and is harvested. Ideally, the data state converges on the underlying recovery rate, although learning may slow or even cease (e.g. Fig. 4; year 19) if the manager is sufficiently confident of the optimal harvest decision for a vulnerable population.

Figure 4.

An example simulation of the active adaptive harvest strategy: (a) population size, with asterisks showing the years in which harvest takes place, and (b) the beta distribution of belief about the recovery rate, represented by the mean (solid line) and the 2·5th and 97·5th percentiles (dotted lines). The prior distribution for the uncertain recovery rate is uniform (0, 1) and the true underlying recovery rate is p = 0·3. Other transition rates follow the ‘no random collapse’ scenario and the harvest strategy does not discount harvest over time (r = 0).

expected value of perfect information

Further understanding of the value of learning can be gained by calculating the expected value of perfect information (EVPI: Walters 1986; Williams 2001). This is the difference between the accumulated harvest value with perfect knowledge, averaged over all plausible models, and the accumulated harvest value if the population were managed using only the current data, without model updating, over the management time-frame. It can be interpreted as the maximum amount (in the units of our objective, harvest value) we would be willing to pay to learn the parameter value with certainty. We set prior belief in the recovery rate to be uniform (0, 1) (i.e. α = β = 1). In the absence of discounting, the EVPI increases steadily with the time horizon, but under discounting it asymptotes to a maximum level as the value of harvest diminishes (Fig. 5). The EVPI is highest under the ‘no random collapse’ scenario without discounting, and this is the model under which the most experimental behaviour is exhibited in the active adaptive strategy (Fig. 2c–d). Under the other models the EVPI is less than 2, even over long time horizons such as 200 years, and this is reflected in the more restricted use of experimentation in the corresponding active adaptive strategies (Figs 3c–d, S2c–d and S3c–d). There is little motivation for the manager to experiment and learn the recovery rate if the improvement in returns is expected to be so little.

Figure 5.

The expected value of information (EVPI) as a function of time horizon when prior belief for recovery rate is uniform (0, 1). The four lines denote ‘no random collapse’ scenario without discounting (thin solid), the ‘no random collapse’ scenario with discount rate r = 0·05 (dashed), the ‘some random collapse’ scenario without discounting (thick solid), and the ‘some random collapse’ scenario with discount rate r = 0·05 (dotted).

Discussion

While this paper uses a model of harvest management to explore optimal adaptive management strategies, the general results found here can potentially be applied more broadly to other environmental scenarios that invite an adaptive management approach. Previously optimal active adaptive management has been demonstrated as a strategy that incites experimentation and the delay of rewards. In other cases it has taken a more neutral form, comparable to the model-averaged passive adaptive approach. However, we present an example where the active adaptive strategy is more precautionary than the optimal passive adaptive strategy, with greater avoidance of actions that put immediate benefits at risk.

For our harvest example, this is represented by an aversion to collapsing the population and thereby surrendering the ability to harvest for multiple years. More generally, this could mean taking an action with known moderate benefits rather than trialling a less well-understood alternative that could yield either great benefits or great losses. Our results suggest that this would be the case only when the optimal decision is marginal, i.e. the passive adaptive or certainty-equivalent approach considers the experiment to be only slightly more beneficial than the conservative action. Sites with some known benefits would be selected for protection, restoration or species reintroduction rather than investing resources in new sites of uncertain but slightly higher expected value. A common pest control method with known results would be implemented over a new candidate method which is expected to be slightly more effective, but could actually be substantially inferior or superior in this novel environment.

This precautionary effect was demonstrated specifically over medium-length time horizons. Here the manager has time to take an experimental action and learn from it, but not to reap the benefits of improved management after the experiment is completed. Instead the optimal active adaptive decision is to avoid experimentation. This occurs not just when the new approach is predicted to be inferior on average, but when it is predicted to be marginally better. In this case, the projection of learning in the optimization still finds that large losses are plausible and are best avoided by a conservative management strategy.

A time-dependent result such as this is influenced inevitably by the terminal condition (equation 7). We have attached no value to the population size at the time horizon, but if the time horizon is ever to be reached it is unlikely that the manager would be indifferent to the final status of the population. A terminal condition that values a larger population [e.g. V(s, α, β, T) =s – 1 for all α, β > 0] causes an optimal harvest strategy (under parameter certainty, passive and active adaptive management) that does not harvest the vulnerable population under any (estimated) recovery rate in the final few years. The purpose of forgoing harvest is an increased likelihood of a large population in the final year. However, the optimal management behaviour over medium-length and long time horizons is the same as presented here for the trivial terminal condition.

The three-class population model presented in this paper is simplistic for the purpose of detecting the main effects of several variables. However, the same model structure can be extended easily to incorporate a larger number of population states and harvest actions. This would yield a transition matrix of higher dimension (compared to those in Table 1) and increase the computation time required to calculate the optimal strategies, but the recursive equations would take the same form (as equation 4). The use of a beta distribution to describe uncertainty in a rate parameter conveniently condenses model belief across a continuous interval into two state variables (α, β), thus limiting computational requirements. This approach could be appropriate for estimating a variety of uncertain rate parameters in environmental management that are defined on the interval (0, 1). For example, the unknown survival rate of a species could be modelled by a beta distribution, with the fates of individual animals (death or survival in each time period) providing the binary data to update the distribution. However, the incorporation of multiple uncertain parameters increases computational requirements exponentially.

The choice of prior weights for alternative hypotheses can have an important effect on future learning (Smith & Walters 1981; Walters 1981; Polacheck 2002). Assigning equal weight to each model initially can also be problematic, because what appears to be a uniform and uninformative prior in one dimension can actually produce implicit assumptions about population dynamics in another dimension (Punt & Hilborn 1997). For example, when we use a uniform prior distribution for recovery rate under a ‘no random collapse’ scenario without discounting, we implicitly allocate a 0·43 prior probability that it is optimal to not harvest the population when it is vulnerable, and a 0·57 prior probability that it is optimal to harvest (Fig. 2b–d). These priors on the optimal harvest decision become even more unbalanced as the critical decision threshold moves away from 0·5 (e.g. Figs 3, S2 and S3).

We have not found any studies that show time-dependent active adaptive management as in this paper. Instead, other authors have focused on the stationary strategy or a strategy with a long time horizon. Our time-dependent solutions should be interpreted carefully. When the time horizon is short (less than, for instance, 5 years) then it is optimal to harvest a vulnerable population under all circumstances as there is no penalty for a population near extinction. It is unlikely that this situation would be acceptable for a real population. On the other hand, managers are also unlikely to weight outcomes in the distant future equally with those earned in the present and so the level of experimentation recommended over long time horizons (up to 200 years in this paper) may not be appropriate. Furthermore, it is unlikely that population dynamics will follow the same patterns over such a long time period (Hilborn, Walters & Ludwig 1995; Punt & Hilborn 1997), particularly in the face of climate change (Hulme 2005). Appropriate parameter values and even model structure may not remain constant over the time-frames required to make up the returns sacrificed for learning.

This caution against using long time horizons without discounting adds weight to the main finding in this paper, that the optimal active adaptive harvest strategy may be more precautionary than the optimal passive adaptive harvest strategy. This result has not been noted in the adaptive management literature, although it has in the engineering literature (Bar-Shalom 1981). The precautionary strategy contrasts starkly with the widespread assumption that the active adaptive management approach encourages experimentation and sacrifice of benefits early in the time frame of management. The selection of an appropriate discount factor and time horizon is vital in determining the optimal management strategy. They must effectively reflect the value of economic investment, social and political horizons, and the extent to which the available models and parameters are relevant to ecosystem dynamics. It is only then, in the context of formal optimization, that the most appropriate level of experimentation or precaution can be determined.

Acknowledgements

This project was funded by CEH's PhD scholarship from the Australian Research Council. The authors thank Mick McCarthy, Fred Johnson, Ken Williams, Tracy Rout, Hiroyuki Yokomizo, an anonymous thesis examiner and an anonymous referee for comments and/or discussions of this work.

Ancillary