Beyond stochastic dynamic programming: a heuristic sampling method for optimizing conservation decisions in very large state spaces


Correspondence author. E-mail:


1. When managing endangered species the consequences of making a poor decision can be extinction. To make a good decision, we must account for the stochastic dynamic of the population over time. To this end stochastic dynamic programming (SDP) has become the most widely used tool to calculate the optimal policy to manage a population over time and under uncertainty.

2. However, as a result of its prohibitive computational complexity, SDP has been limited to solving small dimension problems, which results in SDP models that are either oversimplified or approximated using greedy heuristics that only consider the immediate rewards of an action.

3. We present a heuristic sampling (HS) method that approximates the optimal policy for any starting state. The method is attractive for problems with large state spaces as the running time is independent of the size of the problem state space and improves with time.

4. We demonstrate that the HS method out-performs a commonly used greedy heuristic and can quickly solve a problem with 33 million states. This is roughly 3 orders of magnitude larger than the largest problems that can currently be solved with SDP methods.

5. We found that HS out-performs greedy heuristics and can give near-optimal policies in shorter timeframes than SDP. HS can solve problems with state spaces that are too large to optimize with SDP. Where the state space size precludes SDP, we argue that HS is the best technique.