SEARCH

SEARCH BY CITATION

Keywords:

  • Q-learning;
  • Bellman;
  • dynamic programming;
  • sequential Monte Carlo;
  • stochastic simulation;
  • simulatedannealing

Abstract

In this paper we develop a simulation-based approach to stochastic dynamic programming. To solve the Bellman equation we construct Monte Carlo estimates of Q-values. Our method is scalable to high dimensions and works in both continuous and discrete state and decision spaces while avoiding discretization errors that plague traditional methods. We provide a geometric convergence rate. We illustrate our methodology with a dynamic stochastic investment problem. Copyright © 2011 John Wiley & Sons, Ltd.