Standard Article

Approximate Dynamic Programming—II: Algorithms

  1. Warren B. Powell

Published Online: 15 JUN 2010

DOI: 10.1002/9780470400531.eorms0043

Wiley Encyclopedia of Operations Research and Management Science

Wiley Encyclopedia of Operations Research and Management Science

How to Cite

Powell, W. B. 2010. Approximate Dynamic Programming—II: Algorithms. Wiley Encyclopedia of Operations Research and Management Science. .

Author Information

  1. Princeton University, Department of Operations Research and Financial Engineering, Princeton, New Jersey

Publication History

  1. Published Online: 15 JUN 2010

Abstract

Approximate dynamic programming (ADP) is a powerful class of algorithmic strategies for solving stochastic optimization problems, where optimal decisions can be characterized using Bellman's optimality equation, but the characteristics of the problem make solving Bellman's equation computationally intractable. This brief article provides an introduction to the basic concepts of ADP, while building bridges between the different communities that have contributed to this field. We cover basic approximate value iteration (temporal difference learning), policy approximation, and a brief introduction to strategies for approximating value functions. We cover Q-learning, and the use of the postdecision state for solving problems with vector-valued decisions. The approximate linear programming method is introduced along with a discussion of step size selection issues. The presentation closes with a discussion of some practical issues that arise in the implementation of ADP techniques.

Keywords:

  • approximate dynamic programming;
  • Q-learning;
  • basis functions;
  • linear programming;
  • stochastic approximation