Approximate Dynamic Programming—II: Algorithms
Published Online: 15 JUN 2010
Copyright © 2010 John Wiley & Sons, Inc. All rights reserved.
Wiley Encyclopedia of Operations Research and Management Science
How to Cite
Powell, W. B. 2010. Approximate Dynamic Programming—II: Algorithms. Wiley Encyclopedia of Operations Research and Management Science. .
- Published Online: 15 JUN 2010
Approximate dynamic programming (ADP) is a powerful class of algorithmic strategies for solving stochastic optimization problems, where optimal decisions can be characterized using Bellman's optimality equation, but the characteristics of the problem make solving Bellman's equation computationally intractable. This brief article provides an introduction to the basic concepts of ADP, while building bridges between the different communities that have contributed to this field. We cover basic approximate value iteration (temporal difference learning), policy approximation, and a brief introduction to strategies for approximating value functions. We cover Q-learning, and the use of the postdecision state for solving problems with vector-valued decisions. The approximate linear programming method is introduced along with a discussion of step size selection issues. The presentation closes with a discussion of some practical issues that arise in the implementation of ADP techniques.
- approximate dynamic programming;
- basis functions;
- linear programming;
- stochastic approximation