Standard Article

Reinforcement Learning Algorithms for MDPs

  1. Csaba Szepesvári

Published Online: 15 FEB 2011

DOI: 10.1002/9780470400531.eorms0714

Wiley Encyclopedia of Operations Research and Management Science

Wiley Encyclopedia of Operations Research and Management Science

How to Cite

Szepesvári, C. 2011. Reinforcement Learning Algorithms for MDPs. Wiley Encyclopedia of Operations Research and Management Science. .

Author Information

  1. University of Alberta, Department of Computing Science, Edmonton, Alberta, Canada

Publication History

  1. Published Online: 15 FEB 2011

Abstract

Reinforcement learning is a learning paradigm concerned with learning to control a system so as to maximize a numerical performance measure that expresses a long-term objective. The goal in reinforcement learning is to develop efficient learning algorithms, as well as to understand the algorithms' merits and limitations. In this article we focus on a few selected algorithms of reinforcement learning which build on the powerful theory of dynamic programming.

Keywords:

  • reinforcement learning;
  • Markov Decision Processes;
  • temporal difference learning;
  • stochastic approximation;
  • function approximation;
  • least-squares methods;
  • Q-learning;
  • actor-critic methods;
  • policy gradient;
  • natural gradient