Standard Article

The Knowledge Gradient for Optimal Learning

  1. Warren B. Powell

Published Online: 15 FEB 2011

DOI: 10.1002/9780470400531.eorms0444

Wiley Encyclopedia of Operations Research and Management Science

Wiley Encyclopedia of Operations Research and Management Science

How to Cite

Powell, W. B. 2011. The Knowledge Gradient for Optimal Learning. Wiley Encyclopedia of Operations Research and Management Science. .

Author Information

  1. Princeton University, Department of Operations Research and Financial Engineering, Princeton, New Jersey

Publication History

  1. Published Online: 15 FEB 2011

Abstract

Optimal learning addresses the problem of how to collect information so that it benefits future decisions. For off-line problems, we have to make a series of measurements or observations before choosing a final design or set of parameters; for on-line problems, we learn from rewards we are receiving, and we want to strike a balance between rewards earned now and better decisions in the future. This article reviews these problems, describes optimal and heuristic policies, and shows how to compare competing policies. Then, the presentation focuses on the concept of the knowledge gradient, which guides information collection by maximizing the marginal value of information. We show how this idea can be applied to both on-line and off-line problems, as well as a broad range of other applications which have not previously yielded to formal techniques.

Keywords:

  • knowledge gradient;
  • bandit problems;
  • ranking and selection;
  • S-curve problem simulation optimization