SEARCH

SEARCH BY CITATION

Keywords:

  • Sequential action;
  • Action learning;
  • Perceptual matching;
  • Ideal actor;
  • Reinforcement learning;
  • Causal learning

Abstract

We report the results of an experiment in which human subjects were trained to perform a perceptual matching task. Subjects were asked to manipulate comparison objects until they matched target objects using the fewest manipulations possible. An unusual feature of the experimental task is that efficient performance requires an understanding of the hidden or latent causal structure governing the relationships between actions and perceptual outcomes. We use two benchmarks to evaluate the quality of subjects’ learning. One benchmark is based on optimal performance as calculated by a dynamic programming procedure. The other is based on an adaptive computational agent that uses a reinforcement-learning method known as Q-learning to learn to perform the task. Our analyses suggest that subjects were successful learners. In particular, they learned to perform the perceptual matching task in a near-optimal manner (i.e., using a small number of manipulations) at the end of training. Subjects were able to achieve near-optimal performance because they learned, at least partially, the causal structure underlying the task. In addition, subjects’ performances were broadly consistent with those of model-based reinforcement-learning agents that built and used internal models of how their actions influenced the external environment. We hypothesize that people will achieve near-optimal performances on tasks requiring sequences of action—especially sensorimotor tasks with underlying latent causal structures—when they can detect the effects of their actions on the environment, and when they can represent and reason about these effects using an internal mental model.