Analyzing Risky Choices: Q-learning for Deal-No-Deal
Article first published online: 25 APR 2013
Copyright © 2013 John Wiley & Sons, Ltd.
Applied Stochastic Models in Business and Industry
Volume 30, Issue 3, pages 258–270, May/June 2014
How to Cite
2014), Analyzing Risky Choices: Q-learning for Deal-No-Deal, Applied Stochastic Models in Business and Industry, 30, pages 258–270. DOI: 10.1002/asmb.1971and (
- Issue published online: 13 JUN 2014
- Article first published online: 25 APR 2013
- Manuscript Accepted: 31 JAN 2013
- Manuscript Revised: 19 DEC 2012
- Manuscript Received: 8 AUG 2011
- risky choices;
- Deal No Deal
In this paper, we derive an optimal strategy for the popular Deal or No Deal game show. To do this, we use Q-learning methods, which quantify the continuation value inherent in sequential decision making in the game. We then analyze two contestants, Frank and Susanne, risky choices from the European version of the game. Given their choices and our optimal strategy, we find what their implied bounds would be on their levels of risk aversion. Previous empirical evidence in risky decision making has suggested that past outcomes affect future choices and that contestants have time-varying risk aversion. We demonstrate that the strategies of Frank and Susanne are consistent with constant risk aversion levels except for their final risk-seeking choice. We conclude with directions for future research. Copyright © 2013 John Wiley & Sons, Ltd.