Behind the Curve: Clarifying the Best Approach to Calculating Predicted Probabilities and Marginal Effects from Limited Dependent Variable Models

Authors


  • Kalkan acknowledges the postdoctoral fellowship in quantitative methods he received from University of Oxford, Department of Politics and International Relations, between 2009 and 2010. For thoughtful and constructive comments on earlier versions of this work, we would like to thank Chris Achen, Mike Alvarez, Dave Armstrong, Brandon Bartels, Dan Biggers, Tom Carsey, Anne Cizmar, Dan Corstange, Sarah Croco, Jill Curry, Jim Curry, Ray Duch, Justin Esarey, Rob Franzese, Bill Greene, Kosuke Imai, Bill Jacoby, Karen Long Jusko, Gary King, Jon Ladd, Eric Lawrence, Geoff Layman, Scott Long, Irwin Morris, Irfan Nooruddin, Bill Reed, John Sides, Jeff Smith, Piotr Swistak, Ric Uslaner, Nick Valentino, and Chris Zorn. For research assistance, we thank Christina Heshmatpour and Ilya Kopysitsky. We also wish to thank the anonymous reviewers and the Editor for their close reading of the article, thoughtful suggestions, and guidance. All errors are our own.

Michael J. Hanmer is Associate Professor of Government and Politics, and Research Director at the Center for American Politics and Citizenship, University of Maryland, 3140 Tydings Hall, College Park, MD 20742 (mhanmer@umd.edu). Kerem Ozan Kalkan is Assistant Professor of Political Science and Public Administration, Middle East Technical University, Dumlupinar Bulvari No: 1, Cankaya 06800 Ankara Turkey (okalkan@metu.edu.tr).

Abstract

Models designed for limited dependent variables are increasingly common in political science. Researchers estimating such models often give little attention to the coefficient estimates and instead focus on marginal effects, predicted probabilities, predicted counts, etc. Since the models are nonlinear, the estimated effects are sensitive to how one generates the predictions. The most common approach involves estimating the effect for the “average case.” But this approach creates a weaker connection between the results and the larger goals of the research enterprise and is thus less preferable than the observed-value approach. That is, rather than seeking to understand the effect for the average case, the goal is to obtain an estimate of the average effect in the population. In addition to the theoretical argument in favor of the observed-value approach, we illustrate via an empirical example and Monte Carlo simulations that the two approaches can produce substantively different results.

Ancillary