An earlier version of this article was presented to the Midwest Political Science Association and was awarded the 2006 Harold Gosnell Prize for Excellence in Political Methodology. We would like to thank Steven Abney, Scott Adler, Scott Ainsworth, Frank Baumgartner, Ken Bickers, David Blei, Jake Bowers, Janet Box-Steffensmeier, Patrick Brandt, Barry Burden, Suzie Linn, John Freeman, Ed Hovy, Will Howell, Simon Jackman, Brad Jones, Bryan Jones, Kris Kanthak, Gary King, Glen Krutz, Frances Lee, Bob Luskin, Chris Manning, Andrew Martin, Andrew McCallum, Iain McLean, Nate Monroe, Becky Morton, Stephen Purpura, Phil Schrodt, Gisela Sin, Betsy Sinclair, Michael Ward, John Wilkerson, Dan Wood, Chris Zorn, and seminar participants at UC Davis, Harvard University, the University of Michigan, the University of Pittsburgh, the University of Rochester, Stanford University, the University of Washington, and Washington University in St. Louis for their comments on earlier versions of the article. We would like to give special thanks to Cheryl Monroe for her contributions toward development of the Congressional corpus in specific and our data collection procedures in general. We would also like to thank Jacob Balazer (Michigan) and Tony Fader (Michigan) for research assistance. In addition, Quinn thanks the Center for Advanced Study in the Behavioral Sciences for its hospitality and support. This article is based upon work supported by the National Science Foundation under grants BCS 05-27513 and BCS 07-14688. Any opinions, findings, and conclusions or recommendations expressed in this article are those of the authors and do not necessarily reflect the views of the National Science Foundation. Supplementary materials, including web appendices and a replication archive with data and R package, can be found at http://www.legislativespeech.org.
How to Analyze Political Attention with Minimal Assumptions and Costs
Version of Record online: 28 DEC 2009
©2010, Midwest Political Science Association
American Journal of Political Science
Volume 54, Issue 1, pages 209–228, January 2010
How to Cite
Quinn, K. M., Monroe, B. L., Colaresi, M., Crespin, M. H. and Radev, D. R. (2010), How to Analyze Political Attention with Minimal Assumptions and Costs. American Journal of Political Science, 54: 209–228. doi: 10.1111/j.1540-5907.2009.00427.x
- Issue online: 28 DEC 2009
- Version of Record online: 28 DEC 2009
Previous methods of analyzing the substance of political attention have had to make several restrictive assumptions or been prohibitively costly when applied to large-scale political texts. Here, we describe a topic model for legislative speech, a statistical learning model that uses word choices to infer topical categories covered in a set of speeches and to identify the topic of specific speeches. Our method estimates, rather than assumes, the substance of topics, the keywords that identify topics, and the hierarchical nesting of topics. We use the topic model to examine the agenda in the U.S. Senate from 1997 to 2004. Using a new database of over 118,000 speeches (70,000,000 words) from the Congressional Record, our model reveals speech topic categories that are both distinctive and meaningfully interrelated and a richer view of democratic agenda dynamics than had previously been possible.