Sentiment strength detection in short informal text

Authors

Errata

This article is corrected by:

  1. Errata: Erratum: Correction to Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., and Kappas, A., ‘Sentiment in short strength detection informal text’. Journal of the American Society for Information Science and Technology 61(12) 2010, 2544–2558 Volume 62, Issue 2, 419, Article first published online: 9 December 2010
  2. Errata: Erratum Volume 63, Issue 2, 429, Article first published online: 1 November 2011

Abstract

A huge number of informal messages are posted every day in social network sites, blogs, and discussion forums. Emotions seem to be frequently important in these texts for expressing friendship, showing social support or as part of online arguments. Algorithms to identify sentiment and sentiment strength are needed to help understand the role of emotion in this informal communication and also to identify inappropriate or anomalous affective utterances, potentially associated with threatening behavior to the self or others. Nevertheless, existing sentiment detection algorithms tend to be commercially oriented, designed to identify opinions about products rather than user behaviors. This article partly fills this gap with a new algorithm, SentiStrength, to extract sentiment strength from informal English text, using new methods to exploit the de facto grammars and spelling styles of cyberspace. Applied to MySpace comments and with a lookup table of term sentiment strengths optimized by machine learning, SentiStrength is able to predict positive emotion with 60.6% accuracy and negative emotion with 72.8% accuracy, both based upon strength scales of 1–5. The former, but not the latter, is better than baseline and a wide range of general machine learning approaches.

Ancillary