EMOTAG: AN APPROACH TO AUTOMATED MARKUP OF EMOTIONS IN TEXTS
Article first published online: 4 JUL 2012
© 2012 Wiley Periodicals, Inc.
Volume 29, Issue 4, pages 680–721, November 2013
How to Cite
Francisco, V. and Gervás, P. (2013), EMOTAG: AN APPROACH TO AUTOMATED MARKUP OF EMOTIONS IN TEXTS. Computational Intelligence, 29: 680–721. doi: 10.1111/j.1467-8640.2012.00438.x
- Issue published online: 6 NOV 2013
- Article first published online: 4 JUL 2012
- Manuscript Accepted: 2 APR 2012
- Manuscript Revised: 28 MAR 2012
- Manuscript Received: 1 DEC 2010
- emotional markup;
- emotional dimensions;
- emotional categories
This paper presents an approach to the automated markup of texts with emotional labels. The approach considers two possible representations of emotions in parallel: emotional categories (emotional tags used to refer to emotions) and emotional dimensions (measures that try to model the essential aspects of emotions numerically). For each representation, a corpus of example texts previously annotated by human evaluators is mined for an initial assignment of emotional features to words. This results in a list of emotional words (LEW) which becomes a useful resource for later automated markup. The algorithm proposed for the automated markup of text closely mirrors the steps taken during feature extraction, employing a combination of the LEW resource and the ANEW word list for the actual assignment of emotional features, and WordNet for knowledge-based expansion of words not occurring in either and an ontology of emotional categories. The algorithm for automated markup is tested and the results are discussed with respect to three main issues: the relative adequacy of each of the representations used, correctness and coverage of the proposed algorithm, and additional techniques and solutions that may be employed to improve the results. The average percentage of success obtained by our approach when it marks up with emotional dimensions is around 80% and when it marks up with emotional categories is around 50%. The main contribution of the approach presented in this paper is that it allows dimensions and categories at different levels of abstraction to operate simultaneously during markup.