The Role of Language in the Automatic Coding of Political Texts

Authors

  • Didier Ruedin

    Corresponding author
    1. University of Neuchâtel
    • Address for Correspondence: Swiss Forum for Migration and Population Studies, University of Neuchâtel, Fbg de l'Hôpital 106, CH - 2000 Neuchâtel. Personal website: http://druedin.com Email: didier.ruedin@wolfson.oxon.org

    Search for more papers by this author
    • The research leading to these results was partly carried out as part of the project SOM (Support and Opposition to Migration). The project has received funding from the European Commission's Seventh Framework Programme (FP7/2007-2013) under grant agreement number 225522. Replication material is available from: http://hdl.handle.net/1902.1/20302.

Abstract

Automatic approaches to coding party manifestos and other political texts have become more widespread. This research note addresses the question to what extent the source language of a text affects the results. To do so, Swiss manifestos in German and French are coded automatically, comparing a keyword-based dictionary approach and Wordscores. Because of language differences, both stemming and particularly stop words are important to obtain comparable results for Wordscores. If both are used, the predicted scores are almost identical in both languages. With the right preparations, the challenge of language differences can thus be overcome.

Ancillary