The State of the Art in Sentiment Visualization

Visualization of sentiments and opinions extracted from or annotated in texts has become a prominent topic of research over the last decade. From basic pie and bar charts used to illustrate customer reviews to extensive visual analytics systems involving novel representations, sentiment visualization techniques have evolved to deal with complex multidimensional data sets, including temporal, relational and geospatial aspects. This contribution presents a survey of sentiment visualization techniques based on a detailed categorization. We describe the background of sentiment analysis, introduce a categorization for sentiment visualization techniques that includes 7 groups with 35 categories in total, and discuss 132 techniques from peer‐reviewed publications together with an interactive web‐based survey browser. Finally, we discuss insights and opportunities for further research in sentiment visualization. We expect this survey to be useful for visualization researchers whose interests include sentiment or other aspects of text data as well as researchers and practitioners from other disciplines in search of efficient visualization techniques applicable to their tasks and data.


Introduction
The development of digital technologies and the Internet has led to an unprecedented increase of text data, creating new opportunities and challenges. Researchers in linguistics and natural language processing (NLP) have access to the data that can be drastically different from traditional corpora or literature with regard to content, scale, and corresponding analyses, for instance, customer reviews or social media messages. One of the research problems investigated by these disciplines is sentiment analysis (the term being often interchangeable with opinion mining and affect analysis), which is generally concerned with detecting attitudinal content in text at various levels of granularity [PL08]. Usually, the text is classified into positive, negative or neutral at the level of words, utterances or complete documents.
Sentiment visualization-understood as a research challenge in information visualization (InfoVis) and visual analytics (VA) to analyze sentiment discovered in text data-is part of the more general research area of text visualization [KK15]. The applications and tasks of sentiment visualization include, for instance, monitoring of public opinion in social media, literature analysis for digital humanities, or support for research of sentiment and stance in linguistics and NLP. Some of the earliest papers mentioning visualization of sentiment actually originate in data mining (DM) or NLP and use basic visual representations in most cases. On the contrary, stateof-the-art techniques often reflect the advances in InfoVis and VA and incorporate sentiment in complex settings involving heterogeneous data. However, existing work in sentiment visualization has not been covered by any comprehensive survey yet, as discussed in Section 3. Therefore, our survey can be beneficial for visualization researchers working on this problem since there is already a significant body of work involving sentiment visualization which can be difficult to explore-the publications are scattered across a large number of outlets and disciplines. Our survey can also be useful for researchers from other fields as well as practitioners interested in visualization/visual analysis methods for sentiment data. For instance, a recent report indicates that the emotion detection and recognition market (which involves, among other methods, emotion analysis of text data with NLP) is predicted to reach 22.65 billion USD by 2020 [Emo]. In this case, the availability of a survey covering academic research could also facilitate the cooperation between academia and industry.
In this paper, we propose a survey based on the collection and analysis of a substantial number of sentiment visualization techniques described in peer-reviewed papers in InfoVis, VA and other disciplines (NLP, DM, etc.). The main scientific contributions of this paper are the following: To refine our categorization, discover interesting patterns and facilitate data exploration by the readers of this paper, we have developed an interactive survey browser available at http://sentimentvis.lnu.se In this survey, we have limited ourselves only to visualization techniques based on the analysis of text data, and have not included visualization techniques related to emotion measurement with the help of brain-computer interfaces (as opposed to emotions discovered in text) or similar approaches. We refer the interested readers to the recently published survey by Cernea and Kerren [CK15] that covers the corresponding research area. Another related discipline that concerns itself with analysis and, in some cases, visualization of opinions is social network analysis. For example, Du et al. [DYLL15] discuss OpinionRings, a visualization technique for networks with explicit user opinion values. Since such approaches do not involve text data, we have considered them to be beyond the scope of this survey.
The rest of this paper is organized as follows. In the next section, we describe the background of sentiment analysis and related research problems from linguistics and computational linguistics/NLP. Section 3 provides a discussion of existing visualization surveys relevant to our work. Afterwards, we discuss our methodology and initial statistical results for the collected data in Section 4. Our categorization and the corresponding sentiment visualization techniques are discussed in Section 5. We discuss our interactive survey browser, findings and perspectives for the research field in Section 6. Finally, we conclude this paper in Section 7.

Background
With the advent of machine-readable corpora, research in linguistics has shown that language is not primarily a means of providing information about facts, but rather to evaluate what we are talking about, to take a stance and to express opinions and emotions. Language use in different contexts is highly view-pointed, interactive and interpersonal. Human communication has a purpose. It is in a constant flux and so is the use of language itself [Eng07,Par15].
Evaluative meanings are not easy to specify in advance because they are not confined to traditional areas of grammar or specific words, but may be expressed by parts of words, words or longer chunks. Such meanings have been studied under a range of different names in various research traditions in linguistics such as evaluation, appraisal and stance taking [MW05,Hun11], epistemic modality, subjectivity and intersubjectivity [Par03,Ver05,MACHvdA13], and emotions and affect [FSS13].
The term sentiment analysis as it is used in NLP is usually defined as the task of classifying (short) pieces of text (ranging from single words over phrases and sentences to complete documents) into a small number of classes representing different kinds of sentiments (the term sentiment is used here and below synonymously with terms like emotion, affect, attitude, and so on, unless a more specific term is required). In the simplest formulation, sentiment analysis is considered as a binary problem, where we are either interested in detecting the presence of emotionally loaded content, or in distinguishing positively from negatively loaded content. The former of these tasks is closely related to what has been referred to as subjectivity detection [PL08], while the latter is sometimes referred to as polarity detection [Tur02]. Classification itself is usually based on lexical matching of keywords from a previously constructed dictionary/lexicon (such as WordNet-Affect [SV04], MPQA Subjectivity Lexicon [WWH05], SentiWordNet [BES10], LIWC [TP10], or SenticNet [COR14]), knowledge about word/concept similarity (e.g. using a distributional semantics model or an ontology such as WordNet [Mil95]), or a variety of machine learning (ML) classification models.
Arguably, the most common approach to sentiment analysis is to formulate the problem as a three-way categorization task over the categories negative, neutral and positive. More complex formulations of the sentiment analysis task involve a broader range of possible sentiment classes, either in terms of a graded scale (e.g. weakly to strongly negative and positive) [SPW*13], or in terms of a broader palette of sentiment types [SPH*11, He12, ADAC*13]. Sentiment analysis has also been combined and integrated with other NLP and ML techniques such as topic detection and tracking (TDT), in which case we are interested not only in which sentiments are expressed in the data, but also what is the target of the sentiment. As an example, a sentiment analysis system might detect that customers are predominantly negative to the release of a novel product. However, it would be valuable for the product manufacturer to know if there are any specific aspects of the product that are more negatively perceived-it might be the case that the negativity only concerns one specific aspect of the product, in which case it might be reasonably easy for the company to make the necessary adjustments. Such analysis is commonly referred to as aspect-based sentiment analysis [BE10, JO11].
Another closely related task is emotion detection, which typically employs a categorical, dimensional, or hybrid model of emotions. One of the most well-known categorical models is Ekman's 'Big Six' basic emotions: anger, fear, happiness, surprise, disgust and sadness [Ekm92]. Dimensional models describe emotions in terms of continuous spaces along axes such as valence/pleasure, arousal and dominance [RM77]. Finally, hybrid models such as the Plutchik's wheel of emotions [Plu80] define a set of basic emotions, their relative similarity, and possible combinations resulting in numerous derivative emotions. The relation between emotions and sentiment is analyzed by Munezero et al. [MMSP14], who discuss the definitions of sentiment, opinion, emotion, and affect in detail from the standpoint of linguistics and psychology. They point to subtle differences between these concepts which are often overlooked in work originating from more technical fields such as InfoVis, VA, or DM.
In general, the variation in terminology and computational models of sentiment analysis are covered by several survey papers-the arguably most comprehensive one is that of Pang and Lee [PL08], who discuss work done in linguistics, NLP, DM and ML. More recent survey papers include the works by Tsytsarau and Palpanas [TP12], Cambria et al. [CSXH13], Ravi and Ravi [RR15], and a comprehensive survey by Mohammad [Moh16].
Based on this discussion, we can summarize the general properties and challenges of sentiment visualization as follows: r sentiment visualization covers a variety of sentiment analysis tasks ranging from subjectivity detection to emotion analysis and stance analysis; r sentiment visualization techniques may have to use data specific to the sentiment analysis model (e.g. lexicon-based or ML-based) and scope (word-level, utterance-level, etc.); and r sentiment visualization reflects the variety of data domains and user tasks existing in research and applications of sentiment analysis, which range from theoretical research in linguistics and NLP to social media and news monitoring, thus implying the usage of various visual channels and representations.

Related Work
Sentiment visualization has not enjoyed the same level of interest in systematic/comprehensive reviews compared to other visualization areas that are also related to data extracted from text, such as the visualization of topic models or events. Only a few text visualization surveys include analysis and visualization of sentiment/opinion as one of their categorization aspects. An example is the survey on the visual analysis of events in text data streams written by Wanner et al. [WSJ*14]. They select polarity extraction as one of the text processing methods used in visual analytics systems: it was utilized by 14 out of 51 papers included in their report. Several surveys mention sentiment and affect analysis as a potential feature extraction method for text visualization, for instance, the paper by Risch et al. [RKPW08] in the context of visual analytics (without any examples) or the paper byŠilić and Bašić [ŠB10] in connection with text stream visualization (with a single example). Given the total amount of work on sentiment visualization, there is clearly a gap in this area in the visualization survey literature.  [SBT15] provide an overview of 11 techniques and classify them according to visual metaphor. However, the focus of that work is not on categorization, but rather on evaluation: the authors conduct a study to compare the techniques with regard to metrics such as user-friendliness or usefulness. In contrast, our survey focuses on the categorization of a much larger number of techniques with regard to multiple aspects related to computational model, data, user tasks, and visual representation.

Methodology
The steps that we took while working on this survey are summarized in Figure 2. The overall methodology can be compared to the model described by Pirolli and Card [PC05], which was adapted to scientific literature analysis by Beck et al. [BKW16].
Based on our previous work on TextVis Browser [KK15], we started with an initial set of text visualization techniques related to sentiment as well as an initial categorization applicable to such techniques. We should state that we use a technique as a unit for this survey as opposed to a publication-therefore, we describe several cases below where multiple techniques originate from the same publication. Since our survey includes work not only from InfoVis, but also from VA and even non-visualization disciplines, a single technique does not necessarily mean a novel metaphor/representation, but also an approach or a system relevant to sentiment visualization.
In addition to the initial set of techniques, we have conducted a search in several visualization outlets: IEEE TVCG, Information Visualization, Computer Graphics Forum, IEEE CG&A, and Journal of Visualization as well as proceedings of IEEE InfoVis, IEEE VAST, EuroVis, IEEE PacificVis, TextVis workshop, ACM CHI, ACM IUI, IV, IVAPP, and VINCI. We have also conducted a search in IEEE Xplore, ACM DL, and Google Scholar using such key phrases as 'sentiment visualization', 'emotion visualization', and 'opinion visualization' (considering only literature in English). Finally, we have investigated references from related surveys as well as already detected research publications.

Selection criteria
We have used the following criteria for including/excluding techniques in our survey: r a technique must be related to visualization of sentiment associated with text data (either extracted automatically or annotated manually); r a technique must be illustrated by at least a single figure in the corresponding publication; r a technique must be described in a peer-reviewed publication (including poster papers and extended abstracts); r since we are focusing on techniques as opposed to publications, incremental work in several papers by the same authors is not considered as separate techniques; and r a technique must actually involve an implementation used for visual representation or analysis (possibly even without interactive features) as opposed to figures generated with third-party tools solely for illustrative purposes in the respective publication.
Some of the candidate techniques had to be excluded with regard to the criteria above. Brath and Banissi [BB15] discuss the usage of text layout and font attributes for the purposes of text-related data visualization and mention sentiment analysis as one of the possible applications, however, the sentiment values are not directly visualized by their technique. OpinionRings by Du et al. [DYLL15] use network data explicitly labelled with user opinions as opposed to extracting the data from text-we have also not included Opinion Space by Faridani et al. [FBRG10] for the same reason. Li et al. [LDS10] mention a visualization module used in their opinion mining system, but provide no further details or figures, so it is

Chosen publications/techniques
The resulting set of sentiment visualization techniques comprises 132 entries from a wide range of journals and conferences. Statistics for publication outlets in Table 1 provide us with an insight that researchers from multiple non-visualization disciplines have demonstrated interest for sentiment visualization (based on the variety of outlets), thus reinforcing our claim for the importance of this research problem. The analysis of temporal distribution (see Figure 1) shows that a stable interest for the problem emerged in mid-2000s and strongly increased in the beginning of the current decade.

Sentiment Visualization Techniques
The initial version of the categorization was based on our previous work related to text visualization [KK15]. Inspired by the VA pipeline model by Keim et al.
[KAF*08], the model presented in Figure 3 treats the resulting sentiment visualization as a combination of the general InfoVis approach and computational methods, both applied to text data. The aspects used in our categorization (highlighted in yellow) vary from general to specific (left to right). These 7 aspects, or groups, include the total of 35 categories listed in Table 2. The categorization facilitates the search for visualization techniques and corresponding publications for interested readers based on their data, required analytic and visualization tasks, and even specific encodings used for sentiment. In this section, we discuss the individual categories and corresponding prominent examples; for full details on each technique's categorization, see the summary tables or our interactive browser (both discussed in Section 6).

Data aspects
Every sentiment visualization technique relies on certain data, and both higher-level (for instance, the original domain) and lower-level (e.g. source representation) data aspects affect the later stages of the pipeline.

Data domain
Most sentiment visualization techniques are designed with a specific data domain in mind, some of which have been historically or currently associated with sentiment analysis and opinion mining tasks in visualization, NLP, and DM communities.
A prominent example of such a domain that provides a lot of text data suitable for sentiment analysis is Online Social Media including forums, blogs, microblogs, and social networks. It is no surprise that the majority of techniques in our data (around 62%) support this category. Some of the early examples include MoodViews by Mishne and de Rijke [MDR06], an NLP system for affect analysis ('mood') of blogs which uses explicit mood tags provided by users as well as predicted mood levels for at least 132 mood types proposed by the blog platform. The system applies basic line plots for temporal visualization and allows the users to interactively investigate salient terms and phrases for selected time intervals. Ink Blots by Abbasi and Chen [AC07] is a technique for exploration of documents and corpora that uses a simple bubble metaphor to mark regions of interest in communication and social media (i.e. forums) texts. With the development of microblogs, more and more techniques started to focus on this data. Diakopoulos et al. [DNKS10] describe Vox Civitas (see Figure 4(c)), a visual analysis tool for investigation of sentiment, relevance, and salient keywords in social media posts related to public events, which uses a pixel-based stacked bar timeline to represent sentiment values. TwitInfo by Marcus et al. [MBB*11] is a visual event analysis system for Twitter streams which supports sentiment polarity detection. The aggregated polarity value is visualized with a pie chart, and the polarity for individual documents is encoded as colour of the corresponding markers in a geographical map view, as depicted  Communication such as emails and chats can also be subject to sentiment analysis and visualization (used by 11% of techniques). The earliest example in our data is CrystalChat by Tat and Carpendale [TC06], a visualization technique for personal chat history that represents individual messages as circles and organizes them in a 3D layout with regard to temporal order and chat contact. The technique encodes the emotional content of conversations based on detected emoticons as the background plane colour. Another technique for communication texts is described by Gobron et al. [GAP*10], and it is a rather unusual emotion visualization technique bordering on computer graphics rather than InfoVis. The authors use the Facial Action Coding System (FACS) to create animated 3D avatars whose faces convey the emotions related to the corresponding text data. Chen et al. [CFKA14] conduct visual analysis of chat logs using emoticons classified by valence/polarity. Mail data is supported by Ink Blots by Abbasi and Chen [AC07], the work of Mohammad [Moh12], 5W Summarization by Das et al. [DBG12], the technique by Guzman [Guz13], and TargetVue by Cao et al. [CSL*16].
Before the rise of social media, sentiment analysis was almost solely used to analyze product reviews and customer feedback. The category Reviews/(Medical) Reports is supported by 25% of sentiment visualization techniques, including the ones such as Affect Inspector by Subasic and Huettner [SH01] which uses star plots to visualize affect profiles of text documents, including movie reviews. Pulse by Gamon et al. [GACOR05] uses a tree map to represent a clustering of sentence-level sentiment classification of car reviews (see Figure 4(a)). Opinion Observer by Liu et al. [LHC05] provides a visualization of customer opinions using modified bar charts. AMAZING by Miao et al. [MLD09] depicted in Figure 4(b) is an opinion mining system for product reviews that visualizes NLP processing results with a line chart (using the review timestamps) and a pie chart (a simple summary for the proportion of positive/negative reviews). In general, a lot of techniques in this category originate from NLP and DM rather than the visualization community, focus mostly on the analytical part, and use rather simple visual representations and interactions. In contrast, Chen et al. [CISSW06] use multiple analytical and visualization techniques for investigation of conflicting opinions in customer reviews. on lexical matching of terms associated with eight emotions as well as positive/negative categories. The author uses multiple basic representations such as line plots, bar charts, and word clouds to analyze the distribution and temporal trends of emotion-bearing word usage in individual documents and corpora. Finally, Weiler et al. [WGS15] apply their text stream analysis and visualization system called Stor-e-Motion for a combined text of the whole 'Harry Potter' series. The resulting visualization represents sentiment as a river-like stream graph with an overlay of salient topic terms list for each time interval (in this particular case, position in text is treated as timestamp).
We did not expect to find a lot of work focusing on sentiment visualization in Scientific Articles/Papers due to the style standard in this genre. The few existing techniques (2%) detect polarity or stance in text that surrounds citations and use this information for analysis and visualization of citation networks. Schäfer and Spurk [SS10] classify polarity and reuse of citations in scientific articles and use the classification results for colour coding of the citation graph edges. Small [Sma11] discusses sentiment analysis and visualization of citations in scientific literature with Maps of Science. His approach results in classification of multiple categories beyond the standard polarity-related ones. For instance, uncertainty and differentiation/contrast can be considered as categories of stance. The results of analysis are used to calculate the layout of a node-link diagram which resembles a map. The recent work of Wang et al. [WLQ*16] involves polarity classification and visualization of citations to represent a citation graph for paper review purposes.

The last category of data domains is
Editorial Media such as news or pre-moderated websites (e.g. Wikipedia), and this category is supported by 14% of techniques. Some of the early examples here include SATISFI, 'Sentiment and Time Series: Financial Analysis System' by Taskaya and Ahmad [TA03], which uses lexical matching with specific markers to analyze the polarity of financial news documents. The polarity values are aggregated and treated as time series which are visualized with simple line plots. Fukuhara et al. [FNN07] use line charts to visualize topic data associated with eight affect categories in news and blogs, and vice versa, affect data associated with a specific topic. Gamon et al. [GBB*08] describe BLEWS, a system dedicated to the analysis of relations between news articles and political blog posts that refer to such articles. The number of detected subjective posts is encoded visually as the amount of glow around the corresponding bars representing the number of liberal and conservative blog posts. Zhang et al. [ZKKT09] introduce Sentiment Map, a lightweight visualization based on a geographical map. The tool detects eight emotion categories in news articles over time and visualizes the resulting time series with line plots for corresponding geographical regions based on the zoom level. In contrast to some of these techniques with simple and standard representations, TextWheel by Cui et al. [CQZ*12] introduces a combination of several complex and novel representations for monitoring of sentiment associated with specific entities in the news streams. Figure 5(b) displays some of the representations used in the system: a radial node-link diagram representing relations between entities (a keyword wheel), a U-shaped transportation belt that acts as a substrate for moving document glyphs, and a significance trend chart that is represented by a line plot.
We have also found several techniques (around 4%) that are not designed for any particular data domain. Duan et al. [DQP*12] describe VISA, a system for temporal aspect-based visual sentiment analysis which extends the more general system TIARA with sentiment analysis capabilities. VISA uses a river-like stream graph with embedded tag clouds as well as several auxiliary views (pie charts, bar charts, text views) with multiple interactions to support a number of user tasks. The authors demonstrate VISA with a use case involving hotel reviews data, but their approach is not specific to this data domain. Semantize by Wecker et al. [WLM*14] is a lightweight web-based visualization technique that uses font style and background colour to encode the word-level polarity, sentencelevel subjectivity, and paragraph-level polarity directly in the HTML document. Neviarouskaya et al. [NAPI14] visualize the results of document analysis with their @AM model, including the judgement and appreciation aspects which go beyond typical polarity and emotion categories. Gold et al. [GREA15] list sentiment polarity as one of the possible annotation types for their Lexical Episode Plots technique. In this case, particular words as well as bar segments representing text document regions can be highlighted to facilitate the understanding of affective structure of the text. Typographic Set Graph by Brath and Banissi [BB16] provides a map representation of a word-emotion association lexicon. This technique manages to simultaneously represent membership of individual words in ten sets (eight emotion and two polarity categories) by using font style and colour.

Data source
The type of text data source has implications for the design of sentiment visualization techniques in most cases. However sometimes the categorization can become somewhat fuzzy, since a single large document (e.g. a novel or a transcript) can be treated as a collection or sequence of its sections; and vice versa, multiple separate texts can be concatenated for analysis and visualization.
First of all, we have identified techniques that support data from an individual Document (15% of the total set). In most of such cases, visualization of a single document is used either to support details on demand, or to represent a subset of data from another type of data source. For instance, Affect Inspector by Subasic and Huettner [SH01] can visualize affect profiles for a single document or several documents at a time. uVSAT by Kucher et al. [KSBK*16] provides a separate representation of individual documents based on selected subsets of aggregated emotion/stance value series initially presented to the user (see Figure 6  as well as the position of text fragments in the document. The technique by Bembenik and Andruszkiewicz [BA16] takes an unstructured text document as input, extracts proposals and arguments alongside their polarity, and visualizes this data using a node-link diagram. The absolute majority of techniques in our data set (86%) support text data from a collection of documents, a corpus, or corpora. In contrast to the visualization of individual documents, techniques that support such data sources typically have to address challenges related to larger data set sizes, varying text lengths, relationships between documents, and additional data properties (see Section 5.1.3). A typical example here would be a technique oriented at a collection of customer reviews [XLLS11]  During the last five years, research in streaming data visualization has also produced a number of sentiment visualization techniques which support Streams as data sources (19% of our collected set). Most of these techniques consume data from microblogs such as Twitter or Weibo, for instance, TwitInfo by Marcus et al. [MBB*11] or MoodLens by Zhao et al. [ZDWX12]. In some cases, the major focus of a technique is set on event detection, and sentiment analysis/visualization plays an auxiliary role. For example, Krstajić et al. [KRHW12] describe a visual analysis system for event detection in Twitter data which uses the aspect-based sentiment analysis method introduced by another technique [RHD*12] to calculate polarity scores for individual documents among other features. [SBBI*15] is a VA system for text stream data that focuses specifically on temporal sentiment analysis of Twitter data. Polarity of individual tweets is visualized with stream graphs and geospatial heat maps, and a fine-grained analysis of emotions in the text data is available with a scatter plot representation. PaloPro by Tsirakis et al. [TPTV16] is a brand monitoring platform which conducts opinion mining of data streams from multiple social media and news sources. Its dashboard visualization includes line plots and bar charts representing polarity for specific topics or named entities.

Data properties
Besides the data source type, we have also analyzed special properties of the data used by sentiment visualization techniques. The results of our analyses discussed in Section 6 confirm that these properties are correlated with the later stages of the sentiment visualization pipeline.
Some of the recent techniques (22%) make use of Geospatial information, starting with Sentiment Map by Zhang et al. [ZKKT09]. This tool identifies affective content (namely, eight emotion categories) in news articles. present webLyzard, a platform for monitoring and visual analysis of social media, news, and other text documents from the Web. Among other analyses, it supports polarity detection for specific topics and uses these polarity values with line charts, a map and a tag cloud. One of the techniques described by Zhang et al. [ZLW13] provides a geospatial visualization of sentiment in social media. The technique combines a regular geographical map with the representations of kernel density estimation (KDE) analysis for sentiment polarity in specific cities and edges between cities representing the retweeting network. Caragea [HC14] discuss ConVis, a visualization system for discourse analysis of social media discussions that uses a network of users, topics/concepts, and opinions. The resulting visualization displayed in Figure 5(c) combines a radial node-link diagram with an indented sequence of stacked bars that represents the conversation tree. The subsequent work of the same authors on MultiConVis [HC16] extends the analysis to multiple discussions and temporal data. A similar network of users, entities/concepts, and opinions is also visualized with ORCAESTRA by Prasojo et al. [PDK15]. The authors propose a planetary metaphor for their main node-link representation which represents nodes as stars, planets, and asteroids, and uses a heliocentric radial layout. Asteroid nodes represent individual comments and use colour to encode the corresponding sentiment polarity.

Tasks
The process of designing a sentiment visualization technique includes the analysis of intended data, intended audience, and intended tasks. We focus on the latter aspect in this subsection and analyze the tasks related to both the computational and visual/exploratory methods.

Analytic tasks
High-level analytic tasks supported by sentiment visualization techniques are based on the respective sentiment analysis models introduced in Section 2.
Polarity Analysis/Subjectivity Detection is the most common analytic task in our survey associated with 81% of techniques. The techniques which support only this analytic task provide a summary about the overall polarity of text data, for example, SATISFI by Taskaya and Ahmad [TA03] or Vox Civitas by Diakopoulos et al. [DNKS10]. Annett and Kondrak [AK08] extend a blog visualization tool eNulog with sentiment analysis by classifying movie blog posts into positive, negative, and neutral/uncertain, and using these labels for colour coding of the blog map nodes. Pupi et al. [PDPA14] highlight the polarity of individual sentiment-bearing terms in their system Ent-it-UP. Agave by Brooks et al. [BRT*14] is one of the few sentiment visualization systems that focus on collaborative visual analysis of social media data. The system visualizes the aggregated polarity of temporal text data using representations such as line plots and stream graphs. The final example in this category is BLEWS by Gamon et al. [GBB*08], a system dedicated to the analysis of relations between news articles and political blog posts that refer to such articles. The characteristic detail of BLEWS is that it focuses only on detecting subjectivity in the blog posts without more detailed polarity analysis. The number of detected subjective posts is visually encoded as the amount of glow around the corresponding bars representing the number of liberal and conservative blog posts.
Another category with high support in our data set (64%) is Opinion Mining/Aspect-based Sentiment Analysis. We have used this item for techniques supporting sentiment analysis at the level of particular aspects/features, topics, named entities, or clusters detected in text. This task has often been supported historically with customer reviews data. For instance, Review Spotlight by Yatani et al. [YNTT11] is a simple visualization tool for summarizing customer reviews with a tag cloud of salient adjective/noun word pairs. The sentiment polarity for each word is calculated with lexical matching over its counterparts and used for colour coding. OpinionBlocks by Alper et al.
[AYHK11] provides an aspect-based sentiment overview for customer reviews. The visualization combines multiple coordinated bar charts and text tags that can be explored interactively to investigate the polarity and salient keywords associated with specific product features. SentiVis by Di Caro and Grella [DCG13] visualizes results of aspect-based sentiment analysis of customer reviews. After selecting a specific aspect, the users are provided with a visual representation of polarity scores for review objects (in this case, restaurants) that combines a scatter plot and a line plot. Görg et al. [GLK*13] discuss the text analysis and visualization features added to Jigsaw, a general-purpose VA system. The support for sentiment analysis includes document-level polarity detection, which can also be used for analysis of specific aspects when combined with topic analysis, as demonstrated with car reviews data. The visual representations involving the polarity data include word clouds and pixel-based Document Grid Views that can encode document-level polarity.
Another prominent application of techniques in this category is the analysis of social media texts. Wensel and Sood [WS08] describe several basic visualizations of sentiment with regard to the specific topics discovered in personal blog posts with their system VIBES. The authors estimate valence of the texts (hence their claims about analyzing the emotional content of the text data-nevertheless, valence detection on its own is similar, if not  Emotion/Affect Analysis is related to analysis of affective content in text beyond the positive/negative categories, usually involving a categorical or dimensional emotion model. Approximately 24% of techniques in our survey support this task. Affect Inspector by Subasic and Huettner [SH01] uses star plots to visualize affect profiles of text documents. The authors combine lexical matching with fuzzy tagging to label documents with 83 affect categories (including basic emotions), and support visual exploration and annotation with their tool. Gregory et al. [GCW*06] conduct analyses of affect bearing words in customer reviews using lexical methods, an annotation tool and a general-purpose visualization system IN-SPIRE. They propose a novel metaphor based on a rose plot to visualize statistics for eight affect categories. The users can investigate the rose plots for the corpus in general as well as particular clusters of reviews. Kang and Ren [KR11] conduct a joint emotion/topic analysis in blog posts. They analyse the output of their method by examining node-link diagrams generated for topic networks that use colour coding based on one of the eight emotions dominating for the respective topic. Zhao et al. [ZGWZ14] support visual analysis of emotions detected in tweets over time for an individual user with their system PEARL. They use two emotion models, the categorical Plutchik's model with eight basic emotions and the dimensional VAD model. PEARL uses river-like stream graphs as the main visual representation for temporal emotion data alongside multiple auxiliary representations (line and area charts, glyphs, scatter plots, and tag clouds) to support overview and detailed exploratory analysis. Wang et al. [WSK*15] discuss a technique for emotion analysis of Twitter data called SentiCompass. As displayed in Figure 6(a), the visual representation uses star plots that naturally correspond to the polar valence-arousal space and organizes them in a nested fashion resembling a spiral to support the simultaneous visual analysis of multiple temporal intervals. Additionally, the same data is represented by linearly ordered line plots as an auxiliary representation.
Finally, we have introduced the category Stance Analysis to represent analyses that encompass not only sentiment/affect, but also other categories of subjectivity/evaluation expressed in text, for instance, certainty or judgement. Currently, very few techniques support this task (6%), including the works of Small [Sma11], Almutairi [Alm13], Neviarouskaya et al. [NAPI14], and Bembenik and Andruszkiewicz [BA16] discussed above. Torkildson et al. [TSA14] complement Ekman's six emotions with two additional categories representing support and accusation for their visualization of social media posts on the Gulf Oil Spill crisis. uVSAT by Kucher et al. [KSBK*16] is a VA system dedicated to sentiment and stance analysis which detects the expressions of Ekman's six emotions as well as certainty and uncertainty in social media data over time. These time series are visualized with line plots, and the contents of corresponding documents are represented with highlighted text, scatter plot thumbnails, and bubble charts, as displayed in Figure 6 certain topic besides the expressed sentiment. Finally, El-Assady et al. [EAGA*16] provide an animation-based visualization of conversation transcripts, e.g. political debates transcripts, with their system ConToVi. The system supports several stance categories related to argumentation as well as sentiment, certainty, and politeness. It allows the users to monitor the stance of individual speakers with regard to specific topics.

Visualization tasks
Besides the higher-level analytic tasks related to sentiment analysis model, we have included a number of more concrete representation and interaction tasks that are directly supported by sentiment visualization techniques. A lot of techniques in our survey (58%) are related to Clustering/Classification/Categorization. Here, we have identified the techniques that involve additional (semi-)automatic tagging or grouping of data elements besides the actual sentiment classification, represented or facilitated by the visualization. The work by Oelke et al. [OHR*09] on opinion analysis for customer reviews uses several techniques for visual analysis of aspects/features, including a matrix-based visual summary report and a circular correlation map (a sort of combination of a node-link diagram and a parallel coordinates plot). The authors also introduce a novel technique for the visual analysis of opinion clusters that combines a Voronoi diagram with thumbnails containing cluster details as tables. Brew et al.
[BGAC11] support the temporal sentiment polarity analysis for groups of Twitter users with their SentireCrowds system. They cluster users into groups using tweet contents and calculate the aggregated polarity values for each cluster/time step. SentireCrowds provides an overview of the overall sentiment over time with an area chart as well as multiple treemaps for individual time steps which represent the polarity and salient keywords for user clusters. Kim and Lee [KL14] discuss a dimensionality reduction technique called Semi-Supervised Laplacian Eigenmaps which they apply to the customer reviews data. The method involves extraction of features (terms) related to positive/negative categories and dimensionality reduction of the feature space based on graph and matrix computations. The resulting 2D embedding of reviews, which highlights the clustering in the data, is visualized with a colour-coded scatterplot. ToPIN by Sung et al. [SHS*16] identifies topics in student comments by using a clustering algorithm and then represents them as nodes. The average polarity of comments belonging to the same cluster is encoded by node brightness.
Comparison of several entities is facilitated by most visualization techniques in our survey (91%). For instance, Xu et al. [XLLS11] propose a method for opinion analysis of product reviews that directly takes the comparisons of particular features into account. The resulting visualization uses a node-link diagram to represent a probabilistic graphical model as a bipartite graph, where the polarity and direction of comparison are displayed for each feature. Kuksenok et al. [KBR*12] use a timeline visualization to represent occurrences of several affect categories in their annotated data set and to identify relationships between such categories. One of the techniques used in the SocialBrands system by Liu et al. [LXG*16] is BrandWheel, which visualizes the scores of various brand aspects estimated by the analysis of social media posts and employees' reviews. The system is capable of simultaneously displaying two BrandWheels for comparison purposes and even visualizing their differences as a derived representation. A similar comparison mode was previously used in EmotionWatch by Kempter et al. [KSMP14]   The users can then validate and edit the predicted sentiment values by using several visual representations: a tree cloud and a scatterplot with embedded word clouds. Uncertainty in this case is related to the ambiguous polarity predictions, which are highlighted in yellow in the tree cloud representation. The VA framework for event cueing by Lu et al. [LSB*16] involves the polarity analysis of RSS news messages based on ML methods. The authors address the task of uncertainty visualization with stream graphs: stacked layers encode the volumes for several sentiment classification certainty levels over time. The previous work from the same authors [LHW*15] also uses blurred map glyphs to represent uncertainty of Twitter data polarity analysis for disaster scenarios.

Visualization aspects
The final two groups of our categorization are related to specific aspects of representing sentiment data visually.

Visual variable
One of the interesting research questions that we aimed to answer with this survey was related to how sentiment values are usually represented by various techniques. Based on the work by Bertin [Ber83], we have introduced several categories for the visual variables used to encode sentiment. and emotions (moods) in RSS news feeds. The main visual representation of Eventscapes uses a timeline metaphor with document thumbnails laid out linearly according to their timestamps and valence values (vertical and horizontal axis, respectively). Also, several techniques which use a gauge metaphor rely on orientation rather than position [WS08,ZQLT16].  [KL14] use several marker shapes as well as colour coding to differentiate between labeled/unlabelled positive and negative reviews in their scatterplot representation. Other techniques use the contour of a heat map or similar representation to convey emotion values [LR09,Alm13]. Kempter at al. [KSMP14] conduct an emotion analysis of social media texts with 20 emotion categories. Their visualization system EmotionWatch uses a star plot with filled area to represent the emotions detected for the selected time interval. Munezero et al. [MMMS15] focus on the temporal emotion analysis and visualization for an individual Twitter user with their tool EmoTwitter. One of the visual representation used for the resulting eight basic emotions is a star plot with filled area.

Size/Area
Finally, we have discovered that only four techniques in our complete data set (3%) use Texture/Pattern to represent sentiment. Gali et al. [GOCD12] introduce three techniques involving polarity detection in timestamped social media posts related to five Canadian banks. One of these techniques, Emotional Tapestry, generates monthly summaries as woven patterns different for various sentiments, which can be combined for several banks at a time. Zhang et al. [ZLW13] introduce a visualization technique for temporal visualization of social media sentiment that is based on the Electron Cloud Model. The authors calculate the polarity of posts over time for individual users and then visualize this data using a special layout algorithm, which results in a texture-like rendering of trajectory lines. Kuang et al. [KTLS14] describe ImgWordle, a visualization tool that is designed for social media monitoring. One of its visual representations is a choropleth map that provides an overview of frequency and sentiment polarity of posts for each region. The authors use textures to represent sentiment, since the colour channel is used to convey topic data. Finally, Krcadinac et al. [KJDP16] propose an artistic visualization technique for chat conversations called Synemania. They analyze the emotions present in chat mes-sages using the Ekman's six emotions and then use this data in an animated particle simulation. The visualization can therefore be used for monitoring of emotions during a conversation by observing the overall texture and colours.

Visual representation
The last group in our categorization includes visual representations (or metaphors) that make use of sentiment. The statistics in this group are not so heterogeneous as for visual variable, for instance: the categories described below are supported by 17% to 43%. The majority of techniques in our survey (70%) have more than one category assigned-in many cases, novel or complex representations combine several traits, or multiple coordinated views are used to represent data that includes sentiment information.
The first category in this group is Line Plot/River, supported by 43% of techniques. Basic line plots/charts have been used primarily for temporal data in sentiment visualizations since earlier works such as SATISFI [TA03] and MoodViews [MDR06] up to the recent techniques such as uVSAT [KSBK*16] and Westeros Sentinel [SHHJ*16]. We also include the techniques that use area charts in this group since they usually convey the same data as line plots, for example, the work by Fukuhara et al. [FNN07] or Lingoscope by Diakopoulos et al. [DZES14]. Then there are river-like representations [HHN00,BW08]  The second category in the visual representations group is Pixel/Area/Matrix (used by 39% of techniques). Here, we have tried to collect the techniques which use space-filling approaches and other representations which rely on the size/area variable. One of the basic representations here is a pie chart: for instance, AMAZ-ING by Miao et al. [MLD09], TwitInfo by Marcus et al. [MBB*11], and 5W Summarization by Das et al. [DBG12] use pie charts to provide an overall summary about the polarity distribution. Kumamoto et al. [KWS14] describe an emotion detection tool for personal Twitter data that uses six emotion categories organized in polar pairs. The tool presents a simple visualization consisting of line charts and pie charts for the individual user's data over time. The sizes of nodes represented by squares are calculated similarly to a regular tree map, but the layout takes the geographical positions of cities into account.

Various forms of
Node-Link representations are used by sentiment visualization techniques (17%) on their own, e.g. in works of Kang and Ren [KR11], Small [Sma11], or Makki et al. [MBM14], or in combination with other representations such as a map (e.g. in Whisper by Cao et al. [CLS*12] or the work by Zhang et al. [ZLW13]) or a (stacked) bar chart (for instance, in the work by Xu et al. [XLLS11] or the works by Hoque and Carenini [HC14,HC16]). Several techniques use arc diagram variations, e.g. the work by Chen et al. [CISSW06] or News Flow by Braşoveanu et al. [BHHS12]. Arc diagrams are also used in the recent work by Fu et al. [FZCQ17], whose system iForum uses multiple representations for visual analysis of MOOC forum data. Individual threads are represented by a combination of a river and an arc diagram called Thread River, and this representation can be user-configured to display sentiment analysis results.
We have used the next category for representations that use multiple visual items (such as dots, bubbles, glyphs, or words/tags) to give rise to associations with Clouds/Galaxies (supported by 27% of the techniques set). Some of the techniques in this category use various forms of word/tag clouds, for instance, Review Spotlight by Yatani et al. [YNTT11] and webLyzard by Scharl et al. [SHHW*12]. Fisheye Word Cloud by Wang et al. [WDN13] is a technique for temporal sentiment visualization that detects the polarity of individual key terms in a set of Twitter posts. The visual representation is based on a word cloud whose layout takes the temporal order into account. The technique makes heavy use of focus+context for interactive exploration. Other techniques rely on clouds of dots or similar markers, e.g. RadViz-based Attribute Astrolabe used in SentiView by Wang et al. [WXL*13] and the work by Kim and Lee [KL14]. Opinion Zoom by Marrese-Taylor et al. [MTVBM13] provides a lightweight visualization of customer reviews using an aspect-based sentiment analysis model. The system uses basic visual representations such as bar charts and bubble charts. Lu et al. [LKT*14] describe a VA framework for classification and prediction that uses sentiment analysis to predict movie gross based on social media texts and movie reviews. Polarity values are used in several visualizations used in the framework, namely, word clouds and temporal bubble charts. Finally, the galaxy metaphor is also used in a more literate sense in CosMovis by Ha et al. [HKH*14], a system for emotion analysis in movie reviews which displays a constellation map of movies positioned according to the specific affect-bearing words detected in reviews. The visualization also contains the centroids of corresponding word clusters and artistic representations of emotions as constellations.

The category
Maps includes techniques (27%) which use either (1) an actual geographical map or (2) an abstract map which somehow allows the user to identify interesting regions or peaks in the overall landscape. In the former case (1), the maps are often augmented with markers (e.g. in TwitInfo by Marcus et al. [MBB*11]) or overlays (e.g. in the work by Zhang et al. [ZLW13]). The visualization of public opinions on educational institutions in eduMRS-II by Qiu et al. [QRQ15] provides a map with overlaid circles/bubbles which represent the aggregated positive or negative polarity. A similar approach is used by Dai and Prout [DP16] to represent the aggregated positive sentiment on the Super Bowl teams extracted from Twitter. Tweetviz by Sijtsma et al. [SQC16] uses a map view to represent business locations reviewed in tweets and encodes the review polarity with the map marker colour. Some techniques use the choropleth approach, for instance, MoodLens by Zhao et al. [ZDWX12] and ImgWordle by Kuang et al. [KTLS14]. topical (whether the RSS news is related to the Democratic or Republican party), and polarity information. A later work by Wanner et al. [WWS12], Topic Tracker, is a system for temporal visual analysis of Twitter streams that combines topic monitoring and gradual sentiment polarity detection. The authors use basic colour-coded triangle glyphs representing the timestamp and polarity of individual tweets. The glyphs are positioned in a dense fashion, and the final result resembles pixel-based representations. FluxFlow by Zhao et al. [ZCW*14] is a VA system for investigating anomalous patterns of information spreading on social media, namely, retweeting threads on Twitter. The system uses lexical matching of emotionbearing words as one of the features for estimating the anomaly score and visualizes these values as part of thread glyphs. Several other techniques use gauge-like glyphs, for instance, VIBES by Wensel et al. [WS08] or Social Sentiment Sensor by Zhao et al. [ZQLT16].

Discussion
After several iterations of adding new techniques and refining the categorization, we have been able to summarize the state of the art in sentiment visualization based on the statistics for our data. In addition, we have investigated the relations between categories in general.

Correlation between categories
We have conducted a correlation analysis for categories assigned to sentiment visualization techniques. Technique entries were treated as observations, and categories were treated as dimensions/variables (see the supplementary material for the complete categorization results). Linear correlation analysis was then used to measure the association between pairs of categories. The resulting matrix in Figure 7 contains Pearson's r coefficient values which reveal certain patterns and interesting cases of positive (green) and negative (red) correlation between categories. The interpretation of the coefficient values seems to differ in the literature: Cohen [Coh88] defines the range 0.30-0.50 (absolute values) as moderate correlation and 0.51-1.00 as strong correlation; Taylor [Tay90] mentions the corresponding ranges 0.36-0.67 and 0.68-1.00 used in earlier works; and Evans [Eva96] defines the ranges 0.40-0.59 for moderate correlation, 0.60-0.79 for strong correlation, and 0.80-1.00 for very strong correlation. Based on this, we have focused on cases with an absolute value of 0.40 or higher.
The interesting cases with negative correlation mostly include categories from the same groups, implying 'competition' or a kind of paradigmatic relation between them. For example, the correlation of −0.49 between reviews and social media could be explained by a general shift from data sets of well-defined product reviews to the data extracted from social media (including posts associated with some brands)-see the discussion of such temporal trends below. The correlation of −0.51 between the document and corpora categories is interesting, given the discussion in Section 5.1.2. While there are many techniques using both data source types, this value could be explained by techniques focusing exclusively on individual documents or corpora. The document category is also negatively correlated to the temporal data property with a value of −0.45 : while there are some techniques which treat position in text as the temporal dimension, the majority of the techniques which support time series apparently use other data sources. Finally, the strongest negative correlation in our survey is −0.63 between the analytic tasks of polarity and emotion/affect analysis, which is explained by a more specialized nature of the latter task.
The cases of positive correlation tend to include categories from different groups in a kind of syntagmatic relation. For instance, the correlation of 0.42 between the literature domain and the document data source type can be explained easily: the corresponding techniques focus on individual novels/poems. The correlation of 0.40 between the document data source and emotion/affect analysis is more surprising in this regard-perhaps, it is affected by a rather low support for both categories. The other cases involving visual tasks and representations are generally easier to interpret. The positive correlation of the streams data source with such visualization tasks as region of interest and monitoring with the respective values of 0.48 and 0.64 was expected. The analytic task of opinion mining/aspect-based sentiment analysis involves the support for various kinds of classification/clustering/categorization, hence the positive correlation of 0.42 (it is a little surprising that this value is not higher, though).
Temporal data is often visualized using representations such as line plots or rivers, which explains the correlation of 0.55 . The strongest positive correlation of 0.74 in our survey exists between network data and node-link representations, which is not surprising at all. We could also expect the correlation between geospatial data and map representations to be higher than 0.58 -but we have noted in Section 5.3.2 that the latter category includes also abstract maps. Finally, the correlation of 0.41 between pixel/area/matrix representations and the visual variable of size is explained by such representations as bar charts and pie/donut charts. Table 2 presents the statistics for the collected data based on the final categorization. It supports our expectations of the most common aspects of existing sentiment visualization techniques. An average technique is used for visualization of temporal data (stored as corpora) from social media based on polarity analysis/subjectivity detection. The popularity of the more specific aspect-based sentiment analysis/opinion mining is explained by the existing interest for topic analysis and visualization. Based on the statistics given in Table 2, we can also identify a standard set of four visualization tasks relevant to sentiment, which reflect the visual information seeking mantra [Shn96] and visual analytics mantra [KAF*08] to some extent: if possible, cluster the data into groups first, then provide an overview of these results, compare interesting items, and explore them in detail. More specific visualization aspects related to variable and representation are discussed below. Also, we should note that the absolute majority of techniques rely only on 2D representations (even though we have not discussed it explicitly).

Popular approaches
Temporal trends In addition to the overall statistics, it is also interesting to analyze temporal trends with regard to the occurrence of individual categories in our collected data set. Figure 8 comprises sparkline-style plots based on the category counts normalized by the total technique counts for each year (for example, 6 out of 28 techniques from 2016 support streaming data). It allows us to detect global trends and compare trends within each group of categories. For instance, it confirms our previous statements about popularity of social media and decreasing role of reviews as data sources. The popular approaches discussed above demonstrate stable support throughout the years, but it is also interesting to trace the usage of underrepresented categories over time, as discussed below.

Underrepresented categories
Emotion/affect analysis is supported by a relative minority of techniques, and only a few techniques address the issues of stance analysis. According to the temporal trends (see Figure 8), the interest has varied over the years for the former task and started to emerge only recently for the latter task. These analytical tasks present multiple opportunities for future research with applications in several domains (social media, literature, etc.). As for visualization tasks, uncertainty tackling is currently underrepresented. This also presents future research opportunities, since uncertainty is an inherent aspect of many popular ML models (e.g. SVM or CRF) as well as data sets characterized as 'Big Data'. The increasing interest for such data will for sure also affect the number of techniques supporting streaming data sources as well as the related visualization task of monitoring. While such data sets are mostly associated with online sources, we have been surprised by the statistics for several other domains. For instance, very few techniques (mostly from the past years) address the literature domain as opposed to the overall growth of interest  Table 2) calculated for our techniques data set.   Table 2 for the legend). The values are relative to the total technique counts for the corresponding years.
for digital humanities. Also note the low number of techniques that focus on a single document-the interest for such data sources has diminished over the years. We also had to exclude the domain of patents from our categorization since no detected techniques support it. This can be explained by the generally formal and objective style of language used in such texts-on the other hand, several techniques already support the related domain of scientific articles.

Visual representations
The statistics on the visual variable state that colour is the most common visual channel to convey sentiment/emotion, which was expected by us even before collecting the data. The rather large number of techniques using position/orientation and size/area to encode sentiment can be explained by the usage of line plots and stream graphs for temporal sentiment data, as well as pie charts and bar charts for simple visual summaries which are often used by techniques originating in non-visualization disciplines. This leads us to the issue of categorizing the visual representations used for sentiment into simple and complex, which was initially one of our intentions. The existing work investigating complex representations (for instance, [TKT*00, WBWK00, YN05]) refrains from providing exact definitions of such. Shamim et al. [SBT15] have already raised the issue of evaluating sentiment visualization techniques including the perception aspects-this problem presents interesting opportunities for future research.
Interactive exploration with a survey browser We have developed an interactive survey browser similar to our TextVis Browser to accompany this paper and used it extensively ourselves while working on the paper. Figure 9 demonstrates its user interface that is focused on individual technique thumbnails and the interaction panel comprising category filters and a search field. In general, it follows the design decisions used by several existing browsers [Sch11, TA, KK14, KK15, BKW16]. SentimentVis Browser is implemented as a client-side web application using HTML, JavaScript, and D3. After loading the page, the user is presented with a list of visualization technique thumbnails organized in a grid. The entries are ordered by publication year first and then by the prime author's surname. Clicking on a thumbnail opens a dialog box with details such as a complete bibliographical reference, a URL link to the source publication webpage (if available), a BiBTeX file link, and a complete list of categories assigned to the corresponding technique. The categories are also presented in form of filters in the main interaction panel on the left. Additionally, the panel includes a text search field, a time range slider, and a histogram showing the temporal distribution of techniques before and after filtering. The users can also access category statistics via the 'About' dialog (similar to the data in Table 2) and a summary table with an overview of the complete categorization (see below). We encourage the users to submit additional entries to SentimentVis Browser via a form available from the top panel-the process is not entirely automated, though, since we intend to continue careful curation of the survey.

Survey summary and citation counts
Besides the interactive survey browser, our entire resulting categorization is presented in the supplementary material (see Tables S1-S3). These results take the aforementioned scope of particular categories into account. For instance, most visualization techniques use several visual representations/views to represent a single complex data set. In such cases, we considered only the representations involving sentiment, and categorized the techniques appropriately. The supplementary material also includes citation counts for the publications corresponding to visualization techniques (see Table S4). While this information is external to the actual techniques, it allows us to make assumptions about the impact and adoption of sentiment visualization approaches. When using these citation counts as guidelines, though, it is important to take the aspects related to publication year and discipline into account. For example, the most cited publications from our survey belong to disciplines such as data mining and NLP-these are the works on Affect Inspector [SH01], Pulse [GACOR05] (see Figure 4(a)), Opinion Observer [LHC05], comparative relation maps [XLLS11], and MoodLens [ZDWX12]. The most cited publications from visualization outlets are related to techniques such as TwitInfo [MBB*11] (see Figure 5(a)), Vox Civitas [DNKS10] (see Figure 4(c)), and OpinionSeer [WWL*10]. In general, these techniques fit the discussion of popular categories and temporal trends above well, thus reinforcing its validity.

Conclusions
In this paper, we have analyzed the state of the art in visualization of sentiment detected in text data. We have discussed 132 visualization techniques originating from peer-reviewed publications using a finegrained categorization comprising 7 groups with 35 categories in total. We have also introduced an interactive survey browser which supports the categorization, and discussed insights as well as opportunities for future research in sentiment visualization. The collected survey data indicates the growing multidisciplinary interest for visualization of sentiment with regard to multiple data domains and tasks. Our future work on this survey includes updates to the survey data set, eventual refinements of the categorization, additional features for the interactive browser, and more analytical work based on the resulting changes.

Supporting Information
Additional Supporting Information may be found in the online version of this article at the publisher's web site: Table S1: The summary of our survey.