SEARCH

SEARCH BY CITATION

Keywords:

  • data mining;
  • personality recognition;
  • social network analysis;
  • Twitter

Abstract

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

In this article, we address the issue of how emotional stability affects social relationships in Twitter. In particular, we focus our study on users’ communicative interactions, identified by the symbol “@.” We collected a corpus of about 200,000 Twitter posts, and we annotated it with our personality recognition system. This system exploits linguistic features, such as punctuation and emoticons, and statistical features, such as follower count and retweeted posts. We tested the system on a data set annotated with personality models produced by human subjects and against a software for the analysis of Twitter data. Social network analysis shows that, whereas secure users have more mutual connections, neurotic users post more than secure ones and have the tendency to build longer chains of interacting users. Clustering coefficient analysis reveals that, whereas secure users tend to build stronger networks, neurotic users have difficulty in belonging to a stable community; hence, they seek for new contacts in online social networks.

1 Introduction and Background

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

Twitter is one of the most popular micro-blogging Web services. It was founded in 2006 and allows users to post short messages up to 140 characters of text, called “tweets.” The service rapidly gained worldwide popularity, with over 500 million active users as of 2012, generating over 340 million tweets daily and handling over 1.6 billion search queries per day. Since its launch, Twitter has become one of the top 10 most visited websites on the Internet, and its use is spreading more and more (Twitter Search Team, May 31, 2011).

Following the definition of Boyd and Ellison (2007), Twitter is a social network site but shares some features with blogs. Zhao and Rosson (2009) highlighted the fact that people use Twitter for a variety of social purposes such as keeping in touch with friends and colleagues, raising the visibility of their interests, gathering useful information, seeking help, and relaxing. They also reported that the way people use Twitter can be grouped into three broad classes: people updating personal life activities, people producing real-time information, and people following other people's rich sit summary (RSS) feeds, which is a way to keep informed about personal interests.

According to Boyd, Golder, and Lotan (2010), there are many features that affect practices and conversations in Twitter. First of all, connections in Twitter are directed rather than mutual: users follow other users’ feeds and are followed by other users. Public messages can be addressed to specific users with the symbol @. According to Honeycutt and Herring (2009), this is used to reply to, to cite, or to include someone in a conversation. Messages can be marked and categorized using the “hashtag” symbol #, which works as an aggregator of posts having something in common. Another important feature is that posts can be shared and propagated using the “retweet” practice. Boyd et al. (2010) emphasized the fact that retweeting a post is a means of participating in a diffuse conversation. Moreover, posts can be marked as favorites, and users can be included into favorite lists. These practices enhance the visibility of the posts or the users.

In recent years, scientific interest has begun to be focused on the Twitter community, especially in information retrieval. For example, Pak and Paroubek (2010) developed a sentiment analysis classifier from Twitter data, Finin et al. (2010) performed named entity recognition on Twitter using crowdsourcing services such as Mechanical Turk1 and CrowdFlower,2 and Zhao et al. (2011) proposed a ranking algorithm for extracting topic keyphrases from tweets. Of course, the computational personality recognition field also saw a great interest toward the analysis of Twitter. For example, Quercia et al. (2011) analyzed the correlations between the five personality traits described by the Big Five factor model (Section 3) and the behavior of four types of users: listeners, popular, hi-read, and influential.

In this article, we describe a system, developed by us, for the automatic extraction from text of the personality traits defined by the Big Five model. Among all the five personality traits, we analyze how emotional stability affects communicative interactions between users in Twitter. In the next section, we present the data set we collected from Twitter. Then, we give an overview about the Big Five personality traits, emotional stability, and automatic personality recognition from text. We provide a detailed description of our system, and in the last two sections, we report the results of the experiments and draw some conclusions.

2 Collection of the Data Set

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

A social network is a set of social entities connected by social relationships, such as friendship, coworking, or information exchange. We define our social network as a set of Twitter users connected by communication exchanges.

We collected a corpus called “Personalitwit2,” starting from Twitter's public time line.3 The sampling procedure is depicted in Figure 1.

image

Figure 1. Data sampling pipeline.

Download figure to PowerPoint

We sampled data from December 25 to 28, 2011, but most of the posts have a previous posting date because we also collected data from user pages, where 20 recent tweets are displayed in reverse chronological order. We collected all the tweets displayed on the page of each public user, sampled from the public time line, plus the nicknames of the related users, who had a conversation with the users, detected using the @ symbol. Then, we used the nicknames to collect tweets from the pages of the related users. Users with empty pages were discarded.

By doing this, we consider conversations as network connections, rather than using the following–follower relationships. We filtered out all the retweeted posts because they are not written by the users themselves and could affect linguistic-based personality recognition. The data set contains all the following information for each post:

  • username;

  • text;

  • post date;

  • user type (public user or related user);

  • user retweet count;

  • user following count;

  • user follower count;

  • user listed count;

  • user favorites count;

  • total tweet count;

  • user page creation year;

  • time zone;

  • related users (users who replied to the sampled user);

  • reply score (rp), defined as

    • display math

    which provides a measure of users’ tendency to communicate with others; and

  • retweet score (rt), defined as

    • display math

    which provides a measure of the tendency of the users to propagate information through the network.

In the corpus, there are 200,000 posts, more than 13,000 different users, and about 7,800 ego networks, where public users are the central nodes and related users are connected to them through the edges. The statistical summary of Personalitwit2, reported in Table 1, shows that there is a large number of users that do not use favorites and lists, whereas there are few users that do not have a following and followers. The distribution of users per language is reported in Figure 2. We run experiments only on the English subset of the corpus (5,392 ego networks), leaving the analysis of other languages to future works. Table 1 provides a summary of the English subset extracted from Personalitwit2, which shows how English Twitter users have more followers than the general average and less listed users. We extracted a small development data set from the English subset for feature selection.

Table 1. Summary of Personalitwit2.
 MinMedianMeanMax
Tweets35,28412,246582,057
Following0197838320,849
Followers024034,50217,286,123
Listed01385539,019
Favorites0715762,689
image

Figure 2. Frequency distribution of users per language. From left to right: Arabic, Bahasa, Chinese, Czech, Dutch, English, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Norwegian, Portuguese, Russian, Slovene, Spanish, Swedish, Thai, Turkish, and unidentified.

Download figure to PowerPoint

3 Definition of Personality and Emotional Stability

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

Personality is seen as a complex of attributes that uniquely characterize an individual. In psychology, it is defined as an affect-processing system (DeYoung 2010), and, according to Adelstein et al. (2011), it is connected to the behavioral responses of subjects to the environment.

A standard way to describe personality in psychology is the Big Five factor model that has been introduced by Norman (1963). The Big Five consists of five bipolar personality traits, namely, extraversion, emotional stability, agreeableness, conscientiousness, and openness, which have been proposed in this form by Costa and McCrae (1992). The five broad factors emerged from testing several different personality traits by means of factor analysis over self-reports, questionnaire data, peer ratings, and objective measures. It is important to note that different research teams came to very similar conclusions independently (Digman 1990). Extraversion describes a person along the two opposite poles of sociability and shyness. Emotional stability, which is sometimes referred by its negative pole (neuroticism), describes the modality of impulse control along a scale that goes from control (a calm and stable person) to instability (an anxious and neurotic person). Agreeableness refers to the tendency to be sympathetic and cooperative toward others, rather than suspicious and antagonistic. Conscientiousness describes a person in terms of self-discipline versus disorganization. Openness to experience refers to the tendency to be creative and curious rather than unimaginative.

According to Digman (1990), there have been a lot of studies in psychology that independently came to the conclusion that five is the right dimension to describe personality. Despite a general agreement on the number of traits, there is no full agreement on their meaning, because some traits are vague. For example, there is some disagreement about how to interpret the openness factor, which is sometimes called “intellect” rather than openness to experience. Emotional stability is one of the most robust personality traits. This emerges clearly when comparing the proposals of personality traits reported in Table 2, adapted from Digman (1990). Among all the five traits, emotional stability plays a crucial role in social networks. Studying off-line social networks, Kanfer and Tanaka (1993) reported that secure (high emotional stability) subjects had more people interacting with them. Moreover, Van Zalk et al. (2011) reported that youths who are socially anxious (low emotional stability) have fewer friends in their network and tend to choose friends who are socially anxious too. We will test if it is true also in online social networks.

Table 2. Proposals of Five Personality Traits since Norman (1963) (Adapted from Digman 1990).
AuthorIIIIIIIVV
NormanSurgencyAgreeablenessConscientiousnessEmotionCulture
BorgattaAssertivenessLikabilityInterestEmotionIntelligence
EysenckExtraversionNeuroticism
GuilfordActivityDispositionIntroversionEmotional stability
Buss and PlominActivitySociabilityImpulsivityEmotionality
TellegenPositive emotionConstraintNegative emotion
Costa and McCraeExtraversionAgreeablenessConscientiousnessNeuroticismOpenness
LorrInvolvementSocializationSelf-controlEmotional stabilityIndependence
HoganSociabilityLikabilityPrudenceAdjustmentIntellect
DigmanExtraversionComplianceWillNeuroticismIntellect

4 Automatic Personality Recognition: Related Work

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

The Big Five is a formalized model suitable for computational analysis. There are two research fields that showed some interest in automatic personality recognition in recent years: computational linguistics and social network analysis.

The computational linguistics community started paying attention to personality recognition only recently. In 2005, a pioneering work by Argamon et al. (2005) classified neuroticism and extraversion using linguistic features such as function words, deictics, appraisal expressions, and modal verbs. One year later, Oberlander and Nowson (2006) classified extraversion, stability, agreeableness, and conscientiousness of blog authors using n-grams as features and naïve Bayes as a learning algorithm. In a very comprehensive work, Mairesse et al. (2007) reported a long list of correlations between the Big Five personality traits and two feature sets: Linguistic Inquiry and Word Count (see Pennebaker, Francis, and Booth 2001 for details) and MRC Psycholinguistic database (Coltheart 1981), which include word classification, such as “positive emotions” or “anger” and scores such as age of acquisition and word imageability. They obtained these correlations from a psychological factor analysis on a corpus of essays (see Pennebaker and King 1999 for details) and developed a supervised system for personality recognition.4 Luyckx and Daelemans (2008) built a corpus for stylometry and personality prediction from text in Dutch using n-grams of parts of speech and chunks as features. They used the Myers–Briggs Type Indicator schema, which includes four binary personality traits (Briggs and Myers 1980), in place of the Big Five. Unfortunately, their results are not comparable with any other because of the different language and schema used. In a recent work, Iacobelli et al. (2011) tested different features, such as stop words or inverse document frequency, and found that bigrams and stop words treated as Boolean features yield very good results in predicting personality in a large corpus of blogs using support vector machines as a learning algorithm. As stated by the authors themselves, their model may overfit the data, because the n-grams extracted are very few in a very large corpus. In social network analysis, personality recognition has an even shorter history. Golbeck, Robles and Turner (2011) predicted the personality of 279 users from Facebook using either linguistic (e.g., word counts) or social network features, such as friend count. Quercia et al. (2011) used network features to predict users’ personality on Twitter using M5 rules as a learning algorithm. In computational linguistics, there is a tendency to predict classes of personality traits, and the evaluation measure is accuracy, whereas in the social network analysis, the tendency is to predict personality trait scores rather than classes, and therefore, related measures are mean absolute error and root mean squared error. The state of the art in personality recognition for the emotional stability trait is reported in Table 3.

Table 3. State of the Art in Textual Personality Recognition.
AuthorAlgorithmMeasureResult for emotional stability
  1. acc, accuracy; MAE, mean absolute error; NB, naïve Bayes; RMSE, root mean squared error; SVM, support vector machine.

  2. a

    Results reported in Luyckx and Daelemans (2008).

Argamon05NBacc0.581a
Oberlander06NBacc0.558a
Mairesse07SVMacc0.573
Iacobelli11SVMacc0.705
Golbeck11M5MAE0.127
Quercia11M5RMSE0.850

5 Personality Recognition Tool

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

5.1 Description of the System

We developed a generative system that, given a set of correlations between personality traits and some linguistic or extralinguistic features, generates hypotheses of personality for each user in a social network site for which we have textual data. The system does not make use of any learning algorithm; rather, it generates personality hypotheses for each tweet in the data set, by applying correlations between the text and the emotional stability trait, with a threshold filter for scores below the average. In the evaluation phase, the system generates one generalized hypothesis per user by comparing all the hypotheses generated for each user's tweets, computing also a confidence score that can be used for feature weighting, as we did.

In our system, personality can take three possible values: secure (s), neurotic (n), and omitted/balanced (o). Those users classified as neurotic are defined to be emotionally reactive and vulnerable to stress. Secure users are less easily upset and are less emotionally reactive. They tend to be calm, stable, and less exposed to negative feelings. The latter class, omitted/balanced, indicates users that do not show any feature or shows both the features of a neurotic and a secure user in equal measure.

5.2 Feature Selection

As underlined by previous works, such as that of Mairesse et al. (2007), feature selection is extremely important for the performance of the system. Details are reported in Table 4.

Table 4. Feature from Mairesse et al. (2007) and Quercia et al. (2011), for which We Have Correlations with Emotional Stability.
FeaturesFrequencyConfidenceRatio
  1. Note: Ratios in bold face are the ones we selected.

Anxiety words0.0160.0091.778
Anger words0.0760.01206.333
Affect words0.1850.0802.313
Articles0.1940.1711.135
Exclamation marks0.1660.1661.000
Feeling words0.0050.0015.000
Family words0.0280.0191.474
Friend words0.0000.000
Pronoun “I”0.1560.1221.279
Leisure words0.0760.0511.490
Long words0.3130.3081.016
Negative particles0.0330.0103.300
Negative emotion0.0090.0091.000
Numbers0.1230.1181.042
Parentheses0.0190.0181.008
Positive emotion0.0520.0521.000
Prepositions0.2130.1851.151
Pronouns0.3220.2651.215
Present0.1940.1721.128
Question marks0.1750.1711.023
Repeat ratio0.0190.0191.000
Sad words0.0000.000
Sight words0.0140.0062.333
Space words0.1230.1091.128
Pronoun “we”0.0380.0094.222
Word count0.3360.01228.000
No. of characters0.2230.1991.121
No. of syllables0.3360.3011.116
Kucera-Francis frequency0.1850.1681.101
Kucera-Francis category0.3410.000
Brown corpus frequency0.1750.1661.094
Thorndike-Lorge frequency0.1750.1561.122
Concreteness0.3030.2741.106
Familiarity0.2130.1941.109
Imageability0.3130.2811.179
Meaningfulness0.2610.2371.101
Age of acquisition0.190.1711.111
Following1.0000.9591.043
Followers1.0000.9641.037
Retweeted0.2510.2511.000

We exploit the correlation between linguistic or extralinguistic cues and emotional stability taken from the literature. In particular, we used features partly taken from Mairesse et al. (2007) and partly from Quercia et al. (2011). The former provides a long list of linguistic cues that correlate with personality traits in English. The latter provides the correlations between personality traits and the count of following, followers, listed, and retweeted. They are reported in Table 4.

Feature selection algorithms typically fall into two categories: feature ranking and subset selection. Feature ranking ranks the features by a metric and eliminates all features that do not achieve an adequate score. Subset selection iteratively searches the set of possible features for the optimal subset. Good feature selection algorithms are often designed ad hoc on the target data set, but there is always a risk of overfitting. To avoid this, there are some methodical approaches. From a theoretical perspective, the optimal feature selection, at least for supervised learning problems, requires an exhaustive search of all possible subsets of features of the chosen cardinality (subset selection). In this case, this is impractical because of the large numbers of features available; hence, we run feature ranking for feature selection.

Exploiting the confidence score, we run feature ranking by computing the ratio between the frequency of each feature in the development data set and the confidence obtained with that feature. Best features are the ones with a ratio close to 1. We arbitrarily decided to keep only features with a ratio below 1.05. The selected features are the following:

  • Exclamation marks: the count of ! in a post

  • Negative emoticons: the count of emoticons expressing negative feelings in a post

  • Numbers: the count of numbers in the post

  • Positive emoticons: the count of emoticons expressing positive feelings in a post

  • Question marks: the count of ? in a post

  • Long words: the count of words longer than six characters in the post

  • Repeat ratio: the ratio between words and repeated words in a post, defined as

    • display math
  • Following count: The count of users followed

  • Follower count: The count of followers

  • Retweeted count: The amount of user's posts retweeted

5.3 System Pipeline

The processing pipeline, as shown in Figure 3, is divided into three steps: preprocess, process, and evaluation.

image

Figure 3. Personality recognition system pipeline.

Download figure to PowerPoint

In the preprocessing phase, the system randomly samples a predefined number of posts (we set this parameter to 2,000 for the experiments) to capture the average occurrence of each feature. In the processing phase, the system generates one personality hypothesis per post, matching features and applying correlations. If the system finds feature values above the average computed in the preprocessing phase, it increments or decrements the score associated to emotional stability, depending on a positive or negative correlation. The list of all features used and their correlations with personality traits provided by Mairesse et al. (2007) (Mai07) and Quercia et al. (2011) (Qu11) is reported in Table 5. In the evaluation phase, the system compares all the hypotheses generated for each post of a single user and retrieves one generalized hypothesis per user. This is based on the assumption that one user has one and only one complex personality and that this personality emerges at various levels from written text, as well as from other extralinguistic cues. The system provides confidence and variability as evaluation measures. Confidence gives a measure of the consistency of the personality hypothesis. It is defined as

  • display math

where tp is the amount of personality hypotheses (for example, “s” and “s”, and “n” and “n”) matching while comparing all posts of a user and M is the amount of the hypotheses generated for that user. Variability gives information about how much one user tends to write expressing the same personality traits in all the posts. It is defined as

  • display math

where c is the confidence score and P is the count of all user's posts. The system can evaluate personality only for users that have more than one post; the other users are discarded.

Table 5. Features Used in the System and Their Pearson's Correlation Coefficients with Personality Traits as Reported by Mairesse et al. (2007) and Quercia et al. (2011).
FeaturesCorrelation to emotional stabilityFrom
  1. *p smaller than 0.05 (weak correlation); **p smaller than 0.01 (strong correlation).

Exclamation marks − 0.05*Mai07
Negative emoticons − 0.18**Mai07
Numbers0.05*Mai07
Positive emoticons0.07**Mai07
Question marks − 0.05*Mai07
Long words0.06**Mai07
Repeat ratio0.10**Mai07
Following − 0.17**Qu11
Followers − 0.19**Qu11
Retweeted − 0.03*Qu11

We annotated the corpus with that system. The average confidence is 0.601, and the average variability is 0.049. We have seen that our personality recognition system exploits correlations as a model to generate hypotheses that fit the data. This means that the system does not require previously annotated data, which are very difficult to produce from social network sites because of privacy issues. The drawback is that it provides only confidence and variability as evaluation measures. This means that we should test the performance of the system on gold-standard data to evaluate the real performance. In the following section, we describe how we tested the system.

5.4 Testing the Personality Recognition Tool

We run two tests, the first one to evaluate the accuracy in predicting human judges on personality and the second one to evaluate the performance of the system on Twitter data.

In the first, we compared the results of our system on a data set called Personage (Mairesse and Walker 2007), annotated with personality ratings from human judges. Raters expressed their judgments on a scale from 1 (low) to 7 (high) for each of the Big Five personality traits on English sentences. To obtain a gold standard, we converted this scale into our three-value scheme, applying the following rules: if the value is greater or equal to 5, then we have “s”; if the value is 4, we have “o”; and if the value is smaller or equal to 3, we have “n.” We used a balanced set of eight users (20 sentences per user) with an average length of 16 words per sentence. We generated personality hypotheses automatically with our system using only linguistic features (because these are no data from a social network), namely, exclamation marks, negative emoticons, numbers, positive emoticons, question marks, long words, and repeat ratio. We compared them with the gold standard, and we obtained an accuracy of 0.625 over a majority baseline of 0.5. These results outperform those of Mairesse for the emotional stability trait, and this is due only to feature selection.

In the second test, we compared the output of our system (this time using all the selected features) with the score of Analyzewords,5 an online tool for Twitter analysis based on the Linguistic Inquiry and Word Count features (Tausczik and Pennebaker 2010). This tool does not provide Big Five traits, but among others, it returns scores for “worried” and “upbeat,” and we used those classes to evaluate “n” and “s,” respectively. We randomly extracted 20 users from our data set, 10 neurotics, eight secure, and two users for which Analyzewords could not provide the analysis and were thus discarded. We manually checked whether the classes assigned by our system matched the scores of Analyzewords. Results, reported in Table 6, reveal that our system has good precision in detecting worried/neurotic users. We suggest that the bad results for upbeat/secure users are due to the fact that the “upbeat” class does not fully correspond to the “secure” class. Overall, the performance of our system is in line with the state of the art.

Table 6. Results of Test 2.
 prf1
n0.8000.6150.695
s0.3750.6000.462
Average0.5870.6070.578

6 Experiments and Discussion

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

Frequency distribution of the emotional stability trait in the corpus is as follows: 56.1% calm users, 39.2% neurotic users, and 4.7% balanced users.

We ran an experiment to check whether neurotic or calm users tend to have conversations with other users with the same personality trait. To this purpose, we extracted all the ego networks annotated with personality.

We automatically extracted the trait of the personality of the “public user” (the center of the ego network), and we counted how many edges of the ego network have the same personality trait. The users in the ego network are weighted: this means that if a “public user” has x conversations with the same “related user,” it is counted x times. The frequency is defined as

  • display math

where the same trait is between the public user and the related users. The experiment, whose results are reported in Figure 4, shows that there is a general tendency to have conversations between users that share the same traits. In particular, 66.7% of the neurotic users and 74.8% of the secure ones have conversations with users of the same personality type.

image

Figure 4. Relationships between users with the same personality traits.

Download figure to PowerPoint

We run a second experiment to find which personality type is most inclined to tweet, to retweet, and to reply. Results, reported in Figure 5, show that neurotic users tend to post and to retweet more than stable users. Stable users are slightly more inclined to reply with respect to neurotic ones. A Wilcoxon rank test with continuity correction confirmed that the differences between stable and secure users are significative either for replies (p-value = 0.0001238), retweets (p-value = 7.727e − 14), or posting activity (p-value = 2.2e − 16).

image

Figure 5. Relationships between emotional stability and Twitter activity.

Download figure to PowerPoint

From the analysis of the “o” class emerged that these users tend to have interaction only with other user types (48.1% with neurotics and 51.9% with secures) and that they have an average reply rate and posting activity. Only the retweet score is higher for the omitted/balanced users with respect to secure and neurotic users. A manual look at the data set revealed that there are many retweets that are only links without other text, thus classified as “o.” We suggest that omitted/balanced cannot be considered as a real class, but it is just useful for cleaning the noise from the “s” and “n” classes.

To study if conversational practices among users with similar personality traits might generate a different social structure, we applied a social network analysis to the collected data through the use of the Gephi software,6 an open-source network analysis and visualization software package. The gathered data allowed us to construct a weighted directed network counting 10,192 nodes (or Twitter users) and 10,479 weighted arcs. The arcs have been weighted according to the number of messages actually exchanged between two users. We analyzed separately the network of interactions between neurotic users (n) and calm users (s) to point out any personality-related aspect of the emerging social structure, and then we analyzed inter-group interactions. Visualizations are shown in Figure 6.

image

Figure 6. Conversational structures of stable (s) and neurotic (n) users.

Download figure to PowerPoint

Because of the data acquisition strategy—starting from the users randomly displayed on the Twitter public time line—there are a large number of scattered networks made of few interactions. Nevertheless, the extraction of the ego networks allowed us to detect a rather interesting phenomena: neurotic users seem to have the tendency to build longer chains of interacting users, whereas calm users have the tendency to build mutual connections.

The average path length value of neurotic users is 1.535, versus the average path length measured on the calm users of 1.349. This difference results in a network diameter of 7 for the network made of only neurotic users and of 5 for the network made of secure users. A few points of difference in the network diameter produce a neurotic network much more complex than the calm network. Although this difference might be overlooked in large visualizations because of the presence of many minor clusters of nodes, it becomes evident when we focus only on the giant component of the two networks in Figure 7.

image

Figure 7. Giant components of stable (s) and neurotic (n) users.

Download figure to PowerPoint

The giant components are those counting the major part of nodes and can be used as an example of the most complex structure existing within a network. As should be clear, the neurotic network contains more complex interconnected structures than the calm network even if it has on average smaller social networks, as we claimed before.

To explain the biggest network diameter discovered among the neurotic users, it could be useful to take into consideration the weight of the connections between the nodes. If, as we declared before, the weight of the edges represents the intensity of the exchange of messages happening between two users, we could ask if a larger network is necessarily a more active network.

Table 7 displays the average weight of the edges connecting together neurotic users (intra-neurotic network) or stable users (intra-stable network) or bridging together neurotic with stable users (inter-personality network). The results clearly show that the intra-stable network has an average edge weight higher than that of intra-neurotic users. This means that on average stable users seem to have more constant and frequent communications with similar users, whereas neurotic users seem to have a more erratic use of their network with less repetition and less defined preferential connections.

Table 7. Average Weight for Neurotic and Stable Users.
 Average weight
Intra-stable network0.074
Intra-neurotic network0.030
Inter-personality network0.063

This result is consistent with the differences in network diameter that we have detected, and at the same time, it could be a partial explanation of them. Whereas stable users tend to communicate in a stronger way with a limited number of users, neurotic users seem to communicate in a less stable way with a larger number of users ending up, in this way, producing more and larger communication structures.

Nevertheless, the communication structures built by the neurotic users seem to be more fragile and less solid. The analysis of the clustering coefficient, as defined by Watts and Strogatz (1998) and implemented in Gephi (Bastian et al. 2009) following Latapy (2008), of neurotic and stable users supports this idea: the average cluster coefficient for neurotic users is 0.016, whereas the average cluster coefficient for stable users is 0.377. The data suggest that stable users contribute to smaller but better-connected communicative structures, whereas neurotic users contribute to wider but less tight communicative structures.

7 Conclusions and Future Work

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References

In this article, we presented a generative system for personality recognition that is able to annotate large amounts of data in social network sites, a domain where it is very difficult to produce gold-standard annotations with personality traits. We produced a quite large and richly annotated Twitter data set that we make available to the scientific community. The system proved to have a high performance in detecting neurotic users and outperformed the system from which features are taken.

Results of the analysis of neurotic and secure users support non-Internet-related socio-psychological theories, for example the fact that secure people tend to choose friends who are also secure. Nevertheless, we found also that the behavior of neurotic users is very different online and off-line. For instance, whereas off-line neurotic users tend to choose friends who are also neurotic, online, they search for relationships apparently without caring about the fact that the other person is a neurotic or stable user.

Our results confirm also the fact that neurotic users have weaker social networks at the level of a single user, but they tend to build longer chains of interactions, searching for new relationships. This means that a tweet propagated in “neurotic networks” has a potentially higher visibility, but it will move through less significant links, in the sense that there is less affect between these links with respect to the links between secure users. The neurotic average clustering coefficient is, in fact, significantly lower than the stable users’ clustering coefficient, and this suggests a bigger difficulty in belonging to a stable community. We also found that neurotic users have the highest posting rate and retweet score, and this is once again explained as their will to seek for new relationships, in the hope of finding some that can become stable.

In the future, we would like to repeat the experiments on other social networks, such as Facebook, to see whether different social practices, such as mutual friendship relations, influence network building for secure and neurotic users. It would be also very interesting to explore how other personality traits affect users’ behavior into a social network, and also whether other traits, such as agreeableness, affect secure and neurotic users in communication exchanges.

Another interesting avenue of future research would be to improve the generative personality recognition system by testing new feature sets, finding new ways for testing its performance and examining whether its output can be used for unsupervised or semisupervised learning.

References

  1. Top of page
  2. Abstract
  3. 1 Introduction and Background
  4. 2 Collection of the Data Set
  5. 3 Definition of Personality and Emotional Stability
  6. 4 Automatic Personality Recognition: Related Work
  7. 5 Personality Recognition Tool
  8. 6 Experiments and Discussion
  9. 7 Conclusions and Future Work
  10. References
  • Adelstein J. S., Z. Shehzad, M. Mennes, C. G. DeYoung, X. -N. Zuo, C. Kelly, D. S. Margulies, A. Bloomfield, J. R. Gray, X. F. Castellanos, and M. P. Milham. 2011. Personality is reflected in the brain's intrinsic functional architecture. PLoS ONE 6(11): e27633.
  • Argamon S., S. Dhawle, M. Koppel, and J. W. Pennebaker. 2005. Lexical predictors of personality type. In Proceedings of Joint Annual Meeting of the Interface and the Classification Society of North America, St. Louis, MO, pp. 116.
  • Bastian M., S. Heymann, and M. Jacomy. 2009. Gephi: an open source software for exploring and manipulating networks. In Proceedings of International AAAI Conference on Weblogs and Social Media, San Jose, CA, pp. 12.
  • Boyd D., and N. Ellison. 2007. Social network sites: Definition, history, and scholarship. Journal of Computer-mediated Communication 13(1): 210230.
  • Boyd D., S. Golder, and G. Lotan. 2010. Tweet, tweet, retweet: Conversational aspects of retweeting on Twitter. In Proceedings of HICSS ‘10 Proceedings of the 2010 43rd Hawaii International Conference on System Sciences. IEEE Computer Society: Washington, DC, pp. 110.
  • Briggs I., and P. B. Myers. 1980. Gifts Differing: Understanding Personality Type. Davies-Black Publishing: Mountain View, CA.
  • Coltheart M. 1981. The MRC psycholinguistic database. Quarterly Journal of Experimental Psychology 33A: 497505.
  • Costa P. T., Jr., and R. R. McCrae. 1992. Normal personality assessment in clinical practice: The NEO Personality Inventory. Psychological Assessment 4(1): 5.
  • DeYoung C. G. 2010. Toward a theory of the Big Five. Psychological Inquiry 21: 2633.
  • Digman J. M. 1990. Personality structure: emergence of the five-factor model. Annual Review of Psychology 41: 417440.
  • Finin T., W. Murnane, A. Karandikar, N. Keller, J. Martineau, and M. Dredze. 2010. Annotating named entities in Twitter data with crowdsourcing. In Proceedings of the NAACL HLT Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk (CSLDAMT ‘10). Association for Computational Linguistics: Stroudsburg, PA, pp. 8088.
  • Golbeck J., C. Robles, and K. Turner. 2011. Predicting personality with social media. In Proceedings of the Annual Conference Extended Abstracts on Human Factors in Computing Systems. ACM: New York, pp. 253262.
  • Honeycutt C., and S. C. Herring. 2009. Beyond microblogging: Conversation and collaboration via Twitter. In Proceedings of the Forty-second Hawaii International Conference on System Sciences. IEEE Press: Los Alamitos, CA, pp. 110.
  • Iacobelli F., A. J. Gill, S. Nowson, and J. Oberlander. 2011. Large scale personality classification of bloggers. Lecture Notes in Computer Science 6975: 568577.
  • Latapy M. 2008. Main-memory triangle computations for very large (sparse (power-law)) graphs. Theoretical Computer Science (TCS) 407(1-3): 458473.
  • Luyckx K., and W. Daelemans. 2008. Personae: A corpus for author and personality prediction from text. In Proceedings of LREC2008, the Sixth International Language Resources and Evaluation Conference, ELRA, Marrakesh, Morocco, pp. 17.
  • Kanfer A., and J. S. Tanaka. 1993. Unraveling the web of personality judgments: The influence of social networks on personality assessment. Journal of Personality 61(4): 711738.
  • Mairesse F., and M. Walker. 2007. PERSONAGE: Personality generation for dialogue. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, pp. 496503.
  • Mairesse F., M. A. Walker, M. R. Mehl, and R. K. Moore. 2007. Using linguistic cues for the automatic recognition of personality in conversation and text. Journal of Artificial Intelligence Research 30: 457500.
  • Norman W. T. 1963. Toward an adequate taxonomy of personality attributes: Replicated factor structure in peer nomination personality rating. Journal of Abnormal and Social Psychology 66: 574583.
  • Oberlander J., and S. Nowson. 2006. Whose thumb is it anyway? Classifying author personality from weblog text. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics: Stroudsburg, PA, pp. 627634.
  • Pak A., and P. Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10), Valletta, Malta, pp. 13201326.
  • Pennebaker J. W., M. E. Francis, and R. J. Booth. 2001. Inquiry and Word Count: LIWC 2001. Lawrence Erlbaum: Mahwah, NJ.
  • Pennebaker J. W., and L. A. King 1999. Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology 77: 12961312.
  • Quercia D., M. Kosinski, D. Stillwell, and J. Crowcroft. 2011. Our Twitter profiles, our selves: Predicting personality with Twitter. In Proceedings of SocialCom2011, IEEE: Boston, MA, pp. 180185.
  • Tausczik Y. R., and J. W. Pennebaker. 2010. The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology 29(1): 2454.
  • Twitter Search Team. May 31, 2011. The Engineering behind Twitter's New Search Experience. Twitter Engineering Blog. Twitter. Retrieved June 10, 2011.
  • Van Zalk N., M. Van Zalk, M. Kerr, and H. Stattin. 2011. Social anxiety as a basis for friendship selection and socialization in adolescents’ social networks. Journal of Personality 79: 499526.
  • Watts D. J., and S. H. Strogatz. 1998. Collective dynamics of “small-world” networks. Nature 393: 440442.
  • Zhao D., and M. B. Rosson. 2009. How and why people Twitter: The role that micro-blogging plays in informal communication at work. In Proceedings of GROUP, Sanibel Island, FL, pp. 243252.
  • Zhao W. X., J. Jiang, J. He, Y. Song, P. Achananuparp, E. P. Lim, and X. Li. 2011. Topical keyphrase extraction from Twitter. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (HLT11). Association for Computational Linguistics: Stroudsburg, PA, pp. 379388.