Review communities typically display contributions in list format, using participant feedback in determining presentation order. Given the volume of contributions, which are likely to be seen? While previous work has focused on content, we examine the relationship between communication tactics and prominence. We study three communities, comparing front-page reviews versus those on latter pages. We consider 3 types of devices: structural features, textual features, and persuasive writing. Structural features, such as profiles, convey information about authors. Textual properties, such as punctuation use, can make an impression on others. Rhetorical writing strategies are used by reviewers to convince readers of their opinions. When controlling for content, the most salient tactics distinguishing prominent reviews are textual properties and persuasive language.
Review communities are important resources for sharing information among consumers, who write about their experiences with an increasingly large variety of products, services, and activities. Reviews enable consumers to learn about item attributes that would otherwise be difficult to ascertain before consumption (Schinder & Bickart, 2005). In addition, consumers often view communities as being relatively unbiased sources of information, as compared to an item's manufacturer or a service provider (Dellarocas, 2003). The importance of such sites to businesses is also well documented, as this form of word-of-mouth communication has been shown to influence sales (Chevalier & Mayzlin, 2006; Senecal & Nantel, 2004) and to create trust in a product or seller (Ba & Pavlou, 2002; McWilliams, 2000).
One only need briefly examine popular sites to observe the volume of reviews contributed, and the challenges in effectively organizing them. Online spaces in which many users interact are difficult to sustain, as users become overwhelmed by the information available (Butler, 2001). Review communities typically display contributions in a list across multiple pages, so that users must scroll through them. Since tasks involving the interpretation of unstructured text are particularly susceptible to information overload (Hiltz and Turoff, 1985), online communities must provide a means for users to filter information. Therefore, many sites employ a simple form of social navigation, in which participants are asked to provide feedback on others' contributions (Goldberg et al., 1992). This feedback is then used in determining the display order and thus, how prominent a review will be.
In a large set of reviews, which ones are likely to be read? Although participants develop strategies for mitigating information overload, which include techniques for filtering out messages not likely to be of interest (Jones et al., 2004), little is known about how they select reviews to read. For example, is it possible that review forums are “echo chambers” where participants seek out contributions that express views similar to their own (Garrett, 2009)? Do they study a reviewer's profile, in an attempt to learn who she is, judging her credibility or similarity in terms of product taste? Or, do they consider the “wisdom of the crowds” offered via social navigation a reliable means to guide them to the most informative reviews (Kostakos, 2009)?
Despite the many unknowns surrounding the information-seeking behavior of participants, it is reasonable to assume that the most prominently displayed reviews are the ones most often read. In a community that displays reviews as a list of ranked items, reviews appearing close to the top of the list (i.e., on the first page) are easily seen, while those that are further from the top (i.e., on latter pages) are much less likely to receive attention. In fact, when presented with a ranked list of documents (e.g., in response to a query submitted to a search engine), users seldom look beyond the first page of results (Spink et al., 2006; Joachims et al., 2007; Pan et al., 2007).
Researchers have considered what makes a good review, in terms of providing information viewed as helpful (e.g., “utility prediction” (Zhang & Varadarajan, 2006) and “evaluating helpfulness” (Kim et al., 2006)) or in predicting sales (e.g., (Ghose & Ipeirotis, 2010)). Most work has considered how aspects of review content (e.g., the valence of the reviewer's opinion, the amount of information expressed) are correlated to feedback scores. To contrast, we study how communication tactics used in reviews correlate to their prominence in the community. Another important deviation from previous work, which most often has focused on Amazon.com, is that we study three communities: Yelp, Amazon, and the Internet Movie Database (IMDb). These communities share several features in common: they allow participants to post textual reviews, employ the list display format and use peer feedback to determine display order. However, different commodities are reviewed at these sites, and they have various structural features that participants might use to distinguish themselves.
Review Organization and Structural Features at Amazon, Yelp, and IMDb
Given that information seekers rarely look beyond the first page of a list of items, how contributions are organized in a review community ultimately determines what is seen. Here, we provide an overview of how reviews are organized at the three communities studied. Figure 1 shows a review of a camera at Amazon.com. As shown, users are asked to help others by indicating whether or not the review is helpful. Above the review's title, its rating is expressed in the form “X of Y people” found it helpful. Users may sort by the helpfulness of reviews or by date, with helpfulness being the default. From the reviewer's profile, one gleans information about her level of activity and how useful her contributions are. Participants also use profiles to share information such as a self-description, interests, and photos.
For this review, we can note that 95% (i.e., 52 of 55) of the people who voted found it helpful. Its sentiment toward the product is positive, with the reviewer assigning a perfect rating of five out of five stars. In terms of structural properties, we observe that the reviewer, R. Overall, uses his real name and provides a self-description. In addition, his helpfulness over all contributions is 91%. Another thing that can be noticed is the manner in which the reviewer tries to convince others that the Lumix is a good camera. Overall provides his credentials as a camera reviewer up front; he is a “former pro photog.” Then, he reports several positive aspects of the camera (e.g.,it is “lightweight” and has a “commonly available battery”).
Yelp's review organization also involves user feedback. Figure 2 shows a restaurant review and its reviewer's profile. Members can indicate that a review is useful, funny, or cool. The default “Yelp sort” considers the number of feedback votes, as well as review recency1 . In addition, users may sort by date, item rating, or “elite” status of reviewers2 . From a reviewer's profile, one finds information about her level of participation and recognition from others, and also information concerning who she is. For example, reviewers may post photos and self-descriptions. Links to their friends' pages are shown, as well as the total number of friends.
In Figure 2, we can see from her profile that Melissa M. is an elite reviewer, with 191 friends. She's also written quite a few reviews. Over her 92 reviews, she has received 307 useful votes, 242 funny votes, and 277 cool votes, for a total of 826 votes. It should be noted that since Yelp users can only express positive and not negative votes, we cannot compute Melissa M.'s positive feedback as a percentage, as we could with Amazon's helpfulness metric. In Melissa M.'s review, we can also note some interesting textual properties. At one point, she uses all capital letters (“who I'm SURE hated me”) as well as an emoticon representing a smiley face. Finally, in contrast to R. Overall's camera review, Melissa M.'s review is formatted with paragraph breaks.
Finally, Figure 3 shows a review of the movie Shawshank Redemption from IMDb as well as its reviewer's profile. IMDb participants are asked whether or not reviews are “useful.” Several filters are available including “best reviews” (default), chronological, and most prolific authors. The default uses community feedback and other undisclosed factors. As can be seen, IMDb profiles are basic, as compared to those at Amazon or Yelp. While a user can determine how prolific a reviewer is by viewing all of her reviews, no summary statistics are provided as to how useful they are. Finally, participants cannot post photos, but can share a biography and contact information, and can exchange messages on the IMDb message boards.
The example in Figure 3 is of a moderately useful review; 76% of people who voted found it useful. The reviewer, Si Cole, is positive about the movie, giving it 8 out of 10 stars. In contrast to R. Overall, Si Cole does not offer any information about his credentials or himself, but rather, provides a brief summary of the film, and explains why he liked it. However, in his profile, we learn that Si Cole has an educational background in film and art history, and can learn more from his web site or by contacting him via e-mail.
Review Helpfulness and Prominence
Communities use various constructs to collect feedback on reviews that then aids in determining display order. While these constructs relate to how reviews are valued, we cannot assume that they are equivalent. In fact, it is an open question as to what these constructs mean and to what extent they are universal. For example, Liu and colleagues collected “helpfulness” judgments on a set of Amazon reviews from four subjects (Liu et al., 2007). While they found a high rate of agreement between their subjects, there was low agreement between their subjects' ratings and those of Amazon users. One key difference was that the researchers provided their subjects with concrete criteria for evaluating reviews. In contrast, the intended meaning of “helpfulness” is not defined anywhere on the Amazon site.
A number of factors, such as one's background knowledge and intent, are likely to influence a user's interpretation of review helpfulness. A comparison can be made to the notion of relevance in information science, which expresses the extent to which a document addresses a user's information need. Relevance has many facets and is not completely understood (Mizzaro, 1997). Not only do users often disagree as to whether a document is relevant to a given information need or topic, but users' own needs often change during their interactions with an information system, such that their ideas of what is relevant are rather dynamic (e.g., Bates, 1989; Belkin et al., 1982). While community votes on the helpfulness of reviews are not relevance judgments, both notions measure the utility of an information artifact in a particular context.
A substantial body of work discusses Amazon's social navigation mechanism. While some researchers have developed methods to automatically predict review helpfulness3 (e.g., Zhang and Varadarajin, 2006; Kim et al., 2006), others have explored the nature of the helpfulness metric. Otterbacher (2009) proposed a quantitative model to evaluate information quality, and found a strong correlation between community helpfulness scores and automatically computed quality scores of reviews. Danescu-Niculescu-Mizel and colleagues (2009) found that how opinions are received depends on the relation of the content of a review to what other authors have written. In particular, helpfulness decreases with the absolute deviation from the mean product rating over all reviewers, with negative reviews being more harshly penalized than positive reviews.
Currently, we are concerned with review prominence. We classify reviews into three categories, when they are ranked by the default sorting mechanism at the respective community:
Most Prominent: the first 10 reviews displayed at the forum (i.e., “front-page” reviews)
Less prominent: the first 10 reviews displayed on the middle-most page
Least prominent: the last 10 reviews displayed on the last page (i.e., “last-page” reviews)
This definition allows us to focus on what is likely to be seen in each of the three communities, even though the constructs used to collect feedback on reviews and the methods for determining their display order are different.
Communication in Review Forums
Whether or not participants aim to write prominent reviews, there are certainly things that cause others to take note of a contribution amidst a large volume of postings. In review communities, communication is text-based, asynchronous, and is characterized by a lack of richness (Daft and Lengel, 1986). Reviewers attempt to convey a message to readers through their writing, which may contain factual information about the item, opinion, or both. At the same time, they present a virtual image of themselves (Baralou and McInnes, 2005). However, there is no immediate opportunity for message receivers to ask for clarification. In addition, there are no accompanying facial or body language clues to aid in interpretation or in establishing trust in the reviewer (Bos et al., 2002).
A key concern of communications research is how messages are received by an audience. Many textbooks teach that there are three essential elements involved in convincing an audience of one's position: message content, credibility, and emotional appeal (e.g., Smith and Taylor, 2004). While other researchers have considered the content aspect of product reviews, our focus is on the latter elements. We examine three types of communication devices: structural features, textual features, and persuasive writing (i.e., the use of Logos, Ethos and Pathos). Next, we will discuss how these features might be used to establish (or diminish) the credibility of a message, or to create an emotional appeal.
Structural Features. Online communities typically include structural features that express who each participant is and what she has done in the community. As noted by Tong and colleagues (2008), this information can be provided by multiple sources. They differentiate between self-reported information, that obtained via the system, and information provided by participants' friends. In our study, the first two types of information sources are particularly relevant.
As seen in Figures 1-3, participants typically create profiles, where they can include information about themselves to share with others (i.e., self-reported information). Participants are more likely to be satisfied and to share their expertise when they feel that others can gauge who they are (Ma and Agarwal, 2007). Information about a participant's “friends” or previous interactions with others is often also a part of one's profile. Some of this information is controlled by the user (e.g., linking to a new friend) while other information reflects transactions recorded by the site (e.g., number of messages exchanged with others). Such structural features can reflect useful information such as similarities between participants or who the participant is in her offline life (Adamic et al., 2003).
Another typical component of a user's profile is a measure of the quality of her contributions. Amazon includes reviewers' total helpfulness, while Yelp includes the total number of cool, funny, and useful votes. To contrast, IMDb does not compute usefulness across a participant's reviews, but does display all reviews with their respective scores on the participant's page. These measures can be used by others in judging the quality or credibility of a reviewer (Dellarocas, 2006). In addition, they help to sustain the community; displaying feedback has been shown to promote higher-quality contributions and more participation by community members (Moon and Sproull, 2008).
Obviously, the degree to which individuals share personal information varies (Acquisti and Gross, 2006; Stutzman, 2006). They may have privacy concerns, or are simply not enough invested in the community to create profiles and to update them regularly. Since one's online reputation is a key factor in gaining others' trust (Resnick et al., 2000), we propose to examine the relationship between review prominence and information available via structural features about the respective reviewer.
Q1: Are prominently displayed reviews more likely to be written by reviewers about whom there is information via structural features, as compared to less prominent reviews?
The textual component of a review also creates an impression. In written communication, careful editing is essential, as to not lose credibility with readers (Petelin, 2002). In fact, spelling and grammatical errors have been shown to negatively affect perceptions of the credibility of Web-based information sources (Fogg et al., 2001). Therefore, it may be the case that participants may devalue a poorly written review, even if its actual content is valid.
However, online reviews are a form of asynchronous computer-mediated communication (CMC). CMC has been characterized as a mixed modality, having characteristics of both speech and writing (Baron, 1998; Crystal, 2001). Therefore, many participants expect that people will “write the way they talk” (Hale and Scanlon, 1999). Baron (2010) suggests that CMC is a “sloppy” form of discourse, because our standards of language use have been lowered over the years. If this is also true of online reviews, participants may have low expectations of the quality of writing, and therefore, it may not be the case that more prominent reviews necessarily contain more standard language.
Others have suggested that nonstandard use of spelling and punctuation, such as repeated punctuation marks or the use of all capitalized letters, can serve as expressions of emotion in CMC (Rourke et al., 1999). Since communication is in plain text, reviewers might use such unconventional symbolic representations to substitute for richer clues such as body language (Gudergan et al., 2005; Kuehn, 1993). Therefore, it may be the case that such “errors” can enhance communication in reviews rather than cause others to discredit them.
Emoticons are another means through which interlocutors create a sense of social presence (Rourke et al., 1999). In addition to their obvious use of expressing a particular feeling, emoticons may serve other communicative functions. In a chat context, Derks and colleagues (2008) studied subjects' motives when using emoticons. They found that the intent was not always to express emotion, as might be assumed. Instead, they discovered that chat participants also used emoticons to strengthen a message in order to influence others more effectively. We will examine textual features including punctuation, spacing, grammar and spelling errors, and emoticons to see if there are salient differences between prominent and less prominent reviews.
Q2: Do prominently displayed reviews exhibit different textual features, as compared to less prominent reviews?
We consider the use of persuasive devices, as proposed by Aristotle's Appeals (Ramage and Bean, 1998). According to Aristotle, a writer may use three main strategies to persuade an audience of her message: Logos, Ethos, and Pathos. Logos is an appeal based on reason. For example, a reviewer might claim one camera superior over others, comparing their specifications. To contrast, an appeal based on Ethos leverages the author's reputation. As seen in the example in Figure 1, one might claim to be a professional photographer, in order to win readers' trust. Finally, an author evoking Pathos appeals to readers' emotions, as in telling a story in which a defective camera resulted in losing irreplaceable family pictures.
Q3: Do prominently displayed reviews make use of Logos, Ethos, and Pathos more often, as compared to less prominent reviews?
Finally, we wish to make comparisons across the three communities, Amazon, Yelp and IMDb. While sharing several characteristics, these sites differ with respect to their domains, membership, and mechanisms for collecting feedback on reviews and organizing them. In addition, as can be seen by comparing the user profiles in Figures 1-3, while relationships between members are fairly loose at Amazon and IMDb, Yelp has a friendlier atmosphere, with members often interacting more than just once. Therefore, it is important to examine whether the most prominent (and least prominent) reviews across these communities have common characteristics.
Q4: Do prominently displayed reviews at Amazon, IMDb, and Yelp exhibit similar communication tactics, as compared to less prominent reviews?
We created a set of 300 prominent (first-page), 300 less prominent (middle), and 300 least prominent (last-page) reviews in August of 2009. For diversity, we examined two categories at each community: books and digital cameras at Amazon; restaurants and active life4 at Yelp; and all-time top movies and current box office hits at IMDb. From each category, we chose five popular forums. In other words, the data set was comprised of 900 reviews from 30 forums across three communities.
We collected information via structural features of the online communities that reflect the quality of the reviewers, their level of activity, their social networks and community feedback on the reviews. Figure 4 details the information gathered and compares available features across communities.
We examined 10 characteristics of the text of reviews. Table 1 shows the characteristics studied and provides examples. This information was gathered using Microsoft Word. In the case of spelling errors, unrecognized proper names were not counted as errors. However, Word often labels slang or plays on words as being misspelled (e.g., the term “foodgasms” used in a restaurant review) and we did not eliminate these counts. Counts were normalized by the length of the text in words. One exception is the number of blank lines, which was normalized by the total lines in the review.
Table 1. Textual characteristics and examples
Words in all caps
We went with the smaller menu, which was PLENTY of food.
The chocolate soufflé…so good!
The most boring movie I’ve sat through in a long time!!!!!
David Yates (aka the director)
Smooth buying experience—received what I wanted
Love the PB & J.
Love Alinea. Hope to go back soon. :)
Simply the best!
Next occasion or event you know where I’ll be!
maby the 9th film of all time
this movie was more boring than church! they just talked and talked and they hardly ever shot anybody!
Persuasive Writing Devices
To study the use of persuasive devices, we needed to label each review as to the rhetorical strategies used by its author. For this task, two annotators were recruited. One was a teacher of language arts while the other was a linguist. The annotators were asked to indicate which, if any, of the three strategies were used. More particularly, they were asked the following questions, and they provided a binary answer to each:
Logos: In presenting her view of the item, does the author try to appeal to readers' sense of reason?
Ethos: Does the reviewer attempt to convince readers of her character, experience or qualifications?
Pathos: Does the reviewer try to get readers emotionally involved, by appealing to their values or sensibilities?
Table 2 shows examples of these strategies from the data. The agreement between the two annotators was 0.65. In other words, working independently, they answered the three questions in the same manner for 585 of the 900 reviews. Since the likelihood of this happening by chance is almost nil, the corresponding Cohen's Kappa is approximately 0.65, which represents substantial agreement (Carletta, 1996). For 35% of reviews there were discrepancies. For these reviews, the annotators were asked to discuss their answers in order to resolve their differences.
Table 2. Examples of persuasive devices in reviews
Canon digital camera (Amazon)
Logos: Author cites specs of camera that work well. Ethos: Establishes identity (amateur photographer).
I'm an amateur photographer and find that this type of point-and-shoot works well for me -especially since it has a 4x optical zoom which gives clearer, more vivid photos. I also like the 2.5′′ LCD and the fact that this camera is small enough to easily fit in a pocket or purse.
Breaking Dawn, by Stephenie Meyer (Amazon)
Pathos: Author expresses anger and appeals to readers' sensibilities.
I am absolutely livid about this book…This book is an insult. If you are a fan of the series, over the age of 12 and/or have an IQ above 50, then DO NOT READ THIS BOOK. PLEASE listen to me.
Alinea Restaurant (Yelp)
Ethos: Reviewer convinces readers of his qualifications (a designer who has dined in top restaurants) and then states his opinion.
I design themed immersive environments for a living and eating at Alinea is THE immersive culinary experience for your pallette. Having lived in New York City & now Los Angeles I have been lucky enough to eat at what are considered to be some of the top restaurants in the world. Alinea is by far the most amazing meal I have ever had.
Schindler's List (IMDb - Top 250 Movies)
Logos: Author mentions several reasons why Schindler's List is a top movie.
Many movies come out each year and we applaud them for their screen play, originality and whatever else. But only once in a long while does one come out and you say all those nice things, but one you will also never forget. This movie is more than just something for us to watch for 3 hours and 17 minutes, it is something for us to never forget, to teach us a lesson and to remember those who died needlessly along with those who tried to help those same people survive.
Pathos: Appeals to readers' emotions by making reference to Holocaust.
Central Park (Yelp)
None: States an opinion without providing evidence.
This place was awesome!
Table 3 displays means of four characteristics5 : age (days from earliest posting), length (words), deviation from average rating (reviewer's rating of item - average), and total feedback votes. We performed ANOVAs, with each characteristic as the response, community as a control and review category (front, middle, or last page) as the explanatory variable6.
Table 3. Review Characteristics: Mean age, length, deviation from average rating, and total votes
Δ Ave. Rating
The relationship between age and review category differs between communities. At Amazon, the most prominent reviews tend to be posted earlier on. The opposite is true at Yelp, where recency is used in the ordering mechanism. At IMDb, mean review age does not differ between groups. The second column presents the average length. Superficially, one expects better reviews to be longer, expressing more information than shorter ones (Zhang and Varadarajan, 2006). The same trend is found across communities: prominent reviews are indeed longest, while the last-page (i.e., least prominent) reviews are shortest. At Yelp and IMDb, the less prominent reviews are longer than last-page reviews.
A study of Amazon's feedback system found that more negative reviews were seen as being less helpful (Danescu-Niculescu-Mizil et al., 2009). We observe the same trend. Across all communities, the most prominent reviews are less negative as compared to last-page reviews. At Amazon and IMDb, less prominent reviews are also less negative than the least prominent reviews.
Finally, the fourth column confirms the intuition that prominent reviews are most likely to be read, as they receive the most feedback. At Yelp, while participants leave useful, funny, and cool votes, they cannot leave negative feedback, thus, there is no equivalent metric. At Amazon and IMDb, less prominent reviews receive the least feedback, indicating that they are rather ignored. The last-page reviews, which are more likely to be negative, might be interesting to some users who want to learn the minority opinion. In summary, while the three communities use different ordering mechanisms and criteria for user feedback, it is interesting to note the common ways in which prominent reviews differ from last-page reviews. In particular, prominent reviews are longer and are more positive (or less negative) about the respective item.
Knowing that these factors are correlated to review prominence, we include them as control variables in our analyses of structural features, textual features, and the use of persuasion. In particular, we want to control aspects of review content and age. Review length and the valence of the opinion expressed (DevAveRate) have been reported to correlate to social feedback measures. In addition, we include review age to account for the possible “early bird bias” (Liu et al., 2007). By including these controls in our analyses, we will examine if, in addition to when a review is posted and its content, how it is written affects its prominence in the community.
Structural Features: Reviewer Activity
A common feature at communities is profile information about the reviewer's activity. The first column of Table 4 displays the median number of reviews posted by reviewers. This distribution is skewed, with few prolific users and many writing few reviews. Differences between groups within communities were tested using a rank-sum test (Mann and Whitney, 1947). At all communities, the writers of prominent reviews are more prolific, as compared to those who author the least prominent reviews. To contrast, at Yelp, the authors of the less prominent (i.e., middle-page) reviews are the most prolific. An ANOVA was also conducted, to compare the log number of reviews written between the three groups, while controlling three important aspects that might affect the prominence of reviews (age, length, and the deviation from average item rating). While the control variables are correlated to the total number of reviews written, the group effect remains highly significant (p-value of 0.0125).
Table 4. Reviewer Activity: Median reviews written and interactions with others
# Total Reviews
Using site features, users may also see how much interaction a reviewer elicits. At Amazon, one may leave comments on others' reviews. At Yelp, users receive compliments. At IMDb, one can see messages a reviewer has posted in discussions. While not equivalent, all reveal something about the reviewer's activity level. The distributions are again skewed. Over all, authors of front-page reviews have more interaction with others, as compared to authors of last-page reviews. At Yelp, the authors of reviews displayed on the middle pages have the most interaction. As for the total number of reviews written, an ANOVA was again conducted in order to confirm that there is a significant relationship between review prominence and interaction. In this case, review length was a significant factor. However, the group effect is also highly significant (p-value < 0.0001).
Structural Features: Disclosure of Personal Information
We examined if reviewers provide a self-description, post photos and list their location (beyond country). The proportions of reviewers of prominent, less prominent and least prominent reviews disclosing such personal information were compared via Z-test. At Amazon and IMDb, authors of prominent reviews are more likely to provide self-descriptions as compared to authors of less and least prominent reviews (0.23 versus 0.05 and 0.12 at Amazon; 0.19 versus 0.08 and 0.09 at IMDb). At Yelp, all reviewers provided a self-description. Likewise, a larger proportion of authors of prominent reviews share photos, as compared to authors of less and least prominent reviews at Amazon (0.29 versus 0.06 and 0.17) and Yelp (0.96 versus 0.88 and 0.48). There were no significant differences for the disclosure of location.
Structural Features: Social Networking
Table 5 shows the mean number of friends that Amazon and Yelp reviewers have. Authors of prominent reviews have more friends as compared to authors of the least prominent reviews. At Yelp, authors of less prominent reviews have the most friends. An ANOVA was conducted with the log number of friends as the independent variable. In addition to the review group and the community, review age, length, and the deviation from the average item rating were added as controls. The group and community effects were statistically significant (p-value < 0.0001 in both cases), however, the effects of the control variables were not. Finally, participants at IMDb cannot indicate friends on their profiles. However, one mechanism that enables networking is to provide contact information (e-mail or URL). As shown, a significantly larger proportion of authors of prominent reviews disclose this information, as compared to the authors of the least prominent reviews.
Table 5. Networking: Mean # friends and proportion of reviewers providing an e-mail or URL
Structural Features: Prestige Badges and Reviewer Quality Statistics
Reviewers can earn “top reviewer” or “elite” badges at Amazon and Yelp, which are displayed with reviews. Among the most prominent Amazon reviews, 17% were written by “top reviewers” as compared to only 1% of the less prominent and none of the least prominent reviews. At Yelp, 38% of the prominent reviews were written by “elites” as compared to only 12% of the least prominent reviews. Less prominent Yelp reviews were just as likely to have elite authors.
Reviews might also stand out if they are written by authors with good ratings over all contributed reviews, which are displayed on their profiles. At Amazon, the mean helpfulness of authors of prominent reviews was 0.89, compared to 0.75 for the authors of the less prominent reviews and 0.24 for authors of the least prominent contributions. This is expected since feedback is the primary criterion used to rank reviews.
Yelp displays raw counts of useful, funny, and cool votes on profiles. We compared the sum of these votes. For writers of the front-page reviews, the median was 83.5, while for authors of last-page reviews, the median was only 8. The writers of middle-page reviews receive the most votes (median of 186). Thus, the picture of the less prominent reviews becomes clearer. Reviews displayed in the middle must be of a high quality in order to remain there as they age. Because of time sensitivity, lower-quality reviews, having an advantage while new, can temporarily be displayed towards the top of the list.
We found differences between the most prominent and least prominent reviews with respect to four characteristics: use of multiple punctuation marks, proportion of blank space, and rates of spelling and grammatical errors. Table 6 shows the mean values of these characteristics. (For space considerations, we omit the less prominent reviews.) In order to examine group and community effects while controlling for age and content features, we performed ANOVAs with each characteristic as the response.
Table 6. Textual characteristics of [most | least] prominent reviews
Even when controlling for community and for review age, length, and item rating, writers of the least prominent reviews use multiple punctuation marks (e.g., !?) more often than writers of the most prominent reviews. Another difference is the proportion of blank lines. IMDb reviewers tend to leave more space (e.g., paragraph breaks). However, when community is controlled, there is still a group effect, with front-page reviews containing significantly more space as compared to last-page reviews. Since reviews are plain text, leaving space might make them easier to read or more visually appealing.
At IMDb and Yelp, the least prominent reviews contain more spelling errors than the most prominent ones, while the less prominent reviews do not differ from the most prominent. The differences between groups are not significant at Amazon. A pronounced difference in the rate of grammatical errors can also be noted between the most and least prominent reviews at Yelp and IMDb. However, there is no significant difference at Amazon.
Persuasive Writing Devices: The Use of Logos, Ethos, and Pathos
We considered the proportion of reviews using each strategy, Logos (L), Ethos (E), and Pathos (P), individually and combined. The most commonly used strategy was Ethos only, used in 23.6% of the reviews. The distributions of other combinations were as follows: L (19.6%), L+E (15.1%), P (11.1%), E+P (10.8%), L+P (5.6%), L+E+P (3.2%). Finally, 11% of the reviews did not contain any devices.
Table 7 shows the proportion of writers using each strategy. To test if strategy was correlated to category, we performed logistic regressions with Logos, Ethos, and Pathos as binary responses, community and product as controls, and category as explanatory variable. Additional controls were added to account for the age of reviews as well as content. For Logos, there is no community or product effect; its use is directly correlated to review group. The most prominent reviews tend to invoke Logos more often, as compared to the less and least prominent reviews.
Table 7. Use of Logos, Ethos, and Pathos strategies by review category
p-value (group effect)
To contrast, while Ethos is correlated to group, community and product effects are also significant. Among the most prominent book reviews, only 34% used Ethos, while 90% of prominent restaurant reviews did so. Top restaurant reviews tended to be experience-oriented, detailing the author's meal and impressions, while top book reviews were likely to be of an analytical nature (e.g., critiquing the plot).
The group effect for Pathos is only borderline significant. However, the community and product effects are highly significant. Pathos is used in IMDb reviews (38%) more often than at Amazon (23%) or Yelp (30%). For example, there were many emotionally charged reviews of “Schindler's List,” across all categories. In addition, authors often appealed to readers' values when reviewing controversial films such as “Bruno” (for its portrayal of gay men) or “The Hangover” (for scenes depicting child abuse).
Finally, the last column shows the mean number of persuasive devices used. ANOVA was used to compare between groups. Even when controlling for community and product, there are differences between groups. In particular, the most prominent reviews use the greatest number of devices. The writers of the least prominent reviews use significantly fewer persuasive appeals as compared toothers.
Comparison of All Devices
Over all, which devices distinguish the most prominent reviewers from others? We used multinomial logistic regression (Menard, 2002), with review group as the categorical response. We reexamined devices that were common across communities and were of interest in the analyses. We included several controls: community, product, deviation from average rating, review age, and length. We used the following predictors: reviews written, interactions with others, provision of self-description, multiple punctuation, proportion of blank lines, spelling and grammatical error rates, and number of rhetorical strategies.
Only six variables had significant effects. Interestingly, neither community nor product was significant, suggesting that devices used by authors of front-page reviews are similar across communities and items reviewed. In addition, none of the variables describing activity or disclosure of information had a salient effect. Two textual characteristics, parentheses and spelling error rate, were also eliminated. The resulting model, described in Table 8, is highly significant (i.e., p-value for chi-square test is less than 0.001) and has a pseudo R2 of 0.19.
Table 8. Model relating prominent reviews to least (left) and less (right) prominent reviews.
Multinominal logistic regression models the log ratio of the probability of a review belonging to one category versus another. The left side of Table 8 compares prominent (front-page) reviews to the least prominent (last-page) reviews. Front-page reviews are significantly more positive; longer; have fewer multiple punctuation marks, more blank space, and fewer grammar mistakes; and use more persuasive appeals. The right side of Table 8 compares front-page reviews to those on the middle page of the forum. They differ significantly with respect to only three characteristics: front-page reviews are longer, have fewer multiple punctuation marks and use more persuasive appeals.
We have focused on the notion of review prominence in popular online forums, in order to examine which viewpoints get heard. We argued that, in order for a participant's voice to be heard, her review must be easily seen. While some may be unconcerned by how prominently displayed their contributions are, a growing body of literature suggests that many participants strive for recognition (Kollock, 1999) or even status (Lampel and Bhalla, 2007) in online communities.
In large communities, there is inevitably a lot of noise (Gu et al., 2007). Some reviews are of low quality or may even contain false information, perhaps posted by an organization's competitors (David and Pinch, 2006). Even more common, is the presence of what Gilbert and Karahalios (2010) have termed “déjà reviews,” contributions that do not add any new information, given what others have already posted. Community feedback mechanisms address these problems, by displaying most prominently the contributions considered to be the best. While constructs such as “helpfulness” and “coolness,” which are used to collect feedback in different communities, are not necessarily equivalent, they all provide users with an idea as to what is valued at that particular community.
The current findings emphasize that in addition to what reviewers write, how they write is correlated to how well their messages are received and thus, to how prominently their contributions are displayed. Our analysis showed that, across all three communities, front-page reviews tend to be written by authors who share more information about themselves and who are more active in the community, as compared to authors of the least prominent reviews. However, in our final model, none of the variables based on reviewers' profile was significantly correlated to review group. In other words, features related to review content; textual presentation; and the use of Logos, Ethos and Pathos were more important in terms of explaining the variance between prominent, less prominent, and least prominent reviews.
This should not be taken to mean that the information available about reviewers via structural features is not an essential component of a review community. Sharing personal or identifying information is important for the cohesiveness and sustainability of an online community, and encourages the contribution of high-quality postings. It should also be noted that only a study in which user information seeking behavior is examined, could tell us to what extent users rely on the information about reviewers expressed by structural features, and how exactly this information is used.
The findings also have implications for the use and usefulness of user-contributed content online. In the new networked information economy, as described by Benkler (2006), a cooperative, many-to-many information sharing model has emerged. Consumers searching for information about a product or service are no longer restricted to content produced under the formerly predominate, one-way professional model. Anderson (2006) claims that younger consumers do not make distinctions between professionally produced content and that contributed by amateurs. In fact, one way that user-contributed content, such as reviews, makes a big difference is by helping people find items of interest in the “Long Tail,” which are relevant to just a few consumers. Therefore, the variety of available information sources may greatly benefit consumers.
However, as Benkler points out, when people are not aware of, or do not understand all of the information sources that are available, then it becomes meaningless to have an all-inclusive intake of varying opinion. In the context of online communities, in which a large number of participants share their opinions and experiences, if these contributions are not organized in a manner that facilitates users in finding information that is useful to them, then it does not matter to the consumer how many sources of information are presented. Previous work has found that social navigation systems can reflect the quality of information contained in user-contributed texts (e.g., (Otterbacher, 2009)). Similarly, the current findings show that the most prominent reviews at online communities are of relatively high quality, in that they tend to be well edited and present an opinion on the item of interest with supporting arguments.
There has been concern that the popularity of sites inviting user-contributed content has lowered the quality of information available on the Web. In particular, Keen (2007) takes several issues with what he terms the “cult of the amateur.” In addition to exacerbating the problem of information overload, he claims that the prevalence of user-contributed content has resulted in an “endless digital mediocrity.” Our empirical results do not support this position. While it is true that there is no shortage of low quality reviews, in communities with social navigation mechanisms in place, these reviews are not likely to be prominently displayed. In fact, the prominently displayed reviews at communities such as Amazon, IMDb, and Yelp are written in a manner that one would expect of professionally produced content, while the least prominent reviews are those of a more amateur nature.
It is important to point out that how prominently a review is displayed within its given forum, is somewhat dynamic. Rankings of reviews may change as more feedback is collected and as new reviews are posted. However, given that the forum has achieved momentum, we do not have reason to believe that the key characteristics of the most valued reviews would differ significantly from those observed in our data set.
Finally, it should be noted that the current data set consists of reviews from popular forums at Amazon, Yelp, and IMDb. It may be the case that community standards for review presentation and content are different for less popular items. For less popular items, the supply of reviews available might be significantly less and therefore, readers might accept lower quality reviews.
Social navigation mechanisms are widely used at online review communities in order to prevent problems of information overload and the presence of low quality or fraudulent reviews. Despite that they rely on constructs, such as helpfulness or usefulness, that do not have universal interpretations, and the fact that there are known biases with respect to how such feedback is collected, it is encouraging that these mechanisms pick up on salient review characteristics. Beyond message content, we found that reviews that are displayed on the front, middle, and last pages of their respective forums also differ with respect to their textual properties and the extent to which they use persuasive arguments.
There is still much to learn about how participants actually use the cues available to them, to make judgments as to the utility and credibility of others' contributions. While much information is available in reviewers' profiles, it is not known to what extent they use these details or how they use them. When a user approaches a review community, in order to see what others think about an item that interests her, does she first read through the text of the reviews and then check profiles to judge reviewers' credibility? Does she begin her search for information by identifying people of interest and focus on their reviews? Or, does she simply rely on the community's default social navigation mechanism to determine what is the best available information, and attend to that? Examining such questions in future work will help improve many aspects of these communities, including the development of new site features and mechanisms for organizing user-contributed content.
The author is grateful for the detailed feedback from the two anonymous reviewers, as well as the editor. She would also like to thank Erica Dekker for her help in editing the final version of the paper.
Elite reviewers are described as being the most “active and influential.”
Helpfulness is defined as the ratio of helpful to total votes.
Active life concerns topics such as parks, spas and gyms.
In each cell for all tables, values are displayed as follows: [most | less | least] prominent reviews.
We note differences with p-value less than 0.05 for the given statistical test, and report means when distributions are approximately normal or medians otherwise.
About the Author
Jahna Otterbacher (Ph.D., University of Michigan) is an Assistant Professor in the Lewis Department of Humanities at the Illinois Institute of Technology. Her research focuses on information behaviors in online environments, where the primary mode of communication is unstructured text. She is especially interested in how the organization of textual information affects how people find and make sense of it.
1. Address: Lewis Department of Humanities, Illinois Institute of Technology, 3301 Dearborn, Suite 218, Chicago, IL 60616.