Deliberation in Wikipedia: Rationales in article deletion discussions

Authors


Abstract

In this paper, we describe a study-in-progress aimed at evaluating and improving the quality of online deliberation. We analyze the use of rationales in article deletion discussions on Wikipedia. Our preliminary findings suggest that the majority of participants in these discussions were concerned with the notability and credibility of the topics, and for the most part presented cogent rationales based on established site policies.

INTRODUCTION

As explained in Wikipedia, deliberation is “a process of thoughtfully weighing options”. The emphasis in deliberation is rational thinking and logic instead of a power struggle. According to Dryzek (1990), the deliberation process ideally fosters communication and makes the resulting decision more legitimate.

The rapid development of social media tools and increasing interest in participation in online multi-participant activities make online deliberation an increasingly attractive research topic (Cabinet Office, 2002; Coleman & Gøtze, 2002; Forte & Bruckman, 2008; Hague & Loader, 1999). On the other hand, the proliferation of resources for information sharing and online deliberation has led to an increasing concern with the quality of the shared information in e-participation activities.

This study is aimed at evaluating and improving the quality of online deliberation. Our current focus is to examine the use of rationales in Wikipedia's article deletion discussions.

Wikipedia has been a particular source of concerns regarding information quality, as it is the world's largest online encyclopedia and a crowd-sourced collaborative endeavor. Despite a study by Giles (2005) that demonstrated comparative levels of accuracy between Wikipedia and more traditional references like Encyclopedia Britannica, there is still a perception of problems regarding information quality in Wikipedia. A number of methods of evaluating Information Quality on Wikipedia have been suggested (Yaari, 2011). We maintain that by examining the rationales used for debating keeping/deleting articles in Wikipedia, we can understand both the quality of these online deliberations as a means of mediating conflict via consensus-building (Kittur el al, 2007), and also the quality of Wikipedia articles. The Articles for Deletion (AFD) process on Wikipedia relies on a system of votes combined with rationales for deciding whether articles are to be kept or deleted. Rationales are based on such factors as evaluation of sources and reference to Wikipedia's inclusion criteria. The article under review is kept or deleted based not simply on the number of votes for keep or delete, but on an analysis of the validity and strength of the provided rationales. This analysis is carried out by the editor who closes the discussion, who is typically an “administrator” (a user empowered by the community to interpret consensus and delete articles when appropriate). The rationality of the discussions is considered to be a reflection of the information quality processes of Wikipedia. To understand the quality of this particular type of online deliberation, we analyzed three days' worth of deletion debates to develop a coding schema based on categories of rationales used, then applied that schema to an additional three days to analyze the pattern of rationales.

DECISION RATIONALE

The preliminary step in undertaking this study was to develop a means of analyzing deletion debates. Adopting an open coding process, two independent coders examined the deletion discussions which were begun on 5 January 2012. This date was selected as it was relatively recent, allowing for an up-to-date understanding of how decisions are made, but is far enough removed that the discussions had already concluded when we began our analysis. The coders discussed the coding lists they generated, exchanged their coding experiences, and proposed an agreed coding list. The principal investigator reviewed the list together with the two coders and constructed the final list with definitions for each code. Table 1 shows the coding list. A third independent coder then used this list to examine deletion debates begun on three selected dates, i.e., the corpus of the study.

Table 1. Coding List
Code FamilySub-code
References to Wikipedia's internal policies
  • References to the Wikipedia policy “What Wikipedia is Not” (NOT), which excludes certain topics, contents and practices from Wikipedia

  • References to the Wikipedia policy “Five Pillars” (5P), particularly the first three pillars: encyclopedic, neutral, and free

  • References to the Wikipedia policy on biographies of living persons (BLP), which limits how such articles can be written

  • Speedy deletion (CSD), quick deletion of articles in a narrowly defined range of topics or with serious problems (ie. copyright violation, used to attack subject, etc)

  • Reference to other Wikipedia policies

Procedural points
  • Authorship (author is banned or in a conflict of interest)

  • Previous deletion or attempt at deletion was contested

  • Problems with nomination (ineligible for deletion, rationale does not apply, etc)

  • Opportunity for improvement to resolve concerns

Notability
  • General notability guidelines (WP:N)

  • Availability of external evidence for notability

    • General external evidence (WP:GNG – independent third-party reliable sources)

    • Government recognition of subject

    • Number of Google hits

    • Coverage in media (newspapers, news magazines, etc)

    • Other specific external sources (IMDB, books, etc)

  • Content of article (eg. awards for subject)

  • Topic (eg. geographic features are assumed to be notable)

Credibility
  • Internal citations (sources cited within the article)

  • External sources (sources not cited but available elsewhere)

  • Anticipation (eg. voter volunteered to seek sources)

Precedent
  • Article available in another language Wikipedia

  • Articles on similar subjects available

  • Based on previous deletion discussions

Richness (validity and usefulness of content) 
Utility or function within Wikipedia (eg. gazetteer, set list) 
Agree with rationales provided by other voters 
Disagree with rationales provided by other voters 
No rationale provided 

CORPUS

Three days were selected for analysis using the consensus coding schema: 1 June 2010, 1 June 2011, and 15 January 2012. The first two dates were selected to be a year apart, and not close to any major changes made to deletion processes or policies. Because the third coder belongs to the Wikipedia community, we also ensured that we did not include dates on which she participated in deletion debates. The third date was selected to be on the day of most intense debate surrounding Wikipedia's proposed blackout in opposition to the Stop Online Piracy Act (SOPA) and Protect IP Act (PIPA), and on a date whereby the debates would be suspended for the 24-hour duration of that blackout. We wished to investigate what effect, if any, the SOPA debate, which received over 1800 individual viewpoints, the largest level of community participation in a single debate in Wikipedia's history (http://en.wikipedia.Org/wiki/Wikipedia:SOPA_initiative/Action), and subsequent blackout would have on the quality of deletion discussions and rationales.

Each day covered a wide variety of articles and subjects. On 1 June 2010, debates were begun for 89 articles, of which 32 were biographies (an article type with stricter standards for content, due to the potential implications for living subjects); other represented topics included works of music, bands and organizations. On 1 June 2011, debates were begun for 73 articles, including 23 biographies; other represented topics included a file extension, a list of unused highways, and a fictional university. On 15 January 2012, debates were begun for 67 articles, including 23 biographies; other topics included several military engagements and a police department. Voting patterns for each of the days are summarized in Table 2 (where votes were changed, only the final vote is counted). As shown in the table, over two-thirds of nominated articles were deleted after the deletion discussion process. The articles that were not deleted were kept, merged into other articles, or “userfied” (given to a user as a draft for further improvement prior to republication).

Table 2. Voting patterns of three selected date
Day# of articlesTotal votes for “keep the article”Total votes for “delete the article”Total “other” votes (merge, userfy, etc)Results
1 June 2010891272803719 kept, 60 deleted, 10 other
1 June 2011731192122320 kept, 50 deleted, 3 other
15 January 2012671092006314 kept, 31 deleted, 22 other

PRELIMINARY FINDINGS

Our preliminary findings are based on the analysis of the three selected dates. The third coder coded the content two times to ensure intra-coder reliability. Table below presents the results of this coding analysis: 3

Table 3. Coding results of the three selected dates
 June 1, 2010June 1, 2011Jan. 15, 2012
Agree503429
Credibility1197566
Disagree554
None7612
Notability436314305
Policy797156
Precedent192340
Procedural365046
Richness334393
Utility291511

The results show that all three dates' deletion discussions had very small percentage of votes that did not provide any rationale, i.e., “None”. For example, for deletion discussions begun on June 1, 2010, only 0.9 percent of participants provided no rationale for their votes. This finding is consistent with Wikipedia's policy on voting: voting without a rationale is strongly discouraged (see “Arguments to avoid in deletion discussions,” http://en.wikipedia.org/wiki/Wikipedia:ATA).

This Wikipedia document also recommends against merely citing policy or other voters' rationales, preferring that participants supplement these with their own arguments (although this recommendation is not always followed). This was also observed in our findings. More specifically, those who “agree” with others' rationales and who disagree with others' opinions without providing additional rationale(s) occupies a small percentage in the coded segments (see “Agree” and “Disagree” categories of the three dates).

Another interesting observation is that notability far outweighs the second most frequent rationale, credibility. “Notability” comprises an extensive network of criteria by which various topics can be evaluated. Notability concerns itself with the topic of the article, whereas credibility deals with article content. While concerns with credibility (or some policies like neutrality) may be resolved by editing processes, a topic not worthy of inclusion can only be removed by deletion (with merging of viable content to other articles). Therefore, the most likely reason for the much larger percentage on notability than credibility is that discussion participants emphasize the encyclopedic aspect of the Wikipedia project by removing content not within scope but retaining content within its inclusion criteria, even if that content needs significant improvement. Though policies related to article content, such as “neutral point-of-view,” were occasionally cited, these were primarily incidental to topic-oriented concerns like notability.

Perhaps surprisingly, despite the extensive availability of archived deletion debates, precedent was a relatively little-cited rationale. However, one of the most often cited sections of the “Arguments to avoid” page is “Other stuff exists” – essentially, this page argues that a comparison to another article or previous debate is not a sufficient rationale. For these reasons, participants who used such rationales tended to include other rationales also. Indeed, many votes were coded with multiple rationales: the 443 unique votes recorded for 1 June 2010 had an average of 1.8 codes each; the 354 unique votes for 1 June 2011 also averaged 1.8 codes each; and the 372 unique votes for 15 January 2012 also had an average of 1.8 codes per vote.

IMPLICATIONS

Understanding the decision-making process on Wikipedia lays the groundwork for further evaluation of the value of Wikipedia as an information resource. By examining the reasons for which material is removed from the site, we may see the efficacy of Wikipedia policies and evaluate the rationality. of the community. Given that mass collaboration works like Wikipedia are driven by consensus within the framework of site's guidelines, such an assessment of online decision-making may guide further research into cooperative production and compromise. This study might also provide the basis for a metric to quantify the utility and adoption of policies on Wikipedia, and could find relevance in examining the basic mechanisms that regulate deliberation in other online contexts and distributed communities. Further study is required to consider these possibilities.

Acknowledgements

This project was funded by SSHRC.

Ancillary