Analyzing the wikisphere: Methodology and data to support quantitative wiki research



Owing to the inherent difficulty in obtaining experimental data from wikis, past quantitative wiki research has largely focused on Wikipedia, limiting the ability to generalize such research. To facilitate the analysis of wikis other than Wikipedia, we developed WikiCrawler, a tool that automatically gathers research data from public wikis without supervision. We then built a corpus of 151 wikis, which we have made publicly available. Our analysis indicated that these wikis display signs of collaborative authorship, validating them as objects of study. We then performed an initial analysis of the corpus and discovered some similarities with Wikipedia, such as users contributing at unequal rates. We also analyzed distributions of edits across pages and users, resulting in data which can motivate or verify mathematical models of behavior on wikis. By providing data collection tools and a corpus of already-collected data, we have completed an important first step for investigations that analyze user behavior, establish measurement baselines for wiki evaluation, and generalize Wikipedia research by testing hypotheses across many wikis.