Ulrike Pfeil is a Ph.D. student at the Centre for Human-Computer Interaction Design, School of Informatics at City University, London. Her research interests include social aspects of computing, especially in computer-mediated communication.
Address: The Centre for Human-Computer Interaction Design, City University London, EC1V 0HB, UK
Centre for Human-Computer Interaction Design, School of Informatics City University, London
Panayiotis Zaphiris is a Senior Lecturer at the Centre for Human-Computer Interaction Design, School of Informatics of City University, London. His research interests lie in HCI with an emphasis on inclusive design and social aspects of computing. He is especially interested in HCI issues related to the elderly and people with disabilities.
Address: The Centre for Human-Computer Interaction Design, City University London, EC1V 0HB, UK
Chee Siang Ang
Centre for Human-Computer Interaction Design, School of Informatics City University, London
Chee Siang Ang is a Ph.D. student at the Centre for Human-Computer Interaction Design, School of Informatics of City University, London. His research interests include the psychology and sociology of computer games, including new forms of CMC communication such as MMORPGs.
Address: The Centre for Human-Computer Interaction Design, City University London, EC1V 0HB, UK
This article explores the relationship between national culture and computer-mediated communication (CMC) in Wikipedia. The articles on the topic game from the French, German, Japanese, and Dutch Wikipedia websites were studied using content analysis methods. Correlations were investigated between patterns of contributions and the four dimensions of cultural influences proposed by Hofstede (Power Distance, Collectivism versus Individualism, Femininity versus Masculinity, and Uncertainty Avoidance). The analysis revealed cultural differences in the style of contributions across the cultures investigated, some of which are correlated with the dimensions identified by Hofstede. These findings suggest that cultural differences that are observed in the physical world also exist in the virtual world.
The Internet and the World Wide Web enable people from different geographical locations to communicate with one another, share information, and build commercial or interest-based relationships and personal friendships (Cummings, Butler, & Kraut, 2002; Preece & Maloney-Krichmar, 2005). Different kinds of communities emerge due to this possibility, some of which carry over from the virtual to the physical world (Preece & Maloney-Krichmar, 2005). Preece and Maloney-Krichmar (2005) define online community as communication among a group of people “who come together for a particular purpose, and who are guided by policies (including norms and rules) and supported by software” (n.p., emphasis original). That is, online communication takes place with the help of various tools—such as email, chat, newsgroups, message boards, and more recently blogs and wikis—that make it possible for people to communicate beyond their national borders.
The focus of this study, wiki technology, was designed by Ward Cunningham in 1995 as a tool for collaborative work (Cunningham & Leuf, 2001; Guzdial, Rick, & Kehoe, 2001). Collaboration is defined as working together to achieve collective results that the participants would be incapable of accomplishing working alone (Wikipedia, 2006a). Wikis offer a way to work collaboratively to create web content that is usually created by individuals; as such, they facilitate new ways of social collaboration online (Zaphiris, Ang, & Laghos, in press). Collaborating participants are those who contribute actively to the wiki site. In a wiki site, each contributor can revisit the pages he or she has edited and check the progress of the site. Thus, wiki collaborations are characterized by the sharing of knowledge whenever it is needed and wherever it is located (Lipnack & Stamps, 1997); this is a primary characteristic of computer-supported collaborated work (CSCW).
Although the Internet is a global medium, its users and creators have different backgrounds, live in different environments, and belong to different cultures. Different styles of computer-mediated communication (CMC) among members of different cultures can lead to misunderstandings and problems in communication. Differences in the standards for writing time, dates, addresses, and numbers can also cause confusion; the same goes for differences in symbols, colors, and metaphors. Even a particular style of writing may be considered friendly in one culture and offensive in another (Stengers, De Troyer, Mushtaha, Baetens & Boers, 2004). It is therefore important to study how people from different cultures behave in CMC, such as how they choose representations and how they work together and govern themselves. This can lead to a better understanding of how cultural diversity is spread in online communities and help an Internet user approach users from different cultures in ways that are appropriate to their cultural backgrounds.
This study aims to explore the relatively new research area of cultural differences in wikis through the use of content analysis methods to investigate the behavior of wiki participants. The primary focus of the study is the relations between the patterns of changes on wiki sites and the cultural backgrounds of the contributors. Content analysis methods have been used previously to study wikis (Emigh & Herring, 2005), and the influence of cultural background on web design has also been explored (Callahan, 2005). However, as far as we know, there has been no previous study that combined these two areas to apply content analysis methods to the study of cultural influences on wiki collaborations. Specifically, we investigate the relation between users’ behavior in Wikipedia and their cultural backgrounds as defined by the cultural dimensions proposed by Hofstede (1991).
Wikipedia is a purely online encyclopaedia that is implemented through wiki technology and that aims to build information and consensus among community members about different topics. However, as Wikipedia exists in different languages, differences in the creation of articles across the language versions might occur. An article is developed by modifying the current state of a page through changing the content or adding or deleting information. Since these changes can be tracked and examined, Wikipedia provides a good source of information for investigating cultural differences in CMC.
The following section presents an overview of wiki technology and Wikipedia, cultural issues, and previous research in these areas. The methodology of our investigation is then described. The main question that we address is: How, if at all, do differences in the cultural backgrounds of Wikipedia contributors influence their behavior? An analysis of all edit operations performed on the French, German, Japanese, and Dutch Wikipedia pages about the topic game shows that there are correlations between Hofstede’s cultural dimensions and the nature and frequency of specific edit operations made by contributors.
In 1995, Ward Cunningham invented the wiki as a communication mechanism for groups to create web-based content collaboratively. Wiki technology allows users to create, edit, and distribute content through a web browser. Every user can be an author and can also change text written by others. Each version is saved, so that it is easy to revert to a former version if necessary (Emigh & Herring, 2005).
In a wiki site there is no editorial function that examines the contributions or guarantees quality and accuracy of its content. It is the responsibility of the users to ensure correctness and their collective responsibility to take care of aspects of policy, such as rules and appropriate behavior in the community (Halvorsen, 2005). As anyone can contribute literally anything, vandalism sometimes occurs in wikis, but it tends to be stopped or reversed quickly (Viégas, Wattenburg, & Dave, 2004).
Wikipedia, the most widely-known wiki site, is an open-source, multi-language encyclopedia run by the Wikimedia Foundation (Wikimedia Foundation, 2005), which collects and displays information by using wiki technology (Halvorsen, 2005). According to Jimmy Wales, the founder of Wikipedia, Wikipedia aims to be a high quality encyclopedia with content made easily available to everyone, ideally in his or her own language (Wales, 2005). Wikipedia started on January 15, 2001 and has experienced exponential growth since. As of September 2006, it included information in 229 languages (Wikipedia, 2006d).
At the time of this writing, the English Wikipedia contained roughly 1,370,000 articles, the French 354,000, the German 459,000, the Dutch 224,000, and the Japanese 252,000 (Wikipedia, 2006e). Articles from one language can be translations of articles originally written in another language, but most often articles are created independently in each language version. Although pictures and articles may be shared among the different language versions, there is not necessarily a connection among them, even for articles on the same topic. Nonetheless, Wikipedia articles can easily be linked to other Wikipedia articles or to external pages on the World Wide Web by users (Bellomi & Bonato, 2005).
Figure 1 shows the English Wikipedia article about the topic game as of June 20, 2005.
Wikipedia contributions are governed by policies describing the purpose and aim of Wikipedia and the kind of information users should include. Important among these is the Neutral Point of View (NPOV) policy, which states that information should be provided in a neutral way without the writer getting engaged personally. This does not mean that a Wikipedia article should represent a single, objective point of view, but rather that it should provide various opinions and perspectives in a balanced, neutral way (Wikipedia, 2006d). All opinions about an issue should be visible, so that the full spectrum of knowledge is present in each article (Reagle, 2005).
The History page of each Wikipedia article tracks all former versions of the article from its creation, thus providing a record of the communication process among the people who contributed to its expansion. Not only are all previous versions of a page recorded, but it is possible to compare different versions and examine all the changes users have made from one version to another (Viégas, et al., 2004). This allows researchers to analyze every edit that was made from the start of the page until the date when data are collected, a process that may provide insight into the nature of contributions to Wikipedia and the dynamics of the groups that build, or contribute to, a certain article (Hassine, 2005).
Previous research on wikipedia
Several studies of Wikipedia have been conducted to investigate how communities of users create and review content in collaborative online work. Emigh and Herring (2005), analyzed formality and informality in 15 entries in two collaboratively authored online encyclopedias, including Wikipedia, and compared the results with a traditional print encyclopedia whose content is freely available online. They found that the more control is exercised over the contributions by editors (rather than authors), the more standardized and formal the content becomes. They also found that Wikipedia maintains an almost print standard level of contributions, even though it is not edited by any central entity.
Braendle (2005) conducted a content analysis of a sample of 450 articles in the German Wikipedia to investigate article quality. The results indicated that the factors interest (number of edits and unique authors, traffic, age, and number of backlinks) and relevance (especially the numbers of results in Google) have a considerable impact on the quality of an article. The higher the relevance and interest of an article, the better its quality (Braendle, 2005).
Lih (2004) developed a tool for the evaluation of the quality of Wikipedia articles based on metadata, such as the number of edits made and the number of contributors. His study suggests that Wikipedia articles that are cited by the press increase in quality quickly after citation, due to their greater exposure.
Other properties of Wikipedia have been assessed quantitatively by Voss (2005), who measured quantities such as number of edits per author, the total number of authors per article, and the distribution of dead links. Voss found that not only has the number of Wikipedia articles increased exponentially, but the median size of each article has grown linearly over time.
Viégas, et al. (2004) designed a software tool called history flow visualization to investigate the edit patterns and dynamics of articles in Wikipedia. This tool helps to visualize the development of an article and sheds light on issues such as vandalism-repair, peer review, resolution of conflicts, and authorship. For example, the authors found that the average time to reverse an article after a mass deletion (deletion of the whole article) is 2.8 minutes (Viégas, et al., 2004).
Most relevant to the present study, Bellomi and Bonato (2005) carried out a network analysis of the English Wikipedia to gain insights into the overall structure of Wikipedia and any cultural biases in its content. The results showed that there are no islands in Wikipedia, as every article can be reached through links from other articles. They also found that the investigated content was biased towards Western cultures. However, the authors point out that similar investigations in other local Wikipedias are necessary before their results can be generalized (Bellomi & Bonato, 2005).
Theories of cultural variation
The term culture is difficult to define because it has multiple and often conflicting definitions across different scientific disciplines. After an extensive analysis of available definitions and their classification into different categories, Kroeber and Kluckhohn (1952) concluded that “culture consists of patterns, explicit and implicit, of and for behavior acquired and transmitted by symbols, constituting the distinctive achievements of human groups, including their embodiment in artifacts; the essential core of culture consists of traditional (i.e., historically derived and selected) ideas and especially their attached values; culture systems may, on the one hand, be considered as products of action, on the other, as conditional elements of future action” (p. 357).
The roots of cultural differences across nations or societies reach far into the past and can be considered stable in the long term. It can therefore be assumed that they will not change in the near future (Hofstede, 1991). Hofstede applies the term culture primarily to national groups, admitting that while nations are not the best way to study culture, “they are usually the only kind of units available for comparison and [thus are] better than nothing” (Hofstede, 2002, p. 1356).
Between 1967 and 1973, Hofstede collected data from 116,000 IBM employees working in over 70 countries. The 40 largest countries were first analyzed, but the number was later extended to 50 countries and three regions (Hofstede, 2003). From the findings of this research, Hofstede identified four central dimensions of cultural diversity, which he proposed were largely independent. These dimensions, which can be measured across nations and expressed in scales, are named as follows:
• Power Distance
• Collectivism versus Individualism
• Femininity versus Masculinity
• Uncertainty Avoidance
Later, a fifth dimension called short-term versus long-term orientation was added (Hofstede, 1991).
According to Hofstede, cultural measures are not absolute. Rather, a country’s score along one dimension must be seen in relation to the scores of other countries. Furthermore, not every person from a country fits precisely in the cultural typology of that country, although all the members of one cultural society together exhibit trends and tendencies that can be statistically measured and linked to the four dimensions of cultural diversity (Hofstede, 1991). Brief descriptions of the dimensions are given in the following section.
Power Distance is “the extent to which the less powerful members of institutions and organizations within a country expect and accept that power is distributed unequally” (Hofstede 1991, p. 28). It describes the relationship between the higher-ups and lower-downs of a society and how human disparity and differences in power and wealth are dealt with (Hofstede, 1991). For example, the hierarchy in a company, the rate of political centralization, and the expected respect children exhibit towards their teachers can all be influenced by Power Distance (Dahl, 2004).
Hofstede measures Power Distance with the Power Distance Index (PDI). Countries with a higher PDI have a larger inequality in the distribution of power and wealth among their members. People consider a large disparity of power among people as a given and tend to emphasize hierarchical power (Hofstede, 1991). In countries with a lower PDI, emphasis lies on equality, and legitimate or expert power is preferred. The distance between subordinates and leaders is smaller and consultation among them is common (Hofstede, 1991).
Collectivism versus individualism
The Collectivism versus Individualism dimension describes the extent to which members of a culture rely on and have allegiance to either their self or the group (Hofstede, 1991). Hofstede states that “individualism pertains to societies in which the ties between individuals are loose: Everyone is expected to look after himself or herself and his or her immediate family.” In contrast, “collectivism … pertains to societies in which people from birth onwards are integrated into strong, cohesive in-groups, which throughout people’s lifetime continue to protect them in exchange for unquestioning loyalty” (Hofstede, 1991, p. 51).
Hofstede introduced the Individualism Index (IDV), which describes the extent to which a country tends to be individualistic. Countries with a high IDV emphasise the “I,” and the individual identity prevails over the “we” of the group identity. Self-actualization and freedom are important. In relationships, honesty and a strong private opinion are valued (Hofstede, 1991). In countries with a low IDV, social order is community based and the group protects its members in exchange for loyalty. Harmony and consensus within a group are important, and confrontations are avoided (Hofstede, 1991).
Femininity versus masculinity
The Femininity versus Masculinity dimension deals with gender roles and their importance on individual and cultural levels. Cultures are differentiated according to the way in which gender roles are distributed (Hofstede, 1991). Masculinity, according to Hofstede, “pertains to societies in which social gender roles are clearly distinct (i.e., men are supposed to be assertive, tough, and focused on material success whereas women are supposed to be more modest, tender, and concerned with the quality of life). Femininity, in contrast, “pertains to societies in which social gender roles overlap (i.e., both men and women are supposed to be modest, tender, and concerned with the quality of life)” (Hofstede, 1991, pp. 82-83).
The Masculinity Index (MAS) describes the extent to which a country tends to be masculine. In countries with a high MAS, it is valued to be ambitious, successful, and assertive (Hofstede, 1991). In countries with a low MAS, relationships with other people and the preservation of the environment are important (Hofstede, 1991).
Uncertainty Avoidance describes the extent to which people feel anxious or uneasy in unfamiliar or unpredictable situations. When Uncertainty Avoidance is strong, there is a need for structure, strict rules of behavior, and a belief in an absolute truth to avoid ambiguous situations (Hofstede, 1991). “Uncertainty Avoidance can therefore be defined as the extent to which the members of a culture feel threatened by uncertain or unknown situations. This feeling is, among other things, expressed through nervous stress and in a need for predictability: A need for written and unwritten rules” (Hofstede, 1991 p. 113).
Hofstede measures Uncertainty Avoidance with the Uncertainty Avoidance Index (UAI). In countries with a high UAI, structure, hard work, precision, and punctuality are desired. Aggressions are allowed to occur on certain occasions. Different behaviors and opinions are approached with intolerance (Hofstede, 1991). A lower UAI leads to tolerance towards different and unfamiliar situations. Members accept unfamiliar risks and have a more tolerant attitude towards differing behaviors and opinions (Hofstede, 1991).
Cultural differences in web usage
A number of studies have been carried out to explore the cultural diversity of the World Wide Web and to determine what impact cultural differences have on website design, usability, e-commerce, and computer-mediated communication.
Barber and Badre (1998) found that members of different cultural groups prefer different icons, colors, and site structures. Design elements that occur often in one group but are less prevalent or absent in other groups are called cultural markers. Sheppard and Scholtz (1999) found that the use of appropriate cultural markers increases the usability of the web sites that they tested. Investigating American and Chinese websites, Singh, Zhao and Hu (2003) came to the conclusion that “the web is not a culturally neutral medium, but it is full of cultural markers that give country-specific websites a look and feel unique to the local culture” (p. 63).
Schmid-Isler (2000) investigated Western and Chinese news sites and argued that their layout is different because of culturally-influenced perceptions of information storage and display. Marcus and Gould (2000) applied Hofstede’s dimensions to website design and identified different aspects of design that are influenced by the score of a country along a specific dimension. They note, for example, that “hierarchies in mental models,”“prominence given to leaders,” and “importance of security and restrictions” are related to the Power Distance of a country (Marcus & Gould, 2000, p. 36). Sheridan (2001) examined websites in order to derive guidelines for designing sites for different cultures based on the relationship of visual design elements to Hofstede’s dimensions.
Cultural differences can also be observed in e-commerce websites. According to Chua, Cole, Massey, Montoya-Weiss, and O’Keefe (2002), the success of an e-business depends on the level to which cultural differences are understood. E-commerce websites should be adapted to the local market and should approach members or companies from different countries in accordance with their cultural backgrounds. Singh and Baak (2004) compared Mexican and United States e-commerce sites to examine whether they depicted cultural values differently. They grouped website content into four categories in accordance with the four dimensions of Hofstede (e.g., Company Hierarchy Information or Information about quality and awards in the dimension Power Distance) and investigated if the content matched the score of the country in the specific dimensions. They found that Mexican websites in comparison to U.S. websites showed more content related to Collectivism, Masculinity, and Power Distance. These findings reflect the relative scores of the two countries along these dimensions as assigned by Hofstede.
Similarly, Robbins and Stylianou (2002) analyzed the frequency of occurrence of specific elements in commercial websites and found many elements that can be related to Hofstede’s dimensions. For example, they found a relation between Power Distance and the frequency of organizational charts on websites. Callahan (2005) correlated the frequency of graphical elements on university websites with the countries’ scores along Hofstede’s dimensions, and found correlations supporting their hypothesised relations. Tsikriktsis (2002) investigated the relationship between quality expectations toward a website and culture, again using Hofstede’s dimensions. He found that the scores of a country in masculinity and long-term orientation are associated with higher expectations of website quality.
Other studies have examined the effects of cultural differences in people’s behavior when they communicate through the Internet. Cakir and Cagiltay (2002) found that cultural differences lead to some differences in style of communicating via email, but they also stated that cultural norms in email are not as apparent as in face-to-face environments. Additionally, collaborative or group work via the Internet may be influenced by the culture of the group members. Wilson, et al. (2002) found that the perception of an online group process is influenced by cultural factors that are consistent with Hofstede’s dimensions but can also be explained by other factors, such as the time frame of the CMC and the commonality of the language in use.
By using Hofstede’s dimensions we attempt to explain the relations between the patterns of changes in Wikipedia and the cultural background of the contributors. We hypothesize that Hofstede’s dimensions, which provide insight into how people act in real life, are also applicable to CMC and can therefore also provide insight into peoples’ contributions to Wikipedia. Our assumptions about the differences of contributions in Wikipedia for different cultures are articulated as hypotheses below.
Power distance index
It can be inferred that in countries with a higher PDI, most people are used to following orders. Important or powerful decisions are made by a few powerful people and not by the citizens themselves (Hofstede, 1991). Deleting actions are powerful actions because the user enforces his or her opinion and declares the opinion of the former user wrong. People from a country with a relatively high PDI are not used to taking the initiative and do not feel that they have the power or right to make the decision of deleting somebody else’s work. We therefore expect that deletions are less likely in countries with a relatively high PDI.
H1: There will be a negative correlation between the PDI and the number of deleting actions.
We further expect that there will be a negative correlation between the number of corrective actions and the PDI, as people who are afraid of making mistakes and are obedient towards their superiors feel reluctant to correct another member’s work.
H2: There will be a negative correlation between the PDI and the number of corrective actions.
According to Hofstede (1991), members of a country with a relatively high IDV stress individual goals and opinions more than group interests. In contrast, the lower the IDV of a country, the more important the interests and work of the group are. Therefore, it can be argued that a low score in IDV goes along with an increasing interest in being part of a group (Wikipedia, in this case) and also with an increasing number of contributions (as the interest of the group is to expand the Wikipedia page). We therefore expect a higher number of actions adding content to Wikipedia in collectivistic countries.
H3: There will be a negative correlation between the IDV and the number of adding actions.
In countries with a high IDV, individual opinions are stressed. Corrective actions stress the opinion of the individual who corrects or changes a version written by another member. By correcting another member, the individual opinion prevails over the group consensus. We therefore expect a positive correlation between the IDV and the number of corrective actions.
H4: There will be a positive correlation between the IDV and the number of corrective actions.
In countries with a relatively high MAS, success and progress are important, and the society tends to be more corrective (Hofstede, 1991). Therefore, we expect a positive correlation between the MAS and corrective actions.
H5: There will be a positive correlation between the MAS and the number of corrective actions.
Uncertainty avoidance index
Members of countries with a high UAI tend to have a constant subjective feeling of anxiety. They feel uncomfortable towards unfamiliar risks and try to avoid them (Hofstede, 1991). It is therefore expected that the correlations between all investigated categories and the UAI will tend to be negative. We expected that, since every edit is in a way an unpredictable action connected to a possibility of an error, people will feel a constant insecurity and make fewer total contributions.
H6: There will be a negative correlation between the UAI and the number of contributions to Wikipedia.
Sample and data source
To test our six hypotheses, the sequence of a Wikipedia page evolution was investigated and the contributions of the participants involved in the development of specific language versions of that page were examined and categorized. These different contributions were then sorted into categories, and the number of contributions in each category was correlated with Hofstede’s four dimensions. This was done to investigate the differences in contributions to Wikipedia by people from different cultures.
One precondition of our study was the ability to understand the language of each version of Wikipedia under investigation. Another precondition was that the pages we were to study should have a large enough number of contributions to enable us to do meaningful statistical comparisons. For these reasons, we chose the French, German, Japanese, and Dutch Wikipedia pages on the topic game as the source of data for this study.
The topic game emerges as a unique page in every Wikipedia version of the chosen languages. For the collection of the data, the History section of the pages (up to June 30, 2005), where all performed edits within the current page are saved and listed, was analyzed.
Figure 2 shows the history entry for the German language game page that was investigated. It displays a list of all the versions of the page from its creation to its current version. To identify the differences between two versions of one page, each page must be selected and compared through use of the history page (Figure 3).
We examined all history entries for the four selected language versions up to June 30, 2005. Content analysis was used to categorize all changes, such as whether the change was in content, a deletion, or a grammar correction. To arrive at a set of categories, we followed the process of grounded theory (Glaser & Strauss, 1967). We processed the pages and extracted possible categories as they emerged. By doing this several times in an iterative cycle, the categories were refined according to the data until saturation was reached. All four language versions were included in the process to make sure that the categories cover the full range of possible data variations. To minimize researcher bias, two researchers extracted the categories together. Possible ambiguities and misinterpretations were resolved through discussion. The set of categories was then used for classification. The fact that the categories emerged from the data made it easier to sort the changes into the appropriate category. Thirteen categories were identified that fully describe the data; these are listed in Table 1.
Table 1. Summary of categorization
Name of Category
Addition of topic-related information (the information must not consist only of links).
Addition of links to an existing set of listed links or linking of a word within an existing sentence to a page (links to other Wikipedia pages or to external Internet pages).
Rewording of existing information without adding new information. Rewording done in order to clarify the content (e.g., substitution of certain words for a better understanding, change of the word order or deletion/addition of words in order to clarify).
Deletion of topic-related information (the information must not consist only of links).
Deletion of links from the set of listed links or removal of the linking function from a word within an existing sentence (links to other Wikipedia pages or to external Internet pages).
Modification of an existing link (can be an alteration of the linked URL or the name of the link).
Contributions that affect the appearance or structure of the whole page (e.g., addition of space lines, sorting/moving of paragraphs or links and addition of subtitles in order to structure the content).
Alterations of the grammar (e.g., change of punctuation).
Changes in the mark-up language that have no impact on the appearance of the page or the text (e.g., both “example” and <b>example</b> print the word example in bold face).
Reversion of the page to a former version (in order to reverse vandalism, or to reverse certain activities of users or to go back several versions in one step).
Correction of spelling mistakes (e.g., reversed letters or capital letter).
Contributions that affect the presentation or appearance of the text (e.g., bold/ italic/underlined text).
Entries/actions that are made in order to demolish the page (e.g., deletion of the whole content of a page, addition of sentences that are clearly not related to the topic of the site or addition of links to external sites that can not be associated with the topic).
To determine the different styles of contributions in the different language versions, the number of changes that were made in particular categories was recorded separately for every language version. Each edit was examined and the actions were recorded and sorted in the categories. Note that a single edit may fit in several categories, as people usually made several changes in one editing session. For example, a user could add new information to the article and correct a spelling mistake in existing information in one edit. In this case, we would record that the user made contributions in the categories Add Information and Spelling.
Hofstede investigated cultural differences across different nations (Hofstede, 1991). For the present investigation, it is assumed that the French Wikipedia represents the culture that Hofstede characterized as the culture in France, the German version represents the culture found in Germany, the Japanese version represents the culture in Japan, and the Dutch version represents the culture in the Netherlands. We are aware that using language as a form of cultural representation may be problematic, especially for French and German, as French is spoken in 29 different countries and German in six (Wikipedia, 2006b, 2006c), and some of the countries that speak the same language have different scores along Hofstede’s dimensions. However, as there is no other way to pinpoint the cultural background of Wikipedia contributors precisely, this exploratory study makes the assumption that language is highly correlated to culture (Jiang, 2000).
The scores of the relevant countries along each of Hofstede’s dimensions were correlated with the relative number of contributions (actual number of contributions divided by the total number of contributions in the Wikipedia of that language) in each category within the Wikipedia game page of the respective country. Pearson’s correlation coefficient was used to quantify the correlation.
The French article about game included 228 versions, the German article 155 versions, the Japanese article 70 versions, and the Dutch article 47 versions. In total, 952 changes were analyzed. Additional recorded data were the name/IP address of the user and the date of the edit. From these the number of contributors was extracted (see Table 2). The large number of contributors supports the assumption that the measured behavior does not reflect the work of just a few individuals.
Table 2. Percentage of changes in each category and each language
Total no. of changes
No. of contributors
Table 2 shows the percentage of changes that fell under each of the 13 categories for each language version.
A Pearson’s product moment correlation was calculated between the relative percentage of changes under each category and the score of the countries along the particular dimension of Hofstede. For this exploratory study, it is assumed that correlations with absolute values greater than 0.5 are sufficient to indicate the tendency of the relation between the category and the score in Hofstede’s dimensions. Table 3 shows the values of Hofstede’s dimensions for the four countries under investigation.
Table 3. Score of the countries in Hofstede’s dimensions (Hofstede, 1991)
Power Distance Index (PDI)
Individualism Index (IDV)
Masculinity Index (MAS)
Uncertainty Avoidance Index (UAI)
Table 4 shows the relevant correlations, along with their associated p values, found between the cultural dimension scores of each country and the number of contributions in some of the investigated categories. The categories with an absolute correlation coefficient less than 0.5 (Add Link, Format, Style/Typography, Reversion, and Vandalism) were not considered relevant and are excluded. Although many correlations are not significant (mostly due to the small number of countries investigated), the analysis identifies trends that are worth considering. In addition to the statistically significant correlations (p < 0.05), these trends will be used to support or reject our hypotheses.
Table 4. Summary of strong correlations
Power Distance Index (PDI)
Individualism Index (IDV)
Masculinity Index (MAS)
Uncertainty Avoidance Index (UAI)
−0.98 (p = 0.008)
0.99 (p = 0.004)
0.72 (p = 0.141)
−0.69 (p = 0.155)
0.85 (p = 0.076)
−0.72 (p = 0.142)
−0.91 (p = 0.047)
−0.64 (p = 0.178)
0.84 (p = 0.082)
−0.72 (p = 0.139)
0.67 (p = 0.163)
−0.62 (p = 0.192)
−0.52 (p = 0.242)
0.64 (p = 0.180)
0.57 (p = 0.215)
−0.63 (p = 0.183)
Power distance index
As shown in Table 4, the Power Distance Index (PDI) correlates negatively with the category Delete Information (−0.72) and has a significantly negative correlation with Delete Link (−0.91) (see Figure 4). This means that the higher the PDI of a country, the fewer deletions are made in that particular Wikipedia page. In countries with a high PDI, such as France, people are more likely to have decisions taken by superiors and are reluctant to invalidate someone else’s work (Hofstede, 1991). Hypothesis H1: There will be a negative correlation between the PDI and the number of deleting actions, is thus supported by the data.
Although the correlation between PDI and the category Spelling (+0.64) (see Figure 5) is not significant, a clear trend can be seen. The higher the PDI of the country, the more likely there are to be Spelling correction contributions.
This trend did not match the expectations. As contributions in the category Spelling are in most cases corrections, the expectation was that there would have been a negative correlation between this category and PDI, as people who are afraid of making mistakes and are obedient towards their superiors would be reluctant to correct another member’s work. However, hypothesis H2: There will be a negative correlation between the PDI and the number of corrective actions, was not found to be true. One possible explanation for the trend to have more corrections in countries with higher PDI could be that spelling mistakes are objective and can be traced to a dictionary and therefore do not depend on the superior’s privileged judgement. People from a country with a large PDI therefore could feel confident in correcting spelling mistakes because it is obvious what is spelled right or wrong.
Also note that some language versions could have had fewer spelling errors in them in the first place, which might have influenced this trend. Furthermore, contributions in the category Spelling occurred very often in the French version, and France scores relatively high in the PDI. The French language requires special characters (e.g., accents), which can only be typed with a special keyboard driver. Consequently, many contributions in the category Spelling of the French Wikipedia consist in the addition of accents, which might have been omitted due to the keyboard configuration of the user’s computer. A more detailed analysis of this particular issue seems promising as a topic for future research.
The Individualism Index (IDV) was found to have a significant negative correlation with the number of contributions in the category Add Information (−0.98), and a similar trend can be seen regarding Clarify Information (−0.69) (see Figure 6). The higher the IDV of a country, the less likely its people are to add or clarify information, and the lower the IDV of a country, the more contributions can be found in the categories Add Information and Clarify Information. Thus, hypothesis H3: There will be a negative correlation between the IDV and the number of adding actions, was supported by our analysis. Additionally, the number of contributions in Clarify Information was found to be higher for countries with higher IDV (e.g., the Netherlands). Both of these categories, Add Information and Clarify Information, contribute to the growth and development of the Wikipedia page. In contrast to corrections and deletions, these contributions stress the common work and consensus of the group and are therefore expected in more collectivistic countries.
Although the correlation of the IDV with the category Fix Link (+0.84) was not significant, a clear trend can be observed. The same trend is present for the categories Grammar (+0.67) and Spelling (+0.57) (see Figure 7). The higher the IDV of a country, the more contributions are made by members of the particular Wikipedia page in the corrective categories Fix Link, Grammar, and Spelling. Hypothesis H4: There will be a positive correlation between the IDV and the number of corrective actions tended be supported by the data.
The category Add Information (+0.99) was found to have a significant positive correlation with the Masculinity Index (MAS) (see Figure 8). A similar trend can be observed for Clarify Information (+0.85). The higher the MAS of a country, the more contributions in the categories Add Information and Clarify Information are found. Although we did not predict this outcome, a possible explanation could be that in countries with a relatively higher MAS (e.g., Japan), success and progress are more important than in countries with a lower MAS (Hofstede, 1991). Both the Add Information and Clarify Information categories contribute to the growth and development of the Wikipedia page. An increasing number of contributions in these categories shows increasing interest in growth, knowledge, and progress, which is characteristic of cultures with a relatively high MAS.
The negative correlations of the MAS with the categories Fix Link (−0.72), Grammar (−0.62), and Spelling (−0.63) (see Figure 9) were not found to be significant, but a trend is clearly observable. The higher the MAS of a country, the less likely corrective contributions are within Fix Link, Grammar, and Spelling. This trend contradicts our fifth hypothesis H5: There will be a positive correlation between the MAS and the number of corrective actions. We expected a positive correlation between the MAS and corrective categories such as Fix Link, Grammar and Spelling. Surprisingly, the opposite tendency was found. A higher MAS seems to accompany a decrease in the number of contributions in corrective categories. We have no explanation at present for these observed trends.
Uncertainty avoidance index
Although the correlation between the Uncertainty Avoidance Index (UAI) and the category Add Information (+0.72) (see Figure 10) was not found to be significant, a clear trend is observable. The higher the UAI of a country, the larger the number of contributions is in the category Add Information.
Reversed trends were found between the UAI and the categories Delete Link (−0.64) and Mark-up Language (−0.52) (see Figure 11). The higher the UAI of a country, the less likely contributions are in the categories Delete Link and Mark-up Language.
The observed trends between the UAI and the categories Delete Link and Mark-up Language are in line with our sixth hypothesis, H6: There will be a negative correlation between the UAI and the number of contributions to Wikipedia.
This is not true for the positive trend of the UAI with the number of contributions in the category Add Information. A possible explanation for this could be the strong dependence of people on rules in countries with a higher UAI (e.g., Japan), as rules attempt to eliminate uncertainties in other people’s behavior (Hofstede, 1991). In Wikipedia, the rule is that the encyclopedia should grow and expand. To follow the rule, members have to contribute by adding information. It can be argued that the higher the UAI, the more dependent the members are on rules and thus the more they feel obliged to help the page grow. Therefore they contribute a lot in the category Add Information. This would explain the observed trend between the UAI and the number of contributions in the category Add Information.
The findings of this exploratory study show that content analysis methods can be useful for investigating cultural differences in wiki communities. The methodology further demonstrated that valuable information can be extracted from the history page of a wiki, by categorizing and then relating it to cultural dimensions.
The study shows that the Internet—and Wikipedia in particular—is not a culturally neutral space, but that differences in behavior across cultures can be observed. The amount and strength of the correlations between changes made in Wikipedia and Hofstede’s cultural dimension shows that cultural differences that are observed in the real world can be related, carefully, to the virtual world. Our study thus enhances the validity of previous studies that have observed cultural differences on the World Wide Web (e.g., Singh & Baack, 2004; Singh, Zhao, & Hu, 2003; Tsikriktsis, 2002).
These findings give rise to implications regarding how aspects of collaborative online work are influenced by pre-existing cultural differences. For example, as indicated by the significant negative correlation between the Power Distance Index and the category Delete Link, as well as the similar trend in Delete Information, people of a country with a high Power Distance Index, such as the French, are likely to feel uncomfortable about deleting others’ work. It is therefore advisable not to expect or require it of them in collaborative online work.
These findings provide useful indications for understanding the behavior of people from another culture in cross-cultural online communication (e.g., online communities with international members). People from a given culture are likely to have attributes and behaviors concerning online communication according to their cultural background. If we understand the way people behave in online communication, the effectiveness of this communication or work can be increased and misunderstandings and problems may be minimized.
The knowledge gained from this project also has implications for how to improve the design of online communities, as it is advisable to consider cultural differences and approach the community according to the cultural backgrounds of the members. One should offer communication and collaborative work tools suitable to the cultural preferences of the users. For example, if the users of a community come from a masculine country such as Japan, they—according to our data—are likely to be more active in adding and clarifying information. This can be interpreted as their desire to make the content grow, develop, and succeed. The design of a community should provide functions in the community that support this motivation by, for example, identifying the total number of edits made (“you have made a total of 25,434 edits”) or giving rewards in the form of rank to members who contribute frequently.
People from a country with a high Power Distance are, according to the data, more reluctant to delete information. It is believed that the reason for this is that deletions are powerful actions that people from such countries do not think they have the right or privilege to do. For the design of a community for these people it might therefore be advisable to provide a function similar to deletions but use terminology that would make it appear like a less powerful action, e.g., hide information instead of delete it.
This study is subject to several limitations. Concerning the data, it must be mentioned that when Hofstede originally evaluated the scores for his dimensions, he investigated the former Federal Republic of Germany, as his survey took place before the German Unification in 1990. Our study included the German Wikipedia where we assumed that Germans would comprise the majority of contributions. As the former two parts of Germany had completely different governments, there might now be a different score in Hofstede’s dimensions for the united Germany.
The fact that French is also the official language of 28 countries other than France, and that German is spoken in six countries, might also have influenced the results. As mentioned before, the representation of culture through language is somewhat problematic, but still the best way to study the cultural issue in Wikipedia contributions. Obviously, further research is needed in this very interesting area.
Furthermore, we investigated only the language versions of articles on the topic game. As Viégas, et al. (2004) noted, the topic of an article can influence the nature of contributions. Therefore, the results of this study may not generalize to Wikipedia articles on other topics. Using multiple articles would increase the number of contributors and make the data more representative of Wikipedia in that language. Emigh and Herring (2005), for example, analyzed data from 15 Wikipedia entries and compared them to the analogous 15 entries of three other encyclopedias to validate their findings. This gives an indication of the possible scope for a follow-up study with articles on multiple diverse topics.
Further work is required with respect to the categories where no clear correlations were found. The lack of correlation could be due to the fact that only Hofstede’s dimensions were analyzed. One might therefore investigate possible correlations of these categories with additional cultural dimensions that can be found in the literature (e.g., Trompenaars & Hampden-Turner, 1997).
Recommendations for future studies
This study and framework needs to be validated and tested under different conditions. Only one topic (game) was analyzed. It would therefore be essential to know if similar cultural differences are also evident in other articles. Another possibility could be to extend the number of investigated countries or to investigate completely different countries and see if the results are similar. As this was an exploratory study we only used a small sample of four language editions. As such, the statistical correlations are only indicative of a tendency, and the findings should be further verified by conducting a power analysis with a larger sample.
A verification experiment could also be performed in the form of a controlled experiment, where a number of participants build up a wiki-based collaboration in a controlled environment. This would allow elimination of most of the problems stated above, and help to verify and enhance the findings of this study.
It would also be interesting not just to look at the changes people make in Wikipedia, but to investigate the outcome, in other words, the article itself. Does the structure of the article vary across different cultures? Are there differences in the content of articles on the same topic? Do particular cultures tend to make more contribution on particular topics?
Finally, it would be useful to replicate this analysis by studying other tools for online communication or communities such as chats, newsgroups, message boards, or blogs, to discover if, and if so, to what extent, communication tools affect the expression of cultural differences.
About the Authors
Ulrike Pfeil is a Ph.D. student at the Centre for Human-Computer Interaction Design, School of Informatics at City University, London. Her research interests include social aspects of computing, especially in computer-mediated communication.Address: The Centre for Human-Computer Interaction Design, City University London, EC1V 0HB, UK
Panayiotis Zaphiris is a Senior Lecturer at the Centre for Human-Computer Interaction Design, School of Informatics of City University, London. His research interests lie in HCI with an emphasis on inclusive design and social aspects of computing. He is especially interested in HCI issues related to the elderly and people with disabilities.Address: The Centre for Human-Computer Interaction Design, City University London, EC1V 0HB, UK
Chee Siang Ang is a Ph.D. student at the Centre for Human-Computer Interaction Design, School of Informatics of City University, London. His research interests include the psychology and sociology of computer games, including new forms of CMC communication such as MMORPGs.Address: The Centre for Human-Computer Interaction Design, City University London, EC1V 0HB, UK