The Web has become a major source of information in the developed world, answering many of people’s information needs in their everyday, personal, and professional lives. According to recent estimates, the size of the Web stands at 11.5 billion indexable documents (Gulli & Signorini, 2005). Due to the vast amounts of information available, special tools are needed in order to locate information on the Web. These tools are primarily the major commercial search engines: Google, Yahoo!, and Windows Live. According to a report by comScore (2007), in July 2006, Google’s market share of searches was 47.3%, and Google, Yahoo, and Windows Live (MSN) together accounted for 86.3% of the searches on the Web in the United States.
Search engines are supposed to be unobtrusive tools for information retrieval. However, in reality they have considerable influence on the Web. Businesses, organizations, and individuals invest time and money so that their Web pages are placed high for certain queries; an entire new industry focusing on search engine optimization is based on this idea (SEMPO, 2004). One of the major activities in this new field concerns Organic Search Engine Optimization, which, according to the Search Engine Marketing Professional Organization’s report, is “[t]he practice of using a range of techniques, including augmenting HTML code, web page copy editing, site navigation, linking campaigns and more, in order to improve how well a site or page gets listed in search engines for particular search topics” (SEMPO, 2004, p. 4).
Search engines are the primary tools for finding information on the Web; they are extremely powerful, since they decide what to index and how to rank the indexed results for the specific queries. “Without much exaggeration one could say that to exist is to be indexed by a search engine” (Introna & Nissenbaum, 2000, p. 171). In order to be included in Google’s index, users can submit their sites (http://www.google.com/addurl/), or they can wait for the Google crawler to discover their new page/site. Crawlers are programs that cover the Web, starting from “seed” (a set of initial URLs) and then following links found on those pages (see, e.g., Levene, 2006). Submission does not guarantee inclusion; thus it is important to have links to the site in order to be discovered.
Links are important for the ranking process as well. One of the ingredients in Google’s ranking algorithm is PageRank (Brin & Page, 1998; Google, 2007). The PageRank of a page is determined by the quantity and quality of links pointing to it. The quality of a link is based on the PageRank of the source of the link (for a detailed explanation, see Levene, 2006). For most queries there are thousands of results, which only underscores the importance of ranking. Previous studies have shown that most users view only the first results page. For example, in a study of the search engine AlltheWeb in 2002, 76.3% of the users viewed the first results page only (Spink & Jansen, 2005). A recent eye-track study by Enquiro (2005) showed that users concentrate only on the top three results. Thus links are a key factor in locating information on the Web—Walker (2005), in discussing the economic power of links, considers them the “currency of the Web” (p. 524). Hargittai (2004) provides advice for non-profits on how to improve visibility in the online landscape. Her advised strategies include cross-linking among similar sites and linking to the welcome page from every page of the site (this is called self-linking).
Thus Web page owners, commercial and non-profit, try to please the search engines in order to enhance their visibility on the Web. As Introna and Nissenbaum concluded, search engines are far from being unobtrusive and objective; they influence what we see on the Web, and they themselves are influenced by the “collective preferences of seekers … [and] tend to cater to majority interests” (p. 177).
Search engines such as Google not only influence the business landscape, they also have social implications. The verb “Google,” according to the Oxford English Dictionary (2006), has two meanings: 1) “[t]o use the Google search engines to find information on the Internet,” and 2) “[t]o search for information about (a person or thing) using the Google search engine.”
The remaining sections of the article first discuss “Google bombs,” a method to manipulate search results, specifically as they relate to the results of Google. Next, details about specific Google bombs, data collection, and analysis are provided. The results and discussion sections present the findings and discuss the fate of the analyzed Google bombs from a time perspective: Are Google bombs short lived, or are they effective for long periods of time? On the Internet “a long time” is relative; in this study, 10-40 month old Google bombs were considered.
An active manipulation of the results in Google is called “Google bombing.” The term Google bombing is even included in the second edition of The New Oxford American Dictionary (Price, 2005, n.p.). There Google bombing is defined as “the activity of designing Internet links that will bias search engine results so as to create an inaccurate impression of the search target.”
Google bombs work because, as mentioned before, Google’s ranking algorithm takes into account the quality and quantity of links pointing to the given page. This is not the only factor in Google’s secret algorithm, but concrete examples show that a large number of links pointing to a certain page may increase the placement of that page, especially when the query term appears in the anchor text of the link pointing to that page.
Successful Google bombs raise public interest and are often discussed in the media (e.g., the LA Weekly [Lewis, 2003], BBC News , The New York Times [Flynn, 2004]); in search engine resources (e.g., Sullivan, 2002a & b, 2004a & b), and in the blogger community (e.g., Callishain, 2004; Leiter, 2005; Levine, 2002; Mathes, 2001; Rockley, 2005).
This exploratory study aims to examine whether successful Google bombs result in the continued prominence of the targeted pages on results pages, and if they do, whether there is a change in the linking patterns to the target pages. The “bombing” not only affects the targeted page, but also affects the search engine: It becomes manipulated by the public. The question raised here is whether this manipulation has a short- or a long-term effect on search query outcomes.
As can be seen, links are central to Google bombing. Links are often considered analogues of citations in the scientific environment (Brin & Page, 1998). “Google interprets a link from page A to page B as a vote, by page A, for page B” (Google, 2007). But are links similar to citations when linking patterns are considered at different points in time? A major difference between the two is that citations in printed literature are here to stay, whereas links that exist at one point in time may disappear in the next moment. The number of citations a publication receives is monotonically non-decreasing over time, but what can be said about link counts?
The growth patterns of printed, scientific literature have been studied extensively. The growth of scientific literature on a topic over time can usually be characterized by a logistic function (Egghe & Rousseau, 1990; Price, 1963). The logistic growth curve is characterized by initial exponential growth, explained by the success-breeds-success principle (Egghe & Rousseau, 1990, pp. 297-301; Price, 1976), followed by a period of linear growth. Later the rate of growth slows considerably, until all interest in the topic disappears. Bar-Ilan (1997) showed that the logistic growth function is also applicable to newsgroup discussions on “hot topics”—she studied discussions on mad-cow disease at the time the crisis erupted in the UK—but the life span of the curve was not measured in years; it took only 100 days for interest in the topic to level off.
Standard informetric techniques are applicable to newsgroups postings (Bar-Ilan, 1997), because once a message is posted it does not change. This, however, is not true of Web documents in general. Some of them change over time; others move to a different URL or disappear from the Web altogether (see, e.g., Bar-Ilan & Peritz, 2004; Fetterly, Manasse, Najork, & Wiener, 2003; Koehler, 2004). Printed literature and even newsgroup postings (since they are usually archived) cumulate; thus the function characterizing the cumulative growth is monotonic, non-decreasing. This is not the case for Web links: When a document is removed from the Web, all the links on the page disappear with it. In addition, Web documents often undergo changes, causing additions/deletions of the links outgoing from these pages. Since the essence of Google bombing is linking to the target page, it is not possible to apply existing informetric techniques to study the development of Google bombs over time. In this article, several Google bombs are considered in order to gain insight into linking to the targeted sites some time (between 10 and 40 months) after the Google bomb was created.
Google bomb links often emanate from blogs (usually from the sidebar of the blog) and from forums (where the “bombing” link appears in the signature files of the participants); thus the bombing link often does not form an integral part of the content of the posting (Bar-Ilan, 2006; Kahn & Kellner, 2004). Blogs have been defined as “pages consisting of several posts or distinct chunks of information per page, usually arranged in reverse chronology from the most recent post on the top to the oldest post at the bottom” (Bausch, Haughey & Hourihan, 2002, p. 7). Blogging has become a popular online activity, although according to the Pew Internet & American Life Project, only 8% of Internet users keep a blog, and only 39% read blogs (Lenhart & Fox, 2006).
It is very easy to edit the signature that appears at the bottom of forum messages or to redesign the sidebar of a blog without altering the actual content of individual posts, and this design change at once affects all previously published posts. This process was called “retroactive change of history” by Bar-Ilan (2006). She demonstrated that even during a short period of time (about five months), the linking patterns to the targeted pages changed considerably. These changes occurred without changing the textual content of blog posts. Blog entries have so-called permalinks or permanent links (Bausch et al., 2002); thus changing the sidebars affects all existing blog postings without changing the actual contents of the postings.
Google bombs are one of the means through which Google’s search results are manipulated. This study set out to explore whether Google bombing has a long-term effect on the ranking of search results. In order to do this, a sample of Google bombs was selected, and the content of a sample of source pages (pages from which the links to the targeted page emanate) and the links were analyzed.