Associate Professor Jamie Murphy’s [email@example.com] background includes complementary industry and academic experience. In addition to owning/managing hospitality businesses, he served as the European Marketing Manager for U.S. sports companies. Dr. Murphy’s research focus is effective use of the Internet for citizens, businesses, and governments.
Address: The University of Western Australia Business School, Crawley, WA 6009, Australia
Noor Hazarina Hashim,
The University of Western Australia Business School
Noor Hazarina Hashim [firstname.lastname@example.org] is a Ph.D. candidate in Internet Marketing at the School of Economics and Commerce, University of Western Australia. In Malaysia, she lectures in marketing at the University of Technology Malaysia. Her research interests include the evolution of website and email, and effective Internet marketing.
Address: The University of Western Australia Business School, Crawley, WA 6009, Australia
Peter O’Connor [email@example.com] is Professor of Information Systems at Essec Business School France and serves as Academic Director of Institute de Management Hotelier International (IMHI), its specialized MBA program in hospitality management. His primary research, teaching, and consulting interests focus on the use of information technology in the hospitality sector. Previously he held a visiting position at the Cornell School of Hotel Administration and worked in a variety of international positions in hospitality management in sectors ranging from luxury hotels to contract food services.
Address: IMHI, Essec Business School, 95021 Cergy-Pontoise Cedex, France
Although fields such as e-commerce, information systems, and computer-mediated communication (CMC) acknowledge the importance of validity, validating research tools or measures in these domains seems the exception rather than the rule. This article extends the concept of validation to one of an emerging genre of web-based tools that provide new measures, the Wayback Machine (WM). Drawing in part on social science tests of validity, the study progresses from testing for and demonstrating the weakest form of validity, face validity, to the more demanding tests for content, predictive, and convergent validity. Finally, the study tests and shows nomological validity, using the diffusion of innovations theory. In line with prior diffusion research, the results of tests for predictive and nomological validity showed significant relationships with organizational characteristics and two WM measures: website age and number of updates. The results help validate these measures and demonstrate the utility of the WM for studying evolving website use.
Despite a 1989 call for rigorous instrument validation in management information system research (Straub, 1989), the field has yet “‘to reach the point where validation is the rule rather than the exception’” (Boudreau, Gefen, & Straub, 2001, p. 11). Validation is inadequate, in part due to the difficulty in tracking rapid technological changes (Straub, 1989), yet establishing validity is particularly important for new instruments (Bagozzi, 1981; Hinkin, 1995). In addition to validating research instruments such as survey questions, the computer science field acknowledges validating software or expert systems as an important step in the development of new tools (Kitchenham, Pfleeger, & Fenton, 1995; O’Leary, Goul, Moffitt, & Radwan, 1990). Similarly, social science often validates research instruments, such as the psychometric properties of questionnaire items (Babbie, 1997; Straub et al., 2004), rather than output from online tools. Validating the output from archival databases is an important new challenge.
In the expert systems domain, a review of validation literature found no standard definition of validity and different terms used interchangeably to describe validity (O’Leary et al., 1990). A business research methods text defines validity as the degree to which a research instrument provides adequate coverage of the topic under study (Sekaran, 2003). In computer science and expert systems, validation is the ability of software or a system to comply with defined standards or adequately represent an expert’s knowledge (Kitchenham et al., 1995; Mosqueira-Ray & Moret-Bonillo, 2000; O’Leary et al., 1990).
Common to most definitions across disciplines is determining suitability and accuracy. With regard to types of validity, Straub el al. (2004) argued that predictive validity was optional, highly recommended content and nomological validity, and mandated convergent validity. New measures, however, require substantiation of predictive, content, and nomological validity (Bagozzi, 1981; Straub et al., 2004).
Apart from a single study that included convergent and nomological validity for three website measures from Alexa—content, download time, and navigation (Palmer, 2002)—to the authors’ knowledge, no CMC studies have validated the output from third-party online tools. Comparing Alexa results with jury ratings and a web-based agent, this sole study found significant correlations and suggested that the three measures had convergent validity. The study further suggested nomological validity for website content and website navigation (Palmer, 2002).
(a) tests the content validity of three measures provided by the Wayback Machine: archived web pages, website age, and website updates;
(b) tests the predictive, nomological, and convergent validity of two measures provided by the Wayback Machine: website age and website updates; and
(c) adds to the small number of studies validating measures from third party online tools.
The following sections introduce tests of validity, followed by discussion of the Wayback Machine and diffusion of innovations theory. The article then describes the study population. After testing for face and content validity of three WM measures—website content, website age, and website updates—the article uses the study population to test for predictive, nomological, and convergent validity of the latter two measures. The article closes with suggestions for future research directions for academics studying website evolution or using third-party online tools for research.
As its name implies, face validity relates to face value and relies upon experts’ personal opinions and judgment. Because of the vagueness and subjectivity that can result, face validity is a weak test of validity, and some researchers question its use (Sekaran, 2003). Given the lack of validation of third-party online tools, checking face validity seems a reasonable first step prior to moving on to more demanding tests. Closely related to face validity is content validity, which ensures that a measure includes an adequate and representative set of items to cover a concept. Content validity also relates to sample-population representativeness, for example, the ability of a questionnaire to represent the larger population. When experts agree that a measure provides adequate coverage of a concept, the measure has content validity (Sekaran, 2003).
Predictive validity, also known as practical or concurrent validity, measures how well an independent variable or set of independent variables relates to the characteristics of research interest (Sekaran, 2003). Scholars debate whether predictive validity falls in the general category of construct validity (see below) or the extent that the operationalization of a concept actually measures that concept (Straub, 1989). Predictive validity can also show the applied value of research (Straub et al., 2004). For example, a business could predict its online sales based on the number of website visits and email enquiries. To demonstrate validity, the firm could periodically correlate website visits in a particular month with sales in that or subsequent months. Repeatedly high correlations would suggest predictive validity, thus allowing the firm to use website visits to forecast future sales. Depending on the objective, researchers typically use correlation or regression analyses to test such hypothesized relationships (Hinkin, 1995).
Combined with predictive validity, nomological and convergent validity help achieve construct validity—the empirical and theoretical support for a particular interpretation (Straub, 1989). Nomological, or lawful, validity links a theoretical concept with observable results (Cronbach & Meers, 1955). “‘If theoretically-derived constructs have been measured with validated instruments and tested against a variety of persons, settings, times, and, in the case of IS research, technologies, then the argument that the constructs themselves are valid becomes more compelling’” (Straub et al., 2004, p. 395). Convergent validity results when two variables measuring the same construct correlate highly (Straub et al., 2004). Triangulation of multiple research results, rather than relying on a single line of evidence, helps achieve convergent validity.
The Wayback Machine
The Wayback Machine is part of the Internet Archive (www.archive.org), which amasses websites, moving images, texts, audio, and recently, educational resources (FAQs, 2007). Drawing upon results from the Alexa webcrawler, this U.S.-based non-profit organization permanently stores publicly accessible websites in an enormous digital archive. By preserving human knowledge and artifacts and making its collection available to all, the Internet Archive envisions resembling ancient Egypt’s legendary Library of Alexandria (FAQs, 2007). The archive contains snapshots of over 55 billion web pages—more information than in any library including the U.S. Library of Congress—even though archiving began only in 1996. The archive adds about 20 terabytes (1012 bytes) of digital content monthly (FAQs, 2007), with each sweep of the estimated 16 million archived websites taking over two months (Howell, 2006).
Via the WM, users can view the original version of each site, as well as the dates and content of subsequent updates. To call up archived websites, users type the URL of the desired site into the address box on the WM homepage. The WM then returns the date of original site creation, number and date of site updates, and links to archived sites. Figure 1 shows the WM homepage, and Figure 2 shows the results for a Malaysian hotel, the Timotel in Mersing. The WM also provides information on site updates. An asterisk beside the dates in Figure 2 indicates more than 50% changes to the website since the last visit.
Tracking the evolution of a site can be useful. For example, a researcher could investigate the evolution of Hyatt.com’s online customer relationship programs by analyzing consecutive archived versions of the company’s site. As noted earlier, researchers have used the WM to track and measure web content development (Chu et al., 2007; Hackett & Parmanto, 2005). The WM is also gaining legal acceptance with trademark and intellectual property issues (Howell, 2006). In a landmark 2004 U.S. case, the court ruled that pages culled from the WM were admissible as evidence (Gelman, 2004).
Although massive, the WM has limitations. It archives publicly accessible sites written in simple HTML, but has problems archiving password-protected or dynamic sites (Veronin, 2002). Furthermore, sites can decline inclusion by emailing the Internet Archive or using the Standard for Robot Exclusion (see www.robotstxt.org) to specify files or directories not to crawl (FAQs, 2007). Intellectual property owners concerned about infringements on third party sites can also request removal of such content (FAQs, 2007). Any of these actions stops future indexing, removes site content from the archive, and limits the archives’ comprehensiveness. Finally, a condition of use of the Alexa webcrawler is that the Internet Archive must wait at least six months after surveying before including site updates in the archive. Coupled with the requisite time to survey the 55 billion archived pages, this results in a time lag of six to 12 months for an archived snapshot to appear (FAQs, 2007; Howell, 2006).
Examining websites’ evolutionary aspects helps researchers investigate what factors lead to successful website implementation, including which features organizations add, and leave, on their websites. Evolution itself, however, is a research limitation; a single evaluation at a single point in time cannot capture such evolution. While longitudinal studies would let researchers track changing relationships, performing multiple evaluations is difficult and cumbersome (Chatterjee, Grewal, & Sambamurthy, 2002). Furthermore, some websites may no longer exist and some changes are ephemeral. For instance, a study of over 1,000 websites across six categories found only two-thirds of the sites still functioning at the same URL five years later (McMillan, 2002).
Another research limitation of diffusion studies is relying upon stated behavior rather than measuring actual behavior (Damanpour, 1991; Rogers, 2003). For example, to measure website age, researchers could email webmasters to ask when their websites first went online. However, a webmaster might not reply, might not know, or might give incorrect information. Domain name age, based on when an organization originally registered its domain name—such as Hyatt hotels registering Hyatt.com—provides an actual measure of Internet adoption (Adamic & Huberman, 2000; Murphy, Olaru, & Schegg, 2006). Yet domain name age as a measure of website evolution has limitations. With names registered in the most common domain, .com, changes in domain name registrars render the recorded age invalid (Murphy et al., 2006). Similarly, organizations may buy a domain name but wait months or years before hosting a website at that name, thus making the registration date an unreliable measure of when a website went live. Using data from the WM, which archives actual website pages, helps overcome such limitations and establish the real date of site creation.
Data Preparation and Preliminary Nomological Results
Testing the content, predictive, convergent, and nomological validity of the WM measures necessitated a database. With no comprehensive list of Malaysian hotel websites available, the study began with 540 hotels registered with Malaysia’s Ministry of Tourism, and the Malaysian Accommodation Directory’s (MAD) 2003/2004 list of hotel website addresses. In May 2006, keying the 540 hotels’ names into Google and Yahoo! helped find more hotel websites and verify the MAD website addresses, yielding 310 websites. The WM failed to give results for 19 sites (about 6%), due to trouble locating the site or the site declining indexing by the Internet Archive. Of the remaining 291 websites, some chain hotels shared the same domain name, such as hyatt.comand hilton.comfor all Hyatt and Hilton hotels in Malaysia. To avoid duplication, excluding 116 hotels with the same domain name left 175 websites. Of these 175 hotels, 96 hotels hosted their website in the global .com domain, and 79 hosted their website in Malaysia’s country domain, .my.
Diffusion of Innovations Findings
Table 1 shows the final sample and suggests that in line with diffusion of innovations research, high rated, chain-affiliated, and large hotels tended to lead in website adoption (Murphy et al., 2003; Siguaw, Enz, & Namiasivayam, 2000; Wei, Ruys, van Hoof, & Combrink, 2001). The first five-star hotel went online almost three years earlier than the first one-star hotel, early 1997 versus late 2000. The first chain hotel went online nearly a year earlier than the first non-affiliated hotel, late 1996 versus mid 1997. Finally, the first online hotel with over 300 rooms was about two years older than the first online hotel with under 200 rooms.
Table 1. Sample characteristics
Websites accessible via the WM
Sample without same domain name
Sample with .my domain
Most updates from 1996–2005
No. of Rooms
Similarly, high rated, chain-affiliated, and large hotels led in updating their websites. The five-star hotel with the most updates from 1996-2006 changed its site 60 times, compared to 35 times for the leading one-star hotel. Likewise, the leading chain-affiliated large hotel, which was also a large hotel, made 72 updates on its website versus 63 updates for the leading non-affiliated hotel that was also a small hotel. This discussion of website age and number of updates suggests nomological validity in line with the diffusion of innovations, but the results are just for one hotel—the leading hotel in each category—and not the entire sample of hotels.
Thus, the next section tests the validity of the Wayback Machine’s website age and website updates using the entire sample. Three transformations were necessary prior to testing. A new variable, update frequency, was the website age divided by the total number of website updates. Using this new variable, the most frequently updated website was a three-star independent hotel in Terengganu, which averaged an update every 35 days. At the other extreme, a two-star independent hotel in Melaka updated its website on average once every five years. As update frequency and the number of rooms had an abnormal distribution based on a one-sample Kolmogorov-Smirnoff test, a logarithmic function transformed these two variables into a normal distribution.
Validating the Wayback Machine
The following discussion draws on instrument validation and research of individual measures to validate three measures provided by the WM—website content, website age, and website updates. Starting with the weaker and more subjective tests, this study assessed face validity based on published research, feedback from three website managers, and a comparison with Malaysia’s domain name database. The courtroom acceptance of the WM (Gelman, 2004; Howell, 2006), mentioned earlier in this article, demonstrates face validity by legal experts. Next, an email invited two Malaysian hoteliers to test their website in the WM. The WM provided archived versions of their sites, and they agreed that the WM provided accurate ages and archived versions. Similarly, an author of this study verified that the WM provided accurate dates and versions of one U.S. and four Australian websites that he managed. The study also examined the face validity of the WM by investigating four hotel homepages shown in a 1996 study (Murphy, Forrest, Wotring, & Brymer). The WM results showed the same homepages as those in the article.
A final test of face validity compared the website age provided by the Wayback Machine with the domain name age provided by Mynic, Malaysia’s domain name registrar (whois.mynic.net.my). In principle, a hotel would register a domain name to house the website prior to launching the website. Comparing the WM website age with the domain name age for the 79 hotels using a .my domain name showed that 68 hotels had a domain name age older than the WM website age. Three hotels changed domain names, evidenced by the links and content on archived web pages. For example, the Hotel Flamingo began at www.twosteps.com/flamingo on August 23, 2000 and then changed to www.flamingo.com.my on June 3, 2002. The other eight hotels changed their Mynic information, resetting the registration date on file with Mynic. These two issues highlight shortcomings of using domain name age as a measure of Internet adoption and provide face validity for the Malaysian hotels’ website age.
Content validity was assessed based on the representativeness of websites and adequacy of the website age information provided by WM. As noted above, the WM provided universal coverage for the four sites in the published study and the seven sites managed by three webmasters. Furthermore, as noted in the data preparation section, the WM returned archived versions for 291 of 310 hotel websites, which suggests representativeness. In summary, confirmation by website managers, comparison with a published study, and representation of 291 Malaysian hotels in the WM suggest face and content validity of the WM’s website age, website updates, and archived web pages.
Predictive validity stemmed from the number of website updates recorded. Literature on the evolutionary nature of websites (Chatterjee et al., 2002; Chu et al., 2007; Piccoli et al., 2004; Teo & Pian, 2004) led to the prediction that older websites would have a higher average frequency of updates. The result of a one-tailed Pearson correlation test—a significant positive relationship between website age and the logarithmic value of update frequency (r = .274, n = 175, p < .001)—shows older websites were updated more frequently and suggests predictive validity.
The diffusion of innovations served as the theoretical base for testing nomological validity. This theory argues that certain organizational characteristics relate positively to organizational technology use (Matzler et al., 2005; Wang & Fesenmaier, 2005). U.S. and Swiss studies showed that high rated, large, and affiliated hotels led in technology adoption (Murphy et al., 2003; Siguaw et al., 2000). Compared to lower rated, smaller, or non-affiliated hotels, such hotels had more resources and expertise to facilitate IT implementation. Similarly, emerging Malaysian research and early global research found that large, high rated, and affiliated hotels led in the use of advanced website features (Hashim & Murphy, 2007; Wei et al., 2001). Based on the similarity in these studies, star rating, hotel size, and brand affiliation were the independent variables for testing nomological validity.
Table 2 shows the results of one-way Pearson correlation tests for the logarithmic number of rooms, Spearman correlation tests for star rating, and independent t-tests for chain-affiliation against the dependent variables of website age and number of updates. As mentioned earlier, the analysis used logarithmic values for the update frequency and number of rooms. Given the possible correlation among the three independent variables—size, number of stars, and affiliation—two multiple regression tests examined the predictive importance of the independent variables on website age and number of updates. No independent variables were significant predictors for number of updates, and star rating was a significant predictor of website age (β = .203, p = .031).
Table 2. Correlation and T-test results for website age and number of updates (N = 175)
Although the low correlation coefficients in Table 2 indicated significant relationships, and the multiple regressions showed low predictive importance, the results were in line with diffusion of innovations research. Larger, higher-rated, and affiliated hotels launched their websites earlier and updated their websites more often than smaller, lower-rated, and non-affiliated hotels did, helping support nomological validity.
Convergent validity was evaluated by measuring the relationship between domain name age and the creation date of a website at that address. Despite the limitation of a temporal gap between owning a domain name and having a live website, studies use an organization’s domain name age as a proxy for Internet adoption (Adamic & Huberman, 2000; Murphy et al., 2006). Although as explained earlier, a domain name age is an imperfect proxy, a high positive correlation between a website’s domain name age and that same website’s age as provided by the WM would suggest convergent validity.
Establishing the age of names in global domains such as .com or .org, however, is problematic. On November 30, 1999, the Internet Corporation for Assigned Names and Numbers shifted from a sole domain registrar to a Shared Registration System (SRS) with multiple registrars in the .com, .net, and .org domains (see www.icann.orgfor a history of domain names). The SRS makes gathering valid global domain name ages unreliable, as companies may change domain registrars, resetting their domain name’s creation date and rendering the data invalid (Murphy et al., 2006).
At the country level, however, such as .at and .my for Austria and Malaysia respectively, gathering the domain name age is less problematic. There is usually just one domain name database for each country, such as in Malaysia. Due to the difficulty validating ages in the .com domain, the study used the 79 websites with a .my domain to test convergent validity. Eliminating the 11 hotels that changed domains or Mynic information, the result of one-way Pearson correlation for the 68 hotels hosted in .my showed a significant positive correlation between website age and domain name age (r = .933, p < .001). This strong correlation supports convergent validity for the website age provided by the Wayback Machine.
Conclusions and Future Research
Researchers frequently adopt instruments from other studies, which can contribute to flawed measures for at least two reasons. Researchers fail to validate the adopted instrument or make major alterations to a validated instrument without re-testing it (Straub, 1989). This study reinforces the importance of the first reason, failure to validate, for metrics from the growing field of third-party tools such as those provided by Google and Alexa. As researchers continue to use these tools, it is important to address the validity of both the tools and their measures.
Although the Wayback Machine has limitations such as not indexing some websites, the results of this study showed content validity for three WM measures—website content, website age, and number of updates—as well as predictive, nomological, and convergent validity for website age and number of website updates. This article thus adds to the minimal research on validating online third party tools. Validation studies often deal with a survey instrument or a software process, but results from third party tools such as Alexa seem a new and fruitful area for validation studies.
As this study investigated just four types of validation, future research should address other validation tests, as well as the reliability of third party tools (Boudreau et al., 2001; Straub, 1989; Straub et al., 2004). While this article suggests that the WM provides valid website ages and website updates, future research should revisit these two WM metrics in other industries and extend the concept of validation to measures from other web-based third party tools. For example, Alexa provides measures of website popularity and incoming links to a website. Google provides a toolbar that ranks websites on Google’s proprietary PageRank, and a beta tool, Google Scholar (scholar.google.com), provides popularity measures for scholarly articles (Bakkalbasi et al., 2006; Hall, 2006; Jacsó, 2005, 2006; Kousha & Thelwall, 2007; Pauly & Stergiou, 2005). While widely used, to the authors’ knowledge these tools remain unvalidated.
Finally, now that the Wayback Machine seems validated as a viable research tool, an interesting range of research possibilities arise. Researchers can now have greater confidence in the data generated by the tool and can incorporate such data into their research on website development and e-commerce. As suggested elsewhere in this article, the WM facilitates studies of website development over time. Taking a historical perspective and exploiting this opportunity should lead to a better understanding of website evolutions in domains such as e-commerce and Web 2.0.
The authors presented an earlier and abridged version of this manuscript at the January 2007 ENTER Conference in Ljubljana, Slovenia.
About the Authors
Associate Professor Jamie Murphy’s [firstname.lastname@example.org] background includes complementary industry and academic experience. In addition to owning/managing hospitality businesses, he served as the European Marketing Manager for U.S. sports companies. Dr. Murphy’s research focus is effective use of the Internet for citizens, businesses, and governments. Address: The University of Western Australia Business School, Crawley, WA 6009, Australia
Noor Hazarina Hashim [email@example.com] is a Ph.D. candidate in Internet Marketing at the School of Economics and Commerce, University of Western Australia. In Malaysia, she lectures in marketing at the University of Technology Malaysia. Her research interests include the evolution of website and email, and effective Internet marketing. Address: The University of Western Australia Business School, Crawley, WA 6009, Australia
Peter O’Connor [firstname.lastname@example.org] is Professor of Information Systems at Essec Business School France and serves as Academic Director of Institute de Management Hotelier International (IMHI), its specialized MBA program in hospitality management. His primary research, teaching, and consulting interests focus on the use of information technology in the hospitality sector. Previously he held a visiting position at the Cornell School of Hotel Administration and worked in a variety of international positions in hospitality management in sectors ranging from luxury hotels to contract food services. Address: IMHI, Essec Business School, 95021 Cergy-Pontoise Cedex, France