SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES

In this paper, we describe the preliminary results of a study investigating subtasks of the information gathering task on the Web. Those subtasks are: managing information, handling multiple sessions, and re-finding information. The study compared features in a prototype, WIGI (Web Information Gathering Interface), to the current state of Web information gathering that involves mainly the Web browser. Substantial differences were revealed between the two states compared in the study.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES

Information gathering (or informational) tasks are very common on the Web. According to Rose and Levinson (2004), informational tasks accounted for 61.5% of the overall tasks users conduct on the Web. Broder (2002) found that informational tasks represented 39%-48% of three main types of user tasks: informational, transactional, and navigational. Kellar et al. (2007) found that information gathering represented 13% of the overall tasks performed in their study.

Information gathering can be defined as combinations of activities (i.e. subtasks) that involve at least: finding information sources, finding actual information, and managing and organizing information. For example, collecting information about the different expenses of an overseas trip is an information gathering task. Writing a research survey that requires searching multiple sources of information, gathering information from those sources, and reasoning about the information chosen for the survey is considered to be an information gathering task.

Information gathering on the Web is usually complex and necessitates multiple sessions (Amin, 2009). Amin (2009) has shown that information gathering is highly search-reliant. Alhenshiri et al. (2012) have shown that information gathering on the Web usually consists of more than one subtask each of which is a combination of activities with the same goal. They have also shown that information gathering may require the use of tools and applications to complement the Web browser.

The experiment described in this research was conducted based on three recommendations from the work of Alhenshiri et al. (2012). Those recommendations were: 1) users should be able to re-visit and reuse information from sources located in past sessions; 2) to handle multiple sessions, users should have the ability to keep the task information as one unit to be able to re-locate the task, not dispersed parts of the task; 3) users should have the ability to search, browse, and manage and organize the information they gather on the same display. The following are the three hypothesis of the research discussed in this paper:

  • 1-
    Permitting users to track references opened during information gathering sessions will help with re-finding information and information sources for further gathering and comparisons in subsequent sessions.
  • 2-
    Allowing users to re-find integrated task information will improve the effectiveness of how users handle multiple sessions in the case of information gathering tasks on the Web.
  • 3-
    The use of a single application for searching, browsing, editing, and formatting information will improve the effectiveness of organizing and managing information during the task of information gathering on the Web.

The remainder of the paper is organized as follows. Section 2 illustrates some of the related work. Section 3 describes the research study. Section 4 highlights the preliminary results of the study. Section 5 discusses the results. Section 6 concludes the article.

RELATED WORK

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES

Research started looking at user activities on the Web as parts of a larger task around the beginning of the 2000s. Broder (2002) identified three main types of tasks from user interactions during Web search: navigational, transactional, and informational. Sellen et al. (2002) categorized user activities as: fact finding, information gathering, browsing, transacting, communicating, and housekeeping. In 2004, Rose and Levinson studies user search goals and identified navigational, transactional, and informational tasks in user interactions. Kellar et al. (2007) categorized user tasks on the Web as fact finding, information gathering, transacting, and browsing.

To further explore the concept of information gathering, Amin (2009) attempted to identify the task characteristics. Information gathering was shown to be highly search-reliant and to involve different types of search including exploratory, comparison, and topic search. Mankowsky and Watters (2011) showed that information gathering benefits from supporting the user's ability to remember parts of Web pages from which the information was gathered. Information gathering may require multiple sessions to complete (Spink et al., 1996; Mackay and Watters, 2008), indicating a growing need for strategies and tools for keeping and re-finding the task information for reuse.

Alhenshiri et al. (2012) showed that information gathering is complex, highly search-reliant, and has a high level goal. It also consists of more than one subtask each of which consists of activities with the same goal. Information gathering usually takes more than one session because of its exploratory nature that requires comparing and collecting information from multiple sources of information. Information gathering may necessitate the use of more applications than the conventional Web browser.

RESEARCH STUDY

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES

A complete factorial and counterbalanced experiment was conducted to evaluate features in WIGI against the current browsing model for information gathering on the Web. Thirty participants took part in the study. All participants were computer science students (graduate and undergraduate) because of availability issues. Participants were divided equally between females and males. Each participant performed two tasks each of which was divided into two parts (i.e. part A and part B). One task was performed on WIGI while the other task was performed on the Web browser. The study used four different tasks that were created and further evaluated by a focus group based on the following criteria:

  • The task should indicate uncertainty or ambiguity in information need, or need for discovery.

  • The task should require knowledge acquisition, comparison, or discovery.

  • The task should provide a low level of specificity about the information required in the task and how to find such information.

  • The task should provide enough imaginative contexts for the study participants to be able to relate and apply the situation.

thumbnail image

Figure 1. A task example.

Download figure to PowerPoint

The study used Microsoft Internet Explorer. Each user had two sessions; completed part A of each task on both WIGI and the browser during the first session and completed part β of each task during the second session in the same application sequence. Each task took approximately a half hour per part (totaling up to two hours for both sessions). Figure 1 provides an example of a task used in the study.

The study used a pre-study questionnaire and two post-task questionnaires. The questionnaires collected data about: the completion of the task from the user's perspective, the user confidence, and any encountered difficulties during the task. Figure 2 (a, b) provides the cases compared in the study with 2.a showing WIGI and 2.b showing the browser case while performing a tasks. Four main components of WIGI are shown in Figure 2.a: the search view, the browsing view, the editing area, and the reference (thumbnail view) used to track sources of information clicked by the user. Figure 2.b shows how a task was performed using the browser and Notepad.

thumbnail image

Figure 2a. A task on WIGI.

Download figure to PowerPoint

thumbnail image

Figure 2b. A task on the browser.

Download figure to PowerPoint

We conducted a first scan of the data logged during the study and the data accumulated in the questionnaires. The generic pre-study questionnaire collected data about issues and difficulties users have with information gathering on the Web. The difficulties users have with tools and strategies they use to gather Web information are summarized in Table 1.

Table 1. Pre-study questionnaire responses.
Thumbnail image of

The data in Table 1 align with the recommendations from the study in Alhenshiri et al. (2012). Most of the responses regarded re-finding information during gathering tasks followed by issues dealing with handling multiple sessions for this type of task. The last portion of the results regarded issues with managing, editing, and organizing information.

STUDY RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES

Based to the research hypotheses, the results analyzed in this paper concern three subtasks of the information gathering task on the Web: re-finding information, handling multiple sessions, and managing information. Substantial differences between the features embedded in WIGI and the current browser model for gathering Web information have been revealed in the preliminary results.

The study logged all interactions with WIGI as well as interactions with the Web browser and any other tools used during the study. A total of 5436 activities were logged. Of the data collected in the study, Table 2 shows the criteria concerning the three main subtasks discussed in this paper and their associated types of activities.

Table 2. Data logged during the study.
Thumbnail image of

Re-finding Information

Re-finding is an activity that allows the user to re-locate the information and/or sources of information (Web pages) during a task. On WIGI, users re-visited links to information sources 186 times. On the browser, however, users re-opened two thirds (10/15) of the bookmarks they created during the previous session. They revisited links they kept in text files along with other information nine times. The ANOVA difference between the number of re-finding activities on WIGI and those on the browser was significant, F (1, 58) =14.13, p<0.0005.

Handling Multiple Sessions

In order to handle multiple sessions, users had to keep (save) information from the first session to be used in the second session. Users kept information needed for the task at hand in addition to references to the information gathered. On WIGI, all participants kept information using the save gathered functionality provided in WIGI. Information kept using this feature involved: information copied from Web pages, information entered by the user, and references to information.

On the browser, users kept the information in several formats including text files, bookmarks, saved pages, and emails. Twenty six users (26/30) used text files to keep the task information. Four users (4/30) used bookmarking 15 times. Two pages were saved by two users (2/30). Four users (4/30) used emails to keep information by sending the information to their own email accounts. Some users (4/30) used a combination of more than one strategy to keep the information. The difference between the proportion of participants who used text files and those who used other methods for keeping information while using the browser was significant (z-test, z= − 4.18, p<0.0001).

Managing Information

During the task of information gathering on the Web, users manage, keep, and organize the information they collect for the task. Each task (first part) in the study requested that information should be kept for a subsequent session. On WIGI, none of the users used browser tabs. They used the main window (shown in Figure 2.a) only. Users indicated that they did not need tabs because of the thumbnail view that kept them informed of what pages they have already opened. On the browser, users opened a total of 235 tabs.

Users pasted information from Web pages into the WIGI editor 208 times which was more than twice as many times as they did while using the browser (100 times). The difference between the number of paste activities on WIGI and the browser was significant, ANOVA, F (1, 58) = 5.65, p<0.03. In addition, users copied 10 times from search result hits on WIGI while they did not copy from result hits at all during the use of the browser. Users copied and pasted information within documents they created (15 times overall) while using the browser.

Users were asked to format the information they gathered for the tasks. On WIGI, users performed 520 formatting activities while they had only 50 formatting activities while using the browser and other supporting tools such as text editors and emails. The difference between the two cases using a single-factor ANOVA was significant, F (1, 58) =23, p<0.00002. Formatting involved activities such as: changing text color, text size, and so on.

DISCUSSION AND FUTURE WORK

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES

The preliminary results of the study indicate substantial differences between the features used in WIGI and those used in the case of the browser. Users of WIGI copied information from Web pages into the WIGI editor more frequently. The convenience of the design of WIGI by having the editor embedded into the browser along with the search and browsing features may have motivated users to copy and paste information from pages being viewed in the browsing area. The need for switching between the browser and the editor may have discouraged users from copying and pasting leading them to type more in the case of using the browser. Moreover, users of WIGI performed substantially more formatting activities than users of the browser who relied on text editors such as MSWord. Users of WIGI were able to see the original format of the information on Web pages along with the information they gathered in the editor without the need for switching.

To handle multiple sessions, all users used one feature (the save gathered in WIGI) that kept the context of the task. They lost no information between the two sessions. With the browser, 20% of the users (5/30) lost all the information they kept from the previous session. When attempting to complete the study in the second session, they had to re-do much of the first part of the task. Re-finding by re-visiting links to previously viewed pages was performed significantly more on WIGI than it was performed on the browser.

CONCLUSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES

Since the study in Alhenshiri et al. (2012), difficulties users have with the task of information gathering on the Web have not changed much. The study discussed in this paper showed that there were substantial differences between features of WIGI and those of the ordinary browser that may have affected the effectiveness of the task of information gathering on the Web with respect to each subtask considered. Future work will involve further analysis of the study data to reveal further differences and to investigate the effectiveness of more features regarding each subtask.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. RELATED WORK
  5. RESEARCH STUDY
  6. STUDY RESULTS
  7. DISCUSSION AND FUTURE WORK
  8. CONCLUSION
  9. REFERENCES
  • Alhenshiri, A., Shepherd, M., Watters, C. & Duffy, J. (2010) Web Information Gathering Tasks: A Framework and Research Agenda, Proceedings of the International Conference on Knowledge Discovery and Information Retrieval (KDIR20I0), (pp. 131140). Valencia, Spain.
  • Amin, A. (2009). Establishing Requirements for Information Gathering Tasks. TCDL Bulletin of IEEE Technical Committee on Digital Libraries 5(2). 19377266.
  • Broder, A. (2002). A Taxonomy of Web Search. ACM SIGIR Forum 36(2), 210.
  • Kellar, M., Shepherd, M., & Watters, C. (2007). A Field Study Characterizing Web-based Information-Seeking Tasks”. Journal of the Amercian Society for Information Science and Technology 58(7), 9991018.
  • Mackay, B., & Watters, C. (2008). Exploring Multi-session Web Tasks. Proceedings of the 2008 ACM Conference on Human Factors in Computing Systems (pp. 42734278). Florence, Italy.
  • Mankowski, T., & Watters, C. (2011). Webscraps- a Tool to Manage Web Information Gathering Tasks. MS thesis, Dalhousie University. Halifax, NS, Canada.
  • Rose, D., & Levinson, D. (2004). Understanding User Goals in Web Search. Proceedings of the 13th International Conference on World Wide Web (pp. 1319). New York, NY, USA.
  • Sellen, A., Murphy, R., & Shaw, K. (2002). How Knowledge Workers Use the Web. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp 227234). Minneapolis, Minnesota, USA.
  • Spink, A. (1996). Multiple Search Sessions Model of End-User Behaviour: An Exploratory Study. Journal of the American Society for Information Science 47(8), 603609.