Searcher's perceptions for query reformulation behavior on the web

Authors


Abstract

With the aim of better understanding for query reformulation behavior in Information Retrieval (IR), this study is to investigate searcher's perceptions for query reformulation behavior in information search process on the Web using Google search engine as a tested in exploring what factors might affect searcher's query reformulation behavior. The study employed two survey questionnaires and a search task. Twenty seven subjects were recruited to participate in this study. Based on the findings, precision and relevance seem to be two key factors for subjects to reformulate previous queries as well as to compare between search results. Interestingly, search satisfaction and confidence might be other factors for subjects when considering query reformulations. Query reformulations might have not only broadened the search depending upon the quality of query reformulations, but may also have provided opportunities to learn and come up with new ideas. The findings of this study expect to promote understanding of searcher's query reformulation behavior in order to be of better assistance to both searchers and system developers.

INTRODUCTION

With the growing use of the Web for information retrieval, interaction between system and user has grown in importance. A query is an information need representing the concept or the topic that the searcher wants to examine or know more about. A query can be transformed to another query that the searcher reformulates from the previous query and inserts into the system in order to look for more relevant information (Hearst, 2009). The ideal search would be one input with one best matching output for the user's information need. However, searchers do not get information from just one source; they pick up bits of information from many different sources while they explore. Bates (1989) refers to this process as “berrypicking.” According to Bates (1989) “the query is satisfied not by a single final retrieved set, but by a series of selections of individual references” (p. 410).

The search process begins with the information need. This information need is expressed by the searcher who will structure a concept and formulate it into a search strategy (i.e., a search query that can be understood by the information retrieval system). If a search is completed with an initial search query, there is no need to reformulate the previous query. Thus, the ideal search experience would include retrieval of the desired information with minimal effort at the time the search is performed, but this is not often achieved in real information seeking situations.

One of the searcher's burdens in information retrieval on the Web is query formulation and reformulation. Searchers typically are not experts at formulating queries on information systems with not very intuitive interfaces and when the retrieval yields poor results. Manning, Raghavan, and Schutze (2008) described the characteristics of the Web search users as follows, … web search users tend to not know (or care) about the heterogeneity of web content, the syntax of query languages and the art of phrasing queries; indeed, a mainstream tool (as web search has come to become) should not place such onerous demands on billions of people. A range of studies has concluded that the average number of keywords in a web search is somewhere between 2 and 3 (p. 432).

Query formulation (QF) is the first interaction taken when beginning a search using a search engine. Query reformulation (QR) is an iterative process that follows up on the QF and may continue until the end of the search process. Past research studied and analyzed user queries and QR behavior on the Web (Jansen, Spink, & Saracevic, 2000). Past studies also explored the effectiveness and patterns of user queries using search log analysis (Jansen, 2006a; Jansen & Spink, 2005; Jansen, Spink, & Koshman, 2007; Jansen, Spink, & Pedersen, 2005) and transitions between queries (Belkin, et al., 2001; Jansen, Zhang, & Spink, 2007; Klink, 2001; Rieh & Xie, 2006; Spink, Wolfram, Jansen, & Saracevic, 2001). Jansen, Zhang, and Spink (2007) found that about half of initial queries were modified during the search process and most of these were reformulated to be more specific queries.

This study is to investigate searcher's perceptions for query reformulation behavior in exploring when, why, and in what ways query reformulation was needed.

RESEARCH QUESTIONS

The following research question and three sub-questions guided this study.

What are the factors that might affect searcher's query reformulation behavior during the information search process?

  • When are query reformulations needed?

  • Why are query reformulations needed?

  • In what ways do query reformulations affect the results during a search?

METHODOLOGY

This study methodology employed two survey questionnaires and a search task: a pre-survey, a search task, and post-survey in order.

The pre-task survey collected demographic data and information about previous experience including computer, web-based searching, and Google search engine experience. It was administered to each subject at the beginning of the search task session.

The search task was given to subjects to search for relevant documents until they were satisfied with the search results using FSU bibliographic databases.

At the completion of the task, the post-survey asked questions to examine what factors might affect searcher's query reformulation behavior during their search process for the search task. The post-task survey questionnaire collected data about subjects' experiences with query reformulations during the search task. Questions about query reformulations solicited responses about when, why, and in what ways query reformulation was needed.

SUJBECTS

The researcher recruited a total of 27 subjects from students enrolled at the School of Library and Information Studies at Florida State University (FSU). The target population is a group of university enrolled students. The subjects included undergraduate and graduate students with varying levels of computer and Internet-searching skills. The researchers purposively recruited the subjects from the School of Library and Information Studies (LIS) ranging from undergraduates to graduates because they are considered as experienced and trained searchers and may be familiar with terms used in this study such as information retrieval, precision and recall, or relevance.

PROCEDURES

The study consists of the following activities in a lab setting where each subject.

  • Introduction and a practice session;

  • Break;

  • Complete a pre-task questionnaire;

  • Break;

  • Complete a search task;

  • Break;

  • Complete a post-task questionnaire.

First, when the subjects arrive in a lab the researcher briefly introduced the purpose of the study, the tested (Google search engine), and provided a practice session. After the practice session, a pre-task survey was administered after which a search task begins. Subjects were encouraged to complete the search task in order to find the information that meets the goals of the task. However, if they were satisfied with the search results, they could stop the task at any time. If for some reason they were not comfortable with the task during the search process, they could quit the search task at any time. Following the completion of the search task, subjects were asked to complete the post-task questionnaire and select the best answer among options. Between sessions, subjects were given a break to avoid any carryover effects.

1

Table 1. Gender, Age, and Degree Sought.
VariablesRangeFrequency
GenderMale10(37%)
 Female17 (63%)
AgeLess than 201 (3.7%)
 21–3016 (59.3%)
 31–406 (22.2%)
 Over 414 (14.8%)
DegreeBA/BS14 (51.9%)
 MS/MA5 (18.5%)
 Ph.D.8 (29.6%)
Table 2. Characteristics of subjects.
VariablesComputer KnowledgeWebsearch KnowledgeWebsearch SkillsGoogle Use FrequencyGoogle Query ReformulationGoogle Search Success
RangeAverageHighVery highAverageHighVery highlowaveragehighvery highAt least once a eekDailyMore than once a dayWeeklyRarelyAbout 1 timeAbout 2 timesMore than 3 timesRarelySometimesUsuallyAlways
Frequency1012591171715466141378921168
Percentage37.044.418.533.340.725.93.725.955.614.822.222.251.93.711.125.929.633.37.43.759.329.6

As shown in Table 2, subjects' self-reported computer knowledge and web search skills were above average as was to be expected since they were enrolled in LIS programs that require these skills. 51.9% (14/27) reported that they use Google search engine for their daily information seeking on the Web. It is interesting to see that 88.9% (24/27) reported that they make query reformulations at least about one time 25.9% (7/27), about two times 29.6% (8/27), and more than three times 33.3% (9/27). However, 11.1% (3/27) reported that they rarely make query reformulations using Google search engine for their information seeking.

Almost 88.9% (24/27) reported that they usually or always make a successful search using Google search engine for their daily information seeking.

In order to examine factors for query reformulation behavior, the following questions were asked in the post-survey after completion of the search task.

  • When do you think you needed to reformulate your queries?

    • 1.When there were too many search results;
    • 2.When there were too few search results;
    • 3.When the search results are not satisfactory;
    • 4.To find more relevant information;
    • 5.I don't know.
  • Why do you think you reformulated your previous queries?

    • 1.To enhance relevance;
    • 2.To enhance recall;
    • 3.To enhance precision;
    • 4.To make a comparison between the search results of queries;
    • 5.To feel more satisfied and confident.
  • In what ways did the query reformulations affect the search results throughout a search?

    • 1.It improved the search results;
    • 2.It narrowed the search results;
    • 3.It broadened the search results;
    • 4.It inspired new thinking/ideas;
    • 5.It wasn't helpful; it wasted my time.

As shown in Figure 1, for the first post-survey question, after completing the search task 37% (10/27) answered that they reformulated queries when they wanted to find more relevant information, while 25.9% (7/27) answered that they reformulated previous queries when their search results were not satisfactory and 25.9% (7/27) reported that there were too many search results. Based on this information, we can see that relevance, too many search results, and satisfaction are the three most often stated reasons for reformulating queries.

As shown in Figure 2, for the second post-survey question, 40.7% (11/27) reported that they reformulated queries because they wanted to increase precision, while 37% (10/27) answered that they reformulated queries because they wanted to increase relevance. Interestingly, 18.5% (5/27) reported that they reformulated queries because they wanted to make a comparison between search results. Based on this information, we can see that relevance and precision are two most often stated reasons for reformulating queries.

As shown in Figure 3, for the last question, 48.1% responded that they thought query reformulations affected the search results in a way that improved the search results. 18.5% answered that query reformulation helped to narrow the search results, while 11.1% reported that query reformulation broadened the search results. Interestingly, 7.4% reported that query reformulation was not helpful and it was a waste of their time. It is interesting to see that 14.8% answered that query reformulation helped inspire new thinking and ideas. This implies that the subjects expected to learn new information during the search process. This could mean that query reformulations might have not only broadened the search depending upon the quality of query reformulations, but may also have provided opportunities to learn and come up with new ideas. Nevertheless, the most often stated reason for query reformulation was to improve the search results. Based on this information, we can see that improving and narrowing the search results are the two most often stated reasons that query reformulation affects a search.

Figure 1.

Summary of descriptive statistics after the search task for the question when do you think you needed to reformulate your queries?

Figure 2.

Summary of descriptive statistics after the search task for the question why do you think you reformulated your previous queries?.

Figure 3.

Summary of descriptive statistics after the search task for the question in what ways did the query reformulations affect the search results throughout a search?

CONCLUSION

We have learned in this study that precision and relevance are two major factors that affected searcher's perceptions for query reformulation behavior on the Web in the Google search engine. Searcher's satisfaction and confidence with search results seem to be another reason for searchers to make query reformulations. Interestingly, searchers might expect to learn new ideas and broaden thinking as they make query reformulations throughout a search.

Ancillary