Investigating variation in querying behavior for image searches on the web

Authors


  • Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission from the author.

Abstract

This study analyzed query iterations and query length during image searching processes on the Web from a naturalistic approach. It examined whether query iterations varied across contextual characteristics in an interactive Web searching process and determined if task goals and different content sources of query construction influenced query length and retrieval actions. The study's findings showed that query iterations were significantly different with types of task goals, working stage, and search expertise. Query length was significantly different with content spaces and correlated with retrieval actions on image search engines. The study discussed implications and future areas to be investigated with contextual factors and pattern of query modification in understanding interactive information searching behavior for image search and retrieval.

INTRODUCTION

Search engine services are a popular means for information searching. They provide a simple and direct way of searching information for various resource types, not only textual resources, but also multimedia. Most search engines present similar interfaces allowing people to: submit a query; receive a set of results; follow a link; explore the information space; and modify a query. This process is generally repeated during interactive searching. The popular use of search engine services has led to many investigations of general search habits on the Web. Querying behavior – query formulation and reformulation – has especially been an active area of research in information retrieval (Jansen, Booth, & Spink, 2009). In order to design system features to assist in query generation and modification during an interactive searching process, researchers have been interested in how users construct and reformulate a query to get a better set of results by using transaction log data. Such studies demonstrate queries as evidence of the interactive process and provide a broad understanding of querying behaviors; however, they cannot identify the influence of a user's context in so far as it affects a user's querying behavior (Huang & Efthimiadis, 2009; Liu, Gwizdka, & Belkin, 2010; Toms et al., 2008). They also fail to reveal any relationships between query iterations and retrieval effectiveness, i.e., whether more query iterations increase users' satisfaction with a set of results.

Aula (2003) suggested that several factors including familiarity with the search environment, search engine expertise, domain expertise, and type of search made a difference in query formulation behaviors. She further believed that they should be taken into account when studying and designing information searching systems. Ingwersen and Järvelin (2005) characterized the notion of a work task as the driving force underlying information seeking and retrieval as an information seeker performs a sequence of information seeking and retrieval activities to obtain information for a perceived work task. These include a non-job-related task as well as a job-related task. Ingwersen and Järvelin promoted the idea that information retrieval research must collect contextual elements related to the work task and then examine how information retrieval interactions are performed as well. Thus, examining differences and the influence of the factors related to searchers and tasks on querying and performance during an interaction is necessary to get an insight on how querying behavior and performance vary on such factors. By studying the impact of variation in a user's contextual characteristics when searching, it may be possible to develop search systems that are tailored to users' needs and different types of situations affecting users' searches.

The Web environment and search engine services increase availability and accessibility of digital images. Search engines are major sources for users in searching for images (Green, 2006; Harley, 2006; Pisciotta, 2005; Shonfeld, 2006). In carrying out image searches through search engines, users employ keywords. On a search engine interface, users generally conduct their text-based image searches and refine their efforts utilizing a display of thumbnails for browsing.

While several studies reveal general characteristics of image searching based on transaction log data, little has been investigated concerning whether or not image searching behavior, especially querying behavior – query iterations and query length – differs based on a user's contextual aspects and different sources of collections on Web search engines. Despite the growth of available digital images and easy search capabilities on the Web (Kherfi, Ziou, & Bernardi, 2004), there are concerns with finding, accessing, and sharing a digital image (Harley et al, 2006; Green, 2006). As the number of digital images continues to grow, users will continue to become independent and will be challenged to access, view and retrieve images with textual keywords. Analyzing the interaction process, especially query formulation and modifications, for image searches is essential for understanding how querying behavior varies with a user's contextual factors. The resulting analysis may yield the design of an intuitive and contextualized search environment to support users' different search characteristics.

The purpose of this study is to examine how querying behavior varies with contextual factors which potentially affect query iterations. Further, it will analyze the effect of query length on a user's retrieval actions. The following hypotheses would correspondingly be tested:

  • Task goals, working stage, search expertise, and topic familiarity affect the number of query iterations.

  • The number of query iterations affects a user's perception of the search results.

  • There is a relationship between the number of query iterations and a user's perception of the search results.

  • Task goals and content sources of search engines affect query length.

  • There is a relationship between query length and retrieval actions in a general web search interface and in an image search interface.

In the remainder of this paper, a discussion of related studies, the research method employed in the study, the findings, discussion and implications are presented.

RELATED STUDIES

Studies of Search Queries

Many studies reported querying behavior on search engines using transaction log data (Jansen & Pooch, 2001; Jansen, B., & Spink, A. 2005; Park, Bae, & Lee, 2005). The results of these studies show a common approach of users; for example, they type in short queries with a few query terms, seldom use advanced features, and view few results pages. Other studies on query logs also explored and classified query reformulation types (Jansen, Booth, & Spink, 2009; Jansen, Zhang, & Spink, 2007; Lau & Horvitz, 1999; Rieh & Xie, 2006). They found that searchers changed a great deal of queries to express more precisely their information need. Further, they demonstrated a common pattern of query modification.

Studies using transaction log data help explain how people form search queries on the Web. However, transaction log data does not reveal how recorded queries and retrieval actions were elicited. Aula (2003) used a questionnaire with specified search tasks to examine the factors affecting query formulation in Web information searches. She found that there was a positive correlation between Web experience and the average length of the formulated queries. A user's computer experience, including use of Web search engines, affected the query formulation process; however, domain expertise did not have an effect on the query formulation. Experienced users created longer and more specific queries whereas the queries of users with less experience consisted of fewer and more generic terms. Based on the observations of seven computer science researchers doing work-related searches, Aula and Käki (2003) also reported that participants with more work experience used longer and more refined queries. They further noted that the researchers tended to employ more advanced search strategies than the less experienced searchers. Toms et al. (2008) observed differences of task attributes on query aspects with an experimental system and assigned search tasks. They found that queries varied significantly in length according to task type. Also, the participants varied in the number of pages viewed and the number of unique pages viewed by task type.

Recently, researchers examined correlations between query reformulation strategies and click action as well as the effect of task types on query reformulation. Huang and Efthimiadis (2009) investigated how users' click-through behaviors varied by different types of reformulation based on click data of the AOL search engine's query logs. Their study discovered that different reformulation strategies were effective depending on the action from the initial query. Liu, Gwizdka, and Belkin (2010) investigated the frequency of query reformulation types and how it was related to task type and individual differences. Their results indicated that specialization and word substitution were the two most frequent query reformulation types. Also, the type of search task had a significant effect on the type of query reformulation. In contrast to traditional term-based approaches, Hollink, Tsikrika, and Vries (2010) analyzed semantic relations between queries and identified common patterns. Their analysis shows that many people searched for two entities with some common property, and there were variant names for the same entity or two entities of the same type.

It is known that long queries are a better way for expressing complex and specific information needs than short queries (Lau & Horvitz, 1999; Phan, Bailey, & Wilkinson, 2007). However, little is known whether query length influences retrieval performance such as browsing behavior in an interactive process. Wolframe (2000), using Excite search data, investigated whether query construction or session lengths influenced browsing behavior. He found that browsing persistence did not seem to be affected by the amount of effort (query size) invested by the user; however, there appeared to be a small decrease in the average number of pages viewed with an increase in the number of terms. He concluded that users were not willing to invest significant extra effort in their page viewing, regardless of the number of queries or information needs they wished to fill. With a prototype system, Belkin et al. (2003) found that the relationship between query length and search performance was not statistically significant, although query length was associated with user satisfaction. Phan, Bailey, and Wilkinson (2007) concluded that the degree of specificity of a retrieval request might correspond to the length of a search query in a national government search engine. They showed that query length was associated with the specificity of a user's information goal. They also found that longer queries were generally associated with more specific information goals. Recently, Bendersky and Croft (2009) analyzed long queries in a large scale search log of TREC data to gain insight into how people formulate the long queries and how they behave in response to their results. They found that users tended to click lower in the result list for the long queries than for the short ones. They concluded that the search with the short queries was generally more effective than the search with the long queries.

Studies of Image Queries

Researchers have focused on image searches, queries, and search strategies on the Web by analyzing search logs from general or commercial search engines (Goodrum & Spink, 2001; Jansen, 2008; Jansen, Spink, & Pedersen, 2005; Jorgensen & Jorgensen, 2005; Ozmutlu, Spink, & Ozmutlu, 2003; Tjondronegoro, Spink, & Jansen, 2009), search engines in China and Taiwan (Pu, 2005), and the Google Answers ‘ask-an-expert’ online reference system (Cunningham, Bainbridge, & Masoodian, 2004). Their findings suggest that Web users type in short queries not only when searching for textual information, but also when searching for visual information. Query modification is important in image searching (Goodrum & Spink 2001; Jorgensen & Jorgensen 2005). Pu (2005) and Jansen (2008) found that users' image queries tended to focus on people and people-related queries.

A few studies also analyzed multimedia searching characteristics on the Web among different content collections on search engines (Jansen et al., 2005; Spink & Jansen, 2006; Ozmutlu, Spink, & Ozmutlu, 2003; Tjondronegoro, Spink & Jansen, 2009). The findings show that image search is the most popular. Both the mean terms per query and the session lengths for image searching are larger than the other categories of multimedia searching.

A few studies have provided additional image searching perspectives that utilized different methods of data collection. Cunningham and Masoodian (2006) discussed casual image information seeking behavior from their surveys. They found that 2.24 was the average number of terms in a query and that search engines were useful in finding appropriate terms to use in searching Google Images or in browsing image websites. Goodrum, Bejune, and Siochi (2003) explored image search patterns (state transition) of graduate students on the Web. They found that the subjects input two queries, spent 20 minutes searching per image, and changed their initial queries frequently. The transition patterns revealed that longer strings and lengthier search times occurred when users searched for images using text-only search tools. These retrieved lists of website surrogates rather than image surrogates. In an experimental study investigating image search strategies on the Web, Fukumoto (2004) showed that viewing Web page operation, actions, time, inputting keywords and keyword uniqueness were different based on tasks. In an experimental study of image search interfaces with a combination of textual and visual search methods, Westman, Lustila, and Oittinen (2008) found that 84.5% of all first queries were modified; purely content-based searches accounted only for 5% of queries; text queries included an average of 1.30 terms. They found that task types resulted in different behaviors in a length of task time, querying, and viewing; user background significantly affected the types of queries constructed. Choi and Hsieh-Yee (2010) confirmed similar characteristics of image query formulation and modification strategies. Their study shows that the average number of terms per query in an experimental study was 3.12 terms. The most frequent strategy in modifying a query was replacing one term with another. Their study also suggests that tasks and the type of image users are asked to search for may be likely to impact search query formulation.

Summary

Ample studies demonstrate that query formulation is an essential activity and users change their queries frequently during a Web information search. Additionally, previous studies suggest that a range of factors, such as the characteristics of the searchers and the nature of task goals, have an impact on a user's querying behavior. It seems clear that a user's contextual variables should be incorporated into studies of user querying behavior in an interactive Web search process. This could complement assessing query log data in order to fully understand a user's interactive searching process and better enable the design of effective retrieval systems.

While several studies reported characteristics of how users formulate image queries and patterns of reformulation on the Web, little is known about how certain user characteristics when performing searches affect query formulation and retrieval action. Investigating a user's querying and searching behavior in a naturalistic setting would reveal insight into the real interaction of users when performing image searches. This would, in turn, identify important system interface features for users.

METHOD

Participants and Procedure

Twenty-nine college students were recruited for data collection at a private university: 22 students in Media Studies and 7 students from other fields taking courses in the Department; 22 females and 7 males at an average age of 21 years. In the recruitment process, participants agreed to take part in three search sessions. At each search session, participants performed searches for one search task on their own. To analyze the effect of the three search sessions on performance variables and search actions, an ANOVA test was conducted. The results showed no difference among three search sessions. Because, in general, the performance and other searching variables were not associated with a search session, each search session was treated as an independent case. The total number of search sessions was 87.

The study was conducted with a naturalist approach based on the participants' own aptitudes and at their own pace with their own search goals. Each participant filled out a background questionnaire to collect demographic data and a pre-search questionnaire to collect search task, topic, self-rated topic familiarity, and working stage, using the 7-point Likert scale. Search sessions were performed at the information commons with the researcher's laptop, with an Internet connection. The university's home page was set as the default home for the Web browser. The participants chose either Internet Explorer or Firefox as their browser. The search session was recorded by Camtasia v5.0 software. There was no time limit for a search session. The participants were free to end a search session at any time. After the session, a post-search questionnaire was given to the participants to assess the level of usefulness, satisfaction, and confidence of their searching using a 7-point Likert scale, (e.g., 1= “not useful,” 7= “very useful”). Once the first search session was completed, the second search session was scheduled within two weeks. To reduce the learning effect, there was at least a week break between each search session. The total search length of the participants was 94,770 seconds, equal to 26 hours 19 minutes 30 seconds (an average search length of 1,081.90 seconds, equal to about 18 minutes).

The analysis and results presented in this paper are a subset of the other study reporting influences of several contextual factors on interactive search processes for images on the Web (Choi, in press).

Measures and definitions for data collection and analysis

The study collected the following attributes as variables.

Contextual elements

  • Task goals - the goal is the reason or activity that prompts the need to search for an image in a real daily-life situation. In the study, three task goals were identified: images were needed for course-related projects or research (academic task goal, 60.92%, N=53), for a personal interest (personal interest goal, 20.69%, N= 18), or for work-related activities but not course related (work-related task goal, 18.39%, N= 16).

  • Searching expertise - the level of expertise of subjects' online searching, ranging from Novice (1) to Expert (7).

  • Topic familiarity -subjects' current state of knowledge about a topic, ranging from Not very familiar (1) to Very familiar (7).

  • Work task stage - subjects' assessment of their progress in completing a task, ranging from Starting (1) to Finishing (7).

Retrieval performance

  • Query iterations – each instance of issuing a query during a search on a single topic (number of iterations = number of queries in a search session) (Belkin et al., 2003)

  • Query length – number of words in query. A query with a quotation mark was considered as one word.

  • Retrieval actions - action of browsing, viewing, and saving from the results page via query issued. For each query, number of retrieval actions shown in Table 1 was collected and combined for a total number of retrieval actions.

Participants' perception of search results

  • Satisfaction - the extent of how satisfied the subjects were with the search results

  • Confidence - the extent the subjects were confident with the completion of the image search results

  • Usefulness - the extent the subjects believed the retrieved image(s) was (were) in helping to complete the task

The results were analyzed using the descriptive statistical data, analysis of variance (ANOVA), Spearman Rank Order Correlation, and Pearson's Correlation.

Table 1. Definitions of retrieval actions
ActionDefinition
SEG_ResultsThe total number of Web search result pages a participant checked in a list of search results given by a search engine in returning to a search query
SEI_ResultsThe total number of image search result pages a participant checked in a list of search results given by a search engine
Image_clickingClicking on an image thumbnail to view an image search results page via a general search engine image search
Following a linkClicking on a link in a search results page via a general search engine
Local_pages checkingThe number of pages viewed/checked in a local Web page/site
SavingSaving an image/Web page on computer, copying and pasting an image/URL in a Word document/PowerPoint slides, e-mailing/bookmarking URL, posting it on a social networking site

RESULTS

Query iteration

During a search session, participants actively issued queries on image search engines as well as general Web search engines and local sites in searching for images. Across all three content sources, a total of 978 queries were obtained as the entire string of terms submitted by the participants. An average number of query iterations among 87 sessions was 11.24 with a maximum 39 and SD=8.73.

It is interesting to point out that while image search engines served as the main tool for image searching, participants also used general Web search engines in searching for images. Three hundred sixty-five queries (37.32%) out of the total queries were issued in general search engines whereas about fifty percent (483 queries) were in image search engines. Table 2 shows an average of query iterations on a general search engine, a search engine's image search service, and local sites. It indicates that query iterations occurred more frequently on image search engines than other content spaces.

Web users frequently modify their queries (Jansen, Booth, & Spink, 2009) as do users for image retrieval (Goodrum, Bejune, & Siochi, 2003; Jorgensen & Jorgensen, 2005; Westman, Lustila, & Oittinen, 2008). This study has found a similar trend. However, it is interesting to note that the average number of query iterations during a session (11.24 queries) in this study is much higher than those in the previous studies of textual search sessions, e.g., 7.7 queries in Aula and Käki (2003)'s study and 7.8 queries in Toms (2008) study. Also, image searches in Westman, Lustila, and Oittinen (2008)'s study yielded an average number of queries issued in a task of 5.6 queries for professionals and 4.9 for non-professionals).

Table 2. Average of query iterations on different content sources
SourceN of queries (N=978)Mean of query iterationsStandard Deviation (SD)
General search engines3654.204.59
Image search engines4835.556.00
Local sites1301.492.72

Factors affecting query iterations

Query iterations were analyzed with task goals, topic familiarity, search expertise, and work task stage to detect any query iteration differences. An ANOVA test was conducted to determine the differences. Results indicated that query iterations were significantly different based on task goals (F=7.81, p=0.00), work task stage (F=5.99, p=0.00), and search expertise (F=6.72, p=0.00) while there was no significant difference of query iterations based on topic familiarity (F=0.82, p=0.56). Image searches for academic task goals produced more query iterations (mean = 13.96) than those for work task goals (mean=7.88) and personal interests (mean= 6.22). Work task stages toward the completion stages yielded more query iterations. Users who had self-rated lower search expertise changed queries more often than those who had higher search expertise.

Effects of query iterations on perceptions with results

To test any significant differences of query iterations across a different level of satisfaction, confidence, and usefulness the participants perceived on search results, it was necessary to regroup the responses of satisfaction, confidence, and usefulness into three categories (Low (1–3), Medium (4), and High (5–7)). This was done because at least one group had fewer than two cases, causing Post hoc tests not performed.

The ANOVA test results with three categories of the responses identified significant differences among satisfaction (F=9.02, p=0.00), confidence (F=11.20, p=0.00), and usefulness (F=8.06, p=0.00) based on query iterations. When participants produced fewer query iterations during a search session, they tended to have a high level of satisfaction, confidence, and usefulness on search results. The mean of the number of query iterations for satisfaction (M=9.64, SD=7.37), confidence (M=9.06, SD=7.40), and usefulness (M=10.01, SD=7.88) in a high level was less than the average (11.24) while the mean number of query iterations for satisfaction, confidence, and usefulness at a medium or low level was higher than average.

To see whether query iterations correlate with the level of retrieval success perceived by participants, Spearman Rank Order Correlation was used to test the relationship between the number of query iterations and the level of satisfaction, confidence, and usefulness. As shown in Table 3, the results indicate that there is evidence to suggest a relationship between the number of query iterations and the level of satisfaction, confidence, and usefulness. As the number of query iterations increases, the level of satisfaction, confidence, and usefulness would decrease (a negative relationship).

Table 3. Spearman's Rho test results on query iterations and perception
 SatisfactionConfidenceUsefulness
  1. **. Correlation is significant at the 0.01 level (2-tailed).

Query iterations0.00**0.00**0.00*
 (rs= −0.35)(rs=−0.42)(rs= −0.34)

Search query length

An average term per query was 3.25 (SD=1.69). Tables 4 and 5 present a mean of terms per query by sources and task goals. Queries on local sites were shorter than queries on general search engines and image search engines. However, the average length of queries on general search engines was longer than one on image search engines. Queries in the academic task goals and personal interest goals were longer than ones in the work task goals.

Table 4. Average of Query Length on Different Sources
 N of queriesMean of query lengthSD
General search engines365 (37.32%)3.461.66
Image search engines483 (49.39%)3.371.73
Local sites130 (13.29%)2.231.22
Table 5. Average of Query Length on Task goals
Type of task goalN of queriesMean of query lengthSD
Academic task740 (75.66%)3.291.74
Work task126 (12.88%)2.981.39
Personal interest112 (11.45%)3.291.64

Effects of task goals and content sources on query length

To see the influence of sources and task goals on query length, an ANOVA test was used. The test results show that query length was significantly different across sources (F=29.31, p=0.00) while it was not different across task goals (F=1.93, p=0.15). Results from the analysis of a Tukey test indicate that a mean of query length on local site (M=2.23, SD=1.22) is significantly different from one either on general search engines or image search engines, while a mean of query length on general search engines (M=3.46, SD=1.66) and a mean on image search engines (M=3.37, SD=1.73) did not differ much from one another.

Query length and Retrieval Actions

Table 6 shows the mean of retrieval actions per query issued on different search engine content sources - the number of results pages viewed, the number of pages and the number of thumbnail images selected and viewed from the results pages, the unique page views from the local sites, and the number of information object saved. In general search engine services, participants overall examined 1.04 results pages and less than one link was selected per query issued. In image search engine services, participants examined about 2.5 results pages and on average, about 2 thumbnail images were selected to view per query.

Table 6. Mean of Retrieval Actions on Content Sources
Retrieval ActionGeneral search engines (N=365)Image search engines (N=483)
SEG_Results1.040.19
SEI_Results1.062.49
Image clicking1.761.96
Following a link0.890.17
Local_pages_checking0.890.18
Saving0.330.55

A total number of retrieval actions upon query issued were calculated. In order to determine if task goals had an effect on overall retrieval actions per query, an ANOVA test was conducted. Findings showed that retrieval actions were significantly different across task goals (F=23.35, p=0.00). A Tukey test was run to analyze which specific groups differed from one another. Results of that analysis indicate that academic task goals (M=4.38, SD=5.43) and personal interest goals (M=5.68, SD=7.21) did not differ much from one another. However, the work task goals produced significantly more actions (M=9.61, SD=16.62) than either the academic task goals or personal interest goals.

Pearson's Correlation was used to determine if a relationship exists between query length and retrieval action, which are two continuous variables measured by an interval scale. As Table 7 indicates, query length was correlated positively with an action of following a link on general search engines. On the other hand, query length on image search engines was negatively correlated with the number of search result pages and thumbnails viewed from a set of search results from image search engines and the number of information objects saved. This means that an increase of terms in query decreased the likelihood of viewing and saving actions from search results on image search engines, although all relationships were not strong enough. It can be concluded that image searches with long queries on image search engines were not effective.

Table 7. Results of Pearson's Correlation Test on retrieval action and query length
Retrieval actionQuery length on general search engines (N=365)Query length on image search engines (N=483)
  1. *. Correlation is significant at the 0.05 level (2-tailed).

  2. **. Correlation is significant at the 0.01 level (2-tailed).

SEG_Results0.09 (r=0.09)0.19 (r=0.06)
SEI_Results0.82 (r=−0.01)0.03 (r=−0.10)*
Image clicking0.50 (r=0.04)0.02 (r=−0.11)*
Following a link0.02 (r=0.12)*0.40 (r=0.04)
Local_pages_checking0.87 (r=−0.01)0.91 (r=−0.01)
Saving0.48 (r=−0.04)0.00 (r=−0.15)**

DISCUSSION AND CONCLUSIONS

This study examined whether querying behavior – query iteration and query length - in image searching for participants' own needs on the Web varied across contextual characteristics.

Frequent query modification is common in an interaction process; this study confirmed the trend. In this study, the mean of queries participants produced during a search session was 11.24 queries. It is interesting to note that the average number of query iterations in this study was much higher than those in the previous studies. Given that the previous studies were conducted in an experimental study with assigned search tasks or with participants doing science-related work tasks, it can be inferred that image searches based on a naturalistic setting would elicit a higher number of query iterations. Users with few query iterations tend to be more satisfied with search results than those with more query iterations. This finding supports the suggestion of Belkin et al. (2003) that a number of query iterations per search can be used as an indicator of search effectiveness.

The average of search terms per query found in this study is 3.25 which is longer than that in previous studies: 2 terms in Jorgensen and Jorgensen (2005)'s study as well as in studies of multimedia search log analysis by Jansen and his colleagues. Three terms in textual searches were reported by Aula (2003) and Aula and Käki (2003)'s study. It seems that in performing one's own image searches with keywords, participants entered more search terms than textual searches. This study analyzed differences of query length by task goals and content spaces as well as relationships between query length and retrieval actions on search engine content spaces (general search engines and image search engines). Types of task goals did not show an effect on query length whereas query length differed according to content sources. Queries issued on local sites were significantly shorter. It can be assumed that local sites would include a set of collections closely relevant to the participants' information need. This was so that queries might not have to be longer; however, participants employed more keywords to get a relevant set of results from the wider net of collections on search engines. Further studies should investigate query modification patterns in the term-level as well as the semantic level to find relationships between queries during an interaction.

This study demonstrated the influence of task goals on the searching process. One of the study's findings indicated that participants tended to iterate significantly more queries on academic tasks. There were significant differences on overall retrieval actions (a total number of viewing and saving pages) in types of task goals. Participants who performed image searches for work-related tasks produced more retrieval actions. These results suggest that task goals are definitely an important factor in the users' interactive Web searching process and should be taken into account in the research of information retrieval interaction. However, topic familiarity did not have a significant effect on query iterations. This finding matches Aula (2003)'s study which found that there was no effect of domain knowledge on query formulation. Previous studies (such as Vakkari, 2000a; Wildemuth, 2004) suggested that topic/domain knowledge affected users' information seeking behavior. Further studies should investigate the impact of topic or domain knowledge on Web querying behavior.

There is evidence that work task stage is an important contextual aspect in information seeking. Studies suggest that work stages for a user's particular work task affect search strategies, performance and relevance judgment (Kelly, 2006; Vakkari, 2000b). However, work task stage has not received a great deal of attention in image seeking research. Recently, McCay-Peet and Toms (2009) examined how images are used throughout the process of completing a typical written work task. They found that the stage of the work task process has a significant impact on how the image is used and located with descriptive and conceptual image attributes. This study also concluded that work task stage affects a number of query iterations. Thus, work task stage should be taken into consideration as an important factor when researching the querying and searching process for images.

The findings of this study have implications for the search interfaces of image search engines. Participants tended to modify queries more often on image search interfaces than general search interfaces. Also, longer queries in image search engines produced less retrieval actions in checking results than in shorter queries. This finding suggests that image search engines may need to develop techniques or features to support frequent query modification so that users can modify their queries easily based on search results.

This study shows that participants tended to examine more result pages on image search engines than general Web search engines. It seems that visual stimuli (thumbnail display of images) of image search engine's interfaces effectively support browsing of search results. However, participants only checked a few images (Tjondronegoro, Spink, & Jansen, 2009). In other words, participants simply viewed illustrations of images and made relevance judgments without checking the source of images. Those may have conveyed additional textual descriptions or the context of an image. Images have multiple meanings, and the context of an image is important for its illustration and conceptual representation (Panofsky, 1962; Shatford-Layne, 1994). Relying only on an illustration of images, searchers would miss additional descriptions and representations that Web pages would embed for images; that would be important for interpretation and relevance judgment. In order to retain the context of an image, Web image search engines may present textual clues in addition to a common description of image itself, e.g., file name, size, a URL, or title of an image or a web page of an image. Search engines could add hidden textual context from an image's Web page along with the image itself. Or, it could use them as classifications for users to further sort search results. With such contextual information, users could make sense of an image's illustration and be motivated to check out more image results.

In conclusion, this study showed that querying behavior in image searching processes on the Web varied across a user's contextual factors. Further studies should investigate details of a user's query formulation and reformulation patterns as well as image attributes in queries to understand query description and the representation of a user's information need. The study's findings also suggest more research is needed to investigate the relationship between querying behavior and contextual factors in order to provide insight on a user's interaction on the Web and system design. Indeed, the Web serves a vast and heterogeneous population. Such diverse users bring in different aspects of their search needs and tasks which may lead to unique querying and searching behavior. Thus, the scope of participants in the study is regarded as a limitation that generalizes the findings in image querying behavior. This limitation suggests that future studies are necessary to investigate the querying behavior of different groups of participants in a naturalistic setting.

Acknowledgements

This study was supported in part by the OCLC/ALISE grant. The author thanks the anonymous reviewers for their helpful comments on an earlier draft of this paper.

Ancillary