Selection of information sources: Accessibility of and familiarity with sources, and types of tasks
While many studies have investigated information source selection in different user groups, few have explored the types of factors and how they influence resource selection. This study examines the types of information sources users select as well as the reasons behind their choices of information sources based on their actual searches. Thirty-one participants representing the general public with different demographic characteristics were recruited for the study. Data collected from diaries and questionnaires were analyzed by applying both qualitative and quantitative methods. The findings of this study show that electronic resources are the dominant information sources selected by participants. Seventeen types of factors including some new factors in relation to dimensions of tasks, characteristics of users, and attributes of sources were derived from the data. The results also indicated that multiple factors co-determined participants' selection of resources. Descriptive and statistical analysis of the data demonstrated that accessibility and familiarity of sources were correlated to information source usages, and the results suggested that participants utilized more print or human resources in accomplishing scholarly tasks while they were more dependent on electronic resources in achieving popular tasks.
Introduction and Relevant Literature
Selection of information sources is the first and most essential step in the information search process. The emergence of the Internet makes electronic information sources more accessible for end users. In the digital age, electronic resources, human resources and printed resources are the common information sources selected by users; electronic resources, in particular, have become more and more prevalent as major information sources for users to fulfill different types of search tasks.
Researchers have investigated information resource selection for different types of user groups and tasks. Although academic users rely on both electronic and printed resources, they depend more on electronic resources. Based on a Web survey of undergraduate students, Dilevko and Gottlieb (2002) found students normally began their information search process by using electronic resources; however, printed resources were essential resources for their tasks. About one third of the undergraduates preferred to use print journals to e-journals. Kim and Sin (2007) presented similar results: Web search engines, Web sites, books, online databases and journals, OPACs, friends/family members, printed journals, reference materials, and librarians were the information sources that undergraduates selected based on their frequency of use. Thompson (2007) discovered that library Web pages and Google were the most important information resources for distance education students. Hemminger, Lu, Vaughan and Adams (2007) reported the most frequently used information resources for academic scientists consisted of journals, Web pages, databases, and personal communications. Vibert, Rouet, Ros, Ramond and Deshoullieres (2007) identified PubMed and Google as the key resources for neuroscientists for their research. Not all academic users depend mainly on electronic sources. According to Baruchson-Arbib and Bronstein (2007), humanists used books and journals as their main information channels. Moreover, they tracked citations of these sources to other documents even though they started to adopt electronic information sources.
While academic users access more Internet sources, engineers and people in a corporate engineering environment make extensive use of humans and documents as information resources, including handbooks and internal reports (Cool & Xie, 2000; Fidel & Green, 2004; Hertzum & Pejitersen, 2000). They search documents to identify the right people, and search for people to find documents. Seeking health-related information also presents unique characteristics in source selection. Based on the analysis of a telephone survey of the general public, respondents selected the Internet as their main information resource, and public libraries and doctors as their second and third choices for genetics information seeking (Case et al., 2004). In these cases, Internet sources replace human sources as the first source for consideration. Users with specific medical problem, such as individuals living with spinal cord injury, used more interpersonal information resources — in particular healthcare professionals — than mass media sources (journals, books, TV, etc.) and Internet sources (Burkell, Wolfe, Potter, & Jutai, 2006).
Task plays an important role in determining source selection. In previous literature, for example, people mainly turn to the Internet for health information. However, people with specific health problems, such as a spinal cord injury, consider using human sources — healthcare professionals — more than the Internet even though the health information professionals are less accessible than the Internet. Hertzum and Pejtersen (2000) revealed that the nature of the design task could offer an explanation of engineers' use of documents and people. Time constraint of the task is another determining factor in deciding the usefulness of an online electronic information resource (Vibert, Rouet, Ros, Ramond, & Deshoullieres, 2007). Rowlands (2007) concluded that print and electronic resources were used for different types of tasks and at different times in the information-seeking process.
According to Fidel and Green (2003), accessibility was the most influential factor in the selection of information sources. Fidel and Green further examined different aspects of accessibility of engineers' information source selection including familiarity, right format, different types of information in one place, and others. After analyzing a survey of faculty and researchers, Quigley, Peck, Rutter, and Williams (2002) found that “most convenient” was the most cited factor in selecting information sources in the contexts of current information seeking, routine research of information needs, and seeking information in an unfamiliar area. For decision makers, accessibility is the most important factor in relation to frequency of use of information sources (O'Reilly, 1982).
Familiarity of sources is another key factor in choosing information resources. Under the context of conducting a thorough search for a research topic, “most familiar” was the most cited factor in the selection of resources (Quigley, Peck, Rutter, & Williams, 2002). Complete coverage, accuracy, and in-depth content are the main reasons undergraduate students choose print sources (Dilevko & Gottlieb, 2002). Liu and Yang (2004) presented the top factors influencing distance education students' selection of information resources: timely information retrieval, easy access, comprehensive electronic resources, ease of use and high system performance. Echoing previous research, Kim and Sin (2007) identified similar reasons, such as accessible, easy to use, comprehensive, efficient and free. They also discovered that the most important criterion was accuracy or trustworthiness.
Previous research has identified types of information resource usage in different user groups. However, the majority of these studies only focus on one group of users, such as academic users or engineers. Most of the studies have built results based on surveying users' general practices, in other words, what users said in surveys instead of what resources they selected in fulfilling their specific tasks. While there is more research on information source selection, little research examines the types of factors and how they influence users' decision in choosing different information sources. For example, task, accessibility of resources, and familiarity with resources were identified as important factors in affecting resource selection, but previous research has not further examined the ways that each of these factors influences resource selection. Moreover, in previous research, types of factors were not derived directly from users themselves or from their actual search process; instead, subjects rated the factors identified by researchers. Finally, very few researchers have tested the relationships between information source usage and types of task, accessibility of sources, and familiarity with sources. Research is needed to investigate the major factors that lead to the selection of information sources and more importantly, how these factors determine the choices of information sources.
This study attempts to address three research questions related to users' selection of information sources:
- 1.What types of information sources are selected by users?
- 2.What are the reasons/factors that affect users' selection of information sources?
- 3.How are types of tasks, accessibility of sources, and familiarity with sources associated with users' selection of information sources?
Thirty-one participants were recruited from the Greater Milwaukee area responding to fliers or newspaper advertisements, representing general users of information with different sex, race, ethnic backgrounds, education and literacy levels, computer skills, occupations, and other demographic characteristics. Table 1 presents participant characteristics.
Table 1. Characteristics of Participants (N=31)
The data were collected by several methods:
- 1.Pre-questionnaire — Participants were instructed to fill out a pre-questionnaire requesting their demographic information and their experience in searching for information.
- 2.Information Interaction Diary — They were asked to keep an "information interaction diary" for two weeks to record how they achieved two search tasks: one is work-related and the other one is personal. The diary consists of information in relation to their tasks, source selections, information strategies, and problems encountered, among other pieces of information.
- 3.Think aloud protocol — They were invited to come to the Lab to seek information for two additional work-related and personal search tasks. They were instructed to “think aloud” during their information-seeking process. Their information-seeking processes were captured by Morae, a usability testing software that not only records users' movements but also captures their “think aloud” during their information retrieval process.
- 4.Post-questionnaire — After the searches were done for the diary, participants were aksed to fill in the post-questionnaire, which consists of questions regarding their experience in selection of information sources and their search processes. Participants were also requested to fill in the post-questionnaire for their searches at the lab.
Diaries and Morae software are the most effective ways to capture users' behavior in their information-seeking process. This paper mainly reports the data analyzed from the pre-questionnaires, diaries, and post-questionaires, since particpants were able to select a variety of eletronic, human, and printed sources when they were in their own settings instead of our lab.
The investigators analyzed both qualitative and quantitative data collected from the diaries as well as the pre- and post-questionnaires. Quantitative data were tallied and analyzed in SAS. Qualitative data were analyzed by using open coding (Strauss & Corbin 1990), which is the process of breaking down, examining, comparing, conceptualizing, and categorizing. Table 2 presents the data collection and the data analysis plan. Examples of data analysis were discussed in Results and Discussion because of space limitations.
Table 2. Data Collection and Analysis Plan
Results and Discussion
The findings of this study are organized to answer the three proposed research questions: 1) Types of information resources selected by participants; 2) Types of reasons/factors for the selection of information sources; and 3) Relationships between information resource usage and types of tasks, accessibility of sources, and familiarity with sources.
Types of resource usage
To investigate types of resource usage, participants were asked to identify each information source they used in their diaries. The results of this study confirmed recent resource usage trends indicating that electronic sources are the leading sources for users in searching for information. Table 3 shows the resource usage data of participants in the process of achieving their personal and work-related tasks.
About 83% of the participants used electronic sources, which is similar to Kim and Sin's (2007) findings that the “Web page” was the top-ranked source and the “search engine” was the second. The most notable finding from this analysis was that these two sources were dominant (more than 70%). More specifically, “Web page” included commercial, individual or organizational websites. Approximately 50% of the Web pages were organizational sites including governmental, association, or institutional Web pages. For example, a participant who needed information about the Congo searched the information provided by a Congo government website. It seems that the credibility and reliability of the information sources were major concerns for participants while they used Web page sources. The commercial Web pages, which accounted for about 29%, were used with obvious purpose, mainly in relation to purchasing a commercial item such as a camera or an airline ticket. At the same time, we found that about 11% of the Web pages were “Wikis”; the individual websites turned out to be the least frequently used ones (less than 10%). The results also showed that “search engine” was the second dominant source used by participants. Amongst search engines, about 81% used Google and 7% used Yahoo. We can infer that Google plays an important role as an information source, in particular at the beginning stage of information seeking.
Human resources accounted for 11% of resource usage. Among the human resources, experts such as lawyers or other types of professionals were the most frequently used sources. The second was librarians, and the third was colleagues. The results indicate that participants intended to acquire information from people who have credentials and expertise.
Print materials turned out to be the least frequently used resources. Only 7 out of 118 resource uses in all diaries were print resources. Since electronic resources have become dominant in recent years, use of print sources has decreased subsequently. Books were the only print materials used in the print-source category. Since many periodicals, articles, or news stories can be available via online channels, participants preferred not to use print resources.
Table 3. Types of Resource Usage
Types of reasons/factors for the selection of information sources
Based on the analysis of the diaries and post-questionnaires, 17 types of reasons were identified. These reasons are associated with tasks, sources, users, and search processes. Table 4 presents types of reasons, definitions, and corresponding examples. Among them, accessibility and ease-of-use of sources, types of task, and familiarity with sources are the main factors in each category that influence participants' choice of information sources. The results also show that there are more reasons in relation to sources (9 types) than other categories provided by participants in determining their selections of information sources. Compared with previous studies, the findings validate past research (Kim & Sin, 2007; Lee, Han, & Joo, 2007; Liu & Yang, 2004; O'Reilly, 1982): accessibility, ease of use, comprehensive coverage, accurate/reliable results, and cost of sources are the factors that influence users' selection of information sources. More importantly, the results reveal some new factors in relation to dimensions of tasks, characteristics of users, and attributes of sources that were not identified from previous research. Dimensions of task, such as types, nature, domain and timeframe of task, are not only the key factors influencing source selection but also are key factors leading to a specific information source that is highly related to the task (i.e., “I used this human resource because he is the requestor.”). Users' familiarity with topics and their preference as well as characteristics of sources (e.g. interaction, unique coverage of sources, etc.) unavoidably determine their first information sources.
Another unique and interesting finding is that, in many cases, multiple factors co-determine participants' choices of information sources. Some of the factors are related to one type of element, such as sources. As an example, one participant discussed his reasons for the selection of Google including familiarities with Google, Google's ease of use, and its results: "I know from past experiences that Google is fairly easy to use; it generates a lot of images to choose from, and I have worked with it more than I have Yahoo or MSN." Another participant offered her reasons for selecting library books: “Availability, convenience, comprehensive treatment of subject matter, portability, and economy.” Some of the factors are associated with different elements. For example, relevant results of a source and the domain of search task were the reasons for a participant to have chosen a specific Web search engine. "I like Ask.com better than Google because I tend to get more relevant hits, and also used search engine because Social Media is a new online area, and thought I could get more up to date information from a Web search engine than a book." Future research needs to examine whether these co-factors play the same role in determining source selection.
Table 4. Types of reasons and examples
Relationships between information resource usage and types of tasks, accessibility of sources, and familiarity with sources
Another contribution of this study is that it quantitatively tested the relationships between information source usage and types of task, accessibility of sources, and familiarity with sources. In order to investigate the relationship between types of task and resource usages, we conducted cross-table analysis. Tasks were categorized into three groups:
- 1.Popular tasks — The popular task involved everyday common activities that were not related to work-related activities, such as entertainment, travel, shopping and so on.
- 2.Occupational tasks — The occupational task arose from work-related information activities, such as identifying a list of US law firms that specialize or have experience working with credit union mergers and acquisitions, finding SQL injection techniques and remediation, etc.
- 3.Scholarly tasks — The scholarly task included the research activities, such as writing a paper, writing a book, etc.
Table 5 shows that electronic resources were the most frequently used resources. More than 80% of resource uses mentioned in diaries fell into the electronic category, such as search engines, Web pages, online databases, or digital libraries. Human resources, such as experts or colleagues, were used less frequently (about 11%), and use of print sources only accounted for 5.9% of the total usage. Since 56% of the cells have expected counts less than 5, a Chi-Square test cannot be conducted to statistically analyze the relationship between types of tasks and resource use.
Table 5. Cross-table of Resource Categories by Types of Tasks
We compared proportions of resource uses by types of tasks. From this comparison, we discovered that participants in achieving scholarly tasks utilized more print (14.3%) or human (17.9%) resources, whereas participants in achieving popular tasks were dependent mainly on electronic resources (91.1%).
Then, we analyzed the resource uses in detail (Table 6). Search engines and Web pages accounted for the majority (87.5%) of total source usage for popular tasks. This indicates that participants in performing popular tasks usually involve trivial topics and tend to use Internet resources, which are considered to be easy to access. Participants in performing scholarly tasks, however, rely on reliable resources such as experts or librarians (14.3%) or online databases (14.3%) and books (14.3%) as well. Participants in performing occupational tasks also turned to experts or librarians (14.7%) for information.
Table 6. Usage of Resources by Types of Tasks
In order to understand the relationships between accessibility of sources and resource uses, we examined the actual resource uses and participants' rating on perceived accessibility. Participants were instructed to rate the accessibility of each type of information resource by using five-point scales; their uses of each type of resource were analyzed from their diaries. Table 7 shows the frequency of resource uses and participants' perceptions on each resource's accessibility. The results showed a clear relationship between resource use and accessibility. In addition, we found a huge gap between Web page/search engine and the other resources in their frequency of uses and accessibility levels. Frequencies of Web page and search engine uses were 45% and 41% respectively, and other types of sources were used less than 8%. The accessibility levels show a similar pattern. Even though expert was perceived as a less accessible source (ranked 10th), expert was highly ranked (3rd) in frequency of use. This finding reveals that participants acquired useful and credible information from people who have expertise despite of low accessibility. Conversely, although participants perceived family members or friends as easily accessible sources (ranked 5th), there was no use of family members or friends in diaries.
Since frequencies of use for Web page and search engine are exceptionally higher than the others, we decided to analyze at the ordinal level and apply a Kendall tau rank correlation coefficient. A Kendall tau-b correlation analysis addressed the relationship between resource use frequency (M = 9.83, SD = 15.74) and accessibility (M = 2.81, SD = 0.85). For an alpha level of.05, the correlation between resource use and accessibility was found to be statistically significant, tau-b =.45, p <.05. This indicates that resource use and accessibility are correlated.
Table 7. Resource Use and Accessibility
We also investigated the relationships between familiarity of resources and resource uses. Participants were asked to rate their familiarity of each source by using five-point scales. Table 8 shows the frequency of resource use by types and participants' perceptions on their familiarity with resources. Similar to the results of accessibility, the values for resource familiarity of Web pages and search engines were relatively higher than the others. Participants perceived Web pages and search engines as familiar resources, and correspondingly, diary data also presented relatively higher usage of Web pages and search engines. For human sources, similar patterns occurred. Despite low familiarity, experts were ranked high (3rd) in frequency of use, whereas family members and friends were ranked 5th in familiarity and ranked lowest in frequency of use.
A Kendall tau-b correlation analysis addressed the relationship between resource use frequency (M = 9.83, SD = 15.74) and familiarity with resources (M = 2.27, SD = 0.98). For an alpha level of.05, the correlation between resource use and familiarity with resources was found to be statistically significant, tau-b =.51, p <.05. This indicates that resource use and resource familiarity are correlated.
In addition, participants were asked to rate their familiarity with the selected resources on a five-point scale in their dairies. An overall average of familiarity with selected resources turned out to be 4.05. This result implied that participants in general selected resources that they were quite familiar with.
Table 8. Resource use and familiarity with resources
This study investigated the general public's information resource usage in achieving their work-related and personal tasks as well as the factors that influenced their choices of resources. This is one of only a few studies that analyzed the data derived from people's actual searches. The findings showed that electronic sources, in particular Web search engines and websites, were the dominant sources chosen by participants. At the same time, more human sources were used than print sources. Seventeen types of reasons were identified that affected participants in their selection of information sources. These reasons are associated with sources, tasks, users, and search process. One unique finding of this study is that participants' decisions of which information source to use were not affected by a single factor; instead, they were co-determined by multiple factors. While previous research has identified most of the factors in relation to sources, this study revealed new factors in relation to dimensions of task, characteristics of user and search process, and attributes of sources.
Quantitative analysis further explored relationships between information resource usage and types of tasks, accessibility of sources, and familiarity with sources. Descriptive and statistical analyses correspond to the results of the qualitative analysis and concluded that participants' perceived accessibility and their familiarity with sources significantly correlated to usages of information resources. Even though statistical analysis could not be conducted between types of tasks and resource uses because of the limitation of data, the results suggest that for scholarly tasks, individuals utilized more reliable resources including human and print resources, whereas for popular tasks, individuals were dependent mainly on electronic ones.
The findings of this study yield some insights into the design of different types of information retrieval (IR) systems, especially the incorporation of different attributes of sources that users deem as decisive factors in their choices of systems, such as accessibility, unique coverage, and useful, reliable, and accurate results. More importantly, it is helpful to incorporate some of the attributes associated with human and print sources, such as friendliness, potential for interaction, knowledgeable about the task or the domain of the task, and portability, into the design of IR systems. IR, a collaborative activity as research has demonstrated (Fidel, Pejtersen, Cleal & Bruce, 2004; Hansen & Jarvelin, 2005; Karamuftuoglu, 1998), is an interaction involving not only user-system interaction but also human-human interaction which includes other users, task assigners or information requestors, experts of the topic, information professionals, etc. In addition, designers need to develop different systems or different levels of a system for different types of tasks, and the design of IR systems needs to target users with specific types, timeframe, and domain of tasks. Finally, IR systems need to assist users in applying different types of strategies at different stages of their searches or suggest to them other types of systems to use for their stage/strategy shifts.
This study also has its limitations. First, even though the sample represents the general public of information users, the sample size could be enlarged to be more generalized for the quantitative analysis. Second, two tasks for two weeks might not be enough to present a complete picture of each participant's selection of information sources. Further research can ask participants to record all the tasks and their related resource usage for a longer period of time. Further research can also focus on one type of task that participants engage in and have an in-depth analysis of resource selection for that type of task. From further investigations, we can have a better understanding of the breadth and depth of information source selection for different types of tasks. Last but not the least, based on the results of this study and other related works, further research can investigate the most important factors affecting resource usage and, more importantly, how multiple factors co-determine users' selection of information sources.
The authors would like to thank University of Wisconsin-Milwaukee for the generous funding for the project, and Tim Blomquist and Marilyn Antkowiak for their assistance on data collection.