Information needs and search characteristics of first-year medical students



This study aimed to investigate the relationships between information needs of first-year medical students and search strategies they used to address those needs during online information searching. Students' information needs focused around two categories of questions – general medical and diagnostic – and were pursued through a variety of online searches. Searches for general medical knowledge aimed to find information that was known to exist and resulted in the known-item search tasks. Searches for diagnostic knowledge required finding several pieces of information that needed to be put together to answer a question and resulted in subject search tasks. Remote observations demonstrated that students' searches for general medical information were short, quick, and did not require investment of a significant time or cognitive effort. Diagnostic searches, in contrast, consisted of multiple search moves, were time-consuming, and required a lot more thinking. Students performed both types of searches in indexed medical databases and Google but used them differently. Students' search patterns were visualized. Such way of presenting the findings allowed for a quick overview of the observed search process characteristics.


Contemporary problem-based learning (PBL) approach to medical education extensively uses patient problems that form the focus and stimulus for student learning (Barrows, 1996). These problems, presented as clinical scenarios become challenges that students need to tackle to acquire clinical problem-solving skills. While working on a scenario, students eventually get to the point where in order to proceed, they need more information. This results in generation of information needs that must be met to enable students to solve medical problems.

The most natural way of expressing information needs is through asking questions (Cogdill & Moore, 1997; Wildemuth, de Bliek, Friedman, & Miya, 1994). By the end of each PBL session, a group of 8-10 students comes up with a list of questions that are distributed among the team members. Each student is expected to engage in information searching to find answers to his or her questions and be prepared to share the results with other students during the next meeting.

According to the Association of American Medical Colleges Medical School Objectives Project, a student before graduation is expected to demonstrate the ability to retrieve, manage, and utilize biomedical information for solving problems and making decisions relevant to the care of individuals and populations (Anderson, 1999). This means that information searching is a critical skill in preparation of contemporary medical specialists and is important to study. Especially important it is to study students' search characteristics during early years of professional education when information searching skills for medical problem-solving only begin to develop. Their analysis and identification of problems will allow for timely introduction of corrective measures.


In order to study the quality of information searches by medical students, first, it is important to recognize a variety of information needs they may experience. Northup, Moore-West, Skipper, & Teaf (1983) found that information needs of medical students focused predominantly around disease-related questions, then procedures, and, finally, drugs. They also found that medical students more frequently sought answers to background or basic types of questions. Wildemuth, de Bliek, Friedman, & Miya (1994) classified information needs of first-year medical students in response to clinical scenarios in toxicology into five categories. These categories were composed of mainly identification and explanation questions. This suggested students' lack of knowledge of naming conventions of objects as well as their interest in learning more about toxic agents. Cogdill & Moore (1997) found that information needs of first-year medical students significantly focused around general disease information for managing a clinical problem. The majority of these questions were associated with diagnosis, then treatment, and finally other general information about a disease, procedures, and issues associated with the case. Mehdi, Roghayeh, Fard Azar Farbod, & Sajedi (2010) studied information needs of medical students in emergency departments and found that they were represented by diagnostic, therapeutic, and organizational questions. Patient-specific questions constituted the majority of their information needs. Organizational questions about the hospital policies and procedures received much less attention as the answers were not quick and easy to find.

Questions as accumulated by students represent their information needs and lay the basis for information searches intended to meet those needs. Depending on characteristics of these questions, information searches result in different types of search tasks.

A number of studies investigated and supported the idea that user's search performance depends on the nature of the search task. Navarro-Prieto, Scaife, & Rogers, (1999) in their study demonstrated that in specific fact-finding tasks experienced web-participants used a bottom-up or a mixed strategy, while novice web-participants used a top-down strategy. In exploratory tasks experienced web users employed a top-down strategy, while the novices adhered to a bottom-up approach. Kim & Allen (2002) found that search for the known-item search tasks resulted in relatively high recall and precision ratios. The subject tasks required more time from the searcher and resulted in more search activities. Aula (2003) found that the broad style of searching was appropriate for exploratory search tasks. Search strategies of narrowing the search were most appropriate for fact-finding search tasks. The use of several successive queries or formulation of one comprehensive query with Boolean operators was the most relevant approach to investigating a comprehensive search task. Thatcher (2008) demonstrated that although there were differences among participants' choices of search strategies for the researcher- and participant-defined tasks, there were no significant differences in the choice of search strategies for directed and general purpose tasks.


The study aimed to demonstrate the relationships between information needs of first-year medical students and search strategies they used to address those needs during online information searching. Particularly, the research questions were:

  • 1.What information needs do first-year medical students experience after reading a clinical scenario?
  • 2.What are the characteristics of online information searches performed by first-year medical students in response to their information needs?



Twelve first-year medical students (2 males, 10 females) from the School of Medicine at a large Midwestern University were recruited through ads posted on campus. All participants were volunteers and received monetary compensation for their participation. By the time of the experiment all participants completed the first eight weeks of the first PBL block that they began in the Fall semester of 2011.


Six clinical scenarios were specifically prepared for this study by two subject matter experts. Complexity level of these scenarios corresponded to the level of participants' preparation expected by the time of data collection. After receiving a clinical scenario, participants were asked to immediately document their information needs. Further, they were asked to perform online information searches that would allow them to meet their information needs and proceed with finding solutions to problems described in a clinical scenario. To avoid any bias in distribution of scenarios, participants drew numbers associated with each of them. Participants performed searches individually, in a lab setting, during approximately 30 minutes. Their actions were observed from a remote location and recorded with the Morae 3.2.1 software.


The data was reviewed, and participants' information needs were categorized according to their content into the following categories: (a) general medical knowledge, e.g., What is a direct coombs test? (b) medications and mechanisms of action, e.g., What is the mechanism of action of Vancomycin? (c) laboratory, diagnostic tests, and standards, e.g., What do blood culture results mean? (d) symptoms, differential diagnosis, history of present illness, e.g., How might the rash be related to Maurice's constellation of symptoms? (e) treatment, e.g., What are the treatment options for a reducible umbilical hernia? and (f) other, e.g., psychosocial and epidemiology, e.g., Are Hispanic men especially prone to any medical conditions that could cause RLQ pain?

Participants' information needs varied in content, thus, formed the basis for different types of search tasks. For this study we used user task classification proposed by Kim & Allen (2002). According to it, we divided all searches into “known-item” and “subject” search tasks. Known-item search tasks required the searcher to obtain a piece of information that was known to exist and provide a specific answer to the question. These types of tasks were performed in response to information needs in categories (a), (b), (c), (e) and (f) (see above). Subject search tasks required the searcher to find different pieces of information that were related to the subject and considered useful in answering the question. These types of tasks were intended to meet the information needs in category (d).

To visualize participants' search patterns, we used the model of move sequences visualization suggested by Gwizdka (2011). Here certain moves were marked by letters and two-dimensional vectors: Q – Query formulation; L – Examination of search result list; C – Examination of an individual result (content); B – Bookmarking and tagging a relevant result (Figure 1a). Because participants in this study were not required to bookmark any information sources they considered relevant, we substituted vector B with vector R to indicate a return to the previously obtained search result list (Figure 1b).

Figure 1.

Elements of visualization.

Upon the end of data collection, all recording with participants' online information searches were viewed and analyzed for quantity, sequence, and characteristics of search moves and types of information sources used. Visualization of search patterns was performed manually.


Research Question 1: Students' information needs

Participants' information needs had different manifestations. Some were expressed explicitly as questions, e.g., What is involved in a work-up for sepsis? and some implicitly as statements, e.g., Mechanism of action and drug information on Linezolid. Participants generated 94 immediate information needs, an average of eight per person. Sixty four of those were associated with known-item search tasks and thirty with subject search tasks.

Remote observations revealed that immediate information needs that arose right after participants read a clinical scenario sometimes differed from information needs that were actually pursued. The latter were often modified during information searches as additional unrecognized information needs arose.

Research Question 2: Students' search characteristics

Information searches for both search tasks included common characteristics. For example, participants performed searches in online medical databases such as Up-To-Date, Access Medicine, DynaMed, and StatRef! as well as Google searches. Regardless of the information source, most searches began with entering a query and followed by scrolling down the result list in search of the source with relevant content. Often participants borrowed keywords for consecutive searches from the visited content pages or descriptions provided on Google result lists (Figure 2). While on the content page, participants often used a Ctrl+F short key to quickly identify pieces of information containing the search word(s). No Boolean searches were performed.

Figure 2.

Borrowing queries for new searches.

Searches for general medical knowledge associated with known-item search tasks were short and quick. They were often performed in only one or two databases before the final answer was found (Figure 3).

Figure 3.

Example of a known-item search.

Alternatively, students performed these types of searches in Google. Frequently, they were able to retrieve sought information from the summaries provided under each retuned result in the result list. As a result, they did not need to visit the source pages (Figure 4).

Figure 4.

Obtaining ideas from Google search.

Diagnostic searches associated with subject search tasks were characterized by parallel searching in several databases and Google, often with the use of the same query (Figure 5). These searches included a series of queries, original and re-formulated. Original queries sometimes consisted of complete sentences, e.g., Etiology of high pitched sound while coughing. Re-formulated queries were sometimes borrowed from Google auto-complete feature or the content of previously visited pages.

Figure 5.

Example of a subject search.


First-year medical students' immediate information needs focused around general and diagnostic medical knowledge. The former laid the basis for known-item searches; the latter – for subject searches.

Known-item searches aimed to identify pieces of information that was known to exist. They were performed with minimum invested time and cognitive effort and were usually completed at the first attempt. When participants performed searches in indexed databases, hierarchical organization of content facilitated easy browsing between the pages. As a result, participants were able to access several content pages even on entering a single query. When participants performed Google searches, the first few lines that appeared under each returned result on the result list were usually enough to get an idea about what they were searching for. Hence, participants did not need to proceed to the content page.

Diagnostic searches were frequently characterized by a combination of searches for facts that participants needed to know in order to start putting the diagnosis together. Despite knowing the steps required to solve a problem, in most cases participants lacked the necessary knowledge of professional terminology or diagnostic skills. They compensated it either by relying on Google auto-complete feature for query generation or borrowing queries for new searches from the pages with viewed content. Such approach to diagnostic hypothesis testing resulted in broad and indirect searches. Even so, information presentation within medical databases and flexibility of the Web significantly facilitated such searches. That is why even without the knowledge of professional terminology and diagnostic skills participants were able to perform successful searches that did not require any preliminary planning.

We reported here only on a part of data that was collected for a larger study on information seeking behavior of medical students. All obtained results are not generalizable and are characteristic to this study only. Data collection methods, duration of performed searches, nature of clinical scenarios, and chosen approaches to categorization of information needs could have resulted in a number of limitations. For a larger picture of medical students' online information search characteristics, further investigation, analysis, and reporting are needed.


I would like to thank Professors Michael C. Hosokawa and Ronald H. Freeman for providing clinical scenarios for the study and helping me with categorization of medical students' information needs.