Characterizing a digital library's users: Steps towards a nuanced view of the user

Authors


Abstract

The importance of understanding the wants and needs of an information source's users is a key belief of user-centered design. In digital libraries, we find two approaches to including users as factors in system design. User studies tend to provide a more abstract view and can be associated with information science research. Usability assessment tends to be associated with the practical assessment of the ease, efficiency, learnability, and flexibility of the digital library. However, usability assessment seldom deals with differences in user communities. This study of the information needs and information seeking values of three groups of digital library users illustrates how important it is to understand differences between user communities, to have a “nuanced view of the user”, to support informed decisions assessing the need for digital library content and organization.

Introudction

The phrase “digital libraries” has been applied to a wide range of projects and collections over its history. Early projects focused on ways to bring electronic versions of print journals to scientists and engineers. However, over time, we have seen the emergence of digital libraries that bring digitized versions of primary source materials and born-digital content to a broader population base. Because these digital libraries are often quite properly composed of identifiable collections, with a well-defined scope, a selection policy, and a defined community base, it becomes important that we come to understand the relationship between these collections and the communities they serve.

In pursuing that understanding, however, it is necessary that user communities be identified. Two different factors complicate this task. First, the presence of these collections on the World Wide Web opens the collections to a variety of potential user communities. This is quite unlike the case of digital libraries of full-text journals that are most frequently available through controlled access mechanisms. Second, these collections are often made up of images and multimedia content that are of interest for their beauty and value to our cultural heritage. Thus, the nature of the materials themselves may result in them being sought after by communities that vary in background, focus, and interest level. As Bawden and Vilar(2006) point out, we need to have some idea of user expectations in order to meet or manage those expectations. A first step is to begin to know who those users may be.

Volunteer Voices

This study reports on interviews with three communities of interest for a collection of primary source materials related to Tennessee history. Volunteer Voices (www.volunteervoices.org) was an IMLS-funded statewide digitization project of Tenn-Share and the Tennessee Electronic Library (TEL). It aimed to provide online access to primary source materials. Part of the University of Tennessee's Digital Library Initiative (DLI), the project drew materials from archives, historical societies, libraries, museums and colleges and universities across the entire state. The database includes digitized letters, photographs, realia, maps, and music scores from the 15th century to the present. The intent was to support Tennessee's K-16 history curriculum as well as to provide access to educators, researchers, students and members of the general public. Thus, the project was interested in learning about ways to facilitate access to the kinds of information they were creating.

Digital libraries in relation to the user

The very nature of the kinds of projects and collections we think of when we use the phrase “digital libraries” has evolved with the history of this type of information resource. As Fox and Urs (2002) and others have pointed out, much of the early history of digital libraries was focused on the technologies needed to support the distribution of content across networked infrastructures. Fittingly, perhaps, the content that generated the greatest attention at this stage was itself technical, notably articles in scientific and technical journals (cf. projects like TULIP, Red Sage, and the Chemistry Online Retrieval Experiment (CORE)). The user base for these materials was constrained by the nature of the materials included in these digital collections. Associated user studies focused on the acceptability of digital (“electronic”) documents for academics and other researchers, looking in detail at differences among faculty, graduate students and undergraduates. As the focus of the study of digital collections begins to change through the Digital Library Initiative-funded projects, new types of materials come to the forefront. In a listing of these projects (cf. Chowdhury & Chowdhury, 2003, p. 46-54), we find a growing diversity from geospatial, environmental, image and sensor data through rich multimedia data that run the gamut from ancient texts through medical information systems. This leads us to question how the study of digital library users has evolved in reaction to this change in content focus.

A survey of recent work on user concerns in digital library research within the field of library and information science reveals two approaches. There is a set of studies that look at the usability of existing digital libraries. Alternatively, there is an approach that looks at digital library users by viewing digital libraries as sociotechnical systems (Van House, Bishop and Butterfield, 2003). These groups echo Saracevic's (2001)distinction between the approaches to digital library evaluation by members of the research community which tend to study user behavior with evaluation by the practice community which focuses on usability testing. How do these approaches differ?

Designing systems according to the principles of user-centered design is a central tenet of the field of human-computer interaction. User-centered design, a concept described first by Norman and Draper (1986), includes both early user analysis and later term usability testing. However, user-centered design in digital library research is most often reduced to usability testing alone. While usability testing was used as a way of including user issues early in the history of digital libraries in such studies as the CORE project (e.g., Entlich et al., 1996), it continues to be a predominant approach to including user issues today. Many of these studies have been on collections of electronic technical journal materials. Hartson,Shivakumar, and Pérez-Quiñones (2004) report on a usability analysis of the NCSTRL digital library, a collection of computer science technical reports. Blandford, Keith and Fields (2006) tested a usability evaluation method, claims analysis, as a way to provide input to the design of a corporate library. Ferreira and Nunes Pithan (2005) integrated Kuhlthau's user model with Nielsen's heuristic analysis method to evaluate a digital collection for construction information. There have been some efforts to apply usability testing to other kinds of digital libraries. For example, Long, Lage and Cronin (2005) followed a user-centered design model, including usability evaluation, to improve the user interface to the Aerial Photographs of Colorado collection at the University of Colorado Boulder .

The second approach looks more closely at the context of use. Bishop and her collaborators (2000) at the University of Illinois at Urbana-Champaign conducted a variety of studies of the way actual and potential users of their DLI testbed “meet” a digital library's infrastructure. Conceiving of a digital library as an interconnected set of “artifacts, knowledge, practice, and community” (p. 394), they looked at the factors that affected the use and acceptability of their digital library. They emphasize the importance of situating evaluation in the context of existing and actual work practice. The ramifications of this analysis were extended in a later book edited by Bishop, Van House and Buttenfield(2003). Focusing on the social aspects of digital libraries, the book's contributors add to the understanding of digital libraries as “part of a web of social relations and practices” (p. 2) while exploring a “technically informed social analysis” of these issues.

In the chapter of the book(Bishop et al., 2003) written by Marchionini, Plaisant and Komlodi (2003) a multifaceted analysis of digital library users of various types and backgrounds is presented. They report on studies of three different digital library projects. The first set of studies concerned the Perseus project, a digital library that began as a collection of multimedia materials and tools concerning ancient Greece and that has become widely used as a teaching tool for humanities scholars. The second, the Baltimore Learning Community project, aimed at providing a collection of multimedia materials and tools to help classroom teachers develop instructional modules collaboratively. The third set of collections studied was the Library of Congress's National Digital Library program. In their investigation of user characteristics and needs, the researchers looked at a number of different user communities, ranging from Library of Congress staff, through K-12 teachers and school library media specialist supervisors. From an extensive and diverse number of data gathering activities, the researchers created a user taxonomy consisting of nine categories that include: LC staff, hobbyists, scholars, professional researchers, rummagers/browsers, object seekers, surfers, K-16 teachers and K-16 students. They identified sets of motivational factors and expertise that characterized these users and distinguished individuals who fit into the nine categories. This paper is of special import to the present analysis for two reasons. First, the types of DLs studied, like Volunteer Voices, are collections of primary source materials rather than databases and electronic technical journals. Second, the user groups studied reach beyond academia to working professionals and community members and hobbyists, a feature also shared with Volunteer Voices. Moreover, it is these extended user groups who are important target audiences if digital collections are to achieve the goal of bringing primary source materials and born-digital content to a broader population base, as the Introduction to this paper states.

A nuanced view of the user

We are proposing that it is necessary to look at the users of digital libraries in a more analytic way than is typical in studies of digital library usability. Digital library users, like those categorized by Marchionini, Plaisant and Komlodi, are not a homogenous group with a limited set of interests and opinions. If we are to design digital libraries that meet the needs of diverse user communities, we need to understand what those needs and values are and to incorporate the consequences of those user differences in our collections, interfaces, and access tools.

Methodology

Populations studied

Prior to the beginning of Spring term 2007, the project manager of Volunteer Voices was contacted to determine if the project would be interested in having a study done of the characteristics and information needs of its proposed user population. Working together, three candidate populations were identified. The first, and most obvious, population was composed of people who do historical research. Individuals were recruited from academic departments and historical societies in the local area. The participants had advanced degrees, specializing in the history of the American South. The second population was a group of people in a graduate program for school media specialists. This group was chosen because of the key role that school librarians play in training young people and in supporting teachers who are exposing their students to information about local history. The third group was composed of people no longer enrolled in educational programs who had a personal interest in information about local history, including genealogical information, but who had no formal or professional need for that type of information. This group will be referred to as lifelong learners. Eleven people were interviewed, four in each of the first two groups, three in the final group.

Procedure

Interviews were conducted by the members of a graduate class in human-computer interaction. Class members were trained in the techniques of conducting semi-structured interviews as part of the instructional program. They were then formed into groups corresponding to the populations of interest. Each interviewer developed and tested a sample interview. Teams critiqued and combined their interview questions which were then reviewed by the instructor. Through this process, consistent interview scripts were prepared for each group. Each class member was required to conduct one interview and to participate as a note taker in a second interview within their chosen group.

The interviews took place, whenever possible, in the workplaces of the interviewees and lasted 1-2 hours.As suggested above, one interviewer took the lead and a second took notes, consistent with the requirements of Institutional Review Board Form A approval. Following the interviews, teams were required to generate two types of reports. First, each team member had to submit a written report on the interview they had conducted. Second, the team had to create a report on the generic features of the interviews conducted by the team. They were also asked to prepare both a composite user profile of the interviewees, including both demographic information and information needs and use preferences and a persona (Goodwin, 2007) representing this user class.

Findings

The three user groups differed in a number of ways. These include: demographics, target source materials, and preferred information sources.

Demographics

The researcher group was both the oldest and most educated of the three. The four interviewees (3 males and 1 female) ranged in age from mid-40's to mid-60's. They all had received a graduate education (masters and doctorates) in the history of the southeastern region of the United States. They were employed as university faculty or as researchers for periods ranging from 7 to 28 years, with three of the four having more than 17 years doing scholarly research in this field.

The group of students in the school media specialist group were more variable in age, ranging from their mid 20's to mid 50's. All four were female. All had received the bachelor's degree that is a pre-requisite for the degree program they are involved in and one also held an M.Sc. Three of the four had had experience as classroom teachers prior to enrollment in the school media specialist program. None of them had done extensive research for local or regional history sources. Three of the four had children who had at some point worked on assignments like those these professionals would be likely to encounter when in the field.

The final group who represented the “life long learner” population were, as might be expected the most heterogeneous in their backgrounds and interests. These three individuals were in the same age range as the media specialist group. Their educational backgrounds and professions, however, varied greatly. All three had college degrees, albeit in quite different subject areas. All three used computers extensively in their work, one being a computer scientist. Their interest in local history and genealogy, however, was almost exclusively personal rather than professional.

Target source material

The researcher group used secondary sources to point them to their most important information source:letters, journals, diaries, historic newspapers and government and legal documents (e.g., deeds, wills, court records), if available. Microfilm or digitized versions of documents are used to help identify documents of research interest and may be used directly for teaching, rather than for research. For research itself, hardcopy is still desirable. Original source materials are the ultimate goal.

As might be expected, the school media specialists look to both print and online resources. Because they will be serving K-12 populations, they are interested in authoritative sources that are high quality but not costly because their potential libraries are public institutions. While they might look online for images, they are not, by and large, seeking access to primary materials.

This finding casts some doubt on the claim that digital libraries are likely to be of interest to K-12 populations because they provide experience with original source materials.

The lifelong learners included in this study were well educated. They brought information seeking skills and access to sources that might not be present in ordinary citizens. Like the researchers, they were interested in primary documents as their ultimate source. However, because they are interested in content, they expressed their interest in transcripts rather than images of source documents since the transcripts make it easier to extract the content. Two of the three were primarily interested in genealogical information.

Preferred information sources

Researchers varied in their tendency to use the internet as a way to begin their search for content. One of the four mentioned the importance of using personal contacts to identify approaches to material. An important commonality across the group, however, is the importance of the credibility of the information-containing website. They expressed confidence in sites created and maintained by major archives at the national and state level, for special collections in universities and historical societies, and for government sites.

The school media specialist sample was also concerned about site credibility. While they are less aware of the highly regarded research sites than the researcher sample, they tended to look to the domain name and to trust sites that are .edu, .gov or .org rather than.com sites. This group has a special interest in the issue of site credibility because it will be part of their responsibility to support the development of information literacy in their schools.

The lifelong learners were in this case the only group that included .com sites among their most preferred sources. In this case, the site chosen was one that provided support for genealogical searching and thus saved this group time and effort. Like the other groups, members of this group expressed the belief that .gov sites could be trusted as sources of reliable information when searching on the web.

Discussion

Comparing user groups

These data demonstrate both similarities and differences in the features that diverse user groups look for when approaching digital libraries. All three groups are concerned about the credibility and reliability of their sources. A first level of discrimination is introduced by looking at the domain name, with non-profits, educational and government sources being seen as more reliable. Researchers reach beyond this first level to sites which are associated with more highly rated institutions like national or state archives or with university-supported special collections. The school media specialist group also looks to published hardcopy resources, thought to be more reliable sources of information.

The groups differ in a number of respects, including demographics and ease of internet use. However, a major difference is the interest shown in direct use of the primary source materials. The researchers use digitized collections as a filter, allowing them to identify the location of materials that they can guide research visits more productively. The school media specialists are interested in age-related presentations of the content that could be used to support classroom teaching and student projects but less interested in the direct use of primary source content. The lifelong learners, like researchers, are interested in primary source documents. However, they focus less on the characteristics of the objects and more on the content itself. This is shown by their desire for transcripts of the content rather than the digitized objects.

Contrast with Marchionini, Plaisant and Komlodi

As was pointed out in an earlier section of this paper, this study compares most closely with Marchionini, Plaisant, and Komlodi's (2003) analysis of the users of the Library of Congress's National Digital Library (NDL). The earlier study was much more extensive but covered use of a similar collection by populations that overlap with those studied here. Because the research discussed in the comparable part of the 2003 study was collected in 1995, it is interesting to note similarities and differences across the more than ten year period. Our researcher and lifelong learner groups correspond with Marchionini et al.'s Scholar and Hobbyist groups respectively. Although their research included an extensive study of school media specialists, this user group does not appear in the User Taxonomy of Table 6.4 of the 2003 study explicitly and differs in enough detail that the school media specialists do not truly match the LC staff or Professional researcher groups. Like the earlier study, our school media specialists stressed the need for easy to learn and use systems. The lack of interest in primary sources in the school media specialist is similar in both studies. However, the time difference shows itself in many discrepancies around issues of technology accessibility. The lack of compatible computer systems and of widespread access to the internet at school and at home was cited as major problems in 1995. However, these issues were not mentioned as major problems by our interviewees.

Design implications

Ultimately, user data is collected not only out of theoretical interest but also to help direct the design of the related system, in this case, the interface and functionality of Volunteer Voices. These data support suggestions about content, information architecture and user interface. To meet the needs of researchers and of members of the community who are interested in genealogy, the database needs to include birth and death records, legal documents (marriages, deeds, etc.), and local newspapers. It is desirable to provide transcripts along with digitized images of these records. Transcripts would support the use of the materials by lifelong learners as well as provide text that could be used for full-text information retrieval. It is unlikely that the same architecture could satisfy all three user classes. While the school media specialist group have search system knowledge, their target users do not, nor are they able to interpret and use search system results easily. Chen and collaborators 2004) found that users with different cognitive styles benefited from forms of information representation that matched their preferred style. Providing alternative user interfaces for Volunteer Voices, perhaps through a tabbed user interaction like that used in many library catalogs for Basic and Advanced search, could permit more than one user community to effectively use the same underlying database but would allow the system to display records and to initiate search and browse in ways that are substantially different from each other and are better suited to the information needs of their communities.

Consequences for theory

These data, along with much previous research, suggest strongly that there are very broad categories of difference that separate user communities for a given resource. Significant sources of differences include knowledge of system and content and motivation as Marchionini, Plaisant and Komlodi (2003) demonstrate. They also include substantial differences around the ultimate need for interpretation, transcripts or well-resolved digital images of primary source materials. Efforts to evaluate user interaction with digital libraries must not simply ignore these differences, looking only at interface features, as usability testing so often does. Evaluation needs to be conducted in the context of desired output and that is intimately concerned with user community. We need to move toward a more nuanced view of a digital library's users.

Acknowledgements

The author would like to acknowledge the contribution of the members of the Spring 2007 class IS 588, Human Computer Interaction, at the University of Tennessee's School of Information Sciences, co-taught with Dr. Dania Bilal. Class members (Krishna Adams, Ben Birch, Alison Connor, Anya Furman, Elizabeth Koerber, April Lewis, Jerrell (Bo) Link, Katherine Marsh, Dorothy Ogdon, Scott Rader, and Angela Woofter) carried out the interviews and created the documentation that provided the data used in this report. The author would also like to thank Tiffani Conner, the Volunteer Voices project manager, for her interest and assistance with various phases of our project.

Ancillary