In traditional library indexing systems, the indexer was an individual trained in the rules of information organisation to assign keywords for important information about the physical media and the subject matter of the content. While other groups have been involved in creating index terms (for example, journal article authors who are asked to provide keywords with their submitted articles), these keywords generally have a small circulation and are not widely used. Collaborative tagging systems such as CiteULike (http://www.citeulike.org) allow users to participate in the classification of journal articles by encouraging them to assign useful labels to the articles they bookmark.
Studies comparing the terminology used in tagging journal articles to indexer assigned controlled vocabulary terms suggest that many tags are subject related and could work well as index terms or entry vocabulary (Hammond et al 2005; Kipp 2006; Kipp and Campbell 2006; Kipp 2007a). Some authors suggest that user classification systems demonstrate what vocabulary users actually use to describe concepts and that this could be incorporated into the system as entry vocabulary to the standard thesaurus terms (Mathes 2004; Morville 2005).
However, the world of folksomonies includes relationships that would never appear in a library classification or thesaurus including time and task related tags, affective tags and the user name of the tagger (Kipp 2007b; Kipp and Campbell 2006; Kipp 2006). These short term and highly specific tags and relationships suggest important differences between user indexing systems and professional indexing systems which must be considered in examining the usability of tagging systems for resource discovery.
Users searching online catalogues and databases often express admiration for the idea of controlled vocabularies and knowledge organisation systems, but find it difficult to adapt their vocabulary to the thesaurus and find the search process frustrating. (Fast and Campbell 2004) Additionally, controlled vocabulary indexing has proven costly and has not proven to be truly scalable when dealing with digital information, especially on the web. Morville (2005) suggests that tagging systems could scale along with digital information on the web allowing for some indexing of currently unindexed web materials.
This study explores how users make use of an indexing system for enabling retrieval by performing an information retrieval study on a social bookmarking system and a more traditional online database in order to examine user search behaviour on the two different systems.
This study asks the following questions:
Do tags appear to enhance resource discovery? Do users feel that they have found what they are looking for?
How do users find searching social bookmarking sites compared to searching more classically organised sites? Do users think that tags assigned by other users are more intuitive?
Do tagging structures facilitate information retrieval? How does this compare to traditional structures of supporting information retrieval?
The searchers were asked to search Pubmed and CiteULike for information on a specific assigned topic. Screen capture software, a think aloud protocol and an exit interview were used to capture the impressions of the users when faced with traditional classification or user tags. This data was analysed to explore the use of indexing terms by the participants as well as their use of other features in each system that support information finding and refinding.
Participants selected their own keywords for searches on both tools. At the end of the search process, participants were asked to make a list of what terms they would now use if asked to search for this information again. Three sets of data were thus available for analysis: sets of initial and final keywords selected by the user, the recording of the search session and think aloud, and recorded exit interviews after the search session, all of which can be analysed to examine user impressions of the search process and the utility of the keywords in the process.
Participants tended to prefer the search experience on the system used first, regardless of previous experience with either system. All users used multi word keywords initially, which is unsurprising as they are in training to be librarians. At the end of the search process, when users were asked to generate a new list of keywords they would now use for the search, many separated their list of final keywords by tool showing an awareness of the need to adapt a search to different systems.
Items such as the presence of full metadata, abstracts and even full text links to articles were lauded while lack of vocabulary terms, and especially missing abstracts were deemed to be impediments to search. Participants found related article links and other newer features of systems to be a significant enhancement to the search process and some participants reported or were seen using tags or user names in CiteULike for similar purposes.
Many of the participants in this study made use of the related articles links provided by PubMed and discussed the possibilities presented by MeSH in Pubmed and the tags on CiteULike but did not find that the structures were in place to fully support browsing of related items by keyword or combination of keywords. As shown by Ockerbloom (2006) these webs of related items can be built automatically using existing thesaurus structures and displayed to the user. This suggests that the use of indexing structures to link related items would be worthwhile to users if they are able to see the connections between items as they browse.