Get access

Combining bibliometrics, information retrieval, and relevance theory, Part 1: First examples of a synthesis



In Sperber and Wilson's relevance theory (RT), the ratio Cognitive Effects/Processing Effort defines the relevance of a communication. The tf*idf formula from information retrieval is used to operationalize this ratio for any item co-occurring with a user-supplied seed term in bibliometric distributions. The tf weight of the item predicts its effect on the user in the context of the seed term, and its idf weight predicts the user's processing effort in relating the item to the seed term. The idf measure, also known as statistical specificity, is shown to have unsuspected applications in quantifying interrelated concepts such as topical and nontopical relevance, levels of user expertise, and levels of authority. A new kind of visualization, the pennant diagram, illustrates these claims. The bibliometric distributions visualized are the works cocited with a seed work (Moby Dick), the authors cocited with a seed author (White HD, for maximum interpretability), and the books and articles cocited with a seed article (S.A. Harter's “Psychological Relevance and Information Science,” which introduced RT to information scientists in 1992). Pennant diagrams use bibliometric data and information retrieval techniques on the system side to mimic a relevance-theoretic model of cognition on the user side. Relevance theory may thus influence the design of new visual information retrieval interfaces. Generally, when information retrieval and bibliometrics are interpreted in light of RT, the implications are rich: A single sociocognitive theory may serve to integrate research on literature-based systems with research on their users, areas now largely separate.