In this issue


In this issue, Saracevic presents a two-part critical review that traces and synthesizes the scholarship on relevance over the past 30 years and provides an updated framework within which the still widely dissonant ideas and works about relevance might be interpreted and related. It is a continuation and update of a similar review that appeared in 1975. This first part of the review addresses the questions related to the nature and manifestations of relevance. The nature of relevance is discussed in terms of meaning ascribed to relevance, theories used or proposed, and models that have been developed. The manifestations of relevance are classified as to several kinds of relevance that form an interdependent system of relevances. Each section concludes with a summary that in effect provides an interpretation and synthesis of contemporary thinking on the topic treated or suggests hypotheses for future research. Analyses of some of the major trends that shape relevance work are offered in the conclusions.

Zhang and Benjamin present a conceptual framework (the Information-Model or I-model) to describe various aspects of information related fields from an interdisciplinary perspective. The authors discuss the technology driven evolutions of several information related fields, including the newly formed Information Field from a community of information schools. The authors posit that information related fields (including the Information Field) are built on a number of other fields but with their own unique foci and concerns. The conceptual framework, in addition to providing a unified view of these related fields, can also be used to examine old case studies, recent research projects, and educational programs and curricula concerns to illustrate the commonalities and differences of information related fields.

Cronin and Meho explore the relationship between creativity and both chronological and professional age in information science using a bibliometric approach that captures the shape of a scholar's career. The approach draws on D.W. Galenson's analysis of artistic creativity, notably his distinction between conceptual and experimental innovation, and also H.C. Lehman's seminal study of the relationship between stage of career and outstanding performance. The sample of 12 academics was drawn from lists of those who have either won the ASIST Award of Merit or the Research in Information Science award or both (Marcia Bates, Nicholas Belkin, Blaise Cronin, Raya Fidel, Paul Kantor, Carol Kuhlthau, Gary Marchionini, Tefko Saracevic, Dagobert Soergel, Don Swanson, Carol Tenopir and Howard White). For each academic, data was compiled regarding the frequency with which publications have been cited over time, the most highly cited publications within an author's output, and whether the most highly cited works were produced early or late in an author's career. The data were used to explore the relationship between stage of life and quality of work. The results suggest that creativity is expressed in different ways, at different times, and with different intensities in academic information science.

Galvez and Moya-Anegón explore the feasibility of using finite-state methods in the conflation of personal name variants in standard forms. In bibliographic databases and citation index systems, variant forms create problems of inaccuracy that affect information retrieval, the quality of information from databases, and the citation statistics used for the evaluation of scientists' work. A number of approximate string matching techniques have been developed to validate variant forms, based on similarity and equivalency relations. The authors classify the personal name variants as nonvalid and valid forms. In establishing an equivalence relation between valid variants and the standard form of its equivalence class, the authors defend the application of finite-state transducers. The process of variant identification requires the elaboration of binary matrices and finite-state graphs. This procedure was tested on samples of author names from bibliographic records, selected from the Library and Information Science Abstracts and Science Citation Index Expanded databases. The evaluation involved calculating the measures of precision and recall, based on completeness and accuracy. The results demonstrate the usefulness of this approach, although it should be complemented with methods based on similarity relations for the recognition of spelling variants and misspellings.

Karamuftuoglu clarifies some of the issues surrounding the discussion regarding the usefulness of a substantive classification theory in information retrieval (IR) and attempts to broaden the debate to a larger context by including in the discussion tasks other than IR and domains other than information science (IS). By means of a concrete example from the high accuracy retrieval from documents (HARD) track of a Text REtrieval Conference (TREC), the limitations of automatic document classification are documented. The author suggests that the “bag of words” approach to information retrieval and techniques such as relevance feedback have significant limitations in expressing and resolving complex user information needs. He argues that a comprehensive analysis of information needs involves explicating implicit assumptions made by the authors of scholarly documents, as well as everyday texts such as news articles. He further argues that progress in (IS) can be furthered by developing general theories that are applicable to multiple domains. The concrete example of application of the domain-analytic approach to subject analysis in IS to the aesthetic evaluation of works of information arts is used to support the argument.

Huang et alia propose a three-dimensional, cubic typology for the characterization of Web users' online information behavior. A set of hypotheses concerning the relationships among the dimensions, as well as between the dimensions, and related behavioral aspects is proposed. Online panel data consisting of month-long clickstreams of 2,022 Web users obtained from InsightXplore, Taiwan were obtained for the validation of the hypotheses. The researchers found that a Web user's width (number of categories of Web sites explored), length (number of sites visited per category) and depth (number of pages downloaded per site) are highly correlated. These three dimensions are positively associated with speed of navigation, but negatively associated with the Web user's explicit online information search propensity and the degree of relatedness among the sites they visited.

Tang describes a study which adopted a naturalistic approach to investigate users' interactions with a browsable MeSH (medical subject headings) display designed to facilitate query construction for the PubMed bibliographic database. Participants were recruited through mailing lists and bulletin boards in the health sciences department at a large research university. Nineteen participants completed the study (nine researchers, six students, three health care professionals, and one faculty member). Participants were asked to conduct at least nine search sessions of their own choosing during a two-and-a-half month period. After signing up for the study, the participants were directed to an online tutorial of the functions of the faceted display and other search options. For any search, participants had the option of using the faceted display or the traditional search box. During the study, each session began and concluded with a questionnaire. The results indicate that participants preferred the faceted display when their information needs were vague and the search topics unfamiliar.

Ju reports on a study that investigated the relationship between different types of domain knowledge and information interactions. Subjects were observed and interaction data collected while they performed prescribed tasks using software called augmented seriation. Augmented seriation is a small-scale exploratory data analysis tool with a menu-based graphic user interface. As a geographic information system (GIS) exploratory tool, augmented seriation allows for permuting or regrouping icons in rows and columns to reconstruct the data matrix, which shows a correlation of the whole data set. Thirty-four participants were recruited for two groups of 17 students with different domain knowledge. One group consisted of 17 geography majors with high declarative knowledge. The other group consisted of 17 computer science majors with high procedural knowledge in the use of computer application software. Task completion time, task completedness, and mouse movements were collected while users performed six tasks during the experimental sessions. Data were analyzed through repeated measures. An ANOVA was used for task completion time and task completedness. GOMS (Goals, Operators, Methods and Selection rules) was used for mouse movements to identify some of the similarities and differences between the two groups' information problem-solving process. The GOMS analysis found the two groups to be remarkably similar in terms of processing activities. The ANOVA results indicate that expertise type was not a major factor influencing user performance, but task and task performance combined with the type of expertise played a significant role in the users' interactions with the interface.

Nahl proposes a model prompted by the growing theoretical importance of the affective domain in information science (IS) and human-computer interaction (HCI), and the increased need for integrated models that provide an explicit account of ways in which human mental and physiological systems dynamically interact in task performance with information systems. The ecological constructionism framework defines a social and biological information technology that is created through the dynamic intersection of technological affordances in symbiotic interaction with affective, cognitive, and sensorimotor information procedures that users creatively construct to satisfice the social practices inherent in information settings.

Moed analyzes how the citation impact of articles deposited in the Condensed Matter section of the preprint server ArXiv (hosted by Cornell University), and subsequently published in a scientific journal, compares to that of articles in the same journal that were not deposited in the archive. Its principal aim is to further illustrate and roughly estimate the effect of two factors, “early view” and “quality bias,” on differences in citation impact between these two sets of papers, using citation data from Thomson Scientific's Web of Science. The author presents estimates for a number of journals in the field of condensed matter physics. To discriminate between an “open access” effect and an early view effect, longitudinal citation data were analyzed covering a time period as long as seven years. Quality bias was measured by calculating ArXiv citation impact differentials at the level of individual authors publishing in a journal, taking into account coauthorship. The analysis provides evidence of a strong quality bias and early view effect. Correcting for these effects, there is in a sample of six condensed matter physics journals studied in detail no sign of a general “open access advantage” of papers deposited in ArXiv. The study does provide evidence that ArXiv accelerates citation due to the fact that ArXiv makes papers available earlier rather than makes them freely available.

Brown describes a study that explored the use of Webbased information by authors and readers of the chemistry literature through citation and content analysis of eight American Chemical Society (ACS) journals. Citation analysis was conducted at three points: 1996, 2000 and 2004. Content analysis of the journals was conducted for the occurrence of Web sites both within the full text of the articles and within reference lists. The number of articles augmented with ACS copyrighted electronic supported information was also determined. The analyses indicate that, even though the number of Web-based information resources has grown steadily over the past decade, chemists are not taking full advantage of freely available Web-based resources. They are, however, making use of the ACS Electronic Supporting Information archive. The content of the Web-based resources that are used is primarily text based. The presence of a reference to a Web-based resource in a chemistry article does not influence its rate of citation. Comparison of citation and online access data reveals that at the highest levels of citation, articles also garner high levels of online access. This was especially true for articles describing a technique or methodology.

Burrell demonstrates that the construction of Lorenz/Leimkuhler curves advocated by Egghe is not equivalent to the classical construction nor does it include the classical case, as claimed by Rousseau ( JASIST, 2007). The author shows that Egghe's construction yields the classical Leimkuhler curve of the dual function, up to scale transformation. The conclusion drawn is that, although Egghe produced an interesting approach and Rousseau an interesting observation, both failed to properly appreciate the standard construction.

Metzger summarizes much of what is known from the communication and information literacy fields about the skills that Internet users need to assess the credibility of online information. The article reviews current recommendations for credibility assessment, empirical research on how users determine the credibility of Internet information, and describes several cognitive models of online information evaluation. Based on the literature review and critique of existing models of credibility assessment, recommendations for future online credibility education and practice are provided to assist users in locating reliable information online. The article concludes by offering ideas for research and theory development on this topic in an effort to advance knowledge in the area of credibility assessment of Internet-based information.

Cole et alia report on a field study that collected and analyzed the mental models of 80 self-selected undergraduate students registered in five social science courses at McGill University, Montreal, in the fall and winter terms of 2003. The undergraduates were interviewed one month into term about their course essay, an essay that would constitute part of their course grade. The data collection instrument was a structured interview schedule. The interview schedule required undergraduates to visualize and diagram their essay topics in three different ways using magic markers and large sheets of blank paper. This provided a total of 240 mental model representations from which the researchers created a 12-category mental model classification scheme. The data collection instrument was also designed to communicate learning about essay structure to the undergraduates to see if this would cause the undergraduates' mental model representations to come into alignment with a syndetic map of the same topic area.

Meho and Yang examine the impact of data sources on citation counts and rankings of LIS faculty, specifically comparing Web of Science (WoS) to Scopus and Google Scholar (GS). The analysis was based on citation counts and rankings of all 15 faculty members at the School of Library and Information Science at Indiana-Bloomington, which resulted in the examination of more than 10,000 citing and purportedly citing documents. The results indicate that Scopus alters the relative ranking of those scholars that appear in the middle of the rankings and that GS stands out in its coverage of conference proceedings, as well as international, non-English language journals. The authors conclude that the use of Scopus and GS, in addition to WoS, is more accurate and comprehensive in revealing the scholarly impact of authors.

Saracevic, in a two-part critical review, traces and synthesizes the scholarship on relevance over the past 30 years and provides an updated framework within which the still widely dissonant ideas and works about relevance might be interpreted and related. It is a continuation and update of a similar review that appeared in 1975. In this second part of the review, relevance behavior and effects are synthesized using experimental and observational works that incorporated data. As in the first part of the review, each section concludes with a summary that in effect provides an interpretation and synthesis of contemporary thinking on the topic treated or suggests hypotheses for future research. Analyses of some of the major trends that shape relevance work are offered in the conclusions.

Sotudeh and Horri examine the citation performance of open access journals (OAJs), reporting the results of a study carried out at the article level. To reduce the confounding effects of OA dynamics, the study was limited to prestigious, pure, stable and long-lasting OAJs. One hundred thirty-nine gold OAJs, indexed by SCI, were identified and validated in terms of their stability in providing a full, immediate OA policy. As a prestige criterion, only the 114 journals recognized enough by the scientific community to enter the Journal Citations Report were included in the study. Non-English journals and those launched after 2001 were also omitted, resulting in a set of 99 journals. The SCI Expanded 3.0 available at Web of Science was used to extract characteristics of the articles published in the identified journals. A total of 27,948 unique items, originally published as OA, were collected. Citations were counted using a variable time window (from three years for articles published in 2003 to five years for those published in 2001). The journals were classified into four broad scientific disciplines: life sciences, natural sciences, engineering and material sciences, and multidisciplinary sciences. Regression analysis was used to determine the mathematical model that best describes the relationship between articles and citations across fields in OA journals. The results show that the power law model provides the best fit to the data. Overall, the authors found a similarity of the science system across OAJ and non-OAJ boundaries, indicating evidence of OA's widespread recognition by scientific communities. According to the models used in the study, the citation distributions between fields are strongly disproportionate in life sciences and engineering and material sciences, favoring larger fields in the former, but smaller fields in the latter. Distributions tend to be rather linear in the natural sciences.

Egghe demonstrates that the Egghe's construction, rather than “including the standard case,” as asserted by Rousseau, actually leads to the Leimkuhler curve of the dual function, in the sense of Egghe. The author distinguishes between the Lorenz curve, a convex form arising from ranking from smallest to largest, and the Leimkuhler curve, a concave curve form arising from ranking from largest to smallest.