The differences between latent topics in abstracts and citation contexts of citing papers



Although it is commonly expected that the citation context of a reference is likely to provide more detailed and direct information about the nature of a citation, few studies in the literature have specifically addressed the extent to which the information in different parts of a scientific publication differs. Do abstracts tend to use conceptually broader terms than sentences in a citation context in the body of a publication? In this article, we propose a method to analyze and compare latent topics in scientific publications, in particular, from abstracts of papers that cited a target reference and from sentences that cited the target reference. We conducted an experiment and applied topical modeling techniques to full-text papers in eight biomedicine journals. Topics derived from the two sources are compared in terms of their similarities and broad-narrow relationships defined based on information entropy. The results show that abstracts and citation contexts are characterized by distinct sets of topics with moderate overlaps. Furthermore, the results confirm that topics from abstracts of citing papers have broader terms than topics from citation contexts formed by citing sentences. The method and the findings could be used to enhance and extend the current methodologies for research evaluation and citation evaluation.

