A novel approach for estimating the omitted-citation rate of bibliometric databases with an application to the field of bibliometrics



One of the most significant inaccuracies of bibliometric databases is that of omitted citations, namely, missing electronic links between a paper of interest and some citing papers, which are (or should be) covered by the database. This paper proposes a novel approach for estimating a database's omitted-citation rate, based on the combined use of 2 or more bibliometric databases. A statistical model is also presented for (a) estimating the “true” number of citations received by individual papers or sets of papers, and (b) defining an appropriate confidence interval. The proposed approach could represent a first step towards the definition of a standard for evaluating the accuracy level of databases.