Formulae for the h-index: A lack of robustness in Lotkaian informetrics?



In one of the first attempts at providing a mathematical framework for the Hirsch index, Egghe and Rousseau (2006) assumed the standard Lotka model for an author's citation distribution to derive a delightfully simple closed formula for his/her h-index. More recently, the same authors (Egghe & Rousseau, 2012b) have presented a new (implicit) formula based on the so-called shifted Lotka function to allow for the objection that the original model makes no allowance for papers receiving zero citations. Here it is shown, through a small empirical study, that the formulae actually give very similar results whether or not the uncited papers are included. However, and more important, it is found that they both seriously underestimate the true h-index, and we suggest that the reason for this is that this is a context—the citation distribution of an author—in which straightforward Lotkaian informetrics is inappropriate. Indeed, the analysis suggests that even if we restrict attention to the upper tail of the citation distribution, a simple Lotka/Pareto-like model can give misleading results.