SEARCH

SEARCH BY CITATION

Abstract

The article studies the influence of the query formulation of a topic on its h-index. In order to generate pure random sets of documents, we used N-grams (N variable) to measure this influence: strings of zeros, truncated at the end. The used databases are WoS and Scopus. The formula equation image, proved in Egghe and Rousseau (2006) where T is the number of retrieved documents and α is Lotka's exponent, is confirmed being a concavely increasing function of T. We also give a formula for the relation between h and N the length of the N-gram: equation image where D is a constant, a convexly decreasing function, which is found in our experiments. Nonlinear regression on equation image gives an estimation of α, which can then be used to estimate the h-index of the entire database (Web of Science [WoS] and Scopus): equation image, where S is the total number of documents in the database.