Since the 1990s, with the heightened competition and the strong growth of the international higher education market, an increasing number of rankings have been created that measure the scientific performance of an institution based on data. The Leiden Ranking 2011/2012 (LR) was published early in 2012. Starting from Goldstein and Spiegelhalter's (1996) recommendations for conducting quantitative comparisons among institutions, in this study we undertook a reformulation of the LR by means of multilevel regression models. First, with our models we replicated the ranking results; second, the reanalysis of the LR data showed that only 5% of the PPtop10% total variation is attributable to differences between universities. Beyond that, about 80% of the variation between universities can be explained by differences among countries. If covariates are included in the model the differences among most of the universities become meaningless. Our findings have implications for conducting university rankings in general and for the LR in particular. For example, with Goldstein-adjusted confidence intervals, it is possible to interpret the significance of differences among universities meaningfully: Rank differences among universities should be interpreted as meaningful only if their confidence intervals do not overlap.