Hypothesis tests for population heterogeneity in meta-analysis


Correspondence should be addressed to Wolfgang Viechtbauer, Department of Methodology and Statistics, University of Maastricht, P.O. Box 616, 6200 MD Maastricht, The Netherlands (e-mail: wolfgang.viechtbauer@stat.unimaas.nl).


Choice of the appropriate model in meta-analysis is often treated as an empirical question which is answered by examining the amount of variability in the effect sizes. When all of the observed variability in the effect sizes can be accounted for based on sampling error alone, a set of effect sizes is said to be homogeneous and a fixed-effects model is typically adopted. Whether a set of effect sizes is homogeneous or not is usually tested with the so-called Q test. In this paper, a variety of alternative homogeneity tests – the likelihood ratio, Wald and score tests – are compared with the Q test in terms of their Type I error rate and power for four different effect size measures. Monte Carlo simulations show that the Q test kept the tightest control of the Type I error rate, although the results emphasize the importance of large sample sizes within the set of studies. The results also suggest under what conditions the power of the tests can be considered adequate.