Get access

Sample size determination for clustered count data


Correspondence to: Dulal Bhaumik, Department of Psychiatry, Division of Biostatistics, University of Illinois at Chicago, Chicago, IL, U.S.A.



We consider the problem of sample size determination for count data. Such data arise naturally in the context of multicenter (or cluster) randomized clinical trials, where patients are nested within research centers. We consider cluster-specific and population-averaged estimators (maximum likelihood based on generalized mixed-effect regression and generalized estimating equations, respectively) for subject-level and cluster-level randomized designs, respectively. We provide simple expressions for calculating the number of clusters when comparing event rates of two groups in cross-sectional studies. The expressions we derive have closed-form solutions and are based on either between-cluster variation or intercluster correlation for cross-sectional studies. We provide both theoretical and numerical comparisons of our methods with other existing methods. We specifically show that the performance of the proposed method is better for subject-level randomized designs, whereas the comparative performance depends on the rate ratio for the cluster-level randomized designs. We also provide a versatile method for longitudinal studies. Three real data examples illustrate the results. Copyright © 2013 John Wiley & Sons, Ltd.