Article first published online: 26 JUL 2012
Copyright © 2012 John Wiley & Sons, Ltd.
Statistics in Medicine
Volume 32, Issue 1, pages 67–80, 15 January 2013
How to Cite
Paul, P., Pennell, M. L. and Lemeshow, S. (2013), Standardizing the power of the Hosmer–Lemeshow goodness of fit test in large data sets. Statist. Med., 32: 67–80. doi: 10.1002/sim.5525
- Issue published online: 11 DEC 2012
- Article first published online: 26 JUL 2012
- Manuscript Accepted: 21 JUN 2012
- Manuscript Received: 22 JUL 2010
- goodness of fit;
- Hosmer–Lemeshow test;
- logistic regression;
- perinatal epidemiology
The Hosmer–Lemeshow test is a commonly used procedure for assessing goodness of fit in logistic regression. It has, for example, been widely used for evaluation of risk-scoring models. As with any statistical test, the power increases with sample size; this can be undesirable for goodness of fit tests because in very large data sets, small departures from the proposed model will be considered significant. By considering the dependence of power on the number of groups used in the Hosmer–Lemeshow test, we show how the power may be standardized across different sample sizes in a wide range of models. We provide and confirm mathematical derivations through simulation and analysis of data on 31,713 children from the Collaborative Perinatal Project. We make recommendations on how to choose the number of groups in the Hosmer–Lemeshow test based on sample size and provide example applications of the recommendations. Copyright © 2012 John Wiley & Sons, Ltd.