Goodness-of-Fit Testing for the Logistic Regression Model when the Estimated Probabilities are Small

Authors


Abstract

The distribution of the Hosmer-Lemeshow chi-square type goodness-of-fit tests (Čg, Ȟg) for the logistic regression model are examined via simulations designed to examine their behavior when most of the estimated probabilities are small or are expected to fall in a few deciles. The results of the simulations show statistic Čg should be used when the two outcome groups (y = 0, 1) are not well separated, Δ≤2, where Δ2 is the Mahalanobis distance. Statistic Ȟg should be used when Δ ≥ 8. Either statistic may be used when 2 ≦ Δ ≦ 8. All tests should be used with caution when the proportion in the sample with y = 1 is less than 0.1.

Ancillary