Nonparametric inference on the E/O ratio in model validation



In preventing diseases such as cancers and osteoporosis, statistical models are often used to identify subjects with high risks. The ratio of the expected (or predicted) number of cases in the target population and the observed numbers of cases (the E/O ratio) is a useful quantity to evaluate the goodness of fit of the model. The model is usually evaluated on a sample taken from the target population and, in the literature, statistical inferences on the E/O ratio often assume that the expected number is a constant and the observed number follows a Poisson distribution. In this paper, we introduce a nonparametric method that takes into account the variability of the predicted number due to sampling and its correlation with the observed number. By estimating the variance of the estimated E/O ratio more accurately, this nonparametric approach offers better inferences. In addition, we propose to use an F-statistic to test the goodness of a model across subgroups defined by certain risk factors. Copyright © 2007 John Wiley & Sons, Ltd.