How to handle missing data in regression models using information criteria


  •  This study has been funded by the Dutch Organisation for Scientific Research NWO-VICI-453-05-002.


An important application of multiple regression is predictor selection. When there are no missing values in the data, information criteria can be used to select predictors. For example, one could apply the small-sample-size corrected version of the Akaike information criterion (AIC), the (AICC). In this article, we discuss how information criteria should be calculated when the dependent variable and/or the predictors contain missing values. Therewith, we extensively discuss and evaluate three models that can be employed to deal with the missing data, that is, to predict the missing values. The most complex model, that is, the model with all available predictors, outperforms the other models. These results also apply to more general hypotheses than predictor selection and also to structural equation modeling (SEM) models.