A General Method for Dealing with Misclassification in Regression: The Misclassification SIMEX




Summary We have developed a new general approach for handling misclassification in discrete covariates or responses in regression models. The simulation and extrapolation (SIMEX) method, which was originally designed for handling additive covariate measurement error, is applied to the case of misclassification. The statistical model for characterizing misclassification is given by the transition matrix Π from the true to the observed variable. We exploit the relationship between the size of misclassification and bias in estimating the parameters of interest. Assuming that Π is known or can be estimated from validation data, we simulate data with higher misclassification and extrapolate back to the case of no misclassification. We show that our method is quite general and applicable to models with misclassified response and/or misclassified discrete regressors. In the case of a binary response with misclassification, we compare our method to the approach of Neuhaus (1999, Biometrika86, 843–855), and to the matrix method of Morrissey and Spiegelman (1999, Biometrics55, 338–344) in the case of a misclassified binary regressor. We apply our method to a study on caries with a misclassified longitudinal response.