A Pseudo-Bayesian Shrinkage Approach to Regression with Missing Covariates
Article first published online: 7 DEC 2011
© 2011, The International Biometric Society
Volume 68, Issue 3, pages 933–942, September 2012
How to Cite
Zhang, N. and Little, R. J. (2012), A Pseudo-Bayesian Shrinkage Approach to Regression with Missing Covariates. Biometrics, 68: 933–942. doi: 10.1111/j.1541-0420.2011.01718.x
- Issue published online: 26 SEP 2012
- Article first published online: 7 DEC 2011
- Received January 2011. Revised October 2011. Accepted October 2011.
- Complete-case analysis;
- Drop variables analysis;
- Gibbs sampling;
- Nonignorable modeling;
- Variable selection
Summary We consider the linear regression of outcome Y on regressors W and Z with some values of W missing, when our main interest is the effect of Z on Y, controlling for W. Three common approaches to regression with missing covariates are (i) complete-case analysis (CC), which discards the incomplete cases, and (ii) ignorable likelihood methods, which base inference on the likelihood based on the observed data, assuming the missing data are missing at random (Rubin, 1976b), and (iii) nonignorable modeling, which posits a joint distribution of the variables and missing data indicators. Another simple practical approach that has not received much theoretical attention is to drop the regressor variables containing missing values from the regression modeling (DV, for drop variables). DV does not lead to bias when either (i) the regression coefficient of W is zero or (ii) W and Z are uncorrelated. We propose a pseudo-Bayesian approach for regression with missing covariates that compromises between the CC and DV estimates, exploiting information in the incomplete cases when the data support DV assumptions. We illustrate favorable properties of the method by simulation, and apply the proposed method to a liver cancer study. Extension of the method to more than one missing covariate is also discussed.