Robust Estimation of Area Under ROC Curve Using Auxiliary Variables in the Presence of Missing Biomarker Values
Version of Record online: 3 SEP 2010
© 2010, The International Biometric Society
Volume 67, Issue 2, pages 559–567, June 2011
How to Cite
Long, Q., Zhang, X. and Johnson, B. A. (2011), Robust Estimation of Area Under ROC Curve Using Auxiliary Variables in the Presence of Missing Biomarker Values. Biometrics, 67: 559–567. doi: 10.1111/j.1541-0420.2010.01487.x
- Issue online: 20 JUN 2011
- Version of Record online: 3 SEP 2010
- Received January 2010. Revised July 2010. Accepted July 2010.
- Area under the curve;
- Doubly robust estimators;
- Missing at random;
- Missing not at random;
- Receiver operating characteristic curve;
- Sensitivity analysis
Summary In medical research, the receiver operating characteristic (ROC) curves can be used to evaluate the performance of biomarkers for diagnosing diseases or predicting the risk of developing a disease in the future. The area under the ROC curve (ROC AUC), as a summary measure of ROC curves, is widely utilized, especially when comparing multiple ROC curves. In observational studies, the estimation of the AUC is often complicated by the presence of missing biomarker values, which means that the existing estimators of the AUC are potentially biased. In this article, we develop robust statistical methods for estimating the ROC AUC and the proposed methods use information from auxiliary variables that are potentially predictive of the missingness of the biomarkers or the missing biomarker values. We are particularly interested in auxiliary variables that are predictive of the missing biomarker values. In the case of missing at random (MAR), that is, missingness of biomarker values only depends on the observed data, our estimators have the attractive feature of being consistent if one correctly specifies, conditional on auxiliary variables and disease status, either the model for the probabilities of being missing or the model for the biomarker values. In the case of missing not at random (MNAR), that is, missingness may depend on the unobserved biomarker values, we propose a sensitivity analysis to assess the impact of MNAR on the estimation of the ROC AUC. The asymptotic properties of the proposed estimators are studied and their finite-sample behaviors are evaluated in simulation studies. The methods are further illustrated using data from a study of maternal depression during pregnancy.