Covariate Adjustment in Estimating the Area Under ROC Curve with Partially Missing Gold Standard

Authors

  • Danping Liu,

    Corresponding author
    • National Alzheimer's Coordinating Center, University of Washington, Seattle, Washington 98195, U.S.A.
    Search for more papers by this author
  • Xiao-Hua Zhou

    Corresponding author
    1. Department of Biostatistics, University of Washington, Seattle, Washington 98195, U.S.A.
    • Biostatistics and Bioinformatics Branch, Division of Epidemiology, Statistics, and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, Maryland 20892, U.S.A.
    Search for more papers by this author

email: azhou@uw.edu

**email: danping.liu@nih.gov

Summary

In ROC analysis, covariate adjustment is advocated when the covariates impact the magnitude or accuracy of the test under study. Meanwhile, for many large scale screening tests, the true condition status may be subject to missingness because it is expensive and/or invasive to ascertain the disease status. The complete-case analysis may end up with a biased inference, also known as “verification bias.” To address the issue of covariate adjustment with verification bias in ROC analysis, we propose several estimators for the area under the covariate-specific and covariate-adjusted ROC curves (AUCx and AAUC). The AUCx is directly modeled in the form of binary regression, and the estimating equations are based on the U statistics. The AAUC is estimated from the weighted average of AUCx over the covariate distribution of the diseased subjects. We employ reweighting and imputation techniques to overcome the verification bias problem. Our proposed estimators are initially derived assuming that the true disease status is missing at random (MAR), and then with some modification, the estimators can be extended to the not missing at random (NMAR) situation. The asymptotic distributions are derived for the proposed estimators. The finite sample performance is evaluated by a series of simulation studies. Our method is applied to a data set in Alzheimer's disease research.

Ancillary