Semiparametric analysis of incomplete current status outcome data under transformation models



This work, motivated by an osteoporosis survey study, considers regression analysis with incompletely observed current status data. Here the current status data, including an examination time and an indicator for whether or not the event of interest has occurred by the examination time, is not observed for all subjects. Instead, a surrogate outcome subject to misclassification of the current status is available for all subjects. We focus on semiparametric regression under transformation models, including the proportional hazards and proportional odds models as special cases. Under the missing at random mechanism where the missingness of the current status outcome can depend only on the observed surrogate outcome and covariates, we propose an approach of validation likelihood based on the likelihood from the validation subsample where the data are fully observed, with adjustments of the probability of observing the current status outcome, as well as the distribution of the surrogate outcome in the validation subsample. We propose an efficient computation algorithm for implementation, and derive consistency and asymptotic normality for inference with the proposed estimator. The application to the osteoporosis survey data and simulations reveal that the validation likelihood performs well; it removes the bias from the “complete case” analysis discarding subjects with missing data, and achieves higher efficiency than the inverse probability weighting analysis.