Estimating Propensity Scores and Causal Survival Functions Using Prevalent Survival Data

Authors

  • Yu-Jen Cheng,

    Corresponding author
    1. Institute of Statistics, National Tsing Hua University, Hsin-Chu 300, Taiwan
      email: ycheng@stat.nthu.edu.tw
    Search for more papers by this author
  • Mei-Cheng Wang

    Corresponding author
    1. Department of Biostatistics, Johns Hopkins University, 615 N. Wolfe Street E3614, Baltimore, Maryland 21205, U.S.A.
      email: mcwang@jhsph.edu
    Search for more papers by this author

email:ycheng@stat.nthu.edu.tw

email:mcwang@jhsph.edu

Abstract

Summary This article develops semiparametric approaches for estimation of propensity scores and causal survival functions from prevalent survival data. The analytical problem arises when the prevalent sampling is adopted for collecting failure times and, as a result, the covariates are incompletely observed due to their association with failure time. The proposed procedure for estimating propensity scores shares interesting features similar to the likelihood formulation in case-control study, but in our case it requires additional consideration in the intercept term. The result shows that the corrected propensity scores in logistic regression setting can be obtained through standard estimation procedure with specific adjustments on the intercept term. For causal estimation, two different types of missing sources are encountered in our model: one can be explained by potential outcome framework; the other is caused by the prevalent sampling scheme. Statistical analysis without adjusting bias from both sources of missingness will lead to biased results in causal inference. The proposed methods were partly motivated by and applied to the Surveillance, Epidemiology, and End Results (SEER)-Medicare linked data for women diagnosed with breast cancer.

Ancillary