• missing data;
  • multiple imputation;
  • proportional hazards model;
  • relative survival;
  • colon cancer


Relative survival assesses the effects of prognostic factors on disease-specific mortality when the cause of death is uncertain or unavailable. It provides an estimate of patients' survival, allowing for the effects of other independent causes of death. Regression-based relative survival models are commonly used in population-based studies to model the effects of some prognostic factors and to estimate net survival. Most often, studies focus on routinely collected prognostic factors for which the proportion of missing values is usually low (around 5 per cent). However, in some cases, additional factors are collected with a greater proportion of missingness. In the present article, we systematically assess the performance of multiple imputation in regression analysis of relative survival through a series of simulation experiments. According to the assumptions concerning the missingness mechanism (completely at random, at random, and not at random) and the missingness pattern (monotone, non-monotone), several strategies were considered and compared: all cases analysis, complete cases analysis, missing data indicator analysis, and multiple imputation by chained equations (MICE) analysis. We showed that MICE performs well in estimating the hazard ratios and the baseline hazard function when the missing mechanism is missing at random (MAR) conditionally on the vital status. In the situations where the missing mechanism was not MAR conditionally on vital status, complete case behaves consistently. As illustration, we used data of the French Cancer Registries on relative survival of patients with colorectal cancer. Copyright © 2008 John Wiley & Sons, Ltd.