Regression with incomplete covariates and left-truncated time-to-event data


Correspondence to: Richard J. Cook, Department of Statistics and Actuarial Science, University of Waterloo, 200 University Avenue West, Waterloo, ON, Canada N2L 3G1.



Studies of chronic diseases routinely sample individuals subject to conditions on an event time of interest. In epidemiology, for example, prevalent cohort studies aiming to evaluate risk factors for survival following onset of dementia require subjects to have survived to the point of screening. In clinical trials designed to assess the effect of experimental cancer treatments on survival, patients are required to survive from the time of cancer diagnosis to recruitment. Such conditions yield samples featuring left-truncated event time distributions. Incomplete covariate data often arise in such settings, but standard methods do not deal with the fact that individuals’ covariate distributions are also affected by left truncation. We describe an expectation–maximization algorithm for dealing with incomplete covariate data in such settings, which uses the covariate distribution conditional on the selection criterion. We describe an extension to deal with subgroup analyses in clinical trials for the case in which the stratification variable is incompletely observed. Copyright © 2012 John Wiley & Sons, Ltd.