Summary In order to develop better treatment and screening programs for cancer prevention programs, it is important to be able to understand the natural history of the disease and what factors affect its progression. We focus on a particular framework first outlined by Kimmel and Flehinger (1991, Biometrics, 47, 987–1004) and in particular one of their limiting scenarios for analysis. Using an equivalence with a binary regression model, we characterize the nonparametric maximum likelihood estimation procedure for estimation of the tumor size distribution function and give associated asymptotic results. Extensions to semiparametric models and missing data are also described. Application to data from two cancer studies is used to illustrate the finite-sample behavior of the procedure.