## 1 Introduction

The objective of cancer screening is to detect tumors in an early stage with better chances of success of curative treatment. Since the randomized trials of breast cancer screening in the seventies and eighties and the introduction of breast cancer and cervical cancer screening, authors have developed statistical models for the evaluation of screening programs [1, 2]. In these models, clinical diagnosis of cancer (e.g., due to symptoms) is preceded by a preclinical phase in which cancer is detectable with a suitable screening test, the so-called preclinical detectable phase (PCDP). The duration of the PCDP is known as sojourn time, and the time by which screening advances diagnosis, that is, the time between detection by screening and diagnosis in the absence of screening, is known as lead time. In a way, mean sojourn time and mean lead time measure the quality of a screening test, by indicating how much earlier tumors can be detected by the test.

Sojourn time and lead time are unobserved quantities, but mean values can be estimated by comparing the detection rate at (first) screening and cancer incidence in the absence of screening. A common estimate is the so-called prevalence/incidence ratio: based on a long-known relation between prevalence of preclinical disease (i.e., the probability *P* of the presence of preclinical cancer), incidence *I*, and mean sojourn time *μ*:

Prevalence is estimated from the detection rate *R* at first screening and the sensitivity of the screening test *s*: *P* = *R* ∕ *s*. This model may be adequate when sojourn times are short and incidence can be treated as a constant, for instance, in the case of breast cancer with an estimated mean sojourn time of 1–2 years [1, 2]. However, in the case of prostate cancer, *P* ∕ *I* ratios of 8–12 years have been reported, and with incidence increasing with age and/or time, it cannot be treated as a constant. For the case of exponentially distributed sojourn times, Zelen and Feinleib [1] showed that the relation still holds for prevalence and incidence at the time of screening.

Recently, several authors [3-6] have used the catch-up time method for estimating mean sojourn time, where mean sojourn time is estimated as the time needed for the cumulative incidence in an unscreened population to catch up with the detection rate in the first round of screening. Figure 1, reproduced from [4], illustrates the estimation for the Rotterdam section of the European Randomized Study of Screening for Prostate Cancer (ERSPC). Cumulative incidence in the control arm takes 8.16 years to reach the detection rate of 4.0% in the first round of screening in the screening arm.

The relation between the sojourn time and lead time distributions is not straightforward, except in the case of exponentially distributed sojourn times. In that case, the hazard of clinical diagnosis, marking the end of the sojourn time, is constant. The remaining time from detection by screening to clinical diagnosis is again exponentially distributed with the same hazard rate. In practical applications, exponentially distributed sojourn times are commonly assumed.

In the following, we first present a simple statistical model behind the catch-up time method, then show that in this model, the catch-up time method results in a biased sojourn time estimate, and finally, we show how this model differs from the classic models of Zelen and Feinleib [1] and Walter and Day [2]. The experience of prostate cancer screening in the Rotterdam Center is worked out numerically to illustrate the effect of estimation methods on the estimated mean sojourn time. We also compare our results with the maximum-likelihood methods and estimates of these authors for breast cancer screening.