Overdiagnosis in lung cancer screening: Estimates from the German Lung Cancer Screening Intervention Trial

Overdiagnosis is a major potential harm of lung cancer screening; knowing its potential magnitude helps to optimize screening eligibility criteria. The German Lung Screening Intervention Trial (“LUSI”) is a randomized trial among 4052 long‐term smokers (2622 men), 50.3 to 71.9 years of age from the general population around Heidelberg, Germany, comparing five annual rounds of low‐dose computed tomography (n = 2029) with a control arm without intervention (n = 2023). After a median follow‐up of 9.77 years postrandomization and 5.73 years since last screening, 74 participants were diagnosed with lung cancer in the control arm and 90 in the screening arm: 69 during the active screening period; of which 63 screen‐detected and 6 interval cancers. The excess cumulative incidence in the screening arm (N = 16) represented 25.4% (95% confidence interval: −11.3, 64.3] of screen‐detected cancer cases (N = 63). Analyzed by histologic subtype, excess incidence in the screening arm appeared largely driven by adenocarcinomas. Statistical modeling yielded an estimated mean preclinical sojourn time (MPST) of 5.38 (4.76, 5.88) years and a screen‐test sensitivity of 81.6 (74.4%, 88.8%) for lung cancer overall, all histologic subtypes combined. Based on modeling, we further estimated that about 48% (47.5% [43.2%, 50.7%]) of screen‐detected tumors have a lead time ≥4 years, whereas about 33% (32.8% [28.4%, 36.1%]) have a lead time ≥6 years, 23% (22.6% [18.6%, 25.7%]) ≥8 years, 16% (15.6% [12.2%, 18.3%]) ≥10 years and 11% (10.7% [8.0%, 13.0%]) ≥12 years. The high proportions of tumors with relatively long lead times suggest a major risk of overdiagnosis for individuals with comparatively short remaining life expectancies.


| INTRODUCTION
Randomized trials in the United States 1 and Europe [2][3][4][5] have convincingly shown that screening by low-dose computed tomography (LDCT) is a viable strategy for reducing lung cancer mortality. However, this mortality reduction has to be balanced against various risks of negative side effects, including long-term radiation, risks of surgical interventions of benign lesions after false-positive diagnoses and overdiagnosis.
Overdiagnosis refers to tumors that would not have become manifest in the absence of screening. 6 It is the result of detecting tumors ahead of the time at which they cause symptoms or death-a leadtime window within which a proportion of screening participants may die, before knowing of their lung cancer disease. As the diagnosis of lung cancer generally causes referral to aggressive treatment, overdiagnosis may incur serious and unnecessary losses in quality of life and financial health-care costs. 6 Estimates of its expected magnitude, depending on an individual's age at the time of screening, may help optimize lung cancer screening eligibility criteria.
In randomized trials, the extent of overdiagnosis has been assessed by determining the excess cumulative incidence in the LDCT screening arm as compared to a control arm without screening, 2,3,5,7 or compared to a control arm using the less sensitive standard chest X-ray as alternative screening method, 1,8 after the screening had stopped. 6 Initial estimates varied from zero excess in the Italian Lung Cancer Screening Trial (ITALUNG) 3 to 67.2% (95% confidence interval [CI]: 37.1%-95.4%) in the Danish Lung Cancer Screening Trial (DLCST). 7 Conversely, using data from the US National Lung Screening Trial (NLST), the estimated excess incidence rate of LDCT relative to chest X-Rays was 18.5% (5.4%, 30.6%). 8 In the Dutch-Belgian Nederlands Leuvens Longkanker Screening Onderzoek (Dutch-Belgian Randomized Lung Cancer Screening Trial) (NELSON) study, 10 years postrandomization and about 4.5 years since last screening participation the excess incidence estimate among men was 19.7% of screen-detected cases (95% CI: −5.2%-41.6%). 2 In these studies, however, follow-up times after screening cessation varied and likely were too short to cover the longest possible tumor lead times, which is required for unbiased estimation of overdiagnosis in the given population of screening participants.
Further to the excess-incidence method, mathematical modeling has proven useful for generating estimates of overdiagnosis under alternative screening scenarios and follow-up times, beyond those of original trials. Patz et al 8  Here, we present estimates for excess lung cancer incidence, as well as of preclinical sojourn time combined with screen-test sensitivity, from data of the German Lung Cancer Screening Intervention Trial (LUSI). 11,12 We compare our findings with those from other randomized screening trials in the United States and Europe, and discuss possible sources for heterogeneity in estimates published so far. Besides screen detection, the prospective incidence of lung cancer in both study arms (Supplemental Methods section, Supplemental Table 1) was ascertained comprehensively by a combination of annual follow-up questionnaires (self-reports), and record linkages to cancer registries and mortality registers. For all lung cancer cases, detailed information from medical records (pathology reports, medical letters from responsible physicians on diagnosis and treatment and radiology reports, with their exact dates) was obtained by contacting the treating clinics, and coded to ICD-O-3 for tumor histology and stage (Supplemental Methods section, Supplemental Table 2). Extensive descriptions of study design and results for mortality reduction have been published previously. 5,13

| Statistical methods
For the present analyses, performed between February and May 2020, the end-date for follow-up of lung cancer incidence was set at What's new?
The reduced lung cancer mortality achieved through lowdose computed tomography screening must be balanced against the risk of overdiagnosis. In this randomized screen-

| RESULTS
The 4052 participants in our study (long-term smokers, 2622 males  Heterogeneity in the excess incidence observed in these independent trials in part may be related to differences in the duration of postscreening follow-up time over which excess incidence was estimated, and to population differences in death rates from competing causes during follow-up (eg, depending on age, sex, smoking history and other determinants of general health and overall mortality rates). In addition, excess incidence may vary due to differences in the sensitivity of early lung cancer detection, for example, because of heterogeneous protocols used for malignant nodule detection or due to different time intervals between successive screenings (1, 2 and 2.5 years, successively, in the NELSON trial, vs annual screening in the other studies) ( Table 4).
Excess incidence is largely caused by lead time due to earlier tumor detection, and does not necessarily reflect overdiagnosis, which exclusively results when lead time exceeds the remaining life expectancy of participants with screen-detected cancers. With increasing duration of follow-up, in randomized stop-screen trials the cumulative incidence gap between screening and control arm generally reduces progressively as even more slowly growing tumors gradually become manifest in the control arm. If follow-up since last screening does not fully cover even the longest detection lead times for all participants in the trial, excess incidence will generally yield an overestimate for overdiagnosis. In our data, the median follow-up time since last screening participation was 5.73 years, and 25% of participants still had follow-up times below 4.8 years. Compared to the mean preclinical sojourn times (MPST), follow-up times in the LUSI trial may have been too short for excess incidence to be a valid estimate for overdiagnosis, and likely this was also the case for reported excess incidence in the DLCST, NELSON and the ITALUNG study so far. The point is well-illustrated by analyses in the NLST trial, which in initial analyses showed an excess incidence of 18.5% after a median follow-up time of 4.5 years after last screening participation, 8 whereas in more recent analyses, after an extended period of follow-up to an average of about 9.3 years since last screening, the excess lung cancer incidence in the LDCT had reduced to 3% (compared to the control arm with chest radiography). 18 It is important to note, however, that excess incidence in randomized screening trials may provide an overall estimate for the actual populations screened as a whole, but not necessarily for those individuals that would be naturally at highest risk of being overdiagnosed, for example, those with more advanced age, in view of more limited residual life expectancies.
Considering excess incidence as a measure for estimating overdiagnosis, several further factors may cause bias. First, bias may have occurred when screening was also applied in the control arm. Especially in the NLST, where all control arm participants were systematically screened by standard chest radiology (X-ray), excess incidence in the LDCT arm may underrepresent the true magnitude of overdiagnosis by LDCT as compared to no screening at all. In European trials, by contrast, this type of bias may have been minimal as there was no systematic screening in the control arm, and reported rates of screening contamination in the control arms were also low ( Table 4). Another potential source of bias is confounding, for example, due to imperfect randomization and resulting imbalances between the trial arms in baseline lung cancer risk, or due to postrandomization differences in factors such as participation in smoking parallel cessation programs that may alter lung cancer risk independently of LDCT screening. For the DLCST, the study investigators reported imperfect randomization, possibly by chance, resulting in significantly more participants with more than 35 pack-years of smoking and a higher proportion of participants with more obstructed lung function in the screening arm as compared to the control arm, which may have contributed to excess lung cancer incidence in the screening arm not caused by LDCT detection. 19 In the ITALUNG trial, by contrast, participants in the screening arm were reported to exhibit significantly higher rates of smoking cessation, and lower rates of relapse into smoking among baseline ex-smokers, as compared to usual-care controls, 20  In our data, although the excess incidence for cancer overall is modest for women, the excess incidence for adenocarcinoma is much larger for women than for men (Supplemental Table 2). We previously reported 5 that in LUSI the distribution of histologic subtypes differed significantly between men and women, with women showing a higher proportion of adenocarcinomas, and a much smaller percentage of small cell tumors, than men. Furthermore, LDCT detection (first 5 years after randomization) led to a predominance of diagnosed adenocarcinomas in the screening arm as compared to the control arm, and this was more strongly the case among women than among men.  suggest a greater potential for overdiagnosis in women, a finding further supported by quantitative modeling analyses performed in context of the Cancer Intervention and Surveillance Modeling Network (CISNET). 22 At the same time, our Becker et al, 5 and other data 2,23 also suggest that, compared to men, LDCT screening may be associated with a greater reduction in lung cancer mortality among women.