Lung cancer is the most common cause of cancer-related death in men and the second most common cause of cancer-related death in women in Europe.1 Because lung cancer is often in an advanced stage at the time of diagnosis, 5-year-survival is only 15% or less.2 Japanese studies and the US Early Lung Cancer Action Project (ELCAP) showed that in a high-risk population more lung cancers can be detected by spiral computed tomography (CT) screening than by chest X-ray screening.3, 4 These and other observational studies with spiral CT screening showed that 55–85% of CT-detected lung cancers at baseline screening in a high-risk population of current and former smokers are at a surgically removable stage (stage I).5 Although these results seem promising, observational studies are prone to lead-time, length-time and overdiagnosis bias. Only in a randomised design, disease-specific mortality between the screened and the unscreened population, instead of survival, can be compared. Lead-time, length-time and overdiagnosis do not bias the analysis in such comparisons.6 Therefore, in the United States the National Lung Screening Trial (NLST) was launched in 20027 and in the Netherlands and Belgium, the NELSON trial, a Dutch acronym for ‘Dutch-Belgian lung cancer screening trial’ was launched in September 2003. The NLST investigates whether spiral CT screening and treatment of early lesions will decrease lung cancer mortality compared to chest X-ray screening. NELSON investigates whether 16-detector multi-slice CT screening will decrease lung cancer mortality compared to a control group without screening.
These large randomised controlled trials and studies, in general, are often confined by limited resources and capacity. Therefore, a careful selection of study participants is crucial to reach sufficient power and to minimize sample size and costs. When only persons with an extremely high risk of lung cancer are included in these trials, the required sample size will be low, but the number of eligible subjects will be low as well and consequently the efforts to recruit them very high. It will also be difficult to generalize the study results to current and former smokers in general. When very low risk groups (e.g. non-smokers) are also included, the power of such a trial will decrease and the required sample size to reach sufficient power will increase and may exceed the available screening capacity. Selection criteria for single armed lung cancer screening studies and ongoing and planned trials show some variability in smoking exposure and age selection criteria.4, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 This variability can cause differences in characteristics of the trial population and cancer detection rates. Purpose of this study was to describe a method to come to an optimum selection and recruitment of the eligible population for a lung cancer screening trial, taking into account available resources and screening capacity and the influence that selection criteria have on the estimated lung cancer mortality and the power of such a trial.
CARET, Carotene and Retinol Efficacy Trial; CPS I/II, Cancer Prevention Study I/II; CT, computed tomography; ELCAP, Early Lung Cancer Action Project; EU, European Union; NELSON, ‘Nederlands Leuvens Longkanker Screeningsonderzoek’ = Dutch-Belgian lung cancer screening trial; NLST, National Lung Screening Trial; PLCO Screening Trial, Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial; PY, Person-years; RCT, randomised controlled trial; US, United States.
Material and methods
The method described here was used to recruit participants, define selection criteria and calculate sample size and power for the NELSON trial.
The design of the NELSON first recruitment and the trial design are shown in Figure 1. During the first recruitment phase (second half of 2003) addresses of all men born between January 1, 1928 and January 1, 1953 were obtained from the population registries in 7 districts in the Netherlands (Groningen, Drenthe, Utrecht, Eemland, Midden-Nederland, Kennemerland and Amstelland-de Meerlanden). In addition, addresses of all men and women of the same age were obtained from the population registries of 14 municipalities around Leuven in Belgium. They received a first questionnaire about general health, alcohol consumption, physical exercise, cancer history, family history of lung cancer, body weight and length, education and their opinion on screening programs in general. The questionnaire contained 11 questions on smoking from the Minimum Common Dataset (May 2002) of the EU-US Collaborative Spiral CT working group, adapted from the National Cancer Institutes Cancer Data Standards Registry, the recommended smoking measures of the Behavior Change Consortium of the US National Institutes of Health and from Pistelli et al.18, 19, 20 The most important questions were (i) ‘When you last smoked every day, on average how many cigarettes (shag) do/did you smoke a day?’ (<5, 5–10, 11–15, 16–20, 21–25, 26–30, 31–40, 41–50, 51–60, >60), (ii) ‘What is the total number of years you have smoked/smoke cigarettes or shag every day? Do not include any time you stayed off cigarettes or shag for 6 months or longer.’ (0–5, 6–10, 11–15, 16–20, 21–25, 26–30, 31–35, 36–40, 41–45, 46–50, >50 years) and (iii) ‘If you have quit smoking, how long has it been since you quit?’ (< 1 month, 1–6 months, 7 months to 1 year, 1–3 years, 3–5 years, 6–10 years, 11–15 years, 16–20 years, >20 years, not applicable). The questionnaire was accompanied by brief information about the trial.
Selection of potential participants
Since the smoking exposure history of all respondents on the first NELSON questionnaire was available, a careful decision could be made about whom to invite for the trial. First, the estimated lung cancer mortality risk of the respondents was determined. Next, the required sample size to show a mortality benefit of screening of 20%, 25% and 30%, and the corresponding number of eligible subjects was determined for various selection scenarios and finally the ‘required participation rate’ was determined, which was defined as the required response of eligible subjects to reach the required sample size. In the optimal selection scenario the required participation rate was as low as possible and the required sample size was within the ranges of our capacity (±16,000 participants). We aimed at avoiding a narrow selection of only very high risk subjects.
Calculation of expected lung cancer mortality
Our estimates of lung cancer mortality are based on the US Cancer Prevention Study II (CPS II), a cohort study which started in 1982 and followed 508,579 men and 676,527 women, aged 30 years or more for 6 years.21 CPS II reports lung cancer mortality rates per 100,000 person-years (PY) for groups of men with attained ages 50–79 (50–59; 60–69 and 70–79 years), smoking duration of 20 years or more (20–29; 30–39; 40–49 and ≥50 years) and 1 or more cigarettes smoked per day (1–19; 20; 21–39; 40 and ≥41 cigarettes/day).21 The 5 cigarette-groups (CPS II) were not comparable with the categories in the NELSON questionnaire. The 5 CPS II cigarette-groups were therefore recategorized into 3 groups (1–20; 21–40 and ≥41) and risks in these new groups were calculated as averages of the CPS II groups, weighted by follow-up years. In the NELSON trial only age at entry is known. We accounted for aging during the 10-year follow-up period by using the age-specific risks for age at entry + 5.5 years. Assuming an exponential increase in lung cancer mortality with a factor of 2.5 in 10 years, the mean risk is reached after 5.5 years.21, 22 Thus, in theory, NELSON subjects with age at entry of 44.5 through 74.5 have a mean lung cancer mortality rate of CPS II subjects with attained ages 50–79. In NELSON no subjects aged less than 50 were included, and so NELSON subjects with age at entry of 50 through 74.5 have a mean lung cancer mortality rate of CPS II subjects with attained ages 55.5 through 79. The CPS II mortality rates were weighted for each combination of ‘(attained age) × (cigarettes/day) × (duration)’ with the number of respondents on the first NELSON questionnaire with known birth date (n = 99,608), with age at entry of 50 through 74.5. Because the CPS II monograph included only data on current smokers, the US Cancer Prevention Study I (CPS I) was used to estimate the effect of smoking cessation. This prospective cohort study started following up 456,491 males and 594,551 females older than 30 years on July 1, 1960. Follow-up was for a maximum of 12 years.23 Applying the smoking cessation data on the calculated mortality rates resulted in a lung cancer mortality rate table for smoking duration (4 levels), number of cigarettes/day (3 levels) and duration of smoking cessation (5 levels). By varying the thresholds for duration of smoking, the duration of smoking cessation and the number of cigarettes smoked per day, the mean estimated expected lung cancer mortality rate (per 1,000 PY) for various selection scenarios was determined. Because the groups of smoking duration and the number of cigarettes smoked per day are rather broad (e.g. 1–20 cigarettes/day) linear interpolation was used to estimate the mean mortality rate for more refined selections.
Required sample size
Using the same formulas as in the American PLCO (prostate, lung, colorectal and ovarian) screening trial and the European Randomised Screening Trial on Prostate Cancer, required sample sizes to be able to demonstrate a lung cancer mortality reduction of 20%, 25% or 30% were calculated for the various selection scenarios (Appendix24, 25). A 1:1 randomisation, a power of 80%, a one-sided α significance level of 0.05, a 95% compliance in the screen group, a 5% contamination rate in the control group and 10 years of follow up after randomisation were assumed.25
Before inviting eligible subjects, persons with a moderate or bad self-reported health who were unable to climb 2 flights of stairs and persons with a body weight ≥ 140 kg were excluded, because participants need to have enough cardiopulmonary reserve to undergo surgery. To avoid diagnostic problems (primary vs. metastatic disease), persons with current or past renal cancer, melanoma or breast cancer were not included because these tumours give rise to lung metastases even after long follow up. Lung cancer diagnosed less than 5 years ago may relapse and these subjects were excluded. Lung cancer cases diagnosed 5 years or more ago but still under treatment were not included as well. Persons who had a chest CT examination less than 1 year before they filled in the first NELSON questionnaire were also excluded.
Ethical and legal approval
The NELSON trial was approved by the Ethics Committees of all participating centers. Furthermore the Health Council of the Netherlands advised the Minister of Health to give permission to start the trial after a positive test of the ‘comprehensibility’ of the trial information. On December 23, 2003, the Minister of Health of the Netherlands approved randomisation of persons to the NELSON trial.
During the first recruitment phase 106,931 of the 335,441 subjects (32%) who received the first NELSON questionnaire responded. Mean age of the respondents was 61.0 (standard deviation: 6.8 years). Response rates were lower in Belgium than in the Netherlands and lower in females than in males and overall equally distributed over the age categories (Table I). Table II shows the number of respondents for each level of smoking duration, the number of cigarettes smoked per day and the duration of smoking cessation. Nearly one third of the 106,931 respondents (33,909 (32%)) never smoked, 26,733 (25%) have been smoking for less than 20 years and 24,783 (23%) had quit smoking for more than 20 years. From Table III, which shows the expected lung cancer mortality per 1,000 PY, it appears that these groups have a low lung cancer mortality risk.
Table I. Characteristics of 335,441 Persons who Received the First NELSON Questionnaire (First Recruitment) and Characteristics of the 106,931 Respondents
Table III. Estimated Mortality Rates (Per 1,000 PY) from Lung Cancer in Current and Former Cigarette Smokers by Baseline Amount and Duration of Smoking and Duration of Smoking Cessation, Men, Calculated from CPS II and I, Weighted by Age Categories of NELSON Respondents1
Smoking duration (years)
Duration of cessation
0–1 year (currentsmokers)
No data were available for smoking duration <20 years.
Table IV shows the mean estimated lung cancer mortality rate (per 1,000 PY) for various selections and the corresponding required sample size to demonstrate a lung cancer mortality reduction of 20%, 25% or 30% 10 years after randomisation. When only individuals with a very ‘high risk’ are included (options A and C, Table IV), the required sample size is small but the number of available subjects is relatively small too and a high fraction of eligible subjects has to give consent to participation to reach sufficient power. When individuals with a more ‘moderate risk’ are included (options B, D and E, Table IV) the required sample size increases, but the number of eligible subjects increases relatively more. Consequently, the required participation rate is lower in these moderate risk selection scenarios compared to the high risk scenarios mentioned before. When individuals with a ‘low risk’ are included as well (options G and F), the required sample size increases further while the gain in required participation rate is limited.
Table IV. Estimated Expected Lung Cancer Mortality Rates (Per 1,000 PY) without Screening and Sample Sizes Needed for Various Selections of the 106,931 Respondents on the First NELSON Questionnaire1
One-sided α = 0.05; 1:1 randomisation; power = 80%; 95% compliance screen group; 5% contamination; 10 years of follow up after randomisation.
Number of eligible subjects from the respondents on the first NELSON questionnaire. (Numbers also include respondents who later appeared to be ineligible for participation for reasons other than smoking history (exclusion criteria) (11%)).
Required response among eligible subjects to show a lung cancer mortality difference between screen and control of 30% 10 years after randomisation.
Scenarios D, F and G had the lowest possible required participation rate (52%, 53% and 51%, respectively). Because the required sample size was lowest in scenario D, we decided to invite current and former smokers with 10 years or less of cessation, who smoked more than 15 cigarettes a day for over 25 years or more than 10 cigarettes a day for over 30 years (option D). With a power of 80%, enrolment of 17,300 subjects in NELSON is required to demonstrate a lung cancer mortality reduction of 25% or more and enrolment of 27,900 subjects in NELSON is required to demonstrate a lung cancer mortality reduction of 20% or more 10 years after randomisation.
In this study we demonstrated that selecting participants from the general population for lung cancer screening trials based on risk estimates is feasible, and helpful in minimizing sample size and costs.
In total, 15,428 participants have been randomised for the NELSON trial until October 18, 2005. Recruitment is still ongoing and aimed to reach 16,000 participants. In the first recruitment round of the NELSON trial, 11,103 persons gave informed consent (1.6% females). In 2005 a second recruitment round was started in which 250,606 questionnaires were sent. Until October 18, 2005, 44,509 persons responded to this questionnaire and 10,271 met our previously defined selection criteria and were invited for participation in our trial. About 4,325 of the 4,590 respondents on that invitation (52% females) have been randomised.
The major advantage of our population-based recruitment compared to volunteer-based recruitment (through media) is that differences in e.g., disease prevalence, general health and all-cause mortality between the study population and the average target population are probably limited.26 Even though there might be some ‘self-selection’ in the NELSON trial, population-based recruitment creates the possibility to determine the extent of a possible self-selection.6 We are therefore comparing the characteristics of our study population (results from our questionnaire) with the general population. These analyses are not finished yet.
Like all trial results, the NELSON trial results will, in principle, only be applicable to a population with the same characteristics as the NELSON participants. Because about half of the NELSON participants are former smokers, we did not select a group with a rare smoking history or an extremely high risk of lung cancer death. Our population-based recruitment gives insight in the risk profiles of the general population and we estimate that about 15–25% of the general (Dutch) population, age 50–75 would be target for routine screening if our eligibility criteria would be applied (data and calculations not shown). Therefore we think our results will be generalizable to a sufficiently large part of our population.
Another advantage of population-based recruitment is that it is less likely that potential participants overestimated their smoking history to increase their likelihood to be invited for the trial, because they were unaware of the selection criteria. However, the future will reveal if the advantages of population-based recruitment really outweigh the high costs.
In the Netherlands at present 23% of women smoke compared to 32% of men.27 In the past this difference was even greater, when even fewer women and more men smoked. Therefore, fewer women in the Dutch population have accrued a long-term exposure to cigarettes compared to men. Because the fraction of high-risk subjects among women is therefore low in the Netherlands, we estimated before the start of the trial that including an equal number of females as males would require an enormous recruitment effort. Therefore the original proposal, agreed by the Health Council and the Ministry of Health was to first invite males for the trial and then females. In that way we would still be able to demonstrate possible differences in lung cancer detection between males and females and at the same time limit the efforts to recruit our population. Of course, NELSON trial results will be generalizable to a lesser extent to females, because females comprise ±15% of the study population.
In the current study it was demonstrated that nearly 28,000 participants in NELSON are needed to demonstrate a lung cancer mortality benefit of 20% or more. However, the NLST concluded that 50,000 participants are needed to show a mortality benefit of this size. The NLST completed accrual of nearly 53,500 current and former smokers (quit < 15 years ago), aged 55–74 years with a smoking history of at least 30 packyears and compares CT screening with chest X-ray screening. The larger sample size compared to NELSON might be attributed to differences in selection criteria, a higher power (90% in NLST vs. 80% in NELSON) and a shorter follow up period (4.5 years in NLST vs. 10 years in NELSON).7, 16
In Denmark, ±4,000 male and female current and former smokers (quit < 10 years), aged 50–70, with at least 20 packyears of smoking are recruited through the public media. It is planned to possibly pool mortality data with the Danish trial. NELSON will then be the only trial without screening in controls that is expected to have an 80% power to show a lung cancer mortality reduction of at least 25% 10 years after randomisation. However, pooling NELSON data with Danish trial data may influence power because it can result in pooling data from participants with a different risk profile. For example, the Danish trial does not include participants of 70–75 years of age, who may have an increased risk of death for lung cancer but also for other competing causes of death, compared to the participants of 50–70 years of age. Furthermore, smoking-related selection criteria in the Danish trial (≥20 packyears) are different from NELSON, although the great majority of the NELSON trial participants have a smoking history of at least 20 packyears as well. In NELSON, women comprise ±15% of the total study population. When pooling with the Danish trial data (±45% females), the fraction of females will increase. Bain et al. showed no significant difference in lung cancer mortality between males and females in a reanalysis of prospective cohort studies.28 However, other gender-related issues might be influential. For example, females have a higher risk of peripheral adenocarcinoma and a lower risk of large cell carcinoma than men28 and detection rates for these subtypes might differ. Furthermore, females have a better prognosis for non-small cell carcinoma independent of tumour stage.29
In theory, it may be possible that a difference in lung cancer mortality between the study arms is caused by a difference in treatment. However, treatment is performed according to the national guidelines for the work-up and treatment of lung cancer.30 Thus, it is unlikely that a possible difference in disease-specific mortality between the 2 arms will be caused by differences in treatment between the 2 arms.
Several other uncertainties may change power. First, all-cause mortality will be determined to a large extent by other smoking-related causes of death than lung cancer. As a result of medical progress, the risk of death due to many of these diseases (mainly cardiovascular) has decreased. Consequently, the risk of lung cancer death will probably be higher in current lung cancer screening trial participants than it was in CPS II participants and this may increase power.31 Differences between trials in radiological expertise and protocols can also result in differences in power. Use of 16-detector CT with 0.75-mm slices and double reading of all images in NELSON might enhance power. Power might change when the participation rate is not similar in all risk groups. Since risk estimates and power calculations are always an estimation of the reality, to our opinion, it is of great importance that more (European) centres initiate randomised CT screening trials so that pooling of data can increase power.
To be able to estimate lung cancer mortality in the present trial, data on absolute lung cancer mortality for various smoking exposure levels from a prospective cohort study was needed. To our knowledge only 4 large-scale cohort studies with such detailed data have been published: the Kaiser-Permanente cohort, the US Veterans cohort and CPS I and II.21, 23, 32, 33 In the first study, participants received a health check-up as a reward for their participation, which can be regarded as a form of screening.32 The second cohort is relatively dated (started in 1954) and we had access only to the data of the former smokers. In addition, the study population (US army veterans) is not representative for the general population.33 Because CPS II started more recently than CPS I, calculations were based on CPS II.
Detailed information on absolute lung cancer mortality was unavailable for the Netherlands and Belgium. We compared death rates of CPS II with those reported in the UK British Doctors cohort and found comparable lung cancer mortality rates among never smokers and among current smokers. Furthermore, the mortality rates stratified for the number of cigarettes smoked, seemed to be comparable.21, 34 Although the cohorts are not completely comparable to exclude a possible risk difference between the US and Europe, a significant difference seems unlikely.
In descriptive models of observational data, expected lung cancer mortality can be estimated for each individual. Examples are the model based on the randomised Carotene and Retinol Efficacy Trial (CARET) by Bach et al.31, 35 the CPS II based descriptive model by Flanders et al.36 and the model by Hazelton et al.37 These models are useful for calculating individual lung cancer mortality and selecting study populations for CT screening trials and might be preferable above our method. However, we decided to stay as close as possible to the original mortality data.
In conclusion, we demonstrated that careful selection of participants for lung cancer screening trials by using risk estimates for subgroups of smoking exposure is feasible and helpful to minimize sample size. When pooling with Danish trial data NELSON is the only trial without screening in controls that is expected to have 80% power to show a lung cancer mortality reduction of at least 25% 10 years after randomisation.
The authors thank S.J. Otto, PhD and J. Fracheboud, MD for their contribution in data collection, Mr. A.C. de Jongh, Artex BV, Capelle ad IJssel, The Netherlands for his contributions for the handling of the mailings and data and R. Faber, MSc for his help in converting the databases. They also thank all participating screening centres and the following municipal health services for providing the addresses from the population registries: J. Toet, MSc and E.J.C. van Ameijden, PhD (GG & GD Utrecht), J.M. ten Brinke, MSc (GGD Amstelland – de Meerlanden), Ms. A.E.M. Grotenhuis and W. Nijbroek, MSc (GGD Kennemerland), E. Tromp, PhD (GGD Midden Nederland), N. de Vos, MSc (GGD Eemland), J. Broer, MD, PhD (GGD Groningen), C.A. Bos, MA, MSc (GGD Drenthe), the Local Health Cooperation (LOGO) Leuven and Hageland, Belgium. The authors certify that they have not entered into any agreement that could interfere with their access to the data on the research, nor upon their ability to analyse the data independently, to prepare manuscripts, and to publish them.
APPENDIX: Sample size calculations24, 25
The power (1 − β) is the probability of detecting a statistically significant difference in lung cancer mortality if screening does have a reducing effect on lung cancer mortality. The total number of lung cancer deaths needed to detect a lung cancer mortality difference is:
where θ1 = r + (1 − r)Pc and θ2 = 1 − (1 − r)Ps; D is the total number of lung cancer deaths needed for a one-sided α-level significance test with power 1 − β; Pc is the compliance control group (to what extend the control group does not receive screening); Ps is the compliance screen group (to what extend the screen group actually receives screening); (1 − r) is the assumed reduction in the cumulative lung cancer mortality during the trial, with 100% compliance (Ps = Pc = 1); and f is the fraction of controls needed in screen group.
The required number of participants in the control group (Nc) is:
where Y is the duration of the trial in years from entry in the study until the end of the follow-up; Rc is the average annual disease specific rate in the control group expressed in deaths per person per year; Nc is the number of people assigned to the control group; and f is the fraction of controls needed in screen group.