Results of the three rounds of the Finnish Prostate Cancer Screening Trial—The incidence of advanced cancer is decreased by screening
Article first published online: 5 APR 2010
Copyright © 2010 UICC
International Journal of Cancer
Volume 127, Issue 7, pages 1699–1705, 1 October 2010
How to Cite
Kilpeläinen, T. P., Auvinen, A., Määttänen, L., Kujala, P., Ruutu, M., Stenman, U.-H. and Tammela, T. L.J. (2010), Results of the three rounds of the Finnish Prostate Cancer Screening Trial—The incidence of advanced cancer is decreased by screening. Int. J. Cancer, 127: 1699–1705. doi: 10.1002/ijc.25368
- Issue published online: 4 AUG 2010
- Article first published online: 5 APR 2010
- Manuscript Accepted: 11 MAR 2010
- Manuscript Received: 24 NOV 2009
- The Academy of Finland. Grant Number: 123054
- The Competitive Research Funding (Pirkanmaa Hospital District)
- Finnish Cancer Organizations
- mass screening;
- prostatic neoplasms;
- prostate-specific antigen;
- randomized controlled trials
Screening for prostate cancer (PC) remains a controversial issue despite some new evidence on the mortality benefits of PC screening. We conducted a prospective, randomized screening trial in Finland to investigate whether screening decreases PC incidence. Here, we report the incidence results from three screening rounds during a 12-year period. Of the 80,144 men enrolled, 31,866 men were randomized to the screening arm (SA) and invited for screening with prostate-specific antigen test (cut-off 4.0 ng/ml) every 4 years, while the remaining men formed the control arm (CA) that received no interventions. The mean follow-up time for PC incidence in both arms was over 9 years. The incidence rate of PC (including screen-detected and interval cancers as well as cases among nonparticipants) was 9.1 per 1,000 person-years in the SA and 6.2 in the CA, yielding an incidence rate ratio (IRR) 1.5 (95% confidence interval 1.4–1.5). The incidence of advanced PC was 1.1 in the SA and 1.5 in the CA, IRR = 0.7 (0.6–0.8) and the difference emerges after 5–6 years of follow-up. The incidence of localized PC was 7.5 in the SA and 4.6 in the CA, IRR = 1.6 (1.5–1.7). The results from our large population-based trial indicate that screening for PC decreases the incidence of advanced PC. When compared with the CA, the PC detected in the SA there were substantially more often localized, low-grade PCs due to overdiagnosis.
Prostate cancer (PC) is the most common cancer in most industrialized countries.1 Screening for PC with prostate-specific antigen (PSA) has become a controversial public health issue, and randomized controlled trials are the only reliable way to demonstrate the effectiveness of PC screening. Recently, preliminary mortality results have been published from the European Randomized Study of Screening for Prostate Cancer (ERSPC)2 and Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO).3 The ERSPC trial showed a 20% decrease in prostate cancer mortality, whereas in the PLCO trial, no reduction was observed in the screening arm. In both trials, the incidence of PC was substantially higher in the screening arm compared with the control arm, at least partially due to overdiagnosis and lead time bias. It remains to be established whether the benefits of screening for PC with PSA overweigh the harms.4
The first analysis of the ERSPC trial indicates a relative reduction in mortality, but there were differences between centers in, e.g., the mode of recruitment, screening interval and biopsy threshold. The early results did not show significant heterogeneity between centers, but the magnitude of effect is likely to differ due to the differences in design and protocol within the study.2
We report the results and cancer incidence in screening and control arms in the large population-based Finnish trial during three screening rounds.
Material and Methods
The Finnish Prostate Cancer Screening Trial is a part of the ERSPC study, which is a randomized multicenter trial. The Finnish trial comprises 80,144 men born in 1929–1944 (aged 55, 59, 63 or 67 years at entry). The subjects were identified from the Finnish Population Registry. After exclusion of men with previous PC diagnosis, a random sample of 8,000 men was allocated to the screening arm (SA) annually in 1996–1999 and the remaining men formed the control arm (CA) that received no intervention.
The men in the SA were invited to a local cancer society clinic for the screening test, i.e., blood sample for determining serum PSA concentration. Men with PSA ≥ 4 ng/ml were referred to a local urological clinic for diagnostic examinations including digital rectal examination (DRE), transrectal ultrasound and biopsy. Initially, a sextant biopsy was used, but 10–12 biopsy cores were adopted in 2002. Men with PSA level of 3.0–3.9 ng/ml were referred to an additional test, which in 1996–1998 was DRE and since 1999 a free/total PSA (F/T PSA) ratio with a cut-off point of 16%. Men with a suspicious DRE or F/T PSA ratio < 16% were referred for diagnostic examinations similar to those with PSA ≥ 4 ng/ml.
All the laboratory analyses were carried out at the Department of Clinical Chemistry, Helsinki University Hospital. The serum concentrations of total PSA were analyzed by both Hybritech Tandem-E (Beckman Coulter, Brea, CA) and Wallac Delfia (Wallac, Turku, Finland) assays. The free/total PSA ratio was determined with the Wallac ProStatus free/total PSA assay (Wallac).
The men in the SA were reinvited to the second and third screening rounds in a similar manner 4 and 8 years after the first screen (though men older than 71 years of age were no longer invited and thus men aged 67 years at the initial screen were invited only twice). Information on vital status and place of residence was obtained from the Population Register Centre. Men with prostate cancer or emigrated from the study area were not reinvited. These men and those who had died were included in the analyses according to the intention-to-screen protocol. Because of organizational difficulties, there were 1,671 men in the SA who did not receive invitation. These men are included in the analyses as “nonparticipants” and are analyzed with the SA.
Diagnosis of prostate cancer was based on histopathologic examination, as was determination of the Gleason score. According to the trial protocol, a rebiopsy was indicated if the primary histopathologic diagnosis was prostatic intraepithelial neoplasia, atypical small acinar proliferation or unconfirmed suspicion of prostate carcinoma, or if the PSA concentration was ≥ 10 ng/ml. The decision of rebiopsying a patient after a negative biopsy was made by the attending physician, who did not always comply with the protocol of the screening trial. Therefore, some rebiopsies were performed with less strict criteria and some postponed further than protocol-defined time frames.
A screen-detected PC was defined as a cancer diagnosed within 1 year from a positive screening test (due to the fact that not all biopsies or their indications were recorded in the trial database). The PCs that were diagnosed between 1 and 4 years from a positive screening test were categorized separately as early recall PCs (some of these were men who chose to be biopsied at a private clinic or the PC diagnosis was made in a rebiopsy). An interval cancer was defined as a PC within the screening interval in a man with a screen-negative result at the previous screen. A PC with TNM stage of T1-2, N0 and M0 was categorized as localized, whereas a PC with stage T3-4 or N1 or M1 was advanced. An aggressive PC was defined as a PC with one or more of the following characteristics: Gleason score 8–10, T3-4, N1 or M1. The follow-up ended at PC diagnosis, death, emigration or the common closing date (December 31st 2007). Information on cancers detected outside the screening protocol (interval cancers, and those in nonparticipants and the control arm) were obtained from the nationwide, population-based Finnish Cancer Registry, which has 99% coverage of all solid cancers diagnosed in Finland.5
The study protocol was approved by Helsinki and Tampere University Hospital Ethical committees. Permission to use cancer registry data was obtained from Research and Development Centre for Welfare and Health (STAKES, currently part of the National Institute of Health and Welfare).
Cumulative incidence in the SA was calculated by dividing the number of PC cases (including all PC cases in the SA) by the number of men in the SA and CA in different screening intervals. Cumulative hazard of PC was estimated using the Nelson-Aalen method.6, 7 Cox regression was used to calculate incidence rate ratios (IRR) and their statistical significance. All statistical analyses were performed using Stata 8.2 (StataCorp, College Station, TX).
In the screening arm, there were altogether 292,474 person-years (pyrs) and in the control arm 449,885 pyrs. The mean follow-up time in the SA was 9.2 years and in the CA 9.3 years (standard deviation in both groups 2.7 years). Because of randomized design, the age distribution in both arms was similar (median 58.7 years at entry in both arms. Age proportions at entry in the SA and CA, respectively, were 55 years: 32.9% vs. 33.0%; 59 years: 26.2% vs. 26.3%; 63 years: 21.6% vs. 21.5%; 67 years: 19.2% vs. 19.2%).
In the SA, we invited 30,195 men to the first round (participation proportion 68.8%), 26,240 men (70.9%) to the second and 18,338 men (69.5%) to the third round. A total of 23,771 men (78.7%) participated in at least one screening round, and 10,327 men (52.1% of those invited to all rounds) participated in all the three rounds.
A total 2,655 PCs were detected in the SA and 2,796 PCs in the CA during follow-up (Fig. 1). The total cumulative incidence was 8.3% in the SA and 5.8% in the CA (p < 0.001). The incidence rate of PC was 9.1 in the SA and 6.2/1,000 pyrs in the CA (incidence rate ratio (IRR) 1.46, 95% confidence interval (CI) 1.4–1.5, p < 0.001). Nelson-Aalen cumulative hazard estimates of PC risk were larger for the SA and the difference widened with follow-up (Fig. 2).
The proportion of screen-negative men decreased and the proportion of men with screen-detected PC increased with each screening interval (Table 1). The cumulative incidence increased from the first to the second screening interval but decreased in the third screening interval (Table 2) in most age groups and both arms.
The incidence of localized PC was 7.5 in the SA and 4.6/1,000 pyrs in the CA (IRR 1.63, 95% CI 1.5–1.7, p = < 0.001). The incidence of advanced PC was 1.1 in the SA and 1.5/1,000 pyrs in the CA (IRR 0.69, 95% CI 0.6–0.8, p < 0.001) (Fig. 3). When stratified by age at randomization, the difference was largest in the oldest age group. Among men aged 55 years at entry, IRR was 0.84 (CI 0.6–1.2, p = 0.33), in the 59-year-olds, IRR was 0.79 (CI 0.6–1.0, p = 0.09), in the 63-year-olds, IRR was 0.83 (CI 0.6–1.1, p = 0.14) and in the 67-year-olds, IRR was 0.57 (CI 0.5–0.7, p < 0.001). The total cumulative incidence of localized PC was 6.9% in the SA and 4.3% in the CA (p < 0.001), and the cumulative incidence of advanced PC was 1.0% in the SA and 1.4% in the CA (p < 0.001). In the CA, the proportion of localized PC was lower in all intervals compared with the SA (Table 3). The absolute effect of the reduced incidence of advanced PC can be expressed as number needed to screen, which was 250 (1/(1.4%–1.0%), 95% CI 181–411).
The incidence of low-grade PC (Gleason score 2–6) was 5.8 in the SA and 3.2/1,000 pyrs in the CA (IRR 1.82, 95% CI 1.7–1.9, p = < 0.001; cumulative incidence 5.3% vs. 3.0%, p < 0.001). For Gleason score 7 cancers, the corresponding figures were 1.9 vs. 1.8/1,000 pyrs (IRR 1.02, 95% CI 0.9–1.1, p = 0.72; cumulative incidence in both groups 1.7%, p = 0.72) and for Gleason score 8–10 the incidence was 0.8/1,000 pyrs in the SA and 0.9 in the CA (IRR 0.89, 95% CI 0.8–1.0, p = 0.16; cumulative incidence 0.8% vs. 0.7%, p = 0.15). When only the Gleason 8–10 cancers were analyzed, the cumulative incidence of advanced PC was 0.31% in the SA and 0.50% in the CA (p = 0.0008).
Altogether 161 interval cancers were detected, of which 50 (78% localized) after the first screen, 89 (84%) in the second interval and 22 (73%) in the third interval. The cumulative incidence of interval cancer was 0.53% and that of aggressive interval cancer 0.10%. A total of 176 (46% localized) PCs were diagnosed among either never-participants or previous round nonparticipants in the SA during the first, 220 (63%) in the second and 235 (60%) in the third screening interval (cumulative incidence 8.1%).
The results from the Finnish Prostate Cancer Screening Trial show that the overall prostate cancer incidence rate is ∼50% higher in the screening than in the control arm during a mean follow-up time exceeding 9 years. This difference is mostly due to the high incidence of low-grade and localized cancers in the SA. Screening succeeded in decreasing the incidence of advanced PC in the SA by a quarter, which is an important intermediate indicator of PC mortality. The absolute effect in terms of reduction in advanced cancer was substantially larger than the reported mortality reduction.2 This could be due to either lower validity (not all cancers detected earlier due to screening avoid PC death) and also that the lead-time for advanced or aggressive cancers is shorter and is therefore a better indicator of long-term effect.
The Finnish trial is part of the ERSPC study, from which preliminary mortality results were recently reported.2 This study showed for the first time that screening for PC with PSA can decrease PC mortality with a best estimate of relative risk reduction of 20%. ERSPC is a multicenter study in seven European countries. There are certain differences in the screening protocol between these countries. Therefore, differences among the ERSPC centers in the magnitude of mortality reduction can be anticipated. It is not yet clear how to achieve maximal decrease in PC mortality with minimal harm to the screened population.
The main differences in screening protocol between the ERSPC centers are mode of recruitment, screening interval, invitation procedures (e.g., whether to reinvite nonparticipant men or not) and the PSA threshold leading to biopsy. The Finnish trial used a relatively high cut-off level of PSA ≥ 4.0 ng/ml and a screening interval of 4 years. Our study was population-based, i.e., based on comprehensive recruitment of all men in the source population to ensure good generalizability and obtain a realistic estimate of the screening effect achievable by mass screening. We reinvited also nonparticipants unless exclusion criteria had been met. Our trial showed adequate participation rate for a population-based study; approximately two thirds of the invited men participated at each of the three rounds.
In both arms, the overall incidence of PC increased initially but leveled off and showed eventually some decline. This temporal pattern (period effect) is parallel to the secular trends in PC incidence in the entire Finland. When men in the SA were compared with men of the same age but with 0–2 previous screens, the cumulative incidence in that interval was the same or somewhat lower in men who had been screened once before, but markedly lower if the men had been screened twice before (e.g., cumulative incidence in 63-year-old men was 3.6% in the first round, 3.5% in the second round and 2.6% in the third round). Also in the CA, when men of same age were compared at different periods, the incidence increased during the second interval and decreased markedly subsequently. One explanation is increasing contamination in the CA, i.e., there was more opportunistic PSA testing in the control men during the second follow-up period compared with the first one. Previously, an overall 20% contamination rate has been estimated in the CA of the ERPSC trial.8 The frequency of contamination in the control group has not been analyzed in the Finnish trial.
The optimal interval for PC screening is still debated. Some recommend annual screening,9, 10 while the ERSPC study has used a 4-year screening interval with the exception of biennial screening in Sweden. No major differences in the cumulative incidence of (advanced) interval cancers were observed between the Dutch section with a 4-year interval and the Swedish center using biennial interval,11 suggesting that the longer screening interval was not associated with substantial loss of sensitivity. This is further supported by a recent analysis showing similar test sensitivities in the Dutch (0.95) and Swedish (0.94) ERSPC centers.12
Tumor stage and grade provide an intermediate outcome measure of screening efficacy. In a successful screening program, the incidence of advanced and high-grade (hence, less curable) PCs should decrease in the screened group. Our results show a clear stage shift in the SA with a lower overall incidence of advanced PC than in the CA. The Swedish and the Dutch components of the ERSPC study have also demonstrated a favorable stage shift. In Sweden, the cumulative incidence of advanced PC after 8 years of screening with 4 screening rounds was lower in the SA than in the CA (0.48% vs. 0.63%).13 At the second screen of the Dutch trial, the PC characteristics were more favorable than at the first screen; e.g., the detection rate of stage T3-4 PCs decreased from 18.7% to 3.7%.14 However, the proportion of advanced PCs may decrease merely by over diagnosing latent PCs—therefore, incidence analyses provide a more valid measure of screening efficacy.
In our study, the incidence of low-grade (Gleason 2–6) cancers was roughly 2-fold in the SA compared with the CA but there was no difference between the SA and the CA in the incidence of Gleason 7 or Gleason 8–10 PCs. The Gleason scores we used in this publication were the original scores, which are subject to changes in Gleason scoring criteria over time.15 This means that the incidence of low-grade PCs has declined in the recent years most likely because currently similar cancers are assigned a higher grade than before, due to a shift in classification criteria. An analysis taking this bias into account (a random sample from the original biopsies were regraded to match present-day criteria) was recently published from the Finnish trial, showing that the grade of screen-detected cancers was lower compared with that of interval cancers or control arm cancers.16 Also a joint publication by the ERSPC investigators showed that in Finland a favorable change in PC grades was observed during the first two rounds.17
Recently, a study from England18 showed that PCs detected by elevated PSA are more likely to be less advanced than PCs detected by clinical signs, but no difference was observed in Gleason score 8–10 PCs. This suggests that PSA testing would have no effect on high-grade advanced cancers due to their fast-growing, aggressive nature. In our study, the cumulative incidence of advanced Gleason score 8–10 PC was significantly lower in the SA, indicating that screening could decrease the incidence even in advanced high-grade PCs.
Over diagnosed PCs are latent cancers that are detected because of an intervention, without which the PC would not have been diagnosed during the lifetime of the subject. This is one of the most serious problems with PC screening, as overdiagnosis leads to over treatment, which, in turn, results in adverse treatment effects, psychological stress and increased costs for the health care system.19, 20 The rate of overdiagnosis calculated by different models (for determination of lead-time and it) has been reported to be 23–42%.19 In the ERSPC mortality analysis, the PC incidence in the SA was 1.4-fold compared with the CA.2 Our results suggest that ∼30% of screened PCs could be over diagnosed in the Finnish trial (calculated as the proportion of excess cases in the SA if the cumulative incidence was the same as in the CA). This is, however, a very crude approximation as overdiagnosis estimations need to be performed with models that take lead-time into account.
There are some limitations to our study. We do not have a reliable estimate of PSA contamination in the control arm but this should cause underestimation rather than overestimation of the observed differences between the SA and CA. For comparability between groups, we used original Gleason scores, which are subject to the change in Gleason scoring criteria over time. Finally, the participation proportion in a population-based study was not as high as can be achieved in a volunteer-based trial (which is biased in other ways). A detailed analysis of nonparticipation is needed to understand why some men choose not to participate. However, our intention-to-screen analysis should not be affected by the nonparticipants.
In conclusion, the reduction in incidence of advanced PC in the Finnish screening study was substantial—the cumulative incidence in the SA was one third lower than that in the CA and the effect was larger than that observed in PC mortality in the ERSPC. The benefits of PC screening are becoming clearer, but more information is needed on the adverse effects, costs and quality of life effects before recommendations on PC screening can be made.
- 10American Urological Association Guidelines; Available at: http://www.auanet.org/content/guidelines-and-quality-care/clinical-guidelines.cfm, accessed on July 8th, 2009.