We found that implementation of organized mammography screening program in Norway was associated with a statistically nonsignificant decrease in breast cancer mortality of 7% in the follow-up model and of 11% in the evaluation model.
Strengths and limitations
Our study had several strengths. First, it was based on data linkage where both person-years and breast cancer deaths were counted according to the individual exposure time. Second, our study design was balanced in calendar time where the study group and the regional control group covered the same time period, as did the historical control group and the historical regional control group. In this way, underlying temporal changes in breast cancer mortality were taken into account.
Our study had, however, also limitations. First, both estimates might slightly underestimate the true effect of the program. As stated above, the 7% included breast cancer deaths in women diagnosed after they left the program. This means that we have included some breast cancer cases that cannot have benefited from screening. For the calculation of the 11%, the follow-up period was longer than the accrual period. Owing to the lead time, the analysis will, therefore, include deaths from more breast cancers in the study than in the control groups, as the lead time is not known these numbers cannot be estimated. As all women in a given municipality were allocated the same invitation date, we might furthermore have included some breast cancers diagnosed before first invitation to screening. Second, owing to the short-time interval between program start in study and regional control groups, a large proportion of study group observations, 44% of person-years and 71% of breast cancer deaths, derived from women offered screening at age 64 or above. Third, for women offered screening below age 64 the maximal follow-up time was 6 years, which might not have allowed for the full effect of screening to materialize.6 Fourth, our results have to be interpreted in light of the widespread use of regular mammography in Norway prior to implementation of the program, and the dramatic change in hormone use during the implementation phase. Fifth, in our study design the estimated effect of the program could not be separated from a possible interaction between region and period.
Concerning use of regular mammography, we have previously surveyed historical data on mammography activity in Norway.3 A key finding in this survey was that questionnaire data from the Norwegian Women and Cancer Study (NOWAC) from 1996 showed that at least 40% of Norwegian women aged 50–69 regularly underwent mammography prior to their first invitation to the program.3 The survey furthermore showed that this percentage increased over time, as 64% of first attendees in the organized program in 1996–2002 reported to have had at least one mammography prior to their first participation in the program.3, 7 Finally, our survey showed that the self-reported NOWAC data were supported by mammography activity data collected by the Norwegian Radiation Protection Authority.3 A high preprogram mammography activity in Norway has been noted also by other authors.8
Our previously reported data on mammography activity in Norway can be used in the interpretation of the breast cancer mortality data reported in the present study. Given that a substantial proportion of women had mammography prior to their participation in the program, the breast cancer mortality data reported here do not reflect the impact of screening versus no screening. Our evaluation reflects instead the impact of building a program on top of existing widespread regular mammography. In our survey, we discussed the possible impact of the preprogram mammography activity on the expected effect of the program on breast cancer mortality. If the Norwegian program was assumed to work in line with the randomized controlled trials on mammography screening, a 25% decrease in breast cancer mortality should be expected, given no preprogram mammography activity.9 However, with the preprogram mammography activity we found in the survey, only an effect of 11% would be measured under the assumption that mammography within and outside the organized screening worked similarly.3 The present study showed that the real breast cancer mortality data presented here in fact corresponded very well with the estimate based only on the trial results and the Norwegian mammography activity data. This might not be so surprising as the Norwegian health care system is expected to work fairly similar to the Swedish, from where most randomized controlled trial data derived,9 and the Danish, from where observational data showed similar results.2
Concerning hormone use, sales data from the Norwegian Institute of Public Health on hormones from 1987, 1991, 1995, 1999 and 2002 showed number of defined daily doses per 1,000 women aged 45–64 years per day to be higher in pilot than in control counties. Data after 2002 were not so relevant as women followed past 2002 were all above the age of 70 years. Hormone use could well be expected to have affected breast cancer mortality in Norway during the program implementation, but given the parallel development over time in the pilot and control counties, we do not expect hormone use to have distorted the present study.
Concerning the possible interaction between region and period, it should be considered that breast cancer care units were established in Norway along with the program.10 One would therefore expect breast cancer treatment, and consequently breast cancer mortality, to have improved more from the historical to the screening period in pilot than in control counties. This would represent an interaction between region and period. Patients diagnosed in the post-program period, but prior to invitation to screening, indeed had a better survival than patients diagnosed in the preprogram period.11 However, as these data were not available separately for pilot and control counties, it was difficult to draw a firm conclusion on interaction.
A previous study by Kalager et al.12 of the effect of the Norwegian program to some extent resembled our approach in using three control groups. It differed though by not being balanced in calendar time, by including regions with a very short follow-up time, and by estimating person-years from routine population statistics. For women aged 50–69 years at time of diagnosis, the study found the program to be associated with a 10% reduction in breast cancer mortality. In interpretation of these findings, Kalager et al. emphasized the breast cancer care units introduced along with the program. As breast cancer mortality trends for women diagnosed at age 70–84 years changed in parallel with those for women of screening age, the authors indirectly suggested the mortality reduction to be explained by new treatment modality which, when available, was used for breast cancer patients of all ages. As stated above, we agree that a possible interaction between region and period cannot be separated out from the analysis. However, the mortality in postscreening age measured by Kalager et al. was not unaffected by screening. Breast cancer deaths were categorized by age at diagnosis, and lead time will therefore affect the measured mortality for both women aged 50–69 and 70–84 years. The lead time will result in more deaths being allocated to the age group of 50–69 years and less deaths being allocated to the age group of 70–84 years in the screening group as compared to the three control groups. The possible impact of regular mammography prior to the program was not considered by Kalager et al.
In another study, Autier et al.13 analyzed age-adjusted breast cancer mortality rates from Sweden and Norway. In Sweden, screening was introduced gradually starting in 1986, whereas in Norway gradual implementation started in 1996. From 1989 to 2006, breast cancer mortality decreased by 16.0% in Sweden and by 24.1% in Norway. Based on these data combined with trends from other European countries, the authors concluded that breast cancer mortality trends were not associated with the presence of screening programs. The possible impact of regular mammography prior to the program in Norway was not considered by Autier et al.
Previous studies have shown that the start of the organized screening program left marks on the breast cancer pattern in Norway, and one may ask whether such marks are compatible with the use of regular mammography also before the start of the program. First Zahl et al.14 showed a prevalence peak in the breast cancer incidence after start of screening in the pilot counties. Based on our data on mammography use, a prevalence peak is expected, as the proportion of women in the four pilot counties reporting prior mammography increased from 47% in 1996 to 73% in 1997–1998.3 Weedon-Fekjær et al.8 estimated the incidence increase to be 59% during the Norwegian prevalence peak, where a 78% increase would have been expected in the absence of prior screening.15 Second, Sørum et al.16 showed that the incidence of ductal carcinoma in situ (DCIS) increased with the start of the program. The most relevant comparison, taking into account that DCIS detection may also be affected by technological changes,17 is from a rate of 10 per 100,000 women-years before the program to 30 per 100,000 women-years during the early subsequent invitation rounds. This increase may, to a large extent, although not completely, be explained by the simultaneous increase in proportion of women with previous mammography. In the 11 counties where the program was first implemented, this percentage increased from 43% in 1996 to 92% in 2002.3
The evaluation of breast cancer mortality after implementation of the organized mammography screening program in Norway is a prime example of the limitations of observational epidemiology. In the present situation, the disease outcome could be affected both by mammography activity outside the program, hormone use, treatment improvement and by the organized program. These factors could not be fully separated in the observational data. We therefore have to consider also other possibilities to learn about the effect of the program. First, simulation models can incorporate data on mammography activity outside the program, hormone use, treatment improvement and based on this estimate the effect of the program. Simulation models do, however, build on assumptions, and different models may come up with different estimates based on the same data.18 Second, short-term surrogate indicators can be measured. They reflect characteristics of the screening activity known from the randomized controlled trials to predict breast cancer mortality.19 Breast cancers detected in the Norwegian program had a favorable tumor size distribution although the rate of advanced tumours remained constant.20 The detection rate was in line with European recommendations for initial screens and higher for subsequent screen. However, the sensitivity has been relatively low, indicated by a high interval cancer rate as compared to the background incidence rate in the second year of follow-up.21 If the Norwegian program had been implemented in an unscreened population, one might, based on the short-term indicators roughly, have expected an effect on breast cancer mortality at or slightly below the 25% reduction found in randomized controlled trials.