SEARCH

SEARCH BY CITATION

Keywords:

  • mammography;
  • mass screening;
  • interval cancer;
  • international collaboration;
  • comparative study

Abstract

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References

International comparisons of interval cancers (IC) are important to better understand the relationship between programmes' performance and screening practices. In this respect, differences in (i) definition, (ii) identification and (iii) quantification of IC have received little attention. To examine these 3 comparability issues and activities involving IC, an assessment was conducted among member countries of the International Breast Cancer Screening Network, and the impact of accuracy of identification and quantification practices was estimated using 1996–98 data from the Dutch breast cancer screening programme. Information was obtained from 19 screening programmes in 18 countries, 16 of which acknowledged the coexistence of opportunistic screening. IC data were collected to evaluate performance of the screening programme (100% of programmes) and the radiologists (89%); 53% of programmes had a designated review process for IC. Most programmes (84%) agreed with the European Guidelines definition of IC, but a case situation exercise evidenced substantial discrepancy in classification of cancers that occurred after a positive screen. Completeness of identification of IC appears to contribute most to international variation, and cannot be easily controlled for in methodologically rigorous comparisons. Statistically significant differences of about 4% were measured between quantification methods for IC. An operational definition of IC is proposed to enhance international comparability. Valid comparisons of IC are possible with careful attention to the definition but true differences in IC frequency across screening programmes should exceed 10% to be possibly indicative of real differences between programmes. © 2006 Wiley-Liss, Inc.

Although mammography is currently the only proven screening method for reducing mortality from breast cancer,1, 2, 3 some breast cancers will inevitably elude screening detection. Interval cancers (IC) are those tumours that are diagnosed in the time period between 2 screening rounds. They occur because of the following: rapid tumour growth leads to cancers that truly develop in the interval after screening, cancers that are present are masked by characteristics of the breast or the cancer, readers miss subtle or minimal signs on the screening mammogram, or further assessment procedures fail to diagnose the cancer radiologically detected.4, 5 From a population-based perspective, IC are a key measure of the quality of the radiology performed and of the early impact of screening programmes.6 A high incidence of IC adversely affects the ability of a screening programme to reduce mortality from breast cancer. Knowledge of IC is also necessary to determine the sensitivity of the programme and of the radiologists' performance.

The frequency of IC has been assessed in randomised studies and in established programmes that provide mammography to a defined target group (service screening), and compared against set standards of performance6 as well as across programmes.7, 8, 9, 10, 11, 12, 13, 14, 15, 16 A metaanalysis of population-based programmes showed a 6-fold variation in the level of IC in the first year after mammographic screening.17 However, several important issues affect the estimation of IC frequency and, consequently, their comparability and interpretation. Surprisingly, comparability issues have received minimal attention,15 and are only partly addressed in the current recommendations available for screening evaluation.6

Three independent aspects should be considered before comparing IC among programmes: the definition, the identification and the quantification of IC. Definitional issues involve inclusion criteria: should ductal carcinoma in situ (DCIS) or cancers diagnosed in the contralateral breast or at early recall (for programmes with a scheduled early review procedure for women following a first equivocal assessment visit) be considered as IC? Up to 28% variation in the IC rate has been reported when the most and least restrictive definitions are used.9 Incomplete follow-up in the screened population and imprecise linkage between data sources (e.g. screening programme and cancer registry) are the main limiting factors in the identification of IC. Identification issues generally lead to an under-estimation of IC occurrence. Even when a common definition of IC is applied, the rate of IC differs according to the denominator used for its quantification (e.g. women screened, women screened negative, women-years at risk). Unless the biological behaviour of breast cancer varies across countries, the radiological classification of IC per se (such as ‘true’ IC, screening errors) does not affect the comparability of IC frequency between programmes.

Comparability issues can be substantially lessened when data are gathered directly from individual screening programmes rather than using published material. For instance, a standardised definition for IC can be applied, and flexibility is enhanced with respect to data stratification (by age group, screening round, time period since last negative screen) and choice of quantification method (that is, IC rate or proportional incidence).

To provide a framework for international comparisons of screening performance, the International Breast Cancer Screening Network (IBSN) undertook a project to evaluate how IC, and other performance parameters,18 can most suitably be compared. The IBSN is an international consortium of representatives from 27 countries with population-based breast cancer screening programmes, sponsored by the US National Cancer Institute for the purpose of fostering cross-national efforts to evaluate the quality and effectiveness of screening mammography.19, 20 Prior to organising a tabulated data collection of IC, an assessment was conducted among IBSN member countries to describe how IC are defined, identified and classified across programmes; why they are assessed; and the types of data available for quantifying IC. In addition to summarizing the salient results from the assessment, this paper (i) documents the definitions and degree of completeness of IC in screening programmes internationally, (ii) describes the different assessment methods for IC and quantifies their effect on IC frequency and (iii) proposes an operational definition of IC to enhance international comparisons.

Material and methods

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References

A short assessment form was developed within the Performance Parameters Evaluation Working Group of the IBSN. It was sent electronically in June 2003 to representatives of countries participating in the IBSN. The assessment sought to document the purpose of monitoring IC, the definition of IC used, the means available to ensure identification of IC and the perceived completeness of these methods, the data systematically available on IC, and the review process in place and IC classification system adopted by the different programmes. In particular, agreement with the definition of an IC as a primary breast cancer diagnosed in a woman who had a screening test with/without further assessment, which was negative for malignancy, either before the next invitation to screening or within a time period equal to a screening interval for a woman who has reached the upper age limit for screening was sought.6 A section of the assessment consisted of case situations for which respondents were asked whether the programme would consider the case as an IC within their evaluation. The objective of the case scenarios was to present screening programmes with realistic situations, potentially conflicting with or not adequately addressed by available recommendations.

The scenarios were based on practical situations encountered in the Dutch and Swiss screening programmes, and designed to reflect a diversity of situations met in most settings. For example, in contrast to the Dutch programme, Swiss regional programmes include an early recall policy for extended assessment when a definite conclusion could not be immediately made on the potential evolution of presumably benign lesions.21 In the Netherlands, both the screening programme and the cancer registry cover the whole country, and the remarkably low recall rate indicates a setting with a risk of a higher IC rate.18

Screening programmes in the 27 countries represented in the IBSN were contacted and 18 provided information for this study. Whenever possible, one ‘pooled’ questionnaire was considered for countries where screening delivery is managed at a regional level (Australia, Canada, UK, Switzerland). Divergences across regional programmes were recorded in these countries but the view of the majority of screening programmes, as reported by the country's representative, was used for analysis. This approach could not always be applied to Italy where surveys for the Torino and Florence programmes were treated separately (so, the analysis included 19 breast cancer service screening programmes). For the US, as analyses requiring a definition of IC are yet to be performed and given the lack of agreement across individual registries, no summary data was possible at the time of the study. Administration of the assessment, as well as data entry and analysis, were performed by one of us (J-L.B.).

Responses were reviewed during a Performance Parameters Evaluation Working Group meeting held in Lausanne, Switzerland, in September 2003. An expert with no affiliation to any breast screening programme (P.S.) was also invited to present his suggested classification of the case situations. A ‘standard’ for these situations was subsequently proposed by the Working Group and adopted in May 2004 at the IBSN biennial general meeting in Oslo (http://appliedresearch.cancer.gov/ibsn/).

To demonstrate effects on IC frequencies of different quantification methods, data from the Dutch breast cancer screening programme (1996–98) were used. As part of the ongoing evaluation of the Dutch programme, records of screened women are linked annually to the cancer registry at the regional level. Positive matches are manually checked to determine whether they involve an IC or a late diagnosis of a screen-detected (SD) cancer due to a delay during the diagnostic assessment.22 The Dutch programme evaluation takes the loss to follow-up into account for women who died or moved away from their region before their next screening examination, and also the exact date of the next screening examination.10 IC rates are then expressed as the number of IC divided by the number of women-years of follow-up within the corresponding time period. It is important that the numerator and the denominator refer to the same time span when calculating the incidence of IC.

Three ways to express the IC incidence rate were compared, based on 3 possible denominators: (i) the number of screened women, (ii) the number of screen-negative women and (iii) the total number of women-years of follow-up during the screening interval. The ratio of the number of SD cancers to IC was also computed. These quantification methods would yield similar relative difference if the proportionate incidence was used (the incidence that would have been expected without mass screening), since IC rate and proportionate incidence only differ by a multiplicative factor. The effect of these quantification methods was also assessed under the assumption of a 90% completeness in identification of IC. Differences in IC rates obtained were statistically tested with a p-value threshold of 0.05 (t-test with assumption of equal variances).

Results

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References

Table I summarises the healthcare organisational context of the breast cancer screening programmes participating in this study. Programmes had more often a national than a regional coverage (61% vs. 28%), with 2 established pilot projects in Japan and Uruguay. Availability of, and linkage with, a population-based cancer registry to identify IC was systematic, although some national programmes were not entirely covered by a cancer registration system. All established programmes reported capturing over 90% of IC occurring in their setting. Most (16/18, 89%) acknowledged that screen-eligible women could undergo mammography screening outside of the organised breast cancer screening programme (opportunistic screening) and 72% had an early recall policy. However, data on concomitant opportunistic screening activity were rarely available, particularly at an individual level.

Table I. Type of Breast Cancer Screening Programme, Estimated Extent of Completeness of Interval Cancers and Level of Information Available on Opportunistic Screening (OS) in The 18 Participating Countries, 2003
Country1Programme typeCoverage of IC (%)Presence of OSType of data on OSEarly recall policy
  • 1

    Other countries represented in the IBSN are Belgium, Germany, Greece, Hungary, Iceland, Ireland, Portugal, Sweden and Turkey.

  • 2

    Countries with a national screening coverage but a decentralised (regional) organisation of screening mammography.

  • 3

    Countries with a national screening programme but only a regional coverage by cancer registries.

  • 4

    OS data are available mostly on an individual basis (70% of cases) for the Florence programme and on a aggregate form for the Torino programme.

  • 5

    US Breast Cancer Surveillance Consortium (http://breastscreening.cancer.gov).

AustraliaNational295YesAggregateYes
CanadaNational2100YesAggregateYes
DenmarkRegional100YesUnavailableNo
FinlandNational100YesUnavailableNo
FranceNationalVariable3YesIndividualYes
IsraelNational95NoYes
ItalyRegional100YesIndividual/Aggregate4Yes
JapanPilot95NoYes
Korea (South)NationalVariable3YesUnavailableNo
LuxemburgNational96YesAggregateYes
The NetherlandsNational95–100YesAggregateNo
NorwayNational100YesUnavailableNo
New ZealandNational100YesUnavailableYes
SpainRegional98YesUnavailableYes
SwitzerlandRegional95YesAggregateYes
United KingdomNationalVariable3YesUnavailableYes
United StatesRegional590–98NAYes
UruguayPilot5YesIndividualYes

All screening programmes reported monitoring IC for assessment of programme performance, and most also sought to evaluate performance of individual radiologists (Table II). The majority of programmes agreed with the definition of IC proposed in the European Guidelines (EG).6 Reasons for disagreement were the consideration of the time period since the last screen within the definition (n = 2) and the lack of precision regarding the cause of IC (n = 1). Two-thirds of programmes could track the invitational status of individual women whereas one-third of programmes were not involved in the invitation process and assumed that all eligible women were invited. Eight programmes applied further restrictions on eligibility within the age limits considered: exclusion criteria were a personal history of breast cancer (n = 6), breast-related symptoms (n = 4), breast implant or prostheses (n = 3), and a family history of breast cancer (n = 1). However, half of the programmes reporting restrictive eligibility criteria also acknowledged that they could not ensure strict application of their criteria.

Table II. Characteristics of Data and Screening Evaluation Related to Interval Cancers in 18 Countries with Breast Cancer Screening Programmes1
 %Number2
  • 1

    Two screening programmes were counted for Italy (see Material and Methods).

  • 2

    Percentages based on specified answers (missing values omitted) and for relevant programmes only.

  • 3

    The European Guidelines (EG) defines an interval cancer as a primary breast cancer diagnosed in a woman who had a screening test with/without further assessment, which was negative for malignancy, either before the next invitation to screening or within a time period equal to a screening interval for a woman who has reached the upper age limit for screening.6

  • 4

    Missing data for the United States (see Material and Methods) and Canada.

  • 5

    Missing data for the United States (see Material and Methods) and Denmark.

  • 6

    Missing data for the United States (see Material and Methods), Denmark and Norway.

  • 7

    Missing data for the United States (see Material and Methods); in New Zealand, data to be collected were in the process of being identified.

Purpose for monitoring IC
 Assess performance of programme10019/19
 Evaluate performance of radiologists8917/19
 Conduct epidemiological research6813/19
Definition and identification of IC
 Agree with IC definition in the EG38416/19
 Have further criteria for ineligibility (% yes, within age range)428/19
 Ensure application of these criteria504/8
 Track invited women4  
  At an individual level6511/17
  Eligibility and invitation assumed356/17
Classification of IC
 Have designated review process for IC (% yes)5539/17
 Use EG classification system (% yes)6508/16
Data systematically available for IC7
 Date of birth10017/17
 Date of last screening mammogram10017/17
 Date of diagnosis9416/17
 Presence of risk factors539/17
 Symptoms/reason prompting diagnostic mammogram478/17
 Histological features9416/17
 Growth status (TNM stage)8214/17
 Prognostic indicators (grading, hormonal receptors status,…)7112/17
 Diagnostic mammogram (+ mode of diagnostic)417/17
 Radiological classification (after review)417/17

The dates of birth, last screening mammogram and diagnosis were always available for IC cases (with the exception of one pilot programme for diagnosis date; Table II). Clinical information (histological type, TNM stage, prognostic indicators) on IC was available for most screening programmes (71–94%) whereas data on risk factors and diagnostic mammograms were less readily accessible (41–53%). Some 53% of responding programmes had a designated review process for IC. Review processes were varied and only 4 programmes followed the review strategy recommended in the EG (data not shown). Conversely, the classification of IC proposed in the EG was adopted by all European programmes that reported applying a radiological review process.

The classification for evaluation of 6 case situations by breast cancer screening programmes is detailed in Table III, along with the ‘standard’ suggested by an expert. Although each case satisfied the broad definition of IC proposed in the EG and was agreed upon by most programmes, a substantial heterogeneity existed in the actual classification of these situations. In particular, whether IC should include breast cancers subsequently diagnosed in screen-positive women who had no further assessment (case B) or detected at early recall through the follow-up of a suspicious lesion unrelated to the cancer diagnosed (case E) were not uniformly handled across programmes.

Table III. Classification for Evaluation of 6 Case Situations by 18 Breast Cancer Screening Programmes1 and An Expert Opinion
How would you classify these cases for your programme evaluation?ICSDOtherExpert opinion
  • SD, screen-detected cancer.–

  • 1

    Programmes in the United States were excluded (see Material and Methods).

  • 2

    The expert opinion (‘standard’) suggested two answers depending whether laterality was available for the programme evaluation.

  • 3

    Two programmes without opportunistic screening were excluded (see Table I). This question was asked in a section preceding the case situations.

Screen-positive, assessment negative; several months later breast cancer diagnosed in same breast and at same location1440IC following negative assessment
Screen-positive, assessment not done (woman does not show up); several months later breast cancer diagnosed in same breast and at same location3105Delayed SD cancer
Screen-detected DCIS; during screening interval invasive breast cancer in contralateral breast21314Without info on laterality: Second primary cancer; with info on laterality: IC (post-DCIS)
Screen-positive with early recall (follow-up); at early recall breast cancer diagnosed in same breast and at same location4131Delayed SD cancer
Screen-positive with early recall (follow-up); at early recall breast cancer diagnosed in contralateral breast2981Without info on laterality: Delayed SD cancer; with info on laterality: IC
Screen-negative; during screening interval DCIS diagnosed (no symptom)3133IC

Table IV shows how IC frequencies depend on the completeness of cancer registration and a careful case ascertainment of the linkage procedure. Incompleteness of 10% in either the cancer registry database or record linkage could lead to more than 10% of IC being not identified. The percentage of underestimation of the IC rate also varies with the chosen quantification method (Table V). On the basis of data from the Dutch nationwide breast cancer screening programme, use of the number of women screened in the calculation of IC frequencies resulted in an underestimation of 3.8% compared with the standard calculation based on women-years of follow-up. These differences, in absolute terms, were statistically significant (−0.0819/1,000 (95% CI: −0.0823; −0.0815) and −0.0736/1,000 (−0.00739; −0.00733) for the scenarios with and without 10% under-registration, respectively), consistent across years, and slightly larger for initial than for subsequent screens (not shown). The combined effect of identification and quantification issues resulted in differences of about 10–15% in IC frequency. The ratio between SD cancers and IC increased by 11.1% (from 2.85 to 3.16) under the scenario of 10% unidentified IC.

Table IV. Estimated Effect of Accuracy of Identification on The Proportion (%) of IC not Identified1
Completeness of cancer registration (%)Completeness of record linkage (%)
909599
  • 1

    The probabilities of a cancer case being not identified in either database are assumed to be independent events, and similar proportions of SD cancers and IC are assumed to be unpaired.

  • 2

    90% × 90% = 81% complete, that is, 19% of IC are not identified by record linkage.

9019.0214.510.9
9514.59.86.0
9910.96.02.0
Table V. Effect of 3 Quantification Methods and Under-Registration of IC on The Estimation of IC Rates1
Denominator for IC rateNObservedIf 10% IC was not identified
IC2 (N)IC rate (/1,000)Diff. (%)IC (N)IC rate (/1,000)Diff. (%)
  • 1

    Based on 1996–98 initial screening round results for the Dutch programme (2,094 SD cancers).

  • 2

    IC cases, 0–23 months after screening.

Screened women356,8627362.063.8662.41.8613.4
Screen-negative women354,7687362.073.2662.41.8712.9
Women-years of follow-up343,241.57362.140.0662.41.9310.0

Discussion

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References

This international study showed wide variations in assessment activities involving IC across mammography screening programmes, and demonstrated the impact of identification and quantification practices on IC frequency. International comparisons of IC could lead to better understanding of the relationship between programmes' performance and screening practices. Differences in definition, quantification, or completeness of identification of IC between programmes distort the underlying differences in IC frequency and obscure interpretation of this measure. These results indicate that caution is required when comparing IC in published studies and corroborate the primary objective of this IBSN working group: to propose an operational definition and quantification method for comparing IC internationally, so that some sources of artefactual variation across programmes can be eliminated or controlled.

Cancers diagnosed in the target population can be grouped into 3 categories: (i) unscreened (those occurring in women not yet invited or invited but not screened), (ii) screen-detected (SD), and (iii) screen-undetected (IC, in a broad sense). The discrepancies observed in our case situations exercise revealed the difficulties in defining IC and highlight the need for more precise rules for the evaluation of programme IC. Each example presented some divergence with the set ‘standard’. Tumour laterality was given for most practical situations described, but should bear no influence when assessing overall programme performance. Knowledge of laterality would however be necessary to assess the sensitivity of radiologists' performance, and the dual classification of cases C and E based on our expert's opinion (P.S.) reminded the relevance of this side information for other types of screening evaluation (Table III).

Most of the differences observed in the definition of IC related to cancers that occurred after a positive screening test (cases A to E, Table III). These situations appear to represent about 7–26% of all IC.8, 23, 24, 25, 26 Case A undoubtedly is consistent with the EG and most definitions of IC. The responses of 3 programmes reporting agreement with the EG definition while classifying case A as a SD cancer illustrate that a theoretical definition does not always hold in practice and that case situations probably provide a more appropriate way of assessing the definitions applied by programmes. As cancers detected by radiologists but misdiagnosed during the assessment phase are rare, one can assume that the subsequent cancer diagnosis among women who opted not to present themselves for further investigation would have been correctly made, had the assessment been performed (case B).

Early recall after an unclear screening result is not encouraged and should be kept to a minimum (less than 1% of screened women).6 Further, the delay between diagnostic assessment and early recall and the frequency of repeat mammography vary across programmes with such a procedure, and compliance may not be ensured. Whenever possible, cancers resulting from early recall should be counted separately (cases D and E). A few studies have reported on the proportion of detected cancers during early recall4, 27, 28 but it appears unlikely to exceed a few percent. Their modest contribution to the pool of SD cancers makes them a candidate for inclusion as SD lesions and their diagnosis was presumably made earlier than would have been the case in the absence of a screening programme.

The heterogeneous classification of case C reflected the discrepant eligibility criteria of programmes. Case C is a rare situation: about one woman per thousand experiences a screen-detected DCIS. In 4 regions where medical surveillance of high risk subjects is mostly organised on an individual basis (Denmark, France, Florence, Switzerland), these women were reported to become ineligible for the programme. We suggest, for international comparison and uniformity purposes, to take into account only the first tumour diagnosed, whether DCIS or invasive, so that a woman cannot contribute both as a SD and an IC case.

The interesting aspect highlighted by case F lies beyond its clear classification (as an IC, since any DCIS would analogously be counted as a SD cancer if diagnosed during the screening investigation). Diagnosis of an asymptomatic cancer implies detection by a (opportunistic) screening mammography or, sometimes, a fortuitous finding. Although it remains unknown whether DCIS would have developed into a symptomatic cancer diagnosed before the next scheduled screening test, the amount of opportunistic screening in an area can affect the number of DCIS detected in the interscreening interval. Most programmes include DCIS as IC, a few do not. On one hand, exclusion of DCIS detected in the interscreening interval from international comparisons of IC allows focus on the quality aspects of organised screening programmes independent of the screening activity occurring outside the programme's setting. Bulliard et al.21 pointed out that since cancer registries were rarely in the position to routinely provide incidence rates for noninvasive tumours, in situ cancers should not be counted in the numerator when IC are expressed as a proportion of the underlying breast cancer incidence rate. Further, discarding DCIS should enhance international comparability of IC by excluding an entity whose classification may differ across pathologists and which is the main contributor to over-diagnosis in screening programmes. On the other hand, inclusion of the DCIS diagnosed in the interscreening interval, which contribute between 2 and 8% of all IC,7, 10, 29 implicitly considers opportunistic screening activity as a contributory factor for international differences in IC frequency.

The IC rate is sensitive to the numerator and denominator used for its quantification. The numerator consists of the number of IC considered, and standardisation of the numerator is obtained when a common definition for IC is applied across screening programmes. By allowing for withdrawal from the screened cohort in each of the years of follow-up, the number of women-years at risk is the only method that does not underestimate the IC rate. Computations performed with Dutch data showed that simpler ways of calculating the IC rate yielded small but significant differences (3.8%). A comparable effect of the denominator on IC rate (4.3%) was reported in Scotland.8 This corroborated a minor influence of the choice of denominator in the assessment of IC.

Expression of the IC rate in terms of proportional (relative) incidence enables discarding of geographical differences in (underlying) breast cancer incidence rates when comparing results of screening programmes. This dimensionless indicator is recommended for reporting IC and has been used to establish norms to evaluate IC incidence against the background incidence.6 The proportional incidence method involves the calculation of the incidence rate that would be expected in the absence of screening, a figure that is never exactly known. The longevity of the cancer registry and the screening programme, along with the context in which screening appeared (staggered implementation of an organised programme, concomitant opportunistic screening), mostly determine whether historical or modelled rates from prescreening trends are used for the calculation of proportional incidence. Paradoxically, the way of calculating the expected incidence adds a substantial source of variability,30 which may well counteract the ‘adjustment effect’ brought by the proportional incidence. The ratio between the numbers of SD cancers and IC in comparable age groups provides a simple alternative, easy to derive from results of screening programmes. However, this indicator is highly sensitive to potential misclassification between IC and SD cancers and requires a prior standardisation of the definition of IC.

The accuracy of identification of IC may potentially be the largest source of error and discrepancy between programmes: if 5% of breast cancers are not reported to the registry and another 5% of individuals are not matched by the linkage procedure, up to 10% of IC in the screened population might not be identified (Table IV). Linkage failures are mainly ascribed to deficient or inaccurate notification procedures, inaccuracy in the matching software, subjects who departed from the area prior to their breast cancer diagnosis or who did not give authorisation for record linkage, or the inevitable delay before cancer notification. When the case is recorded in both the screening and cancer registry databases, missed linkages are generally retrieved by manual check and may account for around 1% of all IC in settings where ascertainment is believed to identify virtually all IC.8, 31

Completeness of linkage procedures has rarely been documented32 since, apart from a few countries that can rely on a unique person identifier for matching, it remains uncertain whether the screening programme and the cancer registry have captured all IC. Unmatched records from the cancer registry are generally erroneously counted as unscreened cancers. Unmatched cancer records from the screening database should be rare if the programme notifies SD cancers to the registry. Our rough quantification may overestimate the extent of incomplete identification since some IC are possibly unregistered in both databases (that is, notifications in either database are not independent events) and so do not have a multiplicative effect. However, SD cancers may proportionately be more often successfully matched than IC, particularly when the screening programme transmits its cancer cases to the registry.

Our results indicated that less than 10% of IC remain unidentified for the evaluation of most breast cancer screening programmes. Some evidence suggests that reported estimates sometimes corresponded to the cancer registry coverage so that inaccuracy in record linkage was not taken into account. In any case, if similar definition and quantification methods are used for IC, the reported variation in completeness of identification of IC suggests that differences in IC frequency should exceed 10% to be considered indicative of real differences between programmes.

The EG definition of an IC was found to be used and agreed upon by screening programmes worldwide even if practices of some programmes were sometimes at variance with it. To improve and facilitate international comparison of IC, we suggest the following (i): include only the first primary breast cancer with histological confirmation, (ii) consider time since last screening examination as a stratification variable9, 32 but not as part of the operational definition of IC, and (iii) analyse separately in situ and cancers detected at early recall.

In summary, valid international comparisons of IC are possible with careful attention to the definition, and to a lesser extent, the quantification of this measure. Based on some assumptions about completeness of cancer registration and linkages, identification of IC appears to contribute most to international variation and cannot be easily controlled for in methodologically rigorous comparisons. True differences in IC frequency across screening programmes should exceed 10–15% so as to override any residual, artefactual effect.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References

This project was conducted while Dr. Bulliard was supported by a fellowship from the Swiss National Science Foundation (Nr 32-63130.00). The International Breast Cancer Screening Network (IBSN) is acknowledged for their leadership and previous collaborative work. The authors are indebted to Andriana Koukari (Australia), Jay Onysko (Canada), Elsebeth Lynge (Denmark), Matti Hakama (Finland), Rosemary Ancelle-Park (France), Gadi Rennert (Israel), Marco Rosselli Del Turco and Patrizia Falini (Florence, Italy), Antonio Ponti (Torino, Italy), Noriaki Ohuchi (Japan), Astrid Scharpantgen (Luxemburg), Brian Cox (New Zealand), Solveig Hofvind (Norway), Won Chul Lee (South Korea), Nieves Ascunce (Spain), Sue Moss (UK), Gonzalo Pou (Uruguay), and the statistical coordinating centre of the Breast Cancer Surveillance Consortium (US) for providing information on their screening programmes. We also thank Brian Cox, Hélène Sancho-Garnier (France) and Antonio Ponti for their thoughtful comments on the manuscript.

References

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. Acknowledgements
  7. References
  • 1
    International Agency for Research on Cancer. In: VainioH, BianchiniF, eds. IARC Handbooks of Cancer Prevention, Vol. 7: Breast cancer screening. Lyon: IARC Press, 2002.
  • 2
    Fletcher SW, Black W, Harris R, Rimer BK, Shapiro S. Report of the International Workshop on Screening for Breast Cancer. J Natl Cancer Inst 1993; 85: 164456.
  • 3
    Nyström L, Andersson I, Bjurstam N, Frisell J, Nordenskjöld B, Rutqvist LE. Long-term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet 2002; 359: 90919.
  • 4
    Warren R, Allgood P, Hunnam G, Godward S, Duffy S. An audit of assessment procedures in women who develop breast cancer after a negative result. J Med Screen 2004; 11: 1806.
  • 5
    Porter PL, El-Bastawissi AY, Mandelson MT, Lin MG, Khalid N, Watney EA, Cousens L, White D, Taplin S, White E. Breast tumor characteristics as predictors of mammographic detection: comparison of interval- and screen-detected cancers. J Natl Cancer Inst 1999; 91: 20208.
  • 6
    PerryN, BroedersM, de WolfC, TörnbergS, eds. European guidelines for quality assurance in mammography screening. Luxembourg: Office for Official Publications of the European Communities, 2001.
  • 7
    Day N, McCann J, Camilleri-Ferrante C, Britton P, Hurst G, Cush S, Duffy S. Monitoring interval cancers in breast screening programmes: the East Anglian experience. J Med Screen 1995; 2: 1805.
  • 8
    Everington D, Gilbert FJ, Tyack C, Warner J. The Scottish breast screening programme's experience of monitoring interval cancers. J Med Screen 1999; 6: 217.
  • 9
    Faux AM, Richardson DC, Lawrence GM, Wheaton ME, Wallis MG. Interval breast cancers in the NHS Breast Screening Programme: does the current definition exclude too many? J Med Screen 1997; 4: 16973.
  • 10
    Fracheboud J, de Koning HJ, Beemsterboer PM, Boer R, Verbeek AL, Hendriks JH, van Ineveld BM, Broeders MJ, de Bruyn AE, van der Maas PJ. Interval cancers in the Dutch breast cancer screening programme. Br J Cancer 1999; 81: 9127.
  • 11
    Ganry OF, Peng J, Raverdy NL, Dubreuil AR. Interval cancers in a French breast cancer-screening programme (Somme Department). Eur J Cancer Prev 2001; 10: 26974.
  • 12
    Kavanagh AM, Mitchell H, Farrugia H, Giles GG. Monitoring interval cancers in an Australian mammographic screening programme. J Med Screen 1999; 6: 13943.
  • 13
    Moss SM, Coleman DA, Ellman R, Chamberlain J, Forrest AP, Kirkpatrick AE, Thomas BA, Price JL. Interval cancers and sensitivity in the screening centres of the UK trial of early detection of breast cancer. Eur J Cancer 1993; 29A: 2558.
  • 14
    Wang H, Bjurstam N, Bjorndal H, Braaten A, Eriksen L, Skaane P, Vitak B, Hofvind S, Thoresen SO. Interval cancers in the Norwegian breast cancer screening program: frequency, characteristics and use of HRT. Int J Cancer 2001; 94: 5948.
  • 15
    Törnberg S, Codd M, Rodrigues V, Segnan N, Ponti A. Ascertainment and evaluation of interval cancers in population-based mammography screening programmes: a collaborative study in four European centres. J Med Screen 2005; 12: 439.
  • 16
    Anttila A, Koskela J, Hakama M. Programme sensitivity and effectiveness of mammography service screening in Helsinki, Finland. J Med Screen 2002; 9: 1538.
  • 17
    Taylor R, Supramaniam R, Rickard M, Estoesta J, Moreira C. Interval breast cancers in New South Wales, Australia, and comparisons with trials and other mammographic screening programmes. J Med Screen 2002; 9: 205.
  • 18
    Yankaskas BC, Klabunde CN, Ancelle-Park R, Renner G, Wang H, Fracheboud J, Pou G, Bulliard J-L. International comparison of performance measures for screening mammography: can it be done? J Med Screen 2004; 11: 18793.
  • 19
    Klabunde C, Bouchard F, Taplin S, Scharpantgen A, Ballard-Barbash R. Quality assurance for screening mammography: an international comparison. J Epidemiol Community Health 2001; 55: 20412.
  • 20
    Shapiro S, Coleman EA, Broeders M, Codd M, de Koning H, Fracheboud J, Moss S, Paci E, Stachenko S, Ballard-Barbash R; on behalf of International Breast Cancer Screening Network (IBSN) and the European Network of Pilot Projects for Breast Cancer Screening. Breast cancer screening programmes in 22 countries: current policies, administration and guidelines. Int J Epidemiol 1998; 27: 73542.
  • 21
    Bulliard J-L, De Landtsheer J-P, Levi F. Results from the Swiss mammography screening pilot programme. Eur J Cancer 2003; 38: 17608.
  • 22
    Duijm LEM, Groenewoud JH, Jansen FH, Fracheboud J, van Beek M, de Koning HJ. Mammography screening in the Netherlands: delay in the diagnosis of breast cancer after breast cancer screening. Br J Cancer 2004; 91: 17959.
  • 23
    Amos AF, Kavanagh AM, Cawson J. Radiological review of interval cancers in an Australian mammographic screening programme. J Med Screen 2000; 7: 1849.
  • 24
    Liston J. Are too many breast cancers missed at assessment? Breast 2000; 9: 2017.
  • 25
    McCann J, Britton PD, Warren RM, Hunnam G. Radiological peer review of interval cancers in the East Anglian breast screening programme: what are we missing? J Med Screen 2001; 8: 7785.
  • 26
    Saarenmaa I, Salminen T, Geiger U, Holli K, Isola J, Kärkkäinen A, Pakkanen J, Piironen A, Salo A, Hakama M. The visibility of cancer on earlier mammograms in a population-based screening programme. Eur J Cancer 1999; 35: 111822.
  • 27
    National Health Service. Breast Screening Programme, England: 2003–04. Sheffield: NHS, 2005.
  • 28
    Kerlikowske K, Smith-Bindman R, Abraham LA, Lehman CD, Yankaskas BC, Ballard-Barbash R, Barlow WE, Voeks JH, Geller BM, Carney PA, Sickles EA. Breast cancer yield for screening mammographic examinations with recommendation for short-interval follow-up. Radiology 2005; 234: 68492.
  • 29
    Exbrayat C, Garnier A, Colonna M, Assouline D, Salicru B, Winckel P, Menegoz F, Bolla M. Analysis and classification of interval cancers in a French breast cancer screening programme (département of Isère). Eur J Cancer Prev 1999; 8: 25560.
  • 30
    Prior P, Woodman CB, Wilson S, Threlfall AG. Reliability of underlying incidence rates for estimating the effect and efficiency of screening for breast cancer. J Med Screen 1996; 3: 11922.
  • 31
    Schouten LJ, de Rijke JM, Schlangen JT, Verbeek AL. Evaluation of the effect of breast cancer screening by record linkage with the cancer registry, The Netherlands. J Med Screen 1998; 5: 3741.
  • 32
    Kavanagh AM, Amos AF, Marr GM. The ascertainment and reporting of interval cancers within the BreastScreen Australia Program. NHMRC National Breast Cancer Centre, 1999 (available at http://www.nbcc.org.au/bestpractice/resources/ICP_reportinginternalcancers.pdf).