SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. Materials and Methods
  4. Results
  5. Discussion
  6. Disclosure Statement
  7. References

The standardized assessment of Ki67 labeling index (LI) is of clinical importance to identify patients with primary breast cancer who could benefit from chemotherapy. In this study, we evaluated the interobserver concordance of Ki67 LI assessment. Six surgical pathologists participated and all the slides were prepared from archival breast cancer tissues fixed in 10% buffered formalin for 24 h and stained with MIB-1. Three independent studies were conducted. In the first study, 30 stained slides were assessed using two different methods: the scoring system, with a positive rate scored from 1 (0–9%) to 10 (90–100%) by visual estimate; and the counting method, with approximately 1000 cells counted in hot spots. In the second study, 20 tumors with Ki67 LI 5–25% were assessed, and in the third study, 15 printed photographs of stained slides were assessed to avoid variations by selecting different fields. In study 1, the counting system (intraclass correlation coefficient [ICC], 0.66 [95% confidence interval 0.52–0.78]) demonstrated a better correlation than the scoring system (ICC, 0.57 [0.42–0.72]). In study 2, the assessment for Ki67 LI of 5–25% demonstrated a correlation (ICC, 0.68 [0.50–0.81]) similar to that of study 1 (unrestricted range of Ki67 LI). In study 3, the assessment of Ki67 LI by counting yielded a good concordance (ICC, 0.94 [0.88–0.97]). In conclusion, there was better concordance with the counting system, and concordance was high when the assessed field was predetermined, indicating that the selection of the evaluation area is critical for obtaining reproducible Ki67 LI in breast cancer.

The introduction of adjuvant therapy into the treatment strategy for breast cancer patients has contributed to a significant reduction in breast cancer mortality.[1] Either one or a combination of chemotherapy, endocrine therapy and molecular targeting therapy has been applied as adjuvant therapy based on the clinical and pathological parameters, including tumor size, lymph node involvement, hormone receptor (HR) expression, human epidermal growth factor receptor 2 (HER2) status and histological grade. However, who should receive chemotherapy among the patients with early stage breast cancer has not been clarified, in particular among those with HR-positive diseases.[2] Chemotherapy may result in serious adverse effects, such as secondary malignancy and cardiac toxicity, and, therefore, it is crucial to develop biomarkers for selection of those who could benefit from systemic chemotherapy.

Several published studies suggest the prognostic value of the cell proliferation marker Ki67 labeling index (LI) in breast cancer.[3] Ki67 LI has also been considered as a promising biomarker to select patients who could benefit from chemotherapy. In preoperative neoadjuvant treatment, Ki67 LI is reported to be associated with pathological response in a number of studies,[4-6] although not all the studies support the predictive value of Ki67 in their multivariate analyses.[7, 8] In the adjuvant settings, Ki67 LI is reported to predict the therapeutic benefits of the addition of taxanes to anthracycline-based regimens in patients with HR-positive diseases,[9, 10] while Ki67 LI is also reported not to predict the relative efficacy of adjuvant chemotherapy consisting of cyclophosphamide, methotrexate and fluorouracil (CMF) compared to endocrine therapy alone.[11] Recently Ki67 LI was reported as a biomarker to distinguish luminal B from luminal A subtypes and it has been widely used in pathological evaluation of breast cancer.[12, 13] Ki67 LI has, therefore, attracted enormous interest from clinical oncologists but it remains to be confirmed whether Ki67 LI is useful for identifying patients who could benefit from chemotherapy. Therefore, the standardization of the assessment of Ki67 LI is considered essential to critically evaluate the clinical value of Ki67 LI and to apply it in clinic.

We have previously reported on the standardization of biomarker assessment, in particular HER2 assessment.[14, 15] In this study, we evaluate the interobserver concordance of the assessment of Ki67 LI in the archival materials by six surgical pathologists and discuss the potential causative factors resulting in the discordance of assessment among the pathologists.

Materials and Methods

  1. Top of page
  2. Abstract
  3. Materials and Methods
  4. Results
  5. Discussion
  6. Disclosure Statement
  7. References

All the slides were prepared from archival tissues of 10% formalin-fixed and paraffin-embedded tissue specimens (years 2009–2010) of primary breast cancer at Kyoto University Hospital, Kyoto, Japan. Pathological assessment was performed by six surgical pathologists (A to F) who specialize in breast pathology from six different Japanese institutions: Kyoto University Hospital, National Cancer Center Hospital, Saitama Cancer Center, Nihon University School of Medicine, The Cancer Institute Hospital of the Japanese Foundation for Cancer Research and Tohoku University School of Medicine, all located in Japan.

For the present paper, three independent studies were undertaken to estimate the interobserver concordance.

Study 1

Six consecutive slides were prepared using five formalin-fixed paraffin embedded (FFPE) blocks from surgical specimens of five different breast cancer cases. A tissue slide from each case was immunostained with an antibody, MIB-1 (DAKO, Glostrup, Denmark), in each institution according to their routine methods. A total of 30 stained slides were collected and shuffled in the data center, located in Kyoto University. The 30 slides were sent to each institute and assessed for Ki67 LI by each pathologist using two different modes of assessment. First, they used the scoring system, in which the rate of positive cells in hot spots, namely areas where Ki67 staining in cancer nuclei is the most dense among the fields, was scored from 1 (0–9%) to 10 (90–100%) by visual estimate without counting the cell number. The second method used was the counting system, for which approximately 1000 cells in total were counted in the hot spots and the positive rate was calculated. Assessment was performed by looking at tissues under the microscope in three institutes or by capturing images in three institutes depending on their routine assessment methods.

Study 2

To assess the variability of Ki67 LI around 15%, which is clinically relevant to distinguish between luminal A and B subtypes of breast cancer,[12, 13] 20 tumors with Ki67 LI ranging from 5% to 25% (15 ± 10%) determined by a pathologist independent of this study, stained in a single institution (Kyoto University Hospital), were subsequently assessed by the participating pathologists using the counting system.

Study 3

To avoid variations by assessment in varied microscopic fields and to further evaluate the variation of the threshold of immunointensity interpreted as positive by different pathologists, 15 printed photographs of Ki67-stained slides were taken by a pathologist independent of the assessment. The photographs were assessed for Ki67 LI by each participating breast pathologist using the counting system. Some examples of the photographs are shown in Figure 1.

image

Figure 1. Printed photographs used for study 3. Three representative photographs of 15 illustrations used in study 3 are shown with Ki67 labeling index (LI) assessed by the counting system by six pathologists: A to F.

Download figure to PowerPoint

To assess the agreement regarding Ki67 LI, the intraclass correlation coefficient (ICC) was estimated with a 95% confidence interval (CI). There is no universally accepted standard criteria for the ICC; hence, based on the similarity to the kappa coefficient, the following criteria using the lower limit of a 95% CI were used here to aid interpretation:[16, 17] the lower limit of ICC, 0.41–0.60 as “moderate correlation”; 0.61–0.80 as “substantial correlation”; and >0.80 as “almost perfect correlation.”

The Bland–Altman plot was used to assess the agreement between the two assessment systems because all the pathologists assessed Ki67 LI using the two assessment systems.[18] All statistical analyses were performed using sas software version 9.2 (SAS institute, Cary, NC, USA).

Results

  1. Top of page
  2. Abstract
  3. Materials and Methods
  4. Results
  5. Discussion
  6. Disclosure Statement
  7. References

Study 1

The same 30 slides were used to analyze the concordance of the assessment for Ki67 LI among the different pathologists involved in this study, applying the counting and the scoring systems. The counting system demonstrated a better correlation of Ki67 LI among the six pathologists than the scoring system (ICC, 0.66 [95% CI 0.52–0.78] for the counting system, 0.57 [95% CI 0.42–0.72] for the scoring system) (Fig. 2a,b). To examine an intraclass correlation between the two assessment systems, scores (1–10) from the scoring system were multiplied by 10 and regarded as equivalent to the percentage using the counting system. The two assessment systems demonstrated a moderate correlation (ICC, 0.68 [95% CI 0.60–0.75]) (Fig. 2c).

image

Figure 2. Assessment of Ki67 labeling index by six different surgical pathologists (A to F). (a) Assessment by the counting system among six pathologists (A–F). The intraclass correlation coefficient (ICC) was 0.66 (95% confidence interval [CI] 0.52–0.78). (b) Assessment by the scoring system (visual estimate). The ICC was 0.57 (95% CI 0.42–0.72). (c) Correlation between two assessment systems. Bland–Altman plot is shown. The vertical line shows the difference in values between two assessment systems. The horizontal line shows the average value of two assessment systems. The ICC was 0.68 (95% CI 0.60–0.75).

Download figure to PowerPoint

Study 2

The assessment of Ki67 LI between 5% and 25% in 20 slides using the counting system demonstrated a moderate correlation among the six pathologists (ICC, 0.68 [95% CI 0.50–0.81]) (Fig. 3). This result is equivalent to the result from Study 1 using the specimens with an unrestricted range of Ki67 LI.

image

Figure 3. Assessment of Ki67 labeling index (LI) between 5% and 25% by the counting system. The ICC was 0.68 (95% CI 0.50–0.81), which was similar to that with an unrestricted range of Ki67 LI from study 1.

Download figure to PowerPoint

Study 3

Copies of 15 printed photographs for Ki67 LI in breast cancer tissues (Fig. 1) were sent to each pathologist at one time. The assessment of Ki67 LI using the counting system in the same photographs yielded an almost perfect concordance among the six pathologists (ICC, 0.94 [95% CI 0.88–0.97]), while the scoring systems showed a substantial concordance (ICC, 0.82 [95% CI 0.66–0.91]) (Fig. 4).

image

Figure 4. Assessment of Ki67 labeling index (LI) in 15 printed photographs. (a) Assessment by the counting system among six pathologists. The ICC was 0.94 (95% CI 0.88–0.97). (b) Assessment by the scoring system (visual estimate). The ICC was 0.82 (95% CI 0.66–0.91).

Download figure to PowerPoint

Discussion

  1. Top of page
  2. Abstract
  3. Materials and Methods
  4. Results
  5. Discussion
  6. Disclosure Statement
  7. References

Ki67 LI is reported in a number of studies to demonstrate prognostic value for breast cancer patients.[3, 19] However, it has not been accepted as a routine biomarker, mainly because there is no standardization of the assay system.[20]. The LI is also considered possibly predictive for the effects of chemotherapy, although different studies yield contrasting results.[4-11] Therefore, standardization of the Ki67 assessment system is crucial for the evaluation of the clinical utility of the marker and its clinical application. Therefore, in the present study, we evaluated the interobserver concordance of the Ki67 LI assessment. We demonstrated that the concordance was significantly higher with the counting system than with the scoring system (visual estimate) and that the concordance was substantial when the same field was assessed in printed photos. Therefore, the results of the present study did indicate that counting cells is useful for reproducible assessment and that an identification of the fields for the assessment is pivotal for the standardized assessment of Ki67 LI in breast cancer. Standardization of the assessment area selection can, therefore, is crucial for evaluating the clinical usefulness of Ki67 LI in breast cancer tissues.

The number of the cells to count has not been established when obtaining the Ki67 LI. In the majority of studies, 1000–2000 tumor cells were counted,[2, 7, 21-27] and, therefore, 1000 cells were counted in the present study. The International Ki67 in Breast Cancer Working Group recommended counting at least 1000 cells but they also accepted counting 500 cells, as the absolute minimum.[28] However, to the best of our knowledge, studies evaluating the association between the number of counted cells and reproducibility have not been published and further study is required to determine the optimum cell number count to obtain the Ki67 LI.

The field to be assessed has also been controversial in obtaining the Ki67 LI. We assessed hot spots where Ki67 immunoreactivity in cancer nuclei was relatively dense, whereas an approach that assesses the whole section and records the overall average score was recommended by the Working Group.[28] The Working Group also recommended that hot spots be included in the overall score even when the average score is chosen,[28] and, therefore, the selection of hot spots is considered indispensable for the Ki67 LI assessment of breast cancer patients in a clinical setting. In addition, the results of the present study demonstrated that identification of the assessment fields is pivotal for the standardized assessment of Ki67 LI and it is possible that this could be expanded to the overall average score because the selection of the assessment area is also considered critical for the overall average score. This should be evaluated in further studies based on the results of the present study.

There are several possible factors that could lead to variability of the selection of hot spots. One possible factor is the presence of lymphocytes or other stromal cells, which could interfere with the estimation of the density of the immuno-positive carcinoma cells and result in the selection of inappropriate fields of hot spots. A second factor is the difference in carcinoma cell density from site to site in the same specimens, which could result in inappropriate estimation of the rate of positive cells in a particular field. A third factor is cytoplasmic or membrane immunoreactivity of Ki67, which should by no means be counted as positive but could influence the estimation of the density of positive cells. A fourth factor is the relative immunointensity, which could affect the assessment of immunopositivity. Finally, magnification to be used would affect the selection of the fields.

In the present study, we also attempted to assess the variability in relative immunointensity regarded as positive by using printed photos (Fig. 1). Considering the small variations of LI among the pathologists involved in this particular study, the variation of the threshold of immunointensity interpreted as positive is considered small in these printed photos. This should be further assessed with stained slides because printed photos may provide better contrast and clearer distinction between positive and negative.

In regards to the validation of an assay system, a number of issues other than the areas to be selected need to be considered, as specified by the Working Group, such as preanalytical and analytical validity, interpretation, scoring and data analysis.[28] Tissue microarrays (TMA) are being more frequently used in various studies, especially for biomarker assessment in large clinical trials. The present study used the whole blocks from surgical pathology specimens, possibly providing larger areas for assessment than TMA. However, it is also true that routine assessment in clinical laboratories is performed using blocks from surgical specimens or core needle biopsy samples and, thus, it is critical to establish standardized methods to select the assessment fields in these sections.

In conclusion, the counting system yielded better concordance among the pathologists than the scoring system (visual estimate). The results of the present study suggest that appropriate identification of the fields to be assessed could be pivotal for obtaining accurate Ki67 LI of breast cancer tissue. Further study to standardize the selection of the hot spots among pathologists is necessary for the critical evaluation of the clinical value of Ki67 LI in breast cancer tissues.

Disclosure Statement

  1. Top of page
  2. Abstract
  3. Materials and Methods
  4. Results
  5. Discussion
  6. Disclosure Statement
  7. References

The authors have no conflict of interest.

References

  1. Top of page
  2. Abstract
  3. Materials and Methods
  4. Results
  5. Discussion
  6. Disclosure Statement
  7. References
  • 1
    Early Breast Cancer Trialists' Collaborative G, Peto R, Davies C et al. Comparisons between different polychemotherapy regimens for early breast cancer: Meta-analyses of long-term outcome among 100,000 women in 123 randomised trials. Lancet 2012; 379: 43244.
  • 2
    Colleoni M, Viale G, Zahrieh D et al. Chemotherapy is more effective in patients with breast cancer not expressing steroid hormone receptors: A study of preoperative treatment. Clin Cancer Res 2004; 10: 66228.
  • 3
    Yerushalmi R, Woods R, Ravdin PM, Hayes MM, Gelmon KA. Ki67 in breast cancer: prognostic and predictive potential. Lancet Oncol 2010; 11: 17483.
  • 4
    Arriola E, Moreno A, Varela M et al. Predictive value of HER-2 and Topoisomerase IIalpha in response to primary doxorubicin in breast cancer. Eur J Cancer 2006; 42: 295460.
  • 5
    Colleoni M, Bagnardi V, Rotmensz N et al. A nomogram based on the expression of Ki-67, steroid hormone receptors status and number of chemotherapy courses to predict pathological complete remission after preoperative chemotherapy for breast cancer. Eur J Cancer 2010; 46: 221624.
  • 6
    Penault-Llorca F, Abrial C, Raoelfils I et al. Changes and predictive and prognostic value of the mitotic index, Ki-67, cyclin D1, and cyclo-oxygenase-2 in 710 operable breast cancer patients treated with neoadjuvant chemotherapy. Oncologist 2008; 13: 123545.
  • 7
    Jones RL, Salter J, A'Hern R et al. Relationship between oestrogen receptor status and proliferation in predicting response and long-term outcome to neoadjuvant chemotherapy for breast cancer. Breast Cancer Res Treat 2010; 119: 31523.
  • 8
    von Minckwitz G, Sinn HP, Raab G et al. Clinical response after two cycles compared to HER2, Ki-67, p53, and bcl-2 in independently predicting a pathological complete response after preoperative chemotherapy in patients with operable carcinoma of the breast. Breast Cancer Res 2008; 10: R30.
  • 9
    Hugh J, Hanson J, Cheang MC et al. Breast cancer subtypes and response to docetaxel in node-positive breast cancer: use of an immunohistochemical definition in the BCIRG 001 trial. J Clin Oncol 2009; 27: 116876.
  • 10
    Penault-Llorca F, Andre F, Sagan C et al. Ki67 expression and docetaxel efficacy in patients with estrogen receptor-positive breast cancer. J Clin Oncol 2009; 27: 280915.
  • 11
    Viale G, Regan MM, Mastropasqua MG et al. Predictive value of tumor Ki-67 expression in two randomized trials of adjuvant chemoendocrine therapy for node-negative breast cancer. J Natl Cancer Inst 2008; 100: 20712.
  • 12
    Cheang MC, Chia SK, Voduc D et al. Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer. J Natl Cancer Inst 2009; 101: 73650.
  • 13
    Goldhirsch A, Wood WC, Coates AS, Gelber RD, Thurlimann B, Senn HJ. Strategies for subtypes–dealing with the diversity of breast cancer: highlights of the St Gallen International Expert Consensus on the Primary Therapy of Early Breast Cancer 2011. Ann Oncol 2011; 22: 173647.
  • 14
    Tsuda H, Kurosumi M, Umemura S, Yamamoto S, Kobayashi T, Osamura RY. HER2 testing on core needle biopsy specimens from primary breast cancers: interobserver reproducibility and concordance with surgically resected specimens. BMC Cancer 2010; 10: 534.
  • 15
    Umemura S, Osamura RY, Akiyama F et al. What causes discrepancies in HER2 testing for breast cancer? A Japanese ring study in conjunction with the global standard. Am J Clin Pathol 2008; 130: 88391.
  • 16
    Fleiss JL, Cohen J. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas 1973; 33: 6139.
  • 17
    Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977; 33: 15974.
  • 18
    Bland JM, Altman DG. Comparing methods of measurement: why plotting difference against standard method is misleading. Lancet 1995; 346: 10857.
  • 19
    Sheri A, Dowsett M. Developments in Ki67 and other biomarkers for treatment decision making in breast cancer. Ann Oncol 2012; 23 (Suppl. 10): x21927.
  • 20
    Harris L, Fritsche H, Mennel R et al. American Society of Clinical Oncology 2007 update of recommendations for the use of tumor markers in breast cancer. J Clin Oncol 2007; 25: 5287312.
  • 21
    Viale G, Giobbie-Hurder A, Regan MM et al. Prognostic and predictive value of centrally reviewed Ki-67 labeling index in postmenopausal women with endocrine-responsive breast cancer: results from Breast International Group Trial 1-98 comparing adjuvant tamoxifen with letrozole. J Clin Oncol 2008; 26: 556975.
  • 22
    Bottini A, Berruti A, Bersiga A et al. Relationship between tumour shrinkage and reduction in Ki67 expression after primary chemotherapy in human breast cancer. Br J Cancer 2001; 85: 110612.
  • 23
    Veronese SM, Gambacorta M, Gottardi O, Scanzi F, Ferrari M, Lampertico P. Proliferation index as a prognostic marker in breast cancer. Cancer 1993; 71: 392631.
  • 24
    Jones RL, Salter J, A'Hern R et al. The prognostic significance of Ki67 before and after neoadjuvant chemotherapy in breast cancer. Breast Cancer Res Treat 2009; 116: 5368.
  • 25
    DeCensi A, Guerrieri-Gonzaga A, Gandini S et al. Prognostic significance of Ki-67 labeling index after short-term presurgical tamoxifen in women with ER-positive breast cancer. Ann Oncol 2011; 22: 5827.
  • 26
    Dowsett M, Smith IE, Ebbs SR et al. Short-term changes in Ki-67 during neoadjuvant treatment of primary breast cancer with anastrozole or tamoxifen alone or combined correlate with recurrence-free survival. Clin Cancer Res 2005; 11: 951s8s.
  • 27
    Nole F, Minchella I, Colleoni M et al. Primary chemotherapy in operable breast cancer with favorable prognostic factors: a pilot study evaluating the efficacy of a regimen with a low subjective toxic burden containing vinorelbine, 5-fluorouracil and folinic acid (FLN). Ann Oncol 1999; 10: 9936.
  • 28
    Dowsett M, Nielsen TO, A'Hern R et al. Assessment of Ki67 in breast cancer: recommendations from the International Ki67 in Breast Cancer working group. J Natl Cancer Inst 2011; 103: 165664.