The Biedenkopf Expert Panel Members: Fatima Cardoso, Breast Cancer Unit, Champalimaùd Cancer Center, Lisbon, Portugal; Manfred Dietel, Institute of Pathology, Humboldt University, Berlin, Berlin, Germany; Lutz Edler, Department of Biostatistics, German Cancer Research Center, Heidelberg, Germany; Meinhard Hahn, Department of Molecular Genetics, German Cancer Research Center, Heidelberg, Germany; Walter Jonat, Department Gynecology and Obstetrics, University of Kiel, Kiel, Germany; Thomas Karn, Department of Obstetrics and Gynecology, Breast Unit, Goethe University, Frankfurt, Germany; Hans Kreipe, Institute for Pathology, Hannover Medical School, Hannover, Germany; Sherene Loi, Department of Translational Research, Jules Bordet Institute, Brussels, Belgium; Gunter von Minckwitz, German Breast Group, Neu-Isenburg, Germany; Achim Rody, Department of Obstetrics and Gynecology, Breast Unit, Goethe University, Frankfurt, Germany; Hans Peter Sinn, Department of Pathology, University of Heidelberg, Heidelberg, Germany; Marc Van de Vijver, Department of Pathology, Academic Medical Center, Amsterdam, Netherlands.
Prognostic and predictive markers are needed for breast cancer to guide the selection of the most appropriate therapy for individual patients. Retrospective studies on many markers have been performed, but almost none were validated in prospective therapeutic trials or prospectively powered marker validation studies in the accurately selected patient population. Consequently, the ASCO Tumor Marker Guidelines in 2007 only deemed the uPA/PAI-1 immunoassay and the 21-gene Recurrence Score PCR assay (Genomic Health Inc.) appropriate for clinical use consideration to assist risk assessment in node-negative breast cancer patients, in addition to estrogen receptor (ER) and human epidermal growth factor receptor-2 (HER2) expression as predictive markers of endocrine and HER2-targeted therapies, respectively.1 The 2009 St. Gallen Consensus panel also endorsed the routine use of Ki-67 expression in addition to ER and HER2, and acknowledged the potential value of validated multigene-profiling assays in selected patients (ie, equivocal risk by clinical pathological variables); however, uPA/PAI was not considered a clinically acceptable prognostic marker, and what constitutes appropriate validation was not discussed in the guideline.2
Based on clinical and molecular evidence from recent years, there is now a general consensus that breast cancer is a disease of different subtypes. The 4 main molecular subtypes can be reasonably accurately distinguished based on hormone receptor status, HER2 expression, and proliferative activity/histological grade. Genomic profiling techniques have led to several prognostic and predictive gene signatures of breast cancer that may further refine outcome prediction, particularly in clinically equivocal situations.3 The most extensively studied multigene assays include the 70-gene prognostic signature (MammaPrint; Agendia Inc., Amsterdam, Netherlands),4, 5 the 21-gene Recurrence Score (OncotypeDX; Genomic Health Inc., Santa Clara CA),6 and the 97-gene Genomic Grade Index (GGI Ipsogen, Marseille, France).7 Numerous other prognostic or predictive signatures have been reported for breast cancer in general or for ER-positive cancers in particular, but these have not been so extensively characterized.8-15 Different molecular subtypes have different chemotherapy sensitivity that is apparent from several neoadjuvant chemotherapy studies.16, 17 Importantly, all current prognostic gene signatures are strongly correlated with proliferative activity of the tumor, and their clinical prognostic value is mainly based on detecting highly proliferating ER-positive tumors.18-22 Furthermore, the value of some emerging predictive markers such as TOP2A might also be blurred by their correlation with the proliferative status.23-24
Routine pathological evaluation of tumors is done on a decentralized basis, whereas genomic assays are generally performed in a centralized manner. This complicates the comparison of the predictive performance of the new molecular techniques with routine clinical-pathological variable-based predictions. Unfortunately, despite the recent development of guidelines for tissue banking,25-27 there is still a considerable lack of uniformly collected, clinically well-annotated, and large-enough sample sets obtained in the context of prospective clinical trials that could be used for validation of biomarkers. Thus, there remains considerable uncertainty on the use of the new molecular markers in routine clinical decision making.28
In September 2009, an international panel of representatives of a number of breast cancer research groups was convened in Biedenkopf, Germany (see Table 1). The panel members (12 representatives from 3 European countries and 1 representative from the United States) comprised experts in the areas of breast pathology, genomic profiling in breast cancer, and breast cancer clinical trials and represented medical oncologists, breast surgeons, pathologists, and a biostatistician who were selected by the consensus chairs. The meeting focused on molecular markers and genomic expression signatures that were developed in recent years. Their clinical value in decision making in breast cancer therapy and their role in patient selection or stratification for future clinical trials was critically reviewed. Twelve presentations were solicited to provide an overview of current knowledge (Table 1). Instead of a central systematic literature review, the presenting panel members were charged with reviewing all available data from published studies from PubMed, as well as from abstracts published in the proceedings of meetings of the American Society of Clinical Oncology, San Antonio Breast Cancer Symposium, European Conference of Clinical Oncology, European Society of Medical Oncology, and European Breast Cancer Conference. The content of the presentations was discussed, and 5 questions were debated. The goal was to formulate a set of consensus comments on the practical use of molecular markers in breast cancer management and their incorporation into future clinical trials. The recommendations in this article were approved by all panelists.
Table 1. Panel Members and Titles of Presentations at the Meeting
Fatima Cardoso, Jules Bordet Institute, Brussels, Belgium
Molecular diagnostics in clinical trials: the TRANSBIG experience
Hans Peter Sinn, University of Heidelberg, Heidelberg, Germany
Molecular diagnostics in premalignant lesions: who is at risk?
Marc Van de Vijver, Academic Medical Center, Amsterdam, Netherlands
Prognostic signatures: ready for prime time or why not?
The following 5 questions were discussed among the panel:
1Are currently available genomic markers useful in all breast cancers or only in specific subgroups?
2Do we need to stratify patients, or conduct separate therapeutic trials and biomarker studies, by molecular subtype or by clinical phenotype?
3Which tests are ready for routine use to define prognostic risk groups, and which information should be provided routinely by clinical pathology?
4Do we need to collect tissue from all patients in clinical trials?
5Are prospectively conducted marker evaluation studies necessary to generate level I evidence?
Are Currently Available Genomic Markers Useful in All Breast Cancers or Only in Specific Subgroups?
The current, first-generation genomic prognostic markers,29, 30 which were developed from combined analysis of all breast cancer subtypes, appear to classify almost all ER-negative patients as high risk and therefore have limited value to risk-stratify this clinical group. However, these molecular markers can subdivide ER-positive breast cancers (with or without endocrine therapy) into lower- and higher-risk groups, and therefore if clinical variables are equivocal, they may provide some clinical value.19, 20, 30-32 The panel recognized that new markers are urgently needed for the ER-negative and HER2-positive breast cancers.
Several recent studies have demonstrated that all currently available genomic prognostic signatures (MammaPrint,4 Recurrence Score,6 Genomic Grading Index,7 and others) identify an overlapping group of highly proliferative ER-positive tumors that have poor prognosis.18, 20-22 It is not yet clear whether a standardized, centralized histopathological grading, particularly if aided by Ki-67 measurements, might also allow defining this subgroup. Some recent data suggest that multivariate prognostic models including ER, HER2, and Ki-67, with or without tumor size and nodal status, determined in a central pathology laboratory could yield prognostic information very similar to the 21-gene Recurrence Score assay.33-34
Do We Need to Stratify Patients or Conduct Separate Therapeutic Trials and Molecular Marker Studies by Molecular Subtype or Clinical Phenotype?
A large amount of data from recent years have clearly demonstrated that the different subtypes of breast cancers, defined by gene expression analysis, by immunohistochemistry (IHC) panels, or by routine ER and HER2 assays, differ markedly in their clinical course. Different subtypes of breast cancers have different chemotherapy sensitivities (basal-like/triple-negative ≥HER2-positive>luminal B>luminal A), have different endocrine sensitivities (luminal A>luminal B), show different annual hazards of recurrence, and have different predilections for metastatic sites. The panel agreed that not accounting for clinical/molecular subtypes during the design and the final analysis of a marker or therapeutic study can introduce substantial bias due to ignoring strong confounders of clinical outcome. We recommend stratifying patients in any future clinical trials or marker studies according to phenotype. Such stratification should at least be performed in post hoc analyses, but a prospectively planned design taking into account larger sample sizes would be strongly preferred. Small discovery trials and phase 1 studies could be excluded from this suggestion to avoid overloading of trial designs in such early studies.
A relatively high concordance (75%-90%) exists between molecular subtypes as defined by genomic methods and IHC phenotype. A simple, routine ER-, PR-, and HER2-based equivalent of molecular classification already exists in clinical practice and is commonly employed during decision making. Immunohistochemistry results for ER, PR and HER2 can define triple-receptor-negative breast cancers as a reasonable surrogate for basal-like molecular class and can directly identify HER2-positive cancers. Among the ER-positive cancers, HER2-normal, low-grade cancers correspond closely to the luminal-A molecular class or MammaPrint and Oncotype DX low-risk groups. High-grade ER-positive cancers correspond closely to the luminal-B or MammaPrint and Oncotype DX high-risk groups. Because these routine markers are available on large numbers of archived samples, the clinical characteristics of these IHC-defined subsets are much better characterized than the clinical characteristics of molecular classes defined by gene expression results. Importantly, the ER- and HER2-based subgroups readily conform to current therapeutic approaches to breast cancer and therefore can be readily incorporated into clinical trials as patient stratification or even eligibility tools.
One of the biggest underlying problems is the question for which subgroup a new therapy may have a beneficial effect. A biological rationale could give certain hints, but in many cases it might be necessary to study several if not all subgroups. Therefore, the panel also felt that new trial designs will need to be considered to compensate for reduced power when preplanned stratification and subset analysis are employed, and subtype-specific clinical trials should be given strong consideration.35-37
Which Tests Are Ready for Routine Use to Define Prognostic Risk Groups, and Which Information Should Be Provided Routinely by Clinical Pathology?
The panel agreed that the therapeutically and scientifically most relevant risk groups are defined by a constellation of markers rather than single markers alone. The following 4 therapeutic and prognostic risk groups are suggested:
1Triple-negative breast cancer, defined as lack of expression of ER and HER2, defined by IHC and/or fluorescent in situ hybridation (FISH) in the case of HER2
2HER2-positive breast cancer (either ER positive or negative), defined by HER2 IHC or FISH38
3ER-positive/HER2-negative breast cancer.
4The ER-positive/HER2-negative subgroup of breast cancer should be further divided into low-risk/low-proliferation and high-risk/high-proliferation groups. This definition should be made by either considering histological grade, Ki-67 expression, GGI, MammaPrint, or Recurrence Score. Histological grade III (G3) tumors can be assumed to be high risk, and histological grade I (G1) can be assumed to be low risk; for G2 tumors, an additional test, such as Ki-67, GGI, MammaPrint, or Oncotype DX, may be appropriate to better define prognostic risk. Several other prognostic assays are under development, but their performance characteristics need to be defined more accurately before adopting these for risk stratification.
The recently published ASCO/CAP Guideline on Hormone Receptor Testing in Breast Cancer recommended the routine testing of PgR, even if the precise role of PgR in patient management has not been strongly established.39 In contrast, the panel discussion resulted in the conclusion that the controversial category of breast cancers with a verified ER negative/PgR-positive status is extremely small. Moreover, because the added value of PgR determination to define the triple-negative group is negligible and Ki-67, as a marker of proliferation activity, appears to provide more important prognostic information in ER-positive cancers, the panel recommends replacing or at least supplementing routine PR reporting with Ki-67 determination. It seems reasonable that classical methods such as histological grading and Ki-67 determination could even reach the precision of modern genomic methods when performed in a centralized manner.33, 34 The panel noted the lack of standard definitions to assign low or high Ki-67 status; however, most suggested thresholds of positive cells ranged between 13% and 17%.40 It is critically important to standardize these methods in local pathology departments. The panel advocates increasing the number of proficiency-testing ring studies41-43 and supports the development of quantitative approaches to reliably measure IHC staining pattern intensity based on digitized histological slides.
To avoid confusion, other names such as “luminal A or B,” “basal-like,” and “HER2-positive molecular class” should be restricted to studies where these molecular classes are determined by appropriate gene expression profiling8,15, 18 or potentially by future complex immunohistochemistry panels. Because of the lack of a standard molecular classification method, “molecular class” is defined differently in almost every publication. Until uniform methods are developed and it is proven in clinical trials that the identification of molecular class by gene expression profiling leads to more appropriate treatment choice than ER-, PR-, and HER2-based recommendations, its diagnostic use should be considered investigational and not used in routine practice.
Do We Need to Collect Tissue From All Patients in Clinical Trials?
The panel emphasizes that collection of tissue material should be performed in clinical trials according to recently published guidelines.25-27 In international trials, attention should be made to comply with distinct regulatory requirements in different countries, and harmonization of the regulations could greatly facilitate the conduct of translational research studies.44 Trialists should aim to institute mandatory tissue collection in a time frame close to the trial, similar to the currently widely accepted collection of baseline demographic information.
Are Prospectively Conducted Marker Evaluation Studies Necessary to Generate Level I Evidence?
The panel endorsed the position of a very recent publication on the use of archived specimens in the evaluation of prognostic and predictive biomarkers.45 This proposal argues that appropriately conducted, prospectively designed, and adequately powered marker validation studies can be conducted on archived specimens that could yield level I evidence for the use of biomarkers. However, not all archived tissue repositories are equally informative. Most ad hoc or sequentially collected tissue banks are subject to various known or potential collection biases, and therefore such data may not yield level I evidence about the predictive performance of molecular markers. The highest utility tissue resource includes prospectively and systematically collected specimens in the context of large randomized clinical trials. Such “prospective-retrospective” designs adhering to specific guidelines could be more efficient than prolonged and costly randomized trials to assess the predictive accuracy of proposed novel markers.
However, the ultimate proof of clinical utility, defined as better outcome when a marker is used, will only come from prospective, randomized therapeutic trials that compare marker-based decision-making strategy with alternative strategies. The panel recommends that clinicians actively take part in the currently available trials such as the MINDACT and TAILORx studies.
Conclusions and Future Directions
Table 2 presents a comparison of the aims and recommendations of the panel from the Biedenkopf meeting with those of the St. Gallen consensus conference on the Primary Therapy of Early Breast Cancer 2009.2 The primary goal of the Biedenkopf meeting was to formulate consensus comments about how to incorporate the use of molecular markers in clinical trials. The panel strongly recommends that all patients in future clinical trials should be stratified according to their clinical phenotype or by molecular class. We recognize that there are no standard, commonly accepted methods to assign molecular class and that development of standard methods, particularly those that can be applied to archived specimens, will be critical to better defining the clinical relevance of molecular classification in the future. In contrast, the routinely available markers of ER and HER2 expression together with histological grading and perhaps aided by Ki-67, MammaPrint, GGI, or Recurrence Sore measurements allow a simple and convenient classification schema that approximates the gene expression profile-based molecular type reasonably well and is readily applicable for patient stratification or selection for clinical trials.
Table 2. Comparison of Recommendations from the Biedenkopf Panel and St. Gallen Consensus Conference
Biedenkopf Symposium on the Incorporation of Molecular Markers into Breast Cancer Therapy
St. Gallen Consensus Conference on the Primary Therapy of Early Breast Cancer 2009
Primary aim of the meeting:
Optimization of strategies for future clinical trials through incorporation of molecular markers.
Justified recommendations on the use of predictive factors for the guidance of adjuvant systemic therapies.
Main subtypes which should be distinguished:
1. Triple Negative Breast Cancer (TNBC). 2. HER2 positive Breast Cancer. 3. ER pos./HER2 neg. Breast cancer further divided into low proliferation (3a) and high proliferation (3b) groups.
Use of PgR status:
Added value of PgR determination to define the TNBC group is negligible. PgR staining offers no predictive value for endocrine therapy. PgR staining in pathology department should be replaced by Ki-67 or other proliferation marker determination to stratify ER positive samples into low and high risk groups.
• ER negative and PgR positive are probably artefactual. • PgR was considered valuable for prognosis, but less important for predicting response to treatment • Higher ER and PgR level as relative indication for endocrine therapy alone. Lower ER and PgR level as relative indication for chemoendocrine therapy.
Use of genomic methods:
Stratification of ER pos/HER2 neg patients into low risk (low proliferation) and high risk (high proliferation) groups.
In ER positive/HER2 negative disease validated multigene tests, if readily available, could assist in deciding whether to add chemotherapy to endocrine therapy in cases where its use was uncertain after consideration of conventional markers.
Readiness of genomic methods:
Recommendation to reserve the use of any of the genomic tests to clinical trials since the results of the tests and their abilities to predict response to current treatments are unknown and hence they currently cannot be used for treatment decision making.
Support of the use of a validated multigene-profiling assay, if readily available, as an adjunct to high-quality phenotyping of breast cancer in cases in which the indication for adjuvant chemotherapy remained uncertain.
On the one hand, such stratification could lead to reduced power because of smaller sample sizes within each clinical/molecular subset. On the other hand, clinical/molecular specific-subtype trials may sometimes require smaller sample sizes because of substantial effect within a given subgroup that would be diluted by inclusion of other subtypes (see Fig. 1). Several statistical designs have been described that consider interactions between treatment effect and molecular subsets.35, 36 A particularly convenient Web-based clinical-trial design tool developed by the US National Cancer Institute Biometric Research Branch is available at http://brb.nci.nih.gov.46 Clearly, the optimal solution would be accurate prospective identification of the specific subgroup in which a new treatment has the highest chance of success. However, this has been a considerable problem for several new therapeutics (eg, antiangiogenics and many tyrosine kinase inhibitors). In this respect, it is important to note that even if the simple and convenient classification scheme of routine markers can be used for stratification in a trial, further detailed studies are still needed in basic research to analyze the molecular differences between the subgroups of breast cancer. Only these studies can lead to a rationale for the efficiency of a drug for a specific type of tumor.
The panel also emphasized that future genomic studies should focus on the discovery of predictive rather than prognostic factors in order to move farther away from the one-size-fits-all concept of therapy.
We thank the independent BANSS Foundation, Biedenkopf, Germany, for financial support and the GBG GmbH, Neu Isenburg, Germany, for logistical support of the meeting. All members of the panel had a significant input into the discussion and formulation of the article.
CONFLICTS OF INTEREST DISCLOSURES
This consensus symposium received financial support from the BANSS Foundation, a nonprofit body based in Biedenkopf an der Lahn, Germany. The founder of the BANSS Foundation, who died from breast cancer, wished to help extend the information resources available to clinicians and investigators in the field of oncology. Both the symposium and the preparation of this article was conducted independent of the diagnostic or pharmaceutical industry. The report was drafted in its entirety by the meeting participants without any paid assistance. The following authors indicated a financial or other interest relevant to the subject matter under consideration in this article: honoraria—H. Kreipe, Roche Pharma, Genomic Health; L. Pusztai, Bristol Myers Squibb Co.; research funding—M. Kaufmann, Siemens Healthcare Diagnostics Products GmbH; L. Pusztai, Bristol Myers Squibb Co.; M. van de Vijver, Hoffmann La Roche; expert testimony—H.-P. Sinn, Genomic Health Inc. (compensated). M. van de Vijver is member of the Pathology Advisory Board to Hoffmann La Roche and a coinventor of the patent “70 gene prognosis profile in breast cancer.”