Risk stratification of ER‐positive breast cancer patients: A multi‐institutional validation and outcome study of the Rochester Modified Magee algorithm (RoMMa) and prediction of an Oncotype DX® recurrence score <26

Abstract The skyrocketing cost of health‐care demands that we question when to use multigene assay testing in the planning of treatment for breast cancer patients. A previously published algorithmic model gave recommendations for which cases to send out for Oncotype DX® (ODX) testing. This study is a multi‐institutional validation of that algorithmic model in 620 additional estrogen receptor positive breast cancer cases, with outcome data on 310 cases, named in this study as the Rochester Modified Magee algorithm (RoMMa). RoMMa correctly predicted 85% (140/164) and 100% (17/17) of cases to have a low‐ or high‐risk ODX recurrence score, respectively, consistent with the original publication. Applying our own risk stratification criteria, in patients who received appropriate hormonal therapy, only one of the 45 (2.0%) patients classified as low risk by our original algorithm have been associated with a breast cancer recurrence over 5‐10 years of follow‐up. Eight of 116 (7.0%) patients classified as low risk by ODX have been associated with a breast cancer recurrence with up to 11 years of follow‐up. In addition, 524 of 537 (98%) cases from our total population (n = 903) with an average modified Magee score ≤18 had an ODX recurrence score <26. Patients with an average modified Magee score ≤18 or >30 may not need to be sent out for ODX testing. By avoiding these cases sending out for ODX testing, the potential cost savings to the health‐care system in 2018 are estimated to have been over $100,000,000.


| INTRODUCTION
The evolving era of precision medicine demands both costefficient and cost-effective strategies for diagnostic treatment algorithms. Challenges remain in accurately assessing which strategies are more cost-effective for identifying hormone receptor-positive breast cancer patients who will benefit from systemic chemotherapy.
Over the last decade, molecular approaches, including multigene assays for predicting prognosis and treatment response, have entered into the clinical arena of breast cancer care. [1][2][3][4][5][6][7][8] All of these multigene assays have been shown to have some prognostic and predictive value in certain subgroups of estrogen receptor (ER) positive breast cancer patients. 3,4,9 Oncotype DX ® (ODX) is the most widely used of these multigene assays. ODX is an expensive test, and although the test has been suggested to be cost-effective, [10][11][12][13][14][15][16] it may not be the most cost-effective option in certain subsets of breast cancer patients. 17,18 A recent meta-analyses by Wang et al 18 suggests that a majority of published articles supporting the cost-effectiveness of ODX were generally industry-funded, and incorporated study designs that can increase the risk of bias. In a majority of these studies supporting the cost-effectiveness of ODX, clinical characteristics commonly used to make chemotherapy decisions (ie, tumor size and grade) were not incorporated into simulation modeling. As such, these "supportive" studies might not reflect actual clinical practice. 18 Several studies have suggested that standard clinical, histopathological, semi-quantitative immunohistochemistry (IHC), and biomarker data can provide information similar to that provided by the ODX recurrence score (ODXRS). [19][20][21][22][23][24][25][26][27][28][29][30][31] The IHC4 score 23 uses semi-quantitative information from the immunohistochemical assessment of ER, PR, HER-2, and Ki-67 (four of the genes measured in the ODX panel) to calculate a risk score using weighting factors and an algorithm.
Recent studies have validated the use of the IHC4 score for identifying patients at low, moderate, or high risk of relapse following current endocrine therapy, and the IHC4+ C score (which includes clinical and additional pathologic variables) for identifying patients at low risk who potentially can avoid adjuvant radiotherapy. 32,33 The equations used for the IHC4 and IHC4+ C scores are available to the public free of charge. 34 Klein and Dabbs et al 19 published three linear equations (the new Magee equations) using different combinations of standard histopathological variables (Nottingham score [NS], ER, PR, HER-2, Ki-67, and tumor size). These new Magee equations are also available to the public free of charge (https ://path.upmc.edu/onlin eTool s/magee equat ions.html), and calculate a recurrence score, which was shown to correlate well with the ODXRS. We published an algorithm 31 based on a modification of the new Magee equations, showing that this algorithm provides similar risk information to the ODX test.
The goal of this study was to further validate our original algorithm, using data from two separate institutions, and to examine outcome data from two separate institutions. We also present data which suggests further clinical utility of the average modified Magee equation, given the recent TAILORx findings that certain populations of patients with an ODXRS <26 may not benefit from additional systemic chemotherapy. 35

| Patients and data retrieval
A total of 903 consecutive cases (889 patients) with ER + invasive breast cancer were included in this study from the University of Rochester and the University of Louisville. Figures 1 and 2 highlight the cases that were used for the F I G U R E 1 Number of cases (patients) used for the validation and outcome evaluations validation and outcome evaluations. All 903 cases were used for the evaluation of an ODXRS <26.
Six hundred and twenty of 903 cases (606/889 patients) were included in the validation study ( Figure 1). None of these cases were used in our original 2015 publication. 31 Three hundred and ten of 903 cases (301/889 patients) were included in the outcome analysis (Figures 1 and 2). The outcome analysis included all patients who had at least 5 years of follow-up data, and all patients who had a breast cancer recurrence.
Information on ER, PR, HER-2, Ki-67, and tumor size was extracted from the pathology report. The NS was calculated using the Nottingham modification of the Bloom-Richardson system. 36 Information on age, ethnicity, lymph node status, lymphovascular invasion, hormone therapy, chemotherapy, radiation therapy, recurrence status, and mortality were extracted from the medical record. Some type of hormonal therapy or therapies (Anastrozole, Exemestane, Tamoxifen, Lupron, and/ or Letrozole) was known to have been received in 240 patients. Some type of systemic chemotherapy or therapies (Carboplatin, Cyclophosphamide, Docetaxel, Doxorubicin, 5-flourouracil, Methotrexate, Paclitaxel, and/or Vinorelbine) was known to have been received in 59 patients. Some type of radiation therapy was known to have been received in 158 patients. All tumor H&E slides and IHC were reviewed by at least two board-certified breast pathologists, with manual interpretation of ER, PR, HER-2, and Ki-67, using standard histological criteria for determining modified ER and PR H-scores 31

| Study design
The algorithmic approach used has been previously described by Turner et al 31 Briefly, an average modified Magee recurrence score was calculated, and all cases with an average modified Magee recurrence score of ≤12, or with a modified ER and PR H-score ≥150 and a Ki-67 <10%, were considered low risk. All cases with an average modified Magee recurrence score >30 were considered high risk. We compared our current results from the validation study with our results from the 2015 publication. We also examined different average modified Magee recurrence score groups (ie, groups with a score ≤9, groups with a score of 10, groups with a score of 11, etc) and their associated-average ODXRS. Finally, we examined the association of breast cancer recurrence outcome data with the average modified Magee recurrence score, ODXRS, and clinicopathological data.

| RESULTS
A summary of clinicopathological features in the patient population is detailed in Tables 1 and 2.

| Correlation and concordance between original publication and the validation population
Overall, the data in our validation population were remarkably consistent with the data in our original test population from the original publication 31 ( Figure 3A,B, Table 3, and  Table S1). A Pearson correlation shows a significant correlation (P < 0.0001) between the original test population and the validation population when comparing the percentage of cases with each Magee score in the two groups ( Figure S1 and Table S2). A Pearson correlation also shows a significant correlation (P < 0.0001) between the original test population and the validation population when comparing the percentage of cases in both a particular Magee score group and its correlating ODX risk group (Figure S2 Table 3 and Table S3).The chi-squared (χ 2 ) test of independence showed no significant difference between the observed frequencies in the original test population and the validation population (P = 0.351) when evaluating the number of cases with an average modified Magee score <18 or ≥18 (Table S4). The chi-squared (χ 2 ) test of independence also showed no significant difference between the observed frequencies when evaluating cases with an average modified Magee score ≤18 in the original test population and the validation population (P = 0.559), that have an Oncotype DX score <26, or ≥26 (Table S5).

| Cases with an average modified Magee recurrence score ≤18 and >18
Five hundred and twenty four of 537 (98%) cases with an average modified Magee score ≤18 had an ODXRS <26 (Table 4). One hundred and thirty three of 366 (36%) cases with an average modified Magee score >18 had an ODXRS ≥26 (Table 4). One hundred and thirty three of 146 (91%) cases with an ODXRS ≥26 had an average modified Magee score >18 (Table 4). An average modified Magee score is highly specific and predictive for an ODXRS <26 (Table 4).

| Outcome analysis
Eighteen of 301 (6%) patients in our outcome population have had a breast cancer recurrence (supplemental Table S6). Overall, two of 66 (3%) patients classified as low risk by our original algorithm recurred, and 10 of 156 (6%) patients classified as low risk by ODX recurred ( Table 5). Two of the 18 patients who recurred did not receive hormonal therapy and these two patients were not included in our subsequent analysis of recurrence in "low risk" patients ( Figure 2). Considering all the low risk patients who did recur except for the two that did not receive hormonal therapy (Table S6), and only the low risk patients who did not recur and who did not receive chemotherapy ( Figure 2), none of the 28 (0%) patients recurred who were classified as low risk by our original algorithm who received radiation, five of the 65 (8%) patients recurred who were classified as low risk by ODX who received radiation, one of the 17 (6%) patients recurred who were classified as low risk by our original algorithm who did not receive radiation, and three of the 51 (6%) patients recurred who were classified as low risk by ODX who did not receive radiation. Seventeen of 18 (94%) patients who recurred had an average modified Magee score ≥13.5, and 13 of 18 (72%) patients who recurred had an average modified Magee score >18 (Table S6).
Patients who recurred had a significantly higher Ki-67 (P < 0.0001) than patients who did not recur (Table S7). Overall, patients who recurred had a lower PR status (P = 0.12), statistically significant in patients with positive lymph nodes (P = 0.02) and in patients who did not receive chemotherapy (P = 0.02) (Table S7). Overall,  neither NS (P = 0.45) nor ER status (P = 0.54) were significantly different between patients who did and did not recur (Table S7). Patients who recurred had a significantly higher risk (OR = 6.2, P = 0.002) for having LVI ( Table  S7), although this did not reach statistical significance.
All the patients with a known Ki-67 who recurred (n = 13, Table S6) had some combination of a lowered modified PR H-score (≤ 210), LN involvement, LVI, or a higher Ki-67 (≥20). Twelve of the 18 patients who recurred in our population (67%) had an ODXRS of <26 (Table 5 and Table S6).
Eleven of these 12 patients (92%) were 50 years or older at the time of diagnosis (Table S6).

| Cost analysis
The list price reported in the Genomic Health 2017 Annual Report 37 for the invasive breast carcinoma ODX test was $4,620. Using our previously published 31 algorithmic approach ( Figure 4) in our total population of cases between 2007 and 2018 (n = 903), 20.8% (n = 188) satisfied low risk algorithmic criteria, and 2.8% (n = 25) satisfied high risk algorithmic criteria. These 213 cases would potentially not have been sent out by our institutions for ODX testing, creating a combined potential cost savings of $984,060 for the University of Rochester and the University of Louisville. In Challenges remain in accurately identifying which strategies are more cost-effective and more cost-efficient in identifying this unique subset of luminal A breast carcinoma patients who will also benefit from systemic chemotherapy. While molecular approaches, including multigene assays, have been shown to have some prognostic and predictive value in certain subgroups of ER + breast cancer patients, 3,4,9 standard clinical practice has paid little attention to the suitability of breast tissue for molecular analysis, and current research suggests that alterations in the molecular integrity of breast tissue during the pre-analytic stage may result in inaccurate results and potentially substandard patient care. 45 With so many new molecular assays available, on a case by case basis, there remains significant uncertainty on the part of many clinicians on how to best utilize this new molecular information, what incremental value these tests provide, or how best to integrate these assay results with the available clinicopathologic features of the patient's tumor. 17,18 ODX in particular is of considerable interest. Interestingly, four of the 16 cancer-related genes measured  46 Several studies have shown that linear regression equations which incorporate histopathological data, ER, PR, HER-2, and Ki-67, can also provide information that can be used to predict the ODXRS with a high degree of accuracy. 19,30,31 We previously published data supporting the use of linear regression equations to risk stratify patients into low and high risks of recurrence, 31,47,48 introducing an algorithmic approach using the modified Magee Equation 31 Hou et al have recently published data supporting our original study conclusions. 49,50 We now have additional data validating our original algorithm, which we are now calling the Rochester Modified Magee algorithm (RoMMa, Figure 4). Our validation study supports that our algorithmic approach is a cost-effective, cost-efficient alternative to ODX in risk stratifying certain breast cancer patients. Consistent with our previously published data, in the current validation population, all the patients with an average modified Magee recurrence score >30 (n = 17), or an average modified Magee recurrence score <9 with an available Ki-67 (n = 4), were correctly predicted to have a high or low ODXRS, respectively (Table 3). Our current results on a separate population of patients from two different institutions give further validation that our algorithm can be used by the clinician when considering which cases not to be sent out for ODX testing.
In our previous study 31 there was no 'two-step' discordance (a discordant high and low recurrence score using the ODX risk stratification criteria between a patients' average modified Magee recurrence score and ODXRS). There was a single case in our validation study with 'two-step' discordance (Table S1). This case was high grade (NS = 8), with a Ki-67 of 15%, and an average modified Magee score of 15.4. The ODXRS was 31. Using the RoMMa algorithmic risk stratification criteria (not the ODX risk stratification criteria), this case would have been RoMMa intermediate (and would have reflexed for ODX testing).
We examined the available outcome data of 301 patients from the original study (Table 2). In the original publication, 31 11 patients with available follow-up had an average modified Magee recurrence score ≤9 (with or without a Ki-67). Nine of these 11 patients had a low ODX score. The other two patients had an intermediate ODX score of 19 and 20. None of these 11 patients have had a breast cancer recurrence. In the current validation population all the patients with an average modified Magee recurrence score ≤9 (n = 12) were correctly predicted to have a low ODXRS. As such, the original algorithm 31 is slightly modified in that we consider patients who have an average modified Magee recurrence score ≤9 to be lowest risk (as opposed to just patients with an average modified Magee recurrence score < 9).
Bhargava et al found that 141 of 144 (98%) cases with new Magee equation scores < 18, or new Magee equation scores 18-25 and mitosis score of 1, had an ODXRS < 26. 51 Our data using the average modified Magee equation in larger population support these findings. In our population, 524 of 537 (98%) cases with an average modified Magee score ≤18 had an ODXRS <26 (Table 4). Based on this finding, we additionally modified the original algorithm to include consideration for an average modified Magee score ≤18 as "low risk" given the recent TAILORx findings. 35 Sixty seven percent (12/18) of patients who recurred in our population had an ODXRS of <26 (Table 5 and Table  S6). Eleven of these 12 patients (92%) were 50 years of age or older at the time of diagnosis. Three of these 11 patients did not receive any hormonal treatment, and only two of these 11 patients received any adjuvant systemic chemotherapy. The recent TAILORx results suggest that women with early breast cancer who are older than 50 years with an ODXRS <26 can be spared adjuvant chemotherapy. 35 Our findings suggest that risk stratification with consideration for systemic chemotherapy is still important in these patients.
Our outcome data support that patients designated as low risk by our original algorithm (average modified Magee score of ≤12 or a combination of NS <6, ER/PR ≥150, and KI-67 <10%) are unlikely to recur. Considering the population of low risk patients who recurred after receiving appropriate hormonal therapy and the population of low risk patients who did not recur and did not receive chemotherapy, one of the 45 (2%) patients classified as low risk by our original algorithm recurred, compared to eight of 116 (7%) patients classified as low risk by ODX who recurred ( Figure 2).
Ninety four percent (17/18) of the patients who recurred had an average modified Magee score ≥13.5 (Table S6). Patients who recurred were more likely to have a lower PR, a higher Ki-67, LN involvement, and LVI than were patients who did not recur (Table S7). Interestingly, neither the average NS nor ER status was significantly different in the recurrence and non-recurrence populations (Table S7). Thirteen of the 18 patients who recurred had some combination of a lowered PR, a higher Ki-67, LN involvement, or LVI (Table S6). In the five other patients who recurred (patient #1, #5, #8, #9, and #11), either the Ki-67 and/or the LVI status was unknown. Our results support that the average modified Magee score, PR, Ki-67, LN, and LVI status may be helpful in predicting patients with a higher risk of recurrence, and should be considered when risk stratifying breast cancer patients for systemic chemotherapy.
ODX testing was rapidly adopted into clinical care of breast cancer patients in 2004 without any randomized trials, based on small cohort studies which suggested that the ODXRS influenced patient preference and oncologist recommendations for chemotherapy. 52 Although a number of studies have suggested that ODX is cost-effective, with the cost of the test being offset primarily by reduction in the use of adjuvant chemotherapy, [10][11][12][13][14][15][16]52 these "supportive" studies have generally been industry funded, not based on real-world population data, and may not reflect actual clinical practice. 18,52 A recent study based on real-world population based data by Mittmann et al suggests that using the ODXRS to determine whether adjuvant chemotherapy should be added to endocrine therapy in ER + lymph node-negative breast cancer patients was approximately $3,000 more expensive per patient than not using the test. 52 Our results suggest a potential estimated cost savings to the health-care system in 2018 of over $100,000,000 if ODX testing was avoided in certain low-and high-risk RoMMa patients. If all cases with an average Modified Magee score of ≤18 were also considered, the cost savings would undoubtedly be substantially higher.
The reproducibility and accuracy of histological grading and immunohistochemical reporting creates a concern for interpretation bias. The literature supports the reproducibility of ER, PR, HER-2, 53-57 and Ki-67, 58 if standardized criteria are used for their interpretation. We used standard criteria for the evaluation of ER, PR, HER-2, and Ki-67, and there was similar agreement in our study between the two institutions in the interpretation of the average modified Magee score relative to the ODX score ( Table S1).
All of the patients without recurrence in our population had at least 5 years of follow-up; however, only 16 of 301 (5%) of patients had 10 or more years of follow-up. Although 5-year follow-up data are acceptable for evaluation of outcome in the published literature, 10 years or more of follow-up is preferable. We continue to maintain our database of patients, with the goal of additional outcome evaluation that has 10 or more years of follow-up. Access to larger database populations, such as ECOG, NSABP B-14, NSABP B-20, and SEER-Medicare, would be helpful in providing additional longitudinal data with 10 or more years of follow-up, which we believe will further validate our findings.
It is likely that variations in adherence to hormonal, systemic, and radiation therapies occurred within our population. A prospective study design or access to data obtained in a prospective fashion would help to eliminate this bias.
The 8th edition of the American Joint Commission on Cancer (AJCC) Cancer Staging manual 59 determined it was appropriate to incorporate the ODXRS into staging for the subgroup of invasive breast carcinoma patients defined by Arm A of the TAILORx study, 60 which includes ER + , HER-2 negative, LN-negative tumors that are 1.1-5.0 cm in size (or 0.6-1.0 cm with intermediate or high histologic or nuclear grade), and have an ODXRS <11. According to the AJCC recommendations, these patients should be placed into the same prognostic category as patients with pT1-N0 M0 (stage IA) breast cancers (AJCC Prognostic Stage Group I). Our results suggest that the likelihood of breast cancer recurrence with an average modified Magee score ≤12 is comparable with an ODXRS <11 ( Table 5). One of the 49 (2%) patients with an ODXRS <11 recurred (Table 5 and Table  S6). One of the 55 (2%) patients with an average modified Magee score ≤12 recurred, and this patient did not receive hormonal therapy (Table 5 and Table S6). However, neither RoMMa nor the average modified Magee score can reliably predict an ODXRS <11. It is worth noting, however, that ODX testing is not a requirement for staging, and Breaux et al 61 have shown that an ODX score <11 changes the stage in only rare cases.
The large cost savings in our study is due to the assumption that none of the patients meeting low or high risk RoMMa criteria would receive ODX testing. It is likely that a number of these patients would still receive ODX testing based on many factors, not the least of these being patient and clinician concerns about not using a more accepted (and publicized) test; however, our study highlights the importance of considering other valid and less costly methods for assessing the risk of breast cancer recurrence. Additional studies testing the cost-efficiency and costeffectiveness of integrating the RoMMa into clinical practice are necessary.

| CONCLUSIONS
Our validation results continue to support that patients who satisfy our algorithmic low-risk or high-risk criteria are likely to have a low-risk or high-risk ODX, respectively.
Specifically, patients with an average modified Magee score of ≤9 are highly likely to be low risk by ODX criteria. Patients with an average modified Magee score ≤12 or, a combination of an average modified Magee score ≤18 with a NS < 6, ER/PR ≥150, and KI-67 <10% (RoMMa low risk histologic criteria), are also very likely to be lower risk by ODX criteria. Patients with an average modified Magee score of >30 are highly likely to be high risk by ODX criteria. Consideration should be given to not sending out tissue for ODX testing in patients meeting low-or highrisk RoMMa criteria, if treatment decisions will be made based on a low-or high-risk ODXRS.
Patients with an average modified Magee score of ≤18 are highly likely to have an ODXRS <26. Given the recent TAILORx findings, 35 we strongly recommend that patients with an average modified Magee score of ≤18 who meet low risk RoMMa histologic criteria not be sent out for ODX testing if the clinician is not considering chemotherapy for an ODXRS <26.
Our results also suggest that the likelihood of breast cancer recurrence in patients who satisfy RoMMA low-risk criteria is comparable with outcomes in patients with an ODXRS <11, and better than an ODXRS <18.
We suggest a "stepwise" approach when risk stratifying breast cancer patients. One approach might be to use information from the RoMMa or similarly less expensive validated models to help identify cases with already available clinical and pathological metrics that will likely elicit information that is similar to ODX. In these cases ODX may not provide any additional significant clinical utility, and would likely not be cost-effective or cost-efficient. ODX testing could then be limited to cases where the assay results would potentially provide clinical utility beyond the available clinical and pathologic metrics. The potential cost savings to the healthcare system would be significant. Support for this "stepwise" approach will require additional validation of the RoMMa in multiple patient cohorts with outcome data to help insure that the information obtained is indeed generalizable to the broader breast cancer population.