Clinical and molecular characteristics of estrogen receptor‐positive ultralow risk breast cancer tumors identified by the 70‐gene signature

Abstract The metastatic potential of estrogen receptor (ER)‐positive breast cancers is heterogeneous and distant recurrences occur months to decades after primary diagnosis. We have previously shown that patients with tumors classified as ultralow risk by the 70‐gene signature have a minimal long‐term risk of fatal breast cancer. Here, we evaluate the previously unexplored underlying clinical and molecular characteristics of ultralow risk tumors in 538 ER‐positive patients from the Stockholm tamoxifen randomized trial (STO‐3). Out of the 98 ultralow risk tumors, 89% were luminal A molecular subtype, whereas 26% of luminal A tumors were of ultralow risk. Compared to other ER‐positive tumors, ultralow risk tumors were significantly (Fisher's test, P < .05) more likely to be of smaller tumor size, lower grade, progesterone receptor (PR)‐positive, human epidermal growth factor 2 (HER2)‐negative and have low Ki‐67 levels (proliferation‐marker). Moreover, ultralow risk tumors showed significantly lower expression scores of multi‐gene modules associated with the AKT/mTOR‐pathway, proliferation (AURKA), HER2/ERBB2‐signaling, IGF1‐pathway, PTEN‐loss and immune response (IMMUNE1 and IMMUNE2) and higher expression scores of the PIK3CA‐mutation‐associated module. Furthermore, 706 genes were significantly (FDR < 0.001) differentially expressed in ultralow risk tumors, including lower expression of genes involved in immune response, PI3K/Akt/mTOR‐pathway, histones, cell cycle, DNA repair, apoptosis and higher expression of genes coding for epithelial‐to‐mesenchymal transition and homeobox proteins, among others. In conclusion, ultralow risk tumors, associated with minimal long‐term risk of fatal disease, differ from other ER‐positive tumors, including luminal A molecular subtype tumors. Identification of these characteristics is important to improve our prediction of nonfatal vs fatal breast cancer.

collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; and decision to submit the manuscript for publication.
(FDR < 0.001) differentially expressed in ultralow risk tumors, including lower expression of genes involved in immune response, PI3K/Akt/mTOR-pathway, histones, cell cycle, DNA repair, apoptosis and higher expression of genes coding for epithelial-tomesenchymal transition and homeobox proteins, among others. In conclusion, ultralow risk tumors, associated with minimal long-term risk of fatal disease, differ from other ER-positive tumors, including luminal A molecular subtype tumors. Identification of these characteristics is important to improve our prediction of nonfatal vs fatal breast cancer. What's new?
The metastatic potential of estrogen receptor (ER)-positive breast cancers is heterogeneous, and distant recurrences may occur months to decades after primary diagnosis. However, the longterm risk of metastatic disease in breast cancer remains largely unexplored. Using a previously-

| INTRODUCTION
Estrogen receptor (ER)-positive breast cancer patients have a steady long-term risk of fatal disease, and distant metastatic recurrence can occur anywhere between a few months to several decades after primary diagnosis. [1][2][3][4][5][6] Mammographic screening enables detection of early breast cancer and has reduced the disease mortality, but can introduce overdiagnosis of tumors that might never have come to clinical attention. 7,8 Adding molecular risk prediction tools to standard clinical breast cancer markers may improve risk assessment and reduce overtreatment in patients with low risk of metastatic disease. 9 However, to confidently offer less aggressive treatment to patients, a better understanding of the long-term risk of metastatic disease in breast cancer is needed.
The 70-gene signature (MammaPrint) was originally designed to identify breast cancer patients with high or low risk of early relapse within 5 years after primary diagnosis to identify which patients require adjuvant therapy. 10 The signature was developed in lymph node-negative patients under the age of 55, but has shown to also be prognostic in lymph node-positive and older patients, and up to 25 years after primary diagnosis. [11][12][13] Genes included in the signature and upregulated in high-risk patient tumor samples have been shown to be associated with the cell cycle, invasion and metastasis and angiogenesis. 10 Clinical trials have validated that ER-positive patients of otherwise high clinical risk, but classified as low risk by the 70-gene signature, may not benefit from adjuvant chemotherapy, 14 and this molecular risk prediction tool is implemented in several breast cancer treatment guidelines. [15][16][17] Moreover, the additional information from the 70-gene signature has been shown to improve clinicians' confidence in their treatment recommendations. [18][19][20] Furthermore, we have demonstrated that the "ultralow risk" threshold derived from the 70-gene signature identifies patients with a very low long-term risk of fatal breast cancer. The 20-year disease-specific survival was 97% and 94% for ER-positive lymph-node negative patients randomized to tamoxifen vs no adjuvant therapy, respectively. 9 Consequently, it is important to understand the underlying characteristics of ultralow risk breast cancer tumors, given that the long-term risk in breast cancer is largely unexplored.
Therefore, here we aimed to assess the clinical and molecular characteristics of ultralow risk tumors in ER-positive lymph-node negative postmenopausal breast cancer patients from the Stockholm tamoxifen randomized trial (STO-3). Ultralow risk tumors were compared to other ER-positive tumors, including PAM50 molecular subtype luminal A and B tumors, regarding the clinically used breast cancer markers, as well as the expression scores of 19 multigene modules representative of specific biological processes and pathways. Furthermore, differentially expressed genes on the single-gene level in ultralow risk tumors vs other ER-positive tumors were identified and categorized by their associated Hallmark gene set to better understand the molecular characteristics of ER-positive tumors with very low long-term risk of fatal disease.

| Patients
The Stockholm breast cancer study group conducted randomized trials 1976 to 1990 in lymph-node negative postmenopausal patients with tumors ≤30 mm in diameter. 21,22 The Stockholm tamoxifen trial (STO-3) enrolled 1780 patients that were randomized to 2 years of adjuvant tamoxifen (40 mg daily) vs no adjuvant treatment. 21,22 In 1983, patients who re-consented and were recurrence-free after 2 years of tamoxifen treatment were randomized to 3 additional years of tamoxifen. As a result, patients in the tamoxifen randomization arm were treated with tamoxifen for 2 or 5 years. No significant differences in survival in the comparison of 2 vs 5 years of tamoxifen have been observed in the STO-3 trial. 21 Molecular analysis was possible for 808 patients with available formalin-fixed paraffin-embedded (FFPE) tissue blocks from the primary breast cancer tumor. 23 The 808 patient subset was well balanced to the original STO-3 trial cohort with regards to tumor characteristics. 23 Eighty-one patients were excluded from analysis due to insufficient invasive tumor cells, leaving 727 samples available for further analysis ( Figure 1). and Ki-67 was scored by breast cancer pathologists. 2 ER-and PR-positivity was defined by a threshold of 10% or greater, according to the Swedish National Guidelines, 24 HER2-positivity as intensity 3+ by IHC and Ki-67 was categorized as low (<15%) and medium/high (≥15%).

| Tumor grade
Tumor grade was retrospectively assessed by one pathologist according to the Nottingham Histologic Score system (Elston grade). 23

| 70-Gene signature risk classification
The 70-gene signature (MammaPrint) was used to classify primary tumors into "high risk", "low risk" or "ultralow risk." The molecular risk prediction tool was originally designed to identify patients with low or high risk of early relapse, 10-13 and the ultralow risk threshold was added to identify patients with indolent tumors associated with minimal long-term risk of fatal disease. 9 From the primary tumor microarray gene expression data, the 70-gene signature risk classification was performed according to standard protocols as previously described, including the use of 465 normalization genes and over 250 probes for hybridization and printing quality control. 11,25 Patient tumor samples were classified into "high risk" (<0), "low (but not ultralow) risk" (>0, <0.355) and "ultralow risk" (≥0.355) using thresholds previously developed. 9

| Multigene modules expression scores
The expression scores of 19 multigene modules, proxy-signatures for activation of biological processes or pathways and associated with clinical outcome, were analyzed in the study. [27][28][29] Each module comprises genes that are positively or negatively associated with its biological process/pathway. Continuous expression scores for each multigene module were calculated for all 652 STO-3 patients with gene expression data using R package Genefu version 2.8.0. 30 The continuous multigene expression scores were categorized to tertiles, which were then converted to two values: the most aggressive tertile (low or high expression scores suggested to be associated with worse prognosis, see details in Supporting Information Materials and Methods) vs the two less aggressive tertiles combined. 27

| Gene set enrichment analysis
Using the 27 Hallmark gene sets, a gene set enrichment analysis (GSEA) was performed to identify which gene sets are significantly enriched of genes differentially expressed in ultralow risk tumors compared to other ER-positive tumors of low or high risk. The GSEA calculates enrichment scores (ES) for each gene set to explore if genes belonging to the gene set tend to occur at the top (or bottom) of a specific preranked gene list. 33 Genes were ranked using the t-statistics output from the analysis of differentially expressed genes.
A false-discovery rate (FDR) of 5% was used to adjust for multiple testing.

| Statistical analysis
Given that all ultralow risk tumors were ER-positive, only ER-positive patients were included in our study (n = 538; Figure 1). Fisher's exact test was used to compare ultralow risk tumors to other ER-positive tumors by the clinically used breast cancer markers (ie, tumor size, tumor grade, PR, HER2 and Ki-67) and the 19 multigene modules expression scores. A P-value less than .05 was considered statistically significant. The analysis to identify differentially expressed genes was conducted using R package OCplus 34 with t-statistics and FDR cutoff of 0.001.
All data preparation and analysis were done using R version 3.5.2 and SAS software version 9.4. All statistical tests were two-sided.

| RESULTS
A total of 652 patients in the STO-3 trial with 70-gene signature risk classification were available for analysis. The 70-gene signature classified 15% (n = 98) of the tumors as of ultralow risk (Figure 1). Given that all ultralow risk tumors were ER-positive as determined by IHC, the analyses were focused on the 538 patients with ER-positive breast cancer tumors only (Figure 1). In Table 1 (Table S3) tumors as compared to all other ER-positive tumors of low or high risk (Table S4). Of these, 454 genes were expressed at lower levels in ultralow risk tumors and 252 genes at higher expression levels.
To further understand the biological function associated with ultralow risk definition, the 706 differentially expressed genes were categorized into different cancer-related Hallmark gene sets (Table S4). with the PI3K/Akt/mTOR pathway and immune response ( Figure 3 and Table S4). Moreover, ultralow risk tumors generally expressed lower levels of genes coding for histones, MYC-signaling, reactive oxygen species, cell cycle, DNA repair and apoptosis ( Figure 3 and Table S4). Genes coding for homeobox proteins, epithelial structure, KRAS-signaling and epithelial-to-mesenchymal transition (EMT) were expressed in higher levels in ultralow risk tumors. Genes involved in different metabolic processes, protein secretion, estrogen response or P53 pathway showed both higher and lower expression levels in ultralow risk tumors.
Further gene set enrichment analysis (GSEA) showed significant (FDR < 0.05) enrichment of gene sets consistent with previous analyses (Table S5). The gene sets MYC-signaling, cell cycle, DNA-repair, unfolded protein response, PI3K/Akt/mTOR-pathway, immune response and apoptosis were downregulated in ultralow risk tumors, and EMT upregulated. Furthermore, the GSEA also showed the gene sets metabolic processes and P53-pathway to be downregulated in ultralow risk tumors, whereas myogenesis, hedgehog signaling and estrogen response were upregulated.

| DISCUSSION
We have previously shown that the ultralow risk threshold of the 70-gene signature identifies patients at minimal long-term risk of death from breast cancer. 9 To further understand the long-term risk of breast cancer, we here aimed to explore the clinical and molecular characteristics of primary ultralow risk tumors from the STO-3 trial.
Our study shows that ultralow risk tumors are significantly more likely to be of a smaller tumor size, of lower tumor grade, PR-positive, HER2-negative and Ki-67-low, compared to other ER-positive tumors.
Moreover, ultralow risk tumors exhibited substantially different expressions of "hallmark" gene sets as well as other important cancerrelated biological processes or pathways.
The risk of distant recurrences in ER-positive breast cancer remains steady decades after primary diagnosis. [1][2][3][4][5][6] Consequently, it is important to identify distinct biological characteristics that predict patients' long-term recurrence risk to improve our understanding and distinguish our management of nonfatal vs fatal breast cancerssomething that has proven a great challenge. Here, our findings suggest that ultralow risk tumors have differential tumor characteristics as compared to other ER-positive tumors, including luminal A molecular subtype tumors which are generally considered to be of low risk. often has been associated with better survival and reduced metastatic risk, 28,43 it has also been associated with better pathologic complete response (pCR) in ER-positive/HER2-negative patients. 29  we have confirmed that patient and tumor characteristics were well balanced to the original STO-3 cohort. 23 Further, when dealing with IHC assays, there is often some degree of subjective inaccuracy. However, the clinical IHC markers were recently re-assessed at a single medical laboratory by dedicated breast cancer pathologists. 52 Despite our relatively small study size with 98 ultralow risk tumors, we were able to identify significant and informative differences between the analyzed groups. Another clear strength of our study includes the recent performance and annotation of genomewide gene expression analyses.

ETHICS STATEMENT
The STO-3 trial was approved by the ethics committee at Karolinska Institutet and participants provided oral consent. The trial was conducted at the Regional Cancer Center Stockholm-Gotland, Sweden, and began in 1976, which was well before trial registration started in Sweden; therefore, information on trial registration number is not available.

DATA AVAILABILITY STATEMENT
The raw RNA microarray data generated in our study is deposited on a secure Swedish server and has been assigned a DOI (https://doi.