The subtype‐specific molecular function of SPDEF in breast cancer and insights into prognostic significance

Abstract Breast cancer (BC) is a molecular diverse disease which becomes the most common malignancy among women worldwide. There are four BC subtypes (Luminal A, Luminal B, HER2‐enriched and Basal‐like) robustly established following gene expression pattern‐based characterization, behave significant differences in terms of their incidence, risk factors, prognosis and therapeutic sensitivity. Thus, there is an urgent need to provide mechanism research, treatment strategies and/or prognosis evaluation based on the patient stratification of BC subtypes. The prostate‐derived ETS factor SPDEF was first identified as an activator of prostate specific antigen, and then, the involvements in many aspects of BC have been proposed. However, the subtype‐specific molecular function of SPDEF in BC and insights into prognostic significance have not been clearly elucidated. This study demonstrated for the first time that SPDEF may play a diversity role in the expression levels, clinicopathologic importance, biological function and prognostic evaluation in BC via bioinformatics and experimental evidence, which mainly depends on different BC subtyping. In summary, our findings would help to better understand the possible mechanisms of various BC subtypes and to find possible candidate genes for prognostic and therapeutic usage.


| INTRODUC TI ON
Breast cancer is the most common malignancy among women worldwide 1 and also a molecular diverse disease, showing different morphologic and biological characteristics and thus different clinical behaviour and treatment response. As to facilitate oncologic decision-making, the BC classification systems are developed to provide an accurate diagnosis of the disease and prediction of tumour behaviour. Hereinto, four BC subtypes have been robustly established following gene expression patterns based characterization. 2 These subtypes, including Luminal A, Luminal B, HER2-enriched and Basal-like, behave significant differences in terms of their incidence, risk factors, prognosis and therapeutic sensitivity. 3,4 Therefore, there is an urgent need to provide mechanism research, treatment strategies and prognosis evaluation based on the patient stratification of BC subtypes.
SPDEF was first identified as an activator of prostate specific antigen, 5 which is largely restricted to epithelial tissues including the lung, stomach, colon and hormone-regulated epithelia such as the prostate, breast and ovary. 6 In cancer literatures, the role of SPDEF in BC is controversial depends on different subtypes, as several studies have demonstrated that high SPDEF expression was confirmed to promote Luminal BC differentiation and correlates with poor overall survival in ER+breast cancer patients. [6][7][8][9] Furthermore, SPDEF can also promote proliferation, migration and invasion of SK-BR-3 cells through AR-PDEF pathway 10 or SPDEF-CEACAM6 oncogenic axis. 11 The set of above observations exhibits a possible oncogenic function of SPDEF. Conversely, the down-regulation of SPDEF in invasive basal breast cancer cell lines supports a tumour suppressive role. 12,13 Therefore, the discrepancies between these findings and those on SPDEF as an oncogene and/or a tumour suppressor have not been resolved. Further, the potential mechanisms underlying subtype-specific functions of SPDEF remain largely unknown.
Bioinformatics analysis has been widely applied in cancer research.
In the present study, we uncovered the global expression profiles of SPDEF, as well as the clinicopathologic and prognostic importance in different BC subtypes through TCGA-BRCA datasets. Moreover, we verified the protein levels of SPDEF with immunohistochemical staining and analysed the relationships between the protein expression of SPDEF and clinicopathologic features in BC subtypes. These bioinformatics and clinical findings have added a new dimension to our knowledge about SPDEF in addition to its role only as an oncogene or a tumour suppressor in BC. Afterwards, we explored the potential functions and signal pathways of SPDEF in BC subtypes using GO, KEGG and hallmark effect gene set analysis, which demonstrated the potential molecular mechanisms of SPDEF underlying the oncogenic activity in non-TNBC (Lumina and HER2+) but tumour suppressor activity in TNBC. And lastly, we conducted the prognostic risk model of SPDEF-related prognosis genes, respectively, in BC subtypes, indicating a highly prognostic performance in survival surveillance. In this study, we innovatively focussed on the SPDEF gene in the aspects of the differential expressions, potential functions and prognostic values in multiple BC subtypes via bioinformatics and experimental evidence.
The workflow of the study design is presented in Figure S1.

| SPDEF expression analysis in TCGA-BRCA dataset
Differential expression of SPDEF in non-tumourous breast tissues and different subtypes of BC tissues were obtained from The Cancer Genome Atlas Project (TCGA). The SPDEF mRNA levels in different subtypes of BC were evaluated using edgeR software packages. 14

| Validation of cell lines with RT-qPCR
Cell lines were purchased from the Cell Bank of the Chinese Academy of Sciences and cultured in special medium. RNA was extracted by TRIZOL (Takara) and transcribed into cDNA using PrimeScript RT reagent Kit (Takara). The quantitative realtime PCR (qPCR) was used to detect the mRNA expression of SPDEF in different subtypes of BC cells. The PCR primers were sequenced as follows: 5'-GAGCCACCTGAGGAGCCTGAG −3' (forward) and 5'-CTTGAGCACTTCGCCCACCAC −3' (reverse) for SPDEF; 5'-CCGGAATCCCTATCTTTAGTCC −3' (forward) and 5'-GCCTTTGTTGCTCTTCCAAAAT-3' (reverse) for TBP.

| Immunohistochemical staining
The paraffin-embedded tissues were obtained from the Pathology

Department of the Affiliated Hospital of Southwest Medical
University. And the tissue slides were deparaffinized, rehydrated and stained with the rabbit polyclonal anti-SPDEF antibody (AB clonal, 1:300) overnight at 4℃. Next, the slides were treated with biotinylated secondary antibody followed by incubation with streptavidin-HRP. Finally, there were stained using DAB and counterstained with haematoxylin. SPDEF staining was scored based on the multiplier of the positive percentage and staining intensity of the stained area as a result of the total score ranged from 0 to 6. The percentage of SPDEF-positive stained cells was scored as 0 (0%-25%), 1 (25%-50%) and 2 (>50%). In addition, the intensity of SPDEF expression was scored as 0 no staining (−), 1 weak staining (+), 2 moderate staining (++) and 3 strong staining (+++). A total score of ≥4 indicated positive SPDEF expression.

| The clinicopathologic and prognostic analysis of SPDEF in BC patients
The association between the SPDEF expression and overall survival was performed by Kaplan-Meier method. 15 To combine with clinical data of patients, the clinical significance of SPDEF expression was figured out. And the best performing threshold is used as a cut-off.

| GO function and KEGG pathway enrichment analysis
Aberrantly expressed genes were filtered using transcription profiles from TCGA-BRCA database. The correlation coefficients were calculated based on Pearson in order to find the SPDEF-related genes among differentially expressed genes (r > 0.4, P < .05). And then, the bioinformatic analysis of the SPDEF-related genes involved GO Enrichment analysis 16 and KEGG signal transduction pathway enrichment 17 were performed by R software and Bioconductor packages. 18

| Gene set enrichment analysis
The different subtypes of BC patients were divided into high-and low-expression groups based on the median expression level of SPDEF from TCGA-BRCA database. Hallmark effector gene set of high SPDEF expression was annotated by gene set enrichment analysis (GSEA). 19,20 Hallmark effector gene sets were obtained from the Molecular Signature Database (MsigDB). 21 The P-value <0.05 and false discovery rate (FDR) <0.25 were used as cut-off criterion.

| Construction of prognostic risk model of BC patients based on SPDEF-related genes
Firstly, univariate Cox regression analysis was performed to identify significant prognostic genes in SPDEF-related genes from TCGA database (P < .05). Then, the least absolute shrinkage and selection operator (LASSO) Cox 22 model was used to identify most critical SPDEF-related prognostic genes. Moreover, risk score model and predictive signature model of prognosis were built by the multivariate Cox regression.
According to the median value of the risk score, all patients from TCGA database were divided into the high-risk group and low-risk group to perform the evaluation of Kaplan-Meier (K-M) survival curves.

| Statistical analysis
The expression levels of gene expression levels in between breast cancer and normal breast tissues were statistically compared by Student's t test and Wilcoxon signed rank sum test. Data were analysed by GraphPad Prism 7.0 software and R-4.0.2 software, which presented as mean ± SEM. Differences were considered statistically significant when P < .05.

| The differential expressions of SPDEF in multiple subtypes of BC
We first analysed the mRNA expression of SPDEF between BC subtypes and normal (adjacent) breast tissues using TCGA database.   of SPDEF and clinicopathologic features in BC subtypes are summarized in Table 1. Over-expressed protein of SPDEF was significantly associated with lymphatic metastasis (P = .039) in Luminal A. As for the Luminal B and HER2+, high SPDEF expression was positively associated with TNM stage (P = .046 in Luminal B, P = .023 in HER2+) and lymphoid nodal status (P = .019 in Luminal B, P = .043 in HER2+).
However, no significant difference was found in TNBC.

| The clinicopathologic and prognostic importance of SPDEF in different BC subtypes
In addition, we compared the transcription levels of SPDEF among groups of different subtype BC patients, according to different clinicopathological characteristics ( Figure 3A-D) (

| The Gene Ontology functions enrichment analysis of SPDEF-related genes in various BC subtyping
To better understand the gene-enrichment and functional annotation analyses of SPDEF, we implemented GO enrichment to  Figure 4 and Table S1.

| Enrichment analysis identifies the SPDEFrelated signalling pathway in multiple BC subtypes
The deeper molecular functions of SPDEF were obtained via

| D ISCUSS I ON
Breast cancer is a clinically and biologically heterogeneous disease; thus, research based on BC subtypes is critical to achieve better clinical outcomes. 23 In cancer literature, the role of SPDEF, known as the prostate-derived ETS factor, that functions in BC is widely reported.

YE Et al.
Prior to the present study, we have summarized SPDEF as the double agent involving in expression profiles, the regulator mechanism in BC progression, as well as the role in diagnosis, treatment and prognosis of BC with literature review. 24   BC by GO analysis (Figure 4C), which is worth further exploring through experimental evidences. Meanwhile, related to our analysis of TNBC ( Figure 4D), extracellular matrix organization has been reported to participate in the regulation process that GREM1 promotes the invasion and metastasis of ER-negative breast. 30  were also proved to be responsible for metastases and therapy resistance in Luminal B type BC. 32,33 Additionally, seldom literature showed the MAPK pathways were involved in the metastasis of HER2+ type BC cells, 34 and mitochondrial oxidative phosphorylation was correlated with the promotion of chemotherapy-resistant BC stem cells. 35 And the evidence from a phase 1 trial verified the targeting of the PI3K/AKT/mTOR pathway for the treatment of mesenchymal TNBC. 36 Noteworthy, there is another study regarding the value of ERβ-targeted therapies for the treatment of TNBC patients, 37 which was closely correlated and consistent with the oestrogen response early pathways enriched in TNBC in our results. Taken together, SPDEF may carry out its regulation functions in such BC subtypes through participation in above signalling pathways. This need to be clarified by further researches.
Fifth, the prognostic risk model of SPDEF-related prognosis genes in subtypes of BC has been constructed for the first time, indicating a high prognostic performance in survival surveillance. The SPDEFbased prognostic index could be an important tool for distinguishing among various subtyping BC patients based on potential discrete outcomes ( Figure 6A-H). Furthermore, this prognostic index can effectively and accurately stratify different subtypes of BC patients, which is vital for monitoring the survival of subtype-specific patients ( Figure 6L). And the ROC curves revealed a high predictive value of the risk model ( Figure 6M-P). Notably, there were two advantages of using the SPDEF-related prognosis genes to construct the prognostic risk model in different subtypes of breast cancer. On the one hand, the influence of confounding factors in the analysis process could be avoided to ensure the inclusion of SPDEF-related prognostic genes significantly associated with the survival outcome. On the other hand, the optimum point of the performance parameters was determined, which improved the discrimination ability of the prognostic risk model.
In summary, our findings provide new insights that can guide a more detailed assessment of BC patients in subsequent clinical trials.
In conclusion, the study we presented here indicated that specific expressions and molecular functions of SPDEF might lead to the occurrence and development of multiple BC subtypes. Further, high expression of SPDEF shows the poor OS and subtype-specific risk model of SPDEF-related prognosis genes indicated a high prognostic performance in survival surveillance in various BC. Overall, our findings would help to better understand the possible mechanisms of various BC subtypes and to find possible candidate genes for prognostic and therapeutic usage.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data utilized in this study are included in this article, and all data supporting the findings of this study are available on reasonable request from the corresponding authors.