Establishment of a five‐enzalutamide‐resistance‐related‐gene‐based classifier for recurrence‐free survival predicting of prostate cancer

Abstract To identify prostate cancer (PCa) patients with a high risk of recurrence is critical before delivering adjuvant treatment. We developed a classifier based on the Enzalutamide treatment resistance‐related genes to assist the currently available staging system in predicting the recurrence‐free survival (RFS) prognosis of PCa patients. We overlapped the DEGs from two datasets to obtain a more convincing Enzalutamide‐resistance‐related‐gene (ERRG) cluster. The five‐ERRG‐based classifier obtained good predictive values in both the training and validation cohorts. The classifier precisely predicted RFS of patients in four cohorts, independent of patient age, pathological tumour stage, Gleason score and PSA levels. The classifier and the clinicopathological factors were combined to construct a nomogram, which had an increased predictive accuracy than that of each variable alone. Besides, we also compared the differences between high‐ and low‐risk subgroups and found their differences were enriched in cancer progression‐related pathways. The five‐ERRG‐based classifier is a practical and reliable predictor, which adds value to the existing staging system for predicting the RFS prognosis of PCa after radical prostatectomy, enabling physicians to make more informed treatment decisions concerning adjuvant therapy.


| BACKG ROU N D
As one of the most malignant tumours in males of the world, prostate cancer (PCa) has approximately 1,00,000 diagnosed cases diagnosed every year. 1 At present, PCa has developed into the second leading cause of cancer-related deaths in males in developed countries. Genetic and demographic factors, such as race, age and family history, are closely related to the occurrence and progression of PCa. 2,3 As our understanding of the underlying biology of PCa has gradually broadened, various treatment strategies have been developed, such as radical prostatectomy, androgen deprivation therapy (ADT), radiation therapy and chemotherapy. However, the prognosis of PCa is still unsatisfactory, and most tumours recur to castration resistance within two years. 4 In 1941, Huggins and Hodges discovered that ADT can be the effective treatment to improve the prognosis of PCa. 5 Since then, ADT has been regarded as the basis of advanced PCa management.
According to the survey, about half of prostate cancer patients will receive ADT for a period of time during treatment in the United States. 5,6 Although tumour regression can be caused by ADT through inhibiting the androgen signalling in most cases, it also will inevitably develop into castration resistance that the tumour cells will lead to the progression of disease for adapting to low androgen levels. [7][8][9] Potent anti-androgens is one of the treatment options for metastatic castration-resistant prostate cancer (mCRPC) through physical competition with the receptor's natural ligand dihydrotestosterone (DHT) to target the AR or inhibiting the biosynthesis of androgen. 10 Currently, enzalutamide (MDV-3100), as a prescribed compound, is most frequently used to treat mCRPC. [11][12][13][14][15] As the direct androgen receptor inhibitor, enzalutamide ultimately eliminates the expression of androgen-responsive genes by affecting the AR signalling pathway at multiple sites, such as preventing ligand binding, inhibiting AR nuclear translocation and blocking DNA transactivation. 16,17 The multi-stage effect of enzalutamide on AR signal transduction is considered to be the main reason why its clinical activity is better than over other direct AR inhibitors, for example, flutamide, bicalutamide and nilutamide. 18 However, the response of patients to the treatment of enzalutamide is different because of the heterogeneity between PCa patients. 19 Currently, it is not easy to obtain RNA profiles from patients who received Enzalutamide treatment and progressed to drug resistance, which could provide resistance-related gene markers, and served as prognostic markers for PCa patients.
Here, we employed microarray analysis to reveal the expression differences between Enzalutamide treatment resistant and parental cells. We also analysed the DEGs of a RNA-sequencing study that focused on pre-and post-Enzalutamide treatment based on patientderived Xenograft (PDX) models. Subsequently, we correlated the differentially expressed genes (DEGs) overlapped from the two cohorts with recurrence-free survival (RFS) of PCa patients and established the RFS predicting classifier using LASSO Cox regression analysis. We validated the classifier in five independent cohorts, as well as our centre samples and obtained satisfying results. Besides, the functional and immunohistochemistry (IHC) assays proved the biological role and clinical significance of these critical candidates.

| Microarray
Total RNA was extracted from the C4-2 and C4-2R cells. Through using the Agilent 2100 bioanalyzer and RNA LabChip kits, the quality and integrity of the RNA were evaluated. In line with the instructions of manufacturer (Shanghai OE Biotech. Co., Ltd.), the microarrays were generated by using 200 ng of total RNA. Microarrays were scanned by the Scanner C slide holder. The Feature Extraction software was used to generate the raw data after generating the microarray scan images. For assembling mapped reads from each sample, StringTie (version 1.3.1) was used that the reference sequence was refined as a guide. The Cuffdiff (version 2.1.1) was used to estimate the FPKM of the genes. 20 Through using the Ballgown package, 21 the expression levels of genes in different groups were compared.
The cut-off criteria was set to the absolute [log 2 (fold change)] > 1.0 and the adjusted p-value <0.05. If the expression of genes met the above rules, it was classified as differentially expressed. (2) available information on clinicopathological features, particularly for RFS status and time.

| Data processing
For TCGA-PRAD cohort, firstly, fragments per kilobase of nonoverlapped exons per million fragments mapped (FPKM)'s number was calculated, following with that the FPKM was transferred into transcripts per kilobase million (TPM) values, which are similar to those resulting from microarrays and more comparable between samples, and then transferred to log 2 (TPM + 1) value for downstream analysis. For the MSKCC cohort, the normalized log 2 mRNA expression was download from http://cbio.mskcc.org/cance rgeno mics/prost ate/data/. The gene matrix data of GSE70380, GSE116918, GSE70769 and GSE46602 were directly downloaded from the GEO database, and the expression of all genes was prelog 2 -transferred. We also collected the preprocessed TPM data from our own AHMU-PC cohort for the external validation, the obtain details and clinical information provided in our published work. 22

| Access to differentially expressed genes
For our RNA sequence count data, the 'DESeq2' was used to iden-

| Construction of a prognostic model with the Least Absolute Shrinkage and Selection Operator Cox regression model
In this research, the DEGs from the Venn diagram were regarded as candidates. The prognostic classifier was established by using LASSO Cox regression analysis based on the 'glmnet' package. The shrinking and selecting variables were successfully achieved by the Cox regression model with the LASSO penalty. According to each prognostic genes' relative expression and its correlated coefficient, the risk score of each patient was calculated. Based on individual normalized gene expression multiplied by the LASSO Cox coefficient, a risk score formula was established, as showed below: In each cohort, the cut-off value was set to the median risk score, and these patients with risk scores higher than the median were assigned to high-risk subgroup, while the others were assigned to low-risk subgroup. Through using the 'pROC' package, the area under curve (AUC) value of the receiver operating characteristic (ROC) curve was used to test the accuracy of this prognostic model. Furthermore, the expression differences of these candidates between different clinicopathological subgroups, for example, Gleason scores (>7 vs. <=7) and pathological tumour grade (T3 + T4 vs. T1 + T2) were compared.
The survival differences between low-and high-risk subgroups were analysed by Kaplan-Meier (K-M) survival analysis through the use of 'survminer' package and two-way log-rank tests. To further verify the clinical usage of this classifier, we tested it in four external databases (including MSKCC, GSE116918, GSE70769 and GSE46602).

| Gene set enrichment analysis
According to the median risk score calculated by the classifier, the patients from the TCGA-PRAD cohort were divided into low-and high-risk subgroups. We employed the GSEA analyses to compare the difference between the two subgroups at pathway levels using 'ggplot2' package on R software. <= 10) subgroups.

| Function verification of TK1 at the cellular level
The function of UBE2T, HERPUD1, IQGAP3 and IGFBP3 had been demonstrated in Pca, while the roles of TK1 have rarely been investigated, so we verified the function of TK1 at the cellular level. The shRNA duplex sequences (shTK1: Coefficient mRNA i * Expression mRNA i 5'-CTTGAAGTAGCAGAGCCGACA-3′) and the scrambled control (scRNA) were used to silence TK1. C4-2R and 22Rv1 cells were plated into 6 cm culture dishes and were transfected with 50 nmol/L of the shRNA or scRNA when the density of cells reached with 60%-70%. After 24 h, shRNAs or scRNAs were removed and cells were incubated with culture medium. 5000 cells of C4-2R and 22Rv1 were plated into each well of 24well plates. Following with MTT assays, we harvested cells on Day 0, 2, 4 and 6. Added 50 μl of 5 mg/ml MTT into each well and the plates were incubated at 37°C with 5% CO 2 for 2 h. After the incubating, media was removed and 1 ml DMSO was added into each well to dissolve the precipitate. The plates were covered with foil and be placed on an orbital shaker for 20 min, after that, the absorbance of each well was measured at 570 nm. 1000 cells of C4-2R and 22Rv1 were plated into 6 cm culture dishes. After two-week Enzalutamide treatment, the colonies were fixed with methanol and treated with crystal violet solution. The nonoverlapping group of at least 50 cells was defined as a colony.

| Western blotting
Our previous studies had described the detailed procedures of the Western blotting. 23 In short, RIPA lysis buffer was used to lyse C4-2R or 22Rv1 to extract the total proteins, and the BCA protein assay kit was used to determine the protein concentration. 12.5% SDSpolyacrylamide gels were used to separate the samples followed by transferring onto NC membranes (Bio-Rad, Hercules, CA). 5% nonfat milk was used to block the membranes at room temperature for 1 h, following by primary antibodies and the appropriate secondary antibodies (1:5000; goat anti-rabbit; Proteintech Group, No. SA00004-2) were used to incubate the membranes in order.

| IHC validation
The study was approved by the Ethics committee of The First

| Identification of Enzalutamide-resistancerelated-genes
We extracted the total RNA from the C4-2 and C4-2R cells. Then, we performed the microarrays to detect the expression of genes from the extracted total RNA, with annotated row read genes of

| Identification of prognostic genes in predicting the RFS of Pca patients
Then, based on the 16 genes obtained from univariate Cox analysis, the key genes with the strongest RFS predictive ability were performed by using LASSO Cox regression analysis. Five genes were extracted in this process, such as UBE2T, HERPUD1, IQGAP3, IGFBP3 and TK1 (Figure 2A

| Verification of prognostic genes in predicting the RFS of Pca patients in our current work
Each sample's risk score from our owe AHMU-PC cohort was In the high-risk group, 66.7% of patients had a Gleason score higher than 7, while 71% of the patients in the low-risk group had a Gleason score less than or equal to 7 ( Figure 3C). In addition, we performed IHC staining of TK1 on prostate tissues of Pca patients. In Table 1, We found that the high expression of TK1 was observed in the envelope invasion (p = 0.01), Gleason score (p = 0.044) and pathology stage (p = 0.02) subgroups. The IHC scores of specimens with Gleason score higher than 7 were significantly higher than that of specimens with Gleason score less than or equal to 7 ( Figure 3D:

| Exploring the biological function of TK1 in C4-2R and 22Rv1
Short hairpin RNA (shRNA) was used to explore the biological func-

| Significance and stability assessment of the five-gene-based RFS predicting classifier in external validation cohorts
In order to further verify the accuracy and stability of this model, we

| Subgroup analyses, multivariate Cox regression and nomogram ROC
Through using K-M analyses, the five-gene-classifier precisely discriminated the high-and low-risk groups of patients in these subgroups, such as different age, Gleason score, PSA and tumour stage subgroups ( Figure S2).

| DISCUSS ION
In the currently study, based on 80 overlapped DEGs identified by two mRNA expression microarray, we established a five-gene-based classifier in the TCGA cohort and validated it in four independent validation cohorts, as well as our cohort data. Therefore, it has been proved that the five-gene-based classifier has a better prognostic value for RFS in PCa patients than age, Gleason score and PSA values, and has a similar prognostic value with pathological T stage.
Following the advancement of molecular biology, the arrival of the era of big data has become a reality. Researchers tried to use the large amounts of data to stratify risk and optimize chemotherapy strategies for various tumour types. [25][26][27] So far, the relationship between prognosis of PCa and differential expression of certain genes had been revealed by only a few studies. 28,29 In addition, large-scale, independent validation of the genes in these researches has not been undergo. In this study, the good prognostic value of the five-gene-based classifier for predicting the recurrence risk of PCa has been showed according to the digital expression profiling, which has been validated in the external cohorts and our data. Notably, we also validated the classifier in our cohort, which comprising the FFPE samples and obtained consistent results. All these results proved the feasible and applicable model in predicting the RFS of PCa patients.
Drug treatment resistance is a complicated process, the exploration of dysregulated genes related to the resistance and progression of PCa will help provide more therapeutic options and further improve prognosis. In the current study, a set of five genes was identified with the ability to predict clinical outcomes effectively. HERPUD1, as an endoplasmic reticulum-bound protein, is a putative component of the endoplasmic reticulum-associated protein degradation (ERAD) pathway. 30 The loss of HERPUD1 can increase the sensitivity of cells to endoplasmic reticulum stress and apoptosis. 31 Recently, study showed that the lower mRNA expression of HERPUD1 was associated with a higher incidence of metastasis after radical prostatectomy. 32 IGFBP3 as a multifunctional protein plays a key role in the regulation of IGF I/II activity, cell proliferation and apoptosis. Recent research found that overexpression of IGFBP3 was associated with tumour recurrence and poor patient survival through analysis of human prostate cancer specimens and TCGA patient cohorts. 33 As a member of the E2 family, UBE2T is responsible for the ATP-dependent ubiquitin labelling of target proteins to promo their degradation. Some studies showed that the overexpression of UBE2T could promote the proliferation of breast cancer cells by repressing BRCA1 expression. 34 Recently studies found that the high expression level of UBE2T was related to tumour formation, invasion, metastasis and poor disease-free survival of PCa patients. 35 Thymidine kinase 1 (TK1) participates in the synthesis of DNA precursor and acts as a biomarker for cancer including prostate and breast cancer, 36 but the specific role of TK1 in the occurrence and development of PCa is unclear. IQGAP3, as a member of the IQGAP family, promotes the proliferation, migration and invasiveness of cancer cells through interacting with its target proteins. 37 Some researches had found that the expression level of IQGAP3 in PCa was increased and the expression level of IQGAP3 correlated inversely with survivability, 37 but the specific role of IQGAP3 in the occurrence and development of PCa is unclear.
Further studies of gene functions and pathways particularly cell cycle, oocyte meiosis, homologous recombination, porphyrin and chlorophyll metabolism, and progesterone-mediated oocyte maturation are helpful to understand the pathophysiologic mechanism of Enzalutamide resistance in PCa ( Figure 6). The cell cycle is a complex process that guides cell proliferation through a series of checkpoints that can correct DNA damage, genetic abnormalities and other errors. 38 In most of the human cancer, the G1 progression of cell cycle is out of control and the impairing of cell cycle checkpoints results in the accumulation of genetic aberrations. 39 In the level of immunology, the cancer cells and egg cells have the same behaviour: implement and development. A study found that as reprogrammed cells, cancer cells have the initiation of the egg cells' genetic program. 40 In human cancers, the activation of meiosis can drive the oncogenesis of cancer. 41 For restarting stalled replication forks, repairing spontaneous DNA double-strand breaks and generating genetic diversity, homologous recombination (HR) is an essential pathway. The tumour cells which were defective in HR were more sensitive to many anticancer drugs. Many studies had found that the re-expression of HR in tumour cells may be one of the reasons for the drug resistance. 42 As far as we know, this is the first attempt to use Enzalutamide The Natural Science Foundation of Guangdong Province, China (2017A030313800).

CO N FLI C T O F I NTE R E S T
None.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data available on request from the authors: The data that support the findings of this study are available from the corresponding author upon reasonable request.