Bioinformatic profiling identifies a platinum‐resistant–related risk signature for ovarian cancer

Abstract Most high‐grade serous ovarian cancer (HGSOC) patients develop resistance to platinum‐based chemotherapy and recur. Many biomarkers related to the survival and prognosis of drug‐resistant patients have been delved by mining databases; however, the prediction effect of single‐gene biomarker is not specific and sensitive enough. The present study aimed to develop a novel prognostic gene signature of platinum‐based resistance for patients with HGSOC. The gene expression profiles were obtained from Gene Expression Omnibus and The Cancer Genome Atlas database. A total of 269 differentially expressed genes (DEGs) associated with platinum resistance were identified (P < .05, fold change >1.5). Functional analysis revealed that these DEGs were mainly involved in apoptosis process, PI3K‐Akt pathway. Furthermore, we established a set of seven‐gene signature that was significantly associated with overall survival (OS) in the test series. Compared with the low‐risk score group, patients with a high‐risk score suffered poorer OS (P < .001). The area under the curve (AUC) was found to be 0.710, which means the risk score had a certain accuracy on predicting OS in HGSOC (AUC > 0.7). Surprisingly, the risk score was identified as an independent prognostic indicator for HGSOC (P < .001). Subgroup analyses suggested that the risk score had a greater prognostic value for patients with grade 3‐4, stage III‐IV, venous invasion and objective response. In conclusion, we developed a seven‐gene signature relating to platinum resistance, which can predict survival for HGSOC and provide novel insights into understanding of platinum resistance mechanisms and identification of HGSOC patients with poor prognosis.


| INTRODUCTION
Ovarian cancer is the most lethal gynecological malignancy in the world. It accounts for 2.5% of female malignant tumors, but 5% of cancer deaths due to low survival rates. 1 There are four main histotypes of ovarian cancer: serous, endometrioid, mucinous, and clear cell. 2 High-grade serous ovarian cancer (HGSOC) has been thought to be the first leading histological subtype of ovarian cancer death. It is characterized by advanced stage at diagnosis and rapid progress. 3 Combination chemotherapy with platinum compounds and taxanes for six cycles after cytoreductive surgery is a standard adjuvant treatment for patients with ovarian cancer. The majority of HGSOC patients responds well to platinum-based chemotherapy. And 5-year survival rates for ovarian cancer have largely improved from 40% to 47% over the last decades. 1 Despite high validity of standard treatment, many patients suffer platinum resistance after initial response, which is one of the causes for low survival rate in HGSOC. Therefore, studying prognostic signatures will be pivotal to improve clinical treatment of HGSOC patients.
Currently, several biomarkers have been used to predict patients' survival in HGSOC. For example, it has been well established that carbohydrate antigen 125 (CA125) is a dominant feature in diagnosis of ovarian cancer clinically. 4 CA125 has provided useful information on identifying patients for secondary tumor-debulking surgery, as well as treatment time for multiple conventional and novel drugs. 2 Unfortunately, some studies have failed to define CA125 levels as an independent factor of prognosis or a target to improve overall survival (OS) of patients. [5][6][7] Bioinformatics is fast becoming a key tool in helping investigators with new research ideas about cancer. Although there is a growing body of literature that recognizes the effect of mRNA expression signatures on recurrence 8,9 or OS, 8,[10][11][12] there are few integrated analysis have studied the association between gene expression related to platinum resistance and OS of HGSOC.
In this study, the Gene Expression Omnibus (GEO) database 13 was used to obtain differentially expression genes between platinum-sensitive and resistant HGSOC tissue samples. The Cancer Genome Atlas (TCGA) database 14 was used to identify a model consisted of mRNAs as a new indicator to predict outcome by analyzing the mRNA expression profiles and clinical features in HGSOC. Furthermore, the prognostic value of the indicator was also confirmed in patients with different clinical characteristics.

| Datasets and patients' information
Four datasets (GSE51373, 15 GSE32602, 16 GSE65986, 17 GSE26193 18 ) with tissue, histological type, histological grade, progression free survival (PFS), and drugs information were included for the analysis differentially expressed genes (DEGs) between platinum-sensitive and resistant samples in HGSOC. Platform used in these datasets was GPL570 (Affymetrix Human Genome U133 Plus 2.0 Array). Platinum-resistant disease was characterized by PFS of <6 months after the last platinum-based treatment. Patients with PFS of >12 months had platinum-sensitive disease. 2 Overall, 104 HGSOC patients were selected for the analysis including data from 87 platinum-sensitive samples and 17 platinum-resistant samples ( Table 1). The expression data and clinical data were downloaded from GEO Databases (https ://www.ncbi.nlm.nih.gov/geo/). To test the prognostic significance of DEGs, gene expression data (level 3) of HGSOC patients from TCGA Databases (https ://cance rgeno me.nih.gov/) with available clinical data were used.

HGSOC samples
Linear models for microarray data (limma) package 19 were used for differential expression analysis. Robust multichip average (RMA) method was used for data normalization. Imputation for microarray data (impute) package 20 was used to impute missing expression data. Surrogate variable analysis package 21 was used for removing batch effects and other unwanted variations in experiment. Expression difference was characterized by fold change (Fc) and P-valve. Fc was defined by the ratio between platinum-resistant group and sensitive group in expression value of each gene. Genes with Fc＞1.5 and P < .05 were defined as DEGs.

| Gene annotation and pathway enrichment analyses
The Database for Annotation, Visualization and Integrated Discovery (DAVID, https ://david.ncifc rf.gov/home.jsp) 22 was used to perform gene function annotation and pathway enrichment analysis. Gene annotations were performed from three aspects, namely, cellular component, biological process, and molecular function by gene ontology. KEGG (Kyoto Encyclopedia of Genes and Genomes) 23 pathway enrichment analyses were also performed to investigate the underlying functions of aforementioned DEGs. P < .05 was used as the significance level.

| Construction of the DEGs-based prognostic model and data analysis
Package survival 24 was used for Cox proportional hazard regression to construct the prognostic model. The univariate regression with P < .05 was used to identify OS related mRNAs from the DEGs. The multivariate regression was used to confirm mRNAs selected by that univariate regression could be brought into prognostic model and calculate risk score. The risk score of each patient was calculated according to the formula: risk score = h0 (t) exp (β1*X1 + β1*X1 + … βn*Xn) (X: gene expression value; β: the coefficient derived from multivariate regression; h (0): baseline hazard function, not specified). HGSOC patients were divided into high-risk group and low-risk group according to the median risk score. 25 Survival curves were created using survfit functions. And survdiff function was used to test survival curve differences. Package survival receiver operating characteristic (ROC) 26 was used to create ROC curve to assess the predictive accuracy of the risk score for time-dependent disease outcomes within 3 years. Cox regression was also performed to evaluate the effect and independence of risk score and clinicopathological parameters, including age, histological grade, stage, residual tumor size, anatomic neoplasm subdivision, and treatment outcome on OS of HGSOC patients.

| Identification of DEGs involved in platinum-resistant HGSOC and pathway enrichment analyses
The gene expression profile with accession numbers GSE51373, GSE32062, GSE65986, and GSE26193 were downloaded from GEO database. The volcano plot ( Figure  S1A) showed the difference of mRNA expression between platinum-sensitive (n = 87) and resistant group (n = 17). Compared with the sensitive group, 269 DEGs were obtained including 123 up-regulated and 146 down-regulated in the resistant group (P < .05, Fc > 1.5). The heatmap presented the expression of DEGs with P < .05 and Fc > 1.5 ( Figure S1B). The top 10 up-regulated and down-regulated DEGs (sorted by Fc) are displayed in Tables 1 and 2. Then we performed gene function analysis through the web-tool: DAVID. As shown in Figure 1, DEGs were significantly enriched in cellular components ( Figure 1A), protein homodimerization activity ( Figure 1B), signal transduction, immune response, and apoptotic process ( Figure 1C). And KEGG analysis ( Figure 1D) showed that DEGs were enriched in the PI3K-Akt signaling pathway, drug metabolism-cytochrome P450, etc, P < .05 was used as the significance level.

| Construction of DEG-based prognostic model
To explore whether DEGs are related to OS, 333 HGSOC samples with the expression profile in TCGA databases were selected for the Cox regression analysis. Univariate Cox regression showed that 12 of 269 DEGs were determined to be associated with OS in HGSOC significantly (P < .05, seven DEGs were included in the prognostic model (Table  3). Among them, LYPD6B, CD38, CMBL, and KIAA2022 were independent protective factors due to hazard ratio (HR) being less than 1. On the contrary, LYRRC17, ZHX3, and AKR1B10 were independent risky factors for OS in HGSOC patients, as the HR were greater than 1.

| The risk score as an independent prognosis indicator in HGSOC
Risk scores for each patient were calculated according to the formula mentioned in the materials and methods. All patients were divided into two groups: low-risk group (n = 167) and high-risk group (n = 166) by the median value of risk score ( Figure 2A). Patients survival status is shown in Figure 2B. The different expression patterns of the seven DEGs in the low-and high-risk groups are showed in heatmap ( Figure 2C). As the value of risk score increased, the expression of protection factors tend to decrease, while the risky mRNAs tend to increase in expression. The survival curves revealed patients with high-risk scores suffered poorer prognosis compared with the low-risk group (P < .001, Figure 3A). The most striking result to emerge from the data is that the area under ROC curve (AUC) was 0.710 ( Figure 3B), indicating that the risk score had a certain accuracy on predicting 3-year OS in HGSOC. In addition, to determine whether the prognostic function of the risk score derives from one mRNA, we drew survival curves ( Figure S2) and ROC curves ( Figure S3) of every single mRNA. It is regrettable that none of these mRNAs was an accurate predictor of OS in HGSOC with AUC for ROC <0.7, which means the prognostic value of risk score was higher than any single mRNA. Moreover the prognostic role of the novel risk score was also evaluated together with the classical clinical pathological parameters that might influence the prognosis of HGSOC. Univariate and multivariate Cox regression analysis indicated that only no-meeting-objective response (HR = 2.623, P < .001) and high-risk score (HR = 1.899, P < .001) were independent prognostic indicators for OS (Table 4).

| Prognosis effect assessment of risk score in subsets of HGSOC patients
Last but not the least, the prognostic value of the risk score in the OS of HGSOC patients was also evaluated in subsets of  Table S1) HGSOC patients with different clinical characteristics, and the results obtained from the analysis are shown in Table 5. We analyzed the correlation between risk score and OS in different subgroups, such as age (≤58 or >58), histologic grade (high or low differentiation), clinical stage (early or advanced stage), lymphatic invasion, venous invasion, treatment outcome objective response ([OR] or no-meeting-objective response), residual tumor size, and anatomic neoplasm subdivision (unilateral or bilateral). Subgroup analyses demonstrated that high-risk score was associated with poor prognostic in the patients with grade 3-4 (P < .001, Figure  4A), advanced stage (P < .001, Figure 4B), venous invasion (P = .0439, Figure 4C), and OR subsets (P < .001, Figure  4D) closely. In addition, there were strong relationships between the high-risk score and low survival rate in different age groups ( Figure 5A), lymphatic invasion ( Figure 5B), residual tumor size ( Figure 5C), anatomic neoplasm subdivision ( Figure 5D). But no significant differences were found between the subgroups in these features.

| DISCUSSION
High-grade serous ovarian cancer is the main histological type that causes death in patients with ovarian cancer. Therefore, studying signatures with prognostic value would be critical for treatment improvement of HGSOC patients. In this study, a potential platinum resistant-related risk signature was built and evaluated to predict survival in HGSOC according to the results of DEGs identification involved in platinum-resistant and Cox proportional regression analysis. The risk score computed according to coefficients and expressions of seven mRNAs, including LRRC17, ZHX3, CD38, AKR1B10, LYPD6B, KIAA2022, and CMBL, could serve as an accurate and independent prognostic indicator for HGSOC. Several single genes were shown to be a powerful in assessment of prognosis for HGSOC. For example, some studies focused their point on the prognostic value of apolipoprotein B mRNA editing enzyme catalytic subunit 3 (APOBEC3) and folate receptor 1 (FOLR1). 27,28 Similarly, a study highlighted the prognostic value of a 11-gene model which was obtained by Rebust likelihood-based study on ovarian samples from TCGA. 29 Despite those surprising findings, no biomarkers for prognosis prediction of HGSOC were yet in clinical use. However, there was no large scale genomic analysis studied the collaborative effect of multiple genes related to platinum resistance on the prognosis in HGSOC. Therefore, it is extremely important to find and design a new model that can predict the prognosis of platinum resistance in HGSOC and with which we can predict the prognosis of patients with HGSOC and take corresponding treatment measures.
Platinum-based chemotherapy combined with debulking surgery is the standard therapy for ovarian cancer patients. Despite high response rate to initial therapy, many patients would suffer platinum resistance and relapse after that, which lead to poor prognosis. So, we attempt to construct a potential platinum-resistant focused model to predict the OS in HGSOC patients. Firstly, a total of 269 DEGs between platinum-resistant and sensitive HGSOC tissue samples were obtained. To understand the functions of DEGs, gene function analysis was performed. DEGs were involved in many biological processes related to drug resistance, such as signal transduction and apoptosis process. They may also participate in the drug resistance of ovarian cancer through PI3K-Akt signaling pathway and drug metabolism-cytochrome P450. Next, a prognosis model consisted of seven DEGs (LRRC17, ZHX3, CD38, AKR1B10, LYPD6B, KIAA2022, and CMBL) was constructed based on relationships between DEGs and the OS in 333 HGSOC patients from TCGA. Among them, LYPD6B, CD38, CMBL, and KIAA2022 were regarded as independent protective factors, while others were defined as risky factors. Surprisingly, survival and ROC curve analyses revealed that the risk score generated from the prognosis model could serve as an indicator of the OS in HGSOC with certain accuracy. Moreover the independence of prognosis effect was also confirmed through comparing the novel risk score with classical clinical pathological parameters. Among the identified seven genes (LRRC17, ZHX3, CD38, AKR1B10, LYPD6B, KIAA2022, and CMBL) in our study, CD38 was well known about its correlation with hematological malignancies. High expression of CD38 was associated with shorter lymphocyte doubling time and poor prognosis in chronic lymphocytic leukemia 30 In addition, several studied argued that CD38 played vital role in multiple myeloma. [31][32][33] AKR1B10 is a member of aldo-keto reductase family 1 member B subfamily and expressed in normal epithelial cells of the digestive tract as a cytosolic reductase. 34 It has been observed that AKR1B10 was associated with survival in gastric cancer, colorectal cancer, and hepatocellular carcinoma. [35][36][37] To date, Very little is currently known about AKR1B10 in ovarian cancer. Hong et al found that the low expression of LRRC17 was a risk factor for fracture in postmenopausal women. 38 However, there is little published data on LRRC17 function in cancer. 39,40 Ochoa et al 41,42 suggested that LYPD6B enhanced the sensitivity of (α3)3(β4)2 nicotinic acetylcholine (ACh) receptors to ACh. CMBL, which is a highly expressed in liver cytosol, was the primary cysteine hydrolase to bioactivate Olmesartan Medoxomil in the liver and intestine. 43,44 Existing research has recognized the critical role of KIAA2022 in X-linked mental retardation (XLMR) and brain development. [45][46][47][48][49] KIAA2022 enhanced cell adhesion and migration by regulating adhesion molecules expression, such as N-Cadherin and β1-integrin. 47,50 ZHX3 is a member of the zinc fingers and homeoboxes (ZHX) gene family. Although it is now well established that ZHX1 and ZHX2 worked as tumor suppressors, 51-53 the role of ZHX3 in cancer is not clear. Studies showed that ZHX3 was upregulated and might be an independent indicator of the OS in renal clear cell carcinoma. 54,55 In the final part of this study, the prognosis efficacy of risk score was also evaluated in subsets of HGSOC patients  focused indicator for survival assessment in HGSOC. The risk score will provide a new direction for the evaluation of HGSOC prognosis, which is different from the traditional assessment system. It might be helpful for individualized treatment and survival improvement through further patient stratification.
Additionally, some limitations of this study should be considered. First, the sample size in this study is small, so further validations with larger samples and other experimental methods are quite essential. Second, this study was limited to the analysis of mRNA expression profile without considering interactions with miRNA, lncRNA, and other factors, further comprehensive study are expected to be done. Third, previous studies of LRRC17, ZHX3, LYPD6B, KIAA2022, and CMBL in cancer still remain paucity despite the importance of them. Extensive researches are required to answer this question.
In conclusion, to the best of our knowledge, we have identified seven genes associated with platinum resistance in HGSOC patients using the Cox regression model. Further analysis revealed that the seven-gene signature could be an independent factor predicting the prognosis of the platinum resistance in HGSOC patients. These findings may be a potential biomarker for the prognosis of platinum resistance and provide insights into theoretical guidance and decision making in clinical practice of HGSOC.