Identification of immune‐enhanced molecular subtype associated with BRCA1 mutations, immune checkpoints and clinical outcome in ovarian carcinoma

Abstract Ovarian carcinoma has the highest mortality among the malignant tumours in gynaecology, and new treatment strategies are urgently needed to improve the clinical status of ovarian carcinoma patients. The Cancer Genome Atlas (TCGA) cohort were performed to explore the immune function of the internal environment of tumours and its clinical correlation with ovarian carcinoma. Finally, four molecular subtypes were obtained based on the global immune‐related genes. The correlation analysis and clinical characteristics showed that four subtypes were all significantly related to clinical stage; the immune scoring results indicated that most immune signatures were upregulated in C3 subtype, and the majority of tumour‐infiltrating immune cells were upregulated in both C3 and C4 subtypes. Compared with other subtypes, C3 subtype had a higher BRCA1 mutation, higher expression of immune checkpoints, and optimal survival prognosis. These findings of the immunological microenvironment in tumours may provide new ideas for developing immunotherapeutic strategies for ovarian carcinoma.

not tumour cells, and it can also promote synergistic antitumour actions in combined treatment. 5 Acquired immunity generally develops from the innate immune system and adaptive immune system and their interactions as well. The innate immune system produces immune cells (eg dendritic cells and macrophages) to protect the body, while the adaptive immune system defends against special threats via specific lymphocytes (B cells and T cells) to form immunological memory. Cancer cells disrupt the regulatory pathway of T cells, recruit immunosuppressive cells and release active cytokines with an immunosuppressive effect by influencing the antigen presentation process, thereby impairing the immune system and altering immune regulation for the benefit of the tumour cells. 6,7 By the advanced stage, they have developed several mechanisms to escape immune surveillance. The stimulation of programmed death-1 (PD-1) signal transduction in tumour cells facilitates the inhibition of T cell activity, and such inhibition can be promoted when the ligand CD86 or CD80 binds to CD28 or CTLA4. Thus, the upregulation of these immune checkpoint genes can lead to the suppression of the immune microenvironment. 8 The immune checkpoint inhibitors developed for PD-1 and CTLA4 can be effective in treating several tumours by activating the function of immune cells and normalizing the tumour microenvironment. 9,10 However, the efficacy of tumour immunotherapy is affected by the immune microenvironment of tumours, and some patients show significant response to tumour immunotherapy, so there is marked individual variation in the clinical treatment outcome. 11 The complexity of the tumour immune microenvironment increases the difficulty of immunotherapy and affects its effectiveness, but the expression pattern of immune checkpoint genes in ovarian carcinoma and the potential clinical relationship are still unclear. 4 Therefore, it is imperative to study in depth the overall immune status of patients, to identify the molecular subtypes of cancer, and to improve treatment efficacy in advanced OV patients.
This study aimed to explore the overall immune status of OV patients and its clinical significance. We screened the expression data of immune genes from the TCGA database and determined four molecular subtypes of ovarian carcinoma. We then compared the clinical characteristics, immune score, BRCA1/2 variant status, prognosis and immune checkpoint expression of the different subtypes, and finally validated our analysis results using external datasets. Our study findings can be helpful for the immunological treatment of ovarian carcinoma.

| Data source and processing
We used the GDC (https ://docs.gdc.cancer.gov/API/Users_Guide/ Getti ng_Start ed/) API to download the TCGA-OV profile dataset containing a total of 374 samples and 5 samples of recurrent tumours, all of which were samples prior to standard treatment after diagnosis. Among them, Stage I, Stage II, Stage III and Stage IV have 1, 21, 292 and 57, respectively. We matched the expression profile and the clinical follow-up samples and selected both samples as the sample set of the study. Further, we extracted the immune gene sets with expression from the expression profiles and selected the expression levels in each sample to be greater than 0. The sample with more than 30% of the genes was included as an immune gene for this study. Final inclusion of 1251 genes.
The GSE26193 dataset of the GPL570 platform was downloaded using the R package GEOquery, which contained 107 samples, of which Stage I, Stage II, Stage III and Stage IV were 21, 10, 59 and 17, respectively, according to the GPL570 annotation information. According to the annotation information of GPL570, probe mapping is applied to genes. If there are multiple probes corresponding to one gene, take the median and delete probes corresponding to multiple genes. Thirteen types of immune metagenes were collected from Safonov et al 12 We downloaded 6 types of immune cells corresponding to each sample of OV from Timer (https ://cistr ome.shiny apps.io/timer/ ) and downloaded immune genes from ImmPort database (https ://immpo rt.niaid.nih. gov). We utilized the R package to estimate and calculate the immune score and matrix score of each sample.

| Molecular subtypes screening based on immune genes
We made use of the expression profile of immune genes for consistent clustering, just as Zhang et al, 13 who used R software package ConsensusClusterPlus to screen the molecular subtypes. In the study, Euclidean distance was utilized to calculate the similarity distance between samples, and K-means was used for clustering. 80% of the samples were sampled by resampling scheme. Resampling was conducted for 100 times. The optimal number of clusters was determined by the cumulative distribution function (CDF). We further utilized the R package sigclust to analyse the clustering significance between these subtypes.

| The relationship between subtypes and clinical features
Different clinical features are closely related to the development of the disease. The relationship between subtypes and disease development can be more clearly recognized by analysing the relationship between subtypes and clinical features. We extracted the information of age, grade and stage from the clinical follow-up data of the patients and observed the relationship between the subtypes and age, grade, and stage, respectively.

| The relationship between subtypes and immunity
There are key gene sets involved in the immune process discussed in previous studies. We collected 13 types of immune metagenes to analyse the relationship between these metagenes and subtypes.
The immune components of tumour tissue are closely related to the prognosis of tumour. We analysed the relationship between matrix, immune score and molecular subtypes, respectively. The score of immune infiltrating cells directly reflects that the degree of immune infiltration in tumour tissue is closely related to the occurrence and development of tumour. We further utilized variance analysis to evaluate the differences in the above scores of different subtypes.

| The relationship between subtypes and prognosis
We extracted the follow-up data of patients from the sample followup information and utilized K-M to analyse the prognostic differences of different subtypes.

| Other statistical methods
In this study, chi-square test and exact test of Fisher's were utilized for the correlation between molecular subtypes and conventional clinical variables. The OS rates of all molecular subtypes were compared using log-rank test and Kaplan-Meier curves. All of the statistical tests were two-sided tests. R software was utilized for statistical analysis.

| Identification of four molecular subtypes of ovarian carcinoma based on immune profiles
The optimal number of clustering was determined by CDF. As shown in Figure 1A, the clustering results were stable when 4 subtypes were clustered, which were obtained by the subsequent observation of the CDF delta area curve in Figure 1B. Finally, k = 4 was selected and 4 molecular subtypes were obtained. The clustering significance among the 4 subtypes was further analysed by "sigclust" in the R software package, and the results indicated no significant clustering difference between C1 and C2 subtypes (P = 1) but a very significant clustering difference between C1 and C3, C1 and C4, C2 and C3, C2 and C4, and C3 and C4 subtypes (P = 0). The stable clustering results at k = 4 were chosen on the basis of the consensus clustering results. Figure 1C shows that 274 tumour samples were divided into these 4 subtypes. Furthermore, the expression spectra of 356 immune gene sets were used to analyse the differences between different subtypes, and the genes with a higher expression level in one subtype compared to other subtypes were screened using the Kolmogorov-Smirnov test. Using FDR < 0.05 as the threshold, 124, 506, 180 and 162 genes with a higher expression level in C1, C2, C3 and C4, respectively, were eventually selected, and there was little intersection between these genes ( Figure 1D). In addition, PCA was performed on the expression spectra of the top 100 genes with significantly higher expression for each subtype, and the scatter plot of the top 2 components is shown in Figure 1E, which indicates the clear clustering of the 4 subtypes. The expression profile heatmap of these genes is shown in Figure 1F, indicating that various subtypes had a clear border and a notable expression pattern in the expression spectra of these genes.

| Relationship between 4 subtypes and clinical characteristics
The relationship between the 4 subtypes and age, tumour grade and tumour stage was analysed, as shown in Table 1. Four subtypes were not significantly correlated with age or grade but showed a significant relationship with stage, and Stage II samples of C3 subtype were evidently more than those of other subtypes.

| Relationship between 4 subtypes and immunity
To analyse the relationship between the 4 subtypes and immunity, we collected 13 immune metagenes, 12 the scores of tumour immune components (matrix score, immune score and tumour purity) and the scores of 6 types of tumour-infiltrating immune cells, and we then analysed the relationship between these three immunityrelated scores. The results showed that most of the 13 immune metagenes were highly expressed in C3, while a few were highly expressed in C3 and C4 ( Figure 2A). The immunescore of C3 subtype was significantly higher than that of the other subtypes, and the matrix score and tumour purity of C4 subtype were clearly greater than those of the other subtypes ( Figure 2B). Among 3 immune cell infiltration scores, B cell and CD8_cell scores of C3 were much higher than those of the other subtypes, and the scores of CD4_T cells, neutrophils, dendritic cells and macrophages in the C3 and C4 groups were markedly greater than those in the C1 and C2 groups( Figure 2C). Generally, most immune signatures in C3 subtype were upregulated as compared with C1 and C2 subtypes, and the upregulation of most tumour-infiltrating immune cells was also observed in C3 and C4 subtypes, which indicated that the immune microenvironment of C3 and C4 subtypes was enhanced

| Analysis of prognostic differences and BRCA variant between 4 subtypes
To determine the relationship between the 4 subtypes and prognosis, the prognostic differences between the 4 subtype samples were analysed by the Kaplan-Meier method ( Figure 3A). There was a significant difference in prognosis between the 4 subtype samples: the prognosis of C4 subtype samples was the worst, and the prognosis of C3 subtype samples was markedly better than that of the other subtype samples, while a very significant difference was found between C3 and C4 subtype samples ( Figure 3B). This indicated that the immune-enhanced  Figure 3C). The percentage of BRCA1 variant samples in C3 subtype samples was significantly higher than that in the other subtype samples (χ 2 test, P = .036). The BRCA2 variant/wild-type ratio in the 4 subtype samples was analysed ( Figure 3D), and it was greater in C4 subtype samples, but the differences with other subtype samples were not statistically significant (χ 2 test, P = .617). Furthermore, we analysed the distribution of the number of mutated genes in the 4 subtype samples and found a significant difference in the frequency of gene mutation between the 4 subtype samples; the frequency of gene mutation in C4 subtype samples was clearly higher than that in the other subtype samples ( Figure 3E).

| Relationship between 4 subtypes and the expression of 8 immune checkpoint genes
The relationship of 8 immune checkpoint genes with the 4 subtypes was further analyzed. The expression levels of PDCD1, CD274, PDCD1LG2, CTLA4, CD86 and CD80 in C3 subtype were significantly greater than those in other subtypes, and CD267 demon-

| WGCNA analysis and mining of immuneenhanced subtype-related modules
The expression profile data of 871 immune genes for the 4 subtypes were obtained to further mine the prognostic markers related to the immune microenvironment of ovarian carcinoma. The distance between transcripts was then calculated, and the weighted co-expression network was constructed using WGCNA in the R software package. [15][16][17][18] Finally, the co-expression modules were screened with the soft threshold of 2. The studies showed that the co-expression network complied with the scale-free network; that is, the log(k) of a node with the connectivity being k was negatively correlated with the log(P(k)) of the presentation probability of the node, and the correlation coefficient was >0.8. To make sure that the network was a scale-free one, β = 2 was selected ( Figure 5A,B). The expression matrix was converted into the adjacency matrix, and the latter was then converted into the topological matrix. The genes were clustered on the basis of TOM using the average-linkage hierarchical clustering method, where the standard of three was sheared according to the mixed dynamics, and the minimum number of genes in each gene (lncRNA) network module was set at 30. After F I G U R E 1 Identification of OV subtypes based on the immune genes. A, CDF curve; different colours reflect different cluster numbers, the horizontal axis represents the consensus index, the vertical axis stands for cumulative distribution function (CDF), and a bigger AUC indicates better clustering. B, CDF delta area curve of consensus clustering, indicating the relative change in area under the cumulative distribution function (CDF) curve for each category number k compared with k − 1. The horizontal axis represents the category number k, and the vertical axis represents the relative change in area under CDF curve. C, Heatmap of sample clustering at consensus k = 4; D, Intersection Venn diagram of significant high-expression genes of various subtypes; E, Expression profile PCA of top 100 significant highexpression genes, and scatter plot of top 2 components; F, gene expression heatmap of top 100 significant high-expression genes in four subtypes. Red represents high expression, and blue represents low expression   (Table S2), 30 pathways in the yellow module (Table S3), 94 pathways enriched in the blue module (Table S4) ( Figure 5E). The relationship of pathways enriched in these three modules was analysed, and a total of 121 pathways were enriched in three modules, where the pathways in the yellow modules overlapped mostly with those in the other two modules.

| Validation of external datasets
We selected the genes in the gene co-expression modules (blue, brown and yellow) closely related to various subtypes and then extracted the  Figure 6A) and found a high expression of most immune metagenes in C3 subtype, which was consistent with the training set. Next, we further analysed the immune scores of the samples ( Figure 6B) and observed that the immune score in C3 subtype was significantly higher than that in the other subtypes and that the matrix score and tumour purity in C4 subtype were clearly greater than that  Figure 6D), the differences in prognosis between 4 subtypes were marginally significant (P = .065), and the prognosis of C4 subtype was markedly poorer than that of the other subtypes; as shown by further analysis, the prognosis of C3 subtype was clearly better than that of C4 subtype ( Figure 6E), which was consistent with the validation dataset.

| Data analysis flow chart
To make our study better understand. The workflow of the proposed method was developed as shown in Figure 7. Among the four molecular subtypes, the overall immune profiles of subtypes C3 and C4 were significantly higher than in comparison with that of subtypes C1 and C2. However, there were also differences between C3 and C4. Most of the 13 kinds of immune metagenes and immune cell infiltration score (immunity) were highly expressed in C3 subtype ( Figure 3A,B), and the immune cell infiltration score (matrix and tumour purity) of C4 subtype was significantly higher than those of other subtypes ( Figure 3B). The scores of multiple types of immune-related cells such as B_cell and CD8_cell in C3 subtype were significantly higher than those in other subtypes.

| D ISCUSS I ON
These findings indicate that the immune microenvironment of C3 and C4 subtypes was strengthened. C1 and C2 subtypes had lower immunoreactive expression with lower immune scores. In the comparison of overall survival, the prognosis of C1 and C2 subtypes was poor and consistent. It was noted that the poorest prognosis was found in C4 subtype, while C3 subtype had significantly better prognosis than the other subtypes. This suggests that the immune-enhanced subtypes may not respond to the best prognosis in ovarian carcinoma. The P-value was calculated using the log-rank test, by comparing the overall survival of 4 subtypes. E, Prognostic difference between C3 and C4 subtypes. The abscissa represents survival time (d) and the ordinate represents survival probabilities BRCA1/2 is a tumour suppressor gene and encodes BRCA1/2 protein, which plays a critical role in regulating essential cellular activities, such as normal cell growth, repair, transcription and activation of DNA damage, and inhibition of chromatin remodeling. 26 (Table S2). The role of these pathways is closely related to malignant neoplasms, autoimmune reaction and inflammation. And the C4 subtype is also significantly positively correlated with the yellow module (r = .64, P = 6e−44). The related genes are significantly enriched in MAPK signalling pathway and PI3K-Akt signalling pathway (Table S3); The blue module gene is significantly enriched in Axon guidance and EGFR tyrosine kinase inhibitor resistance signalling pathway (Table S4).
The results of the validation set showed that the overall immune profiles of subtypes C3 and C4 were significantly higher than in comparison with that of subtypes C1 and C2, and most of the immune checkpoint genes were highly expressed in the C3 and C4 subtypes, consistent with the training set. The reliability of the C3/C4 subtype F I G U R E 7 The workflow of the proposed method as an enhanced subtype of the ovarian cancer immune microenvironment was further demonstrated.

| CON CLUS ION
From the above discussion, the conclusion can be reached that we identified two immune-enhanced subtypes using gene expression profiles of global immune genes through large databases of TCGA and GEO. The two subtypes are distinct in immune checkpoint molecules, immune function, BRCA mutation and clinical prognosis.
These findings of the immune microenvironment may shed new light on the strategy of immunotherapy in ovarian cancer. With the development of internet and big data era coming, constructing databases [34][35][36] and establishing powerful webserver 37,38 will provide the convenience to most scholars.

ACK N OWLED G EM ENTS
This work is supported by grants from Shengjing Freedom researchers' plan (201804). The funding body had no role in the design or conduct of the study. We sincerely thank English Editor for guidance to English copyediting of our paper.

CO N FLI C T O F I NTE R E S T
No potential conflict of interest to disclose.

AUTH O R CO NTR I B UTI O N S
Mingjun Zheng, Yuexin Hu and Rui Gou conceived the idea for the paper. All authors read and approved the final manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data generated or analysed during this study are included in this article.