Seven immune‐related genes prognostic power and correlation with tumor‐infiltrating immune cells in hepatocellular carcinoma

Abstract Background Given poor prognosis and the lack of efficient therapy for advanced hepatocellular carcinoma, immunotherapy has emerged as an increasingly important role. However, there were few reports on the correlation between immune‐related genes and HCC. The purpose of this study is to construct a novel immune‐related gene‐based prognostic signature for HCC and to explore the potential mechanisms. Methods We organized expression data of 374 HCC samples and 50 nontumor samples from TCGA database. A robust signature was constructed by Cox regression analysis based on the immune‐related genes, which were filtered by differential genes analysis and Cox regression analysis. Then, the correlation analysis between the signature and clinical characteristics was conducted. And the signature was validated in ICGC database. Furthermore, the relationships between immune cell infiltration and the signature were explored by bioinformatics analysis. Results Seven genes‐based model (Risk score = BIRC5 * 0.0238 + FOS * 0.0055 + DKK1 * 0.0085 + FGF13 * 0.3432 + IL11 * 0.0135 + IL17D * 0.0878 + SPP1 * 0.0003) was constructed eventually and it was proved to be an independent prognostic factor for HCC patients. The signature‐calculated risk scores were shown to be positively correlated with the infiltration of these five immune cells, including macrophages, neutrophils, CD8+T, dendritic, and B cells. And the results suggested that high amplication of BIRC5, FGF13, IL11, IL17D, and SPP1 were more likely correlated with immune cell infiltration. Finally, PPI network, TFs‐based regulatory network and gene enrichment plots were performed to show potential molecular mechanisms. Conclusion We construct a robust immune‐related gene‐based prognostic signature with seven genes and explore potential mechanisms about it, which may contribute to the immunotherapy research for HCC.


| INTRODUCTION
Liver cancer is one of the most important factors in tumor-related death, threatening thousands of lives worldwide. 1,2 The most common pathological type in liver cancer is hepatocellular carcinoma (HCC) which arises from chronic liver inflammation and liver fibrosis mostly. 3,4 Nowadays, surgical resection and liver transplantation are the most widely used treatments for HCC internationally. However, given that these two treatments show several drawbacks, such as rapid postoperative recurrence and shortage of liver donors, the therapeutic effects are far from satisfactory. 5 Moreover these two methods have limitations in the treatment of advanced HCC, which is usually diagnosed at late stages, resulting in poor survival rate. 5 Considering the restrictions and poor outcome of existing treatments, immunotherapy, acting as a treatment strategy, is under intensive investigation currently. 6,7 Although the immunotherapy of HCC has made some progress, such as medicines of immune checkpoint inhibitors (ICIs), 6 there are still a lot needed to be explored. Therefore, it is essential to investigate immune-related genetic profiling of HCC, thereby providing direction for immunotherapy.
Immunotherapy, as one of cancer treatment methods, has been explored for a long time. The development of immunotherapy has attracted more and more attention to tumor immune microenvironment (TIME). Some studies showed that constituents in TIME, including infiltrating immune cells, secreted cytokines/chemokines and other components, were very important in the research of HCC development and the investigation of immunotherapy. The relationships between HCC prognosis and the multiplicity and function of tumor infiltrating immune cells were explored in several studies. [8][9][10][11] The diversity of immature CD11-CD27-NK cells, which are highly expressed in HCC tissue, increases with the progress of liver cancer and has been proved to be a prognostic predictor of HCC patients. 12 In addition, low CD4+ cytotoxic T cells expression was significantly correlated with prognosis of HCC patients, and was considered to be an independent prognostic predictor for survival time of HCC patients. 10 The number of neutrophils in HCC infiltrate could also predict poor prognosis for HCC patients after resection. 13 What's more, several studies have shown that some cytokines in TIME are related to development and progression of HCC. [14][15][16] Enhanced expression of serum IL6 could increase the risk of HCC development and was associated with early metastasis of HCC. 14 High expression level of IL10 in plasma could predict short survival time for HCC patients. 16 According to these evidences, we can conclude that TIME plays an important role in predicting HCC patients' prognosis. However, there is still a lack of signatures to evaluate TIME based on immune-related genes and predict HCC patients' prognosis as well. In consequence, it is important to find a powerful signature built on immune-related genes for HCC which could predict HCC patients' survival and may provide direction for HCC immunotherapy at the same time.
In this study, an immune-related gene-based risk signature was constructed, which could predict prognosis of HCC patients independently. Then, the correlations between the signature and clinical characteristics were evaluated. And the correlations between the signature and immune cell filtration were also explored. Apart from those, potential regulatory mechanisms of this signature were displayed through bioinformatics analysis.

| Data resource
RNA-sequencing data were contained from the TCGA database (The Cancer Genome Atlas database, https://cance rgeno me.nih.gov/) and the raw count data of 374 HCC samples and 50 nontumor samples were collected. In addition, the full clinical characteristics of corresponding patients were downloaded and extracted. The immune-related genes were obtained from the ImmPort database (https://immpo rt.niaid.nih.gov) and we got 2498 immune-related genes. 17,18 In addition, RNA expression sequencing data and patient clinical information of 232 HCC samples were obtained by ICGC database (https://dcc.icgc.org/) for validation.

| Development of the immune-related gene-based signature
To get immune genes involved in the onset of HCC, we performed differential expression analysis between HCC and nontumor samples, using the edgeR package in R. All differentially expressed genes were extracted according to P value<.05 and the absolute log2 fold change (FC) ≥ 2, and then immune-related genes were screened from these differentially expressed genes. Univariate Cox regression analysis, basing on differentially expressed immune-related genes, was conducted to explore the association between expression levels of these genes and HCC patients' overall survival (OS). Then, genes with statistical significance (P < .05) in univariate Cox regression analysis were included in multivariate proportional hazards regression analysis. Differentially expressed genes that were statistically significant in both univariate and multivariate analyses were considered to be independent prognostic factors for HCC and were chosen to develop immune signature. The immune-related signature was constructed in Cox proportional hazards model method of R package "survival" and then we stratified HCC patients into high-risk and low-risk groups based on the median value of risk scores.

| Validation of the immune-related genebased signature
In order to verify the predictive performance of signatures, we conducted the Kaplan-Meier (K-M) survival analysis through the "survival" package in R to reflect OS of highrisk and low-risk groups. And, the ROC curve, which showed the sensitivity and specificity of the signature for predicting prognosis, was demonstrated with R package of "survival ROC." What's more, expression data from ICGC database was used to conduct external validation. K-M analysis and ROC curve were performed in the ICGC database by using the R package. To further verify the prognostic capabilities of the signature, we calculated and compared the C-index in both TCGA and ICGC database. 19

| Relationship between clinical variables and the signature
We performed the correlation analysis the genes that make up the signature and clinicopathologic factor, including tumor grade, pathological stage, age, gender, Alpha fetoprotein (AFP) quantification, and Hepatitis B virus (HBV) infection. In addition, based on the clinical characteristics and the signature as the included variables, we conducted univariate Cox regression analysis and multivariate Cox regression analysis. Then, the figures were made to present the results of the correlation analysis intuitively.

| Immune cell infiltration analysis
We conducted an analysis about the relationships between the signature and the amount of six kinds of immune cell infiltration downloaded from TIMER (Tumor Immune Estimation Resource). 20 Besides, the correlations between copy number variations of the signature constituent genes and the immune cells are explored at TIMER. The TIMER online database offers abundances of tumor-infiltrating immune cells for multiple tumors, which contains six kinds of cells, including macrophages, neutrophils, B, CD4+T, CD8+T, and dendritic cells.

| Regulatory mechanisms analysis
The Protein-Protein Interaction (PPI) network was displayed based on differentially expressed immune-related genes that are correlated with HCC patients' prognosis via the STRING online database (https://strin g-db.org/). 21,22 Then, we performed functional enrichment analyses to explore potential molecular mechanisms of these immune genes via the GO and KEGG pathways in DAVID website (https://david.ncifc rf.gov/). 23 TFs are vital molecules which regulate the gene expression directly. Therefore, it is essential to explore the potential adjustment ability of TFs to the immune signature genes. The comprehensive list of TFs was collected from Cistrome Cancer database (http://cistr ome.org/) 24 and the regulatory network of genes in the signature and TFs was constructed by Cytoscape. 25 Gene set enrichment analysis (GSEA) (https://pypi.org/proje ct/gseap y/) was used to analyze the GO term of the genes that make up the signature. 26

| Statistical analysis
R version 3.5.1 was used to conduct the statistical analysis of this research. 18 And p value < 0.05 was regarded as statistically significant. 18

| Differential expression analysis
We conducted differential expression analysis and a total of 2062 genes (|logFC|>2, P < .05) were chosen compared gene expressions of HCC tissues with gene expressions of normal tissues. There were 77 down-regulated and 198,5 upregulated genes among those differentially expressed genes ( Figure 1A,B). Then, 116 immune-related genes, including 20 down-regulated and 96 up-regulated genes, were extracted from the differentially expressed genes ( Figure 1C, D).
Through the univariate Cox analysis with these 116 genes, a total of 27 genes that were significantly correlated with HCC patients' survival were extracted, and these genes were considered as factors participating in the occurrence and progression of HCC. To explore the potential mechanisms of these 27 genes, the PPI network based on these 27 genes was constructed using the STRING database, and there were 45 edges and 90 nodes in the PPI network ( Figure 1E). According to the numbers of nodes, FGF13, DKK1, FOS, and SPP1 were the possible core genes among the seven immune-related genes in the PPI network. The GO analysis showed that these 27 prognosis-associated differentially expressed genes were most enriched in "positive regulation of cell migration" in biological processes, "secretory granule lumen" in cellular components, and "receptor ligand activity" in molecular functions ( Figure 1F). And these 27 genes were proved to be actively involved in arginine biosynthesis KEGG according to Functional enrichment analysis ( Figure 1G). Then, the results of univariate and multivariate analysis with the 27 prognosis-related genes were performed, and nine independent prognostic factors (PTHLH, PLXNA3, BIRC5, FOS, DKK1, FGF13, IL11, IL17D, SPP1) of HCC were extracted according to the results of univariate and multivariate analysis ( Table 1). And these nine genes were analyzed by Cox proportional hazards model method of R package "survival" to find the best model. Finally, the seven immune-related genes were chosen and the signature was constructed as follows: What's more, taking the median risk score as the cut-off value, HCC patients were grouped into high-risk and low-risk groups. According to the result of K-M analysis, the higher the risk score calculated by the signature of HCC patients, the worse the prognosis (P < .001, Figure 2A). The 1-year, 3-year, and 5-year ROC curve analysis showed that the area under the curve (AUC) was 0.778, 0.754, and 0.742, respectively ( Figure 2B). These results showed that the signature we constructed has a good predictive effect on the survival of HCC patients. Then, we also depicted the distributions of risk score and survival status, which suggested that there were more deaths in high-risk group ( Figure 2C, D). What's more, heatmap showed the distribution of the expression of seven immune-related genes. Compared with the low-risk group, all the seven genes showed relatively high expression in the high-risk group ( Figure 2E). To verify the predictive power of the immune-related genes-based signature furtherly, 232 HCC patients from ICGC database were used for external validation. And we divided the patients into high risk (n = 116) and low risk groups (n = 116) at cutoff value of median risk score. Consistent with the result in TCGA dataset, K-M analysis showed that patients in the high-risk group had a worse prognosis than those in the low-risk group (P = .007698, Figure 2F). Then, ROC curves were performed and the results showed that the 1-year AUC, 2-year AUC, and 3-year AUC was 0.717, 0.636, and 0.616, respectively ( Figure 2G). There are only two patients with OS longer than 5 years, so the AUC of 5-year ROC curve is not calculated. The C-index was 0.699 and 0.754 in TCGA dataset and ICGC dataset, respectively, suggesting the strong predictive power of the signature.

| Relationship of the immune-related gene-based signature with clinical variables
Afterward, we explored the correlations between clinicopathologic factors and risk scores calculated from the immune-related gene-based signature. The correlations between the seven immune-related genes and clinical characteristics were also assessed. The figures and the table demonstrated that HBV tended to correlated with lower risk scores (P = .037, Figure 3A, Table 2). As Figure 3 showed, higher expression of BIRC5 and DKK1 were related with higher level of AFP and higher level of grade ( Figure 3B,C,D,E). Higher expression of FOS was related with non-HBV and lower level of grade ( Figure 3F,G). Higher expression of IL11 was related with non-HBV and females ( Figure 3H,I). Higher expression of FGF13 was relevant to lower level of AFP ( Figure 3J). And lower expression of IL17D was related with high level of grade ( Figure 3K). The low expression of SPP1 is closely related to the high expression of AFP, young people, and HBV infection ( Figure 3L,M,N).

| Mechanisms of the immunerelated signature
We explored the potential regulatory mechanisms of the seven immune-related genes that made up the signature, which could reflect the regulatory mechanisms of the signature. Based on 315 transcription factors (TF) and seven immune-related genes, we constructed a relevant regulatory network with correlation coefficient greater than 0.3 as cutoff value. The regulatory relationships between these TFs and immune-related genes were revealed in the regulatory network ( Figure 5A). In terms of the figure, we can see that BIRC5, FOS, DKK1, FGF13, IL11, and SPP1 were included in the regulatory network, while there was no regulatory relationship between the TFs and IL17D.
The functional enrichment analysis of the immune-related signature was conducted by GSEA. The top 10 GO terms enriched in high-risk group were demonstrated in Figure 5B, including cellular component morphogenesis, positive regulation of cell differentiation, regulation of anatomical structure morphogenesis, neuron development, neurogenesis, cell morphogenesis involved in differentiation, neuron differentiation, cell part morphogenesis, regulation of cell differentiation and regulation of cell morphogenesis.

| DISCUSSION
HCC is one of the most common cancers worldwide, and the incidence and mortality of HCC have continued to increase. 1 In the last several decades, the treatment of HCC has experienced great improvements, including the breakthroughs of immunotherapy. 6,[27][28][29] Considering the importance of the immune environment in the development of tumors and the close connection between immune environment and immunotherapy, 30 it is crucial to find immune-related biomarkers that can predict the prognosis of HCC patients, which can also provide guidance for immunotherapy research. In this investigation, we constructed a novel signature on the basis of immune-related genes for HCC patients, which may also predict HCC patients' OS independently. Most importantly, we also explored the molecular mechanisms related with the signature using the bioinformatic system and correlations between our signature and immune cell infiltrations to assess immune cells infiltration. These findings indicate that our signature is of great significance for the study of predicting the prognosis of HCC patients and for the exploration of potential immunotherapy targets of HCC.
The signature was constructed by seven immune-related genes that could predict prognosis of HCC patients. In our study, all the seven immune-related genes were proved to be independent prognostic markers for HCC patients. Researches indicated that BIRC5 is overexpressed in various malignant tumors, and high expression level of BIRC5 predicted the poor clinical prognosis of these malignant tumor patients, such as malignant breast cancer, 31 lung adenocarcinoma, 32 neuroblastic malignant tumor, 33 and malignant esophageal cancer. 34 The view consistent with our findings is that BIRC5 is highly expressed in HCC tissue and can predict the prognosis of HCC patients, indicating that BIRC5 could regulate the occurrence and development of HCC. 35 What's more, Yu Bin et al suggested that DKK1 was overexpressed in HCC tissues for the first time and they demonstrated that DKK1 overexpression is closely related to nuclear/cytoplasmic β-catenin accumulation in HCC tissues. 36,37 Studies have shown that the absence of specific c-Fos in HCC cells can prevent dEn-induced HCC, while liver-specific c-Fos Expression can lead to cell carcinogenesis and enhanced carcinogenesis of dEn. 38 Meanwhile, it has been reported that IL-11 could promote the progression of colorectal cancer, breast cancer and gastric cancer. 39 It was also showed that IL11could participate in the regulation pathways in HCC recurrence and  In addition, SPP1 may also affect the immune escape and malignant biological behavior of tumor cells by regulating EGFR activation. 42 Overexpression of SPP1 promotes the development and metastasis of HCC effectively, and may become a underlying target for the treatment of metastatic HCC. 43,44 SPP1 is a key factor that mediates the polarization process of macrophages, and could provide new ideas for tumor immunotherapy strategies. 45 At present, there is no research on FGF13 in HCC. From previous studies, exploring the function and mechanism of these seven genes could facilitate the development of new strategies for HCC treatment.
In our study, we systematically evaluated the correlation between the signature and clinicopathological factors. The results showed that patients with HBV infection tended to get high immune risk score. And it can be further confirmed that the immune-related risk signature, AFP and HBV are independent predictors for prognosis of HCC patients. In addition, the significant relationship between the signature and OS can also suggest that the differences in immune infiltration of HCC may reflect HCC patient's OS. Therefore, it could be a promising predictor for OS of HCC patients by combining the signature and other factors in the future.
Furthermore, the underlying molecular mechanisms of the immune-related gene-based signature were also investigated. The TF-based regulatory network was constructed to guide mechanism analysis of immunotherapy in the future. Among these TFs, HIF1A induces the up-regulation of PD-L1 under hypoxic conditions and it could bind to the transcriptionally active hypoxia response element (HRE) in the PD-L1 proximal promoter, which means that some TFs could play key role in tumor immunotherapy. 46 The TIME is an important aspect of the study of cancer biology and play a vital role for the occurrence, development and treatment responses of tumors. 47,48 The components of the immune system, including cells and molecules, are the basic components of TIME. Therefore, it is necessary to illustrate the relationships between HCC immune cell infiltration and the signature to display the status of HCC TIME. According to our results, high-risk patients tended to related with higher content of B cells, CD8+T cells, dendritic cells, neutrophils and macrophages. In HCC patients, the higher the risk score calculated according to the signature, the higher the amount of immune cell infiltration. And the expression and role of immune cells in HCC have also been reported in previous studies. Zheng C et al revealed the detailed characteristics of infiltrating T cells in HCC from multiple aspects, including clustering, dynamics and developmental trajectory, and the results showed that Tregs and exhausted CD8+T cells are most abundantly expressed in HCC. 49 Liver tumor infiltration lymphocytes was dominated by CD8+T cells in the tumor tissues and CD4+T cells were higher than CD8+T cells in peritumoral area. 50 Shalapour S demonstrated that suppression of CD8+T cells played a vital role in driving oncogenesis through IgAproducing cells for patients with HCC. 51 In addition, the high expression of tumor infiltrating T cell genes is associated with a better prognosis of anti-PD-1 monotherapy. 52 However, the effects and roles of immune cells in HCC have not been fully explained. Furthermore, we also explored the relationships between copy number variations of the seven immune-related genes (BIRC5, FOS, DKK1, FGF13, IL11, IL17D and SPP1) and immune cell infiltrations. And it suggested that high amplication of BIRC5, FGF13, IL11, IL17D and SPP1 were more likely correlated with immune cell infiltration. Our preliminary results could provide an orientation to explore the relationships between immune cells and HCC and thorough studies are still necessary.
Recently, we have seen successful clinical results which have promoted many relevant studies and clinical trials, including HCC immunotherapy studies. Immunotherapies targeting programmed death-1 (PD-1) and cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) are becoming effective novel treatment strategies for HCC. 28,29,53 A recent phase II trial has suggested that pembrolizumab (anti PD 1) could be a treatment option for advanced HCC. 29 In addition, some studies have revealed that tremelimumab alone and in combination with ablation show positive effect on HCC treatments. 27,53 These findings suggest an important role for TIME and the efficacy of immunotherapy for HCC. Some studies have shown that immune-related elements, such as tumor-associated macrophages, dendritic cells and B cells, have impact on survival outcomes of patients with HCC. [54][55][56] Xiao gang et al suggested that up-regulated FOS promotes the expression of PD-1 and the disruption of binding of FOS to the AP-1-binding site in the gene encoding PD-1 promoter could enhance antitumor function of T cell. 57 In addition, chronic lymphocytic leukemia (CLL) cells with down-regulation of CTLA4 exhibit FOS phosphorylation, and the microenvironment-controlled CTLA4 expression mediates the proliferation/survival of CLL cells by regulating the expression/activation of FOS. 58 As a result, with the progress of HCC immunotherapy, the signature and the relationships between the signature and immune cells we have explored could provide further research ideas.
In summary, our study constructed a seven immune-related gene-based risk signature and it could predict the prognosis of HCC patients independently. And the signature could reflect several aspects of the HCC microenvironment, which provides potential directions for research on HCC immunotherapy. Nevertheless, the immune-related genes-based signature needs to be applied in clinical patients to further verify its prognostic predictability.