A prognostic risk model based on immune‐related genes predicts overall survival of patients with hepatocellular carcinoma

Abstract Background and aims Hepatocellular carcinoma (HCC) is one of the most common heterogeneous tumors that occurs after chronic liver diseases and hepatitis virus infection. Immune‐related genes (IRGs) and their ligands regulate the homeostasis of tumor microenvironment, which is essential for the treatment of HCC and its prognosis. This study aimed to investigate the clinical value of IRGs in predicting the prognosis of HCC. Methods We downloaded RNA‐seq data and clinical information from TCGA database. Samples were randomly divided into training cohort and testing cohort. The “limma” R package was performed to identify differentially expressed IRGs (DEIRGs) between HCC group and normal group. Prognostic DEIRGs (PDEIRGs) were obtained by univariate Cox analysis. LASSO and multivariate Cox analysis were used, and a prognostic risk model was constructed. In order to better demonstrate the clinical value of our model in predicting overall survival rate, a nomogram was constructed. To further investigate the molecular mechanism of our model, gene set enrichment analysis (GSEA) was performed. Results Compared with the low‐risk group, the high‐risk group had a significantly worse prognosis. Moreover, our prognostic risk model can accurately stratify tumor grade and TNM stage. Importantly, in our model, not only immune checkpoint genes were well predicted, but also human leucocyte antigen‐I molecules were revealed. GSEA suggested that “MAPK signaling pathway,” “mTOR signaling pathway,” “NOD like receptor signaling pathway,” “Toll like receptor signaling pathway,” “VEGF signaling pathway,” “WNT signaling pathway” had significant correlations with the high‐risk group. Conclusion Overall, our study showed that our prognostic risk model can be used to assess prognosis of HCC, which may provide a certain basis for the survival rate of patients with HCC.


| INTRODUCTION
Hepatocellular carcinoma (HCC) is caused by chronic hepatitis, cirrhosis, and liver fibrosis. The vast majority of patients, including those who exceed Milan Criteria, can only receive palliative care, with low long-term survival. 1 It is worth noting that the prognosis of patients with bile duct metastasis and intrahepatic hematoma is not optimistic, 2 which may ultimately contribute to the development and treatment of HCC. Previous studies have shown that tumorinfiltrating immune cells were highly relevant for prognosis and identification of immunotherapy targets in HCC. 3 Therefore, identification of prognostic differentially expressed immune-related genes (PDEIRGs) is of great significance for improving the prognosis, evaluating therapeutic effect and overall survival (OS). However, the risk assessment of IRGs in prognosis of HCC is rarely explored and further analysis is needed.
In recent years, tumor immunotherapy has received more and more attention. The success of immunotherapy strategies such as immune checkpoint (ICI) blockade in several tumors has established the role of immunotherapy. 4 Immunotherapy can be broadly divided into ICI therapies and adoptive cell therapies (ACTs), of which ICIs mainly function through receptor/ligand recognition, 5,6 while ACTs involve the infusion of pathogen-specific T cells from a donor to recipient. 6 IRGs play a crucial role in regulating receptor/ligand activity in ICIs treatments. 5 Therefore, IRGs may be used as a reference for sensitivity indexes to tumor immunotherapy and perform personalized treatment.
At present, tumor immunotherapy for HCC has achieved remarkable progress. Cytotoxic T lymphocyte-associated antigen-4 (CTLA- 4) and programmed death-ligand 1 (PD-L1) inhibitors have effectively prolonged the OS of patients with advanced HCC (including distant metastasis). 7 However, some HCC patients were not sensitive to ICIs, which may be due to the abnormal expression of IRGs. 6 Identification of PDEIRGs may be helpful for implement individualized treatment and evaluation of prognosis in HCC patients. In this study, we constructed a prognostic risk model based on PDEIRGs and demonstrated that our prognostic risk model has an important role in predicting the prognosis of HCC patients and contributes to individualized therapy at least to a certain extent.

| Sample information
RNA sequence data and clinicopathological information of HCC patients were obtained through The Cancer Genome Atlas (TCGA) database (https://cancergenome.nih.gov/), and the RNA-seq data and clinical information were matched according to patients' ID. This study met the publication guidelines stated by TCGA. All data used in the study was obtained from TCGA, hence ethics approval and informed consent were not required. 8 IRGs and transcription factors (TFs) terms were downloaded from the ImmPort database (https://www.immport. org/home) 9 and Cistrome project (http://cistrome.org/), 10 respectively.

| Construction of the prognostic risk model
DEIRGs were identified by Wilcoxon test, and cut-off value was set to false discovery rate (FDR) <0.05, |log2 fold-change (FC)| >2. To improve reliability of our prognostic risk model, HCC patients (N = 370) were randomly divided into training cohort (N = 185) and testing cohort (N = 185; Table 1). PDEIRGs which used to construct our prognostic risk model were identified by LASSO and multivariate Cox analysis. The risk score was calculated by mRNA expression and estimated regression coefficients, and our prognostic risk model was validated with testing cohort and entire TCGA cohort. First, PDEIRGs were identified by univariate Cox analysis, then LASSO analysis was used to prevent the model from overfitting. Finally, multivariate Cox analysis was used to construct a prognostic risk model.

| Risk score calculation
To calculate risk score of each HCC patient, we calculated the estimated regression coefficients by multivariate Cox analysis. Patients were divided into high/low-risk groups based on the risk score. 11 The following computational formula was used for this analysis: gene i represents the ith gene, and coefficient of (gene i) represents the estimated regression coefficient of the ith gene.

| Kyoto Encyclopedia of genes and genomes enrichment analysis
In order to explore the potential immune molecular mechanisms and immune pathways underlying our prognostic risk model, we conducted gene set enrichment analysis (GSEA) to find enrichment items predicted to correlated with Kyoto Encyclopedia of genes and genomes (KEGG) pathways. Family-wise error rate (FWER) P < .01 and FDR q < .01 were considered statistically significant. "c2.cp.kegg.
v7.0.symbols.gmt" were applied in GSEA analysis. 16 rank test was used to analyze the survival outcomes between the high/low-risk groups using the R package "Survival" and "Survminer," Wilcoxon rank sum test was used in the training/entire TCGA cohorts and log-rank test was used in the testing cohort. Multivariate Cox analysis was used to identify whether our prognostic risk model could be used as an independent prognostic factor for the prognosis of HCC. Time-dependent ROC analysis was used to evaluate the accuracy of our prognostic risk model. 18,19 3 | RESULTS

| Expression of IRGs in HCC
The clinical information of 377 HCC patients were shown in Table S1.
The mRNA expression of 2498 IRGs in HCC tissues and adjacent tissues was examined. As shown in Figure 1A, compared with adjacent tissues, there were 116 DEIRGs in HCC tissues, among which mRNA of 96 genes were found to be significantly up-regulated, while that of 20 genes were down-regulated. In order to study the predictive value of DEIRGs in HCC, univariate Cox analysis was performed. As shown in Figure 1B, of the 116 DEIRGs, 19 genes (PDEIRGs) were significantly associated with OS of the HCC patients.

| Construction of TFs-regulatory network
To further explore PDEIRGs involve in regulating network, the relationship between PDEIRGs and differentially expressed TFs (DETFs) was analyzed. Compared with adjacent tissues, there were 31 DETFs in HCC tissues ( Figure 1C). Then, the correlation between 31 DETFs and 19 PDEIRGs were detected (correlation coefficient >0.3 and P < .05), showing that there was a significant correlation between 19 DETFs and 12 PDEIRGs. Furthermore, "Cytoscape" software was performed to construct a TFs-regulatory network to reveal a direct correlation ( Figure 1D).

| Construction of the four-PDEIRG-based prognostic risk model
Among 377 HCC patients, seven of them belonged to the same samples with different order numbers, so they were excluded. A total of 370 patients were randomly separated into a training cohort (N = 185) and testing cohort (N = 185). The baseline characteristics were summarized in Table S2. In order to study the predictive value of PDEIRGs in HCC, LASSO-modified Cox analysis was carried out in the training cohort to further narrow the scope of PDEIRGs, thereby determining the risk genes suitable for constructing the prognostic risk model (Figure 2A, Table 2). BIRC5, PLXNA3, FGF13, and GAL were selected for subsequent analysis ( Figure 2B). We calculated a risk score of each HCC patient based on the mRNA expression and regression coefficients of four genes. The following computational formula was used for this analysis: Risk score = 0.024 × BIRC5 expression+0.139 × PLXNA3 expression+0.213 × FGF13 expression +0.144 × GAL expression. It is worth noting that the regression coefficient of BIRC5 is weak, but significant, indicating that even though its regression coefficient is weak, it does affect the prognosis of HCC.
We then calculated the risk score for each HCC patient and used the "Survminer" R package to find the optimal cut-off for the risk score.
According to the risk scores of the patients, the patients in the training cohort were divided into the high-and the low-risk group. Kaplan-Meier curve and the time-dependent ROC were used to evaluate the  Using the optimal cut-off calculated from the training cohort, we divided the testing cohort and entire TCGA cohort into two risk groups. In two different cohorts, the OS was lower in the high-risk group than which in the low-risk group ( Figure 3A,D). AUC values at 1-, 3-, 5-year were 67.7%, 65.9%, and 66.6% in the testing cohort, while 70.0%, 65.2%, 63.0% in entire TCGA cohort ( Figure 3B,E). To study patients' risk in two cohorts, we plotted risk curves, ranked risk scores, and analyzed distribution ( Figure 3C,F).

| Evaluation of independent prognostic value of our prognostic risk model
In order to determine whether our prognostic risk model could be used as an independent predictive factor, we used univariate and multivariate Cox analysis. Cox analysis showed that risk score calculated from our prognostic risk model was associated with the patients' OS ( Figure 4A). To evaluate the accuracy of the risk score in predicting the survival status of HCC patients, the ROC curve of clinical parameters was plotted ( Figure 4B). These results indicated that risk score can accurately reveal prognosis and may be more accurate than other clinical parameters.

| Construction and validation of a predictive nomogram
We then used five independent prognostic factors including age, gender, TNM stage, tumor grade, and risk score to establish a nomogram to predict the 3-year OS of HCC patients ( Figure 4C). As shown in Figure 4D, the calibration plot showed that the nomogram (combined

| Correlation between our prognostic risk model and immune genes expression
In the process of new antigen presentation and T cell lysis, the key step is controlled by HLA-I, which presents intracellular polypeptides on the cells surface for T cell receptor recognition.
Down-regulation of HLA-I may reduce antigen presentation and promote immune escape, which is prevalent in a series of cancers and is associated with poor prognosis. 20,21 As shown in Figure 6,  HLA-DQB2, HLA-DRA, HLA-DRB1, and HLA-DRB6 were higher in the high-risk group.
Tumor escape from the surveillance of immune system by multiple ways, in which controlling access of ICIs is an important process of tumor immune escape. 22 At present, CTLA-4 and PD-L1 are the two important routes for tumor immune escape. The mechanism of tumor treatment by immunosuppression is to inhibit the activation of ICI pathways and avoid T cell inactivation, so as to enhance the anti-tumor immune activity. 23 In our study, we evaluated three key ICIs: PD-L1, CTLA-4, and TIM-3. We found that our prognostic risk model was positively related to them, suggesting that our model may be used for evaluation and measurement of response to ICIs in HCC (Figure 7).

| Prognostic risk model mediated multiple immune-related pathways
In order to explore the underlying molecular mechanisms and the signaling pathways of our prognostic risk model, we performed GSEA to compare the high-risk group and the low-risk group in HCC. KEGG enrichment suggested that "MAPK signaling pathway," "mTOR signaling pathway," "NOD like receptor signaling pathway," "Toll like receptor signaling pathway," "VEGF signaling pathway," "WNT signaling pathway" had significant correlations with the high-risk group ( Figure 8A).

| DISCUSSION
ICIs, such as PD-L1, play an important role in the treatment of HCC. 24 However, some patients are not sensitive to ICIs, and even worsen cases. 33 Furthermore, HLA-I is expressed in at least 10 human hepatoma cell lines, 34

FUNDING
This work is supported by the National Natural Science Foundation of China (81970753). The funder played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the report for publication.

CONFLICT OF INTEREST
The authors declare there is no conflict. Wei Li had full access to all of the data in this study and takes complete responsibility for the integrity of the data and the accuracy of the data analysis.

TRANSPARENCY STATEMENT
Wei Li affirms that this manuscript an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.
F I G U R E 8 Several immune-related pathways were up-regulated via our prognostic risk model. A, These immunerelated pathways include MAPK signaling pathway, mTOR signaling pathway, NOD like receptor signaling pathway, Toll like receptor signaling pathway, VEGF signaling pathway, WNT signaling pathway