A novel ferroptosis phenotype‐related clinical‐molecular prognostic signature for hepatocellular carcinoma

Abstract Ferroptosis is a newly identified cell death mechanism and potential biomarker for hepatocellular carcinoma (HCC) therapy; however, its clinical relevance and underlying mechanism remain unclear. In this study, transcriptome and methylome data from 374 HCC cases were investigated for 41 ferroptosis‐related genes to identify ferroptosis activity‐associated subtypes. These subtypes were further investigated for associations with clinical and pathological variables, gene mutation landscapes, deregulated pathways and tumour microenvironmental immunity. A gene expression signature and predictive model were developed and validated using an additional 232 HCC cases from another independent cohort. Two distinct ferroptosis phenotypes (Ferroptosis‐H and Ferroptosis‐L) were identified according to ferroptosis gene expression and methylation in the patients with HCC. Patients with the Ferroptosis‐H had worse overall and disease‐specific survival, and the molecular subtypes were significantly associated with different clinical characteristics, mRNA expression patterns, tumour mutation profiles and microenvironmental immune status. Furthermore, a 15‐gene ferroptosis‐related prognostic model (FPM) for HCC was developed and validated which demonstrated accurate risk stratification ability. A nomogram included the FPM risk score, ECOG PS and hepatitis B status was developed for eventual clinical translation. Our results suggest that HCC subtypes defined by ferroptosis gene expression and methylation may be used to stratify patients for clinical decision‐making.


| INTRODUC TI ON
Hepatocellular carcinoma (HCC) is a major type of primary liver cancer, accounting for ≥90% of all cases. 1 Mortality associated with HCC remains the fourth leading cause of death among the common malignancies, and its incidence ranks fifth, with a consistently rising trend. 2,3 Although great strides have been made in HCC diagnosis and therapy, treatment options remain limited. Only early-stage HCC patients with resectable tumours can undergo hepatectomy.
For most HCC patients with unresectable tumours, sorafenib is considered as the only first-line systemic chemotherapeutic drug. 4,5 However, high recurrence rates and drug resistance remain a problem for patients with HCC, with the prognosis still poor. Thus, the identification of novel therapeutic targets and development of a prognostic model to strategize for customized therapy are urgently needed for patients with HCC.
The hallmarks of cancer include cell death resistance 6 and numerous efforts have been made to employ programmed cell death in cancer therapy. Recently, a new way of programmed cell death involving iron-dependent lipid peroxidation was defined as ferroptosis, which is genetically and biochemically distinct from other well-understood forms of regulated cell death, such as apoptosis and programmed necrosis. 7 Previous studies have identified that small-molecules including erastin could induce ferroptosis by inhibiting system X c -. 8 Later on, other chemical compounds such as sorafenib were found to activate ferroptosis in HCC, which provided the potential for pharmaceutically inducing ferroptosis in patients with HCC. 9,10 Ferroptosis has been morphologically characterized by a series of mitochondrial changes, including small and condensed mitochondria, disappearance of the inner crista and outer membrane rupture of the mitochondria. 11,12 The molecular mechanisms underlying this lethal procedure include iron-dependent lipid metabolism, which further causes an increase in reactive oxygen species and peroxidation products, as its name suggests. 13 Similar to other cell death procedures, ferroptosis is mostly inhibited in tumour cells to maintain their survival. Therefore, modifying the ferroptosis status in patients with HCC may strongly affect their prognosis.
Nevertheless, the correlation between ferroptosis and HCC prognosis remains obscure, and the underlying molecular mechanism remains unclear.
The present study identified two distinct ferroptosis phenotypes of HCC (Ferroptosis-H and Ferroptosis-L) based on the signature of mRNA expression and methylation of ferroptosis-related genes. We compared the prognosis of these subgroups and found that Ferroptosis-H had a worse prognosis. We further investigated the discrepancy in mRNA expression patterns, tumour mutation frequency and immune status. To elucidate the molecular mechanism of ferroptosis in HCC, we identified several prognosis defining genes and investigated their correlation with ferroptosis-related genes.
Furthermore, we developed a robust ferroptosis-related prognosis prediction model and constructed a nomogram that may assist clinical decision-making.

| Data acquisition and preprocessing
Using the TCGA data portal (https://portal.gdc.cancer.gov/), we downloaded a series of transcriptome data, methylation status data and mutation status data from 374 patients with HCC. The clinical information of these patients was also retrieved. Additional gene expression files from 232 HCC cases were extracted from the LIRI-JP project in the International Cancer Genomics Consortium (ICGC; https://icgc. org/) database. All transcriptome data were Log 2 -transformed for subsequent analysis. For the DNA methylation data, each CpG site was annotated and a beta-value ranging from 0 to 1 was used to evaluate the methylation status of each CpG site. The average of CpGs in the promoter region of a gene was used to evaluate the methylation level of that gene. Samples that lacked important clinicopathological variables or survival information were excluded from the final analysis.

| Clustering of ferroptosis-related genes
Forty-one ferroptosis-related genes were identified using the Kyoto Encyclopedia of Genes and Genomes (KEGG; https://www.kegg. jp/). The promoter methylation status and mRNA expression levels of these 41 genes in patients with HCC from TCGA were screened for their association with overall survival (OS). Genes with statistically significant correlation between methylation status or mRNA expression level and OS (P < .05) were extracted for further clustering analysis to identify different ferroptosis phenotypes.

| Differentially expressed gene identification and functional enrichment analysis
Each gene was measured for its adjusted p-values between two ferroptosis phenotypes using the false-discovery rate (FDR) method.
Differentially expressed genes (DEGs) between two distinct ferroptosis phenotypes from TCGA were identified with the criteria FDR<0.05 and |log 2 fold-change| >1 between two ferroptosis phenotypes. DEGs were further analysed for functional and pathway enrichment. To explore the potential mechanism and corresponding biomarkers underlying two ferroptosis phenotypes involved in HCC, the c2.cp.kegg. v7.1.symbols.gmt and h.all.v7.0.symbols.gmt annotated gene set files were selected as the reference gene set in the GSEA analysis and the threshold for significance was set at P < .05, with the FDR<0.25.

| Tumour mutation analysis
By using the maftool package in R software, 14 genes with the highest tumour mutation frequency (TMF) in patients with HCC from TCGA were analysed and visualized. We further performed at comparative analysis between two ferroptosis phenotypes in TCGA and identified genes with significantly different TMF (P-value<.05).

| Evaluation of immune status
The RNA-seq data from TCGA were extracted to evaluate immune checkpoint-related gene expression. Microenvironmental immune infiltration and stromal status were evaluated using the ESTIMATE algorithm, which provides a score of immune infiltration level, tumour purity and stromal content based on expression data. 15 Based on the LM22 signature matrix, the CIBERSORT algorithm deconstructed the proportion of each human immune cell subtype in each patient. 16 The proportion of 22 immune-related cell types in different subtypes was visualized, and by running the Wilcoxon rank-sum test, the difference in immune cell subtype proportions in the two ferroptosis-related subtypes was inspected. Moreover, the ferroptosis-specific gene expression patterns in individual tumour samples were transformed into a ferroptosis score using ssGSEA, and its relation with the proportion of each immune cell was determined by Spearman's test.

| LASSO penalty regression and co-expression network construction
We included the previously identified DEGs and significantly different TMF genes combined with the ferroptosis-related genes defined by KEGG for further analysis. The LASSO-logistic regression analysis was performed based on the expression level to identify key DEGs between the two ferroptosis phenotypes. Furthermore, by excluding the ferroptosis genes from key DEGs, we designated the remaining genes as prognosis defining genes. Moreover, we conducted a correlation test between the 41 ferroptosis-related genes and the prognosis defining genes. The genes with an absolute value of correlation coefficient >0.3 and P-value < .001 were plotted using Cytoscape (https://cytos cape.org/). The ferroptosis co-expression network with P-value <.001 and the absolute value of correlation coefficient >0.4 was illustrated using a Sankey diagram.

| Construction and validation of a ferroptosisrelated prognostic model
The expression levels of the 41 ferroptosis-related genes and previously identified prognosis defining genes with corresponding survival information were extracted from the TCGA and ICGC data sets for further analysis. However, the multivariat Cox regression analysis is unsuitable for highly correlated genes. To address this problem, the prognostic model in this study was constructed using the Lasso Cox regression analysis. 17 By proportional size of genes, the shrinkage of the regression coefficient eliminated genes with zero regression coefficients and reserved genes with a nonzero weight. Finally, a ferroptosis-related prognostic model (FPM) was developed. The risk score of FPM was calculated for each HCC patient in the TCGA and ICGC cohorts as the sum of the multiplied regression coefficient (β) by its expression level of genes retained in the FPM model [FPM risk score = ∑ EXP (mRNA) * β]. The cut-off value of the FPM risk score for risk stratification was defined as the median value in TCGA.
Patients with HCC in both TCGA and ICGC were stratified into highor low-risk subgroups separately. The log-rank test was used to compare the OS between these two subgroups in each cohort. To assess the performance of the FPM in predicting OS, time-dependent ROC was performed in each independent cohort.

| Construction and evaluation of the nomogram
Univariate Cox regression was used to evaluate the association of clinicopathological variables as well as the FPM risk score with OS in the TCGA cohort. Variables with a P-value < .001 were included in the downstream multivariate Cox regression analysis. The nomogram to predict the likelihood of OS was constructed by independent risk factors identified by multivariate Cox regression analysis (P < .05). The performance of the nomogram for predicting the OS was further evaluated by time-dependent ROC method and calibration plots. The consistency of nomogram-predicted probabilities and observed rates was evaluated using the concordance index (C-index).

| Ferroptosis-related gene profiling identifies two HCC subtypes with different prognosis
Of the 41 ferroptosis-related genes, 15 genes were identified for the expression level correlated with OS of HCC patients and 8 genes were screened out for the methylation status associated with OS of HCC patients ( Figure S1). The unsupervised consensus cluster results showed that based on the gene expression level and methylation status, patients with HCC were divided into two distinct groups (Ferroptosis-H and Ferroptosis-L) with different ferroptosis phenotypes ( Figure 1A,B).
The Kaplan-Meier method was used to investigate the relationship between the ferroptosis phenotypes and prognosis of patients with HCC. The results showed that the Ferroptosis-H phenotype had a worse OS than the Ferroptosis-L phenotype (median OS: 3.11 vs. 6.93 years, P < .001) ( Figure 1C). Moreover, we investigated the association between disease-specific survival (DSS) and the ferroptosis phenotype. The results showed that the Ferroptosis-H subgroup had poorer DSS than the Ferroptosis-L subgroup (median DSS time: 4.63 vs. 6.96 years, P = .00062) ( Figure 1D). These results indicated that the Ferroptosis-H phenotype had a worse prognosis.
Furthermore, we compared the clinicopathological characteristics between the two ferroptosis subgroups and found that patients with HCC in the Ferroptosis-H group had advanced T stage (P = .001), higher vascular invasion (P < .001), higher level of serum AFP (P = .015), and less differentiated tumours (P < .001) (Table 1, Figure 1B).

| Analysis of mutations in HCC patients with different ferroptosis phenotypes
We illustrated the gene mutation patterns of HCC patients based on ferroptosis H and L status using exome-seq data from TCGA with a waterfall plot, which illustrates genes with high mutation frequencies, mutation types and mutation distribution across genes in each patient ( Figure 3A). Statistically, 51 genes were found to have different mutation frequencies (P < 05) between the two phenotypes ( Figure 3B), among which TP53 had the highest mutation frequency (29%) and the largest difference between the two phenotypes (42% in Ferroptosis-H vs. 21% in Ferroptosis-L).
The location and rate of each mutation in TP53 were displayed in the lollipop plot ( Figure 3C). Besides TP53, other tumour suppressor genes such as C10orf90, FAT3 and TSC2 were identified as having significantly higher mutation rates in the Ferroptosis-H group. These findings indicated that the poor prognosis of the Ferroptosis-H group may result from the higher tumour mutation frequencies in the tumours.

| The landscape of tumour immune status in patients with different ferroptosis phenotypes
The microenvironmental immune and stromal status conducted by the ESTIMATE algorithm showed that the stromal status was not significantly different between the two groups, whereas the Ferroptosis-H group had a higher immune score and lower tumour purity ( Figure 4A)

| Identification of the most distinctive genes that define ferroptosis-related prognosis
To identify genes that best distinguished the Ferroptosis-H and  Table S1.

| Development and validation of the ferroptosisrelated prognostic model
To develop a reliable predictive model for patient survival, we per-  Figure 6A,B). The regression coefficients of the genes used for the risk score calculation are listed in Table S2.
Based on the FPM risk score of each HCC patient, the TCGA cohort was divided into high-or low-risk subgroups with the median value as the cut-off level (cut-off risk score = 3.2975). As shown in the KM curve, a significantly worse OS was found in the high-risk subgroup than among its low-risk counterparts (P < .001). We further

| Establishment of a nomogram based on the model
Univariate Cox regression analysis was performed to assess the association between OS and FPM risk score combined with clinicopathological variables (age, sex, family history of cancer, alcohol consumption, TNM classification, Child-Pugh grade, ECOG performance status, vascular invasion status, surgical margin status, AFP, hepatitis B, hepatitis C, albumin, liver fibrosis Ishak score, platelet count and post-operative radiotherapy) ( Figure 7A).
By including variables with a P-value < .001 in further multivariate Cox regression analyses, the FPM risk score was shown to be an independent prognostic factor for OS, with a hazard ratio of 4.101 (95%CI: 2.597-6.474; P < .001) ( Figure 7B). We further constructed a nomogram with the FPM risk score, ECOG PS and hepatitis B. These factors were confirmed as independent prognostic factors through the multivariate Cox analysis ( Figure 7C).
The nomogram showed that the FPM risk score acted as the greatest and major weight factor in this scoring system. We further evaluated the predictive performance of this nomogram using the ROC curve and the result showed a higher AUC (1-, 3-and 5 years OS was 0.842, 0.801 and 0.799, respectively) than the FPM risk score alone ( Figure 7D). The calibration plots indicated that this nomogram had a strong agreement between predicted survival probabilities and actual observed outcome in which the C-index at 1-, 3-and 5 year OS was 0.798, 0.767 and 0.764, respectively ( Figure 7E).

| D ISCUSS I ON
Successful evasion of regulated cell death is a major hallmark of cancer. 19 Numerous efforts have been made to induce various kinds of programmed cell death in cancer to achieve therapeutic purposes. 20 and ARID1A, with a significant difference. TP53 is a tumour suppressor gene in many cancers and is largely involved in various types of cell death procedures including apoptosis, necrosis and autophagy. 25,30,31 Activation of TP53 was also found to be essential for the ferroptosis process through direct transcriptional inhibition of SLC7A11. 32 Our results showed that the group with a higher mutation frequency on TP53 was also related to a higher expression of SLC7A11. It is important to note that SLC7A11 and TP53 were part of the 15-gene signature found to be highly predictive for the out- In addition, we also investigated the landscape depicted by the which thus led to a poor prognosis.
After observing the poor prognosis in the Ferroptosis-H group, we selected the key genes to represent genes with differential expression levels or with variant mutation frequency using the Lasso logistic method. Next, the correlation between 41 ferroptosis genes and 43 ferroptosis prognosis defining genes was analysed. These defining genes were differentially expressed in the ferroptosis-related subgroups and were also closely related to ferroptosis-related genes.
However, their relationship with the ferroptosis pathway has not been investigated. They may interact with ferroptosis-related genes and affect the prognosis of patients with HCC in an unrecognized manner. These findings warrant further investigation to untangle the complex interactions.
Next, FPM was developed based on 15 genes using the TCGA cohort to classify the prognosis of patients with HCC, and its robustness was verified using the ICGC cohort. The risk score may provide potential assistance for therapeutic decision-making. HCC. Nevertheless, our study has some limitations. First, our study had a retrospective design, and a more persuasive prospective study is needed to validate our results. Second, the molecular understanding based on functional experiments between ferroptosis-related genes and risk genes should be validated to accelerate the understanding of the ferroptosis in the future. Finally, the model based on RNA-seq may limit clinical application due to its high cost.
F I G U R E 6 Construction and validation of FPM. (A) Selecting the best parameters for FPM in the LASSO model (λ). (B) LASSO coefficient spectrum of 84 genes enrolled and generate a coefficient distribution map for a logarithmic (λ) sequence. In the TCGA cohort, the FPM risk score of each patient was plotted and the high-risk and low-risk group were divided using median cut-off value (C). The distribution of survival state (D), survival curve (E) and ROC curve (F) for the high-risk and low-risk group in the TCGA cohort were showed. The same cut-off value of the FPM risk score was deployed in the ICGC cohort for risk stratification (G), and the survival state (H), survival curve (I) and ROC curve (J) of the high-risk and low-risk group in the ICGC cohort were showed

ACK N OWLED G EM ENT
This study was supported by grants from the National Natural Science Foundation of China (81772628, 81703310). The results shown here are in whole or part based upon data generated by the TCGA Research Network (https://www.cancer.gov/tcga), as well as ICGC (https://daco.icgc.org/).

CO N FLI C T O F I NTE R E S T
The authors declare that there is no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data used to support the findings of this study are available from the corresponding author upon request.