Establishment of a prognostic ferroptosis‐related gene profile in acute myeloid leukaemia

Abstract Acute myeloid leukaemia (AML) is a heterogeneous disease with a difficult to predict prognosis. Ferroptosis, an iron‐induced programmed cell death, is a promising target for cancer therapy. Nevertheless, not much is known about the relationship between ferroptosis‐related genes and AML prognosis. Herein, we retrieved RNA profile and corresponding clinical data of AML patients from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. Univariate Cox analysis was employed to identify ferroptosis‐related genes significantly associated with AML prognosis. Next, the least absolute shrinkage and selection operator (LASSO) regression was employed to establish a prognostic ferroptosis‐related gene profile. 12 ferroptosis‐related genes were screened to generate a prognostic model, which stratified patients into a low‐ (LR) or high‐risk (HR) group. Using Kaplan‐Meier analysis, we demonstrated that the LR patients exhibited better prognosis than HR patients. Moreover, receiver operating characteristic (ROC) curve analysis confirmed that the prognostic model showed good predictability. Functional enrichment analysis indicated that the infiltration of regulatory T cells (Treg) differed vastly between the LR and HR groups. Our prognostic model can offer guidance into the accurate prediction of AML prognosis and selection of personalized therapy in clinical practice.

lacking. 1,4,5 Hence, additional factors for risk stratification should be identified and combined with analyses of cytogenetic and molecular abnormalities to generate a more precise AML prognostic system. Iron is indispensable for multiple biological processes, including mitochondrial respiration, immune surveillance, cellular proliferation and metabolic activity. 6,7 However, massive Iron accumulation in the body can harm cells by inducing membrane lipid peroxidation termed ferroptosis, which is a novel iron-induced programmed cell death. 8 Presently, ferroptosis has gained worldwide attention as a potent therapeutic target for cancer therapy. Moreover, several reports have identified genes that modulate AML via ferroptosis. 9,10 However, the relationship between ferroptosis-related genes and AML prognosis is currently unknown.
In this report, we retrieved the RNA profile and matching clinical data of AML patients from the Gene Expression Omnibus (GEO) and Cancer Genome Atlas (TCGA) databases. Next, we established a prognostic ferroptosis-related gene profile in the training cohort (GSE37642-96). The TCGA-LAML and GSE37642-570 cohorts were selected as the external and internal validation cohorts, respectively.
Additionally, Gene Set Enrichment Analysis (GSEA) and singlesample gene set enrichment analysis (ssGSEA) was employed to explore the associated mechanisms. Lastly, a nomogram was generated integrating the risk score (RS) and patient clinical characteristics to estimate AML prognosis.

| Data collection
The RNA profile and matched patient clinical information of all three AML cohorts were collected from publicly available data sets.
GSE37642-96 and GSE37642-570 microarray data were retrieved from the GEO database (http://www.ncbi.nlm.nih.gov/geo/) and the TCGA-LAML RNA-seq data were obtained from the TCGA website (https://portal.gdc.cancer.gov/repos itory). The relative gene levels were adjusted with the 'limma' R package. In addition, 259 ferroptosis-related genes were obtained from the FerrDb data set (http://www.zhoun an.org/ferrd b/). All data from GEO, TCGA and FerrDb are publicly available, we did not require ethical approval.

| Establishment and verification of a prognostic predictive profile
The GSE37642-96 cohort was selected as the training cohort for the generation of the prognostic model, and, using univariate Cox analysis, ferroptosis-related genes significantly associated with AML prognosis were identified (P-value <0.05) in the GSE37642-96 cohort. Next, the least absolute shrinkage and selection operator (LASSO) regression analysis was used to establish the best ferroptosis-related gene weighting coefficients. The prognostic model was based on a 10-fold cross-validation estimator penalty maximum likelihood estimation. Moreover, the minimum criteria of the penalized maximum likelihood estimator were used to identify the quintessential penalty parameter λ values. The TCGA-LAML and GSE37642-570 cohorts were selected as the external and internal validation cohorts, respectively. The RSs of participants were computed with the unified formula established in the training cohort.
The patients were then assigned to either a high (HR)-or low-risk (LR) group, based on the median RS of the training cohort.

| Quantitative Real-time polymerase chain reaction
A total of 60 RNA later-reserved bone marrow specimens were collected from patients who diagnosed AML (30 patients)

| Functional enrichment analysis
GSEA v4.0.2 software (http://softw are.broad insti tute.org/gsea/ login.jsp) was performed to explore possible biological functions related to the HR and LR cohorts, using the c2.cp.kegg.v7.0.symbols gene sets. NOM p-value <0.05 was considered significant. In addition, we computed the infiltrating score of 16 immune cells and the activity of 13 immune-linked networks, with ssGSEA. 11 Gene cloud biotechnology information (GCBI) was employed to assess relationships between model-linked and other related proteins. Univariate and multivariate analyses were done to identify independent prognostic factors for AML prognosis. This, in turn, was used to construct the nomogram model to estimate the 1-, 3-and 5-year OS rates. The calibration plot was adopted to evaluate calibration. R software (Version 3.6.0) or SPSS (Version 24.0) was used for all data analyses. A two-sided p < 0.05 was set as significance threshold.

| Patient characteristics and selection
A total of three cohorts, including 693 AML patients with complete RNA and matched clinical data, were analysed in this study. Patient clinical characteristics are summarized in Table S1. The GSE37642-96 cohort was selected as the training cohort for the identification of AML prognosis related ferroptosis-related gene profile, and the TCGA-LAML and GSE37642-570 cohorts were selected as the external and the internal validation cohorts, respectively. A summary of our research design is illustrated in Figure S1.

| Construction of a prognostic ferroptosisrelated gene signature
Overall, 210 ferroptosis-related genes matched with the RNA expression data from three cohorts, were identified. Upon univariate Cox analysis of the 210 ferroptosis-related genes in the training cohort (GSE37642-96), 23 genes were considered to correlate with AML overall survival (OS) (p-value <0.05) ( Figure 1A). Then, the LASSO regression analysis was applied to construct a ferroptosis- The training cohort was then sub-categorized into the HR (n = 208) or LR sub-group (n = 209), based on the median threshold ( Figure 2G). The HR patients were strongly correlated with advanced age, reduced RUNX1_RUNX1T1_fusion, and augmented RUNX1 mutation, relative to the LR patients (Table S1 and Figure 6D). As shown in Figure 2J, the HR patients were also more prone to early demise, compared to the LR patients. In addition, the HR patients exhibited a significantly worse OS, compared to the LR patients (p < 0.001) (Figure 2A). We performed Kaplan-Meier analysis in the younger patients (≤60 years) and older patients (>60 years) in the training cohort, those HR patients exhibited a significantly worse OS, compared to the LR patients (p < 0.001) in the population ≤60 years and >60 years ( Figure S2). In addition, the timedependent ROC analysis was employed to assess the sensitivity and specificity of the ferroptosis-related gene profile. The AUCs for 1-,  Figure 2D).

| Verification of the prognostic ferroptosisrelated gene profile in the two validation cohorts
We assessed the predictability of the ferroptosis-related gene profile in the external validation cohort (TCGA-LAML) and the internal validation cohort (GSE37642-570).
Patients, belonging to the TCGA-LAML cohort, were stratified into an HR (n = 75) or LR (n = 65) sub-group, based on the median RS calculated in the training cohort ( Figure 2H). The HR patients were strongly correlated with advanced age and higher cytogenetic risk levels, relative to the LR patients (Table S1 and Figure 6E). Consistent with training cohort, the HR patients experienced an enhanced likelihood of death earlier than the LR patients ( Figure 2K). In contrast, the LR patients had favourable outcome Next, we stratified patients from the GSE37642-570 cohort into a HR (n = 110) or LR (n = 26) sub-group, based on the median RS calculated in the training cohort ( Figure 2I). We observed that the HR patients did not have strong associations with clinicopathological characteristics, relative to the LR cohorts (Table S1 and Figure 6F).
Moreover, consistent with the training cohort, HR patients experienced death earlier than the LR patients ( Figure 2L). Additionally, the LR patients had favourable outcome (p < 0.001) ( Figure 2C).
We performed Kaplan-Meier analysis in the younger patients (≤60 years) and older patients (>60 years) in the GSE37642-570 cohort, those LR patients had favourable outcome (p< 0.001) in the population ≤60 years and >60 years ( Figure S2). The AUCs for  Figure 2F).

| Validation of the expression levels of the 12 ferroptosis-related genes in clinical specimens
The expression signatures of the 12 ferroptosis-related genes were explored in 30 AML clinical specimens. The results demonstrated that HRAS, CXCL2, SLC38A1, PGD, ENPP2, ACSL3, DDIT4 and PSAT1 mRNA levels were upregulated in AML specimens, while PHKG2, HSD17B11, STEAP3 and ARNTL was downregulated ( Figure S3).

| Univariate and multivariate Cox analyses of all three cohorts
The univariate and multivariate Cox analyses for RS and other prognostic values were performed in all three cohorts. The univariate Cox analysis indicated that the RS was an independent prognostic indicator for OS in all three cohorts ( Figure 3A, B, C).
After adjusting other clinical confounding factors in multivariate Cox analysis, the RS was still determined as an independent prognostic predicting factor for AML prognosis in all participants ( Figure 3D, E, F).

| Functional enrichment analysis in all three cohorts
GSEA was performed in all three cohorts to distinguish the biological functions and networks related to RS. Significantly enriched  Figure 4D). ENPP2, PGD, HRAS, ARNTL and DDIT4 were commonly altered. We also identified 73 associated proteins from the GCBI database. Among them, eleven were related to our prognostic model, with the exception of ENPP2 ( Figure 4E).  Figure 6B). Additionally, the RS revealed a numerically but not statistically larger AUC value for the 5-year OS, relative to the age in the GSE37642-570 cohort ( Figure 6C).

| Comparisons of prognostic factors and merged risk scores
Nomogram, a highly precise assessment technology, was employed for the integration of age, cytogenetic risk category and ferroptosis-related gene profile in the TCGA-LAML cohort ( Figure 7A). Moreover, the calibration plots revealed that the nomogram can estimate the 1-, 3-and 5-year OS with great precision ( Figure 7B). The C-index for the nomogram was 0.755 ( Figure 7C).

| DISCUSS ION
Currently, AML remains among the most complicated and challenging diseases. 12 Despite the development of novel drugs and risk-stratified therapies, the 5-year OS rate is still below 40%, owing to the heterogeneous nature of this disease. 13 Hence, it is urgently required to identify new prognostic indicators to enhance the diagnosis, personalized therapy and prognosis of AML patients.
Ferroptosis, a newly discovered ferroptosis-induced programmed cell death, with associated iron accumulation and lipid peroxidation, is essential for eradicating carcinogenic cells. 14,15 Targeting ferroptosis has emerged as a promising anticancer therapy. In fact, some previous studies have suggested that leukaemia is susceptible to ferroptosis and erastin, an inducer of ferroptosis, can enhance the anticancer activity of cytarabine and doxorubicin by enhancing ferroptosis. 16,17 Moreover, several genes are known to modulate ferroptosis in AML. 9,10 Nevertheless, there are no current reports on the relationship between ferroptosis-related genes and AML prognosis.
Herein, we employed univariate Cox and LASSO regression analyses to identify OS-related ferroptosis-related genes in AML samples. A new model for predicting AML prognosis was then generated using 12 ferroptosis-related genes in the training cohort   21 It is reported to be involved in modulating cancer cells sensitivity to oxidative stress. However, the expression of oncogenic RAS in different cancers presented distinct responses (sensitive or resistant) to ferroptosis inducers. [22][23][24] Therefore, the relation between oncogenic RAS and ferroptosis is still controversial. ARNTL suppresses ferroptosis via repression of EGLN2 transcription and activation of the prosurvival transcription factor HIF1A. 25 Some previous findings suggested that the overexpression of ARNTL showed improved sensitivity to anticancer drugs. 26,27 SLC38A1 is an essential mediator of glutamine uptake and metabolism in lipid peroxidation and SLC38A1 knockout can markedly block ferroptosis. 28 However, previous study confirmed that high expression of SLC38A1 was an adverse prognostic factor for AML. 29 ENPP2 is a lipid kinase involved in the lipid metabolic pathway, and it protects cardiomyocytes from erastin-driven ferroptosis. 30 ACSL3 catalyses the conversion of exogenous monounsaturated fatty acids into fatty acyl-CoAs and is necessary for fatty acids activation and cell resistance to ferroptosis. 31 PGD, a cytosolic enzyme belonging to the carbohydrate metabolism pathway, is essential for ER structural integrity and protein secretion. 32 In summary, three of the above-mentioned genes (PHKG2, STEAP3 and SLC38A1) promote ferroptosis, whereas three genes (ARNTL, ENPP2 and ACSL3) prevent ferroptosis. However, whether these genes modulate ferroptosis and impact AML prognosis needs further investigation. Despite extensive research on the mechanisms underlying tumour susceptibility to ferroptosis, not much is known about the relationship between tumour immunity and ferroptosis. To fill this gap, we conducted GSEA analysis and surprisingly found that numerous immunelinked biological processes and pathways were enriched, which indicates a possible connection between ferroptosis and tumour immunity. Nevertheless, such a connection has rarely been reported.
Interestingly, the ssGSEA analysis revealed that the HR patients in all three cohorts exhibit elevated Treg. Prior reports have suggested that Treg cells modulate immune escape via suppression of antileukaemia activity. In addition, a high fraction of Treg in the tumour microenvironment is associated with progression and poor survival for haematological malignancies. 33,34 We, therefore, speculated that the poor prognosis of the HR group might be related to more Treg infiltration in the immune microenvironment. Moreover, some previous studies suggested that the induction of ferroptosis tumour cells at early stage may efficiently enhance immune response and the inducing of Treg cell ferroptosis may be a therapeutic strategy to improve cancer treatment. 35,36 So the use of ferroptosis inducers may help improve the patients' therapeutic effect in the HR group. Although our prognostic model exhibited better predictive power than traditional prognostic factors, certain study limitations must be considered. Our prognostic model was based on publicly available retrospective data. Hence, additional prospective investigations are needed to validate its predictive power. Second, the underlying mechanisms between ferroptosis and AML progression remain poorly understood and further examinations are required to elucidate the importance of ferroptosis-related genes in AML pathology.

| CON CLUS ION
We constructed a novel prognostic model with 12 ferroptosisrelated genes and a nomogram combined the RS, age and cytogenetic risk category in AML. We also demonstrated that our prognostic model and nomogram have high accuracy in predicting OS, which indicates that the ferroptosis-related gene profile can possibly be employed as an indicator of AML diagnosis and prognosis. Furthermore, the prognostic performance of our prognostic model and nomogram can offer guidance towards the accurate prediction and personalized therapy of AML patients in clinical practice.

ACK N OWLED G EM ENTS
The authors thank platforms of TCGA and GEO for sharing the huge amount of data. This work was supported by the follow-

CO N FLI C T O F I NTE R E S T
The authors declare no conflicts of interest in this work.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data from GEO, TCGA and FerrDb are publicly available.