Prognostic value of ferroptosis‐related genes in patients with lung adenocarcinoma

Abstract Background The prevalence of lung adenocarcinomas (LUADs) has dramatically increased in recent decades. Ferroptosis is a process of iron‐dependent regulatory cell death. It is still unclear whether the expression of ferroptosis‐related genes (FRGs) is involved in the pathogenesis and survival of patients with LUAD. Methods We retrieved LUAD data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases and used LASSO Cox regression analysis to select the gene signature suitable for modeling. The risk score was calculated according to the model, and the patients were divided into high‐ and low‐risk groups according to the median risk score. Functional enrichment analysis was carried out by this group, and a model for predicting clinical prognosis was established by combining this group with clinical factors. Results Gene set enrichment analysis (GSEA) and single‐sample gene set enrichment analysis (ssGSEA) analysis showed that there were several immune‐related pathways and immune infiltration differences between high‐ and low‐risk groups. A prognostic model integrating 10 ferroptosis‐related genes (FR‐DEGs), and clinical factors were constructed and validated in an external cohort. Conclusions The FR‐DEGs signature was related to immune infiltration, and a model based on FR‐DEGs and clinical factors was established to predict the prognosis of patients with LUAD.


INTRODUCTION
Lung cancer (LC) is one of the most frequently reported malignancies worldwide and the most common cause of global cancer-associated mortality, with over a million people succumbing each year. 1 Based on the histology, LC is divided into two main subtypes: small cell lung carcinoma and non-small cell lung carcinoma (NSCLC), accounting for 15% and 85% of all cases, respectively. 2 Lung adenocarcinoma (LUAD) has increased in prevalence compared to other subtypes of lung cancer. 3 Increasing numbers of nonand never-smokers are developing LUADs. 4 As a newly discovered type of programmed cell death, ferroptosis results from the accumulation of iron-dependent lipid hydroperoxides and leads to cytological changes; the features and mechanisms of ferroptosis are different from those of typical cell death processes, such as apoptosis. 5 It is closely related to the metabolism of amino acids, iron, and polyunsaturated fatty acids, and the biosynthesis of glutathione, phospholipids, NADPH, and coenzyme Q10. 6,7 Recently, ferroptosis has been found to have great potential in cancer treatment, especially in tumors resistant to traditional therapy. 8,9 Increasing evidence has shown that numerous tumor cells, including ovarian cancer cells 10 and hepatocellular carcinoma cells, 11 are sensitive to it. Furthermore, previous studies have shown that ferroptosis suppresses tumor growth and kills tumor cells 12 and plays an important role in cancers such as renal cell carcinomas and adrenocortical carcinoma cells. 12,13 However, the relationship between ferroptosis and LUAD patient prognoses has yet to be elucidated. In this study, we analyzed the differential expression of ferroptosis-related genes (FRGs) in LUAD patients from a public database to identify the enriched pathways and their biological functions and constructed an FR-DEGs and clinical factors-based model for LUAD prognosis evaluation. Previous evidence indicates that CD8+ T cells enhance ferroptosis by downregulating SLC3A2 and SLC7A11, and the induction of ferroptosis contributes to the antitumor efficacy of immunotherapy, suggesting that the immune system might function through ferroptosis. 14 Moreover, ferroptotic cancer cells might release signals such as oxidized lipid mediators to affect antitumor immunity. 15 which indicates that FRGs could become a promising prognostic marker of immunotherapy in LUAD. Although many different prognostic biomarkers and models have been established in patients with LUAD, few studies have focused on the comprehensive status of the immune pathway.

Data source
RNA sequencing (RNA-seq) data and corresponding clinical information of 535 patients with LUAD (including 59 normal lung tissues and 535 tumor tissues) patients were downloaded from The Cancer Genome Atlas (https://portal.gdc.cancer.gov/ repository)(TCGA cohort). Gene microarray data and clinical information of 226 tumor samples were obtained from the Gene Expression Omnibus (GEO) dataset (https://www.ncbi. nlm.nih.gov/geo/query/acc.cgi?acc=GSE31210) (GEO cohort). 16 Patients who met the following selection criteria were included: Model establishment and validation of prognostic ferroptosis-related gene signature A total of 60 FRGs were retrieved from the previous literature and are provided in Table S1. We performed the following process to establish the immune signature. The "limma" R package was first used to identify the differentially expressed 60 FRGs between tumor tissues and adjacent nontumorous tissues with a false discovery rate (FDR) < 0.05 in the TCGA cohort. Univariate Cox analysis of overall survival (OS) was performed to screen for FRGs with prognostic values. p-values < 0.05 were considered statistically signifcant. Least absolute shrinkage and selection operator (LASSO) Cox regression analysis was then applied to construct a prognostic model. 17,18 LASSO Cox regression was calculated using the "glmnet" package in R software, and the ideal coefficients were evaluated according to the partial likelihood deviance with tenfold cross validation. The risk scores of the patients were calculated according to the normalized expression level of each gene and its corresponding regression coefficients. The formula was established as follows: score = esum (each gene's expression Â corresponding coefficient). The patients were stratified into high-and low-risk groups based on the median value of the risk score. Based on the expression of genes in the signature, principal component analysis (PCA) was carried out with the "prcomp" function of the "stats" R package. In addition, t-SNE were performed to explore the distribution of different groups using the "Rtsne" R package. The "survivalROC" R package was used to conduct time-dependent receiver operating characteristic (ROC) curve analyses to evaluate the predictive power of the gene signature.

Functional enrichment analysis
The "clusterProfiler" R package was utilized to conduct Gene Set Enrichment Analysis (GSEA), analysis of gene ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) between the high-and low-risk groups. FDR < 0.25 was considered statistically significant. The infiltrating score of 16 immune cells and the activity of 13 immune-related pathways were calculated using single-sample gene set enrichment analysis (ssGSEA) 19 in the "gsva" R package. The annotated gene set file is provided in Table S2.

Statistical analysis
The Wilcoxon test was used to compare gene expression between tumor tissues and adjacent nontumorous tissues.
Differences in proportions were compared using the chisquare test. The Wilcoxon test was used to compare the ssGSEA scores of immune cells or pathways between thehigh-and low-risk groups. The OS between different groups was compared using Kaplan-Meier analysis with the logrank test. Univariate and multivariate Cox regression analyses were performed to identify independent predictors of OS. All statistical analyses were performed using R software (Version 4.0.2). A p-value of less than 0.05 was considered statistically significant, and all p-values were two-tailed.

RESULTS
The flow chart of this study is shown in Figure 1. A total of 535 patients with LUAD from the TCGA-LUAD cohort and 226 patients with LUAD from the GEO cohort were finally enrolled. The detailed clinical characteristics of these patients are summarized in Table 1.
Identification of FR-DEGs and their prognostic value in the TCGA cohort Most of the FRGs (45/60, 75%) were differentially expressed between tumor tissues and adjacent nontumorous tissues, and 12 were correlated with OS in the univariate Cox regression analysis (Figure 2(a)). A total of 12 prognostic ferroptosis-related DEGs were preserved (all FDR < 0.05, Figure 2(b), (c)). The interaction network among these genes indicated that FANCD2 and PEBP1 were the hub genes ( Figure 2(d)). The correlation between these genes is shown in Figure 2(e).

Establishment and validation of prognostic model
LASSO Cox regression analysis was applied to establish a prognostic model using the expression profile of the 12 genes mentioned above. A 10-gene signature was identified based on the optimal value of λ (Figure 3(a), (b)). The patients were stratified into high-or low-risk groups according to the median cutoff value (Figure 3(d)). The higher risk group was significantly associated with higher tumor stage, TP53 mutation, sex, and advanced tumor node metastasis (TNM) stage in the TCGA cohort ( Figure 4). PCA and t-SNE analysis indicated that the patients in the different risk groups were distributed in two directions ( Figure 3(e), (g)). As shown in Figure 3(f), patients with high risk had a higher probability of death earlier than those with low risk. The Kaplan-Meier curve consistently indicated that patients in the high-risk group had a significantly worse OS than their low-risk counterparts (Figure 3(c), p < 0.0001).
To test the robustness of the model constructed from the TCGA cohort, patients from the GEO cohort were also categorized into high-or low-risk groups by the median value calculated with the same formula as that from the TCGA cohort ( Figure 5(a)). Similar to the results obtained from the TCGA cohort, PCA and t-SNE analysis confirmed that patients in the two subgroups were distributed in discrete directions ( Figure 5(c), 4(d)). Likewise, patients in the highrisk group were more likely to encounter death earlier ( Figure 5(b)) and had a reduced survival time compared to those in the low-risk group ( Figure 5(e), p = 0.011).

Immune-related functional analyses in the TCGA and GEO cohort
To elucidate the immune-related pathways associated with the risk score, the GSEA analyses of GO and KEGG were conducted between the high-and low-risk groups. Interestingly, GSEA results showed that both GO and KEGG had several changes in immune-related pathways (FDR < 0.25, Figure 6). Four immune-related biological processes or molecular functions in KEGG were changed between the high-and low-risk groups in the TCGA cohort, including the intestinal immune network for IGA production, chemokine signaling pathway, TGF beta signaling pathway, and TOLL-like receptor signaling pathway (FDR < 0.25, Figure 6 (a)). Six immune-related biological processes or molecular functions in GO were changed between the high-and low-risk groups in the TCGA cohort, including somatic diversification of immune receptors, positive regulation of production of molecular, positive regulation of myeloid leukocyte cytokines, positive regulation of cytokine production, regulation of innate immune response, and activation of the innate immune response (FDR < 0.25, Figure 6(b)).
F I G U R E 6 GSEA analyses of KEGG (a) and GO (b) were conducted between the high-and low-risk groups among immune-related pathway To further explore the correlation between the risk score and immune status, we quantified the enrichment scores of diverse immune cell subpopulations, related functions, or pathways with ssGSEA. Interestingly, the score of CD8+ T cells, iDCs, macrophages, mast cells, NK cells, Th1 cells, Th2 cells, Treg, antigen-presenting cells (APC) coinhibition, cytolytic activity, HLA, inflammation-promoting, MHC class I, parainflammation, and T cell coinhibition were significantly different between the low-and high-risk groups in both TCGA (all p < 0.05, Figure 7(a), (c)) and GEO cohorts (all p < 0.05, Figure 7(b), (d)). The trend of change in the two cohorts was consistent.
Combined with clinical and ferroptosis-related risk scores to construct the prognosis prediction model We combined the clinical stage and risk score for multiple Cox regression analysis and constructed a prognosis prediction model. The nomogram was drawn according to the model (Figure 8(a)). The ROC curve of the model shows that the model has good prognosis prediction ability (12 months -area under curve [AUC]: 0.736, 24 months -AUC: 0.724, 36 months -AUC: 0.722, Figure 8(b)). According to the median risk score of the new prognostic model, we divided the patients into high-and low-risk groups. Figure 8(c) shows that the survival of high-risk patients was lower than that of low-risk patients. The same result was verified in the GEO cohort (Figure 8(d)). At the same time, the ROC curve of the geo cohort verified the accuracy of the model in predicting the prognosis (12 months -AUC: 0.912, 24 months -AUC: 0.868, 36 months -AUC: 0.774, Figure 8(e)).

DISCUSSION
In the current study, we systematically investigated the expression of 60 FRGs in LUAD tumor tissues and their association with OS. A novel prognostic model integrating 10 FRGs and clinical factors was first constructed and  Figure 8(b)). At the same time, the external verification results of the model also prove that the model has high prediction accuracy (AUC: 0.774-0.912, Figure 8(e)). The ROC curve results of the external validation set showed that the prediction accuracy of the model in the validation set was higher than that in the training set, which may be because all patients in the validation set were early-stage patients. This suggests that our model may be more suitable for patients with early-stage LUAD. In previous studies, the prognosis prediction model was only based on gene signature, completely ignoring the clinical information, and the LASSO Cox regression analysis was not used to select the best gene signature in previous studies. 20 Our model screened the best gene signature by LASSO Cox regression analysis, calculated its risk score, and divided the patients into groups. A model with more clinical value will then be established by combining this grouping with clinical information.
Functional analyses revealed that immune-related pathways were enriched for FR-DEGs signatures. Although a few previous studies [21][22][23] have indicated that several genes might regulate drug-induced ferroptosis in LUAD, their value in the survival of patients with LUAD remains largely unknown. To our surprise, most of the FRGs (75%) were differentially expressed between tumor and adjacent nontumorous tissues, and 12 were correlated with OS in the univariate Cox regression analysis. These results significantly indicated the potential role of ferroptosis in LUAD and the possibility of building a prognostic model with these FRGs. The prognostic model proposed in the present study was composed of 10 FRGs (ALOX15, ATP5MC3, CISD1, DPP4, FANCD2, GLS2, PHKG2, ACSL3, PEBP1, PGD). These genes can be roughly classified into four categories, including iron metabolism (FANCD2, CISD1, PHKG2), lipid metabolism (ACSL3, PEBP1), (anti) oxidant metabolism (ALOX15), and energy metabolism (GLS2, PGD, ATP5MC3). 8 These genes were all upregulated in LUAD tumor tissue and were associated with prognosis in the current study. Whether these genes play a role in the prognosis of LUAD patients by influencing the process of ferroptosis remains to be elucidated since few related studies on these genes have been reported.
Although the mechanisms underlying tumor susceptibility to ferroptosis have been an intense area of research in the past few years, the potential modulation between tumor immunity and ferroptosis remains elusive. We performed GSEA analyses and discovered that many immune-related biological processes and pathways were enriched. This is similar to previous studies. 20 It is reasonable to assume that ferroptosis may have a close connection with tumor immunity. Interestingly, the contents of the antigen presentation process were significantly different between the low-risk and high-risk groups in this study. One possible explanation is that ferroptotic cells release distinct signals, such as lipid mediators, to attract APCs to the site of ferroptotically dying cells. 15 Previous studies have demonstrated that increased tumor-associated macrophages 24,25 or Treg cells 26 are related to poor prognosis in LUAD patients due to their role in immune invasion. The upregulation of APC coinhibition and T cell coinhibition will decrease immune killing ability and immune escape. Moreover, higher risk scores correlated with lower iDC scores. Therefore, attenuated antitumor immunity in patients at high risk may be an explanation for their poor prognosis. It also provides a potential direction of immunotherapy for LUAD in the future. The clinical model developed in this study can better predict the prognosis of patients with early-stage lung adenocarcinoma. Nowadays, due to the popularity of nextgeneration sequencing, performing expression profiling sequencing of genes in patients can be carried out at a relatively low cost. The model of this study therefore has the potential for widespread generalization, especially in patients with early-stage lung adenocarcinoma who are able to undergo surgical resection.
However, this study has some limitations. As it was a retrospective study based on public databases, some information, such as specific treatment methods, were unavailable. Moreover, the cohort only included patients with early-stage LUAD as the validation set, which can only verify that the model has high accuracy in predicting the prognosis of patients with early-stage LUAD.
In conclusion, the FR-DEGs signature was related to immune infiltration, and we propose a model based on FR-DEGs and clinical factors that could be used to predict the prognosis of patients with LUAD.