Identification of ferroptosis‐related genes as potential biomarkers of tongue squamous cell carcinoma using an integrated bioinformatics approach

Tongue squamous cell carcinoma (TSCC) is one of the deadliest cancers of the head and neck, but the role of the ferroptosis pathway in its development is still unknown. In this study we explored the pathogenetic mechanisms associated with ferroptosis in TSCC. We identified differentially expressed genes (DEGs) of TSCC patients and used gene ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), and gene set enrichment analysis (GSEA) to annotate, visualize, and integrate these DEGs. Receiver operating characteristic curve (ROC) analysis was performed, and the STRING database was used to construct a protein–protein interaction network to evaluate the predictive value of ferroptosis‐related DEGs. A total of 219 DEGs were identified and GO, KEGG, and GSEA showed that extracellular matrix (ECM)‐receptor interaction and interleukin (IL)‐17 signaling pathways were substantially upregulated in TSCC. Univariate Cox analysis revealed that high expression of CA9, TNFAIP3, and NRAS were predictive of a worse outcome. We then constructed a prognostic model that predicted survival in the validation cohort at 1 year and 32 months. Finally, 60 cases of tongue carcinoma and normal tissues were collected, and immunohistochemistry was used to detect the expression of CA9. We found that CA9 was strongly expressed in tongue carcinoma tissues and absent in adjacent tissues. Overall, we found that ferroptosis‐related genes may affect TSCC prognosis through the ECM‐receptor interaction and IL‐17 signaling pathways. Additionally, immunohistochemistry confirmed that CA9 was highly expressed in tongue carcinoma tissues, and a model based on ferroptosis‐related genes showed a good ability to predict overall survival in TSCC.

Tongue squamous cell carcinoma (TSCC) is one of the deadliest cancers of the head and neck, but the role of the ferroptosis pathway in its development is still unknown. In this study we explored the pathogenetic mechanisms associated with ferroptosis in TSCC. We identified differentially expressed genes (DEGs) of TSCC patients and used gene ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), and gene set enrichment analysis (GSEA) to annotate, visualize, and integrate these DEGs. Receiver operating characteristic curve (ROC) analysis was performed, and the STRING database was used to construct a protein-protein interaction network to evaluate the predictive value of ferroptosisrelated DEGs. A total of 219 DEGs were identified and GO, KEGG, and GSEA showed that extracellular matrix (ECM)-receptor interaction and interleukin (IL)-17 signaling pathways were substantially upregulated in TSCC. Univariate Cox analysis revealed that high expression of CA9, TNFAIP3, and NRAS were predictive of a worse outcome. We then constructed a prognostic model that predicted survival in the validation cohort at 1 year and 32 months. Finally, 60 cases of tongue carcinoma and normal tissues were collected, and immunohistochemistry was used to detect the expression of CA9. We found that CA9 was strongly expressed in tongue carcinoma tissues and absent in adjacent tissues. Overall, we found that ferroptosis-related genes may affect TSCC prognosis through the ECM-receptor interaction and IL-17 signaling pathways. Additionally, immunohistochemistry confirmed that CA9 was highly expressed in tongue carcinoma tissues, and a model based on ferroptosis-related genes showed a good ability to predict overall survival in TSCC. Tongue squamous cell carcinoma (TSCC) accounts for 41% of all oral cancers; additionally, the tongue is rich in blood flow and metabolically active, making the metastasis rate higher than that in other types of oral squamous cell carcinoma [1]. Once metastasis begins, the 5-year survival rate drops from 85.5% to 48.5% [2]. It has been shown that promoting ferroptosis and apoptosis can inhibit the growth and metastasis of TSCC cells [3]. However, the mechanism of ferroptosis and TSCC has not been studied. Therefore, to improve the therapeutic and diagnostic capabilities for TSCC, we investigated potential prognostic targets and mechanisms of ferroptosis.
Ferroptosis is a recently identified cell death pathway characterized by iron-dependent accumulation of lipid hydroperoxides [4]. Studies have shown it to be associated with many diseases as a cell death mechanism different from autophagy or apoptosis. The authors of a study on diabetic rats found considerable iron-dependent cell death in the hippocampus, confirming that ferroptosis is associated with diabetic cognitive impairment [5]. Chen et al. [6] found that promoting ferroptosis can inhibit the progression of osteosarcoma. Ferroptosis is closely related to drug resistance, sensitivity to chemotherapy, and metastasis of cancer cells [7], making it a crucial tumor inhibition mechanism with the potential to treat cancer. A study on tongue cancer showed that the tumor suppressor drug quisinostat could inhibit tumor cell growth by markedly inducing apoptosis and ferroptosis in TSCC cells [3]. Under pathological scenarios, ferroptosis often accompanies other cell death routines; however, inhibition of apoptosis or necroptosis is not sufficient to inhibit ferroptosis [8]. Therefore, modulating ferroptosis may have therapeutic potential in various ferroptosisassociated diseases. Although ferroptosis is a critical factor in tumor prognosis, there is limited understanding of the role of ferroptosis in TSCC, especially the prognostic significance of ferroptosis-related genes is still unclear.
At present, there are few studies related to ferroptosis in TSCC. Recent studies only propose the therapeutic effect of promoting the phenotype of ferroptosis without specific mechanisms and genes [9]. Although these studies demonstrate the importance of ferroptosis in TSCC, they do not examine the prognostic value of ferroptosis-related genes for TSCC or differential expression in tumor tissues; therefore, previous studies lack a clear direction for further molecular research.
In this study we sought to identify potential diagnostic biomarkers and their biological functions associated with TSCC by mining the Gene Expression Omnibus (GEO), FerrDB (a website of genes related to ferroptosis), and The Cancer Genome Atlas (TCGA) databases. We used gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and gene set enrichment analysis (GSEA) to analyze biological processes involving the enriched genes to evaluate the diagnostic value of ferroptosis-related genes in TSCC. The expression of CA9 was also verified by immunohistochemistry in tongue cancer samples. Our study provides new insights into potential regulatory mechanisms in TSCC patients.

Microarray data
Expression profile and clinical information data from four tongue cancer datasets (GSE13601, GSE31056 [10], GSE9844 [11], GSE30784 [12]) were downloaded from the GEO [13] database using the R software package GEOQUERY [14] (v. 4  dataset tissue was also entirely tongue and included 150 tongue cancer and 44 normal samples for analysis. We used the normal Between Arrays function in the limma [15] package to correct batch differences between the four datasets and obtained the gene expression matrices for each dataset. The corrected data were analyzed using the UMAP [16] package for dimensionality reduction. The TCGA public database of RNA expression was downloaded. The Genomics Data Commons TCGA Head and Neck Cancer cohort (v. 07-19-2019) [17] was incorporated in the analysis of samples, including 15 cases of tongue cancer and normal samples, and 129 cases without normal pairing.
Identifying and functional analysis of differentially expressed genes (DEGs) The limma package was used to screen the four datasets for DEGs, and we used the ggplot2 [18] package to generate volcano plots (adj. P-value < 0.05 and |log2FC| > 1). We used the clusterProfiler [19] package to perform GO and KEGG enrichment analysis; adj. P-value < 0.01 was considered statistically significant. The DESeq2 [20] package was used to screen the DEGs from the TCGA dataset and used the ggplot2 package to generate volcano plots to demonstrate DEG expression (adj. P-value < 0.05 and |log2FC| > 2). We selected the 30 most up-and downregulated DEGs to display as a heatmap. GSEA [21] was performed on the gene expression matrix with the clus-terProfiler package; we selected c2.cp.kegg.v7.0.symbols.gmt, c4.all.v7.2.symbols.gmt, c5.all.v7.2.symbols.gmt, and c8. All.V7.2. Symbols. Gmt as the reference gene sets. Each analysis was performed over 10,000 permutations. The normalized enrichment score (NES) and normal Pvalues were calculated to classify the enrichment pathways of each phenotype. A false discovery rate < 0.25 was considered significantly enriched.

Differential and ferroptosis-related genes
The ferroptosis-related gene list, totaling 259 genes, was downloaded from FerrDB [22]. The Venn diagrams were prepared with Venn Diagram to visualize the intersections of the significant gene sets within groups and then determined the intersection between DEGs and ferroptosisrelated genes.

Receiver operating characteristic (ROC) analysis of ferroptosis-related genes
Survival analysis data were only available from the GSE31056 dataset and 143 tongue cancer samples from TCGA. We analyzed the relationship between ferroptosisrelated genes and the overall survival (OS) of tongue cancer patients using the survival package [23]. ROC analyses were performed using the timeROC [24] package. The discriminative power of the model was assessed using the area under the ROC curve (ROC AUC).

Construction of protein-protein interaction (PPI) network
CYTOSCAPE [25] software and the cytoHubba [26] plug-in were used to construct and visualize the PPI; each node represents a gene or protein, and the edge between nodes represents the interaction of the molecules. We obtained 97 ferroptosis-related genes using the STRING [27] (v. 10) database and construct PPI networks and analyze interactions between DEGs.
Prognostic models were established using ferroptosis-related genes We used Cox univariate linear regression to analyze the relationship between ferroptosis-related genes and the OS of tongue cancer patients in TCGA using the survival package. Survival curves were based on the Kaplan-Meier method. The hazard ratio (HR) was used to determine a gene variable's protective or harmful effects. A Pvalue < 0.01 was considered statistically significant. Independent prognostic factors were identified using the Cox multivariate regression model. The prognostic model was then applied in the validation set GSE31056.

Immunohistochemistry
For each patient, we carefully collected representative tissue cores of the TSCC tumor and adjacent section. Immunohistochemistry (IHC) staining was used to evaluate CA9 protein expression. Samples were fixed in 10% formalin for 24 h, embedded in paraffin, and cut in 5-μm sections.  water washed, dehydration, xylene transparent, neutral gum sealed.

Statistical analysis
All statistical analyses were performed as the means + standard deviation and calculated in R (v. 3.4.1, Vienna, Austria). We used AUC analysis to investigate the predictive accuracy for each gene and Kaplan-Meier analysis evaluate the effect of a single factor on OS. HR was used to identify the protective or hazardous genes. A P-value of less than 0.05 was considered statistically significant.

Identification of DEGs in TSCC
Four GEO datasets (GSE13601, GSE31056, GSE9844, GSE30784, Table 1) were normalized and batch effects within the group were removed (Fig. 1). After preprocessing, R was used to extract DEGs from the four gene expression matrices ( Fig. 2A-D). As shown in the volcano plots, upregulated genes are depicted in red and downregulated genes in blue; then we selected the 30 most up-and downregulated DEGs and depicted them as a heatmap (Fig. 2E-L).

Functional correlation analysis
We identified 219 DEGs in four GEO databases; 97 were ferroptosis-related genes in the four datasets, and seven genes (TNFAIP3, TP63, TFRC, LPIN1, NCF2, ALOX12, and TF) overlapped in the five datasets (Fig. 3, Table S1). Based on these DEGs, KEGG pathway enrichment analysis results (Fig. 4, Table 2) showed that the extracellular matrix (ECM)-receptor interaction, interleukin (IL)-17, cell cycle, DNA replication, viral protein interaction with cytokine and cytokine receptor, and p53 signaling pathways were significantly upregulated at the genetic level. GO term enrichment analysis of these DEGs, including cell component, molecular function, and biological process, were also analyzed ( Table 3).

Prognostic significance of ferroptosis-related genes
DEGs were extracted from the gene expression matrix of TCGA and displayed as a volcano plot, with upregulated genes depicted in red and downregulated genes in blue (Fig. 5A). The 30 most up-and downregulated DEGs are displayed as a heatmap (Fig. 5B) to show the pathway enrichment. ROC analysis of TCGA data showed that the BACH1, NRAS, TF, TNFAIP3, ATF3, and PGD genes were associated with survival in TSCC (Fig. 5C-H). ROC analysis of the GSE31056 dataset showed that BACH1, TF, ATF3, PGD, CAPG, and LAMP2 were associated with survival ( Fig. 6A-F).

PPI network construction
A total of 97 ferroptosis-related genes were plotted on the STRING (v. 10) website and displayed using Cytoscape (Fig. 9A). The cytoHubba [26]plug-in was Table 2. KEGG pathway enrichment analysis of DEGs. GeneRatio: The numerator is the number of genes enriched to this GO entry, and the denominator is genes from differential expression analysis. BgRatio: Background Ratio, the denominator is the number of genes that have GO annotations in all the protein-coding genes in human, the numerator is the number of genes in these genes that are commented onto this GO entry. Q value: P-value after correction. Count: the number of genes enriched to this GO entry from the input gene for enrichment analysis. *Noun explanation.  Table 3. GO enrichment analysis of DEGs. GeneRatio: The numerator is the number of genes enriched to this GO entry, and the denominator is genes from differential expression analysis. BgRatio: Background Ratio, the denominator is the number of genes that have GO annotations in all the protein-coding genes in humans, the numerator is the number of genes in these genes that are commented onto this GO entry. Q value: P-value after correction. Count: the number of genes enriched to this GO entry from the input gene for enrichment analysis. *Noun explanation.   used to calculate and visualize the top 20 genes (Fig. 9  B). These results showed four genes in the SLC family.

ID
In addition, NRAS and TNFAIP3 of the top 20 genes were overlapped by univariate COX regression analysis of OS prognosis in the TCGA database.

Construction and verification of prediction model
We first performed univariate Cox regression analysis of OS prognosis using the survival package in R in matched tumor and normal control samples from TCGA (P < 0.01). We selected three genes (P ≤ 0.001) associated with survival ( Fig. 10A-D, Table 5), CA9, TNFAIP3, and NRAS (Table S2); then we constructed a prognostic model based on their expression. We further developed this prognostic model using multivariate Cox regression analysis (Fig. 10E) Equation (1): Riskscore ¼ À0:95ðCA9Þ À 0:83ðTNFAIP3Þ The calibration graph in the validation set GSE31056 was used to test the prediction efficiency of the model (Fig. 10F). The results indicated that the OS prognostic model constructed could better predict 3-year OS.

The expression of CA9 was determined by IHC
The previous bioinformatics found CA9 high expression in TSCC, and our findings on CA9 and TSCC survival prompted us to validate CA9 expression in TSCC further. We examined the specimens from 60 independent patients and compared CA9 expression between the tongue cancer and adjacent tissues by immunohistochemistry (IHC) analysis. Consistent with previous reports, CA9 staining was negative in the cytoplasmic of adjacent tissue cells; meanwhile, CA9 staining was strongly positive in the cancer tissue cells (Fig. 11A,B, Table 6). IHC analysis revealed that the expression level of CA9 in the tongue cancer tissues was significantly enhanced.

Discussion
TSCC is the most common malignancy in the oral cavity; despite advances in diagnosis and treatment, the prognosis of advanced states has not significantly improved [28]. The reported 5-year rates of diseasefree survival ranged from 30% to 72% for the younger cohorts (≤45 years ) and 42% to 81% for the older cohorts [29], and the etiology of tongue cancer, especially the molecular mechanism, remains unclear. Ferroptosis may play a key role in tumor suppression and is a potential target for cancer therapy. An evaluation of ferroptotic cellular mechanisms and pathophysiological environments suggests that its regulation may be beneficial in treating cancer [30]. Although many experiments have shown that promoting ferroptosis is beneficial to cancer treatment, the research on ferroptosis in tongue cancer is insufficient, and the molecular mechanism of ferroptosis in tongue cancer needs to be explored further. In this study, we analyzed biological processes involving ferroptosis-related genes using GO/ KEGG/GSEA and evaluated the diagnostic value of using ferroptosis-related genes for TSCC underlying GEO, FerrDB, and TCGA,. Then we performed IHC detection of CA9, where multivariate analysis is significant, and confirmed that it was highly expressed in tumor tissues. A prognostic model developed based on ferroptosis-related genes showed a good predictive ability for OS of TSCC. Our results suggested that ferroptosis was closely related to TSCC and may affect TSCC prognosis through effects on the ECM-receptor interaction and IL-17 signaling pathways.
We analyzed the DEGs by KEGG/GO. The results showed that genes in the ECM-receptor interaction and IL-17 signaling pathways were significantly upregulated. Specific interactions between cells and ECM are mediated by transmembrane molecules that directly or indirectly control cell activities, such as adhesion, migration, differentiation, proliferation, and apoptosis. A study on gastric cancer and ferroptosis showed that the mortality risk score was associated with the ECM-receptor interaction pathway and tumor immunity [31]. In addition, the transcriptome analysis of 72 oral squamous cell carcinoma of the gingivobuccal region (OSCC-GB) patients from multiple  hospitals showed that the ECM-receptor interaction pathway was significantly enriched in tumor tissues [32]. With the deepening of research, it has been found that proinflammatory cytokine IL-17 is closely related to breast cancer, which plays an essential role in promoting tumor proliferation, invasion, and metastasis and is significantly associated with poor prognosis [33]. Studies on liver cancer have confirmed that erastin, an inducer of ferroptosis, inhibits liver cancer cell proliferation and progression, and bioinformatics analysis showed that erastin affected differentiation of Th17 cells and the IL-17 signaling pathway [34]. Taken together, our results, along with previous studies, suggest that ferroptosis may be associated with the progression of tongue cancer through the ECM-receptor interaction and IL-17 signaling pathways. Next, we explored the prognostic value of ferroptosis-related genes in tongue cancer. We selected eight ferroptosis-related genes (BACH1, NRAS, TF, TNFAIP3, ATF3, PGD, CAPG, and LAMP2) to predict survival and used AUC to determine their predictive ability. These genes were associated with the ECM-receptor interaction and IL-17 signaling pathways [35][36][37][38][39]. The genes mentioned above showed good predictive ability at the 5-year and 32-month survival. As shown in Table 5 , P = 0.0087) were significantly related to TSCC prognosis, with higher expression of these genes indicating poorer prognosis. After we included the three genes in multivariate Cox analysis, CA9 was still an independent predictor (HR = 1.263 [1.0957-1.456], P = 0.00128). In addition, through IHC, we found that CA9 was strongly positive in tongue cancer tissues but negative in adjacent tissues. Therefore, we used these genes to construct a prognosis model. Other studies have verified the predictive value of the genes used in our prediction model. CA9 can not only be used as a prognostic marker for bladder cancer [40], previous studies have shown that high CA9 expression is associated with poor prognosis of TSCC [41]. Moreover, CA9 confers resistance to ferroptosis in malignant mesothelioma under hypoxia [42]. HRAS, NRAS, and KRAS, collectively referred to as oncogenic RAS, are the most frequently mutated driver proto-oncogenes in cancer; the relationship between  oncogenic RAS and ferroptosis is still controversial [43]. As the phosphorylation gene of MAPK protein, NRAS has become one of the taxonomic markers of four subtypes of melanoma [44]; moreover, high expression of NRAS is associated with poor prognosis of lung adenocarcinoma [45]. Tumor extracellular vesicles carrying miR-125b-5p enter diffuse large B-cell lymphoma (DLBCL) cells and target TNFAIP3, thereby reducing the sensitivity of DLBCL to rituximab [46], and overexpression of TNFAIP3 is associated with a lower survival rate in breast cancer patients [47]; a study about ferroptosis after cerebral hemorrhage also find TNFAIP3 upregulation [48]. Our model predicted TSCC patient survival in the validation cohorts at 1 year (AUC = 0.889) and 32 months (AUC = 0.9). Therefore, our prognostic model may become a valuable predictive method in the future, and we intended to include detailed clinical records for further refinement.
This study combined the identification of DEGs with ferroptosis-related genes and generated a PPI network in the STRING database with the 97 ferroptosisrelated genes utilized in this study. The top 20 genes belonged to the SLC family, consistent with previous studies [9]. The SLC family is associated with the proliferation and migration of tumors, IL-17 signaling [49], and has also been shown to drive tumor metastasis [50]. Other genes among the top 20, including ATF3, TGFBR1, and EGFR, are associated with the   ECM-receptor interaction and IL-17 signaling pathways [51]. The PPI analysis suggests that ferroptosisrelated genes are associated with the progression of TSCC and further confirmed that ferroptosis-related genes might influence prognosis through ECMreceptor interaction and IL-17 signaling.
Inevitably, this study has some limitations. First, this was a retrospective study because the data sources were TCGA and GEO. Second, the exact mechanism by which the three genes in our model influence TSCC are unknown. Third, large, clinical, multicenter studies are needed to validate our model. Despite the above shortcomings, the results suggest that our model can be used as a reliable prognostic predictor of TSCC survival.

Conclusion
In conclusion, ferroptosis was closely related to TSCC and ferroptosis-related-genes may affect the prognosis of TSCC through the ECM-receptor interaction and the IL-17 signaling pathways. Additionally, we screened genes as potential prognostic markers and constructed a prognostic model based on these ferroptosis-related genes. However, more experiments are needed to validate the current findings further. Our study may provide a broader idea for exploring the molecular mechanism and therapeutic targets of TSCC.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Table S1. 259 Ferroptosis-related genes from the FerrDB database. Table S2. Expression information of CA9, TNFAIP3, and NRAS.