N‐6 methylation‐related lncRNA is potential signature in lung adenocarcinoma and influences tumor microenvironment

Abstract Background N‐6 methylation (m6A) pushes forward an immense influence on the occurrence and development of lung adenocarcinoma (LUAD). However, the methylation on non‐coding RNA in LUAD, especially long non‐coding RNA (lncRNA), has not been received sufficient attention. Methods Spearman correlation analysis was used to screen lncRNA correlated with m6A regulators expression from the Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) repositories, respectively. Then, the least absolute shrinkage and selection operator (LASSO) was applied to build a risk signature consisting m6A‐related lncRNA. Univariate and multivariate independent prognostic analysis were applied to evaluate the performance of signature in predicting patients' survival. Next, we applied Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and gene set enrichment analysis (GSEA) to conduct pathway enrichment analysis of 3344 different expression genes (DEGs). Finally, we set up a competing endogenous RNAs (ceRNA) network to this lncRNA. Results A total of 85 common lncRNAs were selected to acquire the components related to prognosis. The final risk signature established by LASSO regression contained 11 lncRNAs: ARHGEF26‐AS1, COLCA1, CRNDE, DLGAP1‐AS2, FENDRR, LINC00968, TMPO‐AS1, TRG‐AS1, MGC32805, RPARP‐AS1, and TBX5‐AS1. M6A‐related lncRNA risk score could predict the prognostic of LUAD and was significantly associated with clinical pathological. And in the evaluation of lung adenocarcinoma tumor microenvironment (TME) by using ESTIMATE algorithm, we found a statistically significant correlation between risk score and stromal/immune cells. Conclusion M6A‐related lncRNA was a potential prognostic and therapy target for lung adenocarcinoma.


| INTRODUC TI ON
Lung cancer is now the most common malignant tumor with its morbidity and mortality are among the first malignant tumors, which has a big threat to human life. 1,2 As important one of the histological subtypes of lung cancer, the incidence of lung adenocarcinoma has gradually surpassed that of lung squamous carcinoma in recent years and become the largest subtype of lung cancer. 3 The pathogenesis of lung cancer is a complex mechanism in which RNA methylation plays a role in the progress. 4,5 Recent studies have showed that numerous RNA methylation modifications including N-6 methylation (m6A), 5-methylctisine (m5C), and pseudoguanosine are widely present in transcriptome RNA, 6 which may be one of the pathogenesis of LUAD. With the development of next-generation sequencing technology, m6A, the most prevalent modification of eukaryotes is gradually appearing to light, 6,7 and new localization techniques also make m6A research possible. 8 N-6 methylation-related methylase and demethylase can be classified into three enzymes with respective functions that perform reversible methylation on the sixth nitrogen atoms of RNA. 9 The way their works is shown below: enzymes called m6A "writers" (mainly including METTL3-14, WTAP, HAKAI, WTAP, and KIAA1429) combine their subunits to form the m6A complex, which can catalyze the forward methylation of RNA. 9,10 The methylation can be reversed by demethylase, the enzymes involved in such as ALKBH5 and FTO, also known as the "erasers". 11 In addition, there are some reading proteins else with specific structural domains as YTH domains (YTHDC1-3) and RBM domains (RBM15), which participate in the reading process of m6A. 12 The m6A regulators have been shown to play a wide and profound role in the variety of human disease and to be involved in human malignant tumor via regulating RNA metabolism. 9 Tumor microenvironment (TME) is the environment in which tumors live in and the soil for tumors grow, migrate, and invade. The interaction among stromal cells, immune cells, and tumor cells is an important factor in tumor growth and development in TME. 13 In the progress of tumor occurrence, components of TME can be recruited by tumor cells and become the environment driving force for tumor growth and metastasis. 14 For example, although macrophages serve as a barrier to the removal of the tumor cells, it has been reported that tumor-related macrophages can promote tumor progression through para-signal secretion in an existing tumor. 15,16 Most of the previous reports have studied the role of m6A modification on mRNA in tumors, but its m6A modification on ln-cRNA in tumors and tumor microenvironment remained at an unclear level. The purpose of this study was to explore the function of m6A-related lncRNA in LUAD and TME based on public repositories. We obtained the transcriptome expression data and clinical information for patients with lung adenocarcinoma from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) repository. Then, an 11 m6A-related lncRNA signature was constructed after univariate independent prognostic regression analysis and least absolute shrinkage and selection operator (LASSO) analysis, and risk score of each individual was calculated independently. Our following findings demonstrated that the potential roles of signature composed of 11 lncRNA in the prognosis of lung adenocarcinoma.

| Datasets
Expression data of 594 LUAD patients were downloaded in the Cancer Genome Atlas (TCGA) repository, which including 59 normal samples and 535 tumor samples. Simple nucleotide variation data of 560 LUAD patients were downloaded from TCGA repository for TMB calculation. GSE31210 and GSE30219 acquired from Gene Expression Omnibus (GEO) repository were also used for signature construction and multi-repository result validation. Sample with missing clinical information was removed during analyzing.

| Bioinformatics
All statistical analyses were performed in R version 4.0.3. First of all, we selected 29 most common m6A regulators and conducted Spearman correlation analysis in TCGA and GSE31210 dataset, and m6A-related lncRNA was screened by the standard correlation coefficient (|cor|) > 0.3. 1251 and 264 required lncRNA were acquired, respectively, in two databases. We found that the communal part in this lncRNA and the common 85 lncRNA were used for the signature construction. Then, after the univariate regression and LASSO analysis, prognostic model made up of 11 lncRNA was obtained. Kaplan-Meier estimator was applied to draw the survival curves of single lncRNA or between two risk groups. Next, univariate, multivariate independent prognostic analysis, and ROC curve were used to evaluate the predictive efficiency and reliability of risk model. Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression data (ESTIMATE) is an algorithm to estimate the two major components of TME-stromal cells and immune cellsbased on expression data, so that we utilize the method to calculate the stromalscore and the immunescore of each sample. After that, Wilcox test was applied to evaluate the infiltration between two risk groups. The consensus clustering method was applied to the new classification of LUAD, and the characteristics among the clusters were analyzed.
Then, we set our eyes on exploring the possible mechanism of m6A-related lncRNA. We analyzed 3344 differentially expressed genes between risk groups using R package "limma." GO, KEGG enrichment analysis, and metascape online tools were used to elucidate the signaling pathways and molecular mechanisms of possible DEGs enrichment. GSEA was used to explore pathways with significant enrichment in high-risk groups. These biological pathways, cellular components, and molecular functions may reveal the mechanisms of m6A-related lncRNA in LUAD.  Table 1. The risk score of each sample was calculated by the formula: Risk score = ∑ n j=1 Coefj * ij. According to the coef value of lncRNA, we can judge whether a lncRNA is a risk or protect lncRNA. Therefore, we constructed a Sankey diagram to depict the network of "m6A regulators-m6A-related lncRNA-risk type" (Figure 1D), as well based on the correlation analysis, we visualized the relationship between m6A regulators and m6A-related RNA by using cytoscape tool ( Figure 1E). Next, we divided all samples of TCGA into high-risk and low-risk groups according to the risk score, and then used K-M survival curve to reflect the relationship between risk score and survival rate. As shown in Figure 1F, there was a significant difference between two groups (P = 9.665e−08). The risk curve and plot told us that the number of death cases increased as the increase in risk score distinctly ( Figure 1G-I). These results indicated that risk score is a good predictor of overall survival rate in LUAD.

| Selection of m6A regulators
Then, we plotted survival curves of each m6A-related lncRNA with their grouping according to the expression level of the corresponding lncRNA. In the light of the results, the high expression of lncRNA TMPO-AS1 and DLGAP1-AS2 has a significant low overall survival rate, which is consistent with the coef value of all lncRNA.
Survival between groups of all lncRNA has a significant difference (P < 0.05) (Figure 2).

| Verification of the accuracy of the m6Arelated lncRNA signature
In this part, we used a variety method to verify the predictive reliability of this signature. Firstly, univariate and multivariate independent prognostic analysis were used to evaluate the predictive ability of risk score in TCGA cohort. It showed that risk score as a predictor has better predictive value than other clinical characteristics; these including age, gender, T stage, N stage, etc. (Figure 3A,B).

Independent prognostic analysis of validation data set GSE30219
yielded the same results as shown in Figure S1, and multiple ROC curve confirmed this conclusion ( Figure 3C). And the result of  to the risk score. We found that there were significant differences in overall survival (OS) and disease-free survival (DFS) between the high-and low-risk groups in the validation datasets GSE30219 and GSE31210 ( Figure 3G-I). In conclusion, the results confirmed that the risk score has excellent predictive ability, which can be applied to all patients with LUAD.

| Relationship between m6A-related lncRNA and clinicopathological features
In TCGA cohort, we compared the distribution of age, gender, and other clinical characteristics between high-and low-risk groups, and Student's t test was performed to analyze the differences between  Figure 4D). In the next exploration of risk score and clinical characteristics, m6A-related lncRNA signature risk score was significantly associated with lymph node metastasis and patient grade ( Figure 4E-D). There was also some association between risk score and gender, but not significant enough ( Figure 4F). However, there no significant correlations between risk score and age/tumor size ( Figure 4G-H). These findings supported to some extent by the results in Table 2. We also evaluated the role of m6A-related lncRNA

| M6A-related lncRNA can influence TME of LUAD
We believed that m6A-related lncRNA may affect TME. In order to explore the specific part of this effect, we used the ESTIMATE algorithm to estimate the purity of tumor tissue of 594 patients, and the Spearman correlation analysis was carried on between the risk score and the calculated score (stromalscore, immunescore, and ESTIMATE score). Results showed that the three calculated scores decline prominently as the risk score increased ( Figure 5A-C). This suggests that higher tumor purity means a higher prognostic risk for the patient, as the amounts of immune cells and stromal cells in the TME of the patient were reduced.
To further investigate the infiltration of immune cells in the TME, With respect to T-cell infiltration, we found that CD4+ memory T cells had a significant tendency to be activated with the increase in risk ( Figure 5J,K). We also calculated TMB in lung cancer patients and analyzed the associated with risk score. The results showed that TMB was positively correlated with risk score ( Figure 6A), and the difference was significant between high-and low-risk groups ( Figure 6B). All these results suggested that m6Arelated lncRNA affects the prognostic of patients by changing the TME and the infiltration of immune cells.

| Analysis and exploration of subtypes based on the m6A-related lncRNA
We explored a new subtype classifying method of LUAD using the consensus clustering method, and all the samples were divided into two clusters in this way ( Figure S2). We analyzed overall survival rates between two clusters and found no significant differences ( Figure 6C). Analysis of the expression levels of common immune F I G U R E 6 A, Correlations of TMB and risk score. B, Significant difference in TMB between high-and low-risk groups. C, Overall survival curve between cluster s1 and 2. D-G, Different immune checkpoints, common driver genes of lung cancer checkpoints (PD1 and PD-L1 in our study) and lung cancer driver genes (ALK, EGFR, ROS1, and MET in our study) between two clusters revealed significant differences in PD1, PD-L1, and MET gene expression ( Figure 6D-F). There was also a significant difference in risk scores between the two clusters ( Figure 6G).

| Exploration on the mechanism of m6Arelated lncRNA in LUAD
In order to explore the specific mechanism of m6A-related genes affecting LUAD, we used GSEA software to analyze the signal pathways enriched in high-and low-risk group, respectively. The screening criteria of results are |NES|>1, NOM pval < 0.05, and FDR < 0.25. GSEA tests run 1000 times. We then listed 8 pathways that were most significantly enriched in the high-risk group, which were found to be most related to the cell cycle ( Figure 7A).
Therefore, we speculate that m6A-related lncRNA promotes tumorigenesis by influencing metabolic pathways in the cell cycle. Next, we looked for different expression genes (DEGs) between the high-and low-risk groups, and a total of 3,344 DEGs were found ( Figure S3).
Metascape (http://metas cape.org/) online tool and R software were applied to explore the enrichment of DEGs pathways. The results of metascape software showed that the most enriched signaling pathways were cilium movement, neuronal activity ligand-receptor interaction, and NABA matrisome associated ( Figure 7B-D). The  Figure S4). Metascape tool also told us that the lung is the most specific tissue of these genes ( Figure 7E).
In the end, we constructed the ceRNA network of 11 lncRNA

| DISCUSS ION
Lung cancer is a major problem in the development of human medicine. The difficulty of early detection and diagnosis causes only 16% survival rate of lung cancer patients in China. 18 Therefore, it is a potential value to search for new serological or histological prognostic markers for the diagnosis, prognosis, and treatment.
As the most abundant chemical modification, N-6 methylation plays an important role in many biological processes, especially in human cancer. 19   Although METTL3 binds to promoters to promote proliferation of tumor cells in acute myeloid leukemia, 24,25 increased expression often leads to apoptosis of tumor cells in triple-negative breast cancer. 26 N-6 methylation on other RNA has also been studied for the past few years. It has been found that m6A modification on circRNA can promote tumor development by driving circRNA translation, affecting binding to RBP, or changing the methylation of downstream targets of circRNA. [27][28][29] Several reports have also verified that lncRNA is also the target of m6A affecting the tumor process.
For example, GATA3-AS1 promotes the metastasis of liver cancer, and the downregulation of tumor suppressor gene GAS5 in cervical cancer is all inseparable from the shadow of m6A. 11,30,31 LncRNA is also very important in the progression of lung cancer. Some studies have shown that LncRNA can participate in transcriptional regulation in non-small cell lung cancer and enhance tumor tolerance to chemotherapy. [32][33][34] It is important to note that the same m6A regulators may play a different role in different tissues. Therefore, the objective of this study was to explore the function of m6A-related lncRNA in LUAD.
In order to achieve this goal, we downloaded data from TCGA and GEO repositories for lncRNA screening and signature establishment.
Using Spearman analysis to get lncRNA closely related to m6A, and so as to make the signature more reliable, we extracted the common part of the two data sets. Univariate regression analysis was used for the screening of prognostic m6A-related lncRNA, and a signature composed of 11 lncRNA is obtained through LASSO. We found that the prognosis and patient's risk score according to signature are significantly related. Later, after independent prognostic analysis, ROC curve, and external datasets verification, we thought that the ability of m6A-related lncRNA signature to predict patient prognosis is trustworthy. The m6A-related lncRNA is also having a bearing on clinical traits, for instance, late-stage (stage III-IV) and lymph node metastasis showed a higher risk score. It is worth mentioning that the low expression of immune checkpoint PD1 and ALK genes also presumed a higher risk score in LUAD. After exploring the components of TME, we found that the signature is significantly having relations of tumor purity and immune cell infiltration abundance. High purity tumors indicate a high-risk and a poor prognosis, as well the activation of CD4+ memory T cells and the infiltration of macrophages also meant an increased risk of prognosis. In the later study of the mechanism of m6A-related lncRNA in LUAD, we used GSEA, GO, and KEGG to get different pathway enrichment between highand low-risk groups. The results indicated that the signal pathways In conclusion, m6A-related lncRNA is a potential diagnostic target and plays an important role in the treatment and prognosis in LUAD.

ACK N OWLED G M ENTS
Not applicable.

CO N FLI C T O F I NTE R E S T
The authors declare that there is no relative conflict of interest.

CO N S E NT FO R PU B LI C ATI O N
Written informed consent for publication was obtained from all participants.

DATA AVA I L A B I L I T Y S TAT E M E N T
All the data in this study are download from public databases as described in the passage. Data could be acquired from TCGA (https:// portal.gdc.cancer.gov/) and GEO (https://www.ncbi.nlm.nih.gov/) repositories.