A novel autophagy‐related lncRNA survival model for lung adenocarcinoma

Abstract Long non‐coding RNA (lncRNA) is an important regulatory factor in the development of lung adenocarcinoma, which is related to the control of autophagy. LncRNA can also be used as a biomarker of prognosis in patients with lung adenocarcinoma. Therefore, it is important to determine the prognostic value of autophagy‐related lncRNA in lung adenocarcinoma. In this study, autophagy‐related mRNAs‐lncRNAs were screened from lung adenocarcinoma and a co‐expression network of autophagy‐related mRNAs‐lncRNAs was constructed by using The Cancer Genome Atlas (TCGA). The univariate and multivariate Cox proportional hazard analyses were used to evaluate the prognostic value of the autophagy‐related lncRNAs and finally obtained a survival model composed of 11 autophagy‐related lncRNAs. Through Kaplan‐Meier analysis, univariate and multivariate Cox regression analysis and time‐dependent receiver operating characteristic (ROC) curve analysis, it was further verified that the survival model was a new independent prognostic factor for patients with lung adenocarcinoma. In addition, based on the survival model, gene set enrichment analysis (GSEA) was used to illustrate the function of genes in low‐risk and high‐risk groups. These 11 lncRNAs were GAS6‐AS1, AC106047.1, AC010980.2, AL034397.3, NKILA, AL606489.1, HLA‐DQB1‐AS1, LINC01116, LINC01806, FAM83A‐AS1 and AC090559.1. The hazard ratio (HR) of the risk score was 1.256 (1.196‐1.320) (P < .001) in univariate Cox regression analysis and 1.215 (1.149‐1.286) (P < .001) in multivariate Cox regression analysis. And the AUC value of the risk score was 0.809. The 11 autophagy‐related lncRNA survival models had important predictive value for the prognosis of lung adenocarcinoma and may become clinical autophagy‐related therapeutic targets.


| INTRODUC TI ON
Lung cancer is the cancer with the highest morbidity and mortality among cancers. 1 Lung adenocarcinoma (LUAD) is the most common pathological subtype of lung cancer, accounting for 45% of all lung cancer. 2 Despite the continuous progress in the technology of cancer diagnosis and treatment, the mortality rate of lung cancer remains high. One reason is that some patients are diagnosed with advanced lung cancer, and the other is that the existing guided staging system is not accurate in predicting the prognosis of lung cancer; as a result, some patients with early lung cancer did not receive adjuvant therapy after operation, which led to the recurrence or metastasis of lung cancer. 3,4 Therefore, it is necessary to update the staging system of the existing guidelines.
Autophagy is a highly conservative physiological process, which maintains the stability of the intracellular environment through the lysosome degradation system. 5,6 Autophagy plays an important role in many physiological processes, such as immune response, inflammation and tumorigenesis. 7,8 In the past few decades, there have been more and more studies on autophagy in LUAD. 9,10 Therefore, it is very important to establish an autophagy-related gene set to predict the prognosis of LUAD.
Long non-coding RNA (lncRNA) is a series of nucleotides with a length of more than 200 bp and does not have the ability to encode proteins. 11 LncRNA is involved in many steps in the process of cancer occurrence and development, so it may be used as a biomarker to predict the prognosis of cancer patients. [12][13][14] In addition, more and more studies have shown that in many cancers, lncRNA can promote the occurrence and development of tumours. [15][16][17] Therefore, it is very important to screen the autophagy lncRNA related to the prognosis of LUAD.
In this study, a data set of gene expression in LUAD from The Cancer Genome Atlas (TCGA) was analysed and autophagy-related ln-cRNA was screened out. The autophagy-related lncRNA signature was identified to predict the survival prognosis of LUAD patients.

| LUAD patient data
The clinical data and gene expression data of patients with LUAD were collected from TCGA database (https://cance rgeno me.nih.gov/). Among them, the gene expression data used the data that had been normalized. In this study, the data of 954 patients with LUAD were analysed. Excluding patients with duplicated and missing clinical information, a total of 316 patient data were used for follow-up analysis.

| Screening of autophagy-related lncRNA in LUAD
A list of genes related to autophagy was obtained from the Human Autophagy Database (http://www.autop hagy.lu/). A total of 210 autophagy-related genes were obtained from the LUAD gene expression data. Finally, 1651 autophagy-related lncRNAs were screened by constructing autophagy-related mRNA-lncRNA coexpression network according to the following criteria: |Correlation Coefficient| > 0.4 and P < .001. 18 We used Pearson correlation analysis to perform the above analysis by limma R package.

| Identification of autophagy-related lncRNA prognostic signatures for LUAD
In order to identify the autophagy-related lncRNA associated with survival, we conducted a univariate Cox proportional hazard analysis and Kaplan-Meier analysis, and P < .01 was considered to be statistically significant. Then, the survival R package was used for multivariate Cox proportional hazard analysis, the optimal prognostic risk model was established, and the risk score was calculated by the following formula.
The LUAD patients were divided into two groups by the median risk score: high-risk group and low-risk group. We performed Kaplan-Meier survival analysis to estimate the survival difference between the two groups by using the survival R packages.

| ROC curve plotting and independent prognostic analysis
Univariate and multivariate Cox analyses were performed to evaluate relationship of survival prognosis with clinical factors and risk score by using the survival R package. In order to estimate the predictive accuracy for survival time by different clinical factors and risk score, time-dependent receiver operating characteristic (ROC) curves were plotted by using the survivalROC R package.

| Construction and calibration of nomogram
The R package rms was used to construct nomogram which contained risk scores and clinical factors such as age, gender and stage.
The nomogram can be used to predict the probable 3-year and 5year survival of LUAD patients. The R package survival was utilized to plot calibration curve of nomogram. The calibration curve can intuitively demonstrate prediction ability of nomogram.
coef (lncRNAn) was defined as the coefficient of lncRNAs correlated with survival. expr (lncRNAn) was defined as the expression of lncRNAs.

| Statistical analysis
We used R software (version 3.6.2) to perform all statistical analyses.
The Sankey diagram and Cytoscape software were used to visualize prognostic autophagy-associated lncRNA-mRNA co-expression network. The functional annotation was performed by using gene set enrichment analysis (GSEA, https://www.gsea-msigdb.org/gsea/ index.jsp). GSEA is an important gene annotation tool, 19 which can analyse and annotate the whole genetic data, thus avoiding the omission of key information. The P < .05 was considered to be statistically significant.

| Autophagy-related lncRNA with significant prognostic value in LUAD
Through the construction of autophagy-related mRNA and lncRNA co-expression network, a total of 1651 autophagyrelated lncRNAs were obtained. Among them, Cox proportional hazards analysis and Kaplan-Meier analysis demonstrated that 33 autophagy-related lncRNAs were significantly associated with the survival of LUAD patients from the TCGA (P < .01), including 23 lncRNAs with low risk (hazard ratio (HR)<1) and 10 lncRNAs with high risk (hazard ratio (HR)>1) (Table 1).
Furthermore, multivariate Cox analysis screened 11 lncRNAs from the above 33 autophagy-related lncRNAs with prognostic significance, and the names of 11 lncRNAs were GAS6-AS1, HLA-DQB1-AS1, LINC01116, LINC01806, FAM83A-AS1 and AC090559.1 ( Table 2). We used these 11 lncRNAs to establish the optimal prognostic risk model and established a prognostic visual co-expression network of autophagy-related lncRNA-mRNA ( Figure 1). According to the risk score formula, based on calculated median risk score, LUAD patients were divided into two groups: high-risk group and low-risk group. Meanwhile, according to the expression of 11 different lncRNAs, based on the calculated median expression, LUAD patients were divided into two groups: high expression group and low expression group.
Kaplan-Meier survival analysis showed that the overall survival (OS) of the high-risk group was worse than that of the low-risk

| Evaluation of the survival model for LUAD patients
In order to evaluate whether the above 11 autophagy-related lncRNA survival models were independent prognostic factors of LUAD, univariate and multivariate Cox regression analyses were performed.
The hazard ratio (HR) of the risk score was 1.256 (95% CI 1.196-

| GSEA enrichment
GSEA showed that there were different gene expression patterns between high-risk group and low-risk group. In the highrisk group, the expression of genes related to cell cycle and mismatch repair was higher, while in the low-risk group, the expression of genes related to cell cycle mismatch repair was lower ( Figure 7). So far, the focus of accurate genomic medicine was to find accurate and specific predictors of survival and prognosis from large medical data with clinical results. Therefore, in recent years, there were some studies aimed at using bioinformatics analysis to explore the prognostic factors related to autophagy. [23][24][25] In the past F I G U R E 7 Gene Ontology (GO) and KEGG analyses of the 11 autophagy-related lncRNAs by GSEA year, three prognostic risk models of autophagy-related genes in LUAD were established based on TCGA database and using different screening criteria and statistical methods. 26 However, our research also has the following two limitations. First of all, we used traditional statistical analysis methods to establish and evaluate the prognostic risk models of 11 autophagy-related lncRNAs. Although these methods have been applied and verified in many studies, more advanced methods and technologies are needed to improve our further research in the future. Secondly, in order to further verify our bioinformatics prediction results, 11 lncRNAs related to autophagy need to be further studied, including functional experiments and molecular mechanisms.

| CON CLUS ION
In conclusion, we identified a novel autophagy-related survival model

ACK N OWLED G M ENTS
Not applicable.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no competing interests. Writing-review & editing (equal).

E TH I C A L A PPROVA L
Not applicable.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data used in this study were acquired from The Cancer Genome Atlas (TCGA) portal.