Characterization of a non‐coding RNA‐associated ceRNA network in metastatic lung adenocarcinoma

Abstract Lung adenocarcinoma (LUAD) is a highly malignant cancer. Although competing endogenous RNA (ceRNA)‐based profiling has been investigated in patients with LUAD, it has not been specifically used to study metastasis in LUAD. We found 130 differentially expressed (DE) lncRNAs, 32 DE miRNAs and 981 DE mRNAs from patients with LUAD in The Cancer Genome Atlas (TCGA) database. We analysed the functions and pathways of 981 DE mRNAs using the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Based on the target DE mRNAs and DE lncRNAs of DE miRNAs, we established an lncRNA‐miRNA‐mRNA ceRNA network, comprising 37 DE lncRNAs, 22 DE miRNAs and 212 DE mRNAs. Subsequently, we constructed a protein‐protein interaction network of DE mRNAs in the ceRNA network. Among all, DE RNAs, 5 DE lncRNAs, 5 DE miRNAs and 45 DE mRNAs were confirmed found to be associated with clinical prognosis. Moreover, 3 DE lncRNAs, 4 DE miRNAs and 9 DE mRNAs in the ceRNA network were associated with clinical prognosis. We further screened 3 DE lncRNAs, 3 DE miRNAs and 3 DE mRNAs using clinical samples. These DE lncRNAs, DE miRNAs and DE mRNAs in ceRNA network may serve as independent biomarkers of LUAD metastasis.


FAN et Al.
Cancer progression is closely associated with cancer cell metastasis in LUAD. Using CRISPR/Cas9 technology, Huang et al found that catenin knockout attenuates tumour cell metastasis and that catenin effectively enhances cell proliferation and cell cycle through Wnt signalling. 6 DNA damage results in cell death, although cancer stem cells are able to survive for long periods by avoiding DNA damage. Poly (ADP-ribose) polymerase 1 promotes DNA repair in cancer cells 7 and is associated with distant metastasis in patients with LUAD. 8 High α1-antitrypsin expression predicts poor prognosis in 88 patients with LUAD and enhances lung cancer cell invasion and metastasis. 9 The serine protease inhibitor Kazal type 1, a defined prognostic biomarker, was also reported to control cancer cell growth and metastasis. 10 Although cancer metastasis is regulated by various proteins, it is important to note that there are many other factors that regulate these proteins in the tumour microenvironment.
Long non-coding RNA (lncRNAs) are a type of non-coding RNAs that are longer than microRNAs (miRNAs). Numerous recent studies have demonstrated that lncRNAs are abnormally expressed in tumour cells and can interact with miRNA or mRNA to regulate tumour progression. CAR10, a type of lncRNA, is up-regulated in patients with LUAD and increases cancer cell metastasis and epithelial-to-mesenchymal transition by binding to miRNA30 and miRNA203. 11 MALAT1, HCP5, UCA1 and AS1 have been found to promote tumour development and metastasis, and are associated with the clinical prognosis of LUAD. [12][13][14][15] MiRNAs, comprising 22 nucleotides, are another type of non-coding RNAs that regulate mRNA expression by binding to the target region in mRNA. 16 miR-NAs control tumour cell stemness and metastasis by acting either as oncogenes or as tumour suppressors. For example, miRNA-520c-3p is expressed at low levels at the tumour site and negatively regulates the biological behaviour of cancer cells through binding to Akt1 and Akt2. 17 MiRNA-126-3p and miRNA-126-5p are involved in the progression of LUAD and are positively associated with the TNM stage and metastasis. 18 In recent years, accumulating evidence has suggested that competing endogenous RNA (ceRNA) competitively binds to miRNA to regulate miRNA or mRNA expression. 19,20 However, investigation of ceRNAs relationships (either single or multiple) in isolation is not ideal, as the tumour microenvironment is very complex, and isolation studies cannot correctly mimic the complexity of this microenvironment.
The Cancer Genome Atlas (TCGA) is a public database that hosts a huge amount of sequencing data from many types of cancers, including lung, liver, colon and breast. 21 It is of great importance to study the complex crosstalk within ceRNA networks by analysing sequencing data. Sui et al analysed the lncRNA-miRNA-mRNA network using TCGA data from patients with LUAD and found many differentially expressed (DE) lncRNAs, DE miRNAs and DE mRNAs in patients with stages I-II and III-IV LUAD. 22 ceRNAs in patients with LUAD have been widely researched in recent years using TCGA data. [23][24][25] However, ceRNA networks have not been previously investigated in the context of LUAD metastasis. Therefore, our objective was to obtain an in-depth understanding of the ceRNA network in cancer metastasis, using metastatic and non-metastatic tumour data from the TCGA database.

| TCGA data from patients with LUAD
Sequencing data for lncRNA, miRNA and mRNA from patients with LUAD were collected from TCGA (https://portal.gdc.cancer.gov/), and the clinical parameters of all patients were listed (Table S1). lncRNA-seq and mRNA-seq data were normalized using the FPKM method according to the formula FPKM = (RC g × 10 9 )/ (RC pc × L). In addition, two separate files (mirnas.quantification.txt and isoforms.quantification.txt) were used to read the 'per million mapped reads' for miRNA-seq data normalization. Patients with LUAD were screened for further analyses according to the follow-

| Analysis of DE RNAS (lncRNA, miRNA and mRNA) in M0 and M1 patients
The random variance model t test was used to identify the DE RNAs (lncRNA, miRNA and mRNA) after lncRNA-seq, mRNA-seq and miRNA data in M0 and M1 patients were normalized. DE RNAs were screened using the false discovery rate (FDR) method; those with fold change > 1.2 and P < .05 were considered significantly different. [26][27][28] Hierarchical cluster analysis was performed by EPCLUST.

| DE mRNA functional enrichment and pathway analysis
DE mRNA functional enrichment was analysed by screening against the Gene Ontology (GO) database to obtain information regarding the biological functions of mRNA. 29 In addition, Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was used to enrich the differential pathways in M0 and M1 patients on the basis of the DE mRNAs. 30 Fisher's exact test was used to select significantly different functions and pathways (P < .05).  31 We selected the DE mRNAs in the ceRNA network and predicted the protein-protein interactions (PPIs) using the String database.

| Survival analysis of DE lncRNAs, DE miRNAs and DE mRNAs
Patients with LUAD and survival information in the TCGA database were divided into two groups, low expression and high expression, according to the medians of DE lncRNAs, DE miRNAs and DE mRNAs, respectively. Then, these DE RNAs were selected to analyse survival using the univariate Cox proportional hazards regression model. However, this method was not applicable to analyse the survival curves crossed, and therefore, the two-stage procedure was used. P < .05 was considered statistically significant.

| RNA extraction
Tumour and normal tissues from the 41 patients with LUAD were collected for RNA extraction. Tissues were cut into small pieces and placed into RNase-free tubes, and RNA was extracted using Trizol (Takara).
Briefly, chloroform was used to separate RNA, which was precipitated with isopropanol. Finally, RNA was washed in 75% ethanol and the RNA concentration was measured using Nanodrop 2000 (Agilent).

| Real-time quantitative polymerase chain reaction (QRT-PCR)
cDNA was synthesized from mRNA and lncRNA according to the manufacturer´s instructions (Takara, Japan), and the miScript II RT kit (Qiagen) was used for cDNA synthesis from miRNA. Finally, the SYBR Premix Ex Taq was used for qRT-PCR according to the manufacturer's instructions using the following reaction conditions: 95°C for 30 s, followed by 40 cycles of 94°C for 5 s and 60°C for 30 s. Each 20 μL reaction included 10 μL SYBR Green (Roche), 2 μL cDNA which was diluted 5-10 times, 10 μ M primer forward 0.8 μL, 10 μM primer reverse 0.8 μL and 6.4 μL water.

| Go enrichment and KEGG pathways associated with de mRNAmRNA
In patients with LUAD, 981 DE mRNAs were annotated according to information in the GO database. We enriched the functions of these To further identify the biological processes associated with the 981 DE mRNAs, we analysed the KEGG pathways associated with these mRNAs. We found that the up-regulated mRNAs were positively associated with some KEGG pathways, such as the metabolic pathway, cell cycle, p53 and PI3K-Akt signalling pathways, and signalling pathways regulating stem cell pluripotency ( Figure 2C).
These pathways have all been reported to promote cancer progression and metastasis. 35,36 Meanwhile, down-regulated mRNAs were associated with the B cell receptors, cell adhesion molecules and chemokine signalling pathways ( Figure 2D).

| Cerna regulatory network in patients with LUAD
Next, using the miRanda, Targetscan and miRWalk websites, we pre- down-regulated miRNAs). In this ceRNA network, some of the miRNAs (such as miRNA625, miRNA20a and miRNA149) have been reported to serve as tumour suppressors or oncogenes.

| PPI regulatory network in patients with LUAD
Based on the mRNAs in the lncRNA-miRNA-mRNA ceRNA network, we established a PPI regulatory network using the String database to better present the protein-protein interaction in M1 patients with LUAD ( Figure 4). In the PPI regulatory network, the proteins, SNAP25, CCNE2, CDC5L and DICER1, were up-regulated and had a strong regulatory capacity; FCGR2B, VAV1, WAS, LYN and CSF1 were down-regulated proteins that play important roles.

| Association between survival and the expression of lncRNAs, miRNAs and mRNAs in the cerna network in patients with LUAD
To further explore whether metastasis-associated lncRNAs, miR-  with higher expression of GGTA1P had better prognosis than those with a lower expression ( Figure 6A). In contrast, patients with lower expression of RP11-284F21.9 showed a better clinical prognosis than those with higher expression ( Figure 6A). In patients with LUAD, low expression of hsa-miR-3934-3p and hsa-miR-150-5p was correlated with poor prognosis ( Figure 6B).
CBX5 and WDR76 were highly expressed in M1 patients, and survival analysis showed that patients expressing low CBX5 and WDR76 had better clinical prognosis ( Figure 6C). In addition, we performed uni-factor cox regression analysis to examine the relationship between these DE RNAs and prognostic of lung cancer patients ( Figure S2). We found that five of the six DE RNAs qRT-PCR, we were able to verify that 3 lncRNAs, 3 miRNAs and 3 mRNAs exhibited similar expression trends as those predicted using bioinformatic analyses ( Figure S4).

| D ISCUSS I ON
In this study, we aimed to investigate the factors affecting metastasis in patients with LUAD. To comprehensively analyse these factors, we obtained RNA sequencing data (lncRNA, miRNA and mRNA) from the TCGA database. We used bioinformatics to predict and explore cancer development and metastasis. First, we screened 981 DE mRNAs, 130 DE lncRNAs and 32 DE miRNAs, and found that they were dysregulated in M0 but not in M1 patients with LUAD.
Then, we further analysed the functions of these differentially expressed mRNAs based on the GO and KEGG pathway databases.
We established a ceRNA network, based on the DE lncRNAs, DE miRNAs and DE mRNAs, to visualize the interactions among these RNAs. In addition, we analysed the association between the differentially expressed RNAs in the ceRNA network and clinical prognosis in patients with LUAD. Therefore, this study helps to provide a better understanding of the mechanisms underlying cancer metastasis.  47 Reportedly, ceRNA networks can also be constructed to analyse the regulatory factors in one signalling pathway. 48 Here, we found that the DE lncRNAs (GGTA1P, LINC00125, LINC00261, RP11-284F21.9 and RP11-284F21.1) in the ceRNA network were negatively or positively associated with clinical prognosis in patients with LUAD. In triple-negative breast cancer, patients with higher GGTA1P expression exhibit better relapse-free survival. 49 LINC00261 expression plays an important role in cell proliferation, apoptosis and invasion in choriocarcinoma and gastric cancer, 50,51 and was identified as a novel prognostic biomarker in pancreatic cancer. 52 Additionally, LINC00261 is able to bind to hsa-miR-182, hsa-miR-183, hsa-miR-153 and hsa-miR-27a; further, hsa-miR-96 was able to regulate FOXO1 mRNA expression, suppressing cell proliferation, migration and invasion of endometrial cancer cells. 53 Finally, based on the survival analysis of DE RNAs in ceRNA network and experimental data, we identified 3 lncRNAs, 3 miRNAs and 3 mRNAs. We will further investigate which mRNAs play major roles in regulating metastasis, and which lncRNAs and miRNAs regulate these mRNAs.
In conclusion, we used an integrative biological approach to analyse DE mRNAs, lncRNAs and miRNAs in patients with LUAD in the M0 and M1 patient groups. Moreover, we obtained information from the GO and KEGG databases to understand the functions and pathways associated with these screened mRNAs, lncRNAs and miRNAs. The lncRNA-miRNA-mRNA ceRNA network and PPI networks were established, revealing a novel regulatory mechanism for further investigation of LUAD. Finally, we also screened the DE mRNAs, lncRNAs and miRNAs in the network with respect to the clinical prognosis (using RT-PCR), which may serve as independent biomarkers of LUAD metastasis.

ACK N OWLED G EM ENTS
The work was supported by grants from the National Natural Science Foundation of China (Grant no. 81872410 and 81602024).

CO N FLI C T O F I NTE R E S T
The authors confirm that there are no conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.