Screening key lncRNAs for human rectal adenocarcinoma based on lncRNA‐mRNA functional synergistic network

Abstract Background Rectal adenocarcinoma (READ) is one of the deadliest malignancies, and the molecular mechanisms underlying the initiation and development of READ remain largely unknown. In this study, we aimed to find key long noncoding RNAs (lncRNAs) and mRNAs in READ by RNA sequencing. Methods RNA sequencing was performed to identify differentially expressed mRNAs (DEmRNAs) and lncRNAs (DElncRNAs) between READ and normal tissue. READ‐specific protein‐protein interaction (PPI), DElncRNA‐DEmRNA coexpression, and DElncRNA‐nearby DEmRNA interaction networks were constructed. DEmRNAs and DEmRNAs coexpressed with DElncRNAs were functionally annotated. Results A total of 2113 DEmRNAs and 150 DElncRNAs between READ and normal tissue were identified. The PPI network identified several hub proteins, including CDK1, AURKB, CDC6, FOXQ1, NUF2, and TOP2A. The DElncRNA‐DEmRNA coexpression and DElncRNA‐nearby DEmRNA interaction networks identified some hub lncRNAs, including CCAT1, LOC105374879, GAS5, and B3GALT5‐AS1. The colorectal cancer pathway, the intestinal immune network for IgA production and the p53 signaling pathway were three pathways significantly enriched in DEmRNAs and DEmRNAs coexpressed with DElncRNAs. MSH6 coexpressed with two DElncRNAs (LOC105374879 and CASC15) and BCL2 coexpressed with B3GALT5‐AS1 were significantly enriched in the colorectal cancer signaling pathway. TNFRSF17 coexpressed with B3GALT5‐AS1 was enriched in the intestinal immune network for IgA production. CCNB2 coexpressed with LOC105374879 was enriched in the p53 signaling pathway. Conclusion A total of four DEmRNAs (MSH6, BCL2, TNFRSF17, and CCNB2) and three DElncRNAs (LOC105374879, CASC15, and B3GALT5‐AS1) may be involved in the pathogenesis of READ; this data may contribute to understanding the mechanisms of READ and the development of therapeutic strategies for READ.


| INTRODUCTION
Colorectal cancer is one of the most common malignant tumors causing cancer-related deaths and has one of the highest incidence rates among all types of cancer worldwide. 1 Rectal adenocarcinoma (READ) is a common type of colorectal cancer. 2 Although advancements in treatments and the prognosis and diagnosis of READ have been achieved through research, its mortality remains high, which may be due to the lack of efficient biomarkers for READ and the unclear mechanisms underlying READ. Hence, identifying efficient biomarkers and deciphering the detailed molecular mechanisms underlying READ are urgently required.
In the field of gene-gene network analysis, the construction of coexpression networks has opened up enormous possibilities for exploring the role of genes in biological processes. 3 Coexpression analysis of lncRNAs-mRNAs is the most commonly used approach to screen potential target genes of lncRNAs and further research on the biological functions of lncRNAs in many kinds of diseases. 4,5 The advent of high-throughput genetic analysis means that a large portion of the genome can be transcribed, resulting in the discovery of the extensive transcription of large RNA transcripts named long noncoding RNAs (lncRNAs). 6,7 Accumulating numbers of reports of aberrant lncRNA expression have demonstrated that lncRNAs may potentially serve as novel independent biomarkers for the early diagnosis and prognosis of and metastasis prediction in various cancer types. [8][9][10][11] Recently, lncRNA profiling has been performed in several other types of colorectal cancer, which identified novel candidate diagnostic and prognostic biomarkers, such as SNHG6, PVT1, ZFAS1, LINC01555, RP11-610P16.1, RP11, 108K3.1, and LINC01207. 12,13 However, research on lncRNA biomarkers in READ is rare.
Owing to the limited research linking lncRNAs with READ, this study aimed to further investigate this issue. In this study, RNA sequencing was performed to identify DEmRNAs and DElncRNAs between READ and normal tissue. READ-specific protein-protein interaction (PPI), DElncRNA-DEmRNA coexpression, and DElncRNA-nearby DEmRNA interaction networks were constructed. The functional annotation of DEmRNAs and DEmRNAs coexpressed with DElncRNAs was performed. Our study identified potential key genes and lncRNAs in READ and provides further insights into the mechanisms and predictive capacity of lncRNAs in READ.

| Patients
Three patients with READ were enrolled in our study. Three tissue samples and three paired adjacent normal samples were selected from three cases of READ. The tissue samples were biopsy samples obtained from surgery. The detailed characteristics of the patients are displayed in Table 1. All the participants submitted signed informed consent forms, and the protocols were approved by the ethical committee of our hospital.

| RNA isolation, library construction, and sequencing
Total RNA was extracted from the samples using TRIzol reagent (Invitrogen, Carlsbad, CA). A Nanodrop ND-2000 spectrophotometer (Thermo Scientific, Wilmington, DE) was applied to check the RNA concentration and purity. The integrity of the RNA was detected by agarose gel electrophoresis. The RIN value was obtained by an Agilent 2100 Bioanalyzer. The criteria for cDNA library construction were as follows: (a) total RNA >5 μg; (b) concentration of RNA ≥200 ng/μL; and (3) an OD 260/280 value of 1.8-2.2.
Ribosomal RNA was removed with a Ribo-Zero Magnetic kit (EpiCentre, Madison, WI), and the RNA was purified and fragmented into 200-500-base pair fragments. The RNA fragments were primed with random hexameric primers, and the first cDNA strand was synthesized, with the second cDNA strand synthesized with dUTP instead of dTTP. After purification with AMPure XP Beads (Beckman Coulter, Brea, CA), end repair, adenylation of the 3′ ends and adapter ligation were performed. Polymerase chain reaction (PCR) was performed to construct a library for the high-throughput sequencing of lncRNA, and the mRNA from the second cDNA strand was digested using UNG enzyme (Illumina, Inc, San Diego, CA). All libraries used for the high-throughput

| Functional annotation
GeneCodis 3 (http://genec odis.cnb.csic.es/analysis) is an online software tool for functional annotation analysis used to reveal the biological functions related to large lists of genes. Gene Ontology (GO) classification (biological process, cellular component, and molecular function) is a major bioinformatics analysis method for annotating genes. The Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database used to determine the biological systems associated with the output of high-throughput experimental technologies. GO classification and KEGG pathway enrichment analyses were performed using GeneCodis 3. An false discovery rate (FDR) <0.05 was used to indicate statistical significance.

| PPI network construction
The top 100 upregulated or downregulated DEmRNAs in READ were used to build a PPI network using Biological General Repository for Interaction Datasets (BioGRID) (http://thebi ogrid.org/) and Cytoscape 3.5.0 (http://www. cytos cape.org/). We used nodes to represent proteins and edges to represent the interactions between two proteins.

| DEmRNA-DElncRNA interaction analysis
To identify DEmRNAs near DElncRNAs with cis-regulatory effects, DEmRNAs transcribed within a 100 kb window upor downstream of DElncRNAs in READ and normal controls were identified. In addition, DEmRNAs coexpressed with DElncRNAs were identified. Pairwise Pearson correlation coefficients between DEmRNAs and DElncRNAs were calculated. DElncRNA-DEmRNA pairs with P < 0.001 and | r | ≥0.98 were defined as significant mRNA-lncRNA coexpression pairs.
Heatmaps of the top 100 DEmRNAs and all of DElncRNAs between READ and normal tissue are shown in Figure  1A,B, respectively. Circos plots representing the distribution of DElncRNAs and DEmRNAs on chromosomes are shown in Figure 1C.

| Functional annotation of DEmRNAs in READ
DEmRNAs were used for GO and KEGG enrichment analyses. GO

| DISCUSSION
READ is one of the deadliest malignancies, and the molecular mechanisms underlying the initiation and development of READ remain largely unknown. Hence, comprehensive detailing of its mechanisms is critical. An increasing number of studies have explored the important regulatory effects of lncR-NAs on tumor formation and metastasis. Here, DEmRNAs and DElncRNAs in READ were studied using RNA sequencing. A total of 2113 DEmRNAs (809 downregulated and 1304 upregulated mRNAs) and 150 DElncRNAs (81 downregulated and 69 upregulated lncRNAs) between READ and normal tissue were identified. Additionally, we constructed a READspecific PPI network, a DElncRNA-DEmRNA coexpression network and a DElncRNA-nearby DEmRNA interaction network. In addition, DEmRNAs and DEmRNAs coexpressed with DElncRNAs were functionally annotated. Coexpression networks have been used in other studies to identify important modules associated with cancer and the  Most network construction techniques can only address positive correlations in gene expression data, whereas biologically significant genes exhibit both positive and negative correlations. 3 In this study, positively correlated DEmRNAs and DE1ncRNAs in READ were defined as positively coexpressed DElncRNA-DEmRNA pairs, and negatively correlated DEmRNAs and DE1ncRNAs were defined as negatively coexpressed DE1ncRNA-DEmRNA pairs. CDK1, AURKB, CDC6, FOXQ1, NUF2, and TOP2A were the hub proteins of the READ-specific PPI network. CDK1, a member of the CDKs, is a serine/threonine kinase that promotes the G2-M transition and regulates G1 progression and G1-S transition. 15 CDK1 is overexpressed in human colorectal cancers and relevant to the clinical behavior of human colorectal cancers, which was shown by the association between a high ratio of CDK1 nuclear to cytoplasmic expression and poor overall survival and that CDK1 was an independent risk factor for outcome. 16,17 AURKB, a member of the aurora kinase family, is an important diagnostic and prognostic marker involved in the carcinogenesis of colorectal cancers. 18 FOXQ1 is frequently upregulated in colorectal cancers, and FOXQ1 knockdown suppressed cell proliferation and the migration and invasion of colorectal cancers. 19 TOP2A is a potential predictive biomarker for anthracycline and irinotecan treatment in colorectal cancer, and high frequency of gene gains for the TOP1 and TOP2A genes were reported in colorectal cancers. 20 Elevated NUF2 expression was associated with poor prognosis in colorectal cancer, and the knockdown of NUF2 expression suppressed the growth of tumor cells. 21 Therefore, we speculated that CDK1, AURKB, FOXQ1, NUF2, and TOP2A might play important roles in READ. Interaction network analysis showed that AURKB was coexpressed with SNHG5 and that FOXQ1 was coexpressed with LOC105374879. Hence, we further hypothesized that SNHG5 and LOC105374879 might play important roles in READ by regulating AURKB and FOXQ1, respectively. CCAT1 is upregulated in colorectal cancer but not in normal tissue. 22 A CCAT1-specific peptide nucleic acid-based molecular beacon was reported to serve as a powerful diagnostic tool for the specific identification of colorectal cancer. 23 GAS5 is associated with not only susceptibility to colorectal cancer but also the metastasis of colorectal cancer to the lymph node. 24 SLCO1B3, a solute carrier organic anion transporter family member, is upregulated in colorectal cancer. 25 The overexpression of SLCO1B3 changed p53-dependent pathways and conferred apoptotic resistance in colorectal cancer. 26 SLCO1B3 protein expression was significantly correlated with proximal tumor location and the expression of mismatch repair genes, and SLCO1B3 was identified as a cell-surface marker differentially expressed in colon adenocarcinoma relative to its expression in the surrounding normal colon tissue. 27 In this study, SLCO1B3 was coexpressed with CCAT1 and GAS5. Therefore, we presumed that both CCAT1 and GAS5 might be involved in the development of READ by regulating SLCO1B3.
According to KEGG pathway enrichment analysis of DEmRNAs and DEmRNAs coexpressed with DElncRNAs, the p53 signaling pathway, intestinal immune network for IgA production and colorectal cancer pathway were three READ-related pathways. MSH6 coexpressed with two DElncRNAs (LOC105374879 and CASC15) and BCL2 coexpressed with B3GALT5-AS1 were significantly enriched in the colorectal cancer signaling pathway. TNFRSF17 coexpressed with B3GALT5-AS1 was enriched in the intestinal immune network for IgA production. CCNB2 coexpressed with LOC105374879 was enriched in the p53 signaling pathway. MSH6 is a mismatch repair gene involved in colorectal cancers, and it was reported that most patients with colorectal