Construction of circular RNA‐based competing endogenous RNA network in colon cancer

Circular RNAs (CircRNAs) are now under discussion as novel promising biomarkers for patients with various tumors. Our study aimed to identify circRNAs related to the progression of colon cancer and to further investigate their roles in the development of colon cancer base on The Cancer Genome Atlas database. The competing endogenous RNA (ceRNA) network was constructed based on 4 circRNAs, 14 miRNAs, and 229 mRNAs. Meanwhile, three hub genes were identified via the protein protein interaction networks (PPI) network. It should be noted that only RUNX1 targeted by circ‐0084615 and mir‐330‐3p was confirmed as a survival‐related gene. Further analyses indicated the possible function of RUNX1 in colon progression, such as angiogenesis, epithelial–mesenchymal transition, hedgehog signaling and so on. Further investigations were conducted into the correlation between RUNX1 expression profiles and immune cells. Finally, we validated the expression of the circ‐0084615/mir‐330‐3p/RUNX1 axis in clinical specimens. In conclusion, we constructed ceRNA network and revealed that the circ‐0084615/mir‐330‐3p/RUNX1 axis serves as a critical role in colon cancer.

some translation possibilities of circRNA, 8,9 most of them are defined as untranslated ncRNA in the cytoplasm. 10,113][14] Furthermore, circRNAs have not been extensively studied in the context of colon cancer biology or metastasis.
In this study, we explore the expression patterns and origins of circRNA in colon cancer, delving into its involvement in both the onset and advancement of the disease.Furthermore, we selected key circRNA molecules and their downstream target genes in the regulatory network that have significant prognostic significance, and conducted in-depth analysis of their biological and immune functions.

| Data acquisition
The circRNA expression profiles for human samples derived from patients with colon cancer were obtained from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/).We selected the GSE126094, GSE142837, and GSE138589 circRNA expression profiles, which were based on the same platform: GPL19978 Arraystar Human CircRNA microarray V2.The GSE126094, GSE142837, and GSE138589 circRNA expression profiles included 10 colon cancer tissues and 10 adjacent normal tissues, 5 colon cancer tissues and 5 adjacent normal tissues and 6 colon adenocarcinoma tissues and 6 adjacent normal tissues, respectively.
In addition, transcriptome data of colon cancer used in this study were derived from The Cancer Genome Atlas (TCGA).We downloaded the transcriptome data of colon cancer from the TCGA database through the R package, including 39 cases of adjacent normal tissues and 451 cancer cases.Furthermore, relevant clinical information including age, gender, stage, tumor/lymph node/metastasis staging, and survival status was also obtained (Table 1).R software "Limma" package was utilized to correct the transcriptome data.We obtained the drug sensitivity data of colon cancer cell line from the Genomics of Drug Sensitivity in Cancer database (https://www.cancerxgene.org/).
Then, we integrate and rank all of DECs from two circRNA chips with R package RobustRankAggreg.The HTSeq-Counts of miRNAs and mRNAs in the TCGA were analyzed to identify differentially expressed miRNAs (DEmiRs) and differentially expressed genes (DEGs) by utilizing the R software "edger" package.To detect more significant genes, values of jlog 2 (fold change [FC])j > 1 and p-value <.05 were set as cutoff criteria.Venn diagrams were constructed to determine intersections via VENNY 2.1 software (http://bioinfogp.cnb.csic.es/tools/index.html).

| Construction of PPI network and identification of hub genes
Using the search tool for the retrieval of the overlapping mRNAs (STRING, v11.0, https://string-db.org/), a PPI network was built.Visualization of the network was accomplished using Cytoscape.The Cytoscape plugin "MCODE" was used to identify significant module and hub genes, with cutoff criteria of degree ≥2 and k-core ≥2.

| Survival analysis and expression comparison of hub genes
Clinical data for TCGA colon cancer including survival time, survival status, and TNM staging were also downloaded from the TCGA database (samples with missing information were excluded).Survival R package was applied in survival analyses for hub genes.For the overall survival (OS) rates, the logrank test was used to detect significant differences.The results were visualized using Kaplan-Meier survival curves, and a p-value of <.05 was considered as statistically significant.

| Gene set enrichment analysis
Gene set enrichment analysis (GSEA) (Version 3.0, the broad institute of MIT and Harvard, http://software.broadinstitute.org/gsea/downloads.jsp)was conducted to study the biological characteristics of hub genes in colon cancer.In detail, the "collapse data set to gene symbols" was set to false, the number of marks was set to 1000, the "permutation type" was set to phenotype, the "enrichment statistic" was set to weighted, and the Signal2Noise metric was used for ranking genes.High expression group was used as experimental group, and low expression group was used as reference group."c2.cp.kegg.
v7.0.symbols.gmt"gene sets database was used for enrichment analysis.A gene set size >500 and <15, an False Discovery Rate (FDR) of <0.25, and a nominal p-value of <.05 were regarded as the cutoff criteria.

| Patients and specimens
Seventeen cases of colon cancer specimens and paired nontumor bowel tissues were collected from July 2017 to July 2019.Patients with the following criteria were excluded from participation: had received adjuvant chemotherapy or radiotherapy prior to surgery and had additional cancers diagnoses.All patients were classified according to the seventh edition of the TNM staging system 23.Postoperative adjuvant therapies were performed, according to standard schedules and doses.All participating patients gave their written informed consent.This study was approved by the Ethical Committee of Nanjing Red Cross Hospital.The clinical data of all patients were showed in Table S1.

| Statistical analysis
All analyses were performed using R 3. 3 | RESULTS

| Identification of DECs
The construction process of the whole competing endogenous RNA (ceRNA) network is displayed in Figure S1.According to the defined criteria, two colon cancer and one colon adenocarcinoma microarray dataset (GSE126094, GSE142837, and GSE138589) were enrolled in the study (Table 2).After normalization of batch effect, in total, 1850 and 43 circRNAs were differentially expressed between colon cancer and normal tissues in GSE126094 and GSE142837 (Figure 1A,B), and 186 CircRNAs were differentially expressed between colon adenocarcinoma and normal tissues in GSE138589 (Figure 1C).After the intersection, four circRNAs (circ-0000467, circ-0040809, circ-0084615, and circ-0000512) were suggested to be related to tumor initiation for further analysis (Figures 1D and S2 and Table 3).

| Identification of target miRNA and target mRNA
We to predict the target mRNA of these 14 miRNAs, respectively.Two hundred twenty-nine eligible genes (mRNAs) were identified as the final target genes by crossing with DEGs (Figure 1G,H).

| Functional annotation of the circRNAstargeted mRNA
In order to fully understand the biological relevance of these 229 DEGs, we conducted the KEGG and GO analyses.Based on the results of DAVID, the top enriched BP, CPs, and MF terms were: regulation of mitotic nuclear division, DNA damage checkpoint, cyclindependent protein kinase holoenzyme complex, replication fork, and so on (Figure S4A-C).The principal enriched biological pathways were cell cycle, p53 signaling pathway, and so on (Figure S4D).

| Construction of ceRNA network and PPI network
After multiple filters, 4 circRNAs, 14 miRNAs, and 229 target genes were screened out for further study.A ceRNA (circRNA/miRNA/ mRNA) network was established by combining the circRNA/miRNA interaction and miRNA/mRNA interaction, which initially shows the overall view of the regulatory network (Figure S5).String database provided a visual PPI regulatory network of target genes.We  visualized the PPI network results in the Cytoscape software (Figure S5B).According to app MCODE, the three central genes that play crucial roles in the network are RUX1, CTSV, and CBFB (Figure S5C).Those three hub genes constituted three circRNA/ miRNA/hub genes axes including circ-0084615/miR-330-3p/RUNX1, circ-0084615/miR-125b-2-3p/CTSV, and circ-0040809/miR-500b-3p/CBFP axes (Figure S5D).It should be noted that among all three hub genes, only high expression of RUNX1 was significantly related to poor OS and disease-free survival (Figure 2).To explore the biological relevance of RUNX1 involved in colon cancer progression, we performed a GO and KEGG of RUNX1 co-expressed genes based on the TCGA colon cancer cohort, as well as GSEA analysis.GO and KEGG pathway analyses indicated that RUNX1 took part in PI3K/Akt, MAPK, Wnt pathways, and so on (Figure 3A,B).Furthermore, the most cardinal pathway identified by GSEA was angiogenesis, epithelial-mesenchymal transition (EMT), and so on (Figure 3C).(Figure 5G-I).Taken together, these data revealed that RUNX1 was closely related to the formation of an immunosuppressive microenvironment and may be a potential immunotherapeutic target.

| Validation of circ-0084615/mir-330-3p/ RUNX1 axis in clinical specimens
In order to verify the cyclization of circRNAs, convergent primers and divergent primers were designed for four DECs.Only circ-0000467 and circ-0084615 were amplified in the cDNA but not the gDNA of HCT116 (Figure 6A).Sanger sequencing confirmed that the PCR products amplified using the divergent primers contained the head-to-tail splicing sites and the sequences were consistent with the circBase database annotation (Figure 6B).Subsequently, we investigated the miR-330-5p but positively co-expressed with RUNX1.In addition, miR-330-5p was negatively co-expressed with RUNX1 (Figure 6D).

| DISCUSSION
Based on the published databases, we identified the DECs between normal bowel tissues and colon cancer.Four DECs were finally confirmed (circ-0000467/circSKA3, circ-0040809/circBANP, circ-0084615/circASPH, and circ-0000512/circRPPH1).Among them, circ-0000467 was reported to be associated with progression and survival with gastric cancer. 15Circ-0040809 was associated with the survival of bladder cancer patients.circ-0084615 and circ-0000512 were rarely reported before.Furthermore, through the integrated analysis of multiple databases, a colon cancer-centered ceRNA network (circRNA/miRNA/mRNA) was established.We also established a PPI network to prominent the hub genes that may pertain to the progression of colon cancer.In addition, a functional analysis was conducted to speculate the biological function of hub genes in colon cancer.Ultimately, a core tumor suppressor axis (circRNA-0084615/ miR-330-3p/RUNX1) was identified and validated in clinical specimens.
There remain several limitations to our research.First, the ceRNA network we proposed is primarily based on bioinformatics analysis, and the circRNA/miRNA-mediated mRNA regulatory interactions need further validation through both in vitro and in vivo studies.Second, due to the deficient relevant prognostic data, the prognostic value of circRNAs in colon cancer patients cannot be evaluated.Third, our sample size is relatively small.Our data provided new notions into circRNA-related ceRNA networks in colon cancer and identified potential therapeutic targets.
In summary, our study has identified a ceRNA network that sheds light on the potential mechanisms underlying colon cancer.This research offers a deeper understanding of the intricate ceRNA regulatory network in this disease.Survival analysis highlighted RUNX1 as a significant prognostic factor, indicating the hsa-circ-0084615/has-mir-330-3p/RUNX1 axis as a key player in the development and spread of colon cancer.In addition, RUNX1 emerges as a potential therapeutic target for halting the progression of the disease.
However, the detailed mechanisms of action require further investigation.

6 . 1 .
All statistical tests were two-sided, and a p-value of <.05 was considered statistically significant.Continuous variables that conformed to the normal distribution were compared with the use of independent t-test for comparison between groups, while continuous variables with skewed distribution were compared with the Mann-Whitney U-test.The relationship between hub genes and OS was analyzed through the Kaplan-Meier curve, which was evaluated by log-rank test.The univariate regression model was used to analyze the effects of individual variables on survival, and the multivariate cox regression model was used to confirm independent impact factors associated with survival.

Furthermore, we also
constructed the Reactome pathway and reaction analysis.RUNX1 took part in various pathways and reactions, including the Wnt signaling pathway, FOXP3, GITR, and so on (Figure S4E,F).

F I G U R E 2
The survival analysis of three hub genes (RUNX1, CTSV, and CBFB) in colon cancer via GEPIA website tool based on TCGA database.TCGA, The Cancer Genome Atlas.F I G U R E 3 (A,B) GO and KEGG analyses of RUNX1 co-expressed genes.(C) GSEA analysis of RUNX1 in colon cancer.GO, gene ontology; GSEA, gene set enrichment analysis; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Figures
Figures 4A and S6 illustrate the expression profile and prognostic value of RUNX1 across 33 types of tumors.According to diagnostic Receiver Operating Characteristic (ROC) analysis, we observed that RUNX1 exhibited high accuracy in distinguishing between Colon adenocarcinoma (COAD) and adjacent noncancerous tissues (Figure4B).Then, we utilized the TCGA dataset to verify the correlation between RUNX1 expression and clinicopathological variables in COAD.Compared with normal tissues, RUNX1 showed significantly elevated expression in COAD, particularly in advanced stages (Figure4C-G).To assess the biological features of RUNX1 in COAD, we performed the enrichment analyses of KEGG and hallmark pathways.In terms of KEGG and hallmark pathways, overexpressed RUNX1 was positively associated with several tumor-related pathways, such as Janus

3. 6 |
Immune infiltration analysis of RUNX1We conducted further investigations into the correlation between RUNX1 expression profiles and immune cells.The COAD patients were categorized into high or low RUNX1 expression groups based on the median value of RUNX1 expression.We proceeded to evaluate the immune microenvironmental characteristics of RUNX1 based on immunescore and stromalscore in COAD tissues.The results showed a significant positive correlation between RUNX1 and immune, stromal, and estimated scores (Figure5A-C).It became apparent that the expression of immunosuppressive cells (such as macrophage, Myeloid-dericed Suppressor Cell (MDSC), and regulatory T cell) was higher in the high RUNX1 expression group compared with the low RUNX1 expression group (Figure5D-F).Subsequently, by analyzing the correlation between RUNX1 and immune cells, we observed a significant positive correlation between RUNX1 and immune suppressive cells, including macrophage, MDSC, and regulatory T cell F I G U R E 5 (A-C) The differential expression of tumor microenviroment scores between high and low RUNX1 groups (A, StromalScore; B, ImmuneScore; C, ESTIMATEScore); (D-F) the differential expression of immunosuppressive cells between high and low RUNX1 groups (D, MDSC; E, regulatory T cell; F, macrophage); (G-I) the correlation of RUNX1 expression profile with immunosuppressive cells in COAD.
T A B L E 1 Clinicopathological characteristics of patient samples in TCGA.
T A B L E 3 Basic characteristics of the four differently expressed circRNAs.