Identification of a stool long non‐coding RNAs panel as a potential biomarker for early detection of colorectal cancer

Abstract Background The feces of colorectal cancer (CRC) patients contain tumor colonocytes, which constantly shed into the lumen area. Therefore, stool evaluation can be considered as a rapid and low‐risk way to directly determine the colon and rectum status. As long non‐coding RNAs (lncRNAs) alterations are important in cancer cells fate regulation, we aimed to assess the level of a panel of cancer‐related lncRNAs in fecal colonocytes. Methods The population study consisted of 150 subjects, including a training set, a validation set, and a group of 30 colon polyps. The expression levels of lncRNAs were evaluated by quantitative real‐time PCR (qRT‐PCR). The NPInetr and EnrichR tools were used to identify the interactions and functions of lncRNAs. Results A total of 10 significantly dysregulated lncRNAs, including CCAT1, CCAT2, H19, HOTAIR, HULC, MALAT1, PCAT1, MEG3, PTENP1, and TUSC7, were chosen for designing a predictive panel. The diagnostic performance of the panel in distinguishing CRCs from the healthy group was AUC: 0.8554 in the training set and 0.8465 in the validation set. The AUC for early CRCs (I‐II TNM stages) was 0.8554 in the training set and 0.8465 in the validation set, and for advanced CRCs (III‐IV TNM stages) were 0.9281 in the training set and 0.9236 in the validation set. The corresponding AUC for CRCs vs polyps were 0.9228 (I‐IV TNM stages), 0.9042 (I‐II TNM stages), and 0.9362 (III‐IV TNM stages). Conclusions These data represented the application of analysis of fecal colonocytes lncRNAs in early detection of CRC.


| INTRODUC TI ON
CRC has become one of the first priorities of the World Health Organization (WHO) for mass screening due to high morbidity and mortality rates. It develops from a slow progressive premalignant lesion (the adenomatous polyp), which can readily be removed by an accurate diagnosis. 1 Based on the risk level of the malignancy, screening approaches for CRC patients are divided into two main categories: average-risk population and high-risk population. Each of these categories is targeted by a different screening program. 2 According to WHO guidelines, both categories should have been monitored constantly using standard screening methods such as colonoscopy. 1,2 However, given the various disadvantages of these technics, current investigations are being taken into consideration for substituting noninvasive, inexpensive screening methods with more specificity and sensitivity. 3 There are ongoing optimizations to simplify the process of identifying new biomarkers from body specimens such as stool, plasma, and urine. 4 Long non-coding RNAs (lncRNAs) are an important class of ncRNAs that have a huge impact on the cancer progress. These RNAs are transcribed by RNA polymerase II, with a length of 200 nt or more, from different regions of the genome, including intronic and intergenic sites. 5,6 Considering this point, it has been inferred that lncRNA transcription usually does not depend on the presence of the open reading frames and it has been estimated that the human genome contains more than 15 000 lncRNAs-related genes that could produce over 23 000 functional lncRNAs. 7 This large proportion brings the idea that this class of ncRNAs may contribute to a wide variety of regulatory activities such as transcriptional activation/repression, epigenetic regulation, nuclear remodeling, mRNAs stability/degradation, and the microRNA (miRNA) sponge. 5,7 Through these mechanisms, lncRNAs are involved in multiple cancer-related signaling cascades and provoke tumor development or suppression. 4 Furthermore, lncRNAs might be used as biomarkers for the early detection of metastasis in CRC and are regarded as novel biomarkers and therapeutic targets for CRC patients. 8 The diagnostic value of lncRNAs in CRC has not been completely examined due to sampling issues, especially at early stages of the disease. Considering this point that most of the cancer detections are happening in advanced stages, identification of cancer-related biomarkers that actually initiated the malignancy is challenging. Routine tests on tissue samples for the early detection of colorectal cancer (CRC) have some problems such as invasiveness, lack of evaluation by an expert pathologist, cost-intensive, and time-consuming. So, we need to explore other biological samples such as blood, urine, and stool, which are easier to collect and analyze. Among these samples, the stool takes priority, passing throughout the colon and rectal regions and could carry cancer colonic cells (cancer colonocytes). Fecal collection is also easy, inexpensive, noninvasive, and accessible from all ages. Previous investigations proved the existence of the miRNAs in stool samples. 9 However, to the best of our knowledge, no report has been published on the analysis of fecal lncRNAs expression levels. Considering the values of stool samples in the characterization of colon disorders, in this study, we aimed to track the alteration of the expression pattern of 30 known cancer-related lncRNAs in human feces from healthy status to advanced carcinoma. The results of this investigation introduced the human fecal colonocytes as a proper source of lncRNAs for CRC analysis.

| Subjects
The population study consisted of 150 individuals including 60 CRC patients, 60 non-cancer individuals, and a group of 30 individuals with colon polyps who were referred to the Taleghani Hospital, Tehran, Iran. They were divided into three cohorts: 1-Training group (

| Sample processing and RNA extraction
An overall 20 g of fecal samples was taken from each candidate over a month. Using a swab, the samples were collected from either the stools' mucinous region, as a rich source of colonocytes, 10 or nonmucinous areas, for evaluating the entire colon status. The collections were immediately dissolved in RNALater buffer (2 mL/g) and

| Reverse transcription and PCR amplification
To ensure the absence of any possible contamination, sam-

| Quantitative real-time PCR (qRT-PCR)
The qRT-PCR was performed on the 7500 Real-Time PCR System (Applied Biosystems) using the QuantiTect SYBR Green PCR Kit (QIAGEN). The relative abundance of targets expression was determined by normalizing to reference genes (18S rRNA, GAPDH, U6) using the 2 −∆∆CT method. The primer sequences are demonstrated in Table 2.

| Function enrichment analysis
The functional interactions between candidate lncRNAs and biomolecules (proteins, RNAs, and DNAs) were identified by NPInter

| Statistical analysis
To assess the differences in the lncRNAs expression level, we used the Mann-Whitney U test. The diagnostic lncRNA markers were se-

| Quality assessment of isolated RNA from the stool
In order to confirm the non-contamination of the sample with other organisms' RNAs, the expression level of 18S RNA as the internal control was measured along with bacterial 16S RNA and chloroplast RuBisCO by PCR, and the amplifications were examined by gel electrophoresis.

| Identification of differentially expressed lncRNAs (DElncRNAs) in training set
It has been proven that tumor lncRNAs boost or suppress the CRC progress, but their functions in other tumor environment cells have not been elucidated properly. 4 We have chosen 30 known cancerrelated lncRNAs and evaluated their levels in 60 fecal samples obtained from cases with normal and cancer colons. The normal group TA B L E 2 The primer sequences of examined genes was chosen as control. Considering the relative expression <0.5 or >2, we found 10 differentially expressed lncRNAs (DElncRNAs; Table 3; Figure 1). The combination of these 10 DElnRNAs was selected as the predictive panel for further analysis.

| Analyzing the predictive panel between CRC and polyp cases
The diagnostic power of the ten-DElncRNA panel was further esti- These results show that, in comparison with normal vs CRC samples, our panel has a higher sensitivity and specificity for polyp transition into CRC status.

| Functional annotations of validated lncRNAs
We

| D ISCUSS I ON
Identifying of the lncRNAs that are effective in the development of cancer requires the examination of samples in the early phases of the formation of malignancy (such as colon polyps) and their comparison with the healthy and patient groups. The researchers tended to examine the types of biological samples of those that were low-cost, non-invasive, and accessible to all individuals, such as blood plasma. 9 The problem with the use of blood plasma is its circulation through all tissues of the body and secretion of various cellular products into the blood. This makes it difficult to detect actual cancer biomarkers.
The stool only passes through the intestines and rectum and is much less polluted compared to the blood plasma; therefore, it is suitable for examining the status of CRC markers in different groups, including patients, individuals suspected of being malignant and healthy people.
There has been no previous study on lncRNAs expression alteration between CRC patients and healthy individuals, so we have The Mann-Whitney U test was used to assess the differences of LncRNAs level between groups. The relative expression was considered as significant when <0.5 or >2  The lncRNA H19 gene is located on human chromosome 11p15.5

TA B L E 4 Expression analysis and diagnostic performance of DElncRNAs in training cohort
and is involved in the carcinogenesis, progression, and metastasis of CRC. 15 Up-regulation of oncolncRNA H19 correlates with tumor differentiation, the TNM stage, and poor prognosis of colon cancer. 16 OncolncRNA HOTAIR overexpression is associated with tumor in- and hsa-miR-29a-3p was previously investigated in CRC patients and both were miRNAs associated with tumor location. 27,28 As there are no other investigations on the underlined regulatory networks of these miRNAs in CRC initiation and progression, our findings could be considered as a step forward in better understanding cancer regulatory structures. An overall genome mapping study of cancer associated lncRNAs MALAT1 and NEAT1 in MCF-7 breast cancer cells identified these genes as possible targets of MALAT1. 29 It has been shown also that knocking down MALAT1 in CaSki cervical cancer cells increased proliferation and invasion rates through BAX up-regulation. 30 MALAT1 could interact and up-regulate the pre-mRNA factors SRSF1 and PRPF6, and PRPF6 acts as a splicing regulatory of MALAT1. 31 Similar interactions have been reported between ZFP36 and MALAT1, whereas MALAT1 sequence has a regulatory binding site for ZFP36, 32 and MALAT1 overexpresses ZFP36. 29 The oncolncRNA HOTAIR gene is located within the HOXD gene clusters and has a negative effect on other HOXD clusters, being placed on the other chromosomes. 33 Evidence suggests that HOTAIR may target the HOXD cluster genes at RNA and protein levels. 34 It could also repress the HOXD genes in an alternative manner by targeting the polycomb repressive complex 2 (PRC2) family member SUZ12 and induces gene silencing through H3K27 methylation and H3K4 demethylation. 35

| CON CLUS ION
Our study, for the first time, examined the possibility of lncRNAs evaluation in human stools and introduced a panel based on cancerrelated lncRNAs that could identify and distinguish CRC patients from healthy individuals or those with polyps. In addition, our results showed that the measurement of cancer-related lncRNAs as a panel had more sensitivity and specificity than those lncRNAs alone.
Finally, a comprehensive range of lncRNAs should be measured to further elaborate on their regulatory network.

ACK N OWLED G M ENTS
The authors gratefully acknowledge the Research Institute for

Gastroenterology and Liver Diseases of the Shahid Beheshti
University of Medical Sciences (RIGLD), for its support of this study (Grant number: 949).

CO N FLI C T O F I NTE R E S T
The authors have declared that no competing interests exist.