Long non‐coding RNA screening and identification of potential biomarkers for type 2 diabetes

Abstract Background To investigate new lncRNAs as molecular markers of T2D. Methods We used microarrays to identify differentially expressed lncRNAs and mRNAs from five patients with T2D and paired controls. Through bioinformatics analysis, qRT‐PCR validation, ELISA, and receiver operating characteristic (ROC) curve analysis of 100 patients with T2D and 100 controls to evaluate the correlation between lncRNAs and T2D, and whether lncRNAs could be used in the diagnosis of T2D patients. Results We identified 68 and 74 differentially expressed lncRNAs and mRNAs, respectively. The top five upregulated lncRNAs are ENST00000381108.3, ENST00000515544.1, ENST00000539543.1, ENST00000508174.1, and ENST00000564527.1, and the top five downregulated lncRNAs are TCONS_00017539, ENST00000430816.1, ENST00000533203.1, ENST00000609522.1, and ENST00000417079.1. The top five upregulated mRNAs are Q59H50, CYP27A1, DNASE1L3, GRIP2, and lnc‐TMEM18‐12, and the top five downregulated mRNAs are GSTM4, PODN, GLYATL2, ZNF772, and CLTC. Examination of lncRNA‐mRNA interaction pairs indicated that the target gene of lncRNA XR_108954.2 is E2F2. Multiple linear regression analysis showed that XR_108954.2 (r = 0.387, p < 0.01) and E2F2 (r = 0.368, p < 0.01) expression levels were positively correlated with glucose metabolism indicators. Moreover, E2F2 was positively correlated with lipid metabolism indicators (r = 0.333, p < 0.05). The area under the ROC curve was 0.704 (95% CI: 0.578–0.830, p = 0.05) for lncRNA XR_108954.2 and 0.653 (95% CI: 0.516–0.790, p = 0.035) for E2F2. Conclusions This transcriptome analysis explored the aberrantly expressed lncRNAs and identified E2F2 and lncRNA XR_108954.2 as potential biomarkers for patients with T2D.


| INTRODUC TI ON
Patients with type 2 diabetes (T2D) have continuously elevated circulating glucose levels, which is the pathological basis of various diseases. 1,2 Patients with T2D are at a higher risk of heart disease and cerebrovascular disease, accompanied by a higher risk of lowposition amputation than healthy individuals. 3,4 A large proportion of public medical resources, greater than that needed for patients with hypertension, stroke, and coronary artery disease combined, is required to care for patients with T2D. 5,6 T2D is an insidious disease, and delayed diagnosis and treatment lead to a failure in controlling the blood glucose levels. Therefore, new biomarkers and diagnostic approaches are urgently required for clinical therapy.
In recent years, intensive studies pertaining to long non-coding RNAs (lncRNAs) have shown that they are widely involved in biological processes. 7 lncRNAs participate in the regulation of gene expression by binding to homologous DNA, RNA, and a variety of proteins. 8 lncRNAs have also been associated with many human diseases, including cancer, 9 cardiovascular disease, 10 diabetes, 11 and mental disorders. 12 Gao et al. found that compared with the control groups, the expression of lncRNA H19 was significantly reduced in patients with T2D as well as in insulin-resistant mice. 13 Our previous research has shown that lncRNA MEG3 is significantly downregulated in endothelial cells cultured in high glucose concentrations. Additionally, MEG3 knockdown promotes endothelial cell proliferation and reduces apoptosis at high glucose concentrations. 14 Moreover, there is increasing evidence suggesting that lncRNAs may function as novel diagnostic and therapeutic targets for many diseases. 15,16 Therefore, systematic identification of differentially expressed lncRNAs in T2D, elucidation of the underlying mechanism, and evaluation of their clinical significance are necessary in patients with T2D.
In the present study, we analyzed aberrantly expressed lncRNAs in patients with T2D and performed functional enrichment and metabolic pathway analysis to explore their pathogenesis. One of the lncRNA-mRNA pairs was chosen to validate the observed expression patterns, and the ROC curve was used to provide references for the diagnosis and treatment of T2D.

| Participants
In the screening stage, five patients with T2D and five healthy controls were recruited for the analysis of differentially expressed lncRNA/mRNA using microarray assay. Then, 100 patients and

| Sample preparation and RNA purification
Density gradient centrifugation was used to purify peripheral blood mononuclear cells (PBMCs) from the blood obtained from the patients. TRIzol (Invitrogen, Carlsbad, CA, USA) was used to extract total RNA according to the manufacturer's instructions and quantified using a NanoDrop spectrophotometer (ND-2000, NanoDrop Products, Wilmington, DE, USA).

| Microarray assay
An Agilent Microarray (V4.0, CapitalBio; Beijing, China) was used to analyze the samples from the screening stage. A total of 41,000 lncR-NAs and 34,000 mRNAs were evaluated by each slide (4 × 180 K format). Following the manufacturer's standard protocols, the samples were tagged, hybridized, and eluted. This process included reverse transcription of total RNA into double-stranded cDNA, synthesis of cRNA, synthesis of cDNA by cRNA reverse transcription and fragmentation, hybridization, and cleaning with the chip after fluorescent labeling. 10 The Agilent chip scanner (G2565CA) was used to obtain hybrid pictures.
Data were normalized and analyzed using GeneSpring GX soft-

| qRT-PCR validation
Differentially expressed lncRNAs and mRNAs were determined using qRT-PCR. Briefly, total RNA (1 µg) was extracted following the manufacturer's instructions. PCR was performed on an ABI 7500 System (Applied Biosystems, Carlsbad, CA, USA) using SYBR (TaKaRa Bio, Dalian, China). The 2 −△△CT method was used to calculate the fold change, and β-actin was used for normalization. All experiments were repeated thrice.

| Detection of plasma E2F2 protein
Peripheral blood (2 ml) was collected in EDTA tubes from each participant, and plasma was separated by centrifugation. Plasma concentrations of E2F2 were measured using an ELISA kit (Shanghai Hengyuan Biological Technology Co., Ltd., China) following the manufacturer's protocol.

| Statistical analysis
All statistical analyses were performed using SPSS (v22.0, Chicago, IL, USA) and GraphPad Prism (v5.0, GraphPad Software Inc., San Diego, CA, USA). χ 2 and independent t tests were used to determine the differences between the population characteristics of patients with T2D and controls. The Mann-Whitney U test was used for abnormally distributed data. Significant GO terms and KEGG pathways were screened using Fisher's exact test. The lncRNA-mRNA coexpression network and clinical significance were constructed using multiple linear regression. Statistical significance was set at p ≤ 0.05.
The specificity and sensitivity of lncRNAs were determined using receiver operating characteristic (ROC) curves.

| Participant demographics and clinical characteristics
A total of 105 patients with T2D and paired controls were enrolled in our study with two stages. The χ 2 -test and t test results revealed no significant differences between the T2D and control groups in terms of age or sex distribution, except for FPG. In stage one, FPG was an average of 10.85 mmol/l in patients with T2D and 4.87 mmol/l in the control group. In stage two, FPG was an average of 9.49 mmol/l in patients with T2D and 4.96 mmol/l in the control subjects (Table 1).

| lncRNA and mRNA microarray expression profiling
In stage one, ten blood samples were used for microarray profiling.
Screening data can be obtained from the Gene Expression Omnibus  Table 2.

| Bioinformatics analysis
Genes perform their biological functions through coordination. This is especially true for complex diseases, such as T2D, which may be the result of a phenotypic difference caused by mutations in multiple genes. 17  Additionally, we analyzed the significant pathways associated with consensus mutations in T2D patients using KEGG. The top five pathways are represented in a histogram image in Figure 2B; the most highly enriched pathway was observed to be the synaptic vesicle cycle.

| Co-expression analysis and target prediction
Next, we constructed lncRNA-mRNA co-expression network to in-   (Figure 5A, B). These data were consistent with those obtained from microarray analysis. The ELISA showed that plasma E2F2 in the T2D patients was higher than that in healthy controls (86.67 ± 5.83 vs. 57.19 ± 4.89 ng/l, p < 0.01) ( Figure 5C).

| Correlation between lncRNA XR_108954.2 and E2F2 and clinical biochemical indicators
Multiple linear regression analysis was used to evaluate the cor- and lipid metabolism indicators (r = 0.333, p < 0.05) ( Table 3).

| Identification of novel T2D biomarkers
The diagnostic value of XR_108954.2 and E2F2 was evaluated using

| DISCUSS ION
lncRNAs were once considered "junk DNA" due to their non-coding function and were thought to have accumulated during the evolution of genes. 18 However, the rapid development of molecular biology and the application of next-generation sequencing technologies have illustrated that lncRNAs play important roles in diverse biological functions including chromatin modification, transcriptional regulation, posttranscriptional regulation, cellular proliferation, differentiation, and apoptosis. [19][20][21][22] Our study determined 68 differentially expressed lncR-NAs and 74 differentially expressed mRNAs in patients with T2D. 23 Bioinformatics analysis indicated that these lncRNAs may function in most biological processes associated with diabetes.
The role of lncRNAs in diabetes is gathering increasing amounts of attention, and numerous evidences have revealed that lncRNAs play important roles in many of the pathophysiological mechanisms. [24][25][26] The lncRNAs may also serve as biomarkers in the diagnosis, prognosis, and clinical management of the disease. 27 and TCONS_00004187 may serve as potential biomarkers for T2D. 23 In this study, we used bioinformatics approach to predict  XR_108954.2 may be potential biomarkers for the diagnosis and treatment of T2D.

CO N FLI C T O F I NTE R E S T
The authors have no conflicts of interest to declare.

I N FO R M E D CO N S E NT
Signed informed consent was collected from all participants prior to the recruitment.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.