MiR‐378a‐3p as a putative biomarker for hepatocellular carcinoma diagnosis and prognosis: Computational screening with experimental validation

Abstract Background Hepatocellular carcinoma (HCC) is a malignant disease with high morbidity and mortality, and the molecular mechanism for the genesis and progression is complex and heterogeneous. Biomarker discovery is crucial for the personalized and precision treatment of HCC. The accumulation of reported microRNA biomarkers makes it possible to combine computational identification with experimental validation to accelerate the discovery of novel biomarker. Results In the present work, we applied a rational computer‐aided biomarker discovery model to screen for the HCC diagnosis biomarker. Two HCC‐associated networks were constructed based on the microRNA and mRNA expression profiles, and the potential microRNA biomarkers were identified based on their unique regulatory and influential power in the network. These putative biomarkers were then experimentally validated. One prominent example among these identified biomarkers is MiR‐378a‐3p: It was shown to independently regulate several important transcription factors such as PLAGL2 and β‐catenin, affecting the β‐catenin signaling. Such mechanism may indicate a potential tumor suppressor role of MiR‐378a‐3p and the impact of its abnormal expression on the cell growth and invasion of HCC. Conclusions A bioinformatics model with network topological and functional characterization was successfully applied to the identification of HCC biomarkers. The predicted microRNA biomarkers were than validated with experiments using human HCC cell lines, model animal, and clinical specimens. The results confirmed the prediction by our proposed model that miR‐378a‐3p was a putative biomarker for diagnosis and prognosis of HCC.

MiR-378a-3p was identified and validated as a biomarker for HCC diagnosis and prognosis, it was shown independently regulated several important transcription factors including PLAGL2 and β-catenin, then affected the β-catenin signaling, which could be the potential mechanism for its function as a tumor suppressor and its abnormal expression could affect the cell growth and invasion of HCC

BACKGROUND
MicroRNAs are small noncoding RNAs with 22-24 nucleotides in length and important regulators in many biological systems. They regulate gene expression at the posttranscriptional level and can be detected in blood or tissues specifically and stably. Thus, using microRNAs as biomarkers holds potential for disease diagnosis, prognosis, and therapy. 1 Nowadays, most microRNA biomarkers have been studied via biological and clinical experiments that focus on the functional role of individual microRNA, or its expression profiling to identify potential novel biomarkers. However, there are few researches applying theoretical network structures and calculation to investigate a large number of biomarkers. 2,3 The advent of "Big Data Era" provides opportunities and challenges for microRNA biomarker discovery from massive and diverse data. Bioinformatics and computer-aided biomarker discovery have received increasing attention due to its feasibility, guidance, and effectiveness.
In the last decade, many bioinformatics algorithms or models have been developed for identifying microRNA biomarkers, which can be divided into three categories: mathematical models, machine learning, and network analysis. A mathematical model is a description of a model using various scoring functions and statistical methods to identify microRNA biomarkers. Based on the hypothesis that a microRNA is involved in a cancer if its target genes are functionally similar to those genes associated with the studied cancer, Li et al. defined a functional consistency score to measure the correlation between microRNA target genes and cancer-associated genes for the identification of cancer-related microRNAs. 4 Moreover, Madden et al. developed an online tool called "CombiROC" to assist finding optimal combination of biomarkers through receiver operating characteristic (ROC) curve analysis. 5 Recently, machine learning has gradually become more basic and widely used in various fields of research. Zhao et al. used gene expression profiling data and prior knowledge on signaling pathways to identify the dysfunctional pathways in disease conditions; they also performed reverse inference for the identification of cancer-related microRNAs by microRNA-mRNA regulatory network. 6 Mukhopadhyay et al. proposed a packaged genetic algorithm for multiobjective optimization and identified microRNAs as potential biomarkers for cancer by support vector machine (SVM) classifiers. 7 Moreover, combinatorial biomarkers tend to be more efficient and accurate predictors than single biomarkers. Yang et al. developed a method based on the cluster analysis for identifying microRNA biomarkers through the following steps: first differentially expressed microRNAs were detected between studied samples and the control group by statistical t-test. The remaining microRNAs were clustered, and a representative combination of microRNA biomarkers was selected as cancer biomarker using Fisher linear discrimination. 8 Complex network theory provides a way for researchers to study complex diseases at the systemic level. The network topology can describe the degree of action and contribution of biomolecules to complex systems, such as disease evolution. Based on this theory, Xu et al. built a microRNA target-dysregulated network and defined four topological features for microRNAs including Dout, NmicroRNA, Rpc-microRNA, and Rtarpc-microRNA to measure the regulatory ability of microRNAs in cancers. This model was applied in finding prostate cancerassociated microRNAs. 9 Subsequently, Chen et al. constructed a microRNA-microRNA functional similarity network and applied a random walk strategy to predict disease-associated microRNAs. Instead of using the traditional methods, they adopted the global network similarity measures to optimize candidate microRNAs. 10 On the other hand, such network-based methods still face many challenges. The inhibitory effect of microR-NAs on its target mRNA translation is mainly based on base-pairing interaction between the microRNAs and the 3′-untranslated region (3′-UTR) of target mRNAs. A microRNA can have many target mRNAs, and a mRNA can also be regulated by multiple microRNAs, leading to a many-to-many microRNA-mRNA regulatory mode. Extensive research efforts have focused on the synergistic microRNA regulation and "multiple-to-multiple" model between microRNAs and their targets. Meanwhile, the independent regulatory power of individual microR-NAs, that is, the "one-to-multiple" mode of microRNA-mRNAs relationship, is less explored. Our study suggested that the predictions of potential microRNA biomarkers with independent regulatory abilities (outlier microRNAs) deserve further investigation. 11 Based on this theory, the Pipeline of Outlier MicroRNA Analysis (POMA) framework has been developed and applied to the identification of microRNA biomarkers in prostate cancer, 11 renal clear cell carcinoma, 12 and acute myeloid leukemia. 13 However, the POMA framework and other network-based methods do not conduct in-depth examination of the functions of target genes, especially the regulatory power of target genes. In this study, we applied our "single-line mRNA (gene) regulation model" to biomarker microRNA discovery in hepatocellular carcinoma (HCC). The resulting putative biomarker microRNAs were then validated through in-depth examination of their molecular mechanisms and functions.

Data collection for HCC microRNA/mRNA expressions and interactions
To construct a more comprehensive and reliable human microRNA-mRNA network, we collected expression profiles from the Gene Expression Omnibus (GEO) database, microRNA-mRNA target relationships from multiple microRNA databases, as well as computational prediction of microRNA targets.
HCC microRNA expression data were obtained from the following three datasets in GEO: GSE63046, GSE21279, and GSE36915. GSE63046 contains 24 HCC samples and 24 normal adjacent tissue samples, the latter of which consisted of 15 samples with cirrhosis and nine without cirrhosis. 14 GSE21279 included 15 different types of liver tissue samples, among which four HCC samples and three normal samples were selected for further analyses. 15 GSE36915 consists of 68 HCC and 21 nontumor liver tissues. 16 In this work, we selected nine HCC samples and nine adjacent noncirrhosis tissue samples from GSE63046 dataset and downloaded the preprocessed microRNA expression profiling data for detailed analysis. The other two datasets (GSE21279 and GSE36915) were utilized for the validation study.
In our present study, data extracted from three GEO datasets, namely, GSE14520, GSE25097, and GSE36376, were used for further analyses of the expression correlations between microRNAs and target genes: 225 HCC samples and 220 nontumor liver samples were chosen from GSE14520, which used the Affymetrix HT Human Genome U133A Array platform 17 ; 268 HCC tumor and 243 adjacent nontumor samples were selected from GSE25097 18 ; the entire collection of 240 HCC samples and 193 adjacent nontumor samples in GSE36376 was included. 19 For microRNA-mRNA target relationship data, we integrated experimentally validated microRNA-gene interactions from various databases, including miRTarBase, 20 TarBase, 21 miRecords, 22 and miR2Disease. 23 In addition to curated database of microRNA-target interactions, we downloaded computational microRNA-target prediction data from HOCTAR, 24 ExprTarget, 25 and starBase. 26,27 HOCTAR is a resource that integrates expression profiling data and results predicted by sequence-based target prediction tools such as PicTar, 28 TargetScan, 29 and miRanda. 29 Moreover, it also uses Gene Ontology to analyze each microRNA-mediated transcriptional regulation network, thus predicting biological functions of microRNAs. ExprTarget includes microRNA target prediction methods contained in HOCTAR, and also integrates microRNA and mRNA expression datasets of the HapMap cell line. 25 The starBase provides the information on microRNA-ceRNA, microRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data.
For gene-transcription factor (TF) data, we collected data from the article of Vacerizas et al., which included a total of 1843 human TF genes. 30

Construction of human microRNA-mRNA reference network
For experimentally validated microRNA-mRNA interactions, we mainly selected the microRNA-mRNA interactions verified by low-throughput experiments (e.g., quantitative polymerase chain reaction [qPCR]). For computational microRNA-target prediction data, we selected the top 50% with highest scores in HOCTAR, the prediction score over 1 in ExprTarget, and the detection credibility readNUM ≥ 10 and BC (Biological complex) ≥ 2 in starBase. Finally, the results of at least two out of the three databases (HOCTAR, ExprTarget, and starBase) must match to positively identify the predicted interaction between microRNA and mRNA. Additionally, we unified all the microRNA names according to the rule in the latest release of the miRBase database (v21). 31,32

Screening of differentially expressed microRNAs and mRNAs
For the mRNA expression profile data, missing data were imputed by the k-nearest neighbors imputation approach in R, and the differential expression analysis was applied here, using the method described by Tang et al. 33 For the microRNA expression profiling data, the Student's t test was utilized to calculate the differentially expressed microRNAs between disease groups and control groups, and the p-value threshold for statistical significance was less than 0.05.

2.4
Construction of condition-specific microRNA-mRNA network for HCC First, the differentially expressed microRNAs and mRNAs were mapped to the human microRNA-mRNA reference network to obtain two HCC-specific microRNA-mRNA networks, called HCC-Net1 and HCC-Net2, respectively. Two indices, namely, novel out degree (NOD) and transcription factor percentage (TFP) values of microRNAs in two condition-specific microRNA-mRNA interaction networks, were calculated to quantify the power of microRNA regulation and identify candidate microRNA biomarkers for HCC. 11,12,34,35 NOD is a novel index of the network vulnerability, which represents the number of genes in the network that exclusively targeted by a certain microRNA. NOD embodies the independent regulatory power of individual microRNAs, and these microRNAs are more likely to become the vulnerable components of networks, and reflect changes in disease status more sensitively and accurately. TFP indicates the proportion of TF genes in microRNA-regulated genes, and provides an important complement to NOD that further extends the identification of microRNA biomarkers model to the biological function level. Because TF is crucial in many biological processes, abnormal expression of TF genes may play important roles in promoting cancer development. When abnormal regulation occurs in a microRNA with a higher TFP value, it may affect the expression of more TF genes and further have an effect on downstream genes leading to changes in the network state eventually. We calculated NOD and TFP values for each microRNA in the HCC-Net1 and HCC-Net2, respectively. Those microRNAs with significantly higher NOD and TFP values and p value of Wilcoxon signed-rank test <0.05 were selected as candidate biomarkers for HCC.

Biological function analysis
The biological function analyses (gene ontology or signal pathway enrichment analysis) of candidate microRNAs were performed using DAVID (the Database for Annotation, Visualization and Integrated Discovery) and IPA (Ingenuity pathway analysis) tools. The top 10 most significantly enriched pathways were selected for further literature mining and validation.

Classification performance evaluation
To verify the diagnostic effect of selected microRNAs, we used fivefold cross-validation cross-referencing a SVM model in the training dataset to evaluate their diagnostic capabilities. The sensitivity, specificity, and accuracy are used to determine the performance of our method and are calculated as follows: where TP, TN, FP, and FN represent true positive, true negative, false positive, and false negative, respectively. In addition, ROC curve analysis and area under the curve (AUC) were also applied to measure the classification performance of candidate microRNA biomarkers identified by our method both in the training expression profile dataset and the validation dataset. The SVM model is implemented in an R package "e1071," and the ROC curve analysis is conducted in an R package "ROCR."

Cell culture and luciferase-labeled cells
The HepG2, Huh7, SMMC-7721, Li-7, PLC/PRF5, and SK-Hep-1 human HCC cell lines; the HL-7702 (L-02) normal human liver cell line; and the 293T human embryonic kidney cell line were purchased from the Cell Bank of Type Culture Collection of the Chinese Academy of Sciences (Shanghai, China). The MHCC97L and MHCC97H human HCC cell lines were kindly provided by Dr. Yang Xu (Liver Cancer Institute and Zhongshan Hospital, Fudan University, Shanghai, China). The abovementioned cells were cultured in Roswell Park Memorial Institute (RPMI)-1640 medium (HyClone, Logan, UT, USA) containing 10% fetal bovine serum (FBS) (Gibco, Gaithersburgh, MD, USA) and 100 U/mL penicillin-streptomycin (Beyotime Biotech, Beijing, China) in an incubator with 5% CO 2 at 95% humidity and 37 • C. For luciferase (Luc) labeling of cells, MHCC97H cells (0.5 × 10 5 cells/well) were seeded into a 24-well plate. After 24 h of culture, the cells were infected with lentivirus-Luc-neomycin (Neo) (abm, Richmond, BC, Canada) at a multiplicity of infection of 20 in the presence of enhanced infection solution (GeneChem, Shanghai, China) and 10 μg/mL polybrene (GeneChem). At 72 h after infection, the infected MHCC97H cells were selected using 1000 μg/mL Neo (Beyotime Biotech) to obtain Luc-labeled MHCC97H cells.

Scratch assay
The MHCC97H or SMMC-7721 cells (5 × 10 5 cells/well) were seeded into six-well plates, incubated overnight, and then treated with agomiR-378a-3p versus agomiRcontrol or antagomiR-378a-3p versus antagomiRcontrol (200 nM). After 48 h of treatment, scratches were generated across the entire diameter of wells. Progression of tumor cell migration was observed and photographed at a low-power field (×100) under microscopy at 0 and 24 h after scratches. The ImageJ software (National Institutes of Health, Bethesda, MA, USA) was then used to calculate the migration distance to analyze the migration ability of these cells.

2.13
Tumor xenograft experiments where A is the long diameter and B is the short diameter. Four weeks after tumor cell inoculation, the mice were killed and the tumors were removed from the body and weighted. For a s.c. tumor treatment experiment, the Luc-labeled MHCC97H cells (2 × 10 6 cells/100 μL PBS per mouse; 12 mice) were s.c. injected into nude mice, and then subjected to treatment with agomiR-378a-3p or agomiRcontrol (six mice per group) via intratumoral injection (2 nmol per time) when the tumors grew to around 0.5 cm in diameter. The treatment was performed once every 3 days for four times according to the company's protocols. For an orthotopic tumor treatment assay, the Luc-labeled MHCC97H cells (2 × 10 6 cells/100 μL PBS) were s.c. injected into the nude mice. When the tumor grew to a diameter of l cm, the s.c. tumor was resected, soaked in PBS, cut into 1-2 mm 3 tumor size, and implanted orthotopically in the right lobes of the livers of the nude mice. One week after the orthotopic transplantation, the growth of orthotopic tumors was examined by a Caliper IVIS Lumina II system to detect luciferin signal. Those mice that carried orthotopic tumors were then assigned to two groups (six mice per group). Each group was treated with 5 nmol of agomiR-378a-3p or agomiRcontrol each time by tail vein injection. The treatment was also performed once every 3 days for four times. The growth of s.c. and orthotopic tumors was tracked as described above before (week 0) and every week (week 1-4) after treatment. In addition, the lung tissues and the s.c./orthotopic tumor tissues from these tumors-bearing mice were fixed in 10% neutral formalin and embedded in paraffin. The lung and tumor tissue sections (3-μm thick) were then prepared and used for hematoxylin and eosin (H&E) analysis of lung metastatic nodules and immunohistochemistry (IHC) analysis of β-catenin, respectively. For a tumor lung metastasis assay, the Luc-labeled MHCC97H-agomiR-378a-3p versus MHCC97H-agomiRcontrol cells (2 × 10 6 cells/200 μL PBS per mouse; six mice per group) were injected into nude mice through tail vein. Four weeks later, the mice were killed to harvest their lung tissues for H&E analysis of lung metastatic nodules.

IHC analysis
The expression of PLAGL2 and β-catenin in human HCC tumor tissues was determined by IHC analysis using human HCC TMA sections. The sections were deparaffinized and rehydrated. Antigen retrieval was performed by microwaving the slides in 0.01 M citrate buffer (pH 6.0) for 10 min. Endogenous peroxidase activity was quenched by treatment with 3% H 2 O 2 for 30 min followed by incubation with normal goat serum for 15 min. Subsequently, the sections were incubated with rabbit anti-PLAGL2 (1:500) and anti-β-catenin (1:100) primary antibody in a humidity chamber overnight at 4 • C. The sections were then incubated with HRPconjugated goat anti-rabbit IgG (1:200) (cat. no. G1213; Servicebio, Wuhan, Hubei, China) secondary antibody for 1 h at room temperature and immunostaining signal was detected by 3′,3-diaminobenzidine. Finally, the slides were counterstained with H&E and coverslip mounted. The expression level of PLAGL2 and β-catenin was evaluated by a weighted IHC score similarly as described above in the ISH analysis. In addition, the HCC transplanted tumor tissue sections were subjected to IHC analysis of β-catenin.

Statistical analyses for experimental results
The statistical analyses such as Student's t test, analysis of variance, Mann-Whitney U test, Pearson's χ 2 test, and Logrank test were performed with SPSS13.0 (SPSS, Chicago, IL, USA). A p-value of less than 0.05 (*p < 0.05 and **p < 0.01) was considered to be statistically significant.

Construction of human microRNA-mRNA reference network
The flowchart for the computational screening of HCC microRNA biomarker is shown in Figure 1. In this study, we integrated experimentally validated microRNA-gene interactions from four databases (miRTarBase, TarBase, miRecords, and miR2Disease), and only selected those microRNAs with consistent results in at least two out of the F I G U R E 1 The flowchart of the computational screening of the putative microRNA biomarkers for the HCC diagnosis three computational microRNA-target prediction methods (HOCTAR, ExprTarget, and starBase). In total, we obtained a human microRNA-mRNA reference network containing 618 microRNAs, 9526 target genes, and 48,868 microRNA-mRNA target relationships. In this directional binary network, the average degree of microRNAs was 79, and the average degree of target genes was 5 in the microRNA-mRNA network. Top 10 most significantly enriched pathways were selected for further literature mining and validation.
Moreover, the degree of microRNAs also matches the power law distribution, which means that compared to the remaining vast majorities, only a small number of microRNAs regulate more target genes in the networks. The NOD and TFP value distribution features of microR-NAs show about 66.3% (410/618) of microRNAs with a NOD value of greater than 0, meaning these microR-NAs are more likely to have greater independent regulatory power. Furthermore, previously reported microRNA biomarkers generally have significantly higher NOD and TFP values than the remaining microRNAs. 1,11,35

Screening of differentially expressed microRNAs and mRNAs
To more accurately identify differentially expressed mRNAs, six methods (t-test, COPA, OS, ORT, MOST, and LSOSS) were used to screen differentially expressed genes (DEGs) in three HCC datasets (GSE14520, GSE25097, and GSE36376). The percentage of overlapped DEGs obtained by each method in each dataset was calculated to select the best method. Considering the robustness of results in different datasets, the top 40% of the DEGs identified by each method were selected for comparison in this study. Therefore, 3341 DEGs screened by LSOSS were selected for detailed analyses. t-test was performed to select differentially expressed microRNAs in HCC between nine HCC samples and nine normal adjacent tissue samples with cirrhosis (GSE63046), resulting in 149 differentially expressed microRNAs in total (p < 0.05). Among them, 89 exhibit higher expression and 60 show lower expression in HCC.

Construction of condition-specific microRNA-mRNA network in HCC
To construct condition-specific microRNA-mRNA network in HCC, differentially expressed microRNAs and mRNAs were mapped to the human microRNA-mRNA reference network, producing two HCC networks (HCC-Net1 and HCC-Net2). HCC-Net1 contains 85 microRNAs, 4852 target genes, and 9428 microRNA-mRNA target relationships. The average degree of microRNAs is 111, and the average degree of mRNAs is equal to 2. Similarly, HCC-Net2 includes 520 microRNAs, 2697 genes, and 17,889 microRNA-mRNA target relationships. The average degrees of microRNAs and genes are 34 and 7. HCC-Net2 has less number and lower average degree of microRNAs than HCC-Net1, possibly because the expression of target genes in HCC is not considered in the screening process of HCC-Net1, leading to more target genes in the microRNA-mRNA reference network being included. Moreover, an overlapping analysis showed that 76 out of 85 microR-NAs in HCC-Net1 were overlapped in HCC-Net2. This demonstrated consistency between microRNAs identified in HCC-Net1 and HCC-Net2, thus guaranteeing the rationality of subsequent analyses.

Screened HCC microRNA biomarkers
In HCC-Net1, compared to the controls, 33 microRNAs show significantly higher NOD values (p < 0.05), and

Biological function analysis of predicted HCC microRNA biomarker
To investigate the biological properties of six candidate microRNA biomarkers, we carried out an enrichment analysis for a total of 433 target genes using DAVID and IPA. As shown in Figure 2A, there are two GO terms at biological process level, 10 GO terms at cellular component level, and seven GO terms at molecular function level that are statistically enriched for target genes (p < 0.05). Among them, glucocorticoid receptor (GR) is an important signal integrator in liver metabolism and physiological stress. In mice, impairment of GR signaling causes steatosis and HCC. 38 The significantly enriched GO categories include cellular components from various types of lumens and organelles in both cytoplasm and nucleus. Molecular binding is the most significantly enriched molecular function, which includes chromatin binding, adenyl ribonucleotide binding, ribonucleotide binding, purine ribonucleotide binding, ATP binding, adenyl nucleotide binding, and purine nucleotide binding.
Moreover, Figure 2B listed the top 10 significantly enriched terms in signal transduction pathways, disease and biological functions, and toxic functions using IPA (p < 0.05). Notably, the GR signaling pathway ranked first among all 115 significant enriched signal transduction pathways. Other top 10 enriched signal transduction pathways include Ephrin receptor signaling, Paxillin signaling, Hepatocyte growth factor signaling, Integrin signaling, and two types of pinocytosis signals (clathrin-mediated pinocytosis and macrophage), NGF signaling, estrogen receptor signaling, and LPS-stimulated MAPK signaling pathway. Importantly, most signal pathways we identified were closely related to HCC. For instance, the impairment of clathrin-mediated endocytosis via cytoskeletal change by epithelial to fibroblastoid conversion is associated with the des-gamma-carboxy prothrombin production in HCC 39 ; Hepatocyte growth factor stimulates the formation and migration of HCC cells 40,41 ; Estrogen receptor signaling plays an important role in the induction of HCC 42 ; LPS-TLR4 signaling promotes cancer cell survival and proliferation by regulating the activity of the MAPK signaling pathway. 43 Moreover, most of the significantly enriched entries in disease and biofunction are associated with hallmarks of cancer, including cellular growth and proliferation, cell death and survival, cell cycle, and DNA replication rearrangement, and repair (i.e., DNA replication and recombination). 44 And as shown in Figure 2B, liver hyperplasia/hyperproliferation, liver failure, and HCC are the top five significantly enriched entries in toxic function.
In summary, the functional enrichment analysis revealed crucial roles of our predicted microRNA targets in the development of HCC, suggesting these microRNAs might serve as good HCC biomarkers.

Classification evaluation of candidate microRNA biomarkers in HCC
To evaluate the diagnostic values of these six microRNAs, we performed ROC curve analysis and calculated the AUC value in GSE63046, GSE21279, and GSE36915. For the data in GSE63046, Figure 3 showed that miR-221-3p, miR-490-3p, miR-378a-3p, and miR-25-3p all had a discriminating power of AUC values larger than 0.7 between HCC patient samples and nontumor samples. For the data in GSE21279 and GSE36915, as shown in Figure 4A, the AUC values of miR-221-3p and miR-378a-3p in both datasets were greater than 0.6, whereas the AUC value of miR-490-3p was only greater than 0.6 in GSE21279. Interestingly, the AUC values of miR-101-3p and miR-381-3p in GSE63046 were both less than 0.6 and were greater than 0.6 in GSE21279. In addition, the AUC value of miR-25-3p was only greater than 0.6 in GSE36915. Therefore, we concluded that miR-221-3p and miR-378a-3p have a better robustness and classification consistency for universal diagnosis for HCC. The other four microRNAs showed the discriminating power on different datasets, implying their strong classification ability as a personalized medicine model for some specific samples due to the heterogeneity of cancer.

DISCUSSION
In this study, we integrated cancer gene expression data and network model of microRNA-mRNA interactions by calculating NOD and TFP values to find the key microRNA biomarkers in HCC. Bioinformatics methods were used to predict targets of microRNAs, and enrichment analyses were performed on these target genes. Subsequently, these results were verified by experimental methods both in vivo and in vitro experiments. We found that miR-378a-3p acted as a tumor-suppressor gene in HCC. Abnormal expression of miR-378a-3p impacted on tumor cell growth and invasion, which could help researchers develop an early diagnostic biomarker for HCC. However, it is still necessary to conduct both biological and clinical investigations to further explore this topic. Among six candidate microRNA biomarkers predicted by our model, miR-25-3p, miR-221-3p, and miR-101-3p have been reported to have prognostic value for HCC. MiR-25 (miR-25-3p) is highly expressed and is associated with poor prognosis in HCC tissues. 47 MiR-25-3p is also significantly elevated in HCC plasma and can be used in combination with other seven microRNAs to distinguish between HCC patients and noncancer controls. 48 MiR-221-3p plays an important role in the tumor formation. It stimulates cellular growth and proliferation by targeting cell cycle inhibitors, cyclin-dependent kinase inhibitor 1C (p57) and 1B (p27). 49,50 It also inhibits cell death by regulating Bcl-2modifying factor. 51 Moreover, miR-221-3p is overexpressed in HCC tissues and serum, 50,52,53 and the high expression level of MiR-221-3p in serum is associated with tumor size, tumor stage, and poor prognosis. 54,55 MiR-101-3p is significantly downregulated in HCC samples and suppresses tumor, thus a potential biomarker in tumorigenesis. [56][57][58][59] High miR-101-3p expression can inhibit the expression of myeloid cell leukemia-1 (Mcl-1) and promote apoptosis. 60 The ectopic miR-101 expression can imitate the inhibitory effect of nemo-like kinase on HCC, repress cancer cell growth and proliferation, 61 and inhibit the development of HCC by reducing the expression of EZH2. 62 Downregulation of miR-101-3p is associated with invasiveness and poor prognosis of HCC, 63 and low expression of plasma miR-101-3p can predict a worse disease-free survival. 64 In addition, although there is no direct evidence that miR-378a-3p, miR-490-3p, and miR-381-3p could act as potential biomarkers for HCC, they also play key roles in the occurrence and development of HCC. A genetic variant, rs1076064, in miR-378a-3p precursor RNA is positively correlated to hepatitis B virus HCC risk and prognosis. 65 The expression level of miR-378 in HCC blood and tissues of patients is significantly lower than that in the control group, 66,67 and reduced expression of miR-378 is associated with promoter hypermethylation. MiR-490-3p also shows significant downregulation in HCC tissue samples, 14 and it modulates HCC cell growth and epithelial-mesenchymal transition by targeting endoplasmic reticulum-Golgi intermediate compartment protein 3. 68 Similarly, miR-381-3p is also significantly underexpressed in HCC tissues and cell lines, whereas overexpression of miR-381-3p significantly suppresses HCC cell proliferation and invasion, and induces G0/G1 cell cycle arrest by directly targeting liver receptor homolog-1. 69 In summary, additional studies are required to explore the potentials of these microRNAs as biomarkers of HCC. variance (ANOVA), n = 6 replicates per condition. (G) Scratch assay after TCF4 or LEF1 knockdown. *p < 0.05 compared with SMMC-7721-antagomiR-378a-3p or control siRNA-transfected SMMC-7721-antagomiR-378a-3p, one-way repeated measures ANOVA, n = 6 replicates per condition. (H) Transwell invasion assay after TCF4 or LEF1 knockdown. TCF4 siRNA: **p < 0.01; LEF1 siRNA: *p < 0.05 compared with SMMC-7721-antagomiR-378a-3p or control siRNA-transfected SMMC-7721-antagomiR-378a-3p, one-way repeated measures ANOVA, n = 6 replicates per condition. (I) Western blot analysis of PLAGL2 siRNA-mediated PLAGL2 knockdown as well as β-catenin after PLAGL2 knockdown. The representative pictures of Western blot were shown. (J) Luciferase reporter analysis of transcriptional activity of β-catenin after PLAGL2 knockdown. **p < 0.01 compared with SMMC-7721-antagomiR-378a-3p or control siRNA-transfected SMMC-7721-antagomiR-378a-3p, one-way repeated measures ANOVA, n = 3 replicates per condition, n = 3 replicates per sample. (K) A schematic model of miR-378a-3p's function during HCC growth and metastasis. (L and M) Clinical relevance of miR-378a-3p with PLAGL2 and β-catenin in HCC. (L) The representative pictures of ISH analysis of miR-378a-3p as well as immunohistochemistry analysis of PLAGL2 and β-catenin in HCC tissue specimens derived from two representative cases (Case 15, +, low miR-378a-3p; and Case 17, +++, high miR-378a-3p) were shown. (M) The percentage of specimens exhibiting high or low miR-378a-3p expression in relation to the expression levels of PLAGL2 and β-catenin was shown. PLAGL2: **p < 0.01; β-catenin: *p < 0.05, Pearson's χ 2 test Our work integrated a rational microRNA biomarker discovery model and experimental functional analysis to yield valuable research results. For translation to clinical application, further researches are warranted. First, the occurrence and development of diseases is extremely heterogeneous due to the differences of genetics, living habits, occupations, and living environments in the population. As a result, finding universal biomarkers requires large sample size to validate their good application prospects and effectiveness. Second, to identify population-personalized biomarkers, a personalized medicine model depends on more accurate informationization support. Finally, due to the complexity of diseases, it is only possible to conduct rational drug design or treatment from the perspective of system control and dynamics to interfere with disease development based on the indepth analysis of the molecular mechanism.
Besides, there also remain limitations in this study. First, we noticed that the heterogeneity among patients has significant influence on the evaluation of candidate biomarkers. Thus, a large-scale analysis of patient microarray is demanded in the future to enhance the robustness and lower the inaccuracy of candidate biomarkers. What's more, other data, such as real-world patient health records and multiple-level omics data, are also need to be integrated to further describe the biomedical function of miR-378 and explain the necessity as a biomarker for HCC.
In conclusion, we introduced a bioinformatics model that integrated network topological and functional evidence to identify microRNA biomarkers in HCC diagnosis. The predictions from our prosed model were further validated by experimental methods using human HCC cell lines, model animal, and clinical specimens. Notably, we found that miR-378a-3p was a tumor suppressor and its abnormal expression could affect the cell growth and invasion of HCC, which had both theoretical and clinical significance.