Comprehensive analysis of partial epithelial mesenchymal transition‐related genes in hepatocellular carcinoma

Abstract Increasing evidence has revealed that cancer cells undergoing an intermediate state, partial epithelial mesenchymal transition (p‐EMT), tend to metastasize rather than complete EMT. We performed a comprehensive analysis of E‐cadherin and 25 p‐EMT‐related genes in HCC to explore the roles and regulatory mechanisms of them in HCC. We analysed E‐cadherin and 25 p‐EMT‐related genes in HCC and constructed an mRNA‐miRNA‐lncRNA ceRNA subnetwork containing p‐EMT‐related genes by bioinformatic approaches. IHC was used to identify the protein expression of key p‐EMT‐related genes, P4HA2, ITGA5, MMP9, MT1X and SPP1. Complete EMT is not necessary for HCC progression. Overexpression of P4HA2, ITGA5, MMP9, SPP1 and down‐regulation of MT1X were found in HCC tissues, which were significantly associated with poor prognosis of HCC patients. By means of stepwise reverse prediction and validation from mRNA to lncRNA, an mRNA‐miRNA‐lncRNA ceRNA subnetwork correlated with HCC prognosis was identified by expression and survival analysis. This study implied that key p‐EMT‐related genes P4HA2, ITGA5, MMP9, MT1X, SPP1 could be prognostic biomarkers and potential targets of therapy for HCC patients. We constructed an mRNA‐miRNA‐lncRNA subnetwork containing p‐EMT‐related genes successfully, among which each component might be utilized as a prognostic biomarker of HCC.


| INTRODUC TI ON
Hepatocellular carcinoma (HCC) is one of the most common cancers worldwide and listed as the third leading cause of cancer-related mortality. 1 The long-term prognosis of patients with HCC after hepatectomy remains poor because of frequent metastasis. 2 Though efforts have been made in investigating the mechanisms of the HCC initiation and progression, detailed mechanisms of HCC pathogenesis remain largely unclear. Promising biomarkers of HCC urgently need to be identified for HCC diagnosis and therapy improvement.
It is widely acknowledged that epithelial mesenchymal transition (EMT) plays a vital role in invasion and metastasis in diverse types of cancer, including HCC. 3,4 Cancer cells during EMT lose their epithelial features and gain fully mesenchymal phenotypes, known as complete EMT. 5 Down-regulation of epithelial markers such as E-cadherin (also named CDH1), and up-regulation of mesenchymal markers such as vimentin are considered as hallmarks of EMT and growing studies have shown identify the importance of EMT in HCC progression. 6,7 Recently, an intermediate state of EMT named partial-EMT(p-EMT), during which cancer cells exhibit both mesenchymal and epithelial features, has attracted more and more attention. Recent studies have found that metastatic cancer cells express a certain level of E-cadherin, and cancer cells in p-EMT state have an intense trend to adapt to the metastatic microenvironment a higher risk of metastasis. 8 Our understanding of p-EMT programme was enhanced using sophisticated techniques such as lineage tracing in patient-derived xenografts, genetically engineered mouse models (GEMM), single-cell RNA sequence, fluorescent-activated cell sorting (FACS). 9  A competing endogenous RNA (ceRNA) hypothesized that non-coding RNA (ncRNA), including long non-coding RNA (ln-cRNA), can gain cross-talk with mRNAs by competitively binding to shared miRNAs and then form a regulatory network was proposed by Salmena et al 13 Accumulating evidence has indicated that the lncRNA-miRNA-mRNA ceRNA network may play an important role in cancer metastasis including HCC. 14 Recently, a lncRNA-miR-NA-mRNA ceRNA network associated with diagnosis and prognosis of HCC has been established. 15 Nevertheless, understanding of mRNA-miRNA-lncRNA ceRNA networks containing p-EMT-related genes correlated significantly with prognosis of HCC remain extremely limited and need to be explored.
In our study, we performed a comprehensive analysis of E-cadherin in HCC. Then, we analysed the expression patterns, prognostic values, gene mutations, mutual interactions, correlations with each other of 25 p-EMT-related genes and identified key genes after comprehensive consideration of expression and prognostic roles of p-EMT-related genes using online databases. We selected key upstream miRNAs and lncRNAs regulating the key genes to construct an mRNA-miRNA-lncRNA ceRNA subnetwork associated with prognosis of HCC. Furthermore, we identified protein expression levels of the selected key p-EMT-related genes in HCC tissues and adjacent nontumour liver tissues by immunohistochemistry (IHC) staining. They may also serve as potential biomarkers for diagnostic and therapeutic improvement of HCC in the future.  1 (weak); 2 (medium); 3 (strong). The percentage of positive cells was scored from 0 to 4 (0%, 1%-25%, 26%-50%, 51%-75%, 76%-100%). Overall score ranging from 0 to 12 was calculated by multiplying the above two scores, resulting in a negative (0-3) staining or a positive (4-12) staining for each example.

| UALCAN database analysis
UALCAN (http://ualcan.path.uab.edu) is a newly developed interactive web resource for 31 cancer types from TCGA database. 17 In this study, UALCAN database, containing 371 primary HCC samples and 50 normal samples, was used to analyse mRNA expression levels of p-EMT-related genes in HCC. In addition, correlations between genes and clinical characteristics were investigated in UALCAN database. Transcript per million <1 was excluded due to the extremely low value. P value <.05 was considered as statistically significant.

| Oncomine database analysis
Oncomine (https://www.oncom ine.org/) is an integrated online cancer microarray database. 18 In this study, Oncomine was used to analyse mRNA expression levels of the 25 p-EMT-related genes in Roessler's data of HCC. 19 Difference of mRNA expression was compared by Students' t test. Cut-off of P value and fold change were as following: P value: .01, fold change: 1.5.

| Kaplan-Meier Plotter analysis
The correlation between mRNA expression levels of p-EMT-related genes and prognosis of HCC patients was evaluated using Kaplan-Meier Plotter (www.kmplot.com), an online database containing information about the effects of 54 675 genes on survival in more than 20 types of cancers. 20 Each p-EMT-related gene was first entered into 'Liver cancer' item in this database. HCC patients were divided into high and low expression group according to median values of mRNA expression and then Kaplan-Meier overall survival curves were generated. Significant difference was considered when logrank P value <.05.

| Functional enrichment analysis
Database for Annotation, Visualization and Integrated Discovery

| The cBioPortal database analysis
The cBioPortal (www.cbiop ortal.org) is an online open website resource capable to assess multidimensional cancer genomics data. 22 In this study, we analysed the genomic profiles of the 25 p-EMTrelated genes, which contained mutations, putative copy-number alterations from GISTIC and mRNA Expression z-Scores (RNASeq V2 RSEM) with a z-score threshold ± 1.8. The correlations of p-EMTrelated genes with each other were analysed via the cBioPortal online tool and visualized by using ggcorplot package of R software.
Pearson's correction was included.

| Protein-protein interaction (PPI) network
The PPI interaction network between the p-EMT-related genes was constructed by Search Tool for the Retrieval of Interacting Genes (STRING) database (http://string-db.org/) 23

| Prediction of miRNAs and lncRNAs
We predicted the upstream miRNAs of the six key p-EMT-related genes by utilizing miRTarbase (http://mirta rbase.mbc.nctu.edu. tw/php/index.php), an online database containing more than thousands of miRNA-target interactions validated experimentally by reporter assay, Western blot, microarray and next-generation sequencing experiments. 25 For the credibility of predicted results, only the miRNA-target interactions validated by reporter assay were selected for further analysis. The upstream candidate lncRNAs interacted with key miRNAs were predicted by using the miRNet database (https://www.mirnet.ca/), an online platform providing comprehensive analyses of miRNA-target interactions.
Selection criteria were 'Organism-H.sapies' and 'target type-lncR-NAs'. The expression levels and prognostic values of predicted miRNAs and lncRNAs were analysed between HCC tissues and normal tissues using starBase v3.0 (http://starb ase.sysu.edu.cn/ index.php), an online database providing information on differential expression, survival and coexpression analysis of RNAs data from the TCGA projects. 26 P value <.05 was considered as statistically significant. We drew the Venn diagram using VENNY 2.1.0 (http://bioin fogp.cnb.csic.es/tools/ venny/ index.html), an interactive online tool.

| Correlation analysis
We explored the correlations of mRNA-miRNA, miRNA-lncRNA and mRNA-lncRNA pairs in HCC using starBase v3.0 database and P value <.05 was considered as statistically significant.

| Complete EMT may not be necessary for HCC metastasis
E-cadherin has been considered as a hall mark of EMT. 3 Figure   S1). Considering the results above, we thought the complete EMT might not be necessary for HCC progression and concentrated our attention on the role of p-EMT-related genes in HCC.

F I G U R E 3
Prognostic value of mRNA expression levels of distinct p-EMT-related genes in HCC patients. The correlation between mRNA expression levels of distinct p-EMT-related genes and overall survival of HCC patients

| Different mRNA expression levels of p-EMTrelated genes in HCC patients
We used UALCAN to compare the mRNA expression of p-EMT-related genes in HCC tissues and normal liver tissues ( Figure 2 and Figure S2).
The expression patterns of these genes were further confirmed the results in Oncomine and GEPIA (Figures S3 and S34). Among common p-EMT-related genes, the mRNA expression levels of P4HA2, ITGA5, LAMA3, CDH13, LAMB3, VIM were found significantly higher in HCC tissues than in normal tissues in all databases. Up-regulation of variable p-EMT-related genes MMP9 and SPP1 was observed in HCC tissues, while MT1X was remarkably down-regulated in HCC tissues.

| Correlation between the mRNA expression levels of p-EMT-related genes and prognosis of HCC patients
To determine the role of p-EMT-related genes in prognosis of HCC patients, we analysed the correlation between the mRNA expres-

| Functional enrichment analysis for p-EMTrelated genes
In addition to biological processes identified already, the p-EMT programme may be associated with some other biological processes and pathways. We performed functional enrichment analysis to explore the particular biological functions of the 25 p-EMT related genes.
Consistent with previous studies, the results showed that these

| Genetic mutations in p-EMT-related genes and PPI network construction
The alterations of p-EMT-related genes were explored by using

| Identification of key upstream miRNAs
Then, we sought to establish an mRNA-miRNA-lncRNA competing endogenous RNA subnetwork associated with p-EMT-related genes in HCC, among which each component was markedly correlated with HCC prognosis. In order to screen key miRNAs regulating these five key p-EMT-related genes, we predicted upstream miRNA of the five key genes by using miRTarBase, an experimentally validated miRNAtarget interactions database. To improve the credibility of the predicted results, only miRNA-target interactions validated by reporter assays were included in our study. A total of 20 miRNAs were predicted to regulate the three key p-EMT-related genes (ITGA5, MMP9 and SPP1), all of which were up-regulated in HCC tissues ( Figure 6A).
There was no miRNA noticed to target P4HA2 and MT1X in this database. Given the inverse regulatory relationship between miRNAs and their target genes, we explored the expression and prognostic values of the predicted miRNA in HCC patients by using starBase v3.0. The result showed that seven miRNAs were significantly downregulated in HCC tissues, and only two of them (has-miR-148a-3p and has-miR-204-5p) were associated with poor prognosis in HCC patients ( Figure 6B). The expression boxplot and survival curve of the two candidate miRNAs were shown in Figure 6C, and they were selected as key miRNAs for next analysis.

| Identification of key upstream lncRNAs
It is well known that lncRNA can act as sponge to competitively bind to miRNA, regulating expression of target genes. Hence, we predicted key upstream lncRNAs that potentially bound to the two key miRNAs by miRNet, an online database for miRNA-associated studies. A total of 80 lncRNAs were discovered in the database for the two miRNAs has-miR-148a-3p and has-miR-204-5p ( Figure 7A).
The ceRNA hypothesis believes that lncRNA could weaken miRNA activity to up-regulate expression of miRNA-related target genes. 13 Based on the hypothesis, there should be a negative correlation be-

| Identification of the protein expression levels of key p-EMT-related genes in HCC
In addition to mRNA expression analysis, we further observed the

| D ISCUSS I ON
Increasing evidence has identified the vital role of EMT in cancer progression, in which E-cadherin dysregulation was observed in many types of cancers, 6,27 including HCC. In our study, we found that E-cadherin expression was lower in HCC tissues in few cohorts and no significant difference was observed in the others. Analysis TGF-β signalling, which might be a potential target for HCC therapy. 38 The protein encoded by SPP1 can be secreted into serum and is related with the attachment of osteoclasts to mineralized bone matrix. 39 SPP1 has been screened as a molecule for HCC diagnosis and prognosis. 40 Our results showed that the four p-EMT-related genes P4HA2, ITGA5, MMP9 and SPP1 were overexpressed in HCC tissues compared with in normal liver tissues, and the mRNA expression levels of them all were associated with prognosis of patients with HCC. However, MT1X was identified as a tumour suppressor involved in HCC progression and metastasis. 41 The expression and survival analysis showed that MT1X mRNA expression was higher in normal tissues and correlated with better prognosis of HCC patients.
Considering the importance of ceRNA networks in cancer, we constructed a novel ceRNA subnetwork associated with prognosis of HCC, containing the key p-EMT-related genes we selected, to explore the regulatory mechanisms of p-EMT-related genes. Therefore, candidate miRNAs and lncRNAs binding to potential miRNAs were subsequently predicted. 20 miRNAs of ITGA5, MMP9 and SPP1 were first identified using miRTarBase database. Given the action mechanism of miRNA on mRNA, we identified the ITGA5-miR-148a-3p pair for further analysis by performing correlation analysis for these mR-NA-miRNA interactions in HCC using starBase v3.0. Low expression of miR-148a-3p was an independent risk factor for overall survival for HCC. 42 In vitro experiment showed that miR-148a-3p inhibited  45 Overexpression of SNHG20 has been reported to be an indicator of poor prognosis of HCC. 46 In addition, TUG1 is overexpressed in HCC and enhances cell proliferation and tumorigenicity by epigenetically inhibition of Kruppel-like factor 2 (KLF2) transcription. 47 In consequence, a novel mRNA-miRNA-lncRNA regulatory subnetwork associated with prognosis of HCC was constructed successfully. Interestingly, we found that some interactions in this network were identified in previous studies, which further confirmed the reliability of our results. 48 In the end, six pairs of subnetwork (SNHG3/SNHG20/NUTM2B-AS1/ LINC00909/LINC00346/TUG1-miR-148a-3p-ITGA5) were acceptable, which might be utilized to be prognostic biomarkers for HCC.
We further identified the protein expression levels of the five key p-EMT-related genes in HCC tissues and corresponding adjacent nontumour tissues using IHC staining. Consistent with results above, the expression of P4HA2, ITGA5, MMP9 and SPP1 was higher in HCC tissues than in adjacent nontumour tissues in protein expression level. Nevertheless, the protein expression level of MT1X was lower in HCC tissues compared with nontumour liver tissues.
In conclusion, we systematically analysed the expression

ACK N OWLED G EM ENTS
The work was supported by grants from the National Natural

CO N FLI C T O F I NTE R E S T
The authors report no conflicts of interest in this work.

This study was approved by the Academic Committee of Tongji
Medical College, Huazhong University of Science and Technology.
Written informed consent from the patients was obtained.

DATA AVA I L A B I L I T Y S TAT E M E N T
All the data in our study can be accessed from the online database.