Although most studies to understand the role of genes have focused on protein-coding genes, >98% of the human genome does not encode for protein-coding genes. The Encyclopedia of DNA Elements (ENCODE) project revealed that, while less than 2% of the genome encodes for protein-coding genes, more than 80% of the genome is proposed to have biological activity, and a portion of this is actively transcribed but does not encode for protein-coding genes (Fig. 1). These include the long noncoding RNA (lncRNA). Defining their functions and their mechanistic basis has been the focus of recent intense research, and many lncRNA may organize multicomponent complexes that regulate gene expression. LncRNA are becoming increasingly appreciated in the pathogenesis of liver diseases and have potential diagnostic, prognostic, or therapeutic importance.
The identification of the presence of large RNA transcripts that do not code for proteins but that may have biological functions has provided an important new perspective in gene regulation. These long noncoding RNAs (lncRNAs) are being increasingly recognized to contribute to many biological processes through diverse mechanisms. The roles of these emerging genes are being recognized across kingdoms. These findings are profoundly altering our understanding of disease pathobiology and leading to the emergence of new biological concepts underlying human diseases. Strategies for the discovery and characterization of lncRNAs are highlighted. Several lncRNAs have been described in liver disease, and in liver cancers in particular. Their molecular mechanisms of action, function, and contributions to disease pathophysiology are reviewed. LncRNA genes associated with liver diseases have potential roles as biomarkers of disease diagnosis, prognosis, or therapeutic response as well as direct targets for therapeutic intervention. Conclusion: The emerging knowledge in this rapidly advancing field offers promise for new fundamental knowledge and clinical applications that will be relevant for human liver diseases. (Hepatology 2014;60:744–753)
cancer up-regulated drug resistant
Prolyl hydroxylase 1
Encyclopedia of DNA Elements
hepatitis B virus
hemolysis, elevated liver enzymes, low platelets
Hox transcript antisense intergenic RNA
HoxA distal transcript antisense RNA
highly up-regulated in liver cancer
insulin-like growth factor 2
long intergenic noncoding RNAs
long noncoding RNAs
- lncRNA-Dreh lncRNA
down-regulated expression by HBx
highly expressed in hepatocellular carcinoma long noncoding RNA
long noncoding RNA associated with liver regeneration 1
lincRNA-regulator of reprogramming
- lncRNA MVI
long noncoding RNA associated with microvascular invasion in HCC
metastasis-associated lung adenocarcinoma transcript 1
multidrug resistance 1
maternally expressed 3
nuclear factor of activated T cells
plasminogen activator inhibitor-1
plasminogen activator inhibitor-1 mRNA binding protein
polycomb repressive complex 2
- RNA pol II
RNA polymerase II
transcribed ultraconserved RNA 338
transcribed ultraconserved RNA 339
What Are Long Noncoding RNA?
Noncoding (nc) RNA transcripts are divided into two heterogeneous groups based on an arbitrary transcript size of greater or less than 200 nucleotides (Fig. 2). The short ncRNAs include micro-RNAs (miRNAs), small nuclear RNAs, and many other classes of small RNAs. miRNA are 21-nucleotide ncRNA that posttranscriptionally regulate gene expression through the RNA-induced silencing complex, and use of Argonaute family proteins to cleave or inhibit translation of target gene messenger RNA (mRNA) transcripts. In some estimates, up to 60% of protein-coding genes can be targeted by miRNA. Their involvement in liver diseases have been recently reviewed.
Long ncRNAs are greater than 200 nucleotides. We use the term lncRNA to refer to transcribed RNA genes that are polyadenylated and are transcribed by RNA polymerase II (RNA pol II). They are less well understood than miRNA and other small RNAs. Similar to protein-coding genes, lncRNA show tissue-specific expression, chromatin marks, independent gene promoters, regulation by transcription factors, and splicing of multiple exons into a mature transcript. Indeed, the use of chromatin marks that are observed at sites of initiation or active transcription of protein-coding genes have been used to identify candidate lncRNA. Some ncRNAs are transcribed off a nearby or associated gene promoter (e.g., promoter/termini associated RNAs, intronic ncRNAs), whereas others may be transcribed from independent promoters. The genomic locations of lncRNA can be described in terms of their relationship to protein-coding genes (Fig. 3). Intergenic lncRNAs (lincRNAs) have distinct transcriptional units that do not overlap those of protein-coding genes. Other types of lncRNA include enhancer RNAs, transcribed ultraconserved genes, antisense RNA, and many others (Fig. 2). Circular RNAs (circRNAs) are a distinct group best considered separately from other linear ncRNA, although most identified circRNAs are greater than 200 nucleotides. The diversity of genomic locations and types of ncRNA highlights the high complexity of the genetic information encoded within the human genome.
How Are Long Noncoding RNA Identified?
Large-scale genome-wide sequencing and next-generation sequencing identified several lncRNA transcripts, first within the mouse transcriptome, and subsequently also in humans. More recently, tiling microarrays and RNA-sequencing have increased the numbers of known lncRNA. Quantitatively, the number of postulated lncRNA transcripts may even exceed the number of protein-coding mRNAs in mammals. Similar to data from sequencing studies in mice, the identification of 9,640 lncRNA loci in human was reported by the ENCODE project. This effort involved a large number of sequence-based studies to map functional elements across the human genome. The elements mapped include RNA transcribed regions, protein-coding regions, transcriptional factor binding sites, chromatin structure, and DNA methylation sites. ENCODE project data suggest that more than 80% of the genome had biochemical functions, with a small proportion of this representing lncRNA transcripts.
Epigenetic marks associated with transcription can identify lncRNAs within the genome. Genes actively transcribed by Pol II are enriched with trimethylation of lysine 4 on histone H3 and trimethylation of lysine 36 on histone H3 within their coding regions. A search for these domains in genome-wide chromatin-state maps in mice identified ∼1,600 lincRNAs with these epigenetic marks and that were conserved across mammals. These lincRNA undergo 5′-capping and polyadenylation and are transcribed by Pol II. Transcriptional regulation was demonstrated for selected lincRNAs associated with biological processes such as embryonic stem cell pluripotency to cell proliferation. A similar approach identified about 300 lincRNAs in human cells.
Some lincRNAs have been functionally validated and their association with biological processes has been shown. About 20% of lincRNAs are bound to polycomb repressive complex 2 (PRC2). Through this interaction, lincRNA are associated with chromatin remodeling and altered gene expression and with biological functions.[7, 8] LincRNAs may be involved in different kinds of biological pathways such as pluripotency, differentiation, proliferation, or cell survival. For example, lincRNA-Regulator of Reprogramming (lncRNA-RoR) plays a role in promoting survival in liver cancer cells during hypoxic stress by modulating hypoxia signaling pathways or in pluripotent or embryonic stem cells by preventing the activation of cellular stress pathways such as the p53 response.[9, 10]
A search for conserved genomic regions has identified novel lncRNAs. Ultraconserved elements are evolutionary conserved sequences that are 100% identical across the genomes of humans, mice, and rats. A genome-wide bioinformatics study identified 481 segments, ranging in size from 200 to ∼800 basepairs between orthologous regions. The expression of some transcribed ultraconserved regions, or ultraconserved RNA (ucRNA), are significantly altered in diseases, such as adult chronic lymphocytic leukemia, colorectal cancer, and neuroblastoma. We identified and cloned transcribed ultraconserved RNA 338 (TUC338), a ucRNA that is increased in both cirrhosis and hepatocellular carcinoma (HCC).
Although the ability to detect lncRNA within the human genome has been facilitated by genomic sequencing and bioinformatics analyses, validation of putative candidate genes is necessary. This can be done using conventional molecular biological approaches such as cloning and expression analysis using northern blots or quantitative polymerase chain reaction (PCR). However, a major challenge remains to identify and validate functional biological roles of these lncRNA candidates. This is difficult in part because of the lack of proteins encoded by these genes, the diverse molecular functions, and the wide diversity of potential mechanisms by which these RNA genes can have effects on gene expression. Consequently, the function of most of the lncRNA that have been implicated in liver and other diseases remain poorly described. Understanding these functions will be critical to recognizing the contribution of these genes to biological processes involved in hepatic functioning.
Identification of mRNA targets or specific proteins that bind to a candidate lncRNA may provide clues to its function. Potential mRNA targets could be identified by computational searches for mRNA that may be recognized by base-pairing complementarity, by analysis of RNA following cross-linking to a lncRNA or its binding protein, or by transcriptomic analysis after modulation of lncRNA expression. Potential binding proteins could be identified by isolation of proteins bound to tagged RNA or RNA binding proteins, or by using three-hybrid systems using lncRNA as a bait. The subcellular location or patterns of expression within specific cell types, stages of development, or disease states can be assessed using in situ hybridization (Fig. 4) or northern blotting. Functional studies can also be performed by knockdown of lncRNA in model organisms such as mice using similar strategies to those used for protein-coding mRNAs.
Are Long Noncoding RNA Involved in Human Liver Diseases?
Many lncRNA are known to contribute to liver regeneration, neoplasia, and other liver diseases. H19 and XIST were the first lncRNAs to be identified and studied in the liver. Although their discoveries predated that of the microRNAs, the latter have been extensively studied. Others include two novel lncRNA involved in liver cancers that we have cloned,[13, 16] maternally expressed 3 (Meg3) which we have shown to be a tumor suppressor gene in HCC, highly up-regulated in liver cancer (HULC), and metastasis-associated lung adenocarcinoma transcript 1 (MALAT-1), which functions in regulation of alternate splicing and has an oncogenic role in HCC development.[19, 20] The new insights provided by these lncRNA into disease pathogenesis or behavior merely represent the tip of the iceberg given the large numbers of lncRNA that remain to be characterized.
The identification of disease-associated lncRNAs has significant potential for new clinically relevant biomarkers. Thus, focused efforts to characterize disease-relevant lncRNA are very likely to identify new RNA genes that will be relevant to liver diseases or biology. Several recent studies have reported the expression of selected lncRNA within the liver that may contribute to liver pathobiology or are associated with specific liver diseases, and in particular with liver cancers. Similar to expression of other mRNA, expression of lncRNA can be detected using PCR or by in situ hybridization in tissue sections.
LncRNA Associated With Liver Cancers
Several lncRNA have been identified in HCC (Table 1). Several of these lncRNA have been linked to clinical prognostic roles, raising hope that lncRNA may provide effective biomarkers that may be helpful in practice.
|lncRNA||Chromosomal Location||Size (Kb)||Potential Role in Liver Cancer||Refs.|
|HULC||Chr6||0.5||Up-regulated in HCC. Higher expression associated with histological grade or HBV||([33, 34, 39])|
|TUC338||Chr12||0.59||Increased in cirrhosis and HCC. Modulates cell growth||()|
|HEIH||Chr5||1.7||Associated with HBV-HCC||()|
|MEG3||Chr14||∼1.8||De-regulated in HCC, associated with methylation. Predictive biomarker for monitoring epigenetic therapy||([17, 47])|
|H19||Chr11||2.3||Increased in HCC, or in peritumoral areas Low peritumor/tumor expression correlates with prognosis. Suppression of tumor metastases through miR-220 dependent inhibition of EMT, drug resistance||([48-51])|
|HOTAIR||Chr12||2.3||HCC recurrence after LT. Inhibition reduces invasion, increases chemosensitivity||([41, 42, 52])|
|HOTTIP||Chr 7||7.9||Up-regulated in HCC. Predicts disease outcomes and tumor progression||()|
|MALAT-1||Chr11||8.7||Up-regulated in HCC. Associated with cancer metastasis and recurrence||([19, 53])|
|MVIH||Microvascular invasion in HCC. Predicts recurrence-free survival, overall survival||()|
|LINC-ROR||Chr 18||22.8||Tumor cell survival during hypoxia||()|
Viral Hepatitis-Related LncRNA
LncRNA expression profiling between hepatitis B virus (HBV)-HCC and normal liver tissues identified a greater than two-fold change in 4.3% of all lncRNA analyzed. LncRNA down-regulated expression by HBx (lncRNA-Dreh) was altered in HBV X protein (HBx) transgenic mice compared with wild-type mice. This murine lncRNA can inhibit HCC growth and metastasis in vitro and in vivo, and acts through cytoskeletal modulation by repressed expression of vimentin. The human ortholog RNA of Dreh (hDREH) is down-regulated in human HBV-related HCC compared with adjacent noncancerous tissues, with the decrement correlating with poor survival of HCC patients.
LncRNA in Genetic Liver Diseases
An interesting study linked an lincRNA to a mendelian disorder with autosomal-recessive inheritance contributing to liver disease. The HELLP syndrome is characterized by hemolysis, elevated liver enzymes, and low platelets that occurs during pregnancy. This syndrome was linked to the HELLP lncRNA that was functionally implicated in the cell cycle. Blocking potential mutation sites identified in HELLP families decreased the invasion capacity of extravillous trophoblasts.
LncRNA Involved in Hepatic Pathophysiology
Differential expression of several lncRNA was noted during liver regeneration after partial hepatectomy in mice. Among these was lncRNA associated with liver regeneration 1 (lncRNA-LALR1). LncRNA-LALR1 enhances cell cycle progression and hepatocyte proliferation. This lncRNA facilitates cyclin D1 expression through activation of Wnt/β-catenin signaling by a mechanism involving suppression of Axin1 by recruiting CTCF to the AXIN1 promoter region. A human ortholog RNA of lncRNA-LALR1 (lncRNA-hLALR1) has been shown to be expressed in human liver tissues.
A novel biological function as a signaling mediator has been reported for TUC339. Although expression of this ucRNA is not increased in malignant hepatocytes compared with normal cells, it is highly enriched within extracellular vesicles released from HCC cells. The high extracellular vesicle enrichment of TUC339 within tumor cells, and its detection in serum from patients with HCC, raise the potential for its use as a circulating biomarker of disease.
What Are the Molecular Functions of Long Noncoding RNA, and How Do They Modulate Gene Expression?
Among the lncRNA that have been functionally characterized, a wide and diverse range of effects on the regulation of gene expression has been recognized. These include effects on gene expression through modulation of chromatin remodeling, control of gene transcription, posttranscriptional mRNA processing, protein function or localization, and intercellular signaling (Fig. 5). This diversity precludes a function-based classification that would encompass all lncRNA. Mechanisms that have been described for selected lncRNA involved in liver diseases include widely diverse functions such as DNA imprinting, X-inactivation, DNA demethylation, gene transcription, and generation of other RNA molecules.
Within the liver, circadian oscillations of expression of several lncRNA within the liver may contribute to cycling histone modifications and gene transcription that can contribute to homeostasis of metabolic systems. Many lncRNA can regulate gene expression through targeted effects on chromatin remodeling, or through X-chromosome inactivation and imprinting, as exemplified by the lncRNAs Xist and H19, respectively. Loss of imprinting at the H19 locus has been implicated in the increased H19 expression noted in HCC. The MEG3 lncRNA is a putative tumor suppressor in HCC. MEG3 directly and specifically binds to PRC2 in mouse embryonic stem cells. MEG3/Gtl2 is a single-copy gene and these isoforms are transcribed from the same gene but generated by alternative splicing, using different exons. Gtl2 and PRC2 interactions regulate Dlk1 by targeting PRC2 to Dlk1 in cis. Moreover, MEG3 can significantly increase p53 protein levels and stimulate p53-dependent transcription from a p53-responsive promoter. All MEG3 isoforms contain three different secondary folding motifs, M1, M2, and M3. Because M2 and M3 play a critical role in activation of p53, the RNA structural motifs are important to MEG3 function,[26, 28] emphasizing the contribution of secondary and tertiary RNA structure in lncRNA function.
Reprogramming of gene expression during cancer development by the lncRNA HOTAIR (Hox transcript antisense intergenic RNA) involves chromatin remodeling. HOTAIR can act as a molecular scaffold capable of binding to at least two distinct histone modification complexes. A 5′ domain of this lncRNA binds to the mammalian PRC2 complex responsible for H3K27 methylation, whereas a 3′ domain of the RNA binds to the histone lysine demethylase LSD1.[29, 30]
A role for lncRNA in direct control of transcription is emerging. The HOTTIP (HoxA distal transcript antisense RNA) gene is located in contiguity with HOXA13 and controls HOXA locus gene expression by way of interaction with the WDR5/MLL complex. Studies have shown a bidirectional regulatory loop between HOTTIP/HOXA13. HEIH associates with EZH2, an important subunit of the PRC2 complex, and this association contributes to the expression of EZH2-regulated target genes. Chromatin isolation by RNA purification revealed that TUC338 interacts with genomic DNA and plasminogen activator inhibitor-1 (PAI-1) mRNA binding protein (PAI-RBP1), an RNA binding protein that is associated with (PAI-1). TUC338 increases PAI-RBP1 and PAI-1 expression, and PAI-RBP1 promotes transformed cell growth. Furthermore, TUC338 may prevent binding of Pax6 and P53 through binding to their DNA motifs. Thus, TUC338 might mediate gene expression through interaction with trans-acting proteins and binding to cis-acting elements for abnormal cell proliferation in HCC. Other lncRNAs such as Evf-2 have been shown to activate transcriptional activity by directly influencing Dlx-2 activity.
Several lncRNA such as MALAT-1 can regulate gene expression through posttranscriptional effects. The full length of 7.5 kb MALAT1 transcript is processed by RNaseP and RNaseZ to generate two noncoding RNAs. One of these is a small ncRNA, mascRNA, and is exported to the cytoplasm whereas the other is the lncRNA MALAT-1 and is retained in the nuclear speckles. MALAT-1 may regulate alternate splicing by modulating the activity of serine/arginine (SR) splicing factors that regulate alternative splicing. MALAT-1 depletion increases the levels of cellular SR proteins and alters the distribution and the ratio of phosphorylated to dephosphorylated pools of SR proteins. HULC copurifies with ribosomes in the cytoplasm, and has been implicated in the regulation of tumor cell proliferation through down-regulation of p18 and involving ATM/ATR and p53 dependent signaling. HULC also down-regulates several miRNA, including miR-372. Reduction of miR-372 decreases translational repression of PRKACB, which in turn induces phosphorylation of cAMP response element binding protein (CREB). Phosphorylated CREB forms part of the RNA pol II transcriptional machinery to activate HULC expression. Thus, HULC may act as an endogenous “miRNA sponge” to regulate miRNA activities in HCC cells. LncRNA-RoR can negatively regulate p53 expression and inhibit p53-mediated cell cycle arrest and apoptosis through direct interaction with the heterogeneous nuclear ribonucleoprotein I. The effects of LncRNA-RoR on pluripotency or cell survival may therefore be mediated in part through translation regulation.
Protein Function and Localization
Some lncRNA have also been shown to be capable of modulating protein localization, protein stability, and protein activity. NRON is a noncoding repressor of the nuclear factor of activated T cells (NFAT). This lncRNA interacts with proteins such as members of the importin-beta superfamily and regulates NFAT localization as an RNA component of protein complex that represses NFAT activity. Cyclin D1 (CCND1) ncRNAs are transcribed from the promoter region of the CCND1 gene. CCND1 ncRNA interacts with RNA-binding protein, translocated in liposarcoma (TLS), and prevents CREB-binding protein and histone acetyltransferase activities to repress CCND1. This lncRNA negatively regulates CCND1 transcription by recruiting TLS to the CCND1 promoter. Thus, CCND1 ncRNA might be involved in protein activity.
These examples illustrate the diversity of mechanisms by which lncRNA can exert functional effects on gene expression (Fig. 6).
What Is the Diagnostic and Therapeutic Potential of Long Noncoding RNAs?
LncRNA can be useful for diagnosis or as prognostic markers.
LncRNA single nucleotide polymorphisms (SNPs) may identify disease risk. A case-control study examined the association between SNPs rs7763881 in HULC and rs619586 in MALAT1, and susceptibility to HCC or chronic HBV infection was examined in a case-control study of 1,300 HBV-positive HCC patients, 1,344 HBV persistent carriers, and 1,344 subjects with HBV natural clearance. rs7763881 was associated with decreased HCC risk, whereas the association with rs619586 was of borderline significance. Thus, the variant genotypes of rs7763881 in HULC may contribute to decreased susceptibility to HCC in HBV carriers. Variants in genes that could affect lncRNA expression or function may also be useful. The deletion allele of a polymorphism (rs10680577) within the distal promoter of Prolyl hydroxylase 1 (EGLN2) was significantly associated with increased risk for HCC in two case-control cohorts, particularly in smokers. The deletion allele correlated with expression of both EGLN2 and RERT-lncRNA in vivo and in vitro, potentially through actions on the RERT-lncRNA structure, and could be a promising biomarker for early diagnosis of HCC.
Detection of circulating lncRNA raises the potential for use as markers of disease. For example, circulating HULC is detected in patients with HCC, while HULC expression is increased in liver tumor tissues and observed in the plasma of patients with HBV positivity. Other lncRNA that are deregulated in liver diseases such as TUC338 (increased) or MEG3 (decreased) in HCC may also have potential utility as diagnostic markers and warrant further exploration.[13, 17]
Predictors of Prognosis
Several lncRNA have prognostic potential. LncRNA-HEIH is associated with cancer recurrence in HBV-related HCC, with high lncRNA-HEIH expression predicting a worse prognosis. Similarly, plasma HULC is increased with higher HCC Edmonson grades. Increased HOTAIR expression is associated with risk of recurrence after hepatectomy for HCC,[41, 42] whereas overexpression of MVIH is associated with microvascular invasion and decreased survival after hepatectomy. MALAT-1 expression predicts recurrence in patients with HCC undergoing liver transplantation, particularly in patients beyond Milan criteria. As data regarding the role of lncRNA emerges in other liver diseases, the potential for lncRNA based prognostic markers will expand.
New therapeutic possibilities are arising from targeting RNA for the treatment of disease. Therapeutic approaches based on H19 have entered clinical trials. An example is gene therapy using a vector carrying the diphtheria toxin A-chain gene under the control of H19 regulatory sequences that may potentially be useful for liver tumors with a high expression of H19. Likewise, enhancing lncRNA-LALR1 may be therapeutically beneficial in inducing liver regeneration following liver injury because this lncRNA has been shown to accelerate liver cell proliferation. Similar to emerging strategies to target microRNA, targeting lncRNA holds considerable promise.
Extensive investigations into the content of the human genome have identified several thousands of lncRNA. Emerging studies are implicating these in a wide range of biological processes involved in development, differentiation, growth tissue injury, repair, regeneration, and metabolism. New candidate disease-associated lncRNA genes are being identified and their molecular mechanisms are being elucidated. Epigenetic regulation of the genome by lncRNA is an exciting nascent but emerging paradigm that will provide new biological insights. LncRNA offer new possibilities for understanding disease pathogenesis, as diagnostic or prognostic biomarkers, or as direct therapeutic targets. Indeed, several lncRNA associated with liver cancers may have potential as markers of disease progression or prognosis. The involvement of lncRNA in diseases other than cancer remains less well characterized but is likely to be a fruitful area of investigation. Given the large number of lncRNA, and the intense research to identify and evaluate these genes, novel lncRNAs with relevance to liver diseases are likely to be identified and their effects on disease pathophysiology defined. An improved understanding arising from focused study of these novel genes in liver diseases will undoubtedly provide useful insights and generate new hypothesis regarding disease pathogenesis that will eventually lead to novel clinical applications.