Early detection of hepatocellular carcinoma via liquid biopsy: panel of small extracellular vesicle‐derived long noncoding RNAs identified as markers

This study investigated the diagnostic potential of serum small extracellular vesicle‐derived long noncoding RNAs (EV‐lncRNAs) for hepatocellular carcinoma (HCC). Driver oncogenic lncRNA candidates were selected by a comparative analysis of lncRNA expression profiles from two whole transcriptome human HCC datasets (Catholic_LIHC and TCGA_LIHC). Expression of selected lncRNAs in serum and small EVs was evaluated using quantitative reverse transcription PCR. Diagnostic power of serum EV‐lncRNAs for HCC was determined in the test (n = 44) and validation (n = 139) cohorts. Of the six promising driver onco‐lncRNAs, DLEU2, HOTTIP, MALAT1, and SNHG1 exhibited favorable performance in the test cohort. In the validation cohort, serum EV‐MALAT1 displayed excellent discriminant ability, while EV‐DLEU2, EV‐HOTTIP, and EV‐SNHG1 showed good discriminant ability between HCC and non‐HCC. Furthermore, a panel combining EV‐MALAT1 and EV‐SNHG1 achieved the best area under the curve (AUC; 0.899, 95% CI = 0.816–0.982) for very early HCC, whereas a panel with EV‐DLEU2 and alpha‐fetoprotein exhibited the best positivity (96%) in very early HCC. Serum small EV‐MALAT1, EV‐DLEU2, EV‐HOTTIP, and EV‐SNHG1 may represent promising diagnostic markers for very early‐stage HCC.


Introduction
Primary liver cancer is one of the most commonly diagnosed cancers globally, with relatively high morbidity and mortality [1]. Hepatocellular carcinoma (HCC) is the most prevalent primary liver cancer accounting for approximately 75% of all liver cancer cases [2]. Despite surveillance with abdominal ultrasonography among the at-risk populations, only 20% of HCC patients can afford the curative treatment through surgical resection, ablation treatment, or liver transplantation [3,4]. Moreover, recently, the sensitivity and specificity of clinical screening and diagnostic techniques have been questioned, particularly for very early-stage HCC. Hence, pressing need exists for the development of HCC diagnostic biomarkers with improved sensitivity and specificity to increase the opportunity for curative treatment and improve patient prognosis.
Tumor-secreted extracellular vesicles (EVs) serve as critical intercellular communicator between tumor cells and stromal cells in local and distant tumor environments. EVs, detectable in all body fluids, are resistant to biological degradation and, thus, have been reported as promising biomarkers for monitoring cancer development, particularly in liquid biopsy approaches [5,6].
Long noncoding RNAs (lncRNAs), a group of noncoding RNA more than 200 nucleotides in length, function as important players in chromatin modification, transcriptional inhibition, RNA processing, and gene activation by acting as decoys or signals [7,8]. Recent studies indicated that HCC-related lncRNAs such as HOTAIR, HULC, H19, and MALAT1 have decisive regulatory roles in the development and progression of HCC, while their dysregulation is related to various biological processes, including differentiation, proliferation, apoptosis, invasion, and metastasis [9,10]. Although lncRNAs have been primarily examined in the context of tissues, it is likely that they also exist in various body fluids, either free, bound to proteins, or coated by EVs, suggesting their usefulness as liquidbased noninvasive markers for clinical use as a diagnostic and therapeutic target. However, the potential of dysregulated lncRNAs, particularly those in circulating tumor-derived EVs, has not been fully explored to assess their potential as liquid biopsy biomarkers.
The current study, therefore, sought to investigate potential lncRNA biomarkers using HCC whole transcriptome sequencing datasets from The Cancer Genome Atlas liver hepatocellular carcinoma project (TCGA_LIHC) and the Catholic University of Korea's liver hepatocellular carcinoma project (Catholic_LIHC) and to subsequently validate their applicability as HCC diagnostic biomarkers. The significantly differently expressed lncRNAs between HCC tissues and nontumor liver tissues were analyzed, and six lncRNAs-DLEU2, HOTTIP, MALAT1, NEAT1, SNHG1, and TUG1-were identified as candidate biomarkers for HCC. Furthermore, serum small EV-derived DLEU2, HOTTIP, MALAT1, and SNHG1, which showed good to excellent discriminating ability to diagnose very early HCC with or without alpha-fetoprotein (AFP), were identified as potential liquid biopsy biomarkers. Overall, small EV-derived lncRNAs could help diagnose very early-stage HCC even in patients without AFP elevation.

LncRNA expression analyses
Next-generation RNA-sequencing datasets were acquired from the Catholic_LIHC and TCGA_LIHC. Public microarray datasets (GSE77314, GSE94660, and GSE124535) were obtained from the National Center for Biotechnology Information Gene Expression Omnibus (GEO). Two independent whole transcriptome data (Catholic_LIHC and TCGA_LIHC) were analyzed to identify overexpressed lncRNA candidates in HCC. Other datasets were then used to validate the expression of candidate lncRNAs. All dataset sequencing reads were quality-controlled using FASTQC (https://www.bioinformatics.babraham.ac.uk/projects/ fastqc/), followed by mapping to human Gencode release version 22 using STAR software (https://code. google.com/archive/p/rna-star/). The Lnc2Cancer 2.0 database was used to select the lncRNAs with the most significant functional roles in cancer. Hence, only lncRNAs registered in Lnc2Cancer2.0 were selected as final candidates.

Study population and definitions
Sera and the clinical information were collected from the Biobank of Ajou University Hospital, a member of Korea Biobank Network, between January 2014 and December 2018. The study population was divided into groups of healthy controls (NL), subjects with chronic hepatitis (CH), subjects with liver cirrhosis (LC), and subjects with HCC. Healthy controls were defined as 18-to 50-year-old subjects with no past medical history who attended regular medical checkup in the Ajou Health Promotion Center and had normal health status. CH was diagnosed by hepatitis C virus RNA or serum hepatitis B surface antigen detection for more than six consecutive months, as well as elevated aminotransferase or alanine aminotransferase. LC was diagnosed based on clinical data, radiologic findings, and/or histological confirmations. HCC was diagnosed according to the American Association for the Study of Liver Diseases practice guideline [11]. Very early-stage HCC was used to denote a single tumor of < 2 cm in diameter, equivalent to modified Union for International Cancer Control (mUICC) stage I. Overall survival was calculated based on time interval from HCC diagnosis to death from any cause. This study was approved by the Institutional Review Board of Ajou University Hospital, Suwon, South Korea (AJRIB-BMR-KSP-18-397 and AJIRB-BMR-KSP-18-299). Anonymous sera and clinical information were supplied by the Ajou Human Bio-Resource Bank, and the requirement for informed consent was waived.

Serum isolation and small EV extraction from patients
To compare lncRNA expression between serum and serum EV, blood samples were collected from patients, from which serum was isolated in 1.5 mL tubes and stored at −80°C until use. Serum EV was extracted using ExoQuick (System Biosciences, Mountain View, CA, USA). The details of the modified protocol were described in our previous study [12].

RNA extraction and quantitative reverse transcription PCR
Serum RNA was extracted using the TRIzol-LS reagent (Invitrogen), while RNA from small EVs was isolated from serum using the SeraMir ™ Exosome RNA Amplification kit (System Biosciences), according to the manufacturer's instructions [12]. Next, serum RNA (500 ng) was reverse-transcribed into cDNA using the PrimeScript ™ RT Master mix (TaKaRa Bio, Otsu, Japan), whereas small EV-derived RNA (50 ng) was reverse-transcribed using the miScript II RT kit (QIAGEN, Hilden, Germany). cDNA was then used as templates for quantitative reverse transcription PCR (qRT-PCR) with the amfiSure qGreen Q-PCR Master Mix (GenDEPOT), which was performed on the ABI 7300 Real Time PCR System (Applied Biosystems ™ , Foster City, CA, USA). Primer information is summarized in Table S1. The 2 ÀΔΔC t method was used to determine the expression of each lncRNA relative to the internal control gene, HMBS. The detailed analysis methodology was described in a previous study [12].

EV uptake to recipient cells
The cell culture media were collected and centrifuged at 3000 g for 15 min to remove cells and cell debris. The supernatant was then mixed with the appropriate volume of ExoQuick and incubated at 4°C overnight. The mixture was centrifuged at 1500 g for 30 min, and the supernatants were removed by aspiration. EV pellets were resuspended with 50 µL of PBS. The size of small EVs was visualized by NTA. To examine the uptake of EVs into recipient MIHA cells, MIHA cells were incubated with PBS or SNU449-derived EVs (10 µgÁmL −1 ) for 4 h at 37°C. The cells were then washed with PBS, fixed in 4% paraformaldehyde, permeabilized in 0.25% Triton X-100, and incubated with primary and secondary antibodies.  2 (Sigma, St. Louis, MO, USA) or DMSO (0.1%) for 30 min at 37°C prior to the addition of SNU449-EVs. After incubation, total RNA from the cells was isolated followed by cDNA synthesis. qRT-PCR was performed to investigate the expression levels of lncRNAs as described above.

Statistical analysis
All statistical analyses were conducted with IBM SPSS software version 22.0 (SPSS Inc., Chicago, IL, USA) and GRAPHPAD PRISM version 7.01 (GraphPad Software, San Diego, CA, USA). P value < 0.05 was considered statistically significant. A chi-square test (two-sided) was performed for intergroup comparisons of categorical parameters, independent sample t-test or Welch's ttest was applied for continuous variables. ROC curves were constructed to define area under the curves (AUCs) with 95% confidence interval (CI) in each comparison group. The Kaplan-Meier survival curves with the log-rank test were performed to assess significant prognostic power between two patient groups.
Next, we analyzed the gene expression of these six candidate lncRNAs based on liver disease status in the Catholic_LIHC dataset. All six lncRNAs were differentially expressed in the five stages of liver disease ( Fig. 2A). In addition, the expression of these six candidate lncRNAs in HCC and nontumor tissues in other GEO RNA-sequencing datasets (GSE77314, GSE94660, and GSE124535) confirmed the higher expression of most candidate lncRNAs in tumor tissues, save for DLEU2 and NEAT1 in GSE77314 and NEAT1 in GSE94660 (Fig. 2B).

Measurement of candidate lncRNA expression in serum and serum-derived EVs in the test cohort
To confirm the usefulness of the candidate lncRNAs as liquid biopsy biomarkers for HCC, we quantified their concentration in the serum and serum-derived EVs of seven NL and nine HCC subjects. EV characterization was confirmed by TEM (Fig. 3A), NTA, and immunoblotting for EV markers and ER markers ( Fig. 3B and Fig. S1A). qRT-PCR analysis revealed no significant differences in the level of serum-derived lncRNAs of the two groups (Fig. 3C), whereas the levels of serum EV-derived lncRNAs (except for TUG1) were significantly higher in subjects with HCC than in NL (Fig. 3D). Further evaluation of the six serum EV-derived lncRNAs in the test cohort (n = 44) consisting of NL (n = 8), CH (n = 7), LC (n = 10), mUICC I/II HCC (n = 9), and mUICC III/IV HCC (n = 10) identified four lncRNAs-serum EV-derived DLEU2 (EV-DLEU2), HOTTIP (EV-HOTTIP), MALAT1 (EV-MALAT1), and SNHG1 (EV-SNHG1) -as significantly discriminatory for the HCC and non-HCC samples.
To confirm that the four lncRNAs originated from HCC, we investigated whether normal liver cells are capable of taking up HCC-derived EVs. First, we compared four lncRNA expression in MIHA-EV and SNU449-EV. All lncRNA expressions in SNU449-EV were higher than in MIHA-EV (Fig. S1B). After SNU449-EV was treated to MIHA, resulting images clearly showed overlapping of green (CD8; EV marker), red (EEA1; endosome marker), and blue (DAPI) fluorescence indicating successful delivery of EV to MIHA (Fig. S1C). Moreover, the relative expression levels of DLEU2, HOTTIP, MALAT1, and SNHG1 in EV-treated MIHA were significantly increased compared to those in PBS-treated MIHA, but also significantly decreased by endocytosis inhibitor (Fig. S1D). These results suggest that the four candidate lncRNAs originated from EVs and may participate in crosstalk  between HCC and normal liver cells. Hence, these were the four candidates ultimately selected for further validation (Fig. 3E).

Validation of the four serum EV-lncRNAs for HCC diagnosis
We then analyzed the diagnostic performance of EV-DLEU2, EV-HOTTIP, EV-MALAT1, and EV-SNHG1 for HCC in the validation cohort, comprising 72 subjects with HCC and 67 subjects without HCC. Table 1 summarizes the clinical baseline characteristics of the validation cohort. The proportions of subjects with mUICC stage I, II, III, IVA, and IVB HCC were 39%, 13%, 28%, 14%, and 7%, respectively. The expression levels of the four EV-derived lncRNAs were significantly higher in patients with all mUICC stages compared with NL, CH, and LC subjects (Fig. 4A). Meanwhile, their expression levels did not differ significantly between the mUICC and BCLC stages or based on vascular invasion ( Fig. 4A and Fig. S2). ROC curve analysis further showed the high discriminatory abilities of the four EV-derived lncRNAs for the diagnosis of HCC, with EV-MALAT1 emerging as the best (AUC = 0.908, 95% CI = 0.86-0.96; Fig. 4B). With the optimal cutoff values-5.5318-fold for EV-DLEU2, 9.2165-fold for EV-HOTTIP, 7.3752fold for EV-MALAT1, and 7.7427-fold for EV-SNHG1-EV-MALAT1 also exhibited the highest sensitivity (92.1%) and negative predictive value (92.5%), while EV-SNHG1 showed the best specificity (85.2%) and positive predictive value (87.5%; Table 2).
Considering the uneven age distribution between the groups, we also assessed whether EV-lncRNA expression varied depending on patient age. However, results show that EV-lncRNA levels were comparable between different age groups, thereby confirming that the observed differences in EV-lncRNA levels were based on HCC (Fig. S3). To assess EV-DLEU2, EV-HOTTIP, EV-MALAT1, and EV-SNHG1 as diagnostic biomarkers for very early HCC, we compared the diagnostic abilities of EV-derived lncRNAs for mUICC stage I, with that of AFP. EV-DLEU2, EV-HOTTIP, EV-MALAT1, and EV-SNHG1 had significantly better performance than AFP for detection of very early HCC (Fig. 5A,B). Specifically, AUCs for mUICC stage I HCC (vs. NL, CH, and LC) were 0.508, 0.885, 0.878, 0.92, and 0.904 for AFP, EV-DLEU2, EV-HOTTIP, EV-MALAT1, and EV-SNHG1, respectively (Table 2). Additionally, EV-derived lncRNAs showed a higher negative rate in CH and LC groups and a higher positive rate in the HCC group than AFP (Fig. 5C). Furthermore, EV-DLEU2, EV-HOTTIP, EV-MALAT1, and EV-SNHG1 showed comparable positivity in AFP-positive HCC (> 20 ngÁmL −1 ) as well as AFP-negative HCC (≤ 20 ngÁmL −1 ) with the optimal cutoff values-5.5318-fold for EV-DLEU2, 9.2165-fold for EV-HOT-TIP, 7.3752-fold for EV-MALAT1, and 7.7427-fold for EV-SNHG1. In the subgroup analysis of mUICC stage I or II, EV-derived lncRNAs had a higher positive rate in AFP-negative HCC than in AFP-positive HCC (Fig. 5D).
3.5. Performance of biomarker panels comprising different combinations of serum AFP, EV-DLEU2, EV-HOTTIP, EV-MALAT1, and EV-SNHG1 for the diagnosis of HCC and very early HCC Next, we examined several combinations of AFP and EV-lncRNAs to identify the optimal biomarker panel for diagnosing HCC (Fig. 6A,B; Table S2). In terms of all-stage HCC diagnosis, the combination of AFP and EV-MALAT1 showed the best AUC (0.911, 95%   EVs of healthy controls (n = 7) and subjects with HCC (n = 9), and (E) differential gene expression of six serum EV-lncRNAs according to liver disease status in the test cohort (n = 44). Black horizontal lines indicate sample means. Target gene expression was calculated relative to that of HMBS (Welch's t-test; *P < 0.05, **P < 0.01, and ***P < 0.001). highest positivity (95% for each combination) for mUICC stage I or II HCC. In fact, even in very early HCCs with low AFP levels (≤ 20 ngÁmL −1 ), panels with EV-lncRNAs showed high positivity (88-96%; Fig. 6C).

Prognostic implication of EV-lncRNA expression
Finally, we examined the prognostic impact of EV-lncRNA in the validation cohort. None of the candidate EV-lncRNAs were significantly related to any of the 72 HCC patients' overall survival (Fig. S4).
Meanwhile, in patients with mUICC stage I/II HCC, the high expression of EV-MALAT1 (≥ 27.732-fold) was significantly associated with poor overall survival (log-rank P = 0.009; Fig. 7). However, in patients with mUICC stage III/IV HCC, no single EV-lncRNA was related to overall survival.

Discussion
This study compared the lncRNA expression between HCC and non-HCC tissues using two RNAsequencing datasets, resulting in the identification of six lncRNAs (DLEU2, HOTTIP, MALAT1, NEAT1,  MALAT1 is a representative onco-lncRNA that is upregulated in many solid carcinomas [13]. It stimulates tumor growth, metastasis, and chromosomal instability through multiple mechanisms in different tissues, including modulating alternative splicing of oncogenic mRNAs, attaching to active regions of the chromosome, and recruiting serine/arginine-rich family proteins [14][15][16]. Although several studies have implicated it in the prognosis and development of HCC [17][18][19], few studies have actually reported the diagnostic value of circulating MALAT1 for HCC diagnosis. A recent study evaluating eight serum lncRNAs including serum MALAT1 reported an acceptable AUC (0.733, 95% CI = 0.676-0.790) with 59.7% sensitivity and 75.7% specificity in a model of HCC vs. LC, CHB, and healthy controls; however, this was lower than the AUC for AFP (0.811) [20]. Similarly, another study measuring plasma MALAT1 reported an AUC of 0.66 with 51.1% sensitivity and 89.3% specificity, which was also lower than that of AFP (sensitivity 73.3%, specificity 75%, and AUC 0.7) [21]. Meanwhile, Yuan et al. [22] quantified the level of 10 lncRNAs including MALAT1, in the plasma of 100 subjects with HCC, 100 subjects with CH, and 100 healthy controls, and reported no significant difference in plasma MALAT1 levels among the three groups. Therefore, to the best of our knowledge, the current study is the first to report the excellent diagnostic value of EV-derived MALAT1 in HCC with an AUC of 0.908, sensitivity of 92.1%, and specificity of 81.6% in an HCC vs. NL/CH/LC model. DLEU2 was also overexpressed in HCC tissues, particularly in large tumors with vascular invasion, and advanced-stage [23]. Recently, HBx was reported to bind the promoter region of the DLEU2 lncRNA, thereby enhancing DLEU2 transcription and inducing the accumulation of DLEU2 RNA. Moreover, the interaction of DLEU2 with HBx and the enhancer of zeste homolog 2/polycomb repressor complex 2 complex leads to sustained covalently closed circular DNA and host HCC-related gene transcription [24]. However, reports on circulating DLEU2 as a biomarker of HCC are limited. Here, we report its discriminant capacity highlighting its potential for use as a biomarker of very early-stage HCC.
Increased expression of HOTTIP in HCC tissue compared to nontumor counterparts is reportedly associated with HCC progression and disease outcomes [25]. Additionally, HOTTIP may promote HCC carcinogenesis through targeting the miR-125b [26], whereas miR-192 and miR-204 were suggested as upstream regulators for the suppression of HOTTIP expression in HCC [27]. Recently, a study showed significantly higher expression of HOTTIP in patients with advanced-stage HCC, old age, male gender, white race, and no cirrhosis. Moreover, HOTTIP was expressed with genes related to the PPAR signaling pathway [28], which plays a critical role in the pathogenesis of HCC. Studies have also reported HOTTIP as a diagnostic or prognostic biomarker in gastric and colorectal cancers [29,30]; however, to date, only one study has reported the discriminatory role of circulating HOTTIP in liver diseases. This study showed an increased expression of HOTTIP in resolved HBV patients compared to healthy controls [31].
SNHG1 is also upregulated in HCC [32] and contributes to the development and progression of HCC via direct inhibition of miR-195 expression [33]. SNHG1 also has important roles in HCC cell cycle, migration, and epithelial-mesenchymal transition by epigenetic silencing of cyclin-dependent kinase inhibitor (CKI) 1A and CKI2B [34]. Upregulated SNHG1 reduces sorafenib-induced apoptosis and autophagy in sorafenib-resistant HCC cells by triggering the Akt pathway through the solute carrier family 3 member 2 [35]. It has also been reported that plasma SNHG1 is elevated in HCC subjects and correlates with tissue SNHG1 expression. The AUC was 0.86 (95% CI = 0.78-0.91) for HCC discrimination (HCC vs. CHB and LC) in combination with AFP [32].
Collectively, this is the first study to show the diagnostic role of circulating small EV-derived MALAT1, DLEU2, HOTTIP, and SNHG1, which are reported onco-lncRNAs in HCC. Interestingly, serum lncRNA levels did not differ significantly between subjects with HCC and those without, whereas serum EV-derived lncRNA levels were significantly higher in subjects with HCC.
The proportion of EV-encapsulated vs. free noncoding RNAs found in liquid biopsies is currently a controversial issue as researchers have shown that most microRNAs are integrated into ribonucleoprotein complexes, while only a small proportion are enclosed in EVs [36]. Meanwhile, others have demonstrated consistently higher concentrations of microRNAs in exosomes compared to that in free form within serum and saliva [37]. Our previous study also reported a higher level of small EV-derived lncRNA than free serum lncRNA [12]. Therefore, both free and EVenclosed lncRNAs should be considered in an oncogenic liquid biopsy-based biomarker study.
Certain limitations were noted in this study. First, the detected EV-derived lncRNAs may have originated from cancer cells or from other cells, such as platelets or leukocytes. Indeed, the selective detection of tumorderived EVs from blood using specific surface markers prior to the analysis of lncRNAs has been a challenging issue in the research community. Second, the number of enrolled patients was relatively small, while the major etiology of HCC was limited to HBV (88.9%). Therefore, our results should be confirmed with a large number of external cohorts based on various etiologies.  I or II (middle panel), and HCC with mUICC stage I (right panel) from (A) nontumor subjects, and (B) patients at high risk of developing HCC (CH and LC) (two-tailed t-test; compared to AUC of AFP, ***P < 0.001). Bar chart showing a positive rate for AFP, DLEU2, HOTTIP, MALAT1, and SNHG1 in (C) patients with various liver disease status, and (D) patients with HCC, mUICC I/II, and mUICC I. The cutoff for positivity was defined as 5.5318-fold for EV-DLEU2, 9.2165-fold for EV-HOTTIP, 7.3752-fold for EV-MALAT1, 7.7427-fold for EV-SNHG1, and 20 ngÁmL −1 for AFP level.

Conclusions
Serum small EV-derived MALAT1, DLEU2, HOTTIP, and SNHG1 were significantly highly expressed in patients with HCC and showed good to excellent discriminating capacity that was significantly higher than that of the traditional tumor marker AFP. Furthermore, even in very early HCC with low AFP levels (≤ 20 ngÁmL −1 ), EV-derived MALAT1, DLEU2, HOT-TIP, and SNHG1 showed high positivity, suggesting the utility of EV-lncRNAs as a diagnostic liquid biopsy biomarker for very early HCC, particularly in patients without AFP elevation.   6. Combinations of AFP with four serum EV-lncRNAs for the diagnosis of HCC and early-stage HCC in the validation cohort. AUCs for the combination of two markers for diagnosing all-stage HCC (left), HCC with mUICC stage I or II (middle), and HCC with mUICC stage I (right) versus (A) all controls (normal, chronic hepatitis, and liver cirrhosis), and (B) patients at high risk of developing HCC (chronic hepatitis and liver cirrhosis) (two-tailed t-test; compared to AUC of AFP, all comparison values were determined as P < 0.0001). (C) Bar chart showing a positive rate for the combination of two markers by AFP status in subjects with all-stage HCC (top), mUICC I/II (middle), and mUICC I (bottom).

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Fig. S1. Communication between HCC and normal liver cells through EVs. Fig. S2. Differential gene expression of final four serum EV-lncRNAs. Fig. S3. Age-related EV-derived lncRNA expression in the validation cohort in all patients. Fig. S4. Prognostic power of four serum EV-lncRNA expression in the validation cohort. Table S1. Primer sequences in this study. Table S2. AUROCs of combination of two markers for diagnosing HCC.