Human bile contains MicroRNA-laden extracellular vesicles that can be used for cholangiocarcinoma diagnosis

Authors

Errata

This article is corrected by:

  1. Errata: Correction Volume 60, Issue 6, 2135, Article first published online: 24 November 2014

  • Potential conflict of interest: Dr. Kaloo owns stock and has intellectual property rights in Apollo. He consults for Checkmed and Pentax. Dr. Saxena consults for and received grants from Boston Scientific and also received grants from Cook Medical. Dr. Geschwind consults for and received grants from Biocompatibles/BTG, Bayer, Guerbet, Nordion/BTG, and Phillips. He consults for Jennerex and received grants from Theshold. He is the founder and CEO of PreScience Labs, LLC. Dr. Thuluvath advises, is on the speakers' bureau for, and received grants from Vertex and Gilead. He advises Janssen and is on the speakers' bureau for Onyx. He received grants from Boehringer Ingelheim, Novartis, Bristol-Myers Squibb, Eisai, and Salix.

  • This study was supported by a K08 Award (DK090154-01) from the National Institutes of Health (NIH; to F.M.S.) and by an Early Research and Detection Network (EDRN) Associate Membership supported by an U01 Award (CA086402) from the NIH. Dr. Meltzer is an American Cancer Society Clinical Research Professor.

  • See Editorial on Page 782

Abstract

Cholangiocarcinoma (CCA) presents significant diagnostic challenges, resulting in late patient diagnosis and poor survival rates. Primary sclerosing cholangitis (PSC) patients pose a particularly difficult clinical dilemma because they harbor chronic biliary strictures that are difficult to distinguish from CCA. MicroRNAs (miRs) have recently emerged as a valuable class of diagnostic markers; however, thus far, neither extracellular vesicles (EVs) nor miRs within EVs have been investigated in human bile. We aimed to comprehensively characterize human biliary EVs, including their miR content. We have established the presence of extracellular vesicles in human bile. In addition, we have demonstrated that human biliary EVs contain abundant miR species, which are stable and therefore amenable to the development of disease marker panels. Furthermore, we have characterized the protein content, size, numbers, and size distribution of human biliary EVs. Utilizing multivariate organization of combinatorial alterations (MOCA), we defined a novel biliary vesicle miR-based panel for CCA diagnosis that demonstrated a sensitivity of 67% and specificity of 96%. Importantly, our control group contained 13 PSC patients, 16 with biliary obstruction of varying etiologies (including benign biliary stricture, papillary stenosis, choledocholithiasis, extrinsic compression from pancreatic cysts, and cholangitis), and 3 with bile leak syndromes. Clinically, these types of patients present with a biliary obstructive clinical picture that could be confused with CCA. Conclusion: These findings establish the importance of using extracellular vesicles, rather than whole bile, for developing miR-based disease markers in bile. Finally, we report on the development of a novel bile-based CCA diagnostic panel that is stable, reproducible, and has potential clinical utility. (Hepatology 2014;60:896–907)

Abbreviations
CA19-9

carbohydrate antigen 19-9

CCA

cholangiocarcinoma

CT

computed tomography

Ct

cycle passing threshold

CTRL

control

dCCA

distal cholangiocarcinoma

ERCP

endoscopic retrograde cholangiopancreatography

EV

extracellular vesicle

EUS

endoscopic ultrasound

HCC

hepatocellular carcinoma

iCCA

intrahepatic cholangiocarcinoma

IR

interventional radiology

MRI

magnetic resonance imaging

miR

microRNA

MOCA

multivariate organization of combinatorial alterations

NTA

nanoparticle tracking analysis

pCCA

perihilar cholangiocarcinoma

PSC

primary sclerosing cholangitis

qRT-PCR

quantitative reverse-transcription polymerase chain reaction

RF

random forests

SVMs

support vector machines

TNM

tumor node metastasis

TSG101

tumor susceptibility gene 101

Cholangiocarcinoma (CCA) is a cancer that arises in the biliary tree.[1] Anatomically, CCA is divided into intrahepatic (iCCA), perihilar (pCCA), and distal (dCCA) tumors.[2] Surgery is the only curative option.[3] Unfortunately, because of the nonspecific nature of symptoms, as well as to failure of currently available tests, patients are usually diagnosed late in disease progression, when they are no longer surgical candidates.[3] All three types of CCA present diagnostic dilemmas. For example, the diagnosis of iCCA is, in part, based on lack of liver cirrhosis and absence of any other known primary solid tumors.[2] However, iCCA can also develop in cirrhotic livers, and a small size iCCA arising in a cirrhotic liver may mimic hepatocellular carcinoma (HCC) in terms of its rapid uptake of contrast material.[4] Diagnosing pCCA is equally difficult, despite a variety of available diagnostic tools, including magnetic resonance imaging (MRI), computed tomography (CT), endoscopic retrograde cholangiopancreatography (ERCP), cholangioscopy, and endoscopic ultrasound (EUS). pCCA tends to display a strong desmoplastic reaction, which poses a significant diagnostic challenge, because obtaining cells from these lesions for cytologic examination is exceedingly difficult.[3] Therefore, the sensitivity of cytology performed on brush biopsy specimens is, at best, only 20%.[2] Similar to pCCA, cancers located in the distal bile duct (dCCA) display low celullarity and a strong desmoplastic reaction, rendering cytologic diagnosis very difficult.

Multiple recent studies have sought to develop more-precise markers of CCA. One such approach aimed at diagnosing CCA is based on serum proteomics.[5] Additional studies have focused on RNA expression profiles in biliary brushings[6] or on microRNA (miR) profiles in whole human bile.[7] However, there are several limitations in interpreting the results from these studies. First, these studies tended to include small numbers of patients. In addition, standardization of specimen collection, specimen manipulation, and marker derivation have received inadequate attention. For example, in previous bile-based studies, there has been scant information regarding standardization of bile processing to ensure reproducible and reliable results. The issue of unreliable and/or conflicting results is paramount in cancer marker development.[8] Therefore, it is not surprising that studies published to date present contradictory information regarding specific miR-based markers of cancer.[9] In addition to which methodologies are best for body fluid collection, storage, and processing, there are several unanswered questions, including the appropriate reference gene(s) for normalization. Serum/plasma studies thus far have employed a variety of reference genes, including miR-16, miR-142-3p, let-7a, and small RNA U6. Whereas it is difficult to predict which of these RNAs serves as the best normalizer, it is apparent that some are worse than others. U6, in particular, which is approximately 4 times longer than any miR, should be avoided in miR-based marker panels in biologic fluids because it is less stable than miR species in body fluids and displays a different dynamic of degradation.[10, 11] Unfortunately, the only previous bile-based miR panel for CCA diagnosis employed U6 as a normalizer.[7] Last, given the complex makeup of biologic fluids, it is naïve to hypothesize that a single RNA exhibits constant expression across various physiologic and pathologic states. Recent evidence suggests that in order to circumvent the need for an internal control, synthetic miR sequences can be spiked into biologic fluids before RNA extraction.[12]

We hypothesized that because CCAs are in direct contact with bile, an accurate tumor-derived miR profile is more likely to exist in bile than in serum. In the current study, we present analyses of human bile geared toward developing a reliable, reproducible miR-based CCA diagnostic panel. We investigated the source of miRs in human bile, the stability of miR profiles in human bile, the best bile-processing procedures, and the most stable miR panel for diagnosing CCA from human bile.

Materials and Methods

Bile Samples

Bile samples from CCA and control (CTRL) patients were obtained from the Johns Hopkins Hospital (Baltimore, MD) under an institutional review board IRB-approved protocol. Bile samples were obtained at ERCP or at the time of percutaneous manipulation of biliary tubes by interventional radiology (IR). Aspiration of bile was performed after cannulation of the biliary tree before injection of contrast. CCA diagnosis was established based on pathologic and radiologic evidence. Table 1A contains demographic and clinical patient information. Table 1B contains the tumor node metastasis (TNM) stage for CCA patients who had carbohydrate antigen 19-9 (CA19-9) measured. Numerous patients with benign biliary obstruction were included in the control group. These patients were added because, from a clinical perspective, they present with a clinical picture indistinguishable from CCA. Thus, our marker panel was envisioned as particularly useful in this setting. Supporting Table 1 provides detailed information regarding control patients. For example, there were 13 patients with primary sclerosing cholangitis (PSC). These patients were followed for 5 years after bile specimens were collected to ensure that they were not already harboring early undiagnosed CCA. In addition, we included 15 patients with benign biliary tree obstruction and 3 with benign bile leaks.

Table 1A. Clinical and Epidemiologic Information
IdentifierGenderRaceAgeCA19-9SourceDiseaseTNM
  1. Gender, race, age, level of CA19-9 (where available) diagnosis, and TNM classification are presented.

  2. Abbreviations: M, male; F, female; C, Caucasian; AA, African American; A, Asian; H, Hispanic; NA, not available; PSC-CCA, PSC arising CCA; SOD, sphincter of Oddi dysfunction; CP, chronic pancreatitis; CBD, common bile duct.

CCA1MC57147.9EndoscopypCCAT4N1M0
CCA2MC5070.2IRiCCAT3N0M0
CCA3FC5538.1IRpCCAT4N0M0
CCA4FAA32NAEndoscopyPSC-iCCAT4N1M1
CCA5FA603,330.2EndoscopypCCAT3N0M0
CCA6MC85360.2EndoscopydCCAT3N1M0
CCA7MC45165.6EndoscopyiCCAT4N1M1
CCA8FAA68NAIRicCAT4N0M1
CCA9MC731,969.1EndoscopypCCAT3N1M0
CCA10FC69481.6IRpCCAT3N0M0
CCA11MC6538.1IRdCCAT3N0M0
CCA12MH644,872.7IRpCCAT2N0M0
CCA13FC50108.7IRiCCAT4N0M0
CCA14FC7512,352.2IRpCCAT4N0M1
CCA15FC68571.8IRiCCAT4N0M1
CCA16MC5110,692.7IRpCCAT4N0M0
CCA17MC6936.5EndoscopydCCAT4N1M0
CCA18FC723,610.5IRiCCAT4N1M0
CCA19FC6781IRiCCAT1N0M0
CCA20FC691EndoscopypCCAT3N1M0
CCA21FC69371.7EndoscopypCCAT4N2M1
CCA22FC57167EndoscopypCCAT2N0M0
CCA23FC6781IRiCCAT1N0M0
CCA24FC47496.2EndoscopyiCCAT4N2M0
CCA25MAA630.1IRpCCAT2N0M0
CCA26FC6130.9IRpCCAT3N0M0
CCA27MC70NAIRiCCAT4N0M1
CCA28FC39NAEndoscopyPSC-pCCAT3N0M0
CCA29FA6873.7EndoscopypCCAT4N0M0
CCA30FC5574.3IRpCCAT4N0M0
CCA31FH5615.5EndoscopypCCAT2N0M0
CCA32FC6758.6EndoscopyiCCAT3N1M0
CCA33MAA63<1.0IRpCCAT2N0M0
CCA34MC76NAIRpCCAT1N0M0
CCA35MC63NAIRpCCAT4N2M0
CCA36FC71325.4EndoscopypCCAT4N0M1
CCA37MC682,119.9EndoscopyiCCAT2N2M0
CCA38FAA7577.8EndoscopyPSC-dCCAT4N0M0
CCA39FC711,315.4EndoscopyiCCAT4N1M1
CCA40FC69574.6EndoscopyPSC-pCCAT4N2M0
CCA41MC50150EndoscopypCCAT2N0M0
CCA42FAA44<1EndoscopypCCAT3N0M1
CCA43FC77NAEndoscopypCCAT2N0M0
CCA44MC56114.4EndoscopypCCAT4N2M0
CCA45MC4697.9IRdCCAT2N0 M1
CCA46FH43109.3EndoscopypCCAT4N0M0
CTRL1MH79NAEndoscopyBiliary obstruction 
CTRL2MC54NAEndoscopyStent removal 
CTRL3FAA68NAEndoscopyBiliary obstruction 
CTRL4FC55NAEndoscopySOD 
CTRL5FC65309.5EndoscopyBiliary obstruction 
CTRL6MC61NAEndoscopyCirrhosis 
CTRL7MC59NAEndoscopyBile leak 
CTRL8MC6637.8EndoscopyBiliary obstruction 
CTRL9FA82NAEndoscopyBiliary obstruction 
CTRL10FC6320.7EndoscopyCP 
CTRL11FC50NAEndoscopySOD 
CTRL12FC53NAEndoscopyBiliary obstruction 
CTRL13MAA69NAIRBiliary obstruction 
CTRL14MC61NAIRBile leak 
CTRL15MC56NAIRBiliary obstruction 
CTRL16FAA41NAEndoscopyBile leak 
CTRL17MC60NAEndoscopyBiliary obstruction 
CTRL18MC52NAEndoscopyStent removal 
CTRL19FC398.6EndoscopyCP 
CTRL20FC337EndoscopyCP 
CTRL21FAA49NAEndoscopySOD 
CTRL22FC47NAEndoscopySOD 
CTRL23FC55NAEndoscopySOD 
CTRL24MC27NAIRCholangitis 
CTRL25FC45NAEndoscopySOD 
CTRL26FC63NAEndoscopySOD 
CTRL27FC66NAEndoscopySOD 
CTRL28MC65<1.0EndoscopyCBD stricture 
CTRL29FAA42NAEndoscopySOD 
CTRL30FC49NAEndoscopySOD 
CTRL31MA304.8EndoscopyPSC 
CTRL32MC3520.8EndoscopyPSC 
CTRL33MC53257.1EndoscopyPSC 
CTRL34FC571EndoscopyPSC 
CTRL35MC397.9EndoscopyPSC 
CTRL36FC228.4EndoscopyPSC 
CTRL37MC6625.2EndoscopyPSC 
CTRL38FAA41188.4EndoscopyPSC 
CTRL39MA42NAEndoscopyPSC 
CTRL40MC68230.6EndoscopyCBD stricture 
CTRL41FC58<1EndoscopyCP 
CTRL42MA57NAEndoscopyBiliary obstruction 
CTRL43FC39NAIRCBD stricture 
CTRL44MC74<1EndoscopyStent removal 
CTRL45FC49NAEndoscopyPSC 
CTRL46MA3116.4EndoscopyPSC 
CTRL47MC3135EndoscopyPSC 
CTRL48MA3164.2EndoscopyPSC 
CTRL49MC73NAendoscopyCholangitis 
CTRL50FAA56<1.0EndoscopyBiliary obstruction 

Bile Extracellular Vesicle Isolation

Initially, we experimented with EVs isolation from 1 mL of fresh bile. Once the experimental procedures were carefully delineated, we started extracting EVs from 400 µL of bile. Bile samples were centrifuged at 300×g for 10 minutes at 4°C to pellet cells and debris. The supernatant was then centrifuged at 16,500×g for 20 minutes at 4°C to further remove cellular debris and then filtered through a 200-nm filter. Next, the supernatant was centrifuged at 120,000×g for 70 minutes at 4°C to pellet EVs.[13] EVs were utilized for immediate RNA extraction or resuspended in 50-150 µL of phosphate-buffered saline and stored in −80°C for future use.

Statistical Analyses

We used three computational packages to assess the predictive value of selected miR species for CCA diagnosis: random forests (RFs), support vector machines (SVMs), and our recently developed MOCA algorithm (see details below).[14, 15] For details, as well as for materials and methods, please see the Supporting Information (Methods; Supporting Figs. 1 and 2; Supporting Table 2).

Results

Delineation of the Source of miRs in Human Bile

The sole published report utilizing human bile for miR-based CCA diagnostic panels employed whole bile.[7] Based on RNA gel electrophoresis, as well as quantitative reverse-transcription polymerase chain reaction (qRT-PCR) values for a well-expressed miR species (miR-21), we determined that free-floating cells in bile contribute to the RNA extracted from bile (Supporting Fig. 3A). Unfortunately, the quantity and quality of the RNA contributed by free-floating cells likely depends on the number of free-floating cells in a specific bile specimen and also on the degree of cell viability and therefore is unpredictable. In addition, we demonstrate that the RNA contributed by free-floating cells is rapidly degraded at RT, as well as from a single freeze-thaw cycle (Supporting Fig. 3A,B). These data strongly argue against using whole bile for developing a miR-based disease marker panel.

Isolation and Characterization of Human Bile Extracellular Vesicles

Extracellular vesicle preparations from human bile were imaged by using TEM. We noted the presence of 30-110 nM of vesicles, consistent with previously reported features of EVs (Fig. 1A).[16] To further confirm that these spherical structures are EVs, we assayed for presence of tumor susceptibility gene 101 (TSG101) and CD63, molecules frequently used as extracellular vesicle markers.[17] Western blotting confirmed that the biliary-derived extracellular vesicle preparations are rich in TSG101 and CD63 (as shown in Fig. 1B). To further define human biliary EVs, we employed multiparameter nanoparticle tracking analysis (NTA). First, we noted the presence of round vesicles, displaying typical Brownian motion (Fig. 1C; Supporting Movie 1). Next, we found that the majority of EVs in human bile were between 30 and 110 nM and that the mode of EV sizes was 84 nM (Fig. 1D), suggesting that EVs isolated are most likely exosomes. Based on NTA analysis, we also determined that bile from CCA patients contained approximately 3 × 10−11 EVs/mL of bile, and bile from control patients contained approximately 2.5 × 10−10 EVs/mL of bile. To further substantiate the presence of EVs in preparations from human bile, we stained EVs with PKH67, as previously described.[18] Next, we added stained EVs or control extracts to cells in culture. Cells took up stained EVs from human bile, but not from the controls as shown in (Supporting Fig. 4; Supporting Movie 2).

Figure 1.

Human biliary EV characterization. (A) Typical transmission electron microscopy picture demonstrating the presence of 30-110 nM of spherical structures in human bile. Further characterization demonstrates that these vesicles display EV characteristics. (B) Presence of typical EV proteins (TSG101 and CD63) in EV preparations from human bile. (C) The same 30-110 nM of vesicles as visualized with NTA. (D) Mode of EVs isolated from human bile was determined to be approximately 84 nm (x-axis depicts EV size and y-axis depicts EV concentration for each size).

Presence and Isolation of miR Species From Biliary Extracellular Vesicles

EVs isolated from serum were demonstrated to contain miR species, but no studies to date have investigated human bile EVs.[19] We performed qRT-PCR miR arrays on EV RNA isolated from 1 CCA bile specimen (Fig. 2A). According to the manufacturer's recommendation, we utilized a cycle passing threshold (Ct) value threshold of 40 cycles to establish which miR species existed in quantities high enough to be detected. Utilizing a Ct value threshold of 40 cycles, we were able to detect 137 miR species. From these 137 miR species with amplification, a number of 74 miR species were amplified at a Ct value of 32 or less, which is considered by the manufacturer as reliable amplification.

Figure 2.

miR species extraction from human bile EVs. (A) Amplification curves for miR species from EVs isolated from a bile specimen (x-axis, cycle number; y-axis, measured miR expression). (B) Large variability in measured Cel-miR-39 quantity after Cel-miR-39 was spiked at equal concentrations in a cohort of 60 bile specimens (x-axis, bile specimens; y-axis, measured qRT-PCR value for Cel-miR-39).

Identification of a Normalizer for miR Species Extracted From Biliary Extracellular Vesicles

Starting with the same initial volume of bile, utilization of such a synthetic miR will normalize for any variability in RNA extraction. To gain insight into the potential presence and/or magnitude of such biases, we spiked Cel-miR-39 into EVs extracted from 60 human bile specimens. Next, we extracted RNA from these specimens and performed qRT-PCR for Cel-miR-39. The measured quantity of Cel-miR-39 differed dramatically across the 60 samples (Fig. 2B). The comparison between the highest and the lowest measured quantity of Cel-miR-39 revealed a dynamic range of 65-fold. Based on these data, we concluded that normalizing for RNA extraction efficiency by spiking Cel-miR-39 into all specimens is mandatory.

Stability of miR Species in Biliary EVs

We aimed at determining the stability of miR species extracted from biliary EVs. To determined the effects of storing bile specimens at RT or freeze-thaw cycles, we chose two miR species (miR-21 and miR-638), that, in our previous experiments, were found to be expressed in biliary EVs. Preliminary experiments on CCA tissues (not presented here) demonstrated good expression of miR-21 and −638. We first verified that high tissue expression translated into measurable levels in bile. Next, we utilized primers for these two miR species to perform pilot experiments in bile to document reproducibility, as well as assess the best bile and EV handling techniques. Storing whole bile at RT for up to 48 hours and up to three freeze-thaw cycles has a negligible effect on expression of human bile EV content of miR species (Fig. 3).

Figure 3.

High stability and reproducibility of measured miR expression in bile EV extracts. (A) Levels of miR-21 and miR-638 are stable in EVs isolated from human bile kept at RT for 2-24 hours. (B) Stability of miR-21 and miR-638 after multiple freeze-thaw cycles.

Identification of Differentially Expressed miR Species Between Biliary EVs of CCA and Control Patients

qRT-PCR miR arrays were performed on a total of 6 specimens (RNA extracted from EVs isolated from 3 CCA and 3 control bile specimens). There were 54 miR species, which had amplification at Ct of 32 or less in at least 2 CCA specimens. The average normalized value for each miR species was calculated for CCA and control specimens, respectively. The fold ratio of miR expression in CCA versus control specimens was utilized to order these miR species. We then selected the top 11 miR species (Table 2; Fig. 4) for further analyses.

Table 1B. Clinical and Epidemiologic Information
 NumberNameTNM5-miR PanelCA19-9
  1. TNM classification for the 39 CCA specimens for which there was a recorded value for CA19-9 is displayed. A value of 1 identifies specimens that were correctly diagnosed as cancers by either the 5-miR panel or by CA19-9. There were a total of 28 CCA specimens that were diagnosed correctly by the 5-miR panel and 23 CCA specimens diagnosed correctly by CA19-9. Calculated sensitivity for CCA was 71.7% for the 5-miR panel, 58.9% for CA19-9, and 89.7% for the combination 5-miR panel and CA19-9.

 1CCA1T4N1M011
 2CCA2T3N0M010
 3CCA3T4N0M010
 4CCA5T3N0M001
 5CCA6T3N1M011
 6CCA7T4N1M101
 7CCA9T3N1M001
 8CCA10T3N0M011
 9CCA11T3N0M010
 10CCA12T2N0M001
 11CCA13T4N0M011
 12CCA14T4N0M111
 13CCA15T4N0M101
 14CCA16T4N0M011
 15CCA17T4N1M010
 16CCA18T4N1M011
 17CCA19T1N0M010
 18CCA20T3N1M000
 19CCA21T4N2M111
 20CCA22T2N0M001
 21CCA23T1N0M010
 22CCA24T4N2M001
 23CCA25T2N0M000
 24CCA26T3N0M00
 25CCA29T4N0M010
 26CCA30T4N0M010
 27CCA31T2N0M000
 28CCA32T3N1M010
 29CCA33T2N0M011
 30CCA36T4N0M111
 31CCA37T2N2M011
 32CCA38T4N0M010
 33CCA39T4N1M111
 34CCA40T4N2M011
 35CCA41T2N0M011
 36CCA42T3N0M110
 37CCA44T4N2M011
 38CCA45T2N0 M110
 39CCA46T4N0M011
Total39  2823
Sensitivity, %   71.758.9
Figure 4.

Expression of 11 selected miR species across 96 samples. Heatmap segregates the 50 control samples (left side) and 46 CCA samples (right side). Bright red was used for all coordinates with a z-score ≥0.5; therefore, these coordinates correspond to a CCA classification by MOCA. See Table 3B for mean expression values and diagnostic thresholds for all 11 miR species.

Comparison of Mathematical Models to Analyze Extracellular Vesicle miR Profiles

To assess the predictive value of selected miR species for CCA diagnosis, we used three distinct mathematical approaches: RFs, SVMs, and MOCA algorithm (see Materials and Methods).[14, 15] Table 3A shows the sensitivity and specificity for diagnosing CCA from selected miR species using each of the three mathematical approaches. For each approach, the miR species facilitate reasonably accurate CCA diagnosis. The sum of sensitivity and specificity is approximately equal for MOCA and RFs, with RFs achieving greater sensitivity (76% vs. 67%) and MOCA achieving greater specificity (96% vs. 88%). SVMs achieved a slightly greater specificity than RFs (90% vs. 88%), but, in general, MOCA and RFs outperformed SVMs. MOCA is unique among the three methods, in that it selects biomarkers that can be used for subsequent clinical diagnosis, whereas the other methods generate black-box models that are highly dependent on the training data. Furthermore, biomarkers selected by MOCA have potential clinical utility, because the corresponding thresholds for CCA diagnosis are well above the sample means (Table 3B; with SVMs and RFs, there is no guarantee that samples were partitioned using expression values within the resolution of the experiment. Because MOCA predictions were consistent, provide a clear biological relationship between predictors and classification, and distinguish cancers from controls with clinically relevant resolution, we performed all further analyses using MOCA results.

Table 2. qRT-PCR Values of the Selected 11 miR Species
miR SpeciesCtrl 1Ctrl 2Ctrl 3Mean CtrlCCA 1CCA 2CCA 3Mean CCAFold C/N
  1. Expression of the selected 11 miR species is displayed for the 3 CTRL and 3 CCA specimens used in the discovery phase. Values are normalized to Cel-miR-39.

  2. Abbreviations: fold C/N, fold difference CCA/CTRL; N/A, not applicable.

miR-2220000601.7445.41,061.1702.7N/A
miR-1260000130.774.4387.9197.7N/A
miR-486-3p024.9121.748.94,889.1635.85805721,194.1434
miR-4842123.9015748.2173.64,063.61,661.8111
miR-19a03.601.2218.351.795.6121.9101
miR-19b32.624.9019.21,422.6367.83,874.51,888.398.4
miR-163.8135.7140.793.43,726.41,451.420,4638,546.891.5
miR-1916.69.716.110.8437.41851769797.273.8
miR-3114.7004.9329.855.531.6138.928.4
miR-1274b357.1166.51,306609.813,240.41,899.1367.15,168.98.5
miR-618176,44653,588076,678550,472133,02055.6227,8493
Table 3A. Characteristics of Mathematical Approaches for Data Analysis
 Sensitivity (%)Specificity (%)
  1. Predictive value achieved from each of three distinct mathematical approaches is shown. Statistical sensitivity and specificity of CCA diagnosis achieved using selected miR species using the MOCA algorithm, SVM, and RF.

MOCA6796
RF7688
SVM6590
Table 3B. Characteristics of Mathematical Approaches for Data Analysis
miRMeanThreshold
  1. Eleven differentially expressed miRs selected for analysis of predictive value are shown. The table displays the 11 miR species, the mean expression across 96 samples for that miR, and the threshold above which a sample would be classified as CCA by MOCA. In all cases, the threshold is significantly greater than the mean across all samples for the corresponding miR.

12674.17221.88
61813.326.29
31428.02811.93
22235.6360.88
1670.91144.76
486-3p46.7882.30
48429.2654.49
19b66.34135.98
1915.179.23
19a41.4085.70
1274b61.00165.87
Table 3C. Characteristics of Mathematical Approaches for Data Analysis
 Sensitivity (%)Specificity (%)
  1. Markers comprising five, four, three, or two miR species and corresponding predictive value are shown. Statistical sensitivity and specificity of CCA classification achieved by four, representative multi-miR markers. Multi-miR markers were combined using the union (U) Boolean set operation.

191 U 486-3p U 1274b U 16 U 4846796
191 U 486-3p U 1274b U 166596
191 U 486-3p U 1274b5996
191 U 486-3p5796

Utilization of MOCA for Human CCA Diagnosis Based on Bile EV miR Expression

Table 3C shows representative, highly predictive biomarkers of CCA that comprised five, four, three, or two species as well as the corresponding statistical sensitivity and specificity. All biomarkers comprising two or more miR species are the result of combining those constituent miRs using the union Boolean set operation (see Materials and Methods). Several of the markers that combine six miRs have predictive values equal to that of the first marker in Table 3C (marker 1); however, marker 1 is a subset of any higher-order marker that has an equivalent predictive value, and therefore use of these high-order markers is superfluous and not considered here. Markers combining more than six miRs have an overall decreasing predictive value owing to a substantial decrease in specificity. Conversely, lower-order markers (those combining three, two, or a single miR species) had decreasing predictive value owing to a decrease in sensitivity (Table 3C). For example, the marker comprising miR-191, miR-486-3p, and miR-1274b has a sensitivity and specificity of 59% and 96%, respectively. The marker combining miR-191 and miR-486-3p has a sensitivity of 57% and a specificity of 96%. No single miR marker consistently passed the 0.05 false-discovery rate during 10-fold cross-validation and was not considered further.

The distribution of CCA classification for the four representative multi-miR markers from Table 3C, and expression of each miR for the corresponding individual markers is shown in Fig. 5A,B. This representation is useful to determine when a marker uniquely diagnoses CCA, for which samples there is a consensus of diagnoses, and which markers complement to create highly predictive multi-miR biomarkers. For instance, miR-486-3p is unique among the five miRs, in that it is the only miR that accurately diagnoses samples CCA23, 36, 39, and 43. Similarly, miR-191 is the only marker that accurately diagnoses CCA1 and 46; taken together, this complementarity explains why miR-191 and miR-486-3p combine to make a highly predictive two-miR marker. Similarly, miR-1274b is unique in accurately classifying CCA6 and 37, which explains why this marker complements miR-191 and miR-486-3p to make a highly predictive three-miR marker. MiR-486-3p and miR-16 make a complementary CCA diagnosis more than any other marker pair, and therefore these miRs combine to have the highest predictive value of any two-miR marker (sensitivity, 57%; specificity, 98%). Only CCA3 is accurately classified by four miRs, and no CCA samples are accurately classified by all five miRs from Fig. 5.

Figure 5.

Bile specimen classification by multi-miR markers with high predictive value. (A and B) Classification across 46 CCA samples by each of the markers from Table 3C (color bars in A) and expression for each of the corresponding miRs (heatmap in B). In (B), bright red was used for all coordinates with a z-score ≥0.5; therefore, these coordinates correspond to a CCA classification by MOCA. For each multi-miR marker, (A) displays the correct diagnosis with a solid rectangle. (C and D) Classification across 50 control (CTRL) samples by each of the markers from Table 3C. For each multi-miR marker, (D) displays the correct diagnosis with a solid rectangle.

The origin of high specificity among the most predictive markers from Table 3C is shown in Fig. 5C,D. For example, miR-486-3p never makes a false-positive classification. MiRs −16 and −1274b both classify CTRL37 as CCA; this is the only false-positive classification for either miR-16 or −1274b. Similarly, miR-484 makes only a single false-positive classification (CTRL5). Also, miR-191 makes two false-positive classifications, which are coincident with miRs −16 and −1274b in classifying CTRL37 as CCA and with miR-484 in classifying CTRL5 as CCA. Because these five miRs rarely make a false-positive classification, and because they are complementary in making true-positive classifications, MOCA was able to combine them into multi-miR biomarkers with reasonably well-balanced predictive value.

Discussion

The field of CCA is in urgent need of better diagnostic methods. Whereas the overall survival of CCA patients is dismal, there is a large discrepancy between survival of patients diagnosed early and the vast majority of patients, who are diagnosed late in their disease. The data presented herein delineate a 5-miR panel with superior diagnostic accuracy for CCA, when compared to the currently available diagnostic methods. In addition, these studies are the first to identify and characterize EVs in human bile. The presence of miR-laden EVs in human bile has physiologic, as well as potential pathologic, implications. It was recently demonstrated that HCC cells release EVs rich in miR species, which are believed to function in intrahepatic cell-cell signaling.[20] In addition, it was shown that exosomes, a type of EVs, exist in rat bile, interact with cholangiocytes, and are able to modulate intracellular growth mechanisms.[21] These findings suggest a new paradigm, wherein liver and biliary tree cells communicate through EVs and extracellular vesicle-transported miR species.[21, 22] Our study adds to this paradigm and is the first to put forward the hypothesis that human bile acts as a physiologic and pathologic conduit allowing the communication of information, in the form of EV-transported miR species, between various cells within the liver and biliary tree. These findings open a broad new avenue of investigation for understanding normal physiologic signaling, as well as potential implications in disease, such as CCA.

The current study furnishes strong evidence that RNA isolated from human bile derives from free-floating cells, as well as biliary EVs. Although our vesicle isolation protocol is geared toward exosome isolation through employment of differential ultracentrifugation, further studies are needed to definitely conclude that our findings are specific for exosomes, and not extracellular vesicles in general. In addition, this study establishes that RNA originating from these free-floating cells is rapidly degraded, both by storing bile at room temperature (even for 1 hour) and by a single freeze-thaw cycle. We conclude that any bile-based miR panel developed from whole, cell-containing bile will be unpredictably biased by bile processing and therefore destined to have limited clinical applicability.[7] In contrast, we demonstrate that bile EVs contain abundant miR species that are stable and therefore usable for the development of bile-derived miR-based diagnostic panels.

Utilization of bodily fluids for the development of disease markers is appealing because these fluids can be obtained by noninvasive (urine), minimally invasive (blood), or moderately invasive (bile) procedures. Nevertheless, the mere presence of EVs in these bodily fluids, along with the fact that they contain miR species, is not sufficient to develop stable, reproducible diagnostic panels. The field of miR-based disease markers is plagued by contradictory, and often irreproducible, results.[9] An intuitive explanation for this problem is a lack of standardization in specimen collection, storage, and processing. An additional difficulty is normalization of miR values in biologic fluids. In contrast to tissues, biologic fluids have little long RNA species (such as messenger RNA), and these species are quickly degraded. Therefore, attempts to normalize to a standard housekeeping gene are destined to failure. A multitude of biologic fluid-derived miR panels, including the sole bile-based miR panel published to date,[7] utilize as normalizer the small RNA, U6. However, there is a growing body of evidence suggesting that utilization of U6 to normalize miR expression in body fluids introduces biases that have the potential of rendering the results unreproducible and/or unusable in a clinical laboratory setting.[9] Although there is no indication that any other intrinsic RNA species performs better than U6, there is accumulating evidence that U6 is not appropriate as a normalizer.[10, 11] In addition, the use of any other intrinsic miR species as a normalizer is fundamentally based on the assumption that such a miR with constant expression across physiologic and pathologic states exists. There is no evidence to date to suggest that such an miR species exists in biliary EVs. Presently, there are no data to suggest that any intrinsic RNA normalizer, whose expression is stable in normal and, more importantly, in diseased states, exists in biologic fluids. Therefore, we propose that biologic fluid-derived miR levels should be normalized to initial volume of fluid analyzed. In addition, our study establishes that RNA extraction efficiency varies significantly among specimens, arguing that studies lacking a spike-in control may produce unreliable results.

Cancer diagnosis and clinical outcome prediction studies have created a relatively new field for applied mathematics. However, many previous studies have been limited to a single analytic method. Our current study compares SVMs and RF, two of the most commonly used mathematical models for cancer diagnosis,[23] to MOCA, a recently developed mathematical model. One limitation of SVMs and RF is a lack of transparency in the classifiers that result from training. Because these models are “black boxes,” it is difficult to apply real biological principles to subsequent clinical testing, and continued use of the initial training model is required. Conversely, MOCA returned the combination of miR species that optimized CCA diagnosis and the predictive value that each miR contributed to that combined biomarker. Furthermore, MOCA returned the same biomarkers regardless of how the 10-fold cross-validation data were split; SVM and RFs varied as a function of data split, and for RFs, the model was dependent on the “seed” employed (see Materials and Methods).

We deliberately tuned our analytic technique for high specificity because the clinical consequences of false-positive CCA diagnosis would be calamitous. With specificity at 96%, our 5-miR panel displayed a sensitivity of 67%. Thus, the overall performance of this panel is superior to CA 19-9, as well as to any CCA diagnostic method currently employed in clinical practice. Further studies are needed to verify these findings in larger cohorts and explore the potential utility of combining our new 5-miR panel with other marker panels or with established clinical modalities, such as cytology.

We foresee potential clinical utility of our marker panel in patients with obstruction in the biliary tree (Supporting Table 1). The typical patient who would benefit from this marker panel is a PSC patient. In our analyses, 12 of 13 patients with PSC and no cancer were correctly diagnosed as not having CCA. Please note that all patients were followed for 5 years since the date the bile was collected to ensure that, at least for 5 years, they would not develop CCA. The PSC patient who was diagnosed with CCA continues to be followed at our hospital. There is still no evidence of CCA. The two potential explanations for this seemingly false-positive result are (1) the method is not perfect and the results need to be integrated clinically and with other laboratory results and (2) the biologic state that induced a positive test is reversed, or reversible. Further studies are needed in order to answer this question.

Among patients with a known diagnosis of CCA, 4 had a preexisting diagnosis of PSC. Three of these four were diagnosed correctly by our marker panel (Supporting Table 3). The single patient misdiagnosed as CCA negative did not have CA19-9 levels drawn. Of the remaining 3 patients correctly diagnosed by our marker panel, 2 had CA19-9 levels recorded. One of these had a CA19-9 level of 574.6 units/mL, suggestive of CCA. However, the other had a CA19-9 level of only 77.8 units/mL, which, although above the upper normal limit, is below the accepted cut-off value of 129 units/mL currently utilized to diagnose CCA.[24]

Our marker panel outperformed CA19-9 levels in non-PSC patients. Among 39 CCA patients with recorded CA19-9 values (Table 1B), our marker panel correctly diagnosed 28, translating to a sensitivity of 71%. The accepted cut-off CA19-9 value of 100 units/ML (in patients without PSC) correctly diagnosed only 23 patients, for a sensitivity of only 58%.[2] This calculated sensitivity of CA19-9 is slightly higher than, but similar to, that found in a previous study (53%). Thus, among every 100 CCA patients, our method is expected to diagnose 13 patients more than would CA19-9 levels. Notably, among 11 CCA patients correctly diagnosed by our 5-miR panel, but misdiagnosed by CA19-9, there were 8 CCA patients without lymph node or distant metastatic implants (N0M0). These patients are potentially cured by resection. In contrast, of the 6 patients correctly diagnosed with CCA by CA19-9, but not by our 5-miR panel, only 2 were N0M0 (Supporting Table 4). In further support of our hypothesis that our panel can better identify early tumors, the two T1N0M0 cancers in our cohort were correctly diagnosed by our panel, but not by CA19-9 levels. Diagnosing these early cancers is crucial, because the only current curative option for CCA patients is surgery, which can only be performed early in the course of this disease. We conclude that CA 19-9 tends to diagnose advanced CCA, for which surgery may no longer be an option, whereas our marker panel tends to diagnose early CCA, where surgery is still an option. Therefore, the difference in sensitivity between our marker panel and CA19-9 levels (71% vs. 58%) may actually underestimate our marker panel's potential effect on patient survival. Clearly, prospective studies are needed to answer this important question. Finally, in our 39-patient CCA cohort, 36 were correctly diagnosed by either our marker panel or CA19-9, translating into a combined sensitivity of 89.7%. Thus, even if treatments for advanced CCA improve, combining our marker panel with CA19-9 levels may become a valuable diagnostic strategy.

Ancillary