Circulating U2 small nuclear RNA fragments as a novel diagnostic biomarker for pancreatic and colorectal adenocarcinoma



Improved non-invasive strategies for early cancer detection are urgently needed to reduce morbidity and mortality. Non-coding RNAs, such as microRNAs and small nucleolar RNAs, have been proposed as biomarkers for non-invasive cancer diagnosis. Analyzing serum derived from nude mice implanted with primary human pancreatic ductal adenocarcinoma (PDAC), we identified 15 diagnostic microRNA candidates. Of those miR-1246 was selected based on its high abundance in serum of tumor carrying mice. Subsequently, we noted a cross reactivity of the established miR-1246 assays with RNA fragments derived from U2 small nuclear RNA (RNU2-1). Importantly, we found that the assay signal discriminating tumor from controls was derived from U2 small nuclear RNA (snRNA) fragments (RNU2-1f) and not from miR-1246. In addition, we observed a remarkable stability of RNU2-1f in serum and provide experimental evidence that hsa-miR-1246 is likely a pseudo microRNA. In a next step, RNU2-1f was measured by qRT-PCR and normalized to cel-54 in 191 serum/plasma samples from PDAC and colorectal carcinoma (CRC) patients. In comparison to 129 controls, we were able to classify samples as cancerous with a sensitivity and specificity of 97.7% [95% CI = (87.7, 99.9)] and 90.6% [95% CI = (80.7, 96.5)], respectively [area under the ROC curve 0.972]. Of note, patients with CRC were detected with our assay as early as UICC Stage II with a sensitivity of 81%. In conclusion, this is the first report showing that fragments of U2 snRNA are highly stable in serum and plasma and may serve as novel diagnostic biomarker for PDAC and CRC for future prospective screening studies.

Despite significant improvements in surgery and pharmacotherapy the prognosis of advanced gastrointestinal cancers, such as pancreatic ductal adenocarcinoma (PDAC) and colorectal cancer (CRC), remains dismal. In both cancer types, reliable disease detection in early stages represents an important clinical challenge. Since robust tumor markers for early cancer detection are not available for both PDAC and CRC, identification of novel biomarkers is a central research goal. Serum levels of CA19-9, currently the most widely used diagnostic marker for PDAC, are hampered by a low sensitivity and specificity for PDAC.1 In recent years, additional serum markers such as CEACAM1, MIC-1, TIMP-1, osteopontin and many others have been suggested for PDAC diagnosis, but none of them has managed to enter clinical routine.2–5 Detection of early cancer currently provides the only chance for cure which is illustrated by the fact that five-year survival rates reach 24% if the tumor is localized and small (<2 cm).6 Unfortunately, the detection of early stage PDAC proved challenging and has not been achieved to date.

Unlike pancreatic carcinoma, early CRC can be cured by surgery and adjuvant chemotherapy in many patients, but the outcome of advanced CRC disease remains poor. In addition, an efficient tool for the early detection of CRC exists, namely colonoscopy. Colonoscopy, albeit considered an invasive diagnostic method, was shown to be able to reduce CRC mortality and most CRC-related deaths via detection of early-stage cancer and precancerous lesions.7 Unfortunately, the compliance rates for colonoscopy screening are still rather low.8 Consequently, also for CRC detection, non-invasive diagnostic methods are highly desirable. Several non-invasive screening tests, including fecal occult-blood testing (FOBT) and stool DNA test, have been available for years for CRC.9 However, none of these methods have been established as well-accepted screening tools, because of their low sensitivity.10

In the past, several DNA-, RNA- and protein-based blood tests have been explored for early detection of cancer including PDAC and CRC. However, the vast majority of these tests have been abandoned for various reasons such as impaired reproducibility, poor specificity and sensitivity as well as instability of the target molecules in peripheral blood. Recently, it has been shown that analysis of non-coding RNAs (ncRNAs) circulating in peripheral blood has the potential to overcome some of the previous limitations. NcRNAs include small nucleolar RNAs (snoRNAs), microRNAs (miRNAs), piwi-associated RNAs, small Cajal body-specific RNAs (scaRNAs) and small nuclear RNAs (snRNAs). To date, primarily miRNAs have been evaluated for diagnostic purposes. The main advantage of miRNAs are their proven high stability and abundance in serum or plasma, and the availability of high specific and sensitive detection assays for the majority of the human miRNAs. Furthermore, Mitchell et al. showed that miRNAs originating from the tumor are indeed present in the circulation and stable for a prolonged time.11 Additional work has provided evidence, that binding to proteins such as Argonaute2, NPM1 or HDL,12–15 or to some extent their inclusion in microvesicles16–18 are likely to be the key to the protection of miRNAs from degradation and thus high stability in serum and plasma.

A number of studies have addressed miRNA detection in serum, plasma or whole blood of colorectal or pancreatic cancer patients in order to develop miRNA-based biomarkers.19–29 With two exceptions,23, 27 the reported miRNA biomarkers reached only modest levels of sensitivity and specificity, both for PDAC and CRC, hampering their clinical utility. Apart from miRNAs, there is currently only one available report, indicating that another ncRNA family member, the so called snoRNAs, can also be detected in patient plasma and may serve as diagnostic cancer biomarker for non-small-cell lung cancer.30 For other ncRNAs, a diagnostic utility for cancer detection has not been demonstrated.

Here sera from mice carrying human PDAC xenograft tumors were successfully used for the identification of circulating miRNAs. In subsequent analyses, we found that the miRNA showing the best discriminatory value between cancer and healthy controls was in fact fragmented human U2 snRNA. snRNAs, including U2 snRNA, have so far not been shown to have biomarker potential for cancer. U2 snRNA, together with several proteins forms the U2 small nuclear ribonucleoprotein (snRNP), which plays a key role in the splicing of pre-mRNA catalyzed by the spliceosome.31 The spliceosome is a large dynamic macromolecular machinery, which assembles by the highly coordinated, sequential association of four small nuclear RNPs (snRNPs), among them U2 snRNP. Together with U6, U2 forms part of the RNA network that brings the reactive sites into close proximity of the pre-mRNA and is thus at the catalytic core of the spliceosome. The splicing-active U2 snRNP additionally contains the heteromeric splicing factors SF3a and SF3b. Interestingly, rare missense mutations in SF3B1 and SF3A1 have been described in pancreatic carcinoma and very recently SF3B1 mutations were linked to the pathogenesis of myelodysplastic syndrome.32, 33

For the first time, our report shows that U2 snRNA fragments can be used for the discrimination of PDAC and CRC patients from controls with high sensitivity and specificity.


AUC: area under the curve; CI: confidence interval; CRC: colorectal carcinoma; Ct: cycle threshold; miRNA: microRNA; PDAC: pancreatic ductal adenocarcinoma; qRT-PCR: quantitative reverse transcription polymerase chain reaction; ROC: receiver operating characteristic

Material and Methods

Sample collections

Between 2002 and 2011, consecutive serum and plasma samples (n = 361) were obtained from patients of the Department of Internal Medicine, Knappschaftskrankenhaus, Ruhr-University Bochum, Germany, from the Colorectal Cancer Research Group, Center for Molecular Clinical Cancer Research Department for Molecular Medicine, Aarhus University Hospital, Denmark and from Department of Medicine, Ernst-Moritz-Arndt-University, Greifswald, Germany. The local ethical committees had approved sample collections. Written informed consent of all patients and blood donors was documented according to the local ethics guidelines. The study was conducted according to the declaration of Helsinki. Additional details on the sample collection such as diagnoses, clinical staging, age and gender are given in the Supporting Information (Supporting Information Material and Methods and Tables S2–S7). Blood samples were centrifuged (10 min, 3,000g, room temperature) within 30 min after collection to remove cells and debris and were stored at −80°C until further processing.

Xenograft tumors derived from early passage human PDACs (n = 8) were grown in NMRI-nu/nu mice. Serum was collected via cardiac puncture once tumor size reached 1 cm3 and from tumor-free age-matched control mice (n = 8). The tissue collection was performed according to a protocol approved by the ethics committee of the Ruhr-University Bochum (permission no. 3534-09 and 2392-04). All pancreatic specimen included in this study were reviewed by a pathologist (J.M.). Animal experiments were performed according to the guidelines of the local Animal Use and Care Committees.

RNA isolation

Total RNA was extracted using a mirVana RNA isolation kit (Ambion, Austin, TX) according to the manufacturer's instructions. Briefly, serum or plasma samples were thawed on ice and 170 μl (human samples) or 300 μl (mouse samples) were diluted with an equal volume of mirVana PARIS ×2 denaturing solution and subsequently incubated for 5 min on ice. Prior to the incubation step, 25 fmol of synthetic Caenorhabditis elegans miRNA-54 (cel-miR-54, Qiagen, Hilden, Germany) were added to each human serum or plasma sample as a spike in control.11 Molar concentration of spike in synthetic C. elegans miRNA-54 were determined in pilot experiments to reach a Ct value of 21 in our PCR set-up. Subsequently, equal volumes of acidic phenol-chloroform (Ambion) were added to each sample and centrifuged for 5 min at 10,000g. Next, 3 μl of glycogen (20 mg/ml) (Roche, Mannheim Germany) were added to aqueous phases and mixed with 1.25 volumes of 100% ethanol. Following passage through a mirVana PARIS column and washing steps were carried out following the manufacturer's protocol. Finally, RNA was recovered in 100 μl of RNase-free water.

Reverse-transcription and quantification by real-time polymerase reaction

To quantify the concentration of hsa-miR-29a, -1246 (RNU2-1f), -1290 and cel-miR-54 Qiagen miRNA assays (Qiagen, Hilden, Germany) were used following the manufacturer's protocols. In brief, 2 μl of total RNA were used for reverse-transcription reactions (37°C for 60 min, followed by 4°C). Real-time PCR was performed using an Opticon 2 system with a CFD-3220 Opticon 2 detector (MJ Research, Waltham, MA). PCR cycling conditions were composed of an initial step at 95°C for 15 min followed by 40 cycles of 94°C for 15 s, 55°C for 30 s and 70°C for 30 s. Fluorescence was measured at the last step of each cycle. Melting curves were obtained after each PCR run and showed single PCR products. Data from the qRT-PCR were analyzed using Opticon Monitor Analysis software (version 2.01, MJ Research). All cDNA samples, non-RT (without reverse transcriptase) and no-template controls were assayed in duplicate. Mean cycle threshold (Ct) values and deviations between the duplicates were calculated for all samples. Ct value deviations above 0.5 between replicates were repeated. We chose to use a synthetic miRNA, cel-54, not present in human serum as a reference molecule for normalization, and determined the linear correlation between the logarithm of the amount of input synthetic miRNA and the cycle threshold value on qRT-PCR for both RNU2-1f and cel-54 (Supporting Information Fig. 1). From these analyses, we determined the amount of spike in synthetic cel-54 miRNA to be well in the linear amplification range of the assay. The amount of RNU2-1f was normalized relative to the amount of cel-miR-54 (Ct = Ct(cel-miR-54) – Ct(RNU2-1f)).

Figure 1.

An overview of the study design is shown. Microarray analysis was performed on a set of mouse sera for biomarker discovery. The top candidate RNU2-1f was analyzed in a large series of serum and plasma samples and following exclusion of UICC Stage I CRC and adenoma, statistical analyses were performed to define the RNU2-1f assay characteristics.

miRNA expression analyses and data processing

Total RNAs isolated from 300 μl of mouse serum (n = 6 of either group) were hybridized to the human microRNA Microarray (G4471A Human, Amadid 29297, Sanger 14, Agilent Technologies, Boelingen, Germany). MicroRNA labeling, hybridization and washing were carried out according to the manufacturer's instructions. Images of hybridized microarrays were acquired with a DNA microarray scanner (Agilent G2505B) and features were extracted using the Agilent Feature Extraction image analysis software (AFE) version A. with default protocols and settings. The gene expression data from our study have been deposited in the NCBI Gene Expression Omnibus (GEO) database (accession number GSE34052.

Data analyses

The AFE algorithm generates a single intensity measure for each microRNA, referred to as the total gene signal (TGS), which was used for further data analyses using the GenSpring GX software package version 11.5.1. AFE-TGS were normalized by the quantile method. Subsequently, data were filtered on normalized expression values. Only entities where at least 1 out of 12 samples had values within the selected cutoff (75th–100th percentile) were further included in the data analysis process.


The first step in our analysis was a groupwise comparison of measured miRNA serum levels in tumor carrying mice and control mice, respectively. We conducted two-sided two sample t-tests per variable, assuming equal variances using GenSpring GX software package version 11.5.1. The p-values were adjusted for multiple testing according to Benjamini and Hochberg [FDR]34 and results were considered statistically significant at adjusted p-values below 0.05. Furthermore, only miRNAs with fold change ≥ 1.5 in the microarray analyses were considered worthy of more in-depth analyses.

The resulting top candidate RNU2-1f was evaluated with the statistical software R (version 2.13.1 [R], and in particular the packages pROC and PropCIs. The former computes receiver operating characteristic (ROC) curves and the area under the curve (AUC) whereas the latter provides Clopper-Pearson [KI] CIs for sensitivity, specificity and predictive values. ROC curves and the AUC were analyzed to assess the feasibility of using RNU2-1f concentrations as diagnostic tools for detecting colorectal or pancreatic cancer. Based on the ROC curve, we determined the optimal cutoff concentration to classify an observation cancerous or healthy. We decided to optimize Youden's index [cut],35 which is equivalent to the maximization of the sum of sensitivity and specificity. To prevent overfitting effects, we divided our data into a training and a test set. The optimal classification cutoff is determined on the training set and the quantities of interest are assessed on the independent test data.


Discovery of a candidate miRNA blood biomarker for pancreatic cancer in xenografted mice and validation in mouse sera

The overall strategy and study design to identify novel miRNA-based biomarkers are illustrated in Figure 1. Serum miRNA expression data were collected from six xenografted and six control mice using Agilent miRNA microarrays (G4471A Human, Sanger 14). Following array processing, normalization and filtering of the raw array data, the pairwise comparison (fold change ≥ 1.5) identified 21 differentially expressed miRNAs (15 increased and 6 decreased) in serum of PDAC carrying mice compared to tumor free control mice (Supporting Information Table 1). Only miRNAs with an increased serum concentration in tumor carrying mice were considered potential biomarker candidates. From our list of 15 candidates, we chose two miRNAs for validation via qRT-PCR based on their exclusive and robust (normalized gTotalProbeSignal >12) expression in tumor bearing mice (miR-1290 and miR-1246) and one miRNA (miR-29a) based on its robust expression only. The remaining miRNAs were considered suboptimal candidates, because they generally only reached low mean signal levels in tumor sera (mean normalized gTotalProbeSignal ≤ 6, Supporting Information Table 1). All three miRNAs were validated successfully via qRT-PCR in mouse sera (Supporting Information Fig. 2). Interestingly, the lack of expression of miR-1246 in the control mice, already suggested by the very low array hybridization signal (mean normalized gTotalProbeSignal 6.4, Supporting Information Table 1), was confirmed via qRT-PCR (mean Ct 30.2), indicating that the miR-1246 detected in the carcinoma carrying mice is originating from the human carcinoma.

Figure 2.

Distribution of identified RNU2-1f fragments in patient sera. Upon cloning of PCR products derived from the Qiagen miR-1246 qRT-PCR assay the indicated sequence fragments were identified at the given frequencies. Common sequence region between human mature miR-1246 and RNU2-1 is shown in bold letters. Sequence mismatches between RNU2-1 and the precursor miR-1246 are marked with asterisks.

Of note, in the course of our experiments we noted a cross reactivity of the miR-1246 assay with sequence fragments of the human U2 snRNA. The details of these experiments are described below and led to the conclusion that the measured signal was largely not derived from miR-1246 but from fragments of RNU2-1 (RNU2-1f).

MiR-1246 qRT-PCR is cross reactive with U2 snRNA sequence fragments

The mature miR-1246 has previously been reported to be expressed in lung, breast, colorectal and ovarian carcinomas as well as in osteosarcoma cell lines.36 We sought to confirm and extend these data to colon and pancreatic carcinoma cell lines. The Qiagen qRT-PCR assay generated PCR fragments of roughly the expected size using RNA preparations from patient serum, serum from xenografted mice as well as from supernatants of cell lines. Surprisingly no such PCR product was generated using cell lysates from a panel of colon and pancreatic carcinoma cell lines (Supporting Information Fig. 3). However, a larger PCR product, 180 bp in size, was detected in all cell lysates at high levels. We cloned this PCR product and sequence analysis revealed that it perfectly matched the RNU2-1 sequence in the NCBI database (NCBI Reference Sequence: NR_002716.3). Importantly, we noted that the entire mature miR-1246 sequence is comprised within the human U2 snRNA sequence (Fig. 2). Thus, we sought to determine the origin of the PCR products in patient sera that are generated by the Qiagen miR-1246 assay. We cloned the products and sequenced 129 PCR fragments in total, derived from control and cancer patient sera (Fig. 2). Twenty-nine fragments displayed perfect homology with the mature miR-1246 or the corresponding sequence within RNU2-1, whereas two products were 1 or 2 bp shorter. The majority (98/129) of these cloned PCR fragments, however, displayed extensions by one to seven base pairs at the 3-prime end which in all instances were complementary to the U2 snRNA sequence but not to pre-miR-1246. We also cloned and sequenced PCR products from the Qiagen miR-16 and -196a assays (data not shown). In all instances we found either the mature sequence as deposited in miRBase or sequence variants consistent with described isomiRs ( Next, we cloned a 346 bp fragment of the primary miR-1246 into a lentiviral expression vector and assessed its functionality in a human pancreatic carcinoma cell line (Supporting Information Fig. 4). We noted that, albeit the primary transcript could be detected in the vector transduced cells, no functional mature miRNA was produced in this experimental setting, suggesting that processing of the pri-miR-1246 to the mature miR-1246 may not occur in human cells.

Figure 3.

Box plots depicting the distribution of the Ct(cel-54) − Ct(RNU2-1f) assay data for the various groups included in our analyses. Assay cutoff of −2.995 is indicted by the broken line. Legend: PDAC, pancreatic ductal adenocarcinoma; CRC, colorectal carcinoma; C, colon carcinoma; R, rectal carcinoma; CRC I–IV, UICC Stages I–IV CRC; HC, healthy controls; DC, diseased controls; CRP, c-reactive protein.

Figure 4.

Receiver operator characteristic curve plots for training data (gray lines) and test data (black lines). The 95% CIs of sensitivity and specificity are visualized with a black box. The diagonal dashed line is random chance. The boundaries of the confidence intervals clearly exceed random chance for the independent test set. Legend: TPR, true positive rate; FPR, false positive rate; AUC, area under the curve.

In summary, our data show that RNU2-1 fragments (RNU2-1f) are present in patient serum and in supernatant from cancer cell lines and are detected by the Qiagen miR-1246 assay. Furthermore, neither RNU2-1 fragments nor mature miR-1246 are present at a detectable level in whole cell lysates of colon and pancreatic carcinoma cell lines. Lastly, end-point PCR analysis with primer pairs specific for the precursor form of miR-1246 revealed that pre-miRNA-1246 is also not expressed in these cell lines (Supporting Information Fig. 5).

Figure 5.

qRT-PCR data distribution using Ct(cel-54) − Ct(RNU2-1f) assay separated by training and test set for PDAC, CRC and UICC Stage II CRC. Dotted line indicates threshold of −2.995. Mean values are indicated by horizontal lines.

Lack of discrimination of loop design qRT-PCR between miR-1246 and RNU2-1f

Two papers applying loop design qRT-PCR strategies have previously reported significant levels of the mature miR-1246 in various cell lines.36, 37 Likewise, using the TaqMan miR-1246 assay, which is also based on a loop design, we also obtained low Ct values in the range of 14–20 in various cancer cell lines (data not shown). Based on our sequencing data as shown above, we expected that the majority of RNA templates in the miR-1246 loop design cDNA synthesis were RNU2-1 fragments. To experimentally address this hypothesis, we performed cDNA synthesis and miR-1246 TaqMan amplification with a number of synthetic oligonucleotide molecules, namely the mature miR-1246 sequence as well as molecules extended at the 3′ end by 1, 10 and 30 basepairs into either the miR-1246 precursor or the RNU2-1 sequence. The one bp extension of the template had barely any effect (Supporting Information Fig. 6). Similar data were reported by Lee et al. for isomiRs.38 As expected, the longer extensions did reduce amplification efficiencies, but not to the extent that abundant molecules such as U2 RNA would become undetectable. From these data we conclude that loop design qRT-PCRs are not suitable to reliably discriminate between miR-1246 and RNU2-1f. Furthermore, positive loop design qRT-PCR signals from cell lysates are likely only derived from RNU2-1f and not from miR-1246.

Figure 6.

Decline of RNU2-1f abundance following surgical treatment. RNU2-1f levels were monitored via qRT-PCR in three PDAC (Patients 1–3) and CRC patients (Patients 4–6.) at the indicated time points following surgical resection of the tumor. In all cases RNU2-1f abundance dropped below the diagnostic threshold of −2.995 (dashed line).

Stability of endogenous RNU2-1f in human serum

The fact that RNU2-1f can be readily detected in patient serum and plasma and that its level does not change upon prolonged storage of the blood at room temperature (Supporting Information Fig. 7) suggested that RNU2-1f are similar stable in blood as miRNAs. It was previously shown that miRNAs are stable in plasma either because they are components of ribonucleoprotein complexes or embedded within membrane vesicles such as exosomes.12–15 To unravel the basis for this observed RNU2-1f stability, we performed a protease digestion protocol previously applied to distinguish between these two possibilities. We found that RNU2-1f stability resembled the stability of a “vesicle type” let-7a miRNA upon protease treatment, whereas argonaute-bound miR-16 was highly susceptible to protease treatment (Supporting Information Fig. 8).13 In addition, we also observed resistance of RNU2-1f toward nuclease treatment as previously described for miRNAs (Supporting Information Fig. 8).12

Apoptosis induction raises abundance of RNU2-1f

We hypothesized that the highly stable RNU2-1f could originate from a vesicle-like structure such as apoptotic bodies. Therefore, we induced apoptosis in HCT116 and Panc1 cells via curcumin treatment. In line with our previous tests of RNU2-1f in a series of pancreatic and colorectal carcinoma cell line supernatants (Supporting Information Fig. 3B), RNase insensitive RNU2-1f was readily detectable under normal growth conditions (Cts between 21 and 22). Furthermore, its abundance was markedly increased in curcumin treated cells (Supporting Information Fig. 9). To further corroborate this finding, annexin V positive extracellular vesicles likely produced during apoptosis were enriched from the supernatant derived from HCT116 cells using an immune-affinity purification strategy. Again, RNU2-1f was readily detectable in this preparation (Ct 25) and its level was 16-fold increased following curcumin treatment (data not shown).

Detection of human pancreatic and colorectal carcinoma based on RNU2-1f prevalence in serum or plasma

Having validated the tumor specific release of RNU2-1f in our mouse xenograft model, we sought to determine whether this new RNU2-1f biomarker candidate could be used in the human setting. Apart from PDAC, the tumor type in our original discovery strategy, we also included CRC, because it is the most frequent gastrointestinal cancer type and it is quite common for a number of available diagnostic markers that they are indicative for more than one specific cancer type. A small pilot study with 10 serum samples each, from CRC and PDAC patients as well as from healthy controls RNU2-1f proved to differentiate cancer from controls (data not shown). As a result we set up a comprehensive analysis with a cohort of 361 serum and plasma samples (80 PDAC, 132 CRC, 20 colon adenomas and 129 controls) to more thoroughly assess the variability in RNU2-1f abundance among the various disease and control groups (Fig. 3, Supporting Information Fig. 10 and Tables 2–7). Because both serum and plasma samples are included in our study, a comparison of the RNU2-1f abundance in plasma and serum samples for both cancers and controls was performed in the samples where both serum and plasma were available (Supporting Information Fig. 12). This comparison revealed no statistical difference, neither between the controls nor the UICC Stage II carcinoma samples, which was also reflected by similar mean expression values for cancer and control samples in both sets (Supporting Information Fig. 11). These data indicate, in agreement to the strong correlation reported by Mitchell et al. for miRNAs, that RNU2-1f abundance is also largely similar in both serum and plasma, and thus both sources can be used for testing RNU2-1f in patients. From the subsequent statistical analyses we excluded the 20 adenoma and 21 Stage I CRC cases because the RNU2-1f levels in their plasma were not different from the prevalence measured in controls (Fig. 3). The remaining cohort of 320 samples were randomly assigned to a training set (N = 213) and a test set (N = 107) maintaining the proportion of healthy and diseased subjects in the overall study population. By maximizing the sum of sensitivity and specificity (Youden's index), we derived a threshold of −2.995 for the Ct(cel-54) − Ct(RNU2-1f) assay for the training set samples. We applied this model and threshold to the independent test set samples to derive unbiased performance estimates (Supporting Information Table 8). The threshold dichotomizes diagnostic calls as follows: a score greater or equal than −2.995 classifies a sample as cancerous (diagnostic positive), whereas a score less than to −2.995 classifies a sample as non-cancerous (diagnostic negative). Our results show sensitivity and specificity to be 97.7% [95% CI = (87.7, 99.9)] and 90.6% [95% CI = (80.7, 96.5)], respectively, with an area under the ROC curve of 0.972 (Fig. 4). In a next step, we analyzed the performance characteristics of our classifier on our test set for CRC and PDAC separately. Figure 5 shows the overall better performance of the assay for PDAC and still a very good separation of CRC from controls. This performance is reflected by the fact that our assay correctly classified 24/25 (96%) PDAC samples and 34/39 (87.2%) CRC (UICC Stages II–IV) samples contained in the test set. Combining the data from the training and test set and using the established diagnostic cutoff resulted in the correct classification of 78/80 (97.5%) PDAC samples and 92/111 (82.8%) CRC (UICC Stages II–IV) samples (Fig. 5).

Influence of “other” diseases and age on RNU2-1f abundance in serum

Since our control collection did not only contain healthy persons but also patients who were hospitalized for other diseases as cancer, we were able to ask whether other diseases, including acute or chronic inflammation, could lead to a false positive test result in our analysis. As a surrogate marker for inflammatory activity, we used the c-reactive protein (CRP). As shown in Figure 3, neither the healthy control (HC; Supporting Information Table 7) or diseased control (DC, Supporting Information Tables 5 and 6) state nor the CRP level (Supporting Information Tables 5 and 6) had a strong influence on the mean normalized Ct value for RNU2-1f. In addition, the abundance of RNU2-1f is not age-dependent as shown in Supporting Information Figure 9 for our control cohort.

Detection of human Stage II colorectal carcinoma based on RNU2-1f abundance in serum

An important finding of our analyses is the fact that CRCs can be detected via RNU2-1f as early as the UICC Stage II. The use of our test set and cutoff resulted in a correct classification of 17 of our 21 Stage II CRC samples (Fig. 5). This result is in line with our observation that altogether (training and test set) 34 of 42 (81%) UICC Stage II carcinomas were above our threshold of −2.995 (Fig. 5) and therefore correctly classified as cancer.

RNU2-1f abundance drops rapidly following surgical treatment

The serum bank in some instances comprised follow-up samples from cancer patients. For three patients, we were able to compare RNU2-1f levels upon tumor diagnosis and at days 30 or 200 after surgery; all of them showed a strong decline to normal levels. In addition, we were able to prospectively collect serum samples within a 5–14 days period following surgical resection of the tumor; all three again display a decline of RNU2-1f levels below the diagnostic threshold within this time period (Fig. 6).


Since the discovery that ncRNAs, such as miRNAs, are highly stable in the blood, the search for ncRNAs able to discriminate healthy from cancer patients and/or provide prognostic information, such as likelihood for disease progression or even treatment response, are an important focus in ncRNA research. Making use of a xenograft cancer model for pancreatic carcinoma, we identified ncRNA fragments derived from U2-snRNA (herein called RNU2-1f) to be released from PDAC into the mouse serum. These initial experiments suggested RNU2-1f to be a promising diagnostic marker for a blood-based test. Subsequently, we could separate PDAC and CRC patients from control subjects with high sensitivity and specificity by means of qRT-PCR quantification of serum RNU2-1f. Furthermore, for CRCs RNU2-1f abundance paralleled the tumor stage. Our additional finding that RNU2-1f levels drop rapidly following surgical removal of the tumor together with our data from the animal model showing that RNU2-1f is only present in the circulation of tumor bearing mice, suggest that the rise in the RNU2-1f abundance in the blood stream of cancer patients is caused either in part or completely by RNU2-1f release from the tumor mass. We also observed a high stability of RNU2-1f in serum toward nuclease and a partial stability toward proteinase treatment, suggesting that RNU2-1f is not protected by its binding to a protein such as Argonaute2, as shown for many miRNAs, but is more likely protected by its inclusion into a vesicle-like structure.13 A likely vesicle structure could be apoptotic bodies, which contain RNPs (including U2-snRNA), and are released into peripheral blood via the compromised capillary network characteristic of tumor tissues.39, 40 Our finding that whole cell lysates of in vitro cultured cancer cells did not contain RNU2-1 fragments at a PCR detectable level, whereas conditioned media from the same cells contain abundant RNU2-1 fragments, is consistent with the hypothesis that apoptosis and the production of vesicular blebs is required to enrich the cell culture media with RNU2-1 fragments. Furthermore, we were able to markedly raise RNase resistant RNU2-1f abundance by inducing apoptosis in an in vitro system and were able to show that RNU2-1f is contained in the annexin V positive extracellular vesicle fraction, likely produced during apoptotic processing steps. These data support a link between apoptosis and RNU2-1f biogenesis. They also add weight to the hypothesis, that apoptotic bodies are one likely carrier of RNU2-1f, but nevertheless don't exclude other possible vesicular carrier systems for RNU2-1f such as exosomes. In addition, the high prevalence of RNU2-1 fragments which were, according to our sequencing data, only by a few base pairs longer than the so called Sm protein binding region, suggests that this region of the U2-snRNA molecule is less amenable, i.e. during the apoptotic snRNP degradation process. Sm proteins form a heteroheptameric ring structure surrounding this sequence region and thus may protect it from degradation.41 Clearly, more work is required to fully characterize the mechanism of RNU2-1 processing to RNU2-1f and its release into the bloodstream.

Another important aspect of our study was the discovery that currently established assays for miR-1246 are cross-reactive with RNU2-1 fragments. Furthermore, in contrast to the published literature, our qRT-PCR data suggest that miR-1246 is generally not expressed or at best at a very low copy number in cancer cell lines.36, 37 Lastly, expressing a transcript containing a relevant part of the primary miR-1246 sequence failed to generate functional miR-1246 in a pancreatic carcinoma cell line. This is consistent with reports classifying miR-1246 as a pseudo-miRNA precursor.42, 43

The high sensitivity and specificity for detecting pancreatic cancer shown here for RNU2-1f puts this marker in the top rank of currently published diagnostic markers for PDAC. Only plasma levels of miR-18a and the combination of two miRNAs (miR-16 and -196a) with CA19-9 were described to reach a similar performance.23, 27 Importantly, the latter combination was also able to detect 85.2% of Stage I PDACs, whereas CA19-9 detected only 55.6% in the same collection.27 The number of Stage I PDACs was too low in our study population to evaluate the accuracy of the RNU2-1f assay in the setting of early PDAC. However, RNU2-1f abundance in the sera of the three Stage I and the three Stage II PDACs included in our study reached in all instances the diagnostic threshold set for PDAC. Sera from patients with chronic pancreatitis (CP), a known risk factor for PDAC were not included in our sample. Therefore, we are currently collecting sera both from patients with Stage I PDAC or CP to address the question whether RNU2-1f alone or in combination with other miRNAs or biomarkers such as CA19-9 have the potential to identify early stage PDAC (i.e., tumor size < 1 cm) and possibly also precursor lesions and to separate CP from PDAC. Lastly, it will be interesting to test patients with jaundice in different clinical settings such as CP, choledocholithiasis and PDAC in order to address the influence of jaundice on our RNU2-1f assay performance.

Patients with UICC Stage II CRC treated by surgery without any adjuvant chemotherapy exhibit a cure rate reaching 87%.44 Under these circumstances, it is remarkable that our RNU2-1f assay was able to correctly identify the majority of CRCs as early as UICC Stage II, suggesting it to be a potential new non-invasive screening tool for detecting early CRCs with a good prognosis in the setting where the patient doesn't choose to undergo screening colonoscopy. Similar rates for detecting an early CRC stage have to date not been reported by non-invasive blood-based diagnostic strategies. We also found that the number RNU2-1f released by smaller CRCs (< 1 cm) or adenomas into the circulation is not sufficiently high to be detected in the blood stream above the background of control serum. Recently, miRNA analyses have been shown to be feasible also in stool.45, 46 Together with the observed release of RNU2-1f from tumor cells with its high stability in serum, it seems plausible to test RNU2-1f abundance in patient stool to hopefully improve detection rates toward Stage I CRC and/or adenomas. Lastly, our observed decline in RNU2-1f abundance following surgical treatment suggests that RNU2-1f may also serve as a marker for early therapy response prediction using chemo and radiation therapy regimen. Altogether, our study provides the rationale for future investigations of RNU2-1f as a diagnostic biomarker in large prospective clinical studies of CRC and PDCA risk groups and to compare its performance with currently used non-invasive tests.