Accurate discrimination of pancreatic ductal adenocarcinoma and chronic pancreatitis using multimarker expression data and samples obtained by minimally invasive fine needle aspiration

Authors


Abstract

To augment cytological diagnosis of pancreatic ductal adenocarcinoma (PDAC) in tissue samples obtained by minimally invasive endoscopic ultrasound-guided fine needle aspiration, we investigated whether a small set of molecular markers could accurately distinguish PDAC from chronic pancreatitis (CP). Expression levels of 29 genes were first determined by quantitative real-time RT-PCR in a training set of tissues in which the final diagnosis was PDAC (n = 20) or CP (n = 10). Using receiver operator characteristic curve analysis, we determined that the single gene with the highest diagnostic accuracy for discrimination of CP vs. PDAC in the training study was urokinase plasminogen activator receptor (UPAR; AUC value = 0.895, 95% CI = 0.728–0.976). In the set of test tissues (n = 14), the accuracy of UPAR decreased to 79%. However, we observed that the addition of 6 genes (EPCAM2, MAL2, CEA5, CEA6, MSLN and TRIM29; referred to as the 6-gene classifier) to UPAR resulted in high accuracy in both training and testing sets. Excluding 3 samples (out of 44; 7%) for which results of the UPAR/6-gene classifier were “undefined,” the accuracy of the UPAR/6-gene classifier was 100% in training samples (n = 29), 92% in 12 test samples (p = 0.004 that results were randomly generated; p = 0.046 that the UPAR/6-gene classifier was comparable to UPAR alone; χ2 test), 100% in 3 samples for which the initial cytological diagnosis was “suspicious” and 98% (40/41) overall. Our results provide evidence that molecular marker expression data can be used to augment cytological analysis. © 2006 Wiley-Liss, Inc.

Although the incidence of pancreatic ductal adenocarcinoma (PDAC) is only 1–2% and relatively low compared to other cancers, nearly all patients die from PDAC within 1–2 years.1 Because of its lethality, PDAC now ranks fourth as a cause of death from cancer. While smoking is the major known risk factor for this cancer, explaining 20–30% of all cases, 2 background diseases also increase the risk of pancreatic cancer––pancreatitis and diabetes. Chronic pancreatitis (CP) is a persistent form of pancreatitis that often presents as a discrete mass, typically of the pancreatic head.

Compared to the rest of the GI tract, a nonoperative biopsy specimen of the pancreas is difficult to obtain. Thus, surgical exploration is frequently used to reliably distinguish between CP and PDAC. However, because surgical resection of the head of the pancreas (Whipple procedure) has a 27–50% complication rate and a 2% mortality rate,2 differential diagnosis of CP and PDAC often relies on a combination of factors including the patient's symptoms, serological measurements of the carbohydrate antigen 19-93 and radiological examination. However, since both PDAC and CP have abundant stroma due to an active desmoplastic reaction, each may present as a morphologically identical discrete mass in computed tomography (CT) or magnetic resonance image examination.4

At present, the most sensitive method for nonsurgical detection of PDAC involves obtaining pancreatic specimens through minimally invasive endoscopic ultrasound-guided fine needle aspiration (EUS-FNA), followed by cytopathological examination.5, 6 EUS-FNA is especially well-suited to visualize small lesions, and unlike other methods, the entire pancreas is readily imaged.7, 8, 9 Although the accuracy of EUS-FNA for detection of PDAC is relatively high due to the improvement of endoscopes and on site cytological assessment, there is still a relatively large number of cases in which the diagnosis remains problematic due to sample inadequacy, nondefinitive (suspicious) diagnosis or discordance in pathological interpretation.7, 10, 11 Difficulties in morphological differentiation of CP and PDAC are attributed to extensive fibrosis that is ultimately caused by a variety of genetic alterations, many of which are common between both disease states.12, 13, 14, 15, 16, 17

Recent developments using various molecular detection technologies including serial analysis of gene expression, cDNA microarrays, tissue arrays and the human genome-sequencing project has led to the discovery of multiple genes that are overexpressed in CP and PDAC.4, 17, 18, 19, 20, 21, 22 In the present study, we selected 29 genes known to be differentially expressed in PDAC or CP from published literature studies and from our own microarray analysis. The expression levels of these genes were measured in samples obtained by EUS-FNA using quantitative real-time RT-PCR analysis. We hypothesized that by focusing on the most highly informative genes, an algorithm could be developed that would accurately discriminate PDAC from CP. We found that a combination of 7 or fewer genes can accurately discriminate PDAC from CP samples obtained from EUS-FNA procedure.

Material and methods

Tissue specimens

The study was approved by the Institutional Review Board of the Medical University of South Carolina and all patients completed informed consent. EUS-FNA samples were obtained from 26 patients with pancreatic cancer and 18 patients with severe CP seen at the Digestive Disease Center at MUSC from July 2003 to February 2006. EUS-FNA was performed under conscious sedation with midazolam (0–5 mg) and meperidine (0–200 mg). Fine needle sampling was performed with a 22-gauge needle (Echo-tip, Wilson-Cook, Winston Salem, NC) under EUS and color Doppler with a curved linear array echoendoscope (GF-UC30P or GF-UCT30, Olympus America, Melville, NY). Pancreatic tissues were punctured with an occluding stylet in place, which minimizes contamination with tissues (e.g., intestinal wall) that were traversed in the process of obtaining the FNA. After insertion of the needle into the pancreas the stylet was removed, the needle was moved to and fro for ∼30 sec and then withdrawn. Needle samples were expressed using a 10 cm3 air-filled syringe onto a separate glass slide and a direct smear was made by an on-site cytotechnician. If an adequate amount of material was available (typically ∼80% of cases), a second slide would be prepared, fixed in alcohol and stained later. For on-site evaluation, each slide was air-dried and direct smears were prepared for immediate interpretation by staining with a Romanowsky stain (Diff-Quik). On-site evaluation of smears was performed to assess cellular adequacy and the presence of malignancy. In the event that no cellular malignancy was detected, another EUS-FNA sample would be obtained from the same plane as the first. This procedure would be repeated until cellular malignancy was detected, or until a maximum of 7 passes were obtained. The last and final pass was used exclusively for molecular analysis. Final interpretation of all material consisted of reviewing the slides prepared on-site and stained later, and examination of thin-layer cytology or cytospin/cell block material prepared from the cellular material kept in the Hank's solution.

RNA isolation

Total cellular RNA was isolated from EUS-FNA samples reserved for molecular analysis as described above using 1 mL guanidium thiocynate-phenol-chloroform solution (RNA STAT-60®; TEL-TEST, Friendswood, TX). Tissues were homogenized using a model 395 Type 5 polytron (Dremel, Racine, WI) only if its weight was ≥0.1 g. Total RNA was isolated as per the manufacturer's instructions with the exception that 1 μL of a 50 mg/mL solution of glycogen (Sigma, St. Louis, MO) was added to the aqueous phase immediately prior to addition of isopropanol. Final RNA pellets were dissolved in 25 μL of DEPC treated water. RNA yield was determined by spectroscopy, and typically ranged from 2.5 to 10 μg/FNA sample. Complementary DNA was typically made from 3 μL of total RNA using M-MLV reverse transcriptase (Promega, Madison, WI) and oligo d(T).12, 13, 14, 15, 16

Real-time RT-PCR

Real-time RT-PCR was performed on a Gene Amp 7300 or 7500 Sequence Detection Systems (PE Biosystems, Foster City, CA). With the exception of the SYBR Green I master mix (purchased from Qiagen, Valencia, CA), all reaction components were purchased from PE Biosystems. Standard reaction volume was 10 μL and contained 1× SYBR RT-PCR buffer, 3 mM MgCl2, 0.2 mM each of dATP, dCTP, dGTP, 0.4 mM dUTP, 0.1 U UngErase enzyme, 0.25 U AmpliTaq Gold, 0.35 μL cDNA template and 50 nM of oligonucleotide primer. Initial steps of RT-PCR were 2 min at 50°C for UNG erase activation, followed by a 15-min hold at 95°C. Cycles (n = 40) consisted of a 15-sec melt at 95°C, followed by a 1-min annealing/extension at 60°C. The final step was a 60°C incubation for 1 min. All reactions were performed in triplet. The primer-pairs for the gene analysis are listed in Supplemental Table A. For a given real-time RT-PCR sample, the point at which fluorescence starts to increase above background in the PCR reaction is referred to as the cycle threshold (Ct) value. The Ct value is therefore inversely proportional to the amount of specific mRNA species in the original tissue sample. In our analyses, results were normalized to a reference control gene (β2-microglobin; β2M) by subtracting the Ct value of β2M from the Ct value of a candidate gene measured in the same sample (ΔCt). High ΔCt values are correlated with low levels of gene expression, whereas low ΔCt values are correlated with high levels of gene expression.

Statistical analysis

To evaluate and compare the performance of the urokinase plasminogen activator receptor (UPAR)/6-gene classifiers studied in this research, bootstrap procedures were employed to estimate the statistics of interest, namely, the expected error rate, sensitivity, specificity, positive predictive value and accuracy. Bootstrap is a well-established computational statistic method to estimate the reliability of estimator.23 It can be used to estimate the distribution of an estimator, e.g., the error rate of a classifier, based on N observed cases X = {X1, X2, … , XN}, which is distributed according to an unknown probability distribution F. This can be achieved by drawing a total number of B bootstrap samples with size N from the original data by sampling with replacement. A statistics estimator, ŝb, is then derived from each bootstrap sample and the collection of the estimators can be further used to derive accuracy measures for the estimator, such as the confidence interval and expected value of the estimator under the original probability distribution F.

Classification performance metrics

We defined the true positive (tp) rate as the fraction of samples that were predicted by molecular analysis as “PDAC” and labeled as “PDAC” by cytological analysis, false positive (fp) rate as the fraction of samples that were predicted as “PDAC” by molecular analysis but labeled as “CP” by cytological analysis, true negative (tn) rate as the fraction of samples predicted as “CP” and labeled as “CP” and false negative (fn) rate as the fraction of samples predicted as “CP” but is labeled as “PDAC.” Positive predictive value, sensitivity, specificity and accuracy were defined as follows:

equation image
equation image
equation image
equation image
equation image

Estimation of confidence intervals

With B bootstrap estimators derived by sampling, estimators were sorted and a confidence interval with a Type-I error α was obtained by finding values that corresponded to α/2 and (1 − α/2) percentile of samples. For example, if α = 10%, one can derive a 90% confidence interval by finding the values that correspond to 5 and 95% percentile from the sorted estimators.

Results

Pancreatic tissue specimens (n = 44) were obtained by EUS-FNA from 44 patients, 17 of which were diagnosed using cytological criteria as CP, 3 as “suspicious” and 24 as PDAC. Of the 3 patients diagnosed as suspicious, 2 were determined to have PDAC after surgical resection. Of the 17 patients diagnosed with CP, 2 underwent further staging/surgical procedures. One underwent a Whipple procedure for removal of suspected PDAC and was determined to be free of PDAC. A core biopsy from the other patient revealed the presence of PDAC. Thus, the final tissue diagnosis of the 44 patients was 27 with PDAC and 17 with CP (Table I). The high rate of suspicious or discrepant tissue diagnoses (5/20 (25%)) for our cohort of presumed or suspected CP patients (n = 20) emphasizes the need for objective tissue diagnostic criteria. Compared to the CP population, those with a final tissue diagnosis of PDAC contained a significantly higher percentage of white (vs. black; p = 0.017), and older (>60 years; p = 0.021) patients (Table I).

Table I. Patient Information
 CP1PDACp2
  • 1

    Final tissue diagnosis of chronic focal pancreatits; values for no. of patients and (%) are listed.

  • 2

    χ2 test result for statistical differences between CP and PDAC.

  • 3

    ≤60 vs. >60.

  • 4

    Not applicable.

Gender
 Male13 (76)18 (67)0.770
 Female4 (24)9 (33)
Ethnic group
 White10 (59)22 (81)0.017
 Black7 (41)5 (19)
Age
 40–505 (29)5 (19)0.0213
 51–605 (29)5 (19)
 61–705 (29)8 (30)
 >702 (18)9 (33)
Tumor size
 T1N/A40 (0)N/A
 T2N/A2 (7)
 T3N/A12 (44)
 T4N/A13 (48)

Of the 44 specimens, 7 CP and 7 PDAC samples were randomly selected and set aside for the testing phase of our study. The remaining 30 samples (20 PDAC, 10 CP) from 30 patients were used for training. All tissues used for training were evaluated by real-time RT-PCR using a panel of 30 genes, 27 of which were selected on the basis of overexpression in PDAC or metastatic cancer (Table II). The majority of the 27 genes were selected on the basis of overexpression based on microarray data obtained from multiple laboratories. In general, only genes whose expression was at least 5-fold higher in PDAC compared to either normal or CP tissue were considered. Other genes were selected on the basis of immunohistochemical expression data, while a small number were selected based on biological function. Two pancreas-specific genes (PNLIPRP2 and CTRB1) served as negative controls and were selected on the basis of decreased expression in PDAC compared to CP (Table I). For an internal reference control gene, we evaluated β2-microglobulin, β-actin, and GAPDH. In the set of training tissues, we observed that the correlation coefficient (R2) between β2-microglobulin and β-actin was 0.948; the R2 value for β2-microglobulin and GAPDH was 0.715, while the R2 value for β-actin and GAPDH was 0.707. On the basis of these results, we concluded that β2-microglobulin or β-actin could be used as an internal reference control gene. The results described below used the β2-microglobulin gene as an internal reference control for normalization of gene expression.

Table II. Diagnostic Accuracy of Individual Genes for Discrimination of CP vs. PDAC
GeneAccession no.Sens.1Spec.2AUC3Lower limit4Upper limit4p5References6
  • 1

    Sensitivity (%) for detection of PDAC using optimum threshold values determined by MedCalc software.

  • 2

    Specificity (%) for detection of CP using optimum threshold values determined by MedCalc software.

  • 3

    Area under the ROC curve values were obtained using MedCalc Software (MedCalc, Mariakerke, Belgium).

  • 4

    95% confidence interval values.

  • 5

    Probability that the respective AUC value ≤0.5.

  • 6

    Withthe exception of PNLIPRP2 and CTRB1 (which are downregulated in PDAC), genes are upregulated in PDAC according to the indicated references.

  • 7

    Genes identified from unpublished microarray data as highly overexpressed in pancreatic metastatic cancer.

UPAR7U0883995900.8950.7280.976<0.00014,20,21,24, 25, 26, 27
SHHNM_00019375800.8270.6460.9390.000218
MUC4NM_01840685700.8120.6280.9300.00064,28, 29, 30
MSLNNM_00582380780.8060.6170.9270.00164,19,20
CTRB1NM_00190685800.8020.6170.9240.000231
TRIM29L24203601000.7950.6090.9190.00184,20
DAG17NM_00439365900.7870.6000.9140.0027 
MAL27NM_05288665900.7850.5970.9130.003021
EpCAM27NM_002353501000.7700.5810.9030.00614,19,20
CEA5M2954075800.7650.5750.8990.00764,19, 20, 21, 22
TSPAN1NM_005727401000.7450.5540.8850.01624,19, 20, 21, 22
SNCGNM_00308785600.7370.5460.8800.021021,32
MMP197NM_00242985600.7330.5400.8760.0247 
SurvivinU7528560900.7320.5400.8760.024733,34
CEA67M1872850900.7300.5380.8740.02674,19,20,22
S100P7X65614451000.7270.5350.8730.02884,19, 20, 21, 22
PDEF7AF07153870700.7050.5110.8560.05424,19,35
ELF37NM_00443395400.6920.4980.8470.0738 
PNLIPRP27M9328395500.6870.4930.8430.05924,19,20,22
MUC17NM_00245670700.6550.4600.8180.161117
IHHL3851755850.6450.4500.8100.192422,36
SPARCNM_00311855900.6400.4450.8060.209437,38
FOXM1NM_20200235900.6250.4300.7940.265739
CK197Y00503301000.6200.4260.7900.28634,19, 20, 21, 22
FXYD37X93036401000.6150.4210.7860.307719, 20, 21, 22,40
TFF17X5200355800.5950.4010.7690.40244,19, 20, 21, 22
XAG7NM_006408351000.5700.3770.7480.539219,21
EpCAM17NM_00235455800.5650.3730.7440.5688 
Maspin7U0431385300.4650.2820.6550.755620

The expression of a test gene relative to β2-microglobulin is measured in ΔCt units, a log 2-based scale that is inversely proportional to mRNA expression levels. In the set of training tissues, we observed that although the ratio of mean expression ΔCt values of the 27 test genes in CP tissues compared to PDAC was 0.36, the ratio of expression levels for the negative control CTRB1 and PNLIPRP2 genes in the respective tissues was much higher (ratios = 310 and 14, respectively; not shown). This result provides evidence that in general, the expression levels of the selected genes are consistent with previous studies.

The diagnostic accuracy of UPAR for discrimination of CP vs. PDAC is 0.895

Of the 27 test genes, we observed that UPAR had the highest diagnostic accuracy based on ROC curve analysis and was overexpressed in the majority of PDAC samples compared to CP (Table II). Of those samples for which the ΔCt value for UPAR was below 8.3 (n = 20), 19 out of 20 (95%) samples were PDAC. In contrast, for samples where the UPAR ΔCt value was above 8.3 (n = 10), 9 samples (90%) were CP. Thus, using the UPAR marker alone at a ΔCt threshold of 8.3, sensitivity and specificity values for diagnosis of PDAC vs. CP were 95% and 90%, respectively (Table II; see also Fig. 1).

Figure 1.

Real-time RT-PCR measurements of genes whose diagnostic accuracy for discrimination of CP vs. PDAC is ≥0.72. Real-time RT-PCR was performed as described in Materials and Methods using the indicated primer pairs on cDNA prepared from EUS-FNA samples in which the final tissue diagnosis was PDAC (diamonds; left side of each panel; n = 20) or CP (circles; right side of each panel; n = 10). Only genes for which the AUC value for discrimination of CP vs. PDAC was ≥0.72 are shown (Table I). ΔCt values were obtained by subtracting the mean Ct value of β2-microglobin from the mean Ct value for each respective gene in triplicate reactions. The number of samples for which ΔCt values were >20 (ie., gene expression was minimal or undetectable) are listed above the gene names. The filled circle represents the single CP sample whose UPAR ΔCt value was <8.3, while the filled diamond represents the single PDAC sample whose UPAR ΔCt value was >8.3. The final diagnostic disposition of each gene is indicated at the top of the figure.

To determine whether UPAR could be used for accurate molecular discrimination, we used 2 approaches. First, we determined in a test set whether UPAR alone was a reliable marker. In the set of test tissues (n = 14), we observed that UPAR alone accurately classified 11/14 (79%) samples, a value that was statistically higher compared to random chance (p = 0.032; χ2 test). This result provides evidence that UPAR alone can be used for classification of FNA samples. In the second approach, we analyzed the training set to determine whether any genes could be combined with UPAR to correctly classify the misclassified CP and PDAC samples. For reclassification of the CP sample whose UPAR was <8.3 (sample no. 15), we calculated the difference in gene expression levels (ΔΔCt values) between sample no. 15 and the mean of the PDAC samples whose UPAR was <8.3. Potential discriminatory genes were identified based on (i) a calculated ΔΔCt value >4 (i.e., a >16-fold difference in gene expression), (ii) detection of a signal in all PDAC samples and (iii) an AUC value for discrimination of CP vs. PDAC ≥ 0.72. Using these criteria, we identified MAL2, EPCAM2, CEA5, TRIM29, CEA6 and MSLN and simply added the ΔCt values of the 6 genes and obtained a “molecular score” used for diagnosis (Fig. 2). Similar approaches have been described for other systems.41 For reclassification of the PDAC sample whose UPAR was >8.3 (sample no. 6; shown as a filled diamond in Fig. 1), we calculated the difference in gene expression levels between sample no. 6 and the mean of the CP samples whose UPAR was >8.3. Using an analogous selection system as described above, we identified MAL2 as a potential classifier gene. When the set of 6 markers were combined with UPAR (referred to as the UPAR/6-gene classifier), we were able to accurately classify all 29 samples for which a definitive molecular score was obtained (29/29 = 100%; Fig. 2). One of the 30 samples in the training set (3%) was classified as “undefined,” a category reserved for those samples bearing a molecular score between PDAC and CP (Fig. 2).

Figure 2.

Molecular discrimination of PDAC vs. CP. Labels with numbers describe the 3 steps involved in molecular diagnosis. (a) Algorithm used for molecular classification of PDAC and CP. Molecular discrimination of PDAC vs. CP is first based on expression levels of the UPAR gene. For samples in which the ΔCt value of UPAR is <8.3, expression levels of the 6 indicated genes are added to determine whether the sample is classified as PDAC or CP. For samples in which the ΔCt value is >8.3, diagnosis of PDAC vs. CP is based on expression levels of the Mal2 gene. (b) Diagnosis of PDAC and CP using multimarker analysis. ΔCt values of the genes listed in (a) were added and plotted as a function of cytological diagnosis (PDAC, CP, or suspicious) and UPAR expression levels (>8.3 or <8.3) for training (n = 30; 20 PDAC and 10 CP) and test (n = 14; 7 PDAC and 7 CP) cases. Symbol type reflects final tissue diagnosis, where triangles = PDAC and squares = CP. Thick horizontal lines indicate thresholds for molecular analysis determined from training samples. Molecular diagnosis of individual tissues is based on the criteria indicated to the right of the test set data. *Sample from patient with a suspicious CT that underwent a Whipple procedure. Final tissue diagnosis was CP. **Sample was diagnosed as PDAC by molecular criteria but CP by cytological evaluation.

When the UPAR/6-gene classifier was used for the testing set, a definitive molecular score was obtained for 12 samples, of which 11 were correctly classified (92%; p = 0.004; χ2 analysis). The accuracy rate of the UPAR/6-gene classifier was significantly higher compared to UPAR alone, but only at a p value of 0.046 (χ2 test). Of the 2 samples that were diagnosed as undefined by the UPAR/6-gene, 1 (sample no. 37; molecular score = 74.4) was originally diagnosed as CP by cytological criteria. However, a subsequent core biopsy performed independent of molecular analysis revealed the presence of PDAC. Hence, the undefined diagnosis from molecular analysis was consistent with the varied pathological analysis of this sample. We conclude from the analysis of the test data set that the accuracy of the UPAR marker alone is high, but the accuracy of the UPAR/6-gene classifier set is slightly higher.

Because of the observation that the accuracy rate of the UPAR/6-gene classifier set was higher in the test set compared to UPAR alone, we used bootstrap analysis to estimate its error rate as described in Materials and Methods. We determined that the mean error rate of the UPAR/6-gene classifier set was 0.02 (90% CI between 0 and 0.0682) with a sensitivity rate of 100% and a specificity rate of 94% (Table III). However, we note that since the 44 samples used in the bootstrap experiment comprise both training and test cases, the estimated performance metrics reflect a compromised combination of training and test error, and we may expect a higher error rate when these rules are applied to new cases.

Table III. Estimated Parameters of Molecular Assay using the UPAR/6-Gene Classifier As Determined By Bootstrapping Measurements of Entire Dataset
ParameterMean
  • 1

    90% confidence interval = 0–0.0682.

Positive predictive value0.96
Sensitivity1.00
Specificity0.94
Accuracy0.98
Error0.021

Discussion

Of the genes examined in the present study, we observed that UPAR had the highest accuracy for discrimination of CP vs. PDAC (Table I). The diagnostic accuracy of UPAR alone for discrimination of CP vs. PDAC was 0.895 in the training study, and 0.856 for the entire data set (95% CI = 0.717–943). These results indicate that the UPAR gene is informative for diagnosis of CP vs. PDAC. UPAR encodes the receptor for urokinase plasminogen activator (UPA), which is a serine proteinase. It is involved in localizing and promoting cell-surface plasminogen activation, plasmin formation and localized degradation of the extracellular matrix.42 Overexpression of UPAR has been found in many cancers23, 43, 44, 45, 46 including pancreatic cancer, and contributes to tumor invasion and metastasis.24, 25 Recent studies by Ellis and colleagues have shown that pancreatic metastatic disease is dependent upon the UPA/UPAR system, and that system activation and subsequent metastases is (i) mediated by insulin growth factor-1, hepatocyte growth factor, and (ii) can be inhibited by monoclonal antibodies against the protein product of UPAR.47 Interestingly, studies in colon cancer have shown that of the different cell types present at the invasive margin of tumors, UPAR was expressed most highly by macrophages.48 Thus, the high levels of UPAR observed in PDAC samples obtained by FNA in the present study may result from expression of not only tumor cells, but cell types that are recruited by the tumor.

As determined by bootstrap analysis, the overall accuracy of discrimination of UPAR was increased to 0.980 (90% CI =0.912–1.00) by the addition of a 6-gene classifier set to UPAR (referred to as the UPAR/6-gene classifier). In our set of test tissues (n = 14), the UPAR/6-gene classifier outperformed UPAR alone, but only at a p value of 0.046 (χ2 test). In the UPAR/6-gene classifier strategy, we first used a decision tree in which samples were classified as high UPAR expressers (ΔCT value < 8.3) or low UPAR expressers (ΔCT value > 8.3). If a sample expressed UPAR at high levels, discrimination of PDAC vs. CP was accomplished by simply adding the scores of 6 genes (EpCAM2, CEA5, CEA6, Trim29, MSLN and Mal2) as shown in Figure 2. If the sample expressed UPAR at a relatively low level, the Mal2 gene was used for classification. Because of the small sample size used in the present study, we acknowledge that 1 or more of the 6 genes in this classifier set are probably not necessary due to functional redundancy. Second, we acknowledge that some of the genes that were discarded from consideration, such as CTRB1 or SHH, may prove to be very informative markers. Additional studies are needed to fully validate the 6 markers used for classification. Nonetheless, we note that overexpression of many of the genes in the panel such as Mal2,49 EpCAM2,50 CEA651, 52, 53, 54 and CEA555, 56 are prognostic for other cancers.

Given the limitations of the current approach, the molecular assay described in this report appears to be accurate and robust, and may aid cytological diagnosis used for clinical decisions. For example, the UPAR/6-gene classifier was able to correctly classify all 3 samples in which the original cytological diagnosis from EUS-FNA was “suspicious.” Further, while EUS-FNA sample no. 37 was diagnosed by cytological criteria as CP, molecular analysis indicated an undefined pathology. A subsequent core biopsy of patient no. 37 was diagnosed as PDAC, a result consistent with the molecular diagnosis. Finally, in further support of our molecular assay, we note that the molecular score for sample no. 39 (marked as “*” in Fig. 2b) was 78.4 a value clearly in the CP range. Because of an abnormal CT, patient No. 39 underwent a Whipple procedure for removal of suspected PDAC. Pathological analysis of the resected tissue did not detect PDAC, providing evidence that the cytological and molecular diagnoses of CP were correct, and that the Whipple procedure should not have been performed.

Acknowledgements

We thank Mr. Victor Fresco of the DNA Microarray and Bioinformatics Core Facility for microarray analysis, and Ms. Margaret Romano of the Hollings Cancer Center Tissue Procurement Bank. This work was supported by a Clinical Research Award from the American College of Gastroenterolgy (B.J.H.).

Ancillary