Quantification of cytokeratin 20, carcinoembryonic antigen and guanylyl cyclase C mRNA levels in lymph nodes may not predict treatment failure in colorectal cancer patients



Conventional histopathologic staging of primary colorectal cancers does not allow accurate prognostic stratification within a given tumour stage. Therefore, PCR-based assays are increasingly used to try to predict more accurately the likelihood of disease progression for the individual patient. Real-time reverse transcription PCR (RT-PCR) assays were used to detect and quantitate cytokeratin 20 (ck20), carcinoembryonic antigen (CEA) and guanylyl cyclase C (GCC) mRNA in 149 lymph nodes (LN) from 17 patients with benign disease and 302 LN from 42 patients with colorectal cancer who had curative (R0) resections. None of the markers were specific, with ck20, CEA and GCC mRNA detected in 47%, 89% and 13% of 149 LN, respectively, from patients with benign disease. The sensitivity of all 3 markers was very high, with mRNA detected in 93%, 100% and 97% of 30 histologically involved LN, respectively. There was significant overlap in the mRNA levels of all 3 markers between histologically involved and uninvolved LN. There was no association between mRNA levels and distant recurrence (median follow-up: 3.94 years, range 3.35–5.12). We conclude that the use of molecular techniques to detect occult disease in LN may suffer from the same limitations as conventional methods. Instead, accurate prognostic stratification requires careful assessment of the likely metastatic potential of the primary cancer. © 2003 Wiley-Liss, Inc.

Tumour invasion of the lymphatic circulation represents an important episode in colorectal cancer metastasis, and its detection in draining lymph nodes (LN) by histopathologic staging is a poor prognostic factor because it frequently denotes the presence of systemic disease.1 However, it is predictive only for populations of patients within a stage and there is considerable prognostic heterogeneity within each tumour stage. This is emphasised by the observation that approximately one third of patients with completely excised N0 tumours (i.e., no histopathologic evidence of tumour deposits in the LN) experience treatment failure and die.2 Conversely, one third of patients with N1 tumours (i.e., histopathologic evidence of tumour deposits in the LN) are cured after complete surgical resection. Achieving accurate stratification of individuals into prognostic groups within a given stage has assumed some importance with the recent emergence of more effective adjuvant chemotherapy protocols that have had a positive impact on patient survival.

Molecular analysis of LN using PCR-based assays may be one solution for increasing the accuracy of postoperative staging.3 Indeed, DNA-based assays can identify an additional percentage of cancer patients with LN harbouring occult disease. However, even with this “gold standard” scenario, a significant percentage of patients with PCR negative LN succumb to their disease,4 whereas a significant percentage of patients with PCR positive LN survive for at least 5 years.5 RNA-based assays do not target tumour-specific markers, but tissue-specific ones; hence, the detection of a target mRNA is not proof of the presence of a cancer cell. Several reports have described the use of cytokeratin 20 (CK20), carcinoembryonic antigen (CEA) and guanylyl cyclase C (GCC) as markers for the detection of occult residual disease.6, 7, 8 However, CEA9 and ck2010 mRNA have been detected in normal LN, and low-level ectopic expression of GCC in blood has been demonstrated in CD34+ progenitor cells.11

Technical issues associated with the technique itself are a major reason for these conflicting data: tissue handling and RNA extraction procedures and the use of conventional, gel-based, nonquantitative RT-PCR protocols combine with differences in data interpretation to generate results that vary significantly between studies. Therefore, the aims of our study were to (i) use real-time fluorescence-based technology to rigorously evaluate the expression patterns of 3 of the most commonly used markers for the detection of occult residual disease in LN and (ii) to assess their usefulness as prognostic markers in our series of patients.


Patients and samples

Our study was approved by the Local Research Ethics Committee of the East London and City Health Authority. LN were analysed from 17 patients with benign disease (n = 149) and 42 patients with colorectal cancer who had curative (R0) resections (n = 302). In addition, 30 histologically positive LN were used as positive controls. Median follow-up was 3.94 years (range 3.35–5.12). Cancer patient details are listed in Table I.

Table I. Patient Details1
PatientSexMSI+AgeSiteDifferentiationaDukesTNMR StatusFollow-upChemotherapyRadiotherapy
  • 1

    M, moderately differentiated; P, poorly differentiated; W, well differentiated.

8M+ve55Hepatic flexurePBT3N0M004.48NN
15F−ve86Splenic flexureWBT3N0M004.07NN

RNA extraction

Sterile scalpels were used to cut individual LN, which were then homogenised in 2-ml cryotubes using a bead mill (Glen Creston Ltd, Stanmore, Middx, UK), with 3 steel beads per 0.5 ml RNeasy lysis buffer (Qiagen, Crawley, West Sussex, UK). This minimised the possibility of sample contamination. All RNA extractions were carried out in a separate room in a class 2 containment hood as previously described12 using Qiagen's RNeasy extraction kits. RNA was quantitated, in triplicate, using either a GeneQuant II spectrophotometer (Pharmacia, Milton Keynes, UK) or the Ribogreen assay (Molecular Probes, Leiden, The Netherlands). RNA quality was assessed using RNAChips with the Agilent 2100 Bioanalyzer (Agilent, Palo Alto, CA).

Primers and probe design

Primers and probes for each gene were designed using Primer Express software (Applied Biosystems, Warrington, UK) and are shown in Table II.

Table II. Oligonucleotide Primers and Probes, Amplicon Sizes, Y-Intercepts and Slopes from Standard Curves for the qRT-PCR Amplification of the Different Marker Genes and GAPDH
TargetPrimers (5′–3′)Probes (5′FAM-3′TAMRA)Amplicon sizeStandard curve
 R: GCAGGACACACCGAGCATTT  y-intercept: 43.5
 R: CCAGCTGAGAGACCAGGAGAA  y-intercept: 43.4
 R: GAAGATGGTGATGGGATTTC  y-intercept: 46.9

RT-PCR reaction and conditions

All 5′-nuclease assays were performed using a one-tube/one-enzyme RT-PCR protocol as previously described.12 Reactions were carried out and results recorded and analysed using the ABI 7700 Prism Sequence detection system (Applied Biosystems, Warrington, UK).

Generation of standard curves

mRNA levels were quantitated relative to amplicon-specific standard curves (“absolute” quantification).13 The use of standard curves also provides a measure of the sensitivity of the assay because the assay was able to detect as few as 13 copies of GCC, 40 copies of ck20 and 100 copies of CEA.


Single housekeeping genes are unreliable for normalising quantitative RT-PCR assays, as their mRNA levels vary significantly between individuals.14 Instead, copy numbers were normalised relative to total RNA concentration and expressed as copy numbers/μg RNA.13, 15

RT-PCR quality controls

All serial dilutions were carried out in duplicate. The reactions to generate standard curves were repeated twice, each time in triplicate. All clinical samples were tested in triplicate and the average value of the triplicates was used for quantification. If any of the replicates resulted in discordant Cts (< 1), the assay was repeated. All GCC assays resulting in Cts were repeated at least once. To minimise the risk of false positives, the dilutions for the standard curves were dispensed after the unknown samples had been dispensed and the tubes sealed. The quality of isolated nucleic acids was determined by analysis on the Agilent 2100 Bioanalyser and its performance in a RT-PCR assay by amplification of GAPDH mRNA. To exclude false positives, 2 no-template controls were included with every amplification run; one was prepared before opening all the tubes and dispensing the various reagents, the other at the end of the experiment. This allowed the monitoring of any contamination arising during the handling of the reagents.


All statistical calculations for Mann-Whitney U-test and Spearman's Rank correlation were carried out using Winstat for Excel (R. Fitch Software, Staufen, Germany) and Statsdirect (Ashwell, Herts, UK).


Specificity and sensitivity of target mRNA detection

GAPDH (glyceraldehyde-3-phosphate dehydrogenase) mRNA was detected in all samples, confirming that the RT-PCR assay was valid. The specificity of the assay was established by screening 149 LN obtained from 17 patients with benign disease for expression of ck20, CEA and GCC mRNA. These were detected in 76, 124 and 17 LN, respectively, which translates into specificities of 47%, 13% and 89% (Fig. 1a). The sensitivity of the assay was established by screening 30 histologically positive LN, with target mRNAs detected in 28 (93%), 30 (100%) and 29 (97%) samples, respectively (Fig. 1b).

Figure 1.

Specificity and sensitivity of tissue-specific markers. ck20 (▪;) was detected in 47%, CEA (□) in 89% and GCC (▓) in 13% of LN from patients undergoing surgery for benign disease (a). ck20 (solid black bar) was detected in 93%, CEA (solid white bar) in 100% and GCC (vertically striped bar) in 97% of 30 histologically positive LN (b).

Quantification of target mRNA in LN from noncancer control patients

The copy numbers of the 3 markers in individual LN are shown in Figure 2. GCC mRNA was completely absent in 9 patients. Four patients expressed background levels only of GCC (< 1 × 103) in at least one LN. In 3 patients, there were additional LN expressing high levels of GCC; in each case these LN also expressed high levels of ck20 and CEA mRNA. One patient did not express either ck20 or GCC in any LN (n = 18); however, CEA mRNA was detected in 7/18 LN. There were significant correlations between ck20 and CEA (R = 0.56, p < 0.001) and CEA and GCC mRNA levels (R = 0.60, p < 0.001), but not between ck20 and GCC.

Figure 2.

Quantification of ck20, CEA, and GCC mRNA levels in LN from patients with noncancer related surgery (Benign), in histologically negative LN from R0 patients (Hist −ve) and histologically positive LN (Hist +ve). The horizontal bars show median copy numbers. (*p < 0.05, **p < 0.001, ***p < 0.0001).

Quantification of target mRNA in LN from histologically positive LN

GAPDH mRNA was detected in all samples, confirming that the RT-PCR assay was valid. Median mRNA copy numbers for all 3 markers were significantly higher in the 30 histologically positive LN compared to the 149 control LN (Table II and Fig. 2). There were significant correlations between ck20 and CEA (R = 0.74, p < 0.0001), ck20 and GCC (R = 0.89, p < 0.0001) and CEA and GCC mRNA levels (R = 0.87, p < 0.0001).

Detection of target mRNA in histologically negative LN from colorectal cancer patients

A total of 302 LN (median 8, range 1–22) were obtained from 42 tumours staged as R0, i.e., the patients had undergone curative resections. ck20 mRNA was detected in 225 (75%), CEA mRNA in 297 (98%) and GCC mRNA in 65 (22%) of LN.

Quantification of target mRNA in histologically negative LN from colorectal cancer patients

Median levels of all 3 markers in histologically negative LN were significantly higher than those recorded in the noncancer control LN for ck20 (p < 0.001), CEA (p < 0.0001) and GCC (p < 0.05) (Table III). In addition, the mRNA copy number range for all 3 markers was significantly greater, with maximum expression levels 10- to 100-fold higher (Fig. 2). As with the histologically positive control LN, there were significant correlations between ck20 and CEA (R = 0.86, p < 0.0001), ck20 and GCC (R = 0.55, p < 0.01) and CEA and GCC (R = 0.63, p < 0.001) mRNA levels.

Table III. ck20, CEA and GCC mRNA Copy Numbers1
Copy numberRangeCopy numberRangeCopy numberRange
  • 1

    Both median copy numbers and the expression range are significantly higher in the histologically negative LN extracted from colorectal cancer patients compared with the noncancer (negative) controls.

Noncancer controls4.9 × 1021.0 × 102–4.8 × 1045.4 × 1041.3 × 103–5.1 × 1065.0 × 1021.0 × 102–8.6 × 103
Hist +ve LN4.2 × 1051.0 × 102–2.2 × 1072.6 × 1081.1 × 106–5.1 × 1092.1 × 1052.1 × 102–1.1 × 107
Hist −ve LN1.1 × 1031.0 × 102–4.5 × 1053.5 × 1053.3 × 104–1.2 × 1087.0 × 1021.4 × 102–3.1 × 105

Prognostic significance of detection of ck20, CEA or GCC mRNA

Median follow-up of this patient group was 3.94 years (range 3.35–5.12). To date, 5 patients (1 staged with histologically positive and 4 with histologically negative LN) have died from metastatic colorectal cancer. To test the biologic relevance of detecting and quantitating ck20, CEA and GCC mRNA, patients were stratified according to the mRNA copy numbers of ck20, CEA and GCC, either individually or combined. There was no correlation between copy number of any of the 3 markers and survival, with the highest copy numbers recorded for patients that have been disease free for 3 or more years (Table IV). A comparison of mRNA levels between all LN from patients that died due to distant recurrence and those that have not developed distant recurrence revealed no differences in mRNA copy numbers (Fig. 3).

Table IV. Copy Number Ranking for ck20, CEA, and GCC
RankPatient no.CK20Patient no.CEAPatient #GCC
141.4 × 106268.6 × 108261.9 × 106
2264.0 × 10542.0 × 10849.2 × 105
342.1 × 105191.6 × 10842.4 × 105
4371.6 × 10541.4 × 108191.3 × 105
5267.1 × 104268.8 × 107261.1 × 105
6376.4 × 104276.3 × 107377.9 × 104
7156.0 × 104374.5 × 107375.5 × 104
8374.5 × 104273.4 × 107273.8 × 104
9203.3 × 104262.9 × 107373.2 × 104
10252.6 × 104372.8 × 107262.3 × 104
Figure 3.

Comparison of mRNA levels in LN from patients who succumbed to distant metastases and those who remain disease-free: ck20 (a); CEA (b); GCC (c).


The identification of LN metastasis by conventional histopathologic staging is a pivotal determinant of tumour stage, but it is not predictive for the individual patient. The limited reproducibility and lack of sensitivity of immunohistochemistry make it unreliable for identifying patients at risk of treatment failure.16 Instead, the sensitivity and specificity of PCR- and RT-PCR based assays have led to their widespread evaluation as tools for the development of a more accurate colorectal cancer staging system.3

In colorectal cancer, CEA,6 ck2017 and GCC7, 18 have been reported as suitable molecular markers for prognostic RT-PCR assays. However, there are conflicting findings with respect to the specificity of CEA,19 and ck20 has been detected in healthy controls.10, 20, 21 GCC appears to be the most specific, at least in terms of no reports contradicting the initial observation of its prognostic potential and subsequent reports extending those findings. Nevertheless, GCC expression in CD34+ progenitor cells, which may circulate through the lymphatic system, could interfere with its specificity.11

One important explanation for conflicting results lies with the technique itself. Despite its perception as a simple, sensitive, specific and speedy technique, there are numerous problems associated with RT-PCR assays22 that continue to impede the translation of results into clinical practice.23 The development of real-time fluorescence-based PCR assays addresses many of these problems,13 although other problems have been recognized.15

Our aim was to use a “best practice” assay to quantitate CEA, ck20 and GCC mRNA levels in a large number of LN from noncancer patients as well as in LN from a clinically well-defined group of colorectal cancer patients. The main criterion for selection was not whether cancers had metastasised to the LN or not but whether they had undergone curative surgery. This avoided any likelihood of distant recurrences arising from cancer left behind during surgery. The underlying rationale was to test the hypothesis that LN and distant metastasis may be separate events. All RNA samples were carefully quality assessed, and mRNA levels were quantitated using amplicon-specific standard curves (“absolute quantification”) and normalised against total RNA. This, rather than normalisation against an internal housekeeping gene or rRNA standard, has been shown to be the most biologically relevant method of comparing mRNA copy numbers between individuals.14 These steps ensured that copy numbers obtained from the different markers could be compared reliably and meaningfully between different individuals.

Our findings are unequivocal. First, CEA is not specific, as it can be detected in most LN from all 17 individual controls. Interestingly, there was one patient where its expression was absent in 11/18 LN, suggesting individual-specific patterns of expression. It remains to be seen whether the identification of different CEA splice variants in healthy controls and cancer patients will lead to an improvement of the specificity of this marker.24 Second, the ck20 results confirm our previous, preliminary data25 that revealed its lack of both specificity and sensitivity in LN. Again, the importance of considering variation in expression patterns between and even within individuals was highlighted by the fact that in several patients some of the LN did not express ck20, but others did at high levels. Third, GCC displays significantly higher specificity than either CEA or ck20 while retaining high sensitivity. However, its expression was detected in at least one LN in 8/17 patients undergoing surgery for noncancer related conditions.

Median mRNA levels of individual markers were lowest in the LN obtained from patients with benign disease and highest in patients with histologically positive LN. However, there was considerable overlap and it was impossible to set a cutoff level of expression that would suggest the presence of tumour cells in those LN. This is borne out by a previous report that also found considerable overlap between histologically positive LN and histologically negative but RT-PCR positive LN on the one hand, and histologically and RT-PCR negative LN.26

When patients are stratified according to the individual mRNA levels detected in their LN, it is apparent that there is no correlation between survival and copy numbers (Table IV). The top ranks for all 3 markers contain LN from only one patient (27) staged as histologically positive (3.4 ×107 for CEA and 3.8 × 104 for GCC). All other top ranking LN are from patients staged as histologically negative who are alive and well. For ck20, the first LN from a patient who was staged as histologically negative but has died from a distant recurrence (patient 18) is at position 13 (1.2 × 104 copies/μg total RNA) with a second one at position 33 (5.7 × 103 copies/μg total RNA). The first LN from the remaining patients to have died from distant recurrences are at positions 23 (7.7 × 103 copies/μg total RNA), 37 (4.7 × 103 copies/μg total RNA) and 62 (2.3 × 103 copies/μg total RNA), respectively. Stratification according to CEA or GCC produced similar results, with LN from patients 4, 26 and 37 appearing repeatedly in the top 10; there were no top-ranking histologically negative LN derived from patients that died from distant recurrence. The results were no different when all 3 markers were considered simultaneously (detailed results not shown). A comparison of the copy numbers of the 3 genes between LN from patients who died from their disease and those who survived confirms that there is no difference in their mRNA levels (Fig. 3). As a whole, this suggests that although RT-PCR assays can identify LN harbouring colon-derived cells that are not detected by conventional histopathology, these are not likely to progress to distant metastases.

A recent microarray analysis of the expression signature of solid cancers suggests that the metastatic potential of human tumours is encoded in the bulk of a primary tumour.27 This lends support to the contention that current RT-PCR detection strategies are flawed because they do not provide any information about the metastatic potential of the cells they detect. Even unequivocal detection of tumour cells by mutant allele-specific PCR fails to predict relapse within 5 years of surgery in 27%5 and more recently in 70%4 of patients. Indeed, a patient in the LN negative group also died. Similarly, when RT-PCR assays were used to detect tissue-specific mRNA expression, tumour recurrence was observed in only 14–50% of patients whose LN were RT-PCR positive for CEA, ck20 or both markers.6, 28, 29, 30, 31 This supports a previous assertion that RT-PCR may simply be detecting cells of no biologic significance.32 The interpretation of these results is not helped by the fact that researchers often fail to distinguish between local recurrence—often a reflection of inadequate surgical technique—and distant treatment failure, which is a result of the biologic outcome of the tumour host interaction.

In conclusion, our data suggest that the pivotal issue is not the detection and/or quantification of occult disease per se, but the identification of a tumour's underlying potential for successful growth at a distant site. This is likely to be answered by studying the transcriptome of the cancer to reveal an expression profile that reveals molecular markers predictive for successful metastasis.