Molecular pathogenesis of colorectal cancer

Implications for molecular diagnosis



Colorectal cancer is the third leading cause of cancer-related death in both men and woman in industrialized countries. Major advances have been made in our understanding of molecular events leading to formation of adenomatous polyps and cancer. Most colorectal cancers are sporadic, but a significant proportion (5–6%) has a clear genetic background. It is now widely accepted that colorectal carcinogenesis is a multistep process involving the inactivation of a variety of tumor-suppressor and DNA-repair genes and simultaneous activation of certain oncogenes. In addition, epigenetic alterations through aberrant promoter methylation and histone modification have been found to play a major role in the evolution and progression of a large proportion of sporadic colon cancers. Consequently, it is now apparent that individual colorectal cancers may evolve through diverse molecular pathways. In this article, the authors have summarized the current knowledge of molecular pathogenesis in common hereditary syndromes and sporadic forms of colorectal cancer. Novel molecular diagnostic tools for the early diagnosis and prevention of colorectal cancer that have emerged from these insights are discussed. Cancer 2005. © 2005 American Cancer Society.

Colorectal neoplasia is one of the most common malignancies in the western world. The lifetime risk of colorectal cancer in the US population is 5–6%. Over 50% of the population will develop an adenomatous polyp by the age of 70 years, but only one tenth of these will proceed to cancer.1 Over the past 15 years, evidence from many laboratories has demonstrated that colorectal cancer is a progressive, multistep genetic disease. However, epidemiologic studies have suggested that environmental factors support the development of colon cancer.2–4

It appears that some individuals are more prone to colorectal cancer than others. It has been suggested that about 25% of colon cancer patients have some degree of familial background, and another 15% have a strong family history involving a first or second degree relative.1 Perhaps as many as 5% of colon cancers are caused by single-gene syndromes, most commonly familial adenomatous polyposis (FAP) or Lynch syndrome, also known as hereditary nonpolyposis colorectal cancer, or HNPCC.5 Definite disease-causing mutations have been identified for both syndromes. FAP is usually caused by germline mutations of the tumor-suppressor gene adenomatous polyposis coli (APC).6, 7 Somatic (or acquired) APC alterations are seen in most (85–90%) colorectal neoplasms, whether these cancers are of familial or sporadic origin.4 Lynch syndrome is caused by mutations in DNA mismatch–repair genes, of which hMLH1 and hMSH2 are most commonly involved.5 Less common hereditary colorectal cancer syndromes are listed in Table 1.8, 9

Table 1. Hereditary Syndromes Predisposing to Colorectal Cancer
Strongly predisposed to cancerGene responsible
  1. FAP: familial adenomatous polyposis; HNPCC: hereditary nonpolyposis colorectal cancer.

FAPAPC (dominant) & MYH (recessive)
Lynch syndrome (HNPCC)hMLH1, hMSH2, hMSH6, PMS1, PMS2, hMLH3, EXO1
Weaker cancer risks 
Peutz-Jeghers syndromeSTK11 (or LKB1)
Juvenile polyposisSMAD4/MADH4 or BMPR1A
Cowden diseasePTEN/MMAC1
Bannayan-Ruvalcaba syndromePTEN
Li-Fraumeni syndromep53
Bloom syndromeBlm

In general, colon carcinoma results from the cumulative effect of multiple sequential genetic alterations (multistep carcinogenesis). These alterations can either be acquired, as happens in the sporadic forms, or be inherited, as in genetic cancer predisposition syndromes. In FAP and Lynch syndrome, the germline mutation either provides the first mutation in a critical tumor-suppressor gene in every cell from birth (FAP), or it creates a situation that can lead to accumulation of mutations at a greatly accelerated rate (in Lynch syndrome).

Familial Adenomatous Polyposis

Clinical features

FAP is an autosomal dominantly inherited disease which leads to the development of multiple adenomatous polyps and affects about 1 in 7000 individuals. The penetrance of the disease for polyps is essentially 100% and most will develop cancer if there is no intervention. In the classic forms of FAP, hundreds to thousands of adenomatous polyps develop in the large bowel by the second and third decades of life. Most of these patients develop benign adenomas, but some of these will progressively accumulate additional mutations, and inactivation of the p53 gene mediates the conversion of benign lesions to malignant ones.10 In addition, patients with FAP frequently develop extracolonic manifestations in the course of the disease, such as retinal lesions, osteomas, duodenal or ampullary polyps, desmoids tumors, or brain tumors. Although most FAP patients develop periampullary adenomas, only about 5–10% will progress to cancer in this region. Desmoid tumors are benign but can be lethal. These occur in 10–15% of FAP patients, are more common in some families, and are the major nonmalignant cause of morbidity and mortality in FAP.

Molecular pathogenesis

In 1987, the observation of a patient with multiple congenital abnormalities that were not present in the parents led investigators to the discovery of APC locus on chromosome 5q.11 Subsequent molecular studies demonstrated significant linkage to markers close to chromosome 5q21. In 1991, the disease-causing APC gene was identified by two research groups.6, 7

The APC protein interacts with the intracellular protein β-catenin, which, when activated, translocates to the nucleus and stimulates cell proliferation by transcriptional activation of c-myc, cyclin D1 and the peroxisome-proliferator-activated receptor delta (PPARδ). β-catenin is also part of the intercellular adhesion complex, and influences homotypic cell adhesion. Once β-catenin levels rise, a cell proliferation program is activated. In concert with other factors, the APC protein halts cellular proliferation and promotes apoptosis by phosphorylating β-catenin, leading to its ubiquitination and degradation through the proteosome pathway.12 In the case of an inactivating mutation, APC-mediated β-catenin degradation is lost and nuclear concentrations of β-catenin remain high, which results in the formation of the adenoma.

Approximately 30% of FAP patients do not have identifiable germline APC mutations.13 It has been observed that a proportion of FAP in individuals with wild-type APC sequences are caused by biallelic mutations of the MYH gene.14, 15 In these cases, FAP is an autosomal recessive disorder, which may appear as a sporadic disease clinically. The MYH gene encodes a DNA glycosylase that is involved in base excision repair (BER) and primarily targets repair of oxidative DNA damage. Colorectal tumors from patients with MYH mutations show an excess of acquired G > T transversion mutations in the APC gene in the colon because of a reduced ability to remove 8-oxo-guanine adducts.

Interestingly, the number of polyps in FAP is determined in part, by the location of the germline mutation within the APC gene. Disease-causing germline mutations in APC are always either deletions or premature truncations. FAP patients tend to present with different phenotypic variants, depending upon the location of the truncating mutation.16 Mutations between codons 1250 and 1464 will lead to the most severe form of the disease with a very high number of polyps (> 5000 at maturity) and an early age of manifestation. Mutations within the first 4 exons, however, present with an attenuated phenotype, as do mutations at the 3′ extreme of the gene.17 These patients develop polyps later in life, and the penetrance of the disease is lower (Fig. 1).

Figure 1.

Schematic structure of the APC gene. The APC gene encodes 2843 amino acids, and the location of the mutation correlates with the phenotype. The mutation cluster regions span codons 1250–1464 of exon 15. (CHRPE: congenital hypertrophy of the retinal pigment epithelium).

The APC gene can also harbor a premutation that increases colorectal cancer susceptibility. In case of the I1307K mutation, an isoleucine at codon 1307 is converted to a lysine.18 This missense mutation has no identifiable impact on the function of the APC protein. However, from a genetic perspective, this polymorphism converts the AAATAAAA sequence to the relatively unstable microsatellite sequence (A8), a region of hypermutability which is prone to “slippage errors” incurred by DNA polymerase during replication. The I1307K mutation thereby creates a predisposition to a second frameshift mutation in the A8 and to other mutations in the vicinity of this poly-A sequence. Eventually a third mutation occurs, inactivating the wild-type allele, which completes the biallelic inactivation of APC. The I1307K sequence alteration has been identified in 6% of Ashkenazi Jews, and it is associated with a 2–3-fold increased risk of developing colorectal cancer.

Molecular diagnosis

The clinical diagnosis of FAP can be confirmed by sequence analysis of the APC gene, which is commercially available from multiple commercial vendors. Direct DNA sequencing is successful in finding a pathogenetic mutation in most, but not all, cases. Southern blotting may be necessary to find large genomic rearrangements, such as genetic deletions, and MYH sequencing is often helpful in those who do not have detectable APC alterations. Once a disease-causing mutation is known, affected family members or relatives at risk can be tested with a high degree of reliability for the presence of this mutation after careful genetic counseling.

Lynch Syndrome or Hereditary Nonpolyposis Colorectal Cancer (HNPCC)

Clinical features

Lynch syndrome is an autosomal dominant disorder clinically characterized by early onset of colorectal cancer and extracolonic cancers in multiple members of a family, but it is not accompanied by a large number of colorectal polyps. Over two-thirds of the colorectal cancers occur in the proximal colon, and the average age of diagnosis of these cancers is in the early 40s (yrs), the same average age as in FAP. Synchronous and metachronous tumor formation, especially in the setting of a colorectal cancer in the proximal colon, suggests the diagnosis of Lynch syndrome. Besides their location, tumors of the Lynch syndrome spectrum have typical pathologic features such as lymphocytic infiltration, mucinous character, and poor differentiation. Frequently, colonic neoplasms in Lynch syndrome develop without detection of antecedent adenomas. This is thought to be because of an accelerated adenoma–carcinoma sequence in this disease. Therefore, Lynch syndrome is quite different from FAP with respect to number and location of polyps. Besides their higher risk for malignancies of the colon and rectum, individuals with Lynch syndrome also frequently acquire tumors at extracolonic sites, particularly the endometrium, ovary, stomach, urinary tract, pancreas, small bowel, and brain.

Molecular pathogenesis

In 1992, it was first noted that some colorectal cancers had a large number of deletion or insertion mutations at short simple repetitive sequences called microsatellites.19 Microsatellite sequences are widely prevalent throughout the genome and routinely used for mapping and linkage analyses. Within a short period of time, it was found that 12–15% of all colorectal cancers displayed a phenotype that was defined as “microsatellite instability” or MSI.20 These tumors feature thousands of deletion or insertion mutations in microsatellite sequences. Most of these occur in noncoding regions of the genome, but a few occur in exons, and these lead to losses of critical growth-regulatory genes. Thus, unlike FAP, in which the mutation abrogates the function of a critical tumor-suppressor gene, in Lynch syndrome, the mutation typically abrogates one of the DNA housekeeper genes, leading to a “mutator phenotype.”

In families with Lynch syndrome, linkage analyses in 1993 led to the discovery of loci on chromosomes 2p16 and 3p21–23. It was then recognized that the disease-causing genes belonged to the DNA mismatch–repair system and were homologues to the bacterial MutS and MutL genes, subsequently recognized as hMSH2 and hMLH1.21 In short order, several additional genes related to the human DNA MMR system were identified, including hPMS1, hPMS2, hMSH3, hMSH6, hMLH3 and EXO1, but not all have been unequivocally linked to Lynch syndrome.

The human DNA mismatch–repair system requires the concerted action of several proteins, which form complexes to achieve specific types of DNA repair during the replication of DNA.22 The most commonly involved genes in Lynch syndrome are hMLH1 and hMSH2, which account for 90% of all known mutations in Lynch syndrome families. Germline mutations are found less commonly in hPMS2, hMSH6, and have been postulated to occur in EXO123 and hMLH3,24 although the clinical significance of mutations in the latter two genes remains to be established. Interestingly, mutations in hMHS6 result in only a partial deficiency of mismatch–repair, and in 14% of these tumors, a lower level of MSI, called the MSI-L phenotype, is found.25 Mutations in hMSH6 result in an attenuated form of Lynch syndrome with a later age of onset and delayed penetrance until age 70 years. There is a particularly elevated risk of endometrial cancer in female carriers of hMSH6 mutations.26

In hMLH1 and hMSH2 mutations, high level MSI (MSI-H) is observed in the tumor DNA. The level of MSI is usually assessed using five microsatellite markers; a standard panel of five markers defined by the National Cancer Institute in 1998 is commonly used, but other panels of DNA sequences have been validated for this purpose. MSI-H is defined when two of five markers are mutated, or unstable.27 The MSI-L phenotype is present when only one of five markers is unstable. When none of the five markers is mutated, this is called microsatellite stable, or MSS.

The consequence of a defective DNA mismatch–repair system is the sequential acquisition of multiple mutations in coding and noncoding microsatellite sequences. There is a growing list of genes found to be involved in tumors with the MSI phenotype (Table 2).21, 28 It is generally believed that transformation to malignancy is associated with mutations of these target genes, and that these mutations rapidly accumulate within the genome and are preferentially selected for once DNA MMR activity is lost.29

Table 2. Target Genes in Malignancies of the MSI Phenotype
GeneTarget sequence

Molecular diagnosis

Historically, the diagnosis of Lynch syndrome relied on family history, and the Amsterdam or Bethesda criteria were developed to help identify many families. The Amsterdam criteria were actually developed to identify uniform families for research purposes and may prove to be too stringent for clinical purposes. The purpose of the Bethesda criteria was to identify those tumors that should be tested for MSI. Once an individual is identified or suspected to have Lynch syndrome, genetic counseling and testing is performed. Patients who meet the Amsterdam or Bethesda criteria or who have a strong familial colorectal cancer risk may undergo initial screening by testing the tumor tissue for MSI (Fig. 2). However, in a colon cancer demonstrating MSI-H, one must take into consideration that although approximately 15% of sporadic colon cancers are MSI, only 20–25% of these are from patients with Lynch syndrome. In MSI-H colorectal cancer, the individual at risk is tested for a pathogenic mutation by DNA sequencing in the most commonly involved DNA mismatch–repair genes, hMSH2 and hMLH1. Mutations in these two mismatch–repair genes account for most, perhaps 90%, of the genetically characterized Lynch syndrome cases.30 Genetic testing for the hMLH1 and hMSH2 genes is commercially available at specialized laboratories.

Figure 2.

MSI analysis using a panel of five validated markers. Normal tissue (N) is compared with tumor tissue (T) from the same individual. All tumors show different allele sizes represented by insertion or deletion mutations in the microsatellite sequences of the respective markers.

Another approach to screen patients who may have Lynch syndrome is the immunohistochemical (IHC) staining of colorectal cancer specimens for mismatch–repair genes, as this may help identify the defective mismatch–repair protein (Fig. 3). Pathogenic mutations in a mismatch–repair gene usually lead to loss of a detectable protein expression in tumor tissue. However, some missense mutations will result in the formation of a defective protein that will, nonetheless, be visible (and falsely appear normal) by IHC; this is particularly true for mutations in the hMLH1 gene. Once the search for the disease-causing mutation is successful, family members at risk can be directly screened for this mutation, and close clinical surveillance for cancer can be the instituted.

Figure 3.

hMLH1 and hMSH2 expression analysis. Representative expression analysis performed by immunohistochemistry in tumors that normally express hMLH1 and hMSH2. (A) Normal tissue expresses hMLH1, while the tumor lacks hMLH1 expression. (B) Normal tissue expresses hMSH2, and the tumor does not express the protein. Absence of protein expression implicates that gene as a cause of the MSI.

A recent study from Ohio of over 1000 colorectal cancers indicates that 19.5% will have MSI (12.7% MSI-H, and 6.8% MSI-L), and that IHC testing for hMSH2, hMLH1, and hMSH6 expression will identify 93% of those. IHC is more readily available to nonreferral centers than is MSI analysis. By using an approach in which all tumors were tested for MSI, and subjecting high-risk paatients (i.e., those who were < 50 yrs old, had multiple cancers or a positive family history) to IHC analysis, they found that 2.2% of unselected colorectal cancer cases had detectable germline mutations in one of the DNA mismatch genes. There were no instances in which the MSI-L tumors represented Lynch syndrome, so this finding is not relevant in the context of this disease. More importantly, this systematic approach revealed that over half of Lynch syndrome patients with colorectal cancer in the community are older than 50 years, and only 22% had clinical histories that met the Amsterdam criteria. This group also reported that mucinous cancers may give falsely negative MSI test results.31

Mutations in mismatch–repair genes may result in truncated proteins but may also lead to missense mutations or other variants such as mutations in splice sites. The inability to detect hMSH2 and hMLH1 mutation in half of the patients with Lynch syndrome implies that genetic screening techniques currently used are still imperfect. Also, there are no rapid methods to screen through the putative Lynch syndrome genes, such as hMLH3, hPMS2, or EXO1.

Large deletions in DNA MMR genes in which the breakpoints occur outside of exons are not detectable by routine DNA sequencing, but there are approaches that will detect these mutations. The wild-type allele (from the unaffected parent) will mask the mutant allele in the process of sequencing. To avoid this problem, the “conversion” technique has been used to separate paternal and maternal alleles, which permits the differential analysis of the two alleles, leading to detection of many unique pathogenic mutations.32

It has more recently been recognized that germline mutations in hPMS2 may lead to Lynch syndrome, which can be detected using IHC.33–35 Unfortunately, the germline mutations in these genes are difficult to find because of the presence of multiple pseudogenes, which obscure their detection, and there is no commercially available test for this gene.

Finally, some familial clusters of colorectal cancer do not have MSI in their tumors, and no germline mutation can be found in hMSH2, hMLH1, or hMSH6 by careful testing, including the use of conversion technology to unmask large deletions. These familial clusters, which make up about 40% of bona fide colorectal cancer families, do not appear to have increased risks for cancer outside the colon, have a reduced penetrance for the cancer phenotype, and have a later onset of disease compared with patients in Lynch syndrome families. This has been termed “familial colorectal cancer–type X.”36

Sporadic Colorectal Cancer

Nonhereditary colorectal cancer is among the most common malignancies worldwide. Its characteristic is the step-wise progression from normal colonic epithelium to malignant growth associated with sequential molecular abnormalities in each step.4 In the early 1990s, it was elegantly shown that colorectal tumors arise as a consequence of the accumulation of activated oncogenes and inactivated tumor-suppressor genes.37 Research in the field has subsequently revealed that there are multiple processes involved in the generation of the necessary mutations and different groups of target genes that mediate different biologic behaviors seen in tumor cells.

Historic background and molecular pathogenesis

Adenoma–carcinoma sequence.

In 1988, it was reported that the use of restriction fragment polymorphism (RFLP) analysis allowed a characterization of the frequency and locations of chromosomal losses in colorectal cancers.38 It was shown that in most cases of sporadic colon cancers, multiple chromosomal segments were deleted, that a few cancers had no detectable deletions whereas some had deletions at > 50% of the loci examined, and each cancer appeared to have a unique pattern of deletions. When compared with normal tissue, allelic imbalances were noted in tumor tissues, which gave rise to the term “loss of heterozygosity” (LOH). It was shown that specific chromosomal loci showed particularly high frequencies of allelic losses, such as on chromosomes 17p, 18q and 5q.37 It was next demonstrated that the tumor-suppressor gene residing on chromosome 17p was the p53 gene.39 In colorectal cancer, 75% of tumors had undergone LOH at 17p locus, and, in most cases, the other p53 allele had been inactivated by mutation, which is consistent with the prediction that biallelic inactivation would be found for tumor-suppressor genes in cancers. Allelic deletions of the APC gene on chromosome 5q have been observed in up to 50% of cases and in about 30% of adenomas, respectively,38, 40 and truncating mutations are found in 63% and 60% of adenomas and carcinomas, respectively.41

The story was more complicated for 18q, the most common site for LOH in colorectal cancers. In 1990, a candidate gene on chromosome 18q called the deleted in colorectal carcinoma gene (DCC) was identified by sequencing of this region. Although deletions were found in over 70% of the tumors, inactivating mutations were not found in the residual DCC allele, and its role in multistep carcinogenesis was not firmly established. Subsequently, the SMAD-4 and SMAD-2 genes were found to be deleted on 18q, and these genes have been more clearly shown to have characteristics of tumor suppressors; thus, the role of DCC in colorectal cancer has remained enigmatic.

The presence of 18q alterations has subsequently been proposed as a prognostic marker for survival. Stage II colon cancers (i.e., invasive into the muscularis propria) are associated with significantly better overall survival than Stage III malignancies (i.e., with regional lymph node involvement). However, in Stage II cancers with LOH at 18q, the outcome appears to be equivalent to that seen in Stage III colorectal cancers.42, 43

Before the characterization of inactivating alterations in tumor suppressor genes, activating mutations of the oncogene k-ras were reported in colorectal cancer. The oncogene k-ras on chromosome 12 encodes for a protein which transmits extracellular growth signals to the nucleus. Mutations of k-ras are found in 50% of large polyps and colorectal cancers.38, 44 Mutations in k-ras are always activating missense alterations that lead to continuous growth signals. Missense mutations are found exclusively in codons 12, 13, and 61. It has been demonstrated that mutations in k-ras correlate with the size of lesions and progression to dysplasia. In adenomas of small size (< 2cm) k-ras mutations are found in 14% whereas 33% of the larger adenomas (> 2cm) carry these alterations. In moderately dysplastic adenomas, k-ras lesions are found in 33%, whereas highly dysplastic lesions have mutations in up to 50%.37

k-ras mutations have also been screened for their usefulness as a diagnostic marker. As tumor DNA is shed in the stool, it can be extracted and amplified by polymerase chain reaction (PCR). Mutations of k-ras in the stool correspond well with alterations found in the tumor.45 In initial pilot studies, analyses of DNA extracted from stool samples detected almost 90% of all tumors carrying k-ras mutations. However, the sensitivity of this method was only 33%, which indicated that substantial improvement would be required to develop a valid test for use in the clinical setting. A second problem was that k-ras mutations are not specific for colorectal cancer and can also be found in dysplastic (or inflamed) lesions of the pancreas, upper gastrointestinal tract, and in respiratory neoplasms.

Microsatellite instability

Tumors that develop through inactivation of tumor-suppressor genes and multiple allelic losses are referred to as having “chromosomal instability” (CIN) and evolve through the “suppressor” pathway.46 Although this model was the initial model for multistep tumor progression, we now know that there are other mechanisms of tumor development in the colon. While looking for LOH events using microsatellite markers, a second, and completely unrelated type, of genomic instability was detected by three independent research groups.20, 47, 48 This type of hypermutability, termed MSI, is present in about 15% of all colorectal tumors and in more than 90% of cancers in Lynch syndrome. In fact, attempts to detect LOH at the first locus linked to Lynch syndrome (a microsatellite located in an intron of hMSH2) actually demonstrated MSI and properly interpreted that a novel mechanism of tumorigenesis was involved in this hereditary disease.46

It has been recognized that most colorectal cancers in Lynch syndrome have the MSI phenotype, and that this group makes up perhaps 3% of colorectal cancers. However, about 15% of all colorectal cancers are MSI. Most of this nonfamilial group with MSI—which is about 75% of those with MSI—do not express hMLH1 in tumor tissue. Neither mutations of LOH nor of hMLH1 was found in these tumors, which led to a search for yet another mechanism involved in the evolution of colorectal cancer.

In 1998, evidence begun to accumulate that hMLH1 was inactivated by epigenetic modification of its promoter in most of the sporadic tumors with MSI.49, 50 Genes can be inactivated when the promoter undergoes transcriptional silencing by hypermethylation at CpG islands (i.e., clusters of cytosine residues followed by a guanosine), which are present in about 50% of promoters in the human genome.51 This is a common mechanism by which cells regulate gene expression and is the means by which genes are silenced during inactivation of one of the X chromosomes in women.

Multiple molecular pathways

As alluded to above, it has become clear that colorectal carcinogenesis is a complex disease genetically and that multiple molecular pathways exist in colorectal tumorigenesis in addition to the classic suppressor and the mutator mechanisms.52 It was shown that on an analysis of 209 Stage II and Stage III sporadic colorectal cancers, approximately 14% of all colorectal cancers were classified as MSI-H, whereas the remainder were MSI-L stable or microsatellite-stable (MSS). It was shown that 51% of the cancers showed characteristics of the suppressor pathway (defined by finding LOH events), and 3.4% tumors presented with characteristics of both suppressor and mutator pathways (LOH and MSI-H). More noteworthy, however, 38% of tumors did not demonstrate characteristics of either type of genomic instability (Fig. 4). These data suggested that some colorectal cancers evolve through an overlap of the suppressor and the mutator pathway, but more importantly, a significant proportion of colon cancers may have progressed through an as yet undefined molecular mechanism of genomic instability (Fig. 5).

Figure 4.

Exclusiveness and overlap among subsets of genomic instability. Summary of genomic instability patterns for 209 sporadic colon cancers. Tumors were screened for allelic losses (LOH) and MSI.

Figure 5.

Multiple pathways to colorectal cancer. This figure summarizes the different pathways and genetic targets that can lead to colorectal cancer. In most cases, the wnt signaling pathway is disrupted by either mutations or deletions of the APC gene or by mutations in the β-catenin gene. Tumor formation can proceed through the CIN pathway, characterized by multiple allelic losses of tumor-suppressor genes and mutations of the oncogene, k-ras. Alternatively, mutations of MMR genes lead to the MSI phenotype as in the Lynch syndrome or with acquired inactivation of the hMLH1 gene. Another mechanism of colorectal carcinogenesis occurs through the CpG island methylator phenotype (CIMP), which silences genes through promoter methylation. CIMP can progress through silencing the hMLH1 gene causing the MSI phenotype (CIMP+/MSI+). Alternatively, a variety of tumor-suppressor genes other than hMLH1 can be silenced through promoter methylation (CIMP+/MSI-, or CIMP+/MSI-L).

Epigenetic gene silencing

Epigenetic gene silencing by promoter methylation has been found to inactivate numerous tumor-suppressor genes in a variety of tumors.53 As mentioned, hypermethylation of the promoter region in hMLH1 is responsible for a lack of expression of this protein, and accounts for MSI in virtually all nonhereditary colorectal cancers.49 The tumors with this phenotype tend to occur in patients > 70 years old, whereas the majority of Lynch syndrome colorectal cancers occur in people < 55 years old. Moreover, it was reported that both alleles of hMLH1 were hypermethylated in five of six MSI colon cancer cell lines that lacked identifiable mutations in mismatch–repair genes.50 This distinct pathway of colorectal tumorigenesis-involving transcriptional silencing of selected genes in cancers has been termed the CpG island methylator phenotype or CIMP.54 It has been proposed to be distinct from the increase in methylation observed with advancing age, although it is still not clear that the aging-related (Type A) and cancer-related (Type C) processes are actually different.55

These new findings have focused attention on the contributions of epigenetics to tumorigenesis, and there is growing evidence that epigenetically mediated gene silencing can selectively produce loss of key functions in cancer (Table 3).56

Table 3. Genes Silenced by Hypermethylation in Colorectal Cancer
Cell cycle controlRb, p16 (INK4a), p15, p73, RARβ2
ApoptosisDAP-kinase, p14AR, APAF1, caspase-8, TMS1, PTEN
Cell adhesion and invasionE-cadherin, VHL, APC, TIMP3, thrombospondin1, HPP1
Growth factorsestrogen receptor, androgen receptor, RASSF1A, endothelin-B-receptor

Molecular and genetic profiles and therapeutic strategies

Sporadic colon tumors evolving through the mutator pathway are always MSI-H, and develop through methylation-based silencing of the promoter of hMLH1.49 It has been shown that patients with these tumors have a better prognosis in terms of overall survival compared with patients who have MSS tumors.27 However, when treated with 5-fluorouracil (5-FU) chemotherapy, these tumors do not respond to treatment as well as MSS tumors.57 There is a trend toward reduced survival in patients with MSI-H tumors subjected to 5-FU chemotherapy, although this remains the standard treatment regimen for advanced stage colorectal cancers.58 In vitro studies have demonstrated that MSI-H colorectal cancer cell lines that are deficient in mismatch–repair by either mutational inactivation or promoter hypermethylation of the hMLH1 gene are relatively resistant to 5-FU treatment when compared with mismatch–repair proficient cell lines.59, 60 However, the restoration of mismatch–repair function by the insertion of an intact hMLH1 gene (by transfer of chromosome 3) renders the cells sensitive to 5-FU treatment. The same result was obtained by treatment of cells that were mismatch–repair deficient through promoter hypermethylation of hMLH1 with the DNA demethylating agent 2-deoxy-5-azacytidine. By demethylating the hMLH1 gene, the cells reexpressed hMLH1 transcripts and protein after a 24-hour treatment with the demethylating agent, and the cells regained sensitivity to 5-FU. These findings have significant clinical implications, as the findings could lead to customized treatment options for MSI-H sporadic colorectal cancers by first “sensitizing” them with a demethylating agent and then initiating a conventional treatment regimen.

These findings demonstrate the potential importance of molecular characterization of colorectal cancer, especially in the context of choosing optimal chemotherapeutic regimens. A study was conducted to optimize the characterization of colorectal cancers evolving through the mutator pathway.61 These tumors are currently best defined by using five consensus microsatellite markers, which were defined in an international consensus conference on MSI in colorectal cancer sponsored by the National Cancer Institute (NCI) in 1997.27 Studies were performed to compare the diagnostic power of standard microsatellite analysis, hMLH1 expression performed by immunohistochemistry, and hMLH1 promoter hypermethylation. A total of 27 MSI-H colorectal cancers were investigated. It was found that by simply using expression analysis or determining promoter methylation of the hMLH1 gene, a significant number of MSI-H cancers may be missed when compared with conventional microsatellite testing (Table 4).61

Table 4. Correlation of MMR Gene Expression and Methylation Status in MSI-H Cancers50
TumorsMethylated no. (%)Unmethylated no. (%)Total no.
Lack of hMLH1 expression14 (93)1 (7)15
hMLH1 expression5 (42)7 (58)12
Total19 (70)8 (30)27

Molecular Tests for Cancer Diagnosis and Prognosis

Insight into molecular defects in colorectal cancer has not only elucidated the pathogenic role of such lesions but has also allowed the use of new molecular diagnostic tools for early diagnosis, screening, and prevention. Screening effectiveness depends in large part on accessibility and screening tools themselves. Several methods to screen for colorectal cancer in asymptomatic individuals have been available for some time and have proven their efficacy in several controlled trials.62 Screening strategies using stool-based tests have been performed for decades but have had a relatively modest effect in reducing mortality related to colorectal cancer because of low patient compliance and insufficient test sensitivity and specificity.63, 64

DNA-based fecal tests

Recently, focus has been placed on DNA-based stool tests, which promise more accurate alternatives than conventional methods of colorectal cancer screening. It has been shown that DNA shed from tumors is sufficiently stable in stool to be extracted and subjected to amplification by PCR for screening cancer-associated genetic alterations. The first pilot study of this approach attempted to detect fecal k-ras mutations.45 This test is highly feasible and sensitive, because tumor k-ras mutations are shed into the stool and occur at a limited number of mutational “hot spots” in codons 12 and 13. The main limitation is low sensitivity, as k-ras mutations only occur in approximately 50% of colorectal cancers.

Another single gene study looked for APC mutations in fecal DNA.65 Testing for truncating mutations in this gene, the investigators identified APC alterations in 26 of 46 subjects with neoplasia (57%), and in none of the controls.

The same laboratory studied the microsatellite marker BAT-26 in feces of patients with proximal sporadic colorectal cancers, looking for evidence of MSI.66 By using a PCR-based method to detect microsatellite mutations, the researchers selected 18 of 46 cancers that were MSI-H, and identical mutations were observed in stool in 17 of these 18 cases. However, as stated above, genetic alterations in colorectal cancer are highly heterogeneous, and multiple rather than single genes may help achieve a more sensitive assay.

A few studies of tests for multiple genetic mutations in fecal DNA have been published (Table 5). Dong et al. used a panel of three genetic targets—p53, k-ras, and the mononucleotide repeat marker BAT-26—to detect tumor-associated alterations in feces from patients with colorectal cancers.66 The three alterations, which attempt to detect both the suppressor and the mutator pathways, were able to detect 36 of 51 (71%) patients with colorectal cancer in this pilot study.

Table 5. Multitargeted DNA-based Stool Tests for the Detection of Colorectal Cancer
StudyCasesDNA markersSensitivitySpecificity
  1. HP, Hyperplastic polyps; nd, not done.

Ahlquist 20006822 cancersk-ras, p53, APC91% (20/22)93% (26/28)
 11 adenomasBAT 26, long-DNA82% (9/11) 
 28 controls   
Dong 20016751 cancersk-ras, p53, BAT2671% (36/51)nd
Calistri 20036956 cancersk-ras, p53, APC,51% (27/56)97% (37/38)
 38 controls5 MSI loci, long DNA59% (long-DNA, k-ras and p53) 
Imperiale 2004702507 subjectsk-ras, p53, APC, BAT2651.6%nd
 31 cancersDNA integrity (long DNA)  
Petko 20057242 adenomasmethylation of p16ndnd
 44 HPshMLH1  
 25 controlsMGMT  

A study conducted by Ahlquist et al. targeted mutations in k-ras, APC and p53 as well as the MSI marker, BAT-26.67 Hot spot mutations on the k-ras, APC and p53 genes were targeted for detection. In addition, “long DNA,” which is presumably exfoliated by nonapoptotic dysplastic colonocytes into the stool served as a molecular target because of the greater length of these DNA fragments compared with those shed from normal colonocytes. The sensitivity of this study was 91% for colon cancers and 82% for adenomas. Specificity was 93% and increased to 100% when k-ras was excluded.

A study by Calistri et al. sought mutations in p53 exons 5–8, k-ras exons 1–2, four fragments of APC exon 15, and 5 microsatellite loci in feces and tumors of patients with colorectal cancers.68 In addition, long DNA was evaluated by amplifying longer sequences of both p53 and APC. The most frequent alterations in tumors were k-ras (34%), p53 (34%), MSI (13%), and APC mutations (13%). The most frequently detected alterations in stool were long DNA (51%), k-ras mutations (11%), p53 mutations (6%), MSI (6%), and APC mutations (2%). Interestingly, multiple molecular abnormalities in feces of tumor patients were not frequently found. Molecular alterations were not observed in stools of healthy individuals, except in one case with long DNA, demonstrating the specificity of this assay.

DNA-based stool tests have the potential to improve diagnostic yields of conventional colorectal cancer screening methods. Moreover, compared with fecal occult blood tests, DNA-based assays also have the potential to detect asymptomatic adenomas, most of which are not detectable by blood-based testing.67 Imperiale and colleagues compared a high-throughput fecal DNA test to a standard fecal occult blood test.69 The fecal DNA panel consisted of 21 mutations commonly found in colorectal cancer. The fecal DNA test detected 51.6% of 31 invasive cancers, whereas the fecal occult blood test detected only 12.9% of these cases (P = 0.003). The recognition that a large proportion of colorectal cancers progress through the methylator pathway of carcinogenesis suggested yet one more approach to diagnosis. In some proportion of both colon polyps and invasive cancers, promoter methylation of common tumor-suppressor genes can be detected in human DNA extracted from stool.70, 71 However, high specificity not withstanding, the sensitivity of these assays must be improved before they can serve as reliable screening approaches.

In the final analysis, a general application of this approach will require evaluation of the test under clinical conditions, in which stool samples are first collected by patients and sent to the processing laboratories, which may add variables not anticipated in the pilot studies discussed above. A DNA-based stool test is currently commercially available as PreGen-Plus, from Exact Sciences (Marlborough, MA).

Molecular markers and prognosis

Molecular markers have also been used to predict survival and response to chemotherapy. MSI-H cancers have been shown to have a better overall prognosis in terms of disease-free and overall survival when compared with MSS/MSI-L cancers.60 The latter cancers tend to have a worse outcome, particularly when there are alterations on chromosome 18q in Stage III colorectal cancers.43 Similarly, MSI-H cancers have a worse prognosis in the presence of mutations of the target gene TGFβ-RII.

It has been shown in the in vitro setting that MSI-H colorectal cancer cell lines are relatively resistant to 5-FU and other cytotoxic drugs compared with mismatch–repair-proficient cell lines.59, 60 These findings have been confirmed in a retrospective clinical study which showed that 1) untreated MSI-H cancers have a better outcome than MSS cancers, and 2) 5-FU–based chemotherapy does not improve the outcome of MSI-H colorectal cancers, but rather impairs survival, compared with MSS cancers.57, 58 MSI analysis may serve as a predictive factor in the future and may influence selection of treatment strategies. The possibility that a different therapeutic approach may be helpful was suggested in a study investigating the response of MSI-H tumors to irinotecan.72 Patients with MSI-H cancers responded significantly better than patients with MSI-L or MSS tumors, indicating that this treatment appears to be more suitable for colorectal cancers originating from the mutator pathway.

Another approach has been investigated by the Vogelstein laboratory. They studied the role of chromosomal imbalances in 180 colorectal cancer patients with no evidence of lymph-node involvement or distant metastases. DNA from the tumors was tested for imbalances of chromosomes 8p and 18q by digital single-nucleotide polymorphisms (SNP). They found that patients whose tumors retained alleles of both chromosomes had a 100% 5-year survival, compared with 74% of patients whose tumors retained only one allele of 8p or 18q, and 58% of those who had allelic imbalances on both chromosomes, respectively. Most importantly, these findings were independent of tumor stage.73


In summary, the molecular characterization of colorectal cancer has opened a wide spectrum of screening tools for early diagnosis and prevention of colorectal cancer. Many new molecular markers have been defined and are currently under investigation to help optimize treatment regimens and to predict survival. These molecular screening tests must be compared with conventional screening methods and validated in prospective clinical trials. For these novel and provocative techniques to move into clinical practice, each must be optimized and shown to provide cost-effective improvements in patient management.