Model-based prediction of defective DNA mismatch repair using clinicopathological variables in sporadic colon cancer patients
Colon cancers with defective DNA mismatch repair (MMR) have a favorable prognosis and may lack benefit from 5-fluorouracil–based adjuvant chemotherapy. The authors developed models to predict MMR deficiency in sporadic colon cancer patients using routine clinical and pathological data.
TNM stage II and III colon carcinomas (n = 982) from 6 5-fluorouracil–based adjuvant therapy trials were analyzed for microsatellite instability and/or MMR protein expression. Tumor-infiltrating lymphocytes (TILs) were quantified (n = 326). Logistic regression and a recursive partitioning and amalgamation analysis were used to identify predictive factors for MMR status.
Defective MMR was detected in 147 (15%) cancers. Tumor site and histologic grade were the most important predictors of MMR status. Distal tumors had a low likelihood of defective MMR (3%; 13 of 468); proximal tumors had a greater likelihood (26%; 130 of 506). By using tumor site, grade, and sex, the logistic regression model showed excellent discrimination (c statistic = 0.81). Proximal site, female sex, and poor differentiation showed a positive predictive value (PPV) of 51% for defective MMR. In a patient subset (n = 326), a model including proximal site, TILs (>2/high-power field), and female sex showed even better discrimination (c statistic = 0.86), with a PPV of 81%.
Defective MMR is rare in distal, sporadic colon cancers, which should generally not undergo MMR testing. Proximal site, poor differentiation, and female sex detect 51% of tumors with defective MMR; substituting TILs for grade increases the PPV to 81%. These data can increase the efficiency of MMR testing to assist in clinical decisions. Cancer 2010. © 2010 American Cancer Society.
Although the majority of sporadic colorectal cancers (CRCs) show chromosomal instability, an alternative pathway of tumorigenesis is characterized by defective DNA mismatch repair (MMR).1 Microsatellite instability (MSI) is a molecular marker of defective function of the MMR system.1, 2 Most CRCs with high-frequency MSI are sporadic, wherein the MMR defect develops because of inactivation of the hMLH1 gene by DNA methylation.1, 3, 4 Of the approximately 150,000 incident CRC cases that were diagnosed in the United States in 2008,5 at least 20,000 patients are expected to be sporadic MMR-deficient tumors. The identification of patients with sporadic MMR-deficient tumors has potentially important clinical implications. Multiple retrospective studies,6-9 including a population-based study10 and a meta-analysis,11 have demonstrated that patients with MMR-deficient colon cancers have a more favorable stage-adjusted prognosis compared with patients whose tumors have intact MMR. Although the explanation for the better outcome of MMR-deficient colon cancers is poorly understood, we previously reported that a higher density of TILs, most of which are CD3 + T lymphocytes,12 was associated with better disease-free survival in cases with defective versus intact MMR.13 Evidence also indicates that patients with colon cancers with defective MMR do not benefit from 5-fluorouracil−based adjuvant therapy, in contrast patients whose tumors exhibit intact MMR.11, 14-18 More recently, a retrospective study and pooled analysis has validated the prognostic and predictive impact of MMR status in patients treated in 5-fluorouracil based adjuvant therapy trials in North America and Europe.19 In these studies, MMR status has been determined by MSI testing or by immunohistochemical evaluation of MMR protein expression. MSI testing requires a molecular laboratory and is analyzed by detecting instability at selected microsatellite loci in tumor tissue.2, 20 Analysis of tumors for MMR proteins by immunohistochemistry (IHC) is an alternative or complimentary test, performed in a routine pathology laboratory, and has been shown to be highly concordant with MSI testing.21
MMR-deficient colon cancers have distinct clinical and pathological features that include proximal colon predominance, poor differentiation or medullary-type pattern with or without mucinous histology, and intratumoral lymphocytic infiltration.6-8, 15, 22-25 Prior studies have examined histopathological features in an effort to prioritize sporadic colon cancers for MSI testing, yet morphological prediction of high-frequency MSI had low sensitivity.22 Sporadic MMR-deficient tumors are distinct epidemiologically from Lynch syndrome cases in that they are associated with older age at diagnosis, female sex, and cigarette smoking.4, 25, 26 However, established criteria to identify and select sporadic colon cancers for MSI testing are lacking. This limitation represents an impediment to the use of MMR status for prognostication and patient management in clinical practice. In contrast, Amsterdam and Bethesda criteria were developed to identify patients with Lynch syndrome.27, 28 Screening for Lynch syndrome detected germline MMR mutations in only 2.8% of CRC cases.29 The revised Bethesda guidelines for Lynch syndrome recommend MSI testing for all CRCs in patients diagnosed before age 50 years and for those cancers diagnosed between ages 50 and 59 years with particular pathology features.28, 30 In a recent study, an algorithm was developed for identifying Lynch syndrome patients using morphological tumor features that was restricted to patients under the age of 60 years.31 However, there is no such algorithm or predictive model to identify sporadic MMR-deficient colon cancers.
The potential clinical utility of MMR status as a prognostic and predictive marker provides a strong impetus for developing models to predict the presence of MMR. Therefore, the objective of our study was to develop a predictive model for identifying MMR deficiency in sporadic colon cancer patients. Because only a subset of colon cancers show defective MMR, universal screening would be neither specific for sporadics nor cost-effective. Furthermore, such a model should use routine clinical and pathological data to have the most utility in clinical practice.
MATERIALS AND METHODS
The study population consisted of 982 patients with surgically resected TNM stage II and III colonic adenocarcinomas enrolled in 6 randomized 5-fluorouracil–based phase 3 adjuvant studies conducted by Mayo Clinic and the North Central Cancer Treatment Group (NCCTG). Details of these completed clinical trials have been previously reported.32, 33 Clinical and pathological data, collected in accordance with study protocol requirements, was obtained from the NCCTG study database. Available paraffin-embedded tumor blocks from a nonrandom subset (n = 982) of all study participants that contained sufficient tumor tissue for analysis were used. Primary tumor site was defined relative to the splenic flexure, and tumors located at the splenic flexure were included in the distal category. The current analysis was in accordance with the original informed consent document. Of the 982 patients, 821 were randomized to treatment arms, and 161 received observation alone.
Analysis of DNA Mismatch Repair
Primary colon carcinomas were analyzed for defective DNA MMR by molecular analysis of MSI, as described below, and/or by determination of MMR protein expression by IHC.
Tumors (n = 423) had been analyzed for MSI by detecting instability at selected microsatellite loci as previously described.6, 8, 25, 33 In 200 of these cases, IHC was also performed as described below. The tumors analyzed were composed of at least 60% tumor cells and had a minimum of 5 markers that successfully amplified for both normal DNA and tumor DNA. Given the preponderance of evidence indicating that tumors with low-frequency MSI are not biologically distinct from MSS, we grouped them together in all analyses.7, 8, 25, 34-36 For patients with MSI data, tumors were classified as high-frequency MSI, low-frequency MSI, or MSS as previously described.8, 25 In 1 adjuvant study (NCCTG 91-46-53),33 MMR status was determined by analysis of instability at BAT26 and MMR protein expression (hMLH1, hMSH2, hMSH6). In this study (n = 396), defective MMR was defined by the presence of MSI at the marker BAT26 coupled with absence of protein expression for hMLH1 or hMSH2, or hMSH6. For patients who were stable at BAT26 and showed normal protein expression for all 3 proteins, 16 additional microsatellite markers were used as previously described.33 If >30% of the markers showed MSI, then that tumor was classified as high-frequency MSI, and additional IHC testing was done with antibody for PMS2.
MMR protein expression
Paraffin-embedded tumors (n = 758) were analyzed for hMLH1 and MSH2 proteins, and hMSH6 (n = 387) and PMS2 were analyzed in a tumor subset. The IHC methods and antibodies used were as previously described.37 Slides were scored by a pathologist as either positive or negative based on the presence or absence of nuclear staining for each MMR protein in the tumor cells. Each slide contained a unique number that enabled blinding with respect to patient identity and clinical characteristics.
All colon cancers were evaluated by pathologists at Mayo Clinic or at NCCTG participating institutions. Tumor histologic grade was categorized as grade 1, well differentiated; grade 2, moderately differentiated; grade 3, poorly differentiated; and grade 4, undifferentiated.38 As required by the study protocols, representative blocks of normal and tumor tissue were sent to Mayo Clinic, Rochester for confirmation of the diagnosis of invasive adenocarcinoma of the colon. Histologic parameters were evaluated on H & E-stained slides without knowledge of MSI status.
Tumor-infiltrating lymphocytes (TILs) were analyzed in a patient subset (n = 326) by morphology in H & E-stained tissue sections at light microscopy at Mayo Clinic. The number of TILs was determined by counting 5 consecutive high-power fields (HPFs; ×40). The mean number of TILs was expressed as an average per HPF and categorized as follows: 1+ (0-1), 2+ (2-4), 3+ (5-8), and 4+ (≥9).13
The chi-square test was used to test for an association between clinicopathological features, and Wilcoxon rank sum tests were used to correlate 2-level categorical variables with continuous data. Logistic regression and a recursive partitioning and amalgamation analysis was used to identify important predictive factors of MMR status. Factors explored included age, sex, histologic grade, tumor site, TNM stage, lymph node metastases, T classification, and TILs. Model discrimination (ie, ability to discriminate patients with different MMR outcomes) was evaluated using the c statistic. The c statistic is analogous to the area under the receiver operating characteristic curve, and it ranges from 0.5 to 1, where a c statistic of 1 provides perfect discrimination, and a c statistic of 0.50 provides predictions that are no better than chance alone. A c statistic value that ranges from 0.70 to 0.80 shows acceptable discrimination, whereas a c statistic value that ranges from 0.80 to 0.90 shows excellent discrimination.39 By using MMR status as the diagnosis of interest, we also assessed the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) for specific model categories of interest. Statistical tests were 2-sided, with P ≤ .05 considered significant. P values were not adjusted for multiple comparisons. Statistical analyses were performed using SAS (SAS Institute, Cary, NC).
A total of 982 stage II and III colon cancer patients were included in this study from 6 completed adjuvant therapy trials. Defective MMR, defined as either high-frequency MSI and/or loss of MMR protein expression, was detected in 147 (15%) patients. Both MSI and MMR protein data were obtained on 200 tumors, and the results were concordant in 98% of cases. High-frequency MSI was detected in 75 of 423 (17.7%) tumors tested, and loss of a MMR protein(s) was found in 72 of 559 (12.9 %). MLH1 loss was observed in 60 of these 72 (83.3 %) cases, MSH2 and/or MSH6 loss was detected in 9 cases, and 2 cases showed loss of PMS2.
A comparison of clinicopathological features in tumors with defective versus intact MMR is shown in Table 1. Defective MMR status was significantly associated with proximal tumor site, poor/undifferentiated histology, female sex, older age at diagnosis, and fewer lymph node metastases compared with tumors with intact MMR (Table 1). We found that 91% of cancers with defective MMR were located in the proximal colon. Patients whose tumors showed defective MMR were more likely to be of older age compared with tumors with intact MMR, using cutpoints of age 60 (P = .0591) or age 70 (P = .0053) years.
Table 1. Clinicopathological Variables Stratified by DNA MMR Status in Human Colon Carcinomas
|Histologic gradeb|| || ||<.0001|
| Well/moderate||637 (76.5%)||70 (48.3%)|| |
| Poor/undifferentiated||196 (23.5%)||75 (51.7%)|| |
|Sex|| || ||.0008|
| Female||380 (45.5%)||89 (60.5%)|| |
| Male||455 (54.5%)||58 (39.5%)|| |
|Tumor sitec|| || ||<.0001|
| Distal||455 (54.7%)||13 (9%)|| |
| Proximal||377 (45.3%)||132 (91%)|| |
|TNM stage|| || ||.0027|
| III||690 (82.6%)||106 (72.1%)|| |
| II||145 (17.4%)||41 (27.9%)|| |
|Age, y|| || ||.0107d|
| Mean [SD]||62.1 [10.50]||64.1 [11.11]|| |
| Median||63.0||66.5|| |
| Range||25.0-88.0||31.1-86.2|| |
|Lymph node metastasese|| || ||.0081|
| 0||145 (18.3%)||41 (29.1%)|| |
| 1-3||429 (54%)||71 (50.4%)|| |
| >3||220 (27.7%)||29 (20.6%)|| |
|Age group, y|| || ||.0270|
| <50||104 (83.9%)||20 (16.1%)|| |
| 50-59||196 (90.3%)||21 (9.7%)|| |
| 60-64||147 (85.5%)||25 (14.5%)|| |
| 65-69||179 (86.5%)||28 (13.5%)|| |
| ≥70||209 (79.8%)||53 (20.2%)|| |
|T stagef|| || ||.1053|
| T1, 2||108 (13.1%)||12 (8.3%)|| |
| T3, 4||718 (86.9%)||133 (91.7%)|| |
We used clinical and pathological variables to construct a model to predict defective MMR, with the objective of increasing the efficiency of MMR testing. Logistic regression and recursive partition amalgamation analyses identified tumor site as the most important predictor of MMR status, followed by histologic grade (Table 2). Distal tumor site was associated with a low likelihood of defective MMR (3% rate overall; 13 of 468), whereas proximal tumors had a greater likelihood of defective MMR (26%; 130 of 506). Regardless of the combination of covariates, the likelihood of a distal colon cancer showing defective MMR did not exceed 5%; thus, all distal tumors were combined into a single category for prediction purposes (Table 3). Patients with proximal colon cancers could be further categorized based on histologic grade and sex (see Table 3). By using the 3 factors of tumor site, histologic grade, and sex, the logistic regression model showed excellent discrimination or predictive ability (c statistic = 0.81) for MMR status across the 5 subgroups (Table 3). Recursive partitioning and amalgamation analysis gave similar results, where tumor site and grade were the most important variables for predicting MMR status (data not shown). The PPV for defective MMR using the combination of proximal tumor site, poor/undifferentiated histopathology, and female sex was 51% (95% confidence interval, 40%-61%). Sensitivity for this same category was 32%, specificity was 95%, and NPV was 89%. In an effort to restrict this model to sporadic colon cancer patients, we excluded patients (n = 26) whose tumors showed loss of MSH2, MSH6, or PMS2 expression, along with patients whose tumors were negative for MLH1 and were <60 years of age. This subset of 941 patients yielded similar results to those observed in the full study cohort, with a c statistic of 0.84 and a PPV of 48% for the combination of proximal site, poor/undifferentiated histology, and female sex. As an alternative approach, we limited our analysis to all patients at least 60 years of age and found that the PPV for this model was 56% (c statistic was 0.84).
Table 2. Univariate Association of Covariates With Defective Versus Intact MMR Status (N=982)
|Tumor site (proximal vs distal)||12.26 (6.82-22.07)||0.764 (0.729)|
|Histologic grade (poor/undifferentiated vs well/moderate)||3.84 (2.65-5.59)||0.682 (0.641)|
|Sex (male vs female)||0.54 (0.38-0.77)||0.618 (0.575)|
|Age ≥70 years||1.64 (1.13-2.39)||0.602 (0.555)|
|LN metastasis|| ||0.593 (0.568)|
| 1-3 vs. 0||0.76 (0.46-1.26)|| |
| >3 vs 0||0.61 (0.34-1.11)|| |
|T stage (3, 4 vs 1, 2)||1.58 (0.84-2.95)||0.585 (0.524)|
|Stage (II vs III)||1.51 (0.96-2.39)||0.582 (0.553)|
Table 3. Positive Predictive Value of Covariate Combinations for MMR Status
|Distal||455 (97%)||13 (3%)||Low risk|
|Proximal, well/moderate,b male||151 (85%)||26 (15%)||Average risk|
|Proximal, well/moderate,b female||123 (78%)||35 (22%)||Slightly elevated risk|
|Proximal, poor/undifferentiated,b male||57 (71%)||23 (29%)||Moderately elevated risk|
|Proximal, poor/undifferentiated,b female||45 (49%)||46 (51%)||High risk|
We then evaluated the ability of TILs, a feature of MMR-deficient colon cancers,23, 40 to increase the utility of the predictive model. TIL density within colon cancers was determined in a patient subset from which 58 of 326 (18%) patients analyzed showed defective MMR. By using logistic regression and recursive partitioning and amalgamation analyses, we found that both TILs and tumor site were important predictors of MMR status, followed by histologic grade and sex (Table 4). We then integrated TILs into the predictive model, at which point tumor grade did not contribute to prediction and was therefore removed. This simplified model included tumor site, TILs, and sex and provided excellent discrimination (c statistic = 0.86) with only 4 categories. The observed rate of defective MMR varied from a low of 3% in the distal patient group to a high of 81% in the proximal, female sex, and increased TILs group (Table 5). Specifically, the combination of proximal site, female sex, and TILs (≥2/HPF) yielded a PPV of 81% for detecting defective MMR (Table 5). This same group yielded a sensitivity of 38%, a specificity of 98%, and an NPV of 88%. This model showed the highest efficiency for selecting patients for MMR testing as compared with the model developed without TILs on the full study cohort. Male sex in this model with proximal site and increased TILs (>2/HPF) resulted in a PPV of 50%. When excluding patients whose tumors showed loss of MSH2, MSH6, or PMS2 and those who were negative for MLH1 and <60 years of age, modeling of this subset of 312 patients yielded results similar to those of the full study cohort (c statistic = 0.86, PPV of 81%). Alternatively, when including only those patients aged ≥60 years, the PPV was still 81% (c statistic = 0.84). We emphasize that TILs were determined in routine histological sections and can be analyzed at the time of diagnostic pathology assessment. Therefore, we believe that this model is practical and clinically relevant.
Table 4. Univariate Association of Covariates With Defective Versus Intact MMR Status in Colon Cancers (N=326)
|TILs (≥2/HPF)||14.88 (7.56-29.27)||0.791 (0.753)|
|Tumor site (proximal vs distal)||15.80 (5.56-44.90)||0.755 (0.734)|
|Histologic grade (poor/undifferentiated vs well/moderate)||4.02 (2.21-7.32)||0.680 (0.656)|
|Sex (female vs male)||2.53 (1.39-4.62)||0.632 (0.610)|
|Age ≥70 years||2.21 (1.23-3.98)||0.607 (0.589)|
|LN metastasis|| ||0.566 (0.541)|
| 1-3 vs 0||0.74 (0.37-1.49)|| |
| >3 vs 0||0.70 (0.31-1.56)|| |
|Stage (II vs III)||1.41 (0.74-2.68)||0.549 (0.538)|
|T stage (3,4 vs 1,2)||1.98 (0.58-6.79)||0.543 (0.523)|
Table 5. Covariate Models and MMR Status in Colon Cancers
|Distal||144 (97%)||4 (3%)||70%||Low risk|
|Proximal, TILs 1+||109 (83%)||22 (17%)||82%||Average risk|
|Proximal, TILs 2-4+, male||10 (50%)||10 (50%)||84%||High risk|
|Proximal, TILs 2-4+, female||5 (19%)||22 (81%)||88%||Very high risk|
Currently, there are no guidelines used to detect patients whose sporadic colon cancers exhibit MMR deficiency, the majority of which occur in patients over the age of 60 years, as shown here. The importance of recognizing sporadic colon cancers with defective MMR is gaining acceptance because of the recognition that MMR status can be used for prognostication and may influence treatment decisions. A recent retrospective pooled analysis has validated the prognostic and predictive impact of the MMR status in colon cancer patients from completed adjuvant therapy trials conducted in the United States and Europe.19 Based on these data and prior studies, we anticipate that the clinical demand for MSI testing in sporadic colon cancer patients is likely to increase in the near future. Consistent with population-based studies, the prevalence of defective MMR in stage II and III colon cancers was 15%.10 We found that the MMR phenotype was significantly associated with lower tumor stage, proximal site, poor/undifferentiated histology, female sex, and older age, as has been consistently reported. In addition, 91% of tumors with defective MMR were located in the proximal colon, in contrast to those with intact MMR. Of note, prior data suggest that sporadic tumors with defective MMR are more frequently proximal compared with those in Lynch syndrome.41, 42 Sporadic MMR-deficient cancers are molecularly distinct from Lynch syndrome cases in that they show widespread DNA hypermethylation and appear to originate within serrated polyps with BRAF mutations.41, 43
The screening of every colorectal cancer patient for MSI and/or MMR protein expression has been advocated, but would be an expensive approach. Therefore, the goal of this study was to develop a model to accurately identify patients whose sporadic colon cancer would be likely to demonstrate MMR deficiency, thereby increasing the efficiency of MSI testing to enable such information to be used in clinical decision making. Among stage II and III colon cancer patients who participated in adjuvant chemotherapy trials, we found a 15% prevalence of defective MMR that is consistent with population-based studies.10, 31 We used logistic regression and recursive partitioning and amalgamation analyses of clinical and routine pathological information from the adjuvant studies in an attempt to predict a patient's MMR tumor status. We found that tumor site was the most important predictor of defective MMR status, followed by histological grade in the full study cohort. Defective MMR was rare in distal and increased in proximal tumors, such that distal colon cancers do not warrant screening for defective MMR based on our models. The combination of proximal site, poor differentiation, and female sex resulted in a 51% likelihood of defective MMR. In a study that evaluated several morphological features of MMR-deficient colon cancers, interobserver agreement among 5 expert pathologists was good only for evaluation of poor differentiation (kappa = 0.69).22 These data support the use of tumor differentiation in our models, and we emphasize that our study used routine diagnostic pathology reports generated by multiple pathologists from institutions affiliated with our cooperative group clinical trials mechanism. We emphasize that our predictive models apply only to sporadic colon cancer patients, and are not intended to have utility in Lynch syndrome cases.
On the basis of studies that have demonstrated that increased TILs are a marker for colon cancers with defective MMR,23 and in an attempt to further improve on the PPV, we analyzed and dichotomized TIL counts in routine histological tumor sections from a patient subset (n = 326). Substituting TILs (≥2/HPF) for tumor grade in our 3-variable model increased the PPV, in the cohort with the greatest likelihood of a defective MMR tumor (proximal, TILs ≥2/HPF, female), to 81%. Substituting male for female sex reduced the PPV to 50%, which remains a high rate sufficient to justify MMR testing. PPV is the most important measure of a diagnostic method, as it indicates the probability that a positive test reflects the underlying condition being tested for. Therefore, our model with the category of proximal tumor site, high TILs (≥2/HPF), and female sex allows the accurate prediction of the presence of defective MMR in about 8 of 10 cases with those characteristics, and in males in 5 of 10 cases. If validated, this model suggests that TIL counts should become a component of the diagnostic pathology review of all proximal colon cancers in clinical practice.
Our findings suggest that anatomic tumor site and TILs are the most important discriminating variables for MMR status.40, 44 In prior studies that evaluated several morphological features of MMR-deficient colon cancers and involved expert gastrointestinal pathologists, poor differentiation, medullary carcinoma, and intraepithelial lymphocytosis were the best discriminators between high-frequency MSI and microsatellite-stable cancers with high specificity but very low sensitivity and poor interobserver agreement except for evaluation of poor differentiation.22 These findings underscore the need for a simplified strategy, as proposed here, for identifying patients whose sporadic colon cancers may exhibit defective MMR. Although an independent review of the histopathology by gastrointestinal pathologists and the evaluation of a greater number of morphological tumor features has the potential to further improve the PPV of our model, such an approach cannot easily be implemented into routine clinical practice, and a more simplified approach is therefore more practical. Based on aforementioned factors, we believe that our proposed model is likely able to be generalized and reproducible for selecting sporadic colon cancer cases for MSI testing. Nonetheless, independent validation of our model is warranted. Our model is intended to be used as an adjunct to clinical judgment and not meant to be the exclusive indicator of which patients should receive testing for MMR.
In summary, anatomic subsite and TILs were the most predictive markers of defective MMR status in sporadic cases. Based on the data presented here, we recommend that only proximal colon cancers be tested for MMR deficiency. If proximal tumors with poor differentiation from female subjects are tested, approximately half will show defective MMR. If we substitute TIL counts for histologic grade, we can accurately predict MMR-deficient cancer in 80% of the female patients and 50% of the male patients whose tumors are proximal with increased TILs. The use of the proposed predictive models can greatly increase the efficiency of MMR testing in sporadic colon cancers, which has obvious implications for reducing costs. We emphasize that the proposed models include data that can be obtained at the time of routine diagnostic pathology review. The identification of patients whose colon cancers exhibit defective MMR for use in clinical decision making would represent an important step toward individualized cancer medicine.
CONFLICT OF INTEREST DISCLOSURES
Funded in part by National Cancer Institute grant CA104683-02 (F.S.) and Mayo Clinic Cancer Center core grant CA15083.