Genetic investigation of DNA-repair pathway genes PMS2, MLH1, MSH2, MSH6, MUTYH, OGG1 and MTH1 in sporadic colon cancer
Version of Record online: 6 APR 2007
Copyright © 2007 Wiley-Liss, Inc.
International Journal of Cancer
Volume 121, Issue 3, pages 555–558, 1 August 2007
How to Cite
Schafmayer, C., Buch, S., Egberts, J. H., Franke, A., Brosch, M., El Sharawy, A., Conring, M., Koschnick, M., Schwiedernoch, S., Katalinic, A., Kremer, B., Fölsch, U. R., Krawczak, M., Fändrich, F., Schreiber, S., Tepel, J. and Hampe, J. (2007), Genetic investigation of DNA-repair pathway genes PMS2, MLH1, MSH2, MSH6, MUTYH, OGG1 and MTH1 in sporadic colon cancer. Int. J. Cancer, 121: 555–558. doi: 10.1002/ijc.22735
- Issue online: 25 MAY 2007
- Version of Record online: 6 APR 2007
- Manuscript Accepted: 12 FEB 2007
- Manuscript Received: 29 NOV 2006
- German National Genome Research Network (NGFN) through the POPGEN Biobank. Grant Number: BmBF 01GR0468
- The GEM Platform. Grant Number: BmBF 01GS0426
- the National Genotyping Platform, and the NGFN network for environmental disorders
- sporadic colon cancer;
- MSI genes;
- DNA-repair pathway
Mutations in DNA repair genes have previously been identified as causative factors for hereditary nonpolyposis colon cancer (HNPCC). Recent evidence also supports an association between DNA sequence variation in these genes and sporadic colorectal carcinoma (CRC). Genetic investigation of DNA repair genes PMS2, MLH1, MSH2, MSH6, MUTYH, OGG1 and MTH1, as possible susceptibility factors for sporadic CRC, was done using both a haplotype tagging and a candidate (i.e. coding) single nucleotide polymorphism (SNP) approach. Some 1,068 patients with operated CRC (median age at diagnosis: 59 years) were compared to 738 sex-matched control individuals (median age: 67 years). Haplotype tagging SNPs, previously reported risk variants and all known coding SNPs with a minor allele frequency >0.005 were genotyped in PMS2 (N = 10), MLH1 (N = 11), MSH2 (N = 18), MSH6 (N = 15), MUTYH (N = 7), OGG1 (N = 11) and MTH1 (N = 3). No evidence for an association between CRC and any of the 7 genes was detected, neither with the tagging or coding SNPs nor in a sliding window haplotype analysis (all nominal p-values >0.05). The previously reported risk variants D132H in MLH1 and R154H in OGG1 were not even observed in the German population. Genetic CRC risk factors so far identified in DNA repair genes seem to be rare and population-specific. Their association with the disease could not be replicated in German CRC samples. It remains to be elucidated by more systematic, large-scale experiments whether common variants in the same genes, but present across populations, represent risk factors for sporadic CRC. © 2007 Wiley-Liss, Inc.
Colorectal cancer (CRC) occurs both in conjunction with heritable syndromes and in the form of “sporadic” disease. However, epidemiological studies have also revealed a consistent familial clustering of colorectal cancer outside the recognized syndromes.1 Estimates of the familiarity of sporadic CRC, as expressed by the relative recurrence risk to siblings, range from 1.72 to 6.2, the latter being observed for index patients <55 years.3 The genes and mutations responsible for this familial component of sporadic CRC remain largely unknown.
The systematic genetic investigation of hereditary nonpolyposis colorectal cancer (HNPCC) has led to the identification of the DNA-mismatch repair (MMR) genes as constituting a major pathway to colorectal cancer in syndromic cases. Thus, HNPCC has been shown to be caused by germline mutations in the MSH2,4MLH1,5, 6PMS1,7PMS27 and MSH6.8 To date, more than 200 different predisposing mutations in these genes have been described in HNPCC patients, the majority of which occurred in MSH2 and MLH1.9
Since the causative role of the MMR pathway is well established in hereditary CRC, the same genes are also attractive candidates for sporadic CRC. Indeed, there is increasing evidence for a role of variation in the DNA repair pathway genes to be relevant outside the syndromic CRC families as well. For instance, MLH1 variant D132H was reported to be associated with CRC in Israeli patients.10 This mutation, which was present in 1.3% of Israeli CRC patients, was not however detected in more than 1,100 American cases, suggesting a pronounced ethnic difference in terms of the genetic predisposition to CRC.11 A significant disease association with mutation R154H in the OGG1 gene was reported12 for 254 CRC patients from Korea. Finally, variations in MUTYH, mainly Y165C and G382D, have been shown to be present in up to 0.8% of American patients, with an age at diagnosis <55 years.13
In principle, the investigation of putative susceptibility loci for CRC is possible either by the direct sequencing in both cases and controls or by haplotype tagging analysis. Disease association studies using tagging single nucleotide polymorphisms (SNPs) that are selected on the basis of HAPMAP data provide a particularly efficient means to evaluate large numbers of loci with respect to their possible disease association.14 Haplotype analysis is capable of detecting the effects of hitherto uncovered, infrequent mutations, although the power of this approach decreases with a decreasing frequency of the risk allele15 (Fig. 1). In the present study, we have therefore chosen a combined coding and tagging SNP approach to investigate the possible aetiological role of 7 DNA repair genes in a large sample of sporadic CRC patients from Germany.
Patients and phenotypes
Patients with histologically proven CRC were identified via the regional cancer registry of Schleswig-Holstein, and from the 2002–2005 records of surgical departments in Northern Germany, namely at Kiel, Eckernförde, Rendsburg, Schleswig, Flensburg, Husum, Heide, Niebüll, Neumünster, Itzehoe, Rotenburg, Stade, Reinbek, Bad Oldesloh, Detmold, Neustadt, Hamburg-Harburg, Hamburg-Altona, Hamburg-Eilbek, Hamburg-Bergedorf and Lüneburg. All patients were contacted by mail and invited to participate in the study. Patients who did not respond were sent one written reminder. Individuals who agreed to participate were contacted by the POPGEN project team (http://www.popgen.de).17 They were interviewed by mail questionnaire, and a venous EDTA blood sample was obtained either at the POPGEN office or by the patient's general practitioner. The study was restricted to individuals of German ethnicity, i.e. only cases and controls with both parents born in Germany were included. Patients fulfilling either the clinical Amsterdam or Bethesda criteria for HNPCC were excluded from the study,18 as were patients with a history of inflammatory bowel disease. The study protocols were approved by the institutional ethics committee and the local data protection officers. Written informed consent was obtained from all study participants. The first 1,068 participants (including 543 males) were included in the study. The median age at diagnosis was 59 years (range: 18–74 years) and the median age at recruitment was 63 years (range: 22–84 years). Control individuals (N = 738, including 369 males) were taken from the population-based control pool of the POPGEN project, identified on the basis of the local population registry.17 The median age at the time of recruitment of the controls was 67 years (range: 48–81 years). Individuals with a history of malignant disease or inflammatory bowel disease were excluded. DNA was prepared from all samples using the FlexiGene chemistry (Qiagen, Hilden, Germany) according to the manufactures protocols.
DNA samples were evaluated by gel electrophoresis and adjusted to 20–30 ng/μl DNA content using the Picogreen fluorescent dye (Molecular Probes, Invitrogen, Carlsbad, CA). One microliter of genomic DNA was amplified with the GenomiPhi (Amersham, Uppsala, Sweden) whole genome amplification kit and fragmented at 99°C for 3 min. Hundred nanograms of DNA were dried overnight in TwinTec hardshell 384-well plates (Eppendorf, Hamburg, Germany) at room temperature. Genotyping was performed on these plates using the SNPlex chemistry (Applied Biosystems, Foster City, CA) on an automated platform with TECAN Freedom EVO and 384-well TEMO liquid handling robots (TECAN, Männedorf, Switzerland). Genotypes were reviewed manually using the Genemapper 4.0 software (Applied Biosystems). Genotypes of the nonsynonymous polymorphism MUTYH Gly382Asp, OGG1 Arg154His and MLH1 Asp132His were determined using the TaqMan biallelic discrimination system. Reactions were completed and read in a 7900 HT TaqMan sequence detector system (Applied Biosystems). Amplification was performed with the TaqMan universal master mix. The thermal cycling conditions were 1 cycle for 10 min at 95°C, 45 cycles for 15 sec at 95°C and 45 cycles for 1 min at 60°C. All process data were logged and administered in a database-driven LIMS system.19 The mean call rate of all assays was 99.5%, with a minimum of 97.6%.
SNP selection and data analysis
SNPs were taken from HAPMAP (www.hapmap.org) by automated selection of haplotype tagging SNPs for Caucasians, using the CEU dataset (Mendel errors: 0, minor allele frequency: 0.01, HWE cutoff: 0.01). In addition, coding SNPs with a minor allele frequency >0.01 in Caucasians, as reported in dbSNP or in the scientific literature, were also included.
The study was performed using a case-control design and entailed a sliding window haplotype analysis with COCAPHASE, as implemented in the UNPHASED suite of programs (http://www.rfcgr.mrc.ac.uk/∼fdudbrid/software/unphased/).20 Single marker genotypic and allelic tests for association were performed on sex-matched genotypes using χ2 statistics for contingency tables. Nominal p-values are reported for all statistical tests. The use of nominal p-values not corrected for multiple testing is addressed in the context of the negative study results in the discussion.
The SNP panel was genotyped in 1,068 cases and 738 controls. Haplotype-tagging SNPs for the candidate genes were selected on the basis of the available HAPMAP genotype data (http://www.hapmap.org).14, 15, 21 All coding SNPs previously reported to be associated with sporadic CRC were genotyped. Of the other coding SNPs, only those with a reported minor allele frequency >0.005 in Caucasians were included in the marker panel. In total, 11 SNPs each in the MHL1 and OGG1 genes, 7 SNPs in MUTYH, 18 SNPs in MSH2, 15 SNPs in MSH6, 3 SNPs in MTH1 and 10 SNPs in the PMS2 gene were genotyped. Figure 2 shows the distribution of these markers across the respective genes together with the regional haplotype structure as generated by HAPLOVIEW from the HAPMAP Caucasian genotypes (category CEU). The markers provide good coverage and tag all major haplotype blocks of the candidate genes (see Methods section). The results of the single marker and haplotype analyses are reported in Supplementary Table I. Many of the coding SNPs were in fact found to be monomorphic in the patient sample. In Supplementary Table I, markers are provided in genomic 5′ to 3′ orientation, regardless of the orientation of the transcript. None of the markers showed any significant departure from Hardy-Weinberg equilibrium in controls, suggesting robust genotyping in all instances.
For the MLH1 gene, the CRC-associated variant previously detected in a Jewish population10 was absent from the German patients and controls, as was coding variant Y718H. This result corroborates previous reports of a lack of the latter in Caucasian nonJewish Americans.11
Variants G382D and Y165C in the MUTYH gene exhibited allele frequencies of 0.006 and 0.003 in the colon carcinoma patients, which was not significantly different from the controls. The other coding SNP, Q531R, was monomorphic, whereas marker M22V did not show any significant association with the CRC phenotype. Furthermore, in a joint analysis of G382D and Y165C, including compound heterozygotes or of all, coding mutations was negative (p > 0.10). A subgroup analysis of patients <55 years of age, such as reported in Farrington et al.,13 was also negative.
Variant R154H in OGG1,12 previously reported in Korean patients, was not observed in the German population, as was SNP Q229R. The other exonic coding variants in OGG1, namely V288A, C326S and S85A, exhibited similar frequencies in cases and controls.
Exonic variants S127N and F390L in the MSH2 gene were absent, and coding variant D322G was not overrepresented in CRC cases. Similarly, the A623P and V886I mutations in MSH6 were not detected in Germans. Rare variants A878V and V396L were not associated with the CRC phenotype. In the PMS2 gene, variant K277T was not detected and the nonsynonymous SNPs I622M and S597T showed no frequency difference between cases and controls. Intronic tagging SNP rs12534423 showed a genotypic p-value of 0.039 and an allele-wise p-value of 0.054, but this finding was not supported by the haplotype analysis or by the analysis of neighboring SNPs.
For all loci, a sliding window haplotype analysis was performed to look for associations with other potentially causative variants that were not included in the genotyped SNP panel. None of these analyses yielded a significant association (i.e. all p > 0.05).
DNA-MMR genes are attractive candidates for sporadic CRC because the causative role of altered MMR in familial CRC is well established. In the present study, we have tried to replicate previously reported associations of functional SNPs in MMR and base excision repair genes.10, 11, 12, 13 In addition, we also used haplotype tagging SNPs to increase the amount of genetic information extracted at the respective loci. The international HAPMAP project (www.hapmap.org) has generated a wealth of genotype and marker information that greatly facilitates the design of candidate gene studies.14 The tagging approach is able to detect signals from hitherto uncovered regulatory or functional mutations in a given gene region.15, 22, 23 It thus offers potential advantages even over a direct mutation search of the coding region of a gene because disease susceptibility may also be conferred by variations located elsewhere as, for instance, in splice sites or intronic enhancers.24, 25 Tagging SNPs were thus selected from the public HAPMAP (www.hapmap.org) resources.14 The genotype and allele frequencies of these SNPs in our samples were not significantly different from those reported for the Caucasian HAPMAP individuals (p > 0.10; Supplementary Table I), thereby supporting the appropriateness of selecting variants from this resource.
Our study utilized more than 1,000 patients with histologically proven colorectal cancer. Prior to the genotyping, we carried out power estimation at a nominal significance level of 0.05,16 assuming single marker allelic odds ratio of between 1 and 2, and population frequencies of 0.1, 0.2 and 0.5 for the potential susceptibility variants. As is shown in Figure 1, the power to detect allelic odds ratios >1.5 was >80% for our sample under all assumptions made. For more frequent susceptibility variants, even odds ratios as small as 1.4 would have been detectable with the same power. This notwithstanding, none of the single marker allelic p-values passed the 5% threshold. In particular, we also did not replicate previously reported associations of coding SNPs with sporadic CRC. One genotypic p-value was 0.039, but in view of the lack of support of the haplotype analysis and the multiplicity of tests performed, we did not interpret this result as sufficient evidence for a true disease association. Correction for multiple testing would be necessary and appropriate if global hypotheses are at stake, and it usually reduces power to detect violations of component null hypotheses. In the present case, a positive result should have stood at least gene-wise multiplicity correction. However, all of our results were negative, which means that multiplicity correction was no longer an issue.
Additional efforts to improve the power of our study had been made. Many polygenic disorders are characterized by a strong correlation between the age at diagnosis of affected relatives, as has been documented for instance in breast cancer26 and Alzheimer disease.27 It is indeed plausible that the genetic influence upon the development of such disorders is reflected by the age at which individuals first develop the disorder.28 Therefore, we focused our study on patients with a relatively early age at diagnosis (median: 59 year, range: 18–74 years) since a stronger familial component has been documented for CRC in younger patients.3 On the other hand, patients with conditions like inflammatory bowel disease, known to predispose to CRC, were systematically excluded. Confounding by population affiliation was minimized by confining the study to patients of German extraction, as was determined by the birth places of both parents.
One major reason for the lack of replication is probably the presence of population genetic differences between the patient samples studied so far (i.e. Korea, Scotland and Israel). The most obvious indication for this difference is the absence of many known risk variants from the German population. For instance, MLH1 variant D132H (rs28930073), which was originally identified in Israeli Jews,10 was not present in our samples. This corroborates a previous report in which the mutation was shown to be absent in American Caucasians.11 Similarly, the Korean variant R154H of the OGG1 gene12 was also not detected in our patient sample. Variants in the MUTYH gene were present in German patients at frequencies similar to those reported in the original Scottish study,13 but no significant disease association was detected. This discrepancy may reflect a true difference in terms of genetic risk between Scottish and central European Caucasians, as has been observed for instance for CARD15 variants in Crohn disease.29 Power may also may have been an issue here since the investigated sample in the study by Farrington et al. was larger than the one reported here, particularly as regards younger patients.
In summary, mutations in DNA repair genes that have so far been identified as being associated with CRC are rare and seem to exert their effects in population-specific fashion. It remains to be elucidated in more systematic, large-scale experiments whether common variants, present across populations, may represent risk factors for sporadic CRC. For these studies, the haplotype tagging approach utilized here should turn out to be particularly useful and may facilitate important discoveries in the future.
The cooperation of all patients and of their families and physicians is gratefully acknowledged. Particular thanks are due to Mrs. Huberta von Eberstein and the POPGEN team, and to the heads of the surgical departments involved, namely Mrs. Ilka Vogel (Städtisches Krankenhaus Kiel), Mr. Hermann Dittrich (Rendsburg), Mr. Jürgen Belz (Husum), Mr. Rainer Quäschling (Eckernförde), Mr. Hodjat Shekarriz (Schleswig), Mr. Volker Mendel (Flensburg), Mr. Werner Neugebauer (Flensburg), Mr. Friedrich Kallinowski (Heide), Mr. Anton Schafmayer (Lüneburg), Mr. Sebastian Debus (Hamburg Harburg), Mr. Jiri Klima (Niebüll), Mr. Wolfgang Teichmann (Hamburg Altona), Mr. Lutz Steinmüller (Hamburg Eilbek), Mr. Marco Sailer (Hamburg Bergedorf), Mr. Benno Stinner (Stade), Mr. Hans Fred Weiser (Rotenburg), Mr. Eick Hartwig Egberts (Detmold), Mr. Guido Schürrmann (Itzehoe), Mr. Albrecht Eggert (Reinbek), Mr. Nikolas Schwarz (Neumünster), Mr. Günter Fröschle (Bad Oldesloh) and Mr. Hendrik Schimmelpenning (Neustadt). Ms. Tanja Wesse, Ms. Birthe Petersen, Ms. Lena Bossen, Mrs. Susan Ehlers, Ms. Meike Davids and Mr. Rainer Vogler are gratefully acknowledged for expert technical and computer support.
This article contains supplementary material available via the Internet at http://www.interscience.wiley.com/jpages/0020-7136/suppmat .
|jws-ijc.22735.doc||278K||Supporting Information file jws-ijc.22735.doc|
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.