Identification of a novel single nucleotide polymorphism in the first tandem repeat sequence of the thymidylate synthase 2R allele
Article first published online: 2 FEB 2007
Copyright © 2007 Wiley-Liss, Inc.
International Journal of Cancer
Volume 120, Issue 9, pages 1930–1934, 1 May 2007
How to Cite
Lincz, L. F., Scorgie, F. E., Garg, M. B. and Ackland, S. P. (2007), Identification of a novel single nucleotide polymorphism in the first tandem repeat sequence of the thymidylate synthase 2R allele. Int. J. Cancer, 120: 1930–1934. doi: 10.1002/ijc.22568
- Issue published online: 28 FEB 2007
- Article first published online: 2 FEB 2007
- Manuscript Accepted: 1 DEC 2006
- Manuscript Received: 25 JUL 2006
- Hunter Medical Research Institute
- Hunter Haematology Unit
- thymidylate synthase;
- gene polymorphism;
Thymidylate synthase (TS) activity is an important determinant of response to chemotherapy with fluoropyrimidine prodrugs and its expression is largely determined by the number of functional upstream stimulatory factor (USF) E-box consensus elements present in the 5′regulatory region of the TYMS gene. Two known polymorphisms in this area, a variable number of tandem repeat (VNTR) consisting of 2 or 3 repeats (2R/3R) of a 28-bp sequence and a further G > C single nucleotide substitution within the second repeat of the 3R, result in genotypes with between 2 and 4 functional repeats in most humans. Here, we identify a further G > C SNP in the first repeat of the TYMS 2R allele, which effectively abolishes the only functional USF protein binding site in this promoter. The frequency of the new allele was found to be 4.2% (95% CI = 1.4–9.6%), accounting for 8.8% (95% CI = 2.9–19.3%) of all 2R alleles in our patient cohort. Thus, we observed that the lowest number of inherited functional binding sites is 1 instead of 2 as previously thought, and could potentially be 0 in a homozygous individual. This would severely decrease TS expression and may have implications for predicting efficacy and toxicity of therapy with commonly used fluorouracil-based therapy regimes. © 2007 Wiley-Liss, Inc.
Thymidylate synthase (TS) catalyses the reaction for the sole intracellular source of thymidylate and thus the rate limiting step for DNA synthesis, making it an ideal target for cancer chemotherapeutics. Many fluoropyrimidine prodrugs, such as 5-fluorouracil (5-FU) and capecitabine exert their antitumor effects by inhibiting TS, and are commonly used in the treatment of colon, breast, head and neck cancers.1 Individual response and associated side effects to these drugs are often variable and have been shown to depend at least in part on the expression level of TS in both malignant and normal tissues. Low TS expression has consistently been shown to predict for better efficacy of 5-FU treatment in colorectal cancer, but has also been associated with increased toxicity because of cytotoxic damage to normal tissues.2, 3, 4 Thus, there has been much work done in recent years to establish inherited and acquired determinants of TS protein levels.
Expression of TS is determined largely by polymorphic features in the 5′-upstream promoter/enhancer region of the TYMS gene that influences its transcription and translation. The first of these is a variable number of tandem repeat (VNTR) polymorphism consisting of multiple repeats of a 28-bp sequence.5 Although there have been reports of up to 9 copies of the VNTR in certain populations,6, 7, 8 the majority of TYMS alleles contain only 2 or 3 repeats, and have been designated 2R or 3R, respectively. The frequency of these alleles differs with ethnicity, and Caucasians demonstrate a higher frequency of 2R (0.40) compared to Chinese (0.18).9In vitro studies have shown that 3.6 times less mRNA and protein is produced from a gene with a 2R promoter compared to a 3R, and tumor tissue with the 2R/3R genotype has been shown to produce significantly less cellular TS protein than that with a 3R/3R.6, 10 It is not surprising, then, that colorectal cancer patients with the VNTR 2R/2R genotype show better response and overall survival, but suffer increased toxicity to fluoropyrimidine therapy.10, 11, 12
It has recently been shown that the influence of the VNTR on expression of TS is because of the presence of a potential upstream stimulatory factor (USF) family E-box transcription factor consensus element in each repeat (Fig. 1). In vitro studies have confirmed that the presence of 2 functional consensus elements normally present in a 3R allele confers greater transcriptional ability over the 1 functional element found in a wild-type 2R allele.13 A single nucleotide polymorphism (SNP) at the 12th nucleotide of the VNTR has been identified in vivo as a G > C change in the second repeat of the 3R allele13, 14 and a C > G change in the last repeat of the 2R allele.14 The 3R G > C has been shown to be present in approximately half of all Caucasian 3R alleles, and is commonly referred to as 3RC to distinguish it from the wild-type 3RG.13 When both the VNTR and 3R G > C SNP are considered together, tumors from colorectal cancer patients with the homozygous wild-type 3RG/3RG genotype have been shown to have significantly higher TS expression compared to 2R/2R, 2R/3RC or 3RC/3RC individuals,15, 16 and patients with these low expression genotypes experience significantly higher rates of response and survival after 5-FU-based chemotherapy.14, 16 The C > G SNP in the last repeat of the 2R creates a second E-box consensus sequence and potential functional USF-binding site.14 In this case, the wild-type allele has been designated 2C and the variant 2G, however, in vitro studies have shown no functional difference between the two, and thus no further genotyping studies have been performed.14
We genotyped a series of colorectal cancer patients for the VNTR and 3R G > C SNP and identified a novel banding pattern by restriction fragment length polymorphism (RFLP). Here, we report the in vivo identification of a G > C SNP in the first repeat of the 2R allele and propose a universal nomenclature for the classification of TYMS alleles based on the number of functional USF consensus elements in the 5′ region.
Material and methods
Patients and samples
DNA was extracted from peripheral blood collected from 61 colorectal cancer patients. The cohort was all Caucasian and included 25 females and 36 males with a median age of 63 years (range 38–82 years). Patients were classified as having disease stage B (n = 8), C (n = 51), D (n = 1) and unknown (n = 1). Informed consent was received from all participants and the research was approved by the Hunter Area Health Research Ethics Committee.
Genotyping for the VNTR was performed in 20-μl reactions using the previously described forward and reverse primers 5′-CGTGGCTCCTGCGTTTCC-3′ and 5′-GAGCCGGCCACAGGCAT-3′.17 Reactions consisted of 200 ng DNA, 1.5 mM MgCl2, 250 μM dNTP, 0.5 μM of each primer (Geneworks, Adelaide, Australia), 1% DMSO and 1 U of Red Hot DNA polymerase (ABgene, Surrey, United Kingdom) in 1× Buffer IV as supplied by the manufacturer. Amplifications were performed in a microtube thermocycler (Corbett Research, Sydney, Australia) by initial denaturation at 96°C for 5 min, then 40 cycles of 96°C for 15 sec, 54°C for 30 sec and 72°C for 10 sec, followed by a final elongation of 72°C for 5 min. Products were electrophoresed through a 3% Ultrapure Agarose-1000 high resolution gel (Invitrogen, Carlsbad, CA) in 1× TBE buffer and visualized with ethidium bromide staining to distinguish between the 2R (210 base pairs; bp) and the 3R (238 bp) alleles.
The G > C SNP was genotyped by RFLP with HaeIII after performing the above reaction. Products were electrophoresed through a 4% high resolution agarose gel in 1× TBE buffer and SNPs identified by their banding pattern: 2RGC = 66, 47, 46, 44 and 7 bp; 2RCC = 113, 46, 44 and 7 bp; 3RGGC (3RG) = 66, 47, 46, 44, 28 and 7 bp; 3RGCC (3RC) = 94, 47, 46, 44 and 7 bp.
PCR amplification of the repeat region was performed as mentioned earlier and products separated by 3% agarose gel electrophoresis. Individual 2R and 3R bands were excised from the gel and purified using a Wizard SV Gel and PCR Clean-up System from Promega (Madison, WI) according to the manufacturer's instructions. Isolated nucleic acid was directly sequenced using the above primers by the Biomolecular Research Facility (TUNRA, University of Newcastle, Newcastle, Australia) and SUPAMAC (Sydney University Prince Alfred Macromolecular Analysis Centre, Camperdown, Australia). Sequencing results were compared to each other and to the reported TYMS sequence (Accession No. D00517),18 using the BLAST2 tool on the NCBI website.19
Identification of a novel SNP in the first repeat of the TYMS 2R allele
We performed successful genotyping on 59 of 61 colorectal cancer patients to identify the number of 28-bp tandem repeats (2R or 3R) in the TYMS gene and the associated G > C SNP in the second repeat of the 3R as previously described.13, 14 Two samples contained insufficient DNA and could not be analyzed. During the course of these experiments, we identified 4 patients who were heterozygous 2R/3R for the VNTR and displayed an abnormal RFLP pattern upon digest with HaeIII. All 4 digests produced a band corresponding to 113 bp that was clearly distinct from the largest expected fragment of 94 bp used to identify the previously reported 3RC polymorphic variant (Fig. 2). To determine the origin of this odd restriction fragment, we sequenced the 2R and 3R PCR products from these 4 patients and compared them with the previously published sequences. As controls, we also sequenced similar fragments from patients whose RFLP results displayed expected banding patterns.
All 3R fragment sequences agreed with previously published sequences for 3RG or 3RC. Similarly, 2R fragments from patients with normal RFLP banding patterns produced sequences that agreed with the wild-type 2R sequence. However, sequencing of the 2R fragments from the 4 2R/3R patients with irregular RFLP results all revealed a G > C change at the 12th nucleotide within the first 28-bp repeat (Fig. 3; also indicated in Fig. 1 by [*]). This change was at an identical position in the USF E-box binding site within the repeat sequence to the SNP previously described for the last repeat of the 2R and second repeat of the 3R alleles.13, 14 It was distinct in that it was located in the first 2R repeat and occurred in the wild-type 2R allele, effectively destroying the only functional USF binding site on this allele.
Frequency of the VNTR and associated SNPs in colorectal cancer patients
We designated the new allele 2RCC to indicate the SNPs present in the first and second repeats, respectively, of the 2R allele. To maintain consistency, we adopted this notation for the 3R alleles as well, so that 3RG became 3RGGC and 3RC became 3RGCC. The observed G > C transition in the first repeat of the 2R resulted in the loss of an HaeIII site that could easily be detected by RFLP (Fig. 4). Thus, we proceeded to genotype all 2R/2R homozygous patients to determine the frequency of this allele in our cohort. We found only one other carrier of the allele, designated 2RGC/2RCC, who was heterozygous for the new SNP, and would thus carry only 1 functional USF E-box (Table I). We did not find anyone who was homozygous for 2RCC, nor any carriers of the C > G polymorphism in the last repeat of the 2R that has been described by Kawakami et al.14 The 2RCC allele frequency was calculated to be 4.2% (95% CI = 1.4–9.6%) in our patient cohort, accounting for 8.8% (95% CI = 2.9–19.3%) of all 2R alleles.
|# Functional USF E-boxes||Genotypes1||N (%)|
|Allele frequency||2RGC||52 (44.1)|
To simplify genotyping of the TYMS 5′region, we propose a strategy of classification based on the number of functional USF-1 binding sites as employed in Table I. This nomenclature would encompass all VNTR SNPs reported so far, and could easily accommodate any future findings.
This is the first report, to our knowledge, of the existence of a VNTR G > C SNP in the first repeat of the 2R allele. This single-base change effectively abolishes the only functional USF binding site in this variant and has been shown by Mandola et al. to dramatically decrease transcriptional activity compared to the wild-type 2R.13 These experiments were conducted in vitro as a proof of principle, but the occurrence of this polymorphism in vivo was not demonstrated by these authors.
The variant 2RCC allele was quite common, with a frequency of 4.2% in our small cohort of patients. The calculated CI indicated that the true allele frequency is at least greater than 1%, and 2RCC could account for up to 19% of all 2R alleles in colorectal cancer patients. Although we identified 5 carriers of the allele, none were homozygous 2RCC/2RCC. This is not surprising, since our results indicate that the population frequency for homozygotes is only between 0.02 and 0.92%. However, based on available in vitro data,13 a homozygous individual would have severely decreased TS expression, and given the importance of TS activity for DNA synthesis, it could be speculated that a homozygous genotype may be incompatible with life.
The first repeat 2R G > C SNP identified in this study is distinct from the C > G SNP in the last repeat of 2R previously reported by Kawakami et al.14 The latter polymorphism resulted in the presence of a second putative USF-1 binding site in the 2R allele but was found to have no functional advantage over the wild-type gene, which contained only 1 functional binding site, when examined in vitro.14 However, their assays did not include the addition of exogenous USF-1, which has been shown to increase transcription of TS in similar experiments13 and may be required to activate gene expression to observe an effect. Alternatively, the additional 6 bp insertion in the last repeat and/or the proposed stem-loop structure formed by association of the inverted repeats5 may negatively affect binding of regulatory molecules to this region.
We did not find any individuals with the last repeat 2R variant (which we have designated 2RGG), but this may be beyond the limit of our detection system, since the restriction fragments produced by this variant would be difficult to resolve by agarose gel electrophoresis. Alternatively, the allele may only occur at very low frequencies within the Caucasian population and would easily be missed in our small sample. The actual Caucasian frequency is not known, and the original discovery was made in a Japanese cohort for which the frequency was never determined.14
We have used the notation 2RGC to indicate the wild-type 2R allele. However, this nomenclature is somewhat cumbersome and would become more complex if similar SNPs were identified in all repeats of the 2R and 3R alleles. Thus, we propose a classification system based on the number of functional repeats present in each genotype. Considering the allelic variants identified so far, this would range from 0 for a homozygous 2RCC to a maximum of 4 for a homozygous 3RGGC individual. However, if a C > G SNP is identified in the last repeat of the 3R allele, then the possibility of 6 putative functional repeats would exist (i.e., 3RGGG/3RGGG) and could be easily accommodated by this system.
It has been proposed that assessment of TS expression is most useful as an indicator of resistance to therapy for tumors with high levels of TS, since these are unlikely to respond to 5-FU.2 Given the complex metabolism required to generate cytotoxicity from 5-FU, it is probable that defects in other pathways contribute to cellular resistance in the absence of TS overexpression, and thus underexpression of TS would not necessarily guarantee treatment success. This has been demonstrated in vitro by comparing cell response to treatment with 5-FU vs. FUdR (the precursor of FdUMP, which is the ultimate TS inhibitor). Cells with low, intermediate and high TS expression were found to demonstrate a statistically significant linear trend in sensitivity to FUdR, but not to 5-FU.20 Because of this requirement for further cellular processing of 5-FU, analysis of genotype may have wider implications for predicting patient toxicity than for tumor response to pyrimidine treatment. Further studies will be required to determine if patients who have inherited only 1 functional USF binding site in the TYMS promoter region demonstrate excessive sensitivity to fluoropyrimidine treatment.
Note: During the revision of this manuscript, the same polymorphic variant of TYMS was identified by Guesella et al., and has been published in Pharmacogenomics J, 2006.21
- 7Length polymorphism of thymidylate synthase regulatory region in Chinese populations and evolution of the novel alleles. Biochem Genet 2002; 40: 41–51Available from:http://www.springerlink. com/openurl.asp?genre=article&id=doi:10.1023/A:1014589105977., , , , , , .
- 8McLeod, Novel thymidylate synthase enhancer region alleles in African populations. Hum Mutat 2000; 16:528. Available fromhttp://dx.doi.org/10.1002/1098-1004(200012)16:6<528:: AID-HUMU11>3.0.CO;2-W., , , , .
- 16Single nucleotide polymorphism in the 5′ tandem repeat sequences of thymidylate synthase gene predicts for response to fluorouracil-based chemotherapy in advanced colorectal cancer patients. Int J Cancer 2004; 112: 733–7., , , , , .
- 20Polymorphic tandem repeat sequences of the thymidylate synthase gene correlates with cellular-based sensitivity to fluoropyrimidine antitumor agents. Cancer Chemother Pharmacol 2005; 56: 465–72 Available fromhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=15918040., , , , , , , , , , , , et al.