SEARCH

SEARCH BY CITATION

Keywords:

  • thymidylate synthase;
  • TYMS;
  • gene polymorphism;
  • 5-fluorouracil

Abstract

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References

Thymidylate synthase (TS) activity is an important determinant of response to chemotherapy with fluoropyrimidine prodrugs and its expression is largely determined by the number of functional upstream stimulatory factor (USF) E-box consensus elements present in the 5′regulatory region of the TYMS gene. Two known polymorphisms in this area, a variable number of tandem repeat (VNTR) consisting of 2 or 3 repeats (2R/3R) of a 28-bp sequence and a further G > C single nucleotide substitution within the second repeat of the 3R, result in genotypes with between 2 and 4 functional repeats in most humans. Here, we identify a further G > C SNP in the first repeat of the TYMS 2R allele, which effectively abolishes the only functional USF protein binding site in this promoter. The frequency of the new allele was found to be 4.2% (95% CI = 1.4–9.6%), accounting for 8.8% (95% CI = 2.9–19.3%) of all 2R alleles in our patient cohort. Thus, we observed that the lowest number of inherited functional binding sites is 1 instead of 2 as previously thought, and could potentially be 0 in a homozygous individual. This would severely decrease TS expression and may have implications for predicting efficacy and toxicity of therapy with commonly used fluorouracil-based therapy regimes. © 2007 Wiley-Liss, Inc.

Thymidylate synthase (TS) catalyses the reaction for the sole intracellular source of thymidylate and thus the rate limiting step for DNA synthesis, making it an ideal target for cancer chemotherapeutics. Many fluoropyrimidine prodrugs, such as 5-fluorouracil (5-FU) and capecitabine exert their antitumor effects by inhibiting TS, and are commonly used in the treatment of colon, breast, head and neck cancers.1 Individual response and associated side effects to these drugs are often variable and have been shown to depend at least in part on the expression level of TS in both malignant and normal tissues. Low TS expression has consistently been shown to predict for better efficacy of 5-FU treatment in colorectal cancer, but has also been associated with increased toxicity because of cytotoxic damage to normal tissues.2, 3, 4 Thus, there has been much work done in recent years to establish inherited and acquired determinants of TS protein levels.

Expression of TS is determined largely by polymorphic features in the 5′-upstream promoter/enhancer region of the TYMS gene that influences its transcription and translation. The first of these is a variable number of tandem repeat (VNTR) polymorphism consisting of multiple repeats of a 28-bp sequence.5 Although there have been reports of up to 9 copies of the VNTR in certain populations,6, 7, 8 the majority of TYMS alleles contain only 2 or 3 repeats, and have been designated 2R or 3R, respectively. The frequency of these alleles differs with ethnicity, and Caucasians demonstrate a higher frequency of 2R (0.40) compared to Chinese (0.18).9In vitro studies have shown that 3.6 times less mRNA and protein is produced from a gene with a 2R promoter compared to a 3R, and tumor tissue with the 2R/3R genotype has been shown to produce significantly less cellular TS protein than that with a 3R/3R.6, 10 It is not surprising, then, that colorectal cancer patients with the VNTR 2R/2R genotype show better response and overall survival, but suffer increased toxicity to fluoropyrimidine therapy.10, 11, 12

It has recently been shown that the influence of the VNTR on expression of TS is because of the presence of a potential upstream stimulatory factor (USF) family E-box transcription factor consensus element in each repeat (Fig. 1). In vitro studies have confirmed that the presence of 2 functional consensus elements normally present in a 3R allele confers greater transcriptional ability over the 1 functional element found in a wild-type 2R allele.13 A single nucleotide polymorphism (SNP) at the 12th nucleotide of the VNTR has been identified in vivo as a G > C change in the second repeat of the 3R allele13, 14 and a C > G change in the last repeat of the 2R allele.14 The 3R G > C has been shown to be present in approximately half of all Caucasian 3R alleles, and is commonly referred to as 3RC to distinguish it from the wild-type 3RG.13 When both the VNTR and 3R G > C SNP are considered together, tumors from colorectal cancer patients with the homozygous wild-type 3RG/3RG genotype have been shown to have significantly higher TS expression compared to 2R/2R, 2R/3RC or 3RC/3RC individuals,15, 16 and patients with these low expression genotypes experience significantly higher rates of response and survival after 5-FU-based chemotherapy.14, 16 The C > G SNP in the last repeat of the 2R creates a second E-box consensus sequence and potential functional USF-binding site.14 In this case, the wild-type allele has been designated 2C and the variant 2G, however, in vitro studies have shown no functional difference between the two, and thus no further genotyping studies have been performed.14

thumbnail image

Figure 1. Location of SNPs identified in the VNTR of the 2R and 3R alleles of TYMS. The TYMS 5′region contains a variable number of tandem repeats, with the last one just prior to the ATG start (bold and underlined) containing a 6 bp insertion (triangle). Each 28 bp repeat contains an inverted repeat sequence (CGCCGCG; underlined) and a putative E-box binding site (CACTTG) for upstream stimulatory factors (USF-1/USF-2). The latter are only functional when a G is present in the last nucleotide position (solid box). The last repeat of both 2R and 3R wild-type alleles normally contains a C in this position, and thus the binding site is abolished (dotted box). Polymorphisms have previously been identified in the second repeat of the 3R13 and the last repeat of the 2R.14 Sequencing results in the present study indicate that a similar G > C bp change is present in the first repeat of the 2R allele of some individuals (*).

Download figure to PowerPoint

We genotyped a series of colorectal cancer patients for the VNTR and 3R G > C SNP and identified a novel banding pattern by restriction fragment length polymorphism (RFLP). Here, we report the in vivo identification of a G > C SNP in the first repeat of the 2R allele and propose a universal nomenclature for the classification of TYMS alleles based on the number of functional USF consensus elements in the 5′ region.

Material and methods

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References

Patients and samples

DNA was extracted from peripheral blood collected from 61 colorectal cancer patients. The cohort was all Caucasian and included 25 females and 36 males with a median age of 63 years (range 38–82 years). Patients were classified as having disease stage B (n = 8), C (n = 51), D (n = 1) and unknown (n = 1). Informed consent was received from all participants and the research was approved by the Hunter Area Health Research Ethics Committee.

Genotyping

Genotyping for the VNTR was performed in 20-μl reactions using the previously described forward and reverse primers 5′-CGTGGCTCCTGCGTTTCC-3′ and 5′-GAGCCGGCCACAGGCAT-3′.17 Reactions consisted of 200 ng DNA, 1.5 mM MgCl2, 250 μM dNTP, 0.5 μM of each primer (Geneworks, Adelaide, Australia), 1% DMSO and 1 U of Red Hot DNA polymerase (ABgene, Surrey, United Kingdom) in 1× Buffer IV as supplied by the manufacturer. Amplifications were performed in a microtube thermocycler (Corbett Research, Sydney, Australia) by initial denaturation at 96°C for 5 min, then 40 cycles of 96°C for 15 sec, 54°C for 30 sec and 72°C for 10 sec, followed by a final elongation of 72°C for 5 min. Products were electrophoresed through a 3% Ultrapure Agarose-1000 high resolution gel (Invitrogen, Carlsbad, CA) in 1× TBE buffer and visualized with ethidium bromide staining to distinguish between the 2R (210 base pairs; bp) and the 3R (238 bp) alleles.

The G > C SNP was genotyped by RFLP with HaeIII after performing the above reaction. Products were electrophoresed through a 4% high resolution agarose gel in 1× TBE buffer and SNPs identified by their banding pattern: 2RGC = 66, 47, 46, 44 and 7 bp; 2RCC = 113, 46, 44 and 7 bp; 3RGGC (3RG) = 66, 47, 46, 44, 28 and 7 bp; 3RGCC (3RC) = 94, 47, 46, 44 and 7 bp.

Sequencing

PCR amplification of the repeat region was performed as mentioned earlier and products separated by 3% agarose gel electrophoresis. Individual 2R and 3R bands were excised from the gel and purified using a Wizard SV Gel and PCR Clean-up System from Promega (Madison, WI) according to the manufacturer's instructions. Isolated nucleic acid was directly sequenced using the above primers by the Biomolecular Research Facility (TUNRA, University of Newcastle, Newcastle, Australia) and SUPAMAC (Sydney University Prince Alfred Macromolecular Analysis Centre, Camperdown, Australia). Sequencing results were compared to each other and to the reported TYMS sequence (Accession No. D00517),18 using the BLAST2 tool on the NCBI website.19

Results

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References

Identification of a novel SNP in the first repeat of the TYMS 2R allele

We performed successful genotyping on 59 of 61 colorectal cancer patients to identify the number of 28-bp tandem repeats (2R or 3R) in the TYMS gene and the associated G > C SNP in the second repeat of the 3R as previously described.13, 14 Two samples contained insufficient DNA and could not be analyzed. During the course of these experiments, we identified 4 patients who were heterozygous 2R/3R for the VNTR and displayed an abnormal RFLP pattern upon digest with HaeIII. All 4 digests produced a band corresponding to 113 bp that was clearly distinct from the largest expected fragment of 94 bp used to identify the previously reported 3RC polymorphic variant (Fig. 2). To determine the origin of this odd restriction fragment, we sequenced the 2R and 3R PCR products from these 4 patients and compared them with the previously published sequences. As controls, we also sequenced similar fragments from patients whose RFLP results displayed expected banding patterns.

thumbnail image

Figure 2. Genotyping for VNTR and SNP variants by RFLP. PCR products were separated by electrophoresis to determine VNTR genotypes. The 2R and 3R alleles were identified by bands of 210 and 238 bp, respectively, as indicated in the top panel. Subsequent digestion with HaeIII (bottom panel) was used to genotype the G > C SNPs within the 28 bp repeats. In addition to previously described banding patterns corresponding to the 3RC (3RGCC) and 3RG (3RGGC) alleles, variant patterns containing a 113 bp band were observed (indicated by the arrows). These were later shown to correspond to a G > C transition in the first repeat of the 2R. Alleles with this polymorphism are labeled 2RCC to distinguish them from the wild-type 2RGC.

Download figure to PowerPoint

All 3R fragment sequences agreed with previously published sequences for 3RG or 3RC. Similarly, 2R fragments from patients with normal RFLP banding patterns produced sequences that agreed with the wild-type 2R sequence. However, sequencing of the 2R fragments from the 4 2R/3R patients with irregular RFLP results all revealed a G > C change at the 12th nucleotide within the first 28-bp repeat (Fig. 3; also indicated in Fig. 1 by [*]). This change was at an identical position in the USF E-box binding site within the repeat sequence to the SNP previously described for the last repeat of the 2R and second repeat of the 3R alleles.13, 14 It was distinct in that it was located in the first 2R repeat and occurred in the wild-type 2R allele, effectively destroying the only functional USF binding site on this allele.

thumbnail image

Figure 3. DNA sequence analysis of the TYMS 2R PCR fragment from an individual with the 2RCC/3RGCC genotype. The G > C base pair substitution in the first repeat of the 2R allele (indicated) was identified in 4 carriers of the 2R/3R and 1 with 2R/2R genotype. The potential E-box binding sites (CACTTG) in each repeat are indicated by a dotted box to signify the abolishment of this consensus sequence.

Download figure to PowerPoint

Frequency of the VNTR and associated SNPs in colorectal cancer patients

We designated the new allele 2RCC to indicate the SNPs present in the first and second repeats, respectively, of the 2R allele. To maintain consistency, we adopted this notation for the 3R alleles as well, so that 3RG became 3RGGC and 3RC became 3RGCC. The observed G > C transition in the first repeat of the 2R resulted in the loss of an HaeIII site that could easily be detected by RFLP (Fig. 4). Thus, we proceeded to genotype all 2R/2R homozygous patients to determine the frequency of this allele in our cohort. We found only one other carrier of the allele, designated 2RGC/2RCC, who was heterozygous for the new SNP, and would thus carry only 1 functional USF E-box (Table I). We did not find anyone who was homozygous for 2RCC, nor any carriers of the C > G polymorphism in the last repeat of the 2R that has been described by Kawakami et al.14 The 2RCC allele frequency was calculated to be 4.2% (95% CI = 1.4–9.6%) in our patient cohort, accounting for 8.8% (95% CI = 2.9–19.3%) of all 2R alleles.

thumbnail image

Figure 4. HaeIII restriction map of TYMS promoter region polymorphisms. Identified polymorphic 3R and 2R alleles for TYMS are shown. The G > C transitions in the 28-bp tandem repeats cause the loss of an HaeIII restriction site and destroy the USF-1 E-box consensus element. The corresponding size of the DNA fragments is indicated. The 28 bp tandem repeats are boxed and shown as containing a functional (G; hatched) or nonfunctional (C; shaded) USF-1 binding site. Researchers who reported each allele are referenced in brackets. (*) indicates the new allelic variant described in the present study.

Download figure to PowerPoint

Table I. TYMS Genotype Frequencies in Colorectal Cancer Patients
# Functional USF E-boxesGenotypes1N (%)
  • 1

    Genotypes based on VNTR and G > C SNP for alleles that have been identified in patients genotyped for the present study; 3RGGC formerly known as 3RG; 3RGCC formerly known as 3RC.

02RCC/2RCC0 (0.0)
12RGC/2RCC1 (1.7)
2RCC/3RGCC3 (5.1)
Total4 (6.8)
22RGC/2RGC11 (18.6)
3RGCC/3RGCC6 (10.2)
2RGC/3RGCC19 (32.2)
2RCC/3RGGC1 (1.7)
Total37 (62.7)
33RGGC/3RGCC5 (8.5)
2RGC/3RGGC10 (16.9)
Total15 (27.1)
43RGGC/3RGGC3 (5.1)
Allele frequency2RGC52 (44.1)
2RCC5 (4.2)
3RGGC22 (18.6)
3RGCC39 (33.1)

To simplify genotyping of the TYMS 5′region, we propose a strategy of classification based on the number of functional USF-1 binding sites as employed in Table I. This nomenclature would encompass all VNTR SNPs reported so far, and could easily accommodate any future findings.

Discussion

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References

This is the first report, to our knowledge, of the existence of a VNTR G > C SNP in the first repeat of the 2R allele. This single-base change effectively abolishes the only functional USF binding site in this variant and has been shown by Mandola et al. to dramatically decrease transcriptional activity compared to the wild-type 2R.13 These experiments were conducted in vitro as a proof of principle, but the occurrence of this polymorphism in vivo was not demonstrated by these authors.

The variant 2RCC allele was quite common, with a frequency of 4.2% in our small cohort of patients. The calculated CI indicated that the true allele frequency is at least greater than 1%, and 2RCC could account for up to 19% of all 2R alleles in colorectal cancer patients. Although we identified 5 carriers of the allele, none were homozygous 2RCC/2RCC. This is not surprising, since our results indicate that the population frequency for homozygotes is only between 0.02 and 0.92%. However, based on available in vitro data,13 a homozygous individual would have severely decreased TS expression, and given the importance of TS activity for DNA synthesis, it could be speculated that a homozygous genotype may be incompatible with life.

The first repeat 2R G > C SNP identified in this study is distinct from the C > G SNP in the last repeat of 2R previously reported by Kawakami et al.14 The latter polymorphism resulted in the presence of a second putative USF-1 binding site in the 2R allele but was found to have no functional advantage over the wild-type gene, which contained only 1 functional binding site, when examined in vitro.14 However, their assays did not include the addition of exogenous USF-1, which has been shown to increase transcription of TS in similar experiments13 and may be required to activate gene expression to observe an effect. Alternatively, the additional 6 bp insertion in the last repeat and/or the proposed stem-loop structure formed by association of the inverted repeats5 may negatively affect binding of regulatory molecules to this region.

We did not find any individuals with the last repeat 2R variant (which we have designated 2RGG), but this may be beyond the limit of our detection system, since the restriction fragments produced by this variant would be difficult to resolve by agarose gel electrophoresis. Alternatively, the allele may only occur at very low frequencies within the Caucasian population and would easily be missed in our small sample. The actual Caucasian frequency is not known, and the original discovery was made in a Japanese cohort for which the frequency was never determined.14

We have used the notation 2RGC to indicate the wild-type 2R allele. However, this nomenclature is somewhat cumbersome and would become more complex if similar SNPs were identified in all repeats of the 2R and 3R alleles. Thus, we propose a classification system based on the number of functional repeats present in each genotype. Considering the allelic variants identified so far, this would range from 0 for a homozygous 2RCC to a maximum of 4 for a homozygous 3RGGC individual. However, if a C > G SNP is identified in the last repeat of the 3R allele, then the possibility of 6 putative functional repeats would exist (i.e., 3RGGG/3RGGG) and could be easily accommodated by this system.

It has been proposed that assessment of TS expression is most useful as an indicator of resistance to therapy for tumors with high levels of TS, since these are unlikely to respond to 5-FU.2 Given the complex metabolism required to generate cytotoxicity from 5-FU, it is probable that defects in other pathways contribute to cellular resistance in the absence of TS overexpression, and thus underexpression of TS would not necessarily guarantee treatment success. This has been demonstrated in vitro by comparing cell response to treatment with 5-FU vs. FUdR (the precursor of FdUMP, which is the ultimate TS inhibitor). Cells with low, intermediate and high TS expression were found to demonstrate a statistically significant linear trend in sensitivity to FUdR, but not to 5-FU.20 Because of this requirement for further cellular processing of 5-FU, analysis of genotype may have wider implications for predicting patient toxicity than for tumor response to pyrimidine treatment. Further studies will be required to determine if patients who have inherited only 1 functional USF binding site in the TYMS promoter region demonstrate excessive sensitivity to fluoropyrimidine treatment.

Note: During the revision of this manuscript, the same polymorphic variant of TYMS was identified by Guesella et al., and has been published in Pharmacogenomics J, 2006.21

References

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References