SEARCH

SEARCH BY CITATION

Keywords:

  • capecitabine;
  • susceptibility;
  • pharmacogenomics;
  • genome-wide;
  • meta-analysis

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. FUNDING SOURCES
  8. CONFLICT OF INTEREST DISCLOSURES
  9. REFERENCES

BACKGROUND:

Capecitabine, an oral 5-fluorouracil (5-FU) prodrug, is widely used in the treatment of breast, colorectal, and gastric cancers. To guide the selection of patients with potentially the greatest benefit of experiencing antitumor efficacy, or, alternatively, of developing toxicities, identifying genomic predictors of capecitabine sensitivity could permit its more informed use.

METHODS:

The objective of this study was to perform capecitabine sensitivity genome-wide association studies (GWAS) using 503 well genotyped human cell lines from individuals representing multiple different world populations. A meta-analysis that included all ethnic populations then enabled the identification of novel germline determinants (single nucleotide polymorphisms [SNPs]) of capecitabine susceptibility.

RESULTS:

First, an intrapopulation GWAS of Caucasian individuals identified reference SNP 4702484 (rs4702484) (within adenylate cyclase 2 [ADCY2]) at a level reaching genome-wide significance (P = 5.2 × 10−8). This SNP is located upstream of the 5 methyltetrahydrofolate-homocysteine methyltransferase reductase (MTRR) gene, and it is known that the enzyme for MTRR is involved in the methionine-folate biosynthesis and metabolism pathway, which is the primary target of 5-FU-related compounds, although the authors were unable to identify a direct relation between rs4702484 and MTRR expression in a tested subset of cells. In the meta-analysis, 4 SNPs comprised the top hits, which, again, included rs4702484 and 3 additional SNPs (rs8101143, rs576523, and rs361433) that approached genome-wide significance (P values from 1.9 × 10−7 to 8.8 × 10−7). The meta-analysis also identified 1 missense variant (rs11722476; serine to asparagine) within switch/sucrose nonfermentable-related, matrix-associated, actin-dependent regulator of chromatin (SMARCAD1), a novel gene for association with capecitabine/5-FU susceptibility.

CONCLUSIONS:

Toward the goal of individualizing cancer chemotherapy, the current study identified novel SNPs and genes associated with capecitabine sensitivity that are potentially informative and testable in any patient regardless of ethnicity. Cancer 2011. © 2011 American Cancer Society.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. FUNDING SOURCES
  8. CONFLICT OF INTEREST DISCLOSURES
  9. REFERENCES

Variation in drug response is both clinically expected and relatively poorly predicted.1, 2 Chemotherapy in particular is plagued by highly variable response rates as well as significant toxicity.1 Capecitabine is a chemotherapeutic agent widely used in the treatment of breast, colorectal, and gastric cancers.3 It is an oral, 5-fluorouracil (5-FU) prodrug designed to have limited toxicity because of preferential activation in tumor cells.4 However, toxicities can result, including, in particular, gastrointestinal toxicity and hand-foot syndrome.5 Identifying genetic predictors of capecitabine susceptibility could permit more informed use of this therapy by guiding the selection of patients potentially most likely to experience antitumor efficacy or, alternatively, by recognizing patients who may be at particular risk of developing toxicities. It is noteworthy that capecitabine is particularly well suited for pharmacogenomic study; because, compared with other oncologic drugs, it is often used as single-agent therapy.6, 7

Toward the discovery of genetic variants that govern chemotherapeutic susceptibility in patients, we developed a human cell-based model.8 The model uses lymphoblastoid cell lines (LCLs) collected from individuals across the globe as part of the International HapMap Project (available at http://hapmap.ncbi.nlm.nih.gov/downloads/index.html.en; May 1, 2010), for which genotype information is publicly available for each individual.9 Previous genome-wide discovery work has used populations of these cells for several pharmacologic agents, including cytarabine,10 daunorubicin,11 etoposide,12 cisplatin, and carboplatin.13-15 LCLs offer a model for genome-wide discovery without the confounders of diet, comedications, and comorbidities.16 Genetic variants discovered using this model have been validated in clinical settings.17

Pharmacoethnicity,18 the concept that different ethnic populations have different responses to the same drug, makes population-based studies particularly informative. Although genetics may not be the only factor contributing to different responses across different ethnic groups, it is likely an important component. Whereas the discovery of ethnic-specific polymorphisms is useful, the ultimate goal for clinical translation of pharmacogenomics remains the discovery of genetic polymorphisms (single nucleotide polymorphisms [SNPs]) that are informative and testable in any patient, regardless of ethnicity.1, 13, 18 Therefore, to identify SNPs for testing in the clinical setting, our objective was to interrogate capecitabine susceptibility using genome-wide association in samples from over 500 individuals and to perform a cross-population meta-analysis to characterize the genetic determinants of sensitivity in individuals representing diverse global backgrounds.

MATERIALS AND METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. FUNDING SOURCES
  8. CONFLICT OF INTEREST DISCLOSURES
  9. REFERENCES

Phenotyping

HapMap cell lines from 6 different panels were purchased from Coriell Institute for Medical Research (http://www.coriell.org; May 28, 2010) and were used for susceptibility phenotyping: 84 LCLs from unrelated Asian individuals from HapMap Phase I (individuals from Tokyo, Japan and Beijing, China) (ASN); 84 LCLs representing Caucasian individuals from Utah in the United States with northern/western European ancestry (CEU) in trio structure (2 parents plus their child) from HapMap Phase I (CEU1); 80 LCLs from a CEU population in trio structure from HapMap Phase III (CEU3); 87 LCLs from Yoruba individuals from Ibadan, Nigeria (YRI) in trio structure from HapMap Phase I (YRI1); 86 LCLs from a YRI population in trio structure from HapMap Phase III (YRI3); and 82 LCLs from an African-American population from the Southwest United States in trio structure (ASW). LCLs were cultured in RPMI 1640 media (Cellgro, Herndon, Va) containing 15% heat-inactivated fetal bovine serum (Hyclone, Logan, Utah) and 20 mM L-glutamine. Cell lines were diluted 3 times per week at a concentration of 300,000 to 350,000 cells/mL and maintained in a 37°C, 5% CO2 humidified incubator.

The use of capecitabine in LCLs is hindered by the lack of expression of cytidine deaminase,19 which is required for the conversion of capecitabine to its active form. To circumvent this step in enzymatic activation, 5′-deoxy-5-fluorouridine (5′DFUR), a major metabolite of capecitabine, was used to evaluate capecitabine sensitivity in a short-term cellular growth-inhibition assay.20 LCLs in the exponential growth phase with >85% viability (Vi-Cell XR viability analyzer; Beckman Coulter, Fullerton, Calif) were plated in triplicate at density = 1 × 105 cells/mL in 96-well, round-bottom plates (Corning, Corning, NY) 24 hours before drug treatment. Drug was added immediately after preparation of stock at concentrations of 2.5 μM, 10 μM, 20 μM, and 40 μM and was left on cells for 72 hours. AlamarBlue was added 24 hours before absorbance reading at wavelengths of 570 nm and 600 nm (Synergy-HT multidetection plate reader; BioTek, Winooski, Vt). The percentage of cells that survived was quantified relative to a control well without drug and, at each concentration, represents 2 separate experiments each performed in triplicate. The area under the survival curve (AUC), representing sensitivity to the drug, was calculated for each cell line using the trapezoidal rule and was log2-transformed for all data analysis. For comparisons between populations, AUC values were corrected for cellular growth rate20, 21 by subtracting each cell line's AUC phenotype with the cellular growth rate multiplied by the linear regression coefficient for growth rate. For performing the genome-wide association studies (GWAS), uncorrected AUCs were used so that any growth rate-associated variants also potentially could be identified.

Genotyping

Genotypes were downloaded from the HapMap Consortium release 27. Fewer genotypes were available for LCLs from Phase III HapMap (ASW, YRI3, CEU3) compared with Phase I samples (for which >2 million SNPs were available). To make these populations more comparable, imputation was performed for Phase III lines individually using BEAGLE open-source software (http://faculty.washington.edu/browning/beagle/beagle.html; August 1, 2010).22 For CEU3 and YRI3, CEU1 and YRI1 (HapMap r22), respectively, were used as reference. To measure the accuracy of imputation at each SNP, the correlation coefficient R2 was calculated as described previously after 100 imputations.22 Imputed genotypes with an R2 >0.80, a minor allele frequency (MAF) >0.05, no Mendelian errors, and in Hardy-Weinberg equilibrium (P > .001) were carried through the rest of the analysis. For ASW, the same process was followed using both YRI1 and CEU1 as references.

Genome-Wide Association Studies Analyses

Each of the 6 HapMap panels was analyzed independently by a GWAS. Because a GWAS assumes normality in the data, we first ensured this for each population. For 5 of the 6 panels, log2-transformed AUC phenotypes achieved normal distributions. Because log2-transformation in the ASW did not yield a normal distribution (Shapiro-Wilk test), the ASW population required rank normalization to achieve normality. Rank normalization was performed using the rntransform function in the R GenABEL package (The R Project for Statistical Computing, Vienna, Austria).

For CEU1, CEU3, YRI1, and YRI3, >2 million SNPs (MAF >5% within the panel, no Mendelian errors, and in Hardy Weinberg equilibrium; P > .001) were tested for association using the quantitative trait disequilibrium test (QTDT).23 In ASW, local ancestry at each SNP was estimated using HAPMIX, a method that analyzes data from dense genotyping chips to estimate local ancestry (http://www.stats.ox.ac.uk/∼myers/software.html; June 11, 2010).24 Phased genotypes from CEU1 and YRI1 were used as the ancestral populations to estimate ancestry. GWAS was performed using QTDT with local ancestry (a fractional predicted number of chromosomes) as a covariate.

A genomic control value25 was calculated for each GWAS. Correction for residual inflation of the test statistic was done for studies with λ>1. Resulting P values (possibly adjusted) were carried forward to the meta-analysis.

Meta-Analysis

To identify SNPs associated with capecitabine-induced cytotoxicity, we conducted a meta-analysis to assimilate the results of the GWAS from individual populations. We used METAL (Center for Statistical Genetics, University of Michigan, Ann Arbor, Mich),26 which combines P values across the studies for each SNP using a study-specific weight (sample size) and the direction of effect (β). At each SNP, the direction of effect and the P values from the individual studies were converted into signed Z scores. Z scores were combined with weights proportional to the square root of the sample size for each study.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. FUNDING SOURCES
  8. CONFLICT OF INTEREST DISCLOSURES
  9. REFERENCES

Phenotype Variation Across Ethnic Groups

Figure 1 provides the susceptibility phenotype results, grouped by ethnic population, for all 503 included cell lines upon exposure to the capecitabine metabolite 5′DFUR. Interpopulation comparisons reveal that the YRI population was most sensitive to the growth-inhibitory effects of capecitabine, with a median AUC that was significantly lower than that in the CEU population (P = 8.5 × 10−8) and the ASN population (P = 6 × 10−3) but not significantly different from that in the ASW population. The CEU population was the most resistant. Because previous work has indicated significant differences in the growth rate across HapMap populations21 and because the growth rate is a significant confounder of pharmacologic endpoints, including for 5′DFUR,20 the AUC measurement is corrected for the growth rate to allow the most appropriate comparisons.

thumbnail image

Figure 1. Cellular sensitivity to capecitabine is depicted in 503 HapMap lymphoblastoid cell lines. The population of Yoruba individuals from Ibadan, Nigeria (YRI) was most sensitive to the growth-inhibitory effects of 5′-deoxy-5-fluorouridine (capecitabine) with a median area under the survival curve (AUC) (representing sensitivity to the drug) that was significantly lower than the AUC for the population of Caucasians from Utah in the United States with northern/western European ancestry (CEU) (P = 8.5 × 10−8) and the population of unrelated Asians from Tokyo, Japan and Beijing, China (ASN) (P = 6 × 10−3), but it was not significantly different from the AUC for the population of African-American individuals from the Southwest United States (ASW).

Download figure to PowerPoint

Individual Population Genome-Wide Association Studies Reveal Top Single Nucleotide Polymorphism Finding in Caucasians

Because some interpopulation differences in sensitivity were observed, we first conducted individual, intrapopulation GWAS for each ethnic population to identify any potentially important, population-restricted SNPs. Such SNPs, in fact, may be important toward explaining interethnic susceptibility differences13, 27 like those illustrated in Figure 1.

Manhattan plots that illustrate GWAS results for each population are provided in Figure 2A (CEU) and in Figure 3 (YRI, ASW, ASN). Although there are interesting findings for each population, the most intriguing result was produced by the CEU GWAS. The top CEU signal—considerably stronger than any other signal in the entire CEU GWAS—identified the reference SNP rs4702484 at a level that approximated genome-wide significance (P = 5.2 × 10−8). This SNP, which is located in an intronic region of adenylate cyclase 2 (ADCY2), has a CEU MAF of 12%. In addition, as observed on the chromosomal plot of this region (Fig. 2B), rs4702484 is located just upstream of the 5-methyltetrahydrofolate-homocysteine methyltransferase reductase (MTRR) gene, and it is known that the enzyme for MTRR is involved in the methionine-folate biosynthesis and metabolism pathway.28 It has been demonstrated that 5-FU-related compounds target other enzymes in this pathway29; however, to our knowledge, a potential pharmacogenomic relation with MTRR would be novel.

thumbnail image

Figure 2. Genome-wide association studies (GWAS) in Caucasian individuals from Utah in the United States with northern/western European ancestry (CEU) identified a novel variant, reference single nucleotide polymorphism 4702484 (rs4702484), which was associated with capecitabine sensitivity at a level reaching genome-wide significance (P = 5.2 × 10−8). (A) This Manhattan plot depicts capecitabine susceptibility GWAS results in the CEU population. (B) This is a zoom-in view of the chromosome 5 (chr5) region around the top CEU GWAS SNP, rs4702484. The location within adenylate cyclase 2 (ADCY2) is shown in addition to its close proximity (upstream) of 5 methyltetrahydrofolate-homocysteine methyltransferase reductase (MTRR). The r2 patterns designate linkage disequilibrium probabilities with other single nucleotide polymorphisms in the region. FASTKD3 indicates Fas-activated serine/threonine kinase domains 3; C5orf149, chromosome 5 open reading frame 149.

Download figure to PowerPoint

thumbnail image

Figure 3. These Manhattan plots show the results from genome-wide association studies for capecitabine sensitivity in different populations, including (A) Yoruba individuals from Ibadan, Nigeria (YRI); (B) African-American individuals from the Southwest United States (ASW); and (C) unrelated Asian individuals from Tokyo, Japan and Beijing, China from HapMap Phase I (ASN).

Download figure to PowerPoint

By using whole-genome messenger RNA (mRNA) expression data that we previously generated in our CEU1 LCLs using Affymetrix Exon Array 1.0 (Affymetrix, Santa Clara, Calif),19 we investigated in an exploratory manner whether there was a statistical correlation between rs4702484 and MTRR mRNA expression levels in 30 CEU LCL trios (90 samples) using QTDT software; however, we were unable to identify a direct relation in this limited subset.

Meta-Analysis Results From Population-Based Genome-Wide Association Studies

Next, we performed a meta-analysis of the individual population data to identify the most significant SNPs that incorporated all populations. Figure 4 illustrates the full meta-analysis results in a Manhattan plot. Although none of the top SNP associations reached the traditional cutoff for genome-wide association statistical significance,30 4 SNPs approached this threshold (rs8101143, rs576523, rs4702484, and rs361433; P values from 1.9 × 10−7 to 8.8 × 10−7). An additional 23 top SNPs had P values <10−5 (Table 1). It is noteworthy that the previously identified top CEU SNP, rs4702484, remained highly significant (and ranked third overall) in the meta-analysis (meta-analysis P = 6.4 × 10−7). This SNP did not have an MAF >0.05 in the YRI or ASW populations; in the ASN population, the P value was .34.

thumbnail image

Figure 4. This Manhattan plot depicts the meta-analysis of individual lymphoblastoid cell lines from the Caucasian-American, Nigerian, African-American, and Asian populations. The plot shows meta-analysis results from capecitabine susceptibility genome-wide association studies, including all of the combined world populations that were used (n = 503 individuals).

Download figure to PowerPoint

Table 1. Single Nucleotide Polymorphisms From the Multipopulation Meta-Analysis of Genome-Wide Association Findings for Capecitabine Susceptibilitya
ChromosomeGeneSNPLocationMeta-Analysis POverall Rank (by P)No. of Individual Samples Tested
  • Abbreviations: ADCY2, adenylate cyclase 2; LOC643281, chromosome 6 location 6q14.1; PDE4DIP, phosphodiesterase 4D interacting protein; rs, reference single nucleotide polymorphism number; SH2D4B, SH2 domain containing 4B; SLC44A5, solute carrier family 44, member 5; SMARCAD1, switch/sucrose nonfermentable-related, matrix-associated, actin-dependent regulator of chromatin; SNP, single nucleotide polymorphism; SOX6, sex-determining region Y box 6.

  • a

    Although samples from 503 individuals were tested overall for capecitabine susceptibility, the number of samples analyzed for each SNP (far right column) was less if the SNP was monomorphic or rare (minor allele frequency <5%) in some ethnic groups or, in infrequent cases, if HapMap or imputation genotypes were unavailable.

1rs5765232.3 × 10−72172
 PDE4DIPrs2863344Intron1.9 × 10−69156
 SLC44A5rs1249675Intron6.5 × 10−617491
2rs48481439.3 × 10−627329
3rs67710192.1 × 10−61083
 rs98241506.6 × 10−618244
4rs119413991.3 × 10−66328
 SMARCAD1rs3106136Intron7.2 × 10−623484
 SMARCAD1rs183993Intron8.3 × 10−625502
5ADCY2rs4702484Intron6.4 × 10−73248
6rs25242766.3 × 10−616246
 LOC643281rs12198063Intron7.6 × 10−624245
7rs3614338.8 × 10−74251
 rs28828341.3 × 10−65494
 rs69711096.8 × 10−620503
10rs7054696.8 × 10−619495
 SH2D4Brs6586111Intron7.1 × 10−621487
 SH2D4Brs7915642Intron8.5 × 10−626489
 rs7054717.2 × 10−622493
11SOX6rs16932455Intron1.5 × 10−67329
 SOX6rs12577378Intron1.8 × 10−68330
 SOX6rs7947008Intron2.6 × 10−611336
 SOX6rs12576205Intron4.9 × 10−614257
 SOX6rs16932445Intron4.9 × 10−615257
13rs64905253.9 × 10−613246
18rs99538523.1 × 10−612252
19rs81011431.9 × 10−71474

We undertook an analysis to determine the proportion of SNPs identified from each intrapopulation GWAS that remained strongly significant in the meta-analysis (Fig. 5). By using an arbitrary cutoff for significance of P < 10−4, the ASN-only GWAS identified 200 top SNPs, 14 (7%) of which remained significant in the across-population meta-analysis. For the CEU populations, 161 SNPs were identified by CEU-only GWAS, of which 29 (18%) also were significant in the meta-analysis. For YRI the population, the proportion was 12% (33 of 279 SNPs). For the ASW population, 1 SNP remained significant in the meta-analysis out of 147 that had been identified in the ASW-only GWAS.

thumbnail image

Figure 5. These circles depict the proportion of single nucleotide polymorphisms (SNPs) identified from each intrapopulation genome-wide association study (GWAS) that remained strongly significant in the meta-analysis. For example, the GWAS that included only unrelated Asian individuals from Tokyo, Japan and Beijing, China from HapMap Phase I (ASN) identified 200 top SNPs (the sum of 14 + 186), 14 (7%) of which remained significant in the meta-analysis. An arbitrary cutoff of significance of P < 10−4 was used. None of the SNPs from an individual population that remained strongly significantly in the meta-analysis were the same in multiple populations. CEU indicates Caucasian individuals from Utah in the United States with northern/western European ancestry; YRI, Yoruba individuals from Ibadan, Nigeria; ASW, African-American individuals from the Southwest United States.

Download figure to PowerPoint

Top Meta-Analysis Single Nucleotide Polymorphisms Considering Directionality in All Populations

For analysis of the top meta-analysis SNPs that had potential importance, we more closely interrogated all SNPs that had P values < 10−4, consistent with many of our previous cell-based analyses.8, 11, 13, 31 In the meta-analysis, 321 SNPs met this threshold. Upon inspection, it became apparent that these SNPs generally fell into 1 of 3 categories: 1) directional agreement across all populations regarding the demonstrated association independent of the significance of the association, 2) the meta-analysis P value was entirely driven by the association within a single population, or 3) the direction of the genotype-phenotype association for the SNP was opposite for 1 or more of the ethnic populations.

One-third of the top SNPs (108 of 321 SNPs) fit the description of consistent genotype-phenotype association direction across all 6 individually evaluated populations (ASN, CEU1, CEU3, ASW, YRI1, and YRI3). A representative example of this is provided for rs8101143 (P = 1.9 × 10−7; top-ranked in the meta-analysis) (Fig. 6A). This SNP was not identified by any of our single-population GWAS, because the strength of the association was low in any single ethnic population (P > 10−4) but was strong when multiple populations were considered in the meta-analysis, apparently as a result of reproducible, consistent-directional effects in all of the included populations. Many additional SNPs (n = 193) also fell into this general category, in that the direction of the genotype-phenotype association was consistent across all ethnically distinct individual populations in which the variant was common (MAF >5%) and present (not monomorphic). It is acknowledged that the meta-analysis methodology itself is designed to favor the identification of “consistent-direction” SNPs among the top signals.

thumbnail image

Figure 6. These are examples of single nucleotide polymorphisms (SNPs) that were identified in the meta-analysis. (A) Reference SNP 8101143 (rs8101143) (P = 1.9 × 10−7; ranked first in the meta-analysis) illustrates genotype-phenotype associations in the same direction across all 6 individually evaluated populations. ASN indicates unrelated Asian individuals from HapMap Phase I (individuals from Tokyo, Japan and Beijing, China); CEU1, Caucasian individuals from Utah in the United States with northern/western European ancestry (CEU) in trio structure (2 parents plus their child) from HapMap Phase I; CEU3, CEU individuals in trio structure from HapMap Phase III; ASW, African-American individuals from the Southwest United States; YRI1, Yoruba individuals from Ibadan, Nigeria (YRI) in trio structure from HapMap Phase I; YRI3, YRI individuals in trio structure from HapMap Phase III. (B) SNP rs576523, which is only polymorphic in the YRI population (and it is noteworthy that it was still the second most significant SNP in the overall meta-analysis, P = 2.3 × 10−7), was a SNP with a strong signal in the meta-analysis despite having a genotype-phenotype association present in only 1 population, because the variant was monomorphic in all other populations. (C) SNP rs6971109 illustrates a top hit in the meta-analysis despite the finding that the consensus direction of the genotype-phenotype association for the SNP was opposite in 1 or more of the ethnic populations. The circular symbol indicates that the directionality of the association in that population was opposite to that of the other populations (which all are similarly denoted with diamonds). The dashed vertical line in each chart indicates a P value of .05.

Download figure to PowerPoint

Another category was genotype-phenotype associations present only in 1 ethnically restricted population because the variant was monomorphic or rare in all other populations. An example is provided for SNP rs576523 (Fig. 6B), which is polymorphic only in the YRI population, and it is worth noting that rs576523 was the second most significant SNP in the meta-analysis (P = 2.3 × 10−7). In fact, 3 of the top 10 SNPs in Table 1 are considered polymorphic in only 1 population (rs576523 in YRI, rs2863344 in CEU, and rs6771019 in YRI). These results include several of the “surviving” SNPs, which are depicted in Figure 5, and they likely represent some of the strongest findings given the strengths of the associations, although they are population-restricted.

The third group includes SNPs in which the consensus direction of the genotype-phenotype association for the SNP was opposite for 1 or more of the ethnic populations, yet the strength of the association in the consensus direction was robust enough to achieve a meta-analysis P value that reached top significance (eg, an association was positive for CEU, YRI, and ASW but opposite for ASN, yet the meta-analysis P value was <10−4). An example of this is provided in Figure 6C for SNP rs6971109. Such SNPs may have relevance for general testing in most individuals with the knowledge that, in a single ethnic population, the association may not be relevant. Only 20 of the top 321 SNPs fit this group, and in none of those 20 was the opposite direction-outlier population's association statistically significant.

Potential Functional Role of Top Meta-Analysis Single Nucleotide Polymorphisms

Of the 321 top SNPs with P < 10−4, most (162 SNPs) were located in uncharacterized regions of the genome (ie, not annotated to any known gene based on location), a finding that was consistent in our previous studies using an unbiased genome-wide approach to chemotherapy susceptibility pharmacogenomics. Many others were located in introns (145 SNPs), although none were located at known splice sites. Eight SNPs were located either near known genes (rs263003 with presenilin associated, rhomboid-like [PARL]; rs3106134 with switch/sucrose nonfermentable-related, matrix-associated, actin-dependent regulator of chromatin [SMARCAD1]; and rs972249 with keartin 40 [KRT40]) or in the 3′ or 5′ untranslated region (UTR) (rs7448390 and rs17101607 located in the 3′ UTR of [YIPF5]; rs11635570 located in the 3′ UTR of mitochondrial methionyl-transfer RNA formyltransferase [MTFMT]; rs17039288 located in the 5′ UTR of myelin transcription factor 1-like [MYT1L]; and rs3738414 located in the 5′ UTR of V-set domain-containing T-cell activation inhibitor 1 [VTCN]). One SNP (rs10907177) was a synonymous coding variant in a poorly characterized gene region (chromosome 1 open reading frame 159 [C1orf159]).

Perhaps most interesting was a missense SNP (rs11722476) within SMARCAD1 (Fig. 7A). This SNP had a meta-analysis P value of 6.7 × 10−5. The guanine-to-adenine (G[RIGHTWARDS ARROW]A) DNA change results in a serine-to-asparagine amino acid change in the SMARCAD1 protein. The genotype-phenotype association for this SNP in the n = 503 individuals is illustrated in Figure 7B.

thumbnail image

Figure 7. Meta-analysis of the cross-population genome-wide association studies identified reference single nucleotide polymorphism (SNP) 11722476 (rs11722476), a missense SNP within switch/sucrose nonfermentable-related, matrix-associated, actin-dependent regulator of chromatin (SMARCAD1). (A) This is a zoom-in view of the genomic region on chromosome 4 around this SNP. The finding that several other signals within SMARCAD1 were identified is illustrated by the striking number of SNPs that were observed at P values less than the arbitrary cutoff of P < 10−4. Chr4 indicates chromosome 4; HPGDS, hematopoeitic prostaglandin D synthase. (B) The genotype-phenotype association for this SNP in n = 503 individuals. AUC indicates area under the survival curve (representing sensitivity to the drug); 5′DFUR, 5′-deoxy-5-fluorouridine; A, adenine; G, guanine.

Download figure to PowerPoint

Effect of Thymidine Phosphorylase

Because inactive 5′DFUR requires activation to 5-FU through a final anabolizing enzyme, thymidine phosphorylase, and because thymidine phosphorylase levels can be affected by the relative expression of the thymidine phosphorylase gene (TYMP) in various human tissues,32 we also interrogated whether TYMP levels within LCLs correlated with 5′DFUR susceptibility in our study. By using our broad gene expression data from the HapMap CEU1 and YRI1 populations,19 we identified a significant relation between 5′DFUR AUC and TYMP expression in both CEU1 (P = 1.5 × 10−4) and YRI1 (P = 2.4 × 10−6). In both populations, the direction of the relation indicated that higher TYMP expression was correlated with a lower AUC (β = −3.97 for CEU1; β = −5.10 for YRI1), as may be hypothesized. However, the overall proportion of AUC variation explained by TYMP was only 0.15 for CEU1 and 0.22 for YRI1, supporting our GWAS findings reported above that there are other important sources of genetic variability in determining capecitabine (5′DFUR) sensitivity.

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. FUNDING SOURCES
  8. CONFLICT OF INTEREST DISCLOSURES
  9. REFERENCES

In this report, we describe a novel human cell-based approach to identifying germline pharmacogenomic polymorphisms that govern susceptibility to the widely used chemotherapy drug capecitabine. By using the inherently powerful genetic information encapsulated within the HapMap Project and a high-throughput, cell-based chemotherapy susceptibility testing method, we were able to perform the largest known GWAS for capecitabine susceptibility pharmacogenomics in over 500 individual samples. This comprehensive, “across-populations” chemotherapy study is distinguished from previous cell-based pharmacogenomic approaches by the novel idea of conducting a meta-analysis across multiple ethnic populations. This exercise permitted a built-in method for statistical veracity testing of resulting SNPs, because the meta-analysis assimilated raw P values from individual population-based GWAS, and only the top SNPs that had strong associations after consideration of all individual populations achieved robust meta-analysis P values. Therefore, many of the top SNPs from our meta-analysis may be considered clinically testable for replication in the ethnically diverse human populations that typically are encountered in true clinical practice and in clinical research settings.

Our 2 most compelling genetic findings deserve particular discussion. SNP rs4702484 on chromosome 5 was identified among the top 3 SNPs in our meta-analysis in addition to being identified by our CEU intrapopulation GWAS (at a P value that reached genome-wide significance). Although this SNP is intronic within ADCY2, close analysis of the genomic region around this SNP demonstrates that the SNP is quite proximal to the methionine-folate pathway gene MTRR (see Fig. 2B). Therefore, we hypothesize that upstream regulation of MTRR (through polymorphism of its extended promoter region) may be likely, a mechanism that would fit with the purported site of activity of 5-FU–related compounds like capecitabine within the folate metabolism/nucleotide biosynthesis pathway. Although other capecitabine/5-FU pathway candidate genes (eg, thymidylate synthetase [TYMS], methylenetetrahydrofolate reductase [MTHFR], and dihydropyrimidine dehydrogenase [DPYD]) have been well studied previously,33-35 we could identify no previous positive reports implicating MTRR pharmacogenetic variation with 5-FU–related phenotypes (1 recent study reported a negative finding with a different MTRR polymorphism36). This increases the potential novelty of our finding. Perhaps most noteworthy, if this relation indeed is confirmed by ongoing functional studies of MTRR, then this genetic relation would be an example of a GWAS approach identifying a de facto “candidate gene” that previously was largely ignored by classic candidate gene methods.

Separately, although several other top meta-analysis SNPs are interesting for possible functional and clinical importance, the identification of SNP rs11722476 (a missense SNP within SMARCAD1) was especially intriguing. First, there were a noticeably large (and disproportionate) number of repetitive, strong signals within SMARCAD1 among our top 321 meta-analysis SNPs (see Fig. 7A), and there were 2 signals within SMARCAD1 (potentially in linkage disequilibrium [LD]) among the most significant (P < 10−5) overall SNPs (Table 1). These findings, along with the identification of the missense SNP in this region described above, compositely suggest the importance of this gene for capecitabine. It is noteworthy that SMARCAD1, a member of a helicase superfamily that includes proteins essential to genome replication, repair, and expression,37 has been mentioned previously (although only tangentially) in reports that would be consistent with an important 5-FU–related correlation. One study that used deletion mapping of chromosome 4q22-35 indicated that SMARCAD1 frequently was deleted in head and neck cancers,38 which are often treated successfully by 5-FU. This may suggest that SMARCAD1 gene dosage effects potentially may underlie one mechanism of 5-FU susceptibility for these tumors. A second unrelated study indicated (through expression analysis) that SMARCAD1 has particularly high levels in endocrine tissues.39 Breast tissue would be considered highly endocrine-responsive (estrogen/progesterone receptors); therefore, although still speculative, this may begin to suggest a role for this SMARCAD1 polymorphism in explaining the sensitivity of breast cancers to capecitabine. Of course, such hypotheses would need to be confirmed by formal molecular studies, highlighting that GWA studies often are excellent for permitting new hypothesis generation.

None of the typically studied capecitabine/5-FU pathway candidate genes themselves (TYMS, TYMP, MTHFR, and DPYD, among others) were identified among the top SNP signals in our study. This may have been caused in part by tissue-restricted down-regulation of some of these genes in LCLs. However, it also illustrates the idea that a combined approach (genome-wide plus candidate gene methods) ultimately may yield the most comprehensive approach to drug susceptibility pharmacogenomics, perhaps especially in oncology. In addition, there may be differences in the pharmacogenomics of capecitabine compared with 5-FU (just as there are differences in the toxicity profiles for these 2 drugs), and there has been relatively much less clinical investigation into the pharmacogenomics of capecitabine. One recent study implicated a role for TYMS in patients with colon cancer who received a regimen that contained capecitabine.40 Another earlier study also provided the strongest evidence for TYMS among the typically studied capecitabine candidate genes.41

Our study has recognized limitations. Although SNP rs4702484 did achieve genome-wide statistical significance in the CEU-only GWAS, none of the meta-analysis results achieved P values below the generally accepted GWA cutoff of approximately 5 × 10−8.30 In this sense, the overall statistical power gained by combining 6 panels in the meta-analysis probably was less than we expected. This may be because of different underlying LD structures between the individual populations.42, 43 At the same time, it has been argued that additional factors in the comprehensive evaluation of GWAS findings need to be considered beyond just the P value threshold,44 and we suggest that SNPs in our meta-analysis with approximate P values ≤10−6—especially when the association directionality is consistent across all 6 individually tested panels—have a greater likelihood of true importance. The finding that 4 SNPs indeed achieved P values of approximately 10−7 (approaching traditional genome-wide significance) despite our sample size (approximately 500 individuals), which typically would be considered too small for conducting a GWAS, may emphasize the potential relevance of these findings. We believe it also simultaneously validates the utility of the meta-analysis approach. Second, the majority of our identified top SNPs were located in regions of the genome without obvious apparent functional explanation. This reflects 2 possibilities: Either it simply reflects the greater statistical probability of more commonly identifying variants in noncoding regions of the genome, because those regions inherently comprise a vastly greater total percentage of the genome; or it signals the novelty and the advantage of GWAS studies like ours for interrogating chemotherapeutic pharmacogenomic traits, in which an unbiased approach may be exactly what is desired, because candidate-gene or single-gene methods often have fallen short.18 Third, although the genetic information in this study is from human individuals in the HapMap Project, and although meta-analysis approaches, when conducted properly, may obviate the need for additional replication in a separate population, the phenotypes were derived in a cell-line model; therefore, the results require validation in clinical populations of patients who are receiving capecitabine.

Finally, it should be mentioned that, although our cross-population method offers the advantage of identifying relatively common SNPs that are testable in all or most individuals regardless of ethnic background, we did observe allelic heterogeneity among the top SNPs (and often one or more populations were monomorphic at a given identified locus). In addition, despite the value of a cross-population meta-analysis, one of our strongest results was achieved from a single-population analysis, underscoring the value of individual population analyses as well.

In summary, we conducted a large, cell-based meta-analysis of genome-wide association findings for capecitabine chemotherapy susceptibility across multiple divergent human populations. The resulting list of novel SNPs and related genes, along with SNPs in previously identified capecitabine/5-FU pathway candidate genes, deserve study in clinical settings toward the goal of identifying underlying genetic factors influencing toxicity from, and perhaps response to, this commonly used cancer agent. The ongoing multicenter clinical study examining comprehensive capecitabine toxicity pharmacogenomics (www.clinicaltrials.gov study identifier NCT00977119) indeed plans to use these data for specifically that purpose.

FUNDING SOURCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. FUNDING SOURCES
  8. CONFLICT OF INTEREST DISCLOSURES
  9. REFERENCES

This study was supported by The University of Chicago Breast Cancer Specialized Program of Research Excellence (SPORE) grant P50 CA125183 (M.E.D.), by National Institutes of Health (NIH)/National Institute of General Medical Sciences (NIGMS) grant UO1 GM61393 to the Pharmacogenomics of Anticancer Agents Research Group (M.E.D., N.J.C.), by NIH/National Cancer Institute grant F32 CA136123 (P.H.O.), by NIH grant TL1 RR25001 (A.L.S.), and by NIH/NIGMS grant K08 GM089941 (R.S.H.).

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. FUNDING SOURCES
  8. CONFLICT OF INTEREST DISCLOSURES
  9. REFERENCES