Chemotherapeutic agents are notoriously known to have a narrow therapeutic range that often results in life-threatening toxicity. Hence, it is clinically important to identify the patients who are at high risk for severe toxicity to certain chemotherapy through a pharmacogenomics approach. In this study, we carried out multiple genome-wide association studies (GWAS) of 13 122 cancer patients who received different chemotherapy regimens, including cyclophosphamide- and platinum-based (cisplatin and carboplatin), anthracycline-based (doxorubicin and epirubicin), and antimetabolite-based (5-fluorouracil and gemcitabine) treatment, antimicrotubule agents (paclitaxel and docetaxel), and topoisomerase inhibitors (camptothecin and etoposide), as well as combination therapy with paclitaxel and carboplatin, to identify genetic variants that are associated with the risk of severe neutropenia/leucopenia in the Japanese population. In addition, we used a weighted genetic risk scoring system to evaluate the cumulative effects of the suggestive genetic variants identified from GWAS in order to predict the risk levels of individuals who carry multiple risk alleles. Although we failed to identify genetic variants that surpassed the genome-wide significance level (P < 5.0 × 10−8) through GWAS, probably due to insufficient statistical power and complex clinical features, we were able to shortlist some of the suggestive associated loci. The current study is at the relatively preliminary stage, but does highlight the complexity and problematic issues associated with retrospective pharmacogenomics studies. However, we hope that verification of these genetic variants through local and international collaborations could improve the clinical outcome for cancer patients.
It is now widely and well recognized that medication can cause distinct heterogeneity in terms of its efficacy and toxicity among individuals. These inter-individual differences could be explained in part by the common and/or rare genetic variants in the human genome. Pharmacogenomics aims to discover how genetic variations in the human genome can affect a drug's efficacy or toxicity, and thus brings great promise for personalized medicine in which genetic information can be used to predict the safety, toxicity, and/or efficacy of drugs. Pharmacogenomics study for chemotherapeutic therapies is particularly important because these drugs are known to have a narrow therapeutic window; in general, a higher concentration causes toxicity and a lower concentration reduces the efficacy of the drug. Two of the well-described examples are the association of genetic variants in TPMT with 6-mercaptopurine-induced myelosuppression in treatment of pediatric acute lymphoblastic leukemia and that of UGT1A1 variants with camptothecin-related neutropenia and diarrhea in treatment of colorectal and lung cancers. The US Food and Drug Administration have recommended that variants on these two genes should be helpful for the prediction of severe adverse reactions prior to use of the drugs.[2-7]
With advances in various technologies in the life sciences, it is now possible to accurately genotype more than a million common genetic variations by genome-wide high-density SNP array or to characterize all genetic variants in our genome by the next generation DNA sequencing methods. Although one of the greatest drawbacks of GWAS is the requirement of the large number of samples to achieve high statistical power, this issue could be overcome by the establishment of Biobank Japan in 2003 (http://biobankjp.org/). Biobank Japan collected approximately 330 000 disease cases (200 000 individuals) that had either one or multiples of 47 different diseases including cancers from a collaborative network of 66 hospitals throughout Japan, with the major aim to identify genetic variants associated with susceptibility to complex diseases or those related to drug toxicity. By using the samples from Biobank Japan, a significant number of insightful findings have been published in recent years for identification of common genetic variants associated with complex diseases including cancer.[10-19] With a reasonable number of samples, it is also feasible to carry out pharmacogenomics studies on chemotherapy-induced toxicity.
Neutropenia and/or leucopenia are two of the most common drug adverse events after treatment with chemotherapeutic agents, which often cause life-threatening infections and the delay of treatment schedule that subsequently affect the treatment outcome. Although prophylactic granulocyte colony-stimulating factor has been given to the patients as a preventive measure, the underlying mechanism and susceptible risk factors that cause neutropenia have not been fully elucidated. In this study, we carried out a total of 17 sets of GWAS using 13 122 cancer patients, who received various drug regimens, to identify genetic variants associated with the risk of chemotherapeutic agent-induced severe neutropenia/leucopenia in the Japanese population.
Subjects and Method
A total of 13 122 DNA samples from cancer patients, who received various chemotherapeutic agents, stored in Biobank Japan (University of Tokyo, Tokyo, Japan), were used in this study. Among them, 805 patients developed severe neutropenia and/or leucopenia (≥grade 3), and 4804 patients were not reported to develop any adverse reactions after being given chemotherapeutic agents. The samples could be classified into subgroups according to the drugs used: an alkylating agent (cyclophosphamide); platinum-based (cisplatin and carboplatin), anthracycline-based (doxorubicin and epirubicin); antimetabolite-based (5-fluorouracil and gemcitabine), antimicrotubule-based (paclitaxel and docetaxel); and topoisomerase inhibitor-based (camptothecin and etoposide). The grade of toxicity was classified in accordance with the US National Cancer Institute's Common Toxicity Criteria version 2.0. The adverse event description is based on the medical records collected by the medical coordinator. The patients' demographic details are summarized in Table 1. Participants of this study provided written inform consent and this project was approved by the ethical committee from the Institute of Medical Sciences, University of Tokyo and the RIKEN Center for Genomic Medicine (Yokohama, Japan).
Table 1. Demographic details of cancer patients treated with chemotherapeutic agents, whose DNA samples are stored in Biobank Japan (The University of Tokyo, Tokyo, Japan)
Individuals who did not develop any adverse drug reactions after chemotherapy.
Age, years (mean)
Paclitaxel + carboplatin
Genotyping and quality controls
DNAs obtained from the patients' blood were genotyped using Illumina OmniExpress BeadChip (San Diego, CA, USA) that contained 733 202 SNPs. Sample quality control was carried out by methods including identity-by-state to evaluate cryptic relatedness for each sample and population stratification by the use of principal component analysis to exclude genetically heterogeneous samples from further analysis.[21, 22] Then, our standard SNP quality control was carried out by excluding SNPs deviating from the Hardy–Weinberg equilibrium (P ≤ 1.0 × 10−6), non-polymorphic SNPs, SNPs with a call rate of <0.99, and those on the X chromosome.[21, 22] Q–Q plot and lambda values, which were calculated between observed P-values from Fisher's test allelic model against expected P-values, were used to further evaluate population substructure.
Genome-wide case–control association analyses were evaluated using Fisher's exact method considering allelic, dominant, and recessive genetic models. Manhattan plots of the study were generated using the minimum P-value among the three genetic models for each SNP.
Scoring system using wGRS
The scoring analysis was carried out using SNPs with Pmin of <1.0 × 10−5 after exclusion of SNPs that are in strong linkage disequilibrium (r2 > 0.8) in each GWAS. The wGRS were calculated according to De Jager et al. Briefly, we first calculated the weight of each SNP that is the natural log of the odds ratio for each allele/genotype, considering the associated genetic model. For an additive model, we assigned a score of 2 to an individual with two risk alleles, 1 to that with one risk allele, and 0 to that with no risk allele. For a dominant model, we assigned a score of 1 to an individual with one or two risk alleles, and 0 to that with no risk allele. For a recessive model, we assigned a score of 1 to an individual with two risk alleles, and 0 to that with no or one risk allele. Then the cumulative genetic risk scores were determined by multiplying the number of risk alleles/genotype of each SNP by its corresponding weight, and subsequently took the sum across the total number of SNPs that were taken into consideration of each GWAS set. We classified the genetics risk score into four different groups created from the mean and SD: group 1, <mean − 1SD; group 2, mean − 1SD to mean; group 3, mean to mean + 1SD; and group 4, >mean + 1SD. Odds ratio, 95% confidence interval, P-value, sensitivity, and specificity were calculated using group 1 as a reference. To calculate the OR in which one of the cells in the contingency table is zero, we applied the Haldane correction, used to avoid error in the calculation by adding 0.5 to all of the cells of a contingency table.
After subdividing the patients by administered drugs/major drug subgroups, as previously mentioned, a total of 17 GWAS analyses were carried out by comparing the allele/genotype frequency between the patients who had developed severe neutropenia/leucopenia (grade 3/4) to those who had not developed any adverse drug reactions. The Q–Q plots of each GWAS and the calculated lambda value of below 1.00 indicated no significant population stratification in each of these GWAS analyses (Fig. S1). From this study, although we could not identify any SNPs that surpassed the genome-wide significant threshold (P-value < 5 × 10−8) for showing association with the risk of neutropenia/leucopenia induced by the certain type of drug or regimen, several possible candidate loci were identified. The results of the GWAS are summarized in Table 2, Table S1, and Figure S2; the results of wGRS are summarized in Table S2.
Table 2. Association analysis of single nucleotide polymorphisms (SNPs) with different chemotherapeutic drugs/drug subgroups known to induce severe neutropenia/leucopenia
SNPs used for weighted genetic risk score analyses. BP, SNP genomic location; CHR, chromosome; inf, infinity; L95, lower 95% confidence interval; N/A, not applicable; NRA, non-risk allele; OR, odds ratio; P_allelic, P-value from allelic model; P_dom, P-value from dominant model; P_min, minimum P-value among the three models; P_rec, P-value from recessive model; RA, risk allele; RAF, risk allele frequency; rel.loci, distance of the SNP from the gene; U95, upper 95% confidence interval.
Among these datasets, GWAS carried out using samples who were given: (i) any kind of platinum-based chemotherapy (428 cases vs 743 controls); (ii) cisplatin-based chemotherapy (176 cases vs 471 controls); or (iii) carboplatin-based chemotherapy (261 cases vs 262 controls) identified SNPs showing the most significant association with chemotherapy-induced severe neutropenia/leucopenia are: rs4886670 (Pmin= 9.86 × 10−7, OR = 1.61, 95% CI = 1.33–1.94) near RPL36AP45 for (i); rs10253216 (Pmin= 1.68 × 10−7, OR = 1.48, 95% CI = 1.16–1.89) near AGR2 for (ii); and rs11071200 (Pmin= 8.51 × 10−7, OR = 8.24, 95% CI = 2.89–23.5) on PRTG for (iii) (Table 2, Table S1, Fig. S2b). For the anthracycline-based regimen, we carried out GWAS with individuals given all anthracycline-based (184 cases vs 459 controls), doxorubicin-based (83 cases vs 66 controls), and epirubicin-based (83 cases vs 370 controls) chemotherapy, and identified three SNPs, rs10040979 (Pmin= 4.60 × 10−7, OR = 1.45, 95% CI = 1.12–1.88) in EBF1, rs11857176 (Pmin= 8.08 × 10−7, OR = 1.80, 95% CI = 1.13–2.87) near a hypothetical gene LOC100302666, and rs4149639 (Pmin= 2.89 × 10−7, OR = 4.44, 95% CI = 2.57–7.68) in TNFRSF1A, to be most significantly associated with the risk of high-grade neutropenia/leucopenia, respectively (Table 2, Table S1, Fig. S2c). In the case of antimicrotubule agents, we carried out three different GWAS with individuals who were treated with antimicrotubule (371 cases vs 825 controls), paclitaxel-based (218 cases vs 364 controls), or docetaxel-based (147 cases vs 233 controls) regimens. We identified three SNPs, rs11651483 (Pmin= 3.37 × 10−7, OR = 1.36, 95% CI = 1.12–1.64) in RICH2, rs922106 (Pmin= 9.28 × 10−7, OR = 1.68, 95% CI = 1.28–2.21) in LRRC8B and rs3747851 (Pmin= 5.61 × 10−7, OR = 2.38, 95% CI = 1.69–3.34) in DAB2IP, to be those most significantly associated with the increased risk of severe neutropenia/leucopenia, respectively (Table 2, Table S1, Fig. S2e). Our previous report by Kiyotani et al. identified four SNPs to be associated with gemcitabine-induced hematological toxicities. Three of the four SNPs were included in the current study with suggestive association, rs12046844 (Pmin= 5.84 × 10−4, OR = 2.53, 95% CI = 1.45–4.43), rs6430443 (Pmin= 8.61 × 10−4, OR = 6.33, 95% CI = 1.90–22.2; r2 = 0.895 with rs1901440) and rs11719165 (Pmin= 1.16 × 10−2, OR = 2.36, 95% CI = 1.18–4.70) (Table S4). However, it is noted that some of the samples used in this study overlapped with those in the study reported by Kiyotani et al., as both sourced samples from Biobank Japan.
Lastly, we also attempted to identify genetic variants associated with combined treatment of paclitaxel and carboplatin-induced severe neutropenia/leucopenia (150 cases vs 166 controls), as this combined treatment is commonly used as the standard therapy for both ovarian and lung cancers. We found the most significant association with the SNP rs12310399 (Pmin= 2.46 × 10−7, OR = 1.85, 95% CI = 1.33–2.58) near the FGD6 gene (Table 2, Table S1, Fig. S2a), which is suggested to activate CDC42, a member of the Ras-like family of Rho and Rac proteins, and has a critical role in regulating the actin cytoskeleton. The second strongest association was observed at the locus encoding RXRA (Pmin= 7.38 × 10−7, OR = 2.58, 95% CI = 1.77–3.77), an important transcriptional factor. We also calculated the cumulative genetic scores using SNPs on six loci and identified that individuals in group 4 could have 188 times (95% CI = 36.1–979) higher risk of developing severe neutropenia/leucopenia than those belonging to group 1 with the sensitivity of 95.9% and the specificity of 88.9% (Table S2). Because this drug combination is of clinical importance, we further investigated the association of these six selected loci using 161 individuals who developed grade 1/2 neutropenia/leucopenia, using cases registered in the Biobank Japan. Interestingly, the association results for the six loci were moderate for grade 1/2 neutropenia/leucopenia, with intermediate allele frequency and OR between individuals without any adverse reactions and those with neutropenia/leucopenia of ≥grade 3 (Table S3). In addition, as shown in Table 3 and Figure 1, the higher the calculated score becomes, the higher the proportion and grade of neutropenia/leucopenia. The intermediate scores for patients with grade 1/2 neutropenia/leucopenia could imply the possible usefulness of this scoring system for the prediction.
Table 3. Weighted genetic risk score (wGRS) analysis of cancer patients who received combination treatment with paclitaxel and carboplatin
G3/4 versus G0
G1/2 versus G0
95%_CI, 95% confidence interval; G0, individuals who did not develop any adverse drug reaction; G1G2, grade 1 and grade 2 neutropenia (mild); G3G4, grade 3 and grade 4 neutropenia (severe); OR, odds ratio; REF, reference.
Furthermore, we used simulation to estimate how many samples are required to validate this scoring result. We started off by estimating the incidence of neutropenia/leucopenia by the combined treatment of paclitaxel and carboplatin. In Biobank Japan, a total of 477 individuals received this combined treatment; among them, 166 individuals (35%) did not develop any adverse drug reactions, 161 (35%) developed mild neutropenia/leucopenia (grade 1 or 2) and 150 (30%) developed severe neutropenia/leucopenia (grade 3 or higher). The frequency of developing severe neutropenia/leucopenia is in agreement with a multicenter study reported by Guastalla et al. When we assume that 100 patients who receive this combination therapy are prospectively registered, the incidences of the adverse drug reactions are estimated as shown in Table 4. If we categorize the patients by wGRS according to the proportions indicated in Table 3 (and our hypothesis is right), the statistical power should be enough to validate by this small subset of patients. Even if two individuals in both group 1 and group 4 are incorrectly predicted, the calculated P-value is still 0.03 by Fisher's exact test.
Table 4. Simulation of weighted genetic risk score (wGRS) analysis for a prospective study of 100 patients who received combination treatment with paclitaxel and carboplatin
Estimated verification samples (n = 100; 35 expected to have grade 1/2 neutropenia)
95%_CI, 95% confidence interval; G0, individuals without any adverse drug reaction; G3G4, grade 3 and 4 neutropenia (severe); OR, odds ratio.
In this study, we carried out GWAS analyses for a total of 17 subsets of chemotherapies to identify genetic variants that might be associated with chemotherapeutic-induced neutropenia/leucopenia with grades 3 and 4, however, we could not identify any SNPs that surpassed the genome-wide significant threshold (P-value < 5 × 10−8). Through this study, we encountered several important issues, which are now common problems in pharmacogenomics studies using retrospective clinical data, including confounding factors and heterogeneous treatments for individual patients (often given different combinations of drugs, different dosage of drugs, and different time-periods of treatment), that increase the complexity of studies and generate various noises in the analyses, and diminished the statistical power in the case–control association studies. We understand that our current approach was not an ideal study design, but it is not easy to perfectly standardize therapy in the daily clinical practice of cancer treatment. There are several factors contributing to the variability in treatments: (i) there is some preference by doctors or by hospitals to select a particular regimen among the various recommended standard treatments; (ii) the modifications (adjustments) of the dosage or schedule according to the patient's conditions (performance status, results of laboratory tests, etc.); and (iii) although we have been collecting the clinical information, it is not perfect to collect complete clinical information in some hospitals, particularly those that do not use electronic medical records. One can say that this kind of study should be performed as a prospective design, however, due to the very rapid advances in the development of novel molecular-targeted drugs and new regimens in the oncology area, the protocols have been and will be modified or improved. Hence, spending many years and a huge budget on a prospective study may result in a clinically useless outcome, because the results are unable to be applied due to the replacement of the study protocol with a new protocol, when the results of association studies are available. Nevertheless, retrospective pharmacogenomic studies could be improved by implementing electronic medical record systems that could include detailed descriptions of patients' conditions and their responses to various drugs.
Although we understand the pitfalls in study designs like our present study, we need to seek possible ways to identify candidate genetic variants that might contribute to improvement in the clinical management of cancer patients, including chemotherapy-induced severe neutropenia/leucopenia. Nevertheless, some of the candidate genes that we identified are of interest, considering their known functions as well as their relations with drug actions. For example, the proto-oncogene AGR2, whose genetic variants were suggested to associate with cisplatin-induced neutropenia/leucopenia, encodes an anterior gradient 2 homolog (Xenopus laevis) that is known to play a critical role in cell migration, cell differentiation, and cell growth. Cells stably expressing AGR2 confer resistance to cisplatin in vivo, compared with control cells (empty vector) in a xenograft animal model. The second example is TNFRSF1A, suggested to be associated with anthracycline-based and epirubicin-induced neutropenia/leucopenia. This gene encodes TNFRSF1A, which is a major receptor for TNF-α. The soluble TNFRSF1A level was found to be elevated after 1 month of anthracycline-based chemotherapy. Additionally, both TNF-α and TNFRSF1A are known to play a critical role in doxorubicin-induced cardiotoxicity, in which doxorubicin stimulates an increase in circulating TNF and upregulates TNFRSF1A.[29, 30] Furthermore, genetic variants on PDE4D, which encodes for phosphodiesterase 4D, cAMP-specific, showed suggestive association with gemcitabine-induced severe neutropenia/leucopenia. Ablation of PDE4D has been reported to impair the neutrophil function with altered chemotaxis ability and adhesion capability as well as to reduce neutrophil recruitment to the site of inflammation. Besides, genetic variants on RXRA identified to be associated with combined treatment of paclitaxel and carboplatin-induced severe neutropenia/leucopenia, encodes retinoid X receptor alpha. Disruption of this gene in mouse models moderately alters lymphocyte proliferation and survival, and affects the T helper type1/type 2 balances. All of these genes might provide some important insights into the mechanism of various chemotherapy-induced severe neutropenia/leucopenia, however, further validations are definitely essential.
As already described, the GWAS approach could provide a list of genetic variants that might be associated with complex phenotypes (drug responsiveness or drug-induced adverse reactions) in pharmacogenomics studies. One of the clinically important aims for identification of the associated genetic variants is to establish a prediction model to identify individuals who are at risk of adverse reactions with certain drugs or protocols. In this study, we have applied the wGRS system, by which we could distinguish high-risk patients from low-risk individuals by counting the number of risk alleles of the suggestively-associated SNPs in combination with estimating the effect size of each SNP. One of the best examples from this study was indicated by a scoring system using six candidate SNP loci that were identified through the GWAS of severe neutropenia/leucopenia caused by combination treatment of paclitaxel and carboplatin; among 53 individuals in the high-risk group (group 4) by this scoring method, 47 (89%) revealed high-grade neutropenia/leucopenia. In contrast, among 50 individuals in the low-risk group (group 1), only 2 (4%) revealed high-grade neutropenia/leucopenia, and the odds ratio to have the severe adverse reaction in individuals belonging to group 4 was calculated to be 188 times higher than those categorized to group 1 (Table 3). Interestingly, individuals who developed grade 1/2 (mild neutropenia/leucopenia) were found to show intermediate risk scores between patients with severe neutropenia/leucopenia and those without any adverse reactions. Hence, we suggest that wGRS is an applicable method to evaluate the clinical utility of possible variants with specific phenotypes. However, the data are preliminary and require verification by an independent test sample(s) before any definitive conclusions can be drawn. But, considering that the OR of the high-risk group is very high, the number of samples required for the verification (if our hypothesis is right) is not so large. In fact, we have tried to simulate a prospective study design using a model of 100 patients according to the assumption that 35% individuals will not develop any adverse drug reactions, 35% individuals will develop mild neutropenia/leucopenia (grade 1/2), and 30% will develop severe neutropenia/leucopenia (grade 3/4). As shown in Table 4, the study of 100 patients should have very strong statistical power to verify. If this is verified, as we expect, it should improve the quality of lives of cancer patients and also contribute to reducing medical care costs by avoiding unnecessary adverse events. However, to achieve success in pharmacogenomics and personalized medicine, both local and international collaborative efforts are essential.
We would like to express our heartfelt gratitude to all the patients who participated in this study. We convey our sincere appreciation to Dr Teruhiko Yoshida and Dr Hiromi Sakamoto from the National Cancer Center Research Institute for their kind support. Our thanks also goes to the members of the laboratory for statistical analysis and the laboratory for the genotyping development from the Center for Genomic Medicine for their kind support and fruitful discussions. We would like to extend our gratitude to the staff of Biobank Japan for their outstanding assistance. This work was carried out as part of the Biobank Japan Project, supported by the Ministry of Education, Culture, Sports, Sciences and Technology, Japan. In addition, this project was supported by the JSPS postdoctoral fellowship.
The authors have no conflict of interest.
genome-wide association study
single nucleotide polymorphism
tumor necrosis factor receptor superfamily, member 1A