• Open Access

Genome-wide association study of chemotherapeutic agent-induced severe neutropenia/leucopenia for patients in Biobank Japan


  • Siew-Kee Low,

    1. Laboratory for Statistical Analysis, Center for Genomic Medicine, RIKEN, Yokohama, Japan
    2. Laboratory of Molecular Medicine, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
    Search for more papers by this author
  • Suyoun Chung,

    1. Laboratory of Molecular Medicine, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
    2. Department of Medicine, The University of Chicago, Chicago, Illinois, USA
    Search for more papers by this author
  • Atsushi Takahashi,

    1. Laboratory for Statistical Analysis, Center for Genomic Medicine, RIKEN, Yokohama, Japan
    Search for more papers by this author
  • Hitoshi Zembutsu,

    1. Laboratory of Molecular Medicine, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
    Search for more papers by this author
  • Taisei Mushiroda,

    1. Laboratory for Pharmacogenetics, Center for Genomic Medicine, RIKEN, Yokohama, Japan
    Search for more papers by this author
  • Michiaki Kubo,

    1. Laboratory for Genotyping Development, Center for Genomic Medicine, RIKEN, Yokohama, Japan
    Search for more papers by this author
  • Yusuke Nakamura

    Corresponding author
    1. Laboratory of Molecular Medicine, Human Genome Center, Institute of Medical Science, The University of Tokyo, Tokyo, Japan
    2. Department of Medicine, The University of Chicago, Chicago, Illinois, USA
    • Laboratory for Statistical Analysis, Center for Genomic Medicine, RIKEN, Yokohama, Japan
    Search for more papers by this author

To whom correspondence should be addressed.

E-mails: yusuke@ims.u-tokyo.ac.jp; ynakamura@bsd.uchicago.edu


Chemotherapeutic agents are notoriously known to have a narrow therapeutic range that often results in life-threatening toxicity. Hence, it is clinically important to identify the patients who are at high risk for severe toxicity to certain chemotherapy through a pharmacogenomics approach. In this study, we carried out multiple genome-wide association studies (GWAS) of 13 122 cancer patients who received different chemotherapy regimens, including cyclophosphamide- and platinum-based (cisplatin and carboplatin), anthracycline-based (doxorubicin and epirubicin), and antimetabolite-based (5-fluorouracil and gemcitabine) treatment, antimicrotubule agents (paclitaxel and docetaxel), and topoisomerase inhibitors (camptothecin and etoposide), as well as combination therapy with paclitaxel and carboplatin, to identify genetic variants that are associated with the risk of severe neutropenia/leucopenia in the Japanese population. In addition, we used a weighted genetic risk scoring system to evaluate the cumulative effects of the suggestive genetic variants identified from GWAS in order to predict the risk levels of individuals who carry multiple risk alleles. Although we failed to identify genetic variants that surpassed the genome-wide significance level (P < 5.0 × 10−8) through GWAS, probably due to insufficient statistical power and complex clinical features, we were able to shortlist some of the suggestive associated loci. The current study is at the relatively preliminary stage, but does highlight the complexity and problematic issues associated with retrospective pharmacogenomics studies. However, we hope that verification of these genetic variants through local and international collaborations could improve the clinical outcome for cancer patients.

It is now widely and well recognized that medication can cause distinct heterogeneity in terms of its efficacy and toxicity among individuals. These inter-individual differences could be explained in part by the common and/or rare genetic variants in the human genome. Pharmacogenomics aims to discover how genetic variations in the human genome can affect a drug's efficacy or toxicity, and thus brings great promise for personalized medicine in which genetic information can be used to predict the safety, toxicity, and/or efficacy of drugs.[1] Pharmacogenomics study for chemotherapeutic therapies is particularly important because these drugs are known to have a narrow therapeutic window; in general, a higher concentration causes toxicity and a lower concentration reduces the efficacy of the drug. Two of the well-described examples are the association of genetic variants in TPMT with 6-mercaptopurine-induced myelosuppression in treatment of pediatric acute lymphoblastic leukemia and that of UGT1A1 variants with camptothecin-related neutropenia and diarrhea in treatment of colorectal and lung cancers. The US Food and Drug Administration have recommended that variants on these two genes should be helpful for the prediction of severe adverse reactions prior to use of the drugs.[2-7]

With advances in various technologies in the life sciences, it is now possible to accurately genotype more than a million common genetic variations by genome-wide high-density SNP array or to characterize all genetic variants in our genome by the next generation DNA sequencing methods. Although one of the greatest drawbacks of GWAS is the requirement of the large number of samples to achieve high statistical power,[8] this issue could be overcome by the establishment of Biobank Japan in 2003 (http://biobankjp.org/).[9] Biobank Japan collected approximately 330 000 disease cases (200 000 individuals) that had either one or multiples of 47 different diseases including cancers from a collaborative network of 66 hospitals throughout Japan, with the major aim to identify genetic variants associated with susceptibility to complex diseases or those related to drug toxicity. By using the samples from Biobank Japan, a significant number of insightful findings have been published in recent years for identification of common genetic variants associated with complex diseases including cancer.[10-19] With a reasonable number of samples, it is also feasible to carry out pharmacogenomics studies on chemotherapy-induced toxicity.

Neutropenia and/or leucopenia are two of the most common drug adverse events after treatment with chemotherapeutic agents, which often cause life-threatening infections and the delay of treatment schedule that subsequently affect the treatment outcome. Although prophylactic granulocyte colony-stimulating factor has been given to the patients as a preventive measure,[20] the underlying mechanism and susceptible risk factors that cause neutropenia have not been fully elucidated. In this study, we carried out a total of 17 sets of GWAS using 13 122 cancer patients, who received various drug regimens, to identify genetic variants associated with the risk of chemotherapeutic agent-induced severe neutropenia/leucopenia in the Japanese population.

Subjects and Method

Study subjects

A total of 13 122 DNA samples from cancer patients, who received various chemotherapeutic agents, stored in Biobank Japan (University of Tokyo, Tokyo, Japan), were used in this study. Among them, 805 patients developed severe neutropenia and/or leucopenia (≥grade 3), and 4804 patients were not reported to develop any adverse reactions after being given chemotherapeutic agents. The samples could be classified into subgroups according to the drugs used: an alkylating agent (cyclophosphamide); platinum-based (cisplatin and carboplatin), anthracycline-based (doxorubicin and epirubicin); antimetabolite-based (5-fluorouracil and gemcitabine), antimicrotubule-based (paclitaxel and docetaxel); and topoisomerase inhibitor-based (camptothecin and etoposide). The grade of toxicity was classified in accordance with the US National Cancer Institute's Common Toxicity Criteria version 2.0. The adverse event description is based on the medical records collected by the medical coordinator. The patients' demographic details are summarized in Table 1. Participants of this study provided written inform consent and this project was approved by the ethical committee from the Institute of Medical Sciences, University of Tokyo and the RIKEN Center for Genomic Medicine (Yokohama, Japan).

Table 1. Demographic details of cancer patients treated with chemotherapeutic agents, whose DNA samples are stored in Biobank Japan (The University of Tokyo, Tokyo, Japan)
CategoryControlsaGrade 1/2Grade 3/4CategoryControlsaGrade 1/2Grade 3/4
  1. a

    Individuals who did not develop any adverse drug reactions after chemotherapy.

All48041253805Drug subtype   
Age, years (mean)62.958.759.6Alkylating agent346266176
Cancer subtypeCarboplatin262207261
Lung cancer587259266Anthracycline459240184
Breast cancer876388204Doxorubicin668583
Ovarian cancer14012474Epirubicin37013283
Gastric cancer82710056Antimetabolite2249512294
Esophageal cancer20865535-Fluorouracil952331177
Colorectal cancer157316150Gemcitabine22611180
Endometrial cancer787245Antimicrotubule agent825468371
Cervical cancer1295735Paclitaxel364321218
Prostate cancer911321Docetaxel233143147
Pancreatic cancer833620Topoisomerase inhibitor187123106
Liver cancer366169Camptothecin15510659
Gallbladder cancer5691Etoposide391954
    Paclitaxel + carboplatin166161150

Genotyping and quality controls

DNAs obtained from the patients' blood were genotyped using Illumina OmniExpress BeadChip (San Diego, CA, USA) that contained 733 202 SNPs. Sample quality control was carried out by methods including identity-by-state to evaluate cryptic relatedness for each sample and population stratification by the use of principal component analysis to exclude genetically heterogeneous samples from further analysis.[21, 22] Then, our standard SNP quality control was carried out by excluding SNPs deviating from the Hardy–Weinberg equilibrium (P ≤ 1.0 × 10−6), non-polymorphic SNPs, SNPs with a call rate of <0.99, and those on the X chromosome.[21, 22] Q–Q plot and lambda values, which were calculated between observed P-values from Fisher's test allelic model against expected P-values, were used to further evaluate population substructure.

Statistical analysis

Genome-wide case–control association analyses were evaluated using Fisher's exact method considering allelic, dominant, and recessive genetic models. Manhattan plots of the study were generated using the minimum P-value among the three genetic models for each SNP.

Scoring system using wGRS

The scoring analysis was carried out using SNPs with Pmin of <1.0 × 10−5 after exclusion of SNPs that are in strong linkage disequilibrium (r2 > 0.8) in each GWAS. The wGRS were calculated according to De Jager et al.[23] Briefly, we first calculated the weight of each SNP that is the natural log of the odds ratio for each allele/genotype, considering the associated genetic model. For an additive model, we assigned a score of 2 to an individual with two risk alleles, 1 to that with one risk allele, and 0 to that with no risk allele. For a dominant model, we assigned a score of 1 to an individual with one or two risk alleles, and 0 to that with no risk allele. For a recessive model, we assigned a score of 1 to an individual with two risk alleles, and 0 to that with no or one risk allele. Then the cumulative genetic risk scores were determined by multiplying the number of risk alleles/genotype of each SNP by its corresponding weight, and subsequently took the sum across the total number of SNPs that were taken into consideration of each GWAS set. We classified the genetics risk score into four different groups created from the mean and SD: group 1, <mean − 1SD; group 2, mean − 1SD to mean; group 3, mean to mean + 1SD; and group 4, >mean + 1SD. Odds ratio, 95% confidence interval, P-value, sensitivity, and specificity were calculated using group 1 as a reference. To calculate the OR in which one of the cells in the contingency table is zero, we applied the Haldane correction, used to avoid error in the calculation by adding 0.5 to all of the cells of a contingency table.


After subdividing the patients by administered drugs/major drug subgroups, as previously mentioned, a total of 17 GWAS analyses were carried out by comparing the allele/genotype frequency between the patients who had developed severe neutropenia/leucopenia (grade 3/4) to those who had not developed any adverse drug reactions. The Q–Q plots of each GWAS and the calculated lambda value of below 1.00 indicated no significant population stratification in each of these GWAS analyses (Fig. S1). From this study, although we could not identify any SNPs that surpassed the genome-wide significant threshold (P-value < 5 × 10−8) for showing association with the risk of neutropenia/leucopenia induced by the certain type of drug or regimen, several possible candidate loci were identified. The results of the GWAS are summarized in Table 2, Table S1, and Figure S2; the results of wGRS are summarized in Table S2.

Table 2. Association analysis of single nucleotide polymorphisms (SNPs) with different chemotherapeutic drugs/drug subgroups known to induce severe neutropenia/leucopenia
CHRSNPBPRANRARAF_CaseRAF_CtrlP_allelicP_domP_recPminORL95U95 Gene rel.loci
  1. a

    SNPs used for weighted genetic risk score analyses. BP, SNP genomic location; CHR, chromosome; inf, infinity; L95, lower 95% confidence interval; N/A, not applicable; NRA, non-risk allele; OR, odds ratio; P_allelic, P-value from allelic model; P_dom, P-value from dominant model; P_min, minimum P-value among the three models; P_rec, P-value from recessive model; RA, risk allele; RAF, risk allele frequency; rel.loci, distance of the SNP from the gene; U95, upper 95% confidence interval.

16rs2519974a22889186TC0.5030.3812.52E-044.35E-062.77E-014.35E-061.6471.2642.146 HS3ST2 0
1rs10922438a198469162TC0.2140.1066.01E-061.71E-057.10E-026.01E-062.3011.6083.293 ATP6V1G3 23190
19rs3745571a6475613TC0.7780.6704.05E-047.72E-061.00E+007.72E-061.7301.2762.345 DENND1C 0
All platinum-based drugs
15rs4886670a75449674AC0.3200.2279.86E-071.43E-058.14E-049.86E-071.6051.3301.937 RPL36AP45 29318
19rs33428a30937843GA0.4810.4032.71E-042.78E-063.11E-012.78E-061.375 1.1601.629 ZNF536 0
14rs12589282a22937656GT0.5350.4376.30E-065.42E-034.11E-064.11E-061.4801.2501.752 TRA@ 0
3rs3845905a66525963GA0.9150.8504.12E-067.45E-053.65E-044.12E-061.894 1.4332.503 LRIG1 0
5rs1895302a169542600CT0.5510.4786.95E-047.41E-064.11E-017.41E-061.3401.1321.587 FOXI1 5871
1rs16825455a21837755TC0.6860.6058.85E-058.62E-061.90E-018.62E-061.425 1.1931.702 ALPL 0
7rs10253216a16861849TC0.5650.4682.18E-031.68E-071.00E+001.68E-071.478 1.1551.891 AGR2 −17111
4rs11944965a63424089TC0.8070.6783.45E-061.68E-066.65E-021.68E-061.9861.4752.676 LOC644534 47600
7rs7797977a16862235CA0.6680.5804.06E-035.23E-012.17E-062.17E-061.4571.1271.883 AGR2 −17497
18rs2406342a74488280TG0.6050.4753.59E-052.48E-066.71E-022.48E-061.6971.3232.177 ZNF236 −47836
20rs6077251a7752366TC0.2710.1532.50E-063.64E-063.06E-022.50E-062.0651.5372.773 SFRS13AP2 59982
8rs11774576a27740417AG0.7020.5816.78E-052.82E-062.62E-012.82E-061.6991.3072.208 SCARA5 0
11rs4627050a18822037GA0.7810.6494.43E-069.01E-062.17E-024.43E-061.9321.4522.572 PTPN5 −8648
1rs12142335a108302922AG0.0400.0049.91E-068.45E-061.00E+008.45E-069.7133.17529.710 VAV3 0
15rs11071200a55950082TG0.0600.0081.25E-068.51E-071.00E+008.51E-078.2412.88823.520 PRTG 0
5rs3822735a35799994GA0.8620.7527.24E-061.68E-063.50E-011.68E-062.0621.5002.834 SPEF2 0
3rs1623879a58027197GA0.4410.3217.69E-051.89E-023.75E-063.75E-061.6691.2972.148 FLNB 0
15rs936229a75132319GA0.7130.5957.27E-054.41E-063.01E-014.41E-061.6851.3022.180 ULK3 0
13rs7989332a21050575GT0.8530.7385.47E-061.16E-041.74E-035.47E-062.0561.5072.806 CRYL1 0
3rs3845905a66525963GA0.9210.8285.99E-062.20E-051.50E-025.99E-062.4331.6453.598 LRIG1 0
8rs1714746a4105147GA0.5540.4351.59E-046.63E-027.44E-067.44E-061.6101.2612.056 CSMD1 0
16rs12446319a81774798AG0.2530.1438.97E-061.27E-043.16E-048.97E-062.0261.4802.774 CMIP 29431
1rs1277203a109392837AG0.7300.6263.50E-043.51E-029.38E-069.38E-061.6151.2432.098 AKNAD1 0
All anthracycline-based drugs
5rs10040979a158424391GA0.7010.6184.68E-035.35E-014.60E-074.60E-071.4521.1201.883 EBF1 0
2rs12615435a200638509TG0.8830.7733.95E-064.09E-069.17E-023.95E-062.2141.5553.154 LOC348751 0
5rs7720283a158459721CT0.7750.7061.29E-023.37E-014.15E-064.15E-061.4311.0781.898 EBF1 0
1rs1367448a68633924CT0.6330.5265.02E-045.32E-065.12E-015.32E-061.5541.2121.993 LOC100289178 0
6rs2505059a98495952GA0.5380.3985.41E-062.29E-042.04E-045.41E-061.7651.3832.252 MIR2113 23455
12rs4149639a6442001CT0.1200.0477.39E-061.52E-058.16E-027.39E-062.7631.7814.287 TNFRSF1A 0
19rs1654260a20329111AG0.6250.4888.49E-063.03E-032.18E-058.49E-061.7491.3652.240 LOC100421704 −3576
15rs11857176a78164706AG0.6570.5151.74E-021.00E+008.08E-078.08E-071.8001.1272.874 LOC100302666 −6274
2rs4380275a773278GA0.3920.1524.99E-061.54E-052.05E-024.99E-063.6042.0416.365 LOC339822 6559
11rs2512987a86414282TC0.6810.4177.02E-066.04E-033.42E-057.02E-062.9851.8554.803 ME3 −30604
12rs4149639a6442001CT0.1630.0422.89E-078.31E-073.32E-022.89E-074.4432.5717.677 TNFRSF1A 0
5rs2964475a5407814CA0.6150.4154.13E-063.95E-041.10E-044.13E-062.2481.5923.174 KIAA0947 −14993
13rs1923834a28360487GA0.9160.7709.40E-064.61E-063.33E-014.61E-063.2361.8235.744 GSX1 −6293
10rs908366a126144839AG0.5180.3287.04E-061.08E-048.69E-047.04E-062.1991.5643.092 LHPP −5502
3rs1553091a187716886GA0.4520.3582.65E-028.04E-017.46E-067.46E-061.4801.0532.080 LOC100505844 −22691
All antimetabolite drugs
18rs7228133a4539085CA0.7330.6862.26E-026.64E-011.70E-061.70E-061.2551.0341.522 LOC284215 243085
21rs8127977a26826514AG0.8040.7221.30E-052.11E-062.67E-012.11E-061.5871.2821.966 NCRNA00158 −22501
12rs894734a54319727GA0.8490.7763.84E-053.97E-066.63E-013.97E-061.6191.2792.050 HOXC13 −12849
13rs9580312a22754093GA0.4800.4091.35E-038.09E-066.25E-018.09E-061.3301.1201.581 LOC100506622 −30331
21rs2055011a19481354CT0.1840.1431.12E-022.12E-018.82E-068.82E-061.3471.0751.686 CHODL −135796
12rs12582168a124894184CT0.3330.2568.50E-059.31E-062.84E-019.31E-061.4541.2101.748 NCOR2 0
7rs10488226a12713070AC0.1950.1071.09E-053.54E-062.98E-013.54E-062.0261.5002.737 LOC100505995 −12175
2rs6740660a224943685GA0.9660.8944.10E-068.83E-062.40E-014.10E-063.3861.8706.131 SERPINE2 −39649
4rs1567482a36026747GA0.9520.8756.26E-061.44E-059.14E-026.26E-062.8461.7164.719 LOC651644 39948
2rs6706693a192465598AG0.3280.2191.62E-052.12E-039.45E-069.45E-061.7431.3622.232 OBFC2A −77200
18rs9961113a75605399CT0.6250.4031.43E-063.83E-043.73E-051.43E-062.4731.7063.584 LOC100421527 −260017
5rs2547917a58713680AG0.3500.2129.06E-048.79E-023.33E-063.33E-061.9971.3452.965 PDE4D 0
15rs12900463a85415386CT0.2190.1152.24E-031.02E-014.03E-064.03E-062.1541.3423.457 ALPK3 0
22rs9609078a31153276TC0.0890.0094.32E-069.97E-062.59E-014.32E-0610.8903.52833.610 OSBP2 0
5rs6863418a173625154AG0.1750.0551.37E-056.98E-064.55E-016.98E-063.6232.0426.429 HMP19 88972
20rs6037430a344079GA0.8940.7309.74E-061.75E-057.92E-029.74E-063.1091.8055.359 NRSN2 8567
All antimicrotubule drugs
17rs11651483a12777402CT0.7290.6651.69E-032.60E-013.37E-073.37E-071.3571.1201.643 RICH2 0
6rs4235898a77266188AG0.8300.7401.05E-063.34E-066.50E-031.05E-061.7181.3772.142 LOC100131680 −103976
13rs4771859a93088651GA0.7640.7095.49E-032.60E-011.47E-061.47E-061.3281.0881.623 GPC5 0
1rs12145418a216716320TG0.3340.2743.04E-033.17E-012.35E-062.35E-061.3311.1041.604 ESRRG 0
6rs9386485a106329055TC0.5960.4922.65E-062.85E-058.59E-042.65E-061.5241.2791.817 PRDM1 −205140
16rs12935229a77328895AG0.2490.1821.83E-042.23E-024.40E-064.40E-061.4951.2141.840 ADAMTS18 0
7rs6961860a17085321GA0.5570.4955.33E-038.87E-014.67E-064.67E-061.2831.0781.527 LOC100131425 156806
14rs12882718a86902054TC0.7370.6435.91E-065.55E-062.03E-025.55E-061.5551.2831.884 LOC100421119 −42891
12rs1043763a122630909TC0.6680.5741.36E-056.51E-063.10E-026.51E-061.4961.2481.795 MLXIP 1920
2rs4591358a196365890CT0.3020.2156.60E-066.89E-041.89E-046.60E-061.5781.2971.920 LOC391470 81627
14rs8022296a97987857GA0.6630.6081.06E-026.54E-017.29E-067.29E-061.2691.0581.521 LOC100129345 111127
4rs6817170a154374984GA0.3770.2869.29E-064.61E-053.27E-039.29E-061.5171.2641.822 KIAA0922 −12514
1rs922106a90025519TG0.2980.2022.17E-041.95E-029.28E-079.28E-071.6791.2772.207 LRRC8B 0
6rs9386485a106329055TC0.6240.4771.17E-061.43E-055.24E-041.17E-061.8211.4292.320 PRDM1 −205140
8rs2444896a99022009TG0.7270.6031.58E-052.43E-067.41E-022.43E-061.7541.3552.269 MATN2 0
2rs4666360a20335709CT0.2160.1144.62E-063.24E-062.27E-013.24E-062.1361.5462.950 RPS16P2 19625
9rs3138083a35648950AG0.2200.1173.78E-061.24E-051.09E-023.78E-062.1361.5512.942 SIT1 345
17rs3786094a9875205CT0.5280.4225.30E-041.80E-015.83E-065.83E-061.5311.2061.944 GAS7 0
15rs4886670a75449674AC0.3530.2297.26E-065.38E-054.26E-037.26E-061.8351.4122.383 RPL36AP45 29318
5rs792975a172271007TC0.6540.5197.66E-061.19E-046.08E-047.66E-061.7461.3662.232 ERGIC1 0
9rs3747851a124521260TC0.3370.1765.61E-071.12E-055.63E-045.61E-072.3771.6933.339 DAB2IP 0
7rs4727963a122759980CT0.7720.6187.99E-061.04E-061.69E-011.04E-062.0941.5052.914 SLC13A1 0
14rs1756650a87741025GA0.2110.1629.95E-029.10E-011.74E-061.74E-061.3860.9542.014 GALC 658333
13rs488248a106596719TC0.9180.7953.29E-063.23E-051.17E-023.29E-062.8961.8024.655 LOC728192 −432192
6rs12660691a130008445AC0.9350.8193.62E-064.99E-061.62E-013.62E-063.1991.8995.391 ARHGAP18 0
18rs4553720a62170726TC0.3770.2817.77E-034.56E-016.77E-066.77E-061.5471.1312.116 LOC284294 79890
6rs2157460a130021128TC0.9320.8207.07E-061.07E-051.62E-017.07E-063.0131.8065.025 ARHGAP18 0
2rs837841a130034012TG0.7140.6621.49E-015.26E-018.34E-068.34E-061.2790.9301.759 LOC151121 −33653
All topoisomerase inhibitors
5rs10074959a104208013TC0.3210.2373.23E-028.08E-011.13E-061.13E-061.5241.0482.217 RAB9P1 −227162
7rs1035147a12094966TG0.9810.8774.01E-064.75E-065.56E-014.01E-067.2942.58720.559 TMEM106B −155882
1rs303386a99589379AG0.5850.4441.10E-034.30E-063.88E-014.30E-061.7661.2562.483 LOC100129620 0
14rs7494275a56231800CA0.5430.4061.86E-034.26E-018.50E-068.50E-061.7321.2322.433 RPL13AP3 −1163
3rs480409a7010081GA0.4950.3486.08E-041.63E-019.28E-069.28E-061.8421.3072.596 GRM7 0
6rs17318866a3837198GA0.9660.7901.47E-061.78E-061.91E-011.47E-067.5592.68921.263 FAM50B −12434
2rs17027130a41273631CT0.6440.3872.54E-068.03E-043.91E-052.54E-062.8651.8444.452 LOC729984 −110074
1rs303386a99589379AG0.6270.4293.34E-043.61E-061.06E-013.61E-062.2381.4483.460 LOC100129620 0
20rs6039763a10183517AG0.3700.0901.27E-051.54E-064.61E-011.54E-065.9662.50214.230 LOC100131208 0
1rs2506991a48098406GA0.5930.2691.39E-051.11E-022.28E-062.28E-063.9482.1017.418 LOC388630 127794
2rs12987465a49715021AG0.5930.3591.87E-036.26E-016.61E-066.61E-062.5971.4244.737 FSHR –333355
7rs3095008a20255705TC1.0000.8461.74E-059.39E-061.00E+009.39E-06infN/AN/A MACC1 0
Paclitaxel + carboplatin
12rs12310399a95490248AG0.7080.5672.60E-041.08E-012.46E-072.46E-071.8521.3292.580 FGD6 0
9rs10785877a137125501TC0.8330.6607.38E-071.54E-051.98E-047.38E-072.5801.7663.769 RXRA -92815
19rs995834a28866596CT0.5800.4251.28E-041.20E-061.51E-011.20E-061.8711.3642.566 LOC100420587 307385
1rs922107a90022796GA0.3230.2111.55E-039.07E-022.73E-062.73E-061.7881.2502.558 LRRC8B 0
7rs1425132a37562368TC0.7400.6664.55E-027.35E-014.68E-064.68E-061.4301.0132.017 LOC442668 62349
1rs6429703a15339960TC0.2000.0788.40E-062.59E-055.61E-028.40E-062.9421.8024.804 RP1-21O18.1 0

Among these datasets, GWAS carried out using samples who were given: (i) any kind of platinum-based chemotherapy (428 cases vs 743 controls); (ii) cisplatin-based chemotherapy (176 cases vs 471 controls); or (iii) carboplatin-based chemotherapy (261 cases vs 262 controls) identified SNPs showing the most significant association with chemotherapy-induced severe neutropenia/leucopenia are: rs4886670 (Pmin 9.86 × 10−7, OR = 1.61, 95% CI = 1.33–1.94) near RPL36AP45 for (i); rs10253216 (Pmin 1.68 × 10−7, OR = 1.48, 95% CI = 1.16–1.89) near AGR2 for (ii); and rs11071200 (Pmin 8.51 × 10−7, OR = 8.24, 95% CI = 2.89–23.5) on PRTG for (iii) (Table 2, Table S1, Fig. S2b). For the anthracycline-based regimen, we carried out GWAS with individuals given all anthracycline-based (184 cases vs 459 controls), doxorubicin-based (83 cases vs 66 controls), and epirubicin-based (83 cases vs 370 controls) chemotherapy, and identified three SNPs, rs10040979 (Pmin 4.60 × 10−7, OR = 1.45, 95% CI = 1.12–1.88) in EBF1, rs11857176 (Pmin 8.08 × 10−7, OR = 1.80, 95% CI = 1.13–2.87) near a hypothetical gene LOC100302666, and rs4149639 (Pmin 2.89 × 10−7, OR = 4.44, 95% CI = 2.57–7.68) in TNFRSF1A, to be most significantly associated with the risk of high-grade neutropenia/leucopenia, respectively (Table 2, Table S1, Fig. S2c). In the case of antimicrotubule agents, we carried out three different GWAS with individuals who were treated with antimicrotubule (371 cases vs 825 controls), paclitaxel-based (218 cases vs 364 controls), or docetaxel-based (147 cases vs 233 controls) regimens. We identified three SNPs, rs11651483 (Pmin 3.37 × 10−7, OR = 1.36, 95% CI = 1.12–1.64) in RICH2, rs922106 (Pmin 9.28 × 10−7, OR = 1.68, 95% CI = 1.28–2.21) in LRRC8B and rs3747851 (Pmin 5.61 × 10−7, OR = 2.38, 95% CI = 1.69–3.34) in DAB2IP, to be those most significantly associated with the increased risk of severe neutropenia/leucopenia, respectively (Table 2, Table S1, Fig. S2e). Our previous report by Kiyotani et al.[24] identified four SNPs to be associated with gemcitabine-induced hematological toxicities. Three of the four SNPs were included in the current study with suggestive association, rs12046844 (Pmin 5.84 × 10−4, OR = 2.53, 95% CI = 1.45–4.43), rs6430443 (Pmin 8.61 × 10−4, OR = 6.33, 95% CI = 1.90–22.2; r2 = 0.895 with rs1901440) and rs11719165 (Pmin 1.16 × 10−2, OR = 2.36, 95% CI = 1.18–4.70) (Table S4). However, it is noted that some of the samples used in this study overlapped with those in the study reported by Kiyotani et al., as both sourced samples from Biobank Japan.

Lastly, we also attempted to identify genetic variants associated with combined treatment of paclitaxel and carboplatin-induced severe neutropenia/leucopenia (150 cases vs 166 controls), as this combined treatment is commonly used as the standard therapy for both ovarian and lung cancers. We found the most significant association with the SNP rs12310399 (Pmin 2.46 × 10−7, OR = 1.85, 95% CI = 1.33–2.58) near the FGD6 gene (Table 2, Table S1, Fig. S2a), which is suggested to activate CDC42, a member of the Ras-like family of Rho and Rac proteins, and has a critical role in regulating the actin cytoskeleton. The second strongest association was observed at the locus encoding RXRA (Pmin 7.38 × 10−7, OR = 2.58, 95% CI = 1.77–3.77), an important transcriptional factor. We also calculated the cumulative genetic scores using SNPs on six loci and identified that individuals in group 4 could have 188 times (95% CI = 36.1–979) higher risk of developing severe neutropenia/leucopenia than those belonging to group 1 with the sensitivity of 95.9% and the specificity of 88.9% (Table S2). Because this drug combination is of clinical importance, we further investigated the association of these six selected loci using 161 individuals who developed grade 1/2 neutropenia/leucopenia, using cases registered in the Biobank Japan. Interestingly, the association results for the six loci were moderate for grade 1/2 neutropenia/leucopenia, with intermediate allele frequency and OR between individuals without any adverse reactions and those with neutropenia/leucopenia of ≥grade 3 (Table S3). In addition, as shown in Table 3 and Figure 1, the higher the calculated score becomes, the higher the proportion and grade of neutropenia/leucopenia. The intermediate scores for patients with grade 1/2 neutropenia/leucopenia could imply the possible usefulness of this scoring system for the prediction.

Table 3. Weighted genetic risk score (wGRS) analysis of cancer patients who received combination treatment with paclitaxel and carboplatin
wGRS groupScoreG3G4G1G2G0%_G3G4%_G1G2%_G0G3/4 versus G0G1/2 versus G0
  1. 95%_CI, 95% confidence interval; G0, individuals who did not develop any adverse drug reaction; G1G2, grade 1 and grade 2 neutropenia (mild); G3G4, grade 3 and grade 4 neutropenia (severe); OR, odds ratio; REF, reference.

Total 149161164         
Figure 1.

Proportions of cancer patients who developed no adverse reaction (G0), mild neutropenia/leucopenia (G1/2), or severe neutropenia/leucopenia (G3/4) in each of the weighted genetic risk score (wGRS) score groups. All patients received combined treatment with paclitaxel and carboplatin and were registered with Biobank Japan. The total numbers of patients in scores 1, 2, 3, and 4 are 71, 171, 159, and 73, respectively.

Furthermore, we used simulation to estimate how many samples are required to validate this scoring result. We started off by estimating the incidence of neutropenia/leucopenia by the combined treatment of paclitaxel and carboplatin. In Biobank Japan, a total of 477 individuals received this combined treatment; among them, 166 individuals (35%) did not develop any adverse drug reactions, 161 (35%) developed mild neutropenia/leucopenia (grade 1 or 2) and 150 (30%) developed severe neutropenia/leucopenia (grade 3 or higher). The frequency of developing severe neutropenia/leucopenia is in agreement with a multicenter study reported by Guastalla et al.[25] When we assume that 100 patients who receive this combination therapy are prospectively registered, the incidences of the adverse drug reactions are estimated as shown in Table 4. If we categorize the patients by wGRS according to the proportions indicated in Table 3 (and our hypothesis is right), the statistical power should be enough to validate by this small subset of patients. Even if two individuals in both group 1 and group 4 are incorrectly predicted, the calculated P-value is still 0.03 by Fisher's exact test.

Table 4. Simulation of weighted genetic risk score (wGRS) analysis for a prospective study of 100 patients who received combination treatment with paclitaxel and carboplatin
Estimated verification samples (= 100; 35 expected to have grade 1/2 neutropenia)
wGRS groupG3G4G0OR95%_CIP-value
  1. 95%_CI, 95% confidence interval; G0, individuals without any adverse drug reaction; G3G4, grade 3 and 4 neutropenia (severe); OR, odds ratio.



In this study, we carried out GWAS analyses for a total of 17 subsets of chemotherapies to identify genetic variants that might be associated with chemotherapeutic-induced neutropenia/leucopenia with grades 3 and 4, however, we could not identify any SNPs that surpassed the genome-wide significant threshold (P-value < 5 × 10−8). Through this study, we encountered several important issues, which are now common problems in pharmacogenomics studies using retrospective clinical data, including confounding factors and heterogeneous treatments for individual patients (often given different combinations of drugs, different dosage of drugs, and different time-periods of treatment), that increase the complexity of studies and generate various noises in the analyses, and diminished the statistical power in the case–control association studies. We understand that our current approach was not an ideal study design, but it is not easy to perfectly standardize therapy in the daily clinical practice of cancer treatment. There are several factors contributing to the variability in treatments: (i) there is some preference by doctors or by hospitals to select a particular regimen among the various recommended standard treatments; (ii) the modifications (adjustments) of the dosage or schedule according to the patient's conditions (performance status, results of laboratory tests, etc.); and (iii) although we have been collecting the clinical information, it is not perfect to collect complete clinical information in some hospitals, particularly those that do not use electronic medical records. One can say that this kind of study should be performed as a prospective design, however, due to the very rapid advances in the development of novel molecular-targeted drugs and new regimens in the oncology area, the protocols have been and will be modified or improved. Hence, spending many years and a huge budget on a prospective study may result in a clinically useless outcome, because the results are unable to be applied due to the replacement of the study protocol with a new protocol, when the results of association studies are available. Nevertheless, retrospective pharmacogenomic studies could be improved by implementing electronic medical record systems that could include detailed descriptions of patients' conditions and their responses to various drugs.

Although we understand the pitfalls in study designs like our present study, we need to seek possible ways to identify candidate genetic variants that might contribute to improvement in the clinical management of cancer patients, including chemotherapy-induced severe neutropenia/leucopenia. Nevertheless, some of the candidate genes that we identified are of interest, considering their known functions as well as their relations with drug actions. For example, the proto-oncogene AGR2, whose genetic variants were suggested to associate with cisplatin-induced neutropenia/leucopenia, encodes an anterior gradient 2 homolog (Xenopus laevis) that is known to play a critical role in cell migration, cell differentiation, and cell growth.[26] Cells stably expressing AGR2 confer resistance to cisplatin in vivo, compared with control cells (empty vector) in a xenograft animal model.[27] The second example is TNFRSF1A, suggested to be associated with anthracycline-based and epirubicin-induced neutropenia/leucopenia. This gene encodes TNFRSF1A, which is a major receptor for TNF-α. The soluble TNFRSF1A level was found to be elevated after 1 month of anthracycline-based chemotherapy.[28] Additionally, both TNF-α and TNFRSF1A are known to play a critical role in doxorubicin-induced cardiotoxicity, in which doxorubicin stimulates an increase in circulating TNF and upregulates TNFRSF1A.[29, 30] Furthermore, genetic variants on PDE4D, which encodes for phosphodiesterase 4D, cAMP-specific, showed suggestive association with gemcitabine-induced severe neutropenia/leucopenia. Ablation of PDE4D has been reported to impair the neutrophil function with altered chemotaxis ability and adhesion capability as well as to reduce neutrophil recruitment to the site of inflammation.[31] Besides, genetic variants on RXRA identified to be associated with combined treatment of paclitaxel and carboplatin-induced severe neutropenia/leucopenia, encodes retinoid X receptor alpha. Disruption of this gene in mouse models moderately alters lymphocyte proliferation and survival, and affects the T helper type1/type 2 balances.[32] All of these genes might provide some important insights into the mechanism of various chemotherapy-induced severe neutropenia/leucopenia, however, further validations are definitely essential.

As already described, the GWAS approach could provide a list of genetic variants that might be associated with complex phenotypes (drug responsiveness or drug-induced adverse reactions) in pharmacogenomics studies. One of the clinically important aims for identification of the associated genetic variants is to establish a prediction model to identify individuals who are at risk of adverse reactions with certain drugs or protocols. In this study, we have applied the wGRS system, by which we could distinguish high-risk patients from low-risk individuals by counting the number of risk alleles of the suggestively-associated SNPs in combination with estimating the effect size of each SNP. One of the best examples from this study was indicated by a scoring system using six candidate SNP loci that were identified through the GWAS of severe neutropenia/leucopenia caused by combination treatment of paclitaxel and carboplatin; among 53 individuals in the high-risk group (group 4) by this scoring method, 47 (89%) revealed high-grade neutropenia/leucopenia. In contrast, among 50 individuals in the low-risk group (group 1), only 2 (4%) revealed high-grade neutropenia/leucopenia, and the odds ratio to have the severe adverse reaction in individuals belonging to group 4 was calculated to be 188 times higher than those categorized to group 1 (Table 3). Interestingly, individuals who developed grade 1/2 (mild neutropenia/leucopenia) were found to show intermediate risk scores between patients with severe neutropenia/leucopenia and those without any adverse reactions. Hence, we suggest that wGRS is an applicable method to evaluate the clinical utility of possible variants with specific phenotypes. However, the data are preliminary and require verification by an independent test sample(s) before any definitive conclusions can be drawn. But, considering that the OR of the high-risk group is very high, the number of samples required for the verification (if our hypothesis is right) is not so large. In fact, we have tried to simulate a prospective study design using a model of 100 patients according to the assumption that 35% individuals will not develop any adverse drug reactions, 35% individuals will develop mild neutropenia/leucopenia (grade 1/2), and 30% will develop severe neutropenia/leucopenia (grade 3/4). As shown in Table 4, the study of 100 patients should have very strong statistical power to verify. If this is verified, as we expect, it should improve the quality of lives of cancer patients and also contribute to reducing medical care costs by avoiding unnecessary adverse events. However, to achieve success in pharmacogenomics and personalized medicine, both local and international collaborative efforts are essential.


We would like to express our heartfelt gratitude to all the patients who participated in this study. We convey our sincere appreciation to Dr Teruhiko Yoshida and Dr Hiromi Sakamoto from the National Cancer Center Research Institute for their kind support. Our thanks also goes to the members of the laboratory for statistical analysis and the laboratory for the genotyping development from the Center for Genomic Medicine for their kind support and fruitful discussions. We would like to extend our gratitude to the staff of Biobank Japan for their outstanding assistance. This work was carried out as part of the Biobank Japan Project, supported by the Ministry of Education, Culture, Sports, Sciences and Technology, Japan. In addition, this project was supported by the JSPS postdoctoral fellowship.

Disclosure Statement

The authors have no conflict of interest.


genome-wide association study


odds ratio


standard deviation


single nucleotide polymorphism


tumor necrosis factor receptor superfamily, member 1A


tumor necrosis factor-α


weighted genetic risk score