• bladder cancer;
  • genetic susceptibility;
  • nucleotide excision repair;
  • SNP;
  • gene-smoking interaction


  1. Top of page
  2. Abstract


Growing evidence suggests that single nucleotide polymorphisms (SNPs) in nucleotide excision repair (NER) pathway genes play an important role in bladder cancer etiology. However, only a limited number of genes and variations in this pathway have been evaluated to date.


In this study, the authors applied a comprehensive pathway-based approach to assess the effects of 207 tagging and potentially functional SNPs in 26 NER genes on bladder cancer risk using a large case-control study that included 803 bladder cancer cases and 803 controls.


In total, 17 SNPs were associated significantly with altered bladder cancer risk (P < .05), of which, 7 SNPs retained noteworthiness after they were assessed with a Bayesian approach for the probability of false discovery. The most noteworthy SNP was reference SNP 11132186 (rs11132186) in the inhibitor of growth family, member 2 (ING2) gene. Compared with the major allele-containing genotypes, the odds ratio was 0.52 (95% confidence interval, 0.32-0.83; P = .005) for the homozygous variant genotype. Three additional ING2 variants also exhibited significant associations with bladder cancer risk. Significant gene-smoking interactions were observed for 3 of the top 17 SNPs. Furthermore, through an exploratory classification and regression tree (CART) analysis, potential gene-gene interactions were identified.


In this a large association study of the NER pathway and the risk of bladder cancer, several novel predisposition variants were identified along with potential gene-gene and gene-environment interactions in modulating bladder cancer risk. The results reinforce the importance of a comprehensive, pathway-focused, and tagging SNP-based candidate gene approach to identify low-penetrance cancer susceptibility loci. Cancer 2012;. © 2011 American Cancer Society.

Bladder cancer ranks ninth in worldwide cancer incidence. It is the seventh most common cancer in men and the 17th most common in women.1 In the United States, it is the fifth most common cancer, the fourth most common in men, and the 11th most common in women.2 Environmental exposures account for the majority of bladder cancer cases. For example, tobacco smoking causes approximately 50% of bladder cancer incidence in men and 33% in women. In addition, occupational exposures to aromatic amines and polycyclic aromatic hydrocarbons and polluted drinking water containing arsenic and chlorination by-product also contribute significantly to the development of bladder cancer.3 Other environmental risk factors, such as dietary factors, hair dye use, artificial sweeteners, and phenacetin-containing analgesic drugs, also have been reported, although the associations have not been consistent across different studies.3

Although the molecular mechanisms underlying these bladder cancer etiologic factors are not fully understood, it is widely recognized that environmental carcinogens induce DNA damage that leads to genomic instability. Tobacco carcinogens mainly induce bulky DNA adducts, and the nucleotide excision repair (NER) pathway is the major cellular pathway to repair bulky DNA adducts. Other cellular DNA repair pathways, such as the base excision repair (BER) and double-strand break repair (DSBR) pathways, also play important roles in the prevention of bladder carcinogenesis through repairing single-strand and double-strand DNA breaks caused by smoking, reactive oxygen species, ionizing radiation, and other DNA-damaging agents.4-7 In bladder cancer, the NER pathway is 1 of the most commonly studied pathways that repairs bulky DNA lesions, such as pyrimidine dimers, photoproducts, larger chemical adducts, and cross-links. The NER pathway is also critical for the maintenance of genomic stability. There are 2 types of NER processes in human cells: global genomic repair (GGR) and transcription-coupled repair (TCR). Both processes have 4 major steps: 1) recognition of DNA lesions by a complex of interactive proteins, including xeroderma pigmentosum group C-complementing protein/Rad23 homolog B (XPC-RAD23B), xeroderma pigmentosum complementation group A (XPA), and replication protein A (RPA) in GGR or excision repair cross-complementing rodent repair deficiency complementation group 6 (ERCC6) and Cockayne syndrome type A (CSTA) in TCR; 2) unwinding of DNA strands within the region of lesions by the transcription factor IIH (TFIIH) complex, including the proteins xeroderma pigmentosum D protein (XPD) and xeroderma pigmentosum complementation group B (XPB); 3) elimination of damaged DNA fragments by a protein complex that includes ERCC1, XPF, and XPG; and 4) synthesis of new DNA strands by various DNA polymerases.8 Defects in these critical genes have been identified frequently in many cancers, including bladder cancer.9, 10

Single nucleotide polymorphisms (SNPs) of NER pathway genes have been implicated in bladder cancer etiology.9 For example, we and others previously reported that SNPs in major NER genes, such as XPC, XPD, cyclin H (CCNH), and RAD23B, are associated significantly with bladder cancer risk.11-15 However, those studies mostly took a limited candidate gene approach, which evaluates a small number of genes and potential functional SNPs. In the current study, we performed a comprehensive pathway-based study to assess the effects of 207 haplotype-tagging and potentially functional SNPs in 26 major genes in the NER pathway on bladder cancer risk in a total of 803 patients with bladder cancer and a group of 803 healthy controls. Then, we conducted a series of exploratory analyses to evaluate the cumulative effects and interactions of these variants on the risk of bladder cancer.


  1. Top of page
  2. Abstract

Study Population and Epidemiology Data

The study participants were derived from an ongoing hospital-based bladder cancer case-control study. The cases were patients with newly diagnosed, histologically confirmed, and previously untreated bladder cancer who were recruited at The University of Texas MD Anderson Cancer Center and Baylor College of Medicine between 1999 and 2007. There were no age, sex, ethnicity, or disease-stage restrictions on the recruitment. The controls were healthy individuals with no prior history of cancer (except nonmelanoma skin cancer) who were recruited from the Kelsey-Seybold Clinic, which is the largest multispecialty, managed-care physician group in the Houston metropolitan area. Controls were matched to cases by age (±5 years), sex, and ethnicity. To control for confounding of population stratification, we restricted both cases and controls to self-reported non-Hispanic Caucasians for the current analysis. These cases and controls also were included in our recent genome-wide association study of bladder cancer, and there was no evidence of population substructure among cases and controls.16 The potential controls were surveyed first by a Kelsey-Seybold staff member during clinical registration using a short questionnaire to elicit their willingness to participate in the study and to provide preliminary demographic data for matching. They were contacted by telephone at a later date to schedule an interview appointment at a Kelsey-Seybold Clinic location convenient to the participant. The response rate for the ongoing study was 92% for cases and 76.7% for controls. All cases and controls in this study completed a structured, 45-minute questionnaire administered by trained staff interviewers from The University of Texas MD Anderson Cancer Center. The questionnaire collected information about demographics, smoking history, family history of cancer, and medical history. At the end of the interview, a 40-mL blood sample was drawn into a coded heparinized tube and sent to the laboratory for immediate DNA extraction and molecular analyses. This study was approved by all relevant institutional review boards, and signed informed consent was obtained from each participant.

Selection of Genes and Polymorphisms

A comprehensive list of genes in the NER pathway was developed through an interrogation of the Gene Ontology (GO) database (; accessed May 3, 2011) and a PubMed-based literature review, as previously described.17 Tagging SNPs were selected by the binning algorithm of LDSelect software (; accessed May 3, 2011) with a correlation coefficient (r2) threshold of 0.8 and minor allele frequencies >0.05 within 10 kb upstream of the 5′ untranslated region (5′-UTR) and 10 kb downstream of the 3′-UTR of each gene. We also included a few nonsynonymous SNPs that were identified in the National Center for Biotechnology Information SNP database (dbSNP) (; accessed May 3, 2011). Because this was a gene-centered candidate region search, we achieved 100% coverage of our targeted genomic regions. The number of SNPs for each gene region was as follows: CCNH, 3 SNPs; cyclin-dependent kinase 7 (CDK7), 2 SNPs; damaged DNA binding protein 2 (DDB2), 9 SNPs; ERCC1, 2 SNPs; ERCC2, 8 SNPs; ERCC3, 5 SNPs; ERCC4, 8 SNPs; ERCC5, 14 SNPs; ERCC6, 12 SNPs; ERCC8, 12 SNPs; general transcription factor IIH, polypeptide 1 (GTF2H1), 6 SNPs; GTF2H3, 3 SNPs; GTF2H4, 6 SNPs; GTF2H5, 3 SNPs; ING2, 6 SNPs; ligase I, DNA, ATP-dependent (LIG1), 9 SNPs; MMS19 nucleotide excision repair homolog (MMS19L), 4 SNPs; menage a trois homolog 1, cyclin H assembly factor (MNAT1), 9 SNPs; RAD23A, 3 SNPs; RAD23B, 17 SNPs; RPA1, 15 SNPs; RPA2, 3 SNPs; RPA3, 28 SNPs; XPA binding protein 2 (XAB2), 5 SNPs; XPA, 7 SNPs; and XPC, 8 SNPs.


Genomic DNA was extracted from peripheral blood using the Qiagen Whole Blood DNA Extraction Kit (Qiagen, Valencia, Calif). The genotyping was performed using the iSelect Infinium II platform according to Illumina's protocol, as described previously (Illumina, Inc., San Diego, Calif).12, 17 Briefly, 750 ng of DNA from each sample were amplified 1000-fold to 1500-fold overnight. The amplified DNA was fragmented, precipitated, and resuspended before it was hybridized to the iSelect Beadchip, which contains SNP locus-specific oligonucleotide primers (50 base pairs long) covalently attached to the bead surface. After specific hybridization of genomic DNA to the bead array, each SNP locus-specific primer (attached to beads) was extended with a single hapten-labeled dideoxynucleotide in a single base extension reaction. Incorporated haptens were converted to fluorescent signal by multilevel immunohistochemistry staining and were imaged using the BeadStation Scanner (Illumina, Inc.). The genotype for each SNP was autocalled using the BeadStudio software package (Illumina, Inc.) and processed for further statistical analyses. The average call rate for SNPs is >99%. Individuals with >5% missing genotypes and SNPs with >5% missing calls were excluded from downstream analyses. Randomly selected 2% of samples were run in duplicate, and the concordance of SNP genotype calls was >99.9% for duplicated samples.

Statistical Analysis

Statistical analyses were performed using Intercooled STATA software (version 10.1; Stata Corp., College Station, Tex) and SAS/Genetics (version 9.0; SAS Institute, Cary, NC). The chi-square test was used to assess the differences between cases and controls with regard to categorical variables, such as sex and smoking status. The Student t test was used to test for continuous variables, including age and pack-years. Hardy-Weinberg equilibrium (HWE) was tested using a goodness-of-fit chi-square test. Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated using multivariate logistic regression to identify SNPs that were associated significantly with bladder cancer risk while adjusting for confounding factors, including age, sex, and smoking status. For each SNP, genotypes containing the homozygous major allele were used as the reference group to calculate the ORs and 95% CIs for genotypes containing the variant allele. The definitions of smoking status were the same as previously described.18 P values were calculated using the likelihood-ratio test to compare the models with and without the variables of interest in multivariate logistic regression. Dominant, recessive, and additive models were tested for each SNP, and the reported P value was the smallest from the above 3 tests. We assessed the noteworthiness of an observed significant association using a Bayesian false-discovery probability (BFDP) approach proposed by Wakefield.19 This approach has been adopted and used in several epidemiologic studies to control for multiple testing.20-25 We used all P values from different inheritance models (additive, recessive, and dominant) to perform the BFDP analyses. We adjusted all 207 tests at the SNP level and assumed a range of prior probability from .01 to .05. We set the prior probability that the OR is >2.0 as .025. An association was declared as significant if BFDP was <0.8 for a prior probability of .05 (considered to a moderate prior for pathway-based association study).19 The test for interaction between genotypes and smoking was done by including an interaction term in the logistic regression. We tested for interactions between SNPs and smoking status (never and ever) and between SNPs and pack-years of smoking (as continuous variable). For the SNPs with significant main effects, we reported results of stratified analyses by smoking status and the interactions of these SNPs with smoking status. We also reported SNPs with nominally significant interactions with pack-years of smoking (P < .05). We also performed exploratory classification and regression tree (CART) analyses to identify potential gene-gene interactions using the HelixTree software package; (version 4.1.0; Golden Helix, Bozeman, Mont). CART is a binary recursive partitioning method that produces a decision tree to identify subgroups at different risk levels. Specifically, the recursive partitioning algorithm starts at the first node (with the entire dataset) and determines the first locally optimal split and each subsequent split of the dataset with multiplicity-adjusted P values to control tree growth. We used P = .001 to grow the tree and q = .05 to prune the over-grown tree and control the final tree size. All P values in this study were 2-sided.


  1. Top of page
  2. Abstract

Characteristics of the Study Population

Our participants consisted of 803 Caucasian patients with bladder cancer and 803 age (±5 years) and sex frequency-matched Caucasian controls (Table 1). Cases had a significantly higher percentage of current smokers and recent quitters than controls (23.3% vs 8.3%; P = 5.2 × 10−21). Among ever-smokers, cases reported significantly higher levels of cigarette consumption than controls (mean pack-years, 43.0 vs 29.9; P = 2.8 × 10−12).

Table 1. Selected Host Characteristics of Bladder Cancer Cases and Normal Controls
VariableCases, N=803Controls, N=803Pa
  • Abbreviations: SD, standard deviation.

  • a

    P values were derived from the chi-square test for categorical variables (sex and smoking status) and from t tests for continuous variables (age and pack-years).

Age: mean±SD, y64.7±11.163.8±10.9.10
Sex, no. (%)   
 Men640 (79.7)639 (79.6).95
 Women163 (20.3)164 (20.4) 
Smoking status, no. (%)   
 Never212 (26.4)355 (44.2)5.2×10−21
 Former404 (50.3)381 (47.4) 
 Current and recent quitter187 (23.3)67 (8.3) 
Pack-years, mean±SD43.0±30.729.9±27.92.8×10−12

Main-Effect Analysis of Individual NER SNPs

Among the 207 NER SNPs, 17 (8.2%) were associated significantly with bladder cancer risk at the 5% level (Table 2), among which 1 SNP (reference SNP [rs] number rs4151330) had a statistically significant deviation from HWE (P < .05) in controls, consistent with what was expected by chance. The most noteworthy finding in our study was related to the ING2 gene. Four of the 6 ING2 SNPs conferred a significantly altered risk of bladder cancer; the most significant was rs11132186, an SNP located in the 3′ region of ING2. Under a recessive genetic model, the homozygous variant genotype of rs11132186 was associated with a reduced risk of bladder cancer (OR, 0.52; 95% CI, 0.32-0.83; P = .005). Another ING2 SNP in the 3′ region, rs11735038, also conferred a reduced bladder cancer risk (OR, 0.66; 95% CI, 0.49-0.90; P = .008) under a recessive genetic model. The other 2 significant ING2 SNPs were rs6854224 in the 5′ region (OR, 0.70; 95% CI, 0.53-0.93; P = .013; recessive model) and rs11732255 in the 3′ region (OR, 0.84 95% CI, 0.72-0.98; P = .025; additive model) (Table 2). There are 2 additional SNPs that exhibited best-fitting P values < .01 and conferred an increased risk of bladder cancer: rs11039130 in the 5′ region of DDB2 (OR, 1.64; 95% CI, 1.14-2.35; P = .007) and rs4150667 in the intron of GTF2H1 (OR, 1.55; 95% CI, 1.12-2.15; P = .008). In addition, we assessed the noteworthiness of these significant associations using BFDP. We observed that 7 of the 17 significant SNPs had a BFDP value <0.8 (range, 0.72-0.80), suggesting the potential noteworthiness of their associations with bladder cancer risk (Table 2). Table 3 provides the detailed distributions of different genotypes (homozygous major allele, heterozygote, and homozygous variant) of these 7 SNPs in cases and controls and the associations of each genotype with bladder cancer risk as well as the best fitting model for each SNP.

Table 2. Overall and Stratified Analysis by Smoking Status of Significant Single Nucleotide Polymorphisms
    OverallOR (95% CI)a 
SNPGenePosition in GeneBest Fit ModelOR (95% CI)aPBFDPbHWENever SmokersEver SmokersPInteractionc
  • Abbreviations; BFDP, Bayesian false-discovery probability; CCNH, cyclin H; CI, confidence interval; DDB2, damaged DNA binding protein 2; ERCC2/ERCC6, excision repair cross-complementing rodent repair deficiency, complementation groups 2 and 6; GTF2HI, general transcription factor IIH, polypeptide 1; HWE, Hardy-Weinberg equilibrium (for controls); ING2, inhibitor of growth family, member 2; MNAT1, menage a trois homolog 1, cyclin H assembly factor; NS, nonsignificant; OR, odds ratio; RAD23B, Rad23 homolog B; RPA1, replication protein A1; rs, reference single nucleotide polymorphism; SNP, single nucleotide polymorphism; UTR, untranslated region; XPA, xeroderma pigmentosum, complementation group A; XPC, xeroderma pigmentosum, complementation group C.

  • a

    Adjusted for age, sex, and smoking status, as appropriate.

  • b

    Assumes a previous probability at .05 and a significance level of .8.

  • c

    P values for the interaction between SNPs and smoking status are shown.

  • d

    P < .05.

rs11132186ING23′ of geneRecessive0.52 (0.32-0.83).0050.720.14120.37 (0.15-0.92)0.59 (0.34-1.04).327
rs11039130DDB25′ of geneRecessive1.64 (1.14-2.35).0070.790.59401.69 (0.95-2.98)1.60 (1.01-2.56).806
rs11735038ING23′ of geneRecessive0.66 (0.49-0.90).0080.730.20540.78 (0.60-1.01)0.67 (0.47-0.97).769
rs4150667GTF2H1IntronRecessive1.55 (1.12-2.15).0080.730.07140.92 (0.65-1.29)1.90 (1.24-2.89).197
rs10065575CCNH5′ of geneDominant0.76 (0.61-0.94).0100.780.71010.55 (0.38-0.79)0.89 (0.69-1.16).034d
rs1051315RPA13′ UTRDominant0.65 (0.46-0.92).0130.800.43600.65 (0.37-1.12)0.66 (0.43-1.01).989
rs6854224ING25′ of geneRecessive0.70 (0.53-0.93).0130.790.16500.70 (0.49-0.99)0.67 (0.47-0.95).755
rs1685404DDB23′ of geneDominant0.78 (0.63-0.95).016NS0.28681.01 (0.78-1.31)0.67 (0.52-0.86).038d
rs3744767RPA13′ UTRDominant1.28 (1.04-1.58).018NS0.11941.25 (0.88-1.77)1.31 (1.01-1.69).796
rs4453140ERCC65′ of geneDominant0.78 (0.63-0.96).019NS0.56800.53 (0.37-0.77)0.95 (0.73-1.23).009d
rs10978792RAD23BIntronDominant1.53 (1.06-2.20).021NS0.41741.55 (0.89-2.68)1.52 (0.94-2.46).956
rs11732255ING23′ of geneAdditive0.84 (0.72-0.98).025NS0.98660.82 (0.63-1.07)0.85 (0.71-1.02).720
rs1799787ERCC2IntronAdditive1.19 (1.02-1.39).029NS0.92811.14 (0.64-2.05)1.29 (1.06-1.57).214
rs3176639XPAIntronRecessive0.69 (0.49-0.97).032NS0.22890.96 (0.74-1.25)0.59 (0.39-0.90).261
rs1885094MNAT15′ of geneDominant0.81 (0.66-0.99).037NS0.16550.92 (0.65-1.29)0.75 (0.58-0.97).436
rs4151330MNAT1IntronRecessive1.41 (1.01-1.98).044NS0.03440.97 (0.68-1.36)1.64 (1.08-2.50).250
rs1124303XPCIntronDominant0.76 (0.58-1.00).049NS0.37930.74 (0.47-1.15)0.78 (0.56-1.10).896
Table 3. Risk Estimates of Genotypes of the Significant Single Nucleotide Polymorphisms
SNPGeneGenotypeaNo. of Cases/ ControlsOR (95%CI)a
  • Abbreviations: A, adenine; C, cytosine; CI, confidence interval; CCNH, cyclin H; DDB2, damaged DNA binding protein 2; G, guanine; GTF2HI, general transcription factor IIH, polypeptide 1; ING2, inhibitor of growth family, member 2; OR, odds ratio; rs, reference single nucleotide polymorphism; RPA1 indicates replication protein A1; SNP, single nucleotide polymorphism; T, thymine.

  • a

    Adjusted for age, sex, and smoking status.

rs11132186ING2GG498/4711.00 (Reference)
  GT274/2780.93 (0.75-1.15)
  TT31/540.50 (0.31-0.81)
  TT vs GG+GT 0.52 (0.32-0.83)
rs11039130DDB2CC396/4211.00 (Reference)
  CT321/3251.05 (0.86-1.31)
  TT85/571.68 (1.16-2.44)
  TT vs CC+CT 1.64 (1.14-2.35)
rs11735038ING2TT353/3211.00 (Reference)
  TA358/3600.94 (0.75-1.16)
  AA91/1220.64 (0.47-0.89)
  AA vs TT+TA 0.66 (0.49-0.90)
rs4150667GTF2H1CC360/3561.00 (Reference)
  CT341/3740.89 (0.72-1.10)
  TT102/731.47 (1.04-2.07)
  TT vs CC+CT 1.55 (1.12-2.15)
rs10065575CCNHGG299/2471.00 (Reference)
  GA361/3920.77 (0.61-0.97)
  AA143/1640.73 (0.54-0.97)
  GA+AA vs GG 0.76 (0.61-0.94)
rs1051315RPA1GG738/7021.00 (Reference)
  GT62/990.63 (0.45-0.89)
  TT3/21.66 (0.27-10.14)
  GT+TT vs GG 0.65 (0.46-0.92)
rs6854224ING2TT317/2921.00 (Reference)
  TC376/3680.99 (0.79-1.24)
  CC105/1420.70 (0.51-0.95)
  CC vs TT+TC 0.70 (0.53-0.93)

Stratified Analyses by Smoking Status

We also performed stratified analysis by smoking status (data not shown). There were significant overlaps of top SNPs between analysis of the overall population and smokers; for example, the top 2 SNPs in smokers, DDB2 rs1685404 (OR, 0.67; 95% CI, 0.52-0.86; P = .0021; under dominant model) and GTF2H1 rs4150667 (OR, 1.90; 95% CI, 1.24-2.89; P = .0024; under recessive model), were the eighth and fourth most significant SNPs in overall analysis, respectively. For never-smokers, the most significant SNP, XPA rs10817938 (OR, 0.26; 95% CI, 0.12-0.60; P = .0003; under dominant model), was not significant in overall analysis (P = .67). Then, we tested interactions between significant SNPs and smoking status in modulating bladder cancer risk. Many of the top SNPs exhibited similar effects on bladder cancer risk in never-smokers and ever-smokers, but there were significant SNP-smoking interactions for a few SNPs (Table 2). CCNH rs10065575 was the third most significant SNP in never-smokers (P = .0012) but was not significant in ever-smokers, and the test of interaction revealed a significant interaction of this SNP with smoking status (P = .034) (Table 2). Significant interactions with smoking status also were observed for DDB2 rs1685404 (P = .038) and ERCC6 rs4453140 (P = .009) (Table 2).

SNP-Smoking (Pack-Years) Interactions

We also performed a detailed analysis of the interactions between all assayed SNPs and pack-years of smoking (as continuous variable). Table 4 lists the 17 SNPs that had nominally significant interactions with pack-years (P < .05). The most significant SNP was rs4151150 in MNAT1, which exhibited a significant, negative interaction with pack-years (β = −0.026; P = .0018). Among the 17 SNPs, only rs1685404 on DDB2 had a significant main effect. The other 2 SNPs that had significant main effects as well as significant interactions with smoking status (CCNH rs10065575 and ERCC6 rs4453140) (Table 2) were not significant in this analysis, suggesting that quantitative measures of smoking provide more information that smoking status. Because we did not have sufficient power to detect SNP-smoking interaction, this analysis was exploratory, and no multiple testing correction was applied.

Table 4. The Interaction Between Single Nucleotide Polymorphisms and Pack Years of Smoking
SNPGeneModelOR (95% CI)aPβPInteractionb
  • Abbreviations: CI, confidence interval; DDB2, damaged DNA binding protein 2; ERCC2-ERCC8, excision repair cross-complementing rodent repair deficiency, complementation groups 2-8; MNAT1, menage a trois homolog 1, cyclin H assembly factor; OR, odds ratio; RPA1, replication protein A1; rs, reference single nucleotide polymorphism; SNP, single nucleotide polymorphism; XAB2, XPA binding protein 2; XPA, xeroderma pigmentosum, complementation group A.

  • a

    Adjusted for age, sex, and smoking status.

  • b

    P values for the interaction between SNPs and pack-years of smoking (as continuous variable).

  • c

    P < .05.

rs4151150MNAT1Dominant0.80 (0.46-1.40).4375−0.026.0018
rs807812XAB2Dominant1.03 (0.84-1.27).77590.013.0029
rs10817938XPADominant0.92 (0.63-1.34).66660.028.0059
rs794078XAB2Additive1.02 (0.86-1.21).85250.010.0072
rs4253082ERCC6Dominant0.95 (0.76-1.19).6562−0.011.0079
rs12926685ERCC4Recessive0.77 (0.57-1.04).0909−0.013.0112
rs1131636RPA1Dominant1.14 (0.92-1.41).2323−0.011.0133
rs4253211ERCC6Dominant0.89 (0.69-1.16).4042−0.011.0168
rs794083XAB2Recessive1.14 (0.86-1.52).3547−0.012.0197
rs7503021RPA1Dominant1.17 (0.86-1.61).3230−0.013.0232
rs1685404DDB2Dominant0.78 (0.63-0.95).0161c−0.009.0249
rs2291120DDB2Dominant0.87 (0.69-1.11).2630−0.010.0290
rs4253162ERCC6Dominant0.85 (0.65-1.10).21590.012.0342
rs158937ERCC8Dominant1.11 (0.86-1.45).41690.012.0414
rs794085XAB2Additive1.05 (0.89-1.23).58440.007.0433
rs7072383ERCC6Dominant0.85 (0.66-1.11).23570.011.0447
rs50871ERCC2Additive0.94 (0.81-1.09).39500.006.0474

Haplotype Analysis

We performed a haplotype analysis for ING2 and DDB2, the 2 genes with several significant SNPs (Table 5). We calculated the pairwise D′ to measure linkage disequilibrium between SNPs in each gene and observed no significant linkages. (D′ range, 0.13-0.51 for ING2 SNPs, 0.18-0.31 for DDB2 SNPs; data not shown). In total, 12 haplotypes and 9 haplotypes with a frequency of >1% were identified for ING2 and DDB2, respectively. For ING2, compared with the most common haplotype (Haplotype 1 [H1]), 3 minor haplotypes, H8, H10, and H11, were associated with a decreased risk of bladder cancer (OR, 0.60 [95% CI, 0.39-0.93], 0.43 [95% CI, 0.20-0.90], and 0.42 [95% CI, 0.20-0.89], respectively. For DDB2, compared with H1, H6 was associated with a decreased risk of bladder cancer (OR, 0.67; 95% CI, 0.50-0.91; P = .009) (Table 5).

Table 5. Association of Haplotypes of Nucleotide Excision Repair Genes with Bladder Cancer Risk
HaplotypeaHaplotype SequenceHaplotype Frequency, %No. of Cases/ ControlsOR (95% CI)bP
  • Abbreviations: CI, confidence interval; DDB2, damaged DNA binding protein 2; ING2, inhibitor of growth family, member 2; M, major allele; OR, odds ratio; V, variant allele.

  • a

    The ING2 haplotype includes 6 single nucleotide polymorphisms (SNPs) in the order of reference SNP (rs) numbers rs6854224, rs4862213, rs6830958, rs11132186, rs11735038, and rs11732255. The DDB2 haplotype includes 8 SNPs in the order of rs11039130, rs2029298, rs2291120, rs10742797, rs1685404, rs2957873, rs3824866, rs901746, and rs1050244.

  • b

    Adjusted for age, sex, and smoking status.

 H2M-M-M-M-M-V21.8290/2680.90 (0.72-1.12).34
 H3M-V-V-V-V-M11.7151/1490.85 (0.65-1.12).25
 H4V-V-V-V-M-M7.792/1040.79 (0.57-1.09).15
 H5V-V-V-M-V-M5.670/730.83 (0.58-1.20).33
 H6M-M-M-V-M-M566/610.93 (0.63-1.37).71
 H7M-M-V-M-M-V4.149/550.78 (0.51-1.20).26
 H8V-V-V-V-V-M445/570.60 (0.39-0.93).02
 H9M-V-M-V-M-M2.936/390.78 (0.48-1.28).33
 H10M-M-V-M-M-M1.411/240.43 (0.20-0.90).02
 H11V-V-M-V-M-M1.412/230.42 (0.20-0.89).02
 H12V-V-V-M-M-M1.315/180.68 (0.32-1.40).29
 Other, n=14 2.224/320.59 (0.33-1.03).07
 H1M-V-M-M-M-M-M-M-M19.4276/2970.84 (0.67-1.05).13
 H3M-M-V-M-M-V-V-V-V13.3201/1930.95 (0.74-1.22).67
 H4M-V-M-V-M-M-M-M-M11.5162/1780.81 (0.62-1.05).11
 H5M-M-V-M-M-V-M-M-M10.2149/1540.82 (0.62-1.08).15
 H6M-M-V-M-M-M-M-V-M8.2105/1370.67 (0.50-0.91).009
 H7M-M-M-M-V-M-V-V-M3.456/451.03 (0.67-1.59).88
 H8V-M-M-M-M-M-M-V-M1.624/231.05 (0.58-1.92).87
 H9M-M-M-M-M-M-M-M-M1.421/190.89 (0.46-1.71).72
 Other, n=26 4.261/620.93 (0.63-1.38).72

CART Analysis

CART analysis uses a binary recursive partitioning method to identify subgroups of high-risk individuals and detects higher order interactions among a large number of variables. Figure 1A depicts the resulting tree structure generated by CART analysis. The initial split was rs1051315 in RPA1 gene. In individuals who had the homozygous major allele-containing genotype of rs1051315, the tree structure was generated further according to the genotype information for CCHN rs10065575, ERCC2 rs1799787, RPA1 rs3744467, ING2 rs6854224, MNAT1 rs1885094, and MNAT1 rs4151330, resulting in different subgroups (terminal nodes), each with a distinct combination of genotypes and a different risk estimate. Figure 1B summarizes the risk estimates for individuals in each terminal node. Compared with individuals in Terminal Node 1, individuals in Terminal Node 3 exhibited a significantly increased risk of bladder cancer (OR, 2.58; 95% CI, 1.56-4.26; P = 2.1 × 10−4); whereas individuals in Terminal Node 6 had a significantly reduced risk (OR, 0.42; 95% CI, 0.26-0.67; P = 4.0 × 10−4). We tested the interactions between SNPs identified from this CART analysis. There were significant interactions between CCNH rs10065575 and ING2 rs6854224 (P = .014), between ING2 rs6854224 and MNAT1 rs4151330 (P = .017), and between CCNH rs10065575 and MNAT1 rs1885094 (P = .043). These interactions resulted in Terminal Nodes 4 through 7. Because of the post-hoc data-mining nature of CART analysis, these results are exploratory.

thumbnail image

Figure 1. A classification and regression tree (CART) analysis is illustrated. (A) The tree structure of the CART analysis illustrates the interaction effects between the 17 top variants identified in the discovery stage in modulating bladder cancer risk. RPA1 indicates replication protein A1; rs, reference single nucleotide polymorphism; WW, homozygous wild-type; VV, homozygous variant type; CCNH, cyclin H; ERCC2, excision repair cross-complementing rodent repair deficiency, complementation Group 2; ING2, inhibitor of growth family, member 2; MNAT1, menage a trois homolog 1, cyclin H assembly factor. (B) The risk estimate for each terminal node identified from CART analysis is illustrated using terminal node 1 as the reference group. OR indicates odds ratio; CI, confidence interval.

Download figure to PowerPoint


  1. Top of page
  2. Abstract

In this study, we assessed the effects of a comprehensive panel of 207 SNPs in 26 genes in the NER pathway on the risk of bladder cancer. The ING2 gene was the most noteworthy finding, and 4 of 6 evaluated ING2 SNPs exhibited significant associations with the risk of bladder cancer. Furthermore, we observed potential gene-smoking interaction and higher order interactions among these NER SNPs in the modulation of bladder cancer susceptibility.

Our findings for ING2 are biologically plausible. ING2 is a member of the inhibitor of growth (ING) gene family and encodes a putative tumor suppressor protein involved in the regulation of DNA repair, cell cycle progression, apoptosis, and epigenetic functions in a p53-dependent manner. The ING2 gene was cloned first in 1998 as a homolog of the first ING family member, ING1.26 The ING2 gene is 6 kb in length and is located in chromosome region 4q35.1. The implication of ING2 in NER was established first by Wang et al, who observed that overexpression of the ING2 gene significantly enhanced the repair of ultraviolet-induced DNA damage.27 This function of ING2 depends on the normal functions of the p53 protein, because small-interfering RNA-mediated degradation of either ING2 or p53 abolished the observed repair capacity.27 Furthermore, Wang et al demonstrated that the ING2 protein is not a component of the NER core protein complex; instead, ING2 enhances NER through recruiting XPA to the core complex.27 The function of ING2 in DNA repair also is mediated by interaction with trimethylated and dimethylated H3K4, which stabilizes the mammalian Sin3 homolog A—histone deacetylase 1 protein complex to enhance the transcriptional activity of many relevant genes.28 In addition to its involvement in NER, ING2 also has been implicated in G1-phase cell cycle arrest through increasing the transcriptional activation ability of p53.29 Furthermore, ING2 interacts with phosphoinositides to activate the p53-dependent apoptosis pathway.30 In our study, all 4 significant ING2 SNPs, which are not in strong linkage, were associated with reduced risks of bladder cancer. Among them, 3 SNPs are located in the 3′ region of gene, and 1 is located in the 5′ region of ING2, 5.5 kb upstream of the translation start codon. Because the promoter and enhancer sequences of ING2 have not been well characterized experimentally, it remains to be determined whether these SNPs have any functional significance. It is more likely that they are tagging SNPs but not the causal variants. Therefore, high-density mapping in combination with functional characterizations are warranted to further elucidate the molecular mechanisms underlying the association between ING2 SNPs and bladder cancer risk observed in the current study.

Another noteworthy gene we identifies in this study was DDB2. The DDB2 gene contains 8 exons and spans approximately 24 kb in chromosomal 11p12-p11.31 Because it is a response protein to DNA damage induced by genotoxic agents, such as ultraviolet irradiation, DDB2 interacts with cullin 4A (CUL4A); ring-box 1, E3 ubiquitin protein ligase (RBX1); and constitutive photomorphogenic homolog subunit 2 (COPS2) to form a protein complex that binds to chromatin and initiates the NER process.32 DDB2 enhances the DNA binding activity of DDB1 and serves as a crucial component in the p53-mediated DNA repair process.33, 34 Two SNPs of DDB2 exhibited a significant association with bladder cancer risk. One SNP, rs11039130, which is located approximately 7 kb upstream of the transcription start site, was associated with a 1.64-fold increased risk under a recessive model. The other SNP, rs1685404, which is located in the 3′ region of the gene, conferred a reduced bladder cancer risk under a dominant model. Several studies have reported that transcription of the DDB2 gene is tightly regulated by a wide array of transcription, factors such as the E2F family, breast cancer 1 (BRCA1), specialty protein 1 (Sp1), Myc, and nuclear factor (NF1).35-37 However, whether rs1685404 is a direct causative locus or a surrogate remains to be determined through further fine mapping and functional characterization.

We performed stratified analysis according to smoking status and tested interactions between SNP and smoking (smoking status and pack-years). Because of the limited power to detect SNP-smoking interaction, the current analysis was exploratory, and further validation with a larger sample size is needed. It is noteworthy that the effects of CCNH and ERCC6 SNPs were stronger in never-smokers, whereas the effect of DDB2 SNP was evident only in ever-smokers, and other SNPs had similar effects in never-smokers and ever-smokers. This observation is in line with a recent genome-wide association study (GWAS) of bladder cancer in which the N-acetyltransferase 2 (NAT2) slow-acetylator genotype was associated with an increased risk of bladder cancer in ever-smokers but not in never-smokers, but the effect of the glutathione S-transferase μ1 (GSTM1) null genotype was the strongest in never-smokers and grew progressively weaker in former and current smokers.38 Eight other GWAS-confirmed SNPs demonstrated similar effects among never-smokers and ever-smokers.38 It is intriguing that different genotypes in carcinogen metabolism and DNA repair exhibited differential effects on bladder cancer risk among never-smokers and ever-smokers. Other exposures may explain such interactions. Occupational exposure is the second major environmental risk factor for bladder cancer. Our previous publication indicated that prolonged exposure to diesel fuel or fumes on a regular basis and exposures to tar/mineral oil, dry cleaning fluids, leather and tanning solutions, rubber products, glues, pesticides, insecticides, or herbicides, fertilizers, arsenic, zinc, radioactive materials, and aromatic amine all were associated with an increased risk of bladder cancer.39 It would be interesting to assess the NER SNPs and bladder cancer risk in the context of these different DNA-damaging exposures. Because only a small percentage of our study populations were exposed to these different occupational exposures, the power to detect significant associations in exposed populations was limited. Future studies are warranted to address this question.

We also conducted exploratory CART analyses to assess potential higher order gene-gene interactions within the NER pathway genes. The process of tumorigenesis in sporadic cancers is a multifactorial and multistep process that involves complicated interactions of various low-penetrance genetic and environmental components. The CART analysis identified subsets of individuals with different cancer risks based on different combinations of genotypes, and the OR for individuals in each terminal node ranged from 0.42 to 2.58 (Fig. 1). A paired interaction analysis supported some of the SNP-SNP interactions. These data suggest that gene-gene interactions play an important role in bladder cancer etiology. Nevertheless, because CART analysis is a post-hoc data-mining tool that was applied to the same dataset, the results are preliminary and should be interpreted with caution.

The strengths of our study include a large and homogenous study population, a comprehensive panel of genes in the relatively well characterized NER pathway, and the use of a haplotype-tagging SNP-based genotyping approach. The limitation of this study is that, although we applied a BFDP approach (1 of several available statistical methods) to control for multiple testing, it is possible that some of our reported SNPs are false-positive findings. The main reason for our choice of BFDP is that the noteworthy threshold defined by the BFDP approach accounts for the costs of false discovery and nondiscovery. The other alternative approaches to correct for multiple testing, such as the Bonferroni correction for independent tests and the P(ACT) method to compute P values adjusted for correlated tests,40 do not consider the cost of nondiscovery and are more conservative. Regardless of the method used for correcting multiple testing, the ultimate way to eliminate false-positive findings is through independent validation. We reported both the significant SNPs after multiple testing correction by the BFDP method and the nominally significant SNPs in main-effect and SNP-smoking interaction analyses. External validations in independent epidemiology studies with adequate sample sizes are warranted to confirm the results from our studies.


  1. Top of page
  2. Abstract

This study was supported by National Cancer Institute grants CA131335, CA74880, CA91846, and CA127615.


  1. Top of page
  2. Abstract