Nonsynonymous single nucleotide polymorphisms in DNA damage repair pathways and lung cancer risk




Several reports have revealed the association between single nucleotide polymorphisms (SNPs) and the development of cancer. Although many SNPs have been investigated, they were tested individually. In this study, nonsynonymous SNPs present in DNA damage response genes were comprehensively analyzed for lung cancer susceptibility.


The authors selected 37 nonsynonymous SNPs in 23 genes involved in DNA damage repair pathways. Fifty lung adenocarcinoma patients resected at their institution between 2002 and 2005 and 50 individuals without any known history of cancer were recruited for a case-control study.


Three variants (XRCC1 194Trp homozygotes, POLδ1 119His homozygotes, and RAD9 239Arg heterozygotes) tended to coassociate with lung cancer risk. The authors analyzed and calculated whether the association between combinations of these 3 SNPs significantly affected the risk of lung cancer. Compared with carriers of either XRCC1 194Trp homozygote or RAD9 239Arg heterozygote variants, noncarriers were at a significantly decreased risk for lung cancer (odds ratio [OR], 0.282; confidence interval [CI], 0.089-0.893). The same results were found for the combination of POLδ1 119His homozygotes and RAD9 239Arg heterozygotes (OR, 0.277; CI, 0.077-0.993). Moreover, compared with carriers that had at least 1 of the 3 variants, noncarriers showed a more significant decrease in risk (OR, 0.263; CI, 0.090-0.767).


Analysis of the presence of XRCC1 194Trp homozygote, POLδ1 119His homozygote, and RAD9 239Arg heterozygote variants revealed that their coassociation leads to a significant risk for the development of lung adenocarcinoma. Inclusive analyses of different SNPs were important in this cancer risk study. Cancer 2010. © 2010 American Cancer Society.

The accumulation of changes in cancer-related genes triggered by DNA damage, such as those arising from point mutations, gene amplifications, or translocations, can lead to carcinogenesis. DNA damage caused by irradiation or exposure to chemicals leads to the activation of DNA repair proteins that induce a series of reactions, including cell cycle arrest and subsequent DNA repair. Conversely, reduced or impaired activity of DNA damage repair proteins could lead to cell proliferation with the introduction of DNA replication errors, resulting in the generation and accumulation of mutations that contribute to carcinogenesis.

Previously, we examined the association between hRAD9, a component of the double strand break repair pathway, and lung cancer and reported that it plays an important role in cell cycle control of nonsmall cell lung cancer (NSCLC) cells and influences the NSCLC phenotype.1 Further studies demonstrated that a nonsynonymous single nucleotide polymorphism (SNP) of hRAD9 is closely related to lung cancer risk.2

Nonsynonymous SNPs encode variant residues in proteins that can alter their activity. Such changes in DNA damage repair genes can contribute to carcinogenesis by the mechanisms discussed above. Several recent reports have documented the association between SNPs and cancer risk.2-20 However, there are few reports concerning the combination of SNPs in different proteins and cancer risk.3-6 In this study, a detailed analysis of nonsynonymous SNPs in DNA damage repair genes was carried out to predict the risk for developing lung cancer.


Study Population

Lung cancer patients (N = 50), diagnosed with lung adenocarcinoma and treated by surgery, were recruited during 2002 through 2005 at our Institution. In each case, the diagnosis of lung adenocarcinoma was confirmed histologically. As mentioned in our previous study,2 we investigated patients with lung adenocarcinoma, because its origin might be more significantly affected by genetic factors than environmental factors, whereas squamous cell carcinoma appears to be strongly correlated with smoking. A portion of each tissue sample was frozen immediately after surgical resection and stored at −80°C until used. Controls (N = 50) were matched to cases by age, sex, ethnicity (all Japanese), smoking status, and total pack years (defined as 20 cigarettes per day for 1 year) (Table 1). They were recruited from volunteers and cardiothoracic patients without any known history of cancer from whom peripheral venous blood was collected. Informed consent for this study was obtained from each patient, and this study was approved by the ethical committees of Kobe University Hospital.

Table 1. Characteristics of Lung Cancer Cases and Controls
CharacteristicsLung Cancer (n=50)Controls (n=50)P
  1. SD indicates standard deviation.

  2. P values were calculated using Student t test and chi-square tests for independent samples comparing cases and controls.

Sex (men/women)26/2434/16.10
Mean age±SD68.2±8.865.8±13.0.27
Smoking status   
Former or current smokers2826 
Mean pack-years±SD25.5±29.222.1±28.6.54

Selection of Genes and SNPs

SNPs are registered in the National Institute of Environmental Health Sciences (NIEHS) SNPs program (, 22 To evaluate the association between the risk of lung cancer and altered activities of DNA damage response proteins, we selected 37 nonsynonymous SNPs present in 23 different genes involved in base excision, double strand break, and mismatch repair pathways (Table 2). Nonsynonymous SNPs likely to alter the protein structure or function can be predicted using bioinformatics approaches. Several bioinformatic tools have been developed to predict whether the presence of a nonsynonymous SNP can compromise the activity of a protein.23-26 In this study, we used PolyPhen to evaluate the functional significance of SNPs, because this method has been used most frequently.23 On the basis of the PolyPhen prediction, possible or probable protein-damaging SNPs were considered functional, whereas those considered benign were classified as nonfunctional.27 Furthermore, we highlighted rare nonsynonymous SNPs, because it was reported that there is an inverse relationship between the minor allele frequency and the proportion of nonsynonymous SNPs predicted to be protein disturbing, and targeting rare SNPs would be a better strategy for identifying causal SNPs than targeting common SNPs.27

Table 2. Target SNPs
  1. SNP indicates single nucleotide polymorphism.

Base Excision Repair Pathway (19 SNPs in 12 Genes)
ADPRT(Ser383Tyr), LIG1(Thr614Ile), MBD4(Asp568His), POLI(Arg71Gly), POLI(His449Arg), POLI(Phe507Ser), POLI(Cys535Arg), POLβ(Pro242Arg), POLδ1(Arg119His), POLδ1(Arg177His), RFC1(Gln954Lys), RFC2(Ala198Val), RFC5(Ala13Thr), SMUG1(Gly15Val), SMUG1(Arg105Trp), TDG(Gly199Ser), XRCC1(Pro161Leu), XRCC1(Arg194Trp), XRCC1(Tyr576Ser)
Double Strand Break Repair Pathway (15 SNPs in 9 Genes)
BRCA1(Gln356Arg), CHK2(Phe447Ile), CHK2(Pro85Leu), MRE11A(Asp468Gly), MRE11A(Met698Val), NFκB1(Met506Val), NFκB1(His711Gln), NFκB2(Gly351Arg), RAD9(Cys3Phe), RAD9(His239Arg), RAD17(Arg476Leu), RAD51C(Ile144Thr), RAD51C(Arg249Cys), RAD51C(Thr287Ala), RAD51LC(Glu233Gly)
Mismatch Repair Pathway (3 SNPs in 2 Genes)
MSH3(Phe712Leu), PMS1(Gly501Arg), PMS1(Tyr793His)

Polymerase Chain Reaction and Sequencing

Genomic DNA was extracted from the frozen specimens or peripheral venous blood using DNA extraction kits (Qiagen, Tokyo, Japan). Polymerase chain reactions (PCRs) were performed to amplify regions (about 500-1000 bp) that included the nonsynonymous SNPs. For this purpose, DNA (100 ng) and 200-nM primers were mixed with PCR reagents according to the manufacturer's instructions (Accuprime SuperMix II, Invitrogen, Carlsbad, Calif). The PCR was repeated 35 times in the following order: 94°C for 30 seconds (denature), 55°C for 30 seconds (annealing), and 68°C for 2 minutes (extension). The PCR products were purified using a Qiagen purification kit and then mixed with sequencing primers (Table 3). Sequencing analyses were performed by Bio Matrix Research Inc. (Chiba, Japan).

Table 3. List of DNA Primers Used for PCR and Sequencing in This Study
SNPForward PrimerReverse Primer
  1. PCR indicates polymerase chain reaction; SNP, single nucleotide polymorphism.


Statistical Analysis

Statistical analyses were carried out using StatView software (version 5; SAS Institute, Cary, NC). Student t test and chi-square tests were performed with sex, age, and smoking status to assess the validity of case-control study between cancer and noncancer groups. Multiple logistic regression analysis was carried out to evaluate the association between the genotype and the lung cancer risk with odds ratio (OR) and 95% confidence interval (CI), using wild type homozygotes as reference. Age, sex, and smoking pack-years status variables were used as the covariants.


No significant differences were noted in the distribution of age, sex, ethnicity, smoking status, and total smoking pack-years between cancer patients and controls. We investigated 37 nonsynonymous SNPs present in lung cancer patients, but no variants were detected in 27 SNPs. Furthermore, 6 SNPs could not be genotyped using our methods. The XRCC1 194Trp allele, TDG 199Ser allele, RAD9 239Arg allele, and POLδ1 119His allele were variant alleles detected among our analyzed samples. The frequencies and distribution of the genotypes and the odds ratios for the association of each polymorphism with the risk of lung cancer are described in Table 4. No significant association was found between each polymorphism and the risk of lung cancer.

Table 4. Genotype Frequencies of SNPs in Damage Repair Genes and Risk of Lung Adenocarcinoma
Genes and SNPsGenotypeControlsCasesOR95% CI
  1. OR indicates odds ratios; CI, 95% confidence interval.

  2. ORs are adjusted for age, sex, and smoking pack-years.


We then chose 3 SNPs, XRCC1 194Trp homozygote, POLδ1 119His homozygote, and RAD9 239Arg heterozygote variant, which tended to associate with lung cancer risk, and investigated the association between various combinations of these 3 polymorphisms and the potential for lung cancer (Table 5). XRCC1 194Trp homozygote or POLδ1 119His homozygote carriers were grouped and compared with those that carried neither SNP. Noncarriers showed a slightly decreased risk for lung cancer compared with the carriers (OR, 0.275; CI, 0.068-1.109). The same analysis was performed for the remaining 2 combinations, XRCC1 194Trp homozygote and RAD9 239Arg heterozygote, POLδ1 119His homozygote and RAD9 239Arg heterozygote. Noncarriers were at a significantly decreased risk compared with carriers (OR, 0.282; CI, 0.089-0.893 and OR, 0.277; CI, 0.077-0.993, respectively). Finally, compared with carriers who had at least 1 of the 3 variants (XRCC1 194Trp homozygote, POLδ1 119His homozygote, and RAD9 239Arg heterozygote), noncarriers showed the lowest risk of developing lung cancer (OR, 0.263; CI, 0.090-0.767).

Table 5. Combination of Variants and Risk of Lung Adenocarcinoma
Genes and SNPsControlsCasesOR95% CI
  1. SNP indicates single nucleotide polymorphism; OR, odds ratio adjusted for age, sex, and pack-years; CI, 95% confidence interval; XRCC1,carrier of XRCC1 homozygote variant; POLδ1, carrier of POLδ1 homozygote variant; RAD9, carrier of RAD9 heterozygote variant; XRCC1/POLδ1, carrier of XRCC1 homozygote or POLδ1 homozygote variant; XRCC1/RAD9, carrier of XRCC1 homozygote or RAD9 heterozygote variant; XRCC1/POLδ1/RAD9, carrier that had any of the variants (XRCC1 homozygote, POLδ1 homozygote variant, or RAD9 heterozygote variant).

XRCC1/POLδ1/ RAD96171.00Reference

Furthermore, we analyzed the combination of these polymorphisms in smokers and nonsmokers separately to avoid overlap between genetic and environmental factors in cancer development. Among the smokers, no significant associations were found. However, significant associations were detected among the nonsmokers (Table 6).

Table 6. Combination of Variants and Risk of Lung Adenocarcinoma
Nonsmokers (n=46)
Genes and SNPsControl (n=24)Cases (n=22)OR95% CI
  1. SNP indicates single nucleotide polymorphism; OR, odds ratio adjusted for age, sex, and pack-years; CI, confidence interval; XRCC1, carrier of XRCC1 homozygote variant; NA, not applicable to the statistical analysis; POLδ1, carrier of POLδ1 homozygote variant; RAD9, carrier of RAD9 heterozygote variant; XRCC1/POLδ1, carrier of XRCC1 homozygote or POLδ1 homozygote variant; XRCC1/RAD9, carrier of XRCC1 homozygote or RAD9 heterozygote variant; XRCC1/POLδ1/RAD9, carrier that had any of the variants (XRCC1 homozygote, POLδ1 homozygote variant, or RAD9 heterozygote variant).

XRCC1/POLδ1/ RAD9291.00Reference
Smokers (n=54)
Genes and SNPsControl (n=26)Cases (n=28)OR95% CI
XRCC1/POLδ1/ RAD9481.00Reference


We previously reported that the His239Arg SNP of HRAD9 was associated with the development of lung adenocarcinoma.2 However, we could not define the associated risk of carcinogenesis for all of the lung adenocarcinoma patients examined using this SNP alone. Therefore, we increased the number of the putative target SNPs to include a broader range of DNA damage response genes to evaluate their effects on cancer susceptibility more precisely.

In this study, we failed to detect a significant association between a single SNP and the risk of lung cancer, but did detect a significant association when we evaluated the risk because of combinations of 2 or 3 SNPs. Because many of the DNA damage response proteins function in an intracellular signaling pathway, an alteration in the activity of any 1 protein could compromise the efficiency of the overall pathway.

For these reasons, it is likely that the combined analyses of the multiple SNPs present in distinct DNA damage repair genes contributed to the detection of a more significant link to carcinogenesis.

According to the Build 126 of the dbSNP database, there are >56,000 nonsynonymous SNPs in the human genome.27, 28 Because almost all SNPs arise by independent events, analyses of multiple SNPs, as carried out in Table 5, should be used to evaluate whether their summed effects contribute to the susceptibility to a specific disease. In our study, the simultaneous presence of the above 3 SNPs in an individual patient was not observed. Among the 50 cases and 50 controls analyzed in this study, no individual had all 3 variants, and only 1 case had 2 of the 3 variants (XRCC1 194Trp homozygote and RAD9 239Arg heterozygote). This suggests that any 1 of these 3 variants is enough to expose the carriers to the risk of lung cancer. The simultaneous presence of these SNPs was quite rare in our study. Coexistence of these SNPs might cause more reduced or impaired activity of DNA damage repair proteins and lead to early onset of various cancers or multiple primary cancer. However, in this study, the patient who had XRCC1 194Trp homozygote and RAD9 239Arg heterozygote was not young (60 years old) and did not have multiple cancer. Further studies are essential to clarify the relationship between the simultaneous presence of these SNPs and cancer susceptibility.

During the analysis of the combination of SNPs, XRCC1 194Trp and POLδ1 119His homovariants were examined, because we expected a more significant change in the activity of the homovariant proteins compared with that of heterovariants. Conversely, heterozygosity was detected only in RAD9 239Arg; for these reasons, the heterovariant was included in the combination analysis. In fact, the frequency of this SNP in the RAD9 gene was quite low, and a homovariant was not detected in the NIEHS database program.21 Potentially, variants rarely detected in the normal population might be candidates that lead to the formation of proteins with aberrant functions. It has been suggested that rare SNPs are major contributors to the increase in susceptibility to diseases, including cancers. Hence, examining such SNPs in case-control association studies might have more clinical relevance than common SNPs.27

The frequency of RAD9 239Arg is quite low. The NIEHS SNPs program registry lists the frequency of RAD9 239Arg as 0.01. In addition, RAD9 239Arg was predicted to be possibly damaging by PolyPhen. For this reason, we started our investigation with nonsynonymous SNPs with a small minor allele frequency that were predicted to be probably or possibly damaging by PolyPhen. However, we could not find other SNPs in our cases and controls under the conditions used to detect RAD9 239Arg. Therefore, the criteria for selection of SNPs were slowly expanded step by step to increase the number of SNPs. As a result, we found additional SNPs for our analysis (XRCC1 194Trp was predicted to be possibly damaging, and its frequency was 0.12; POLδ1 119His was predicted to be benign, and its frequency was 0.14). Although our statistical analyses suggested that these nonsynonymous SNPs might affect lung cancer development, further biological investigations are required to clarify the functional aspect of the variant proteins translated from them.

The accumulation of mutations and formation of chromosomal aberrations are distinct features of cancer cells. DNA repair and cell cycle checkpoint pathways are important mechanisms used to maintain genomic integrity to avoid the accumulation of mutations and consequent carcinogenesis. Reduced or impaired activity of DNA damage repair proteins could lead to cell proliferation with the introduction of DNA replication errors, resulting in the generation and accumulation of mutations that contribute to carcinogenesis. Particularly, hereditary weakness of the function of these proteins might cause cancer because of longtime exposure to environmental risk factors such as air pollution. In fact, as a result of the worldwide antismoking campaign, the number of adenocarcinoma patients with no smoking history has increased, whereas the number of squamous cell carcinoma patients with no smoking history has decreased. Our results in this paper may explain at least part of the mechanism of lung cancer development in the nonsmoking population.

In contrast with smokers, significant associations between a combination of the SNPs and lung cancer development were detected among the nonsmokers. This result demonstrates that smoking was a much more harmful factor that eliminated the effect of these SNPs on lung cancer susceptibility. Smoking could lead to cancer because of its toxic effect regardless of whether individuals have these SNPs, whereas the SNPs had an effect on cancer development only in individuals exposed to relatively minor environmental factors. Examination of these SNPs would be more useful for the prediction of lung cancer susceptibility in the population of nonsmokers.

Although there have been many studies analyzing the relationship between SNPs and cancer,7-11, 13, 15-17, 19 only a few of these involved the combination of multiple SNPs to evaluate this relationship.3-6 The methods used in those studies, however, differed from ours. As noted above, we examined the contribution of multiple SNPs to cancer susceptibility.

In this study, the sample size was small, because our focus was to analyze as many kinds of SNPs as possible. Our approach revealed a statistically significant relationship, although it involved a relatively small population. In addition, this study was based on bioinformatics and clinical data. Further studies including molecular studies are essential to confirm the validity of our results. In the future, further analyses should be undertaken with larger sample sizes and other histological subtypes such as squamous cell carcinoma.

In conclusion, the presence of XRCC1 194Trp homozygote, POLδ1 119His homozygote, and RAD9 239Arg heterozygote variant alleles are implicated in the risk of developing lung adenocarcinoma, particularly in the nonsmoking population. We will use this result for early lung cancer screening of nonsmokers. These findings suggest that the combined analysis of multiple SNPs should be used to investigate their contribution to cancer susceptibility.


Supported by grant 18591547 from the Japan Society for the Promotion of Science (to Y.M.).