A recent candidate gene association study identified a single nucleotide polymorphism (SNP) in the PPP2R2B gene (rs319217, A/G) that manifests allelic differences in the cellular responses to treatment with chemotherapeutic agents (Vazquez et al., Nat Rev Drug Discov 2008;7:979-87). This gene encodes a regulatory subunit of protein phosphatase 2A (PP2A), one of the major Ser/Thr phosphatases implicated in the negative control of cell growth and division. Given the tumor suppressor activities of PP2A, here we evaluate whether this genetic variant associates with the age of diagnosis and recurrence of breast cancer in women. To investigate the linkage disequilibrium in the vicinity of this SNP, PPP2R2B haplotypes were analyzed using HapMap data for 90 Caucasians. It is found that the A variant of rs319217 tags a haplotype that appears to be under positive selection in the Caucasian population, implying that this SNP is functional. Subsequently, associations with cellular responses were investigated using data reported by the NCI anticancer drug screen and associations with breast cancer clinical variables were analyzed in a cohort of 819 Caucasian women. The A allele associates with a better response of tumor derived cell lines, lower risk of breast cancer recurrence, later time to recurrence, and later age of diagnosis of breast cancer in Caucasian women. Taken together these results indicate that the A variant of the rs319217 SNP is a marker of better prognosis in breast cancer.
Several studies support the hypothesis that single nucleotide polymorphisms (SNPs) can affect the predisposition to cancer development and the response to cancer therapy1–13 (cancer functional SNPs or functional SNPs to abbreviate). However, the identification of all or most functional SNPs is a challenging problem given the large number of genetic variations in the human population. Genome wide association studies, designed to screen for potential genetic associations are also limited given the implicated assumptions for such analyses and that loci may not be functional loci but rather linked to the functional locus.
Recently we performed a candidate gene association study to uncover functional SNPs using data generated by the NCI anticancer drug screen (NCI60 screen).14 Specifically, statistical computing methods were developed to analyze correlations between the response of tumor derived cell lines to standard chemotherapeutic agents and the genotype of SNPs within candidate genes in the p53 pathway.15 The analysis identified six SNPs with significant genotype-drug response associations. Genetic variants of two of these SNPs, residing in the YWHAQ and CD44 genes, have been recently shown to associate with cancer risk and response to chemotherapy in patients with soft tissue sarcomas,16 thus validating the NCI60 candidate gene study.
The analysis of the NCI60 data also predicted a functional SNP in the PPP2R2B gene, encoding for a regulatory subunit of protein phosphatase 2A (PP2A). PP2A is a ubiquitously expressed heterotrimeric protein that accounts for a large fraction of phosphatase activity in eukaryotic cells17 and it has been implicated in breast tumorigenesis and progression. Phosphorylation of PP2A is associated with progression of breast tumors.18 The inactivating glycine 90 to aspartate somatic mutation in the structural subunit encoded by PPP2R1B, observed at higher frequency in breast cancer patients, results in reduced protein function.19 Both tumorigenicity and functional haploinsufficiency have been attributed to mutation in the A subunit of PP2A where mutation in this scaffold subunit promotes degradation of the regulatory subunit.20, 21 Furthermore, a case-control study using haplotype analysis of tagged SNPs in PPP2R1A and PPP2R2A demonstrated associations protective for breast cancer and modified risk for women with proliferative breast lesions.22 However, this study did not include evaluation of PPP2R2B. Therefore, we further investigate the hypothesis that SNPs in PPP2R2B will associate with breast cancer phenotypes, focusing on a putative functional genetic variant in the PPP2R2B gene.
Material and Methods
The haplotype analysis was based on HapMap genotypes for 90 Caucasians of Northern and Western European ancestry (HapMap-CEU).23 Genotypes for 500 SNPs within PPP2R2B were available. Among them, 43 SNPs with genotype calls having relative mutual information of 0.7 or higher with the rs319217 genotypes were selected. The ancestral allele information was available from dbSNP.24 The relative mutual information (Mij = Iij/Iii) is a measure of linkage disequilibrium, defined as the ratio between mutual information (Iij = Σmab(mij,ab/n)ln[nmij,ab/gi,agj,b]) of a probe SNP (j) and a reference SNP (i) (here rs319217) and the self-mutual information (Iii) of the reference SNP (i), where n is the number of samples, gi,a is the number of samples with genotype a at locus i, and mij,ab is the number of samples with genotype a at locus i and genotype b at locus j. The relative mutual information takes the values between 0 (independent SNPs) and 1 (perfectly linked SNPs). Based on the genotypes for the selected 43 SNPs, the associated haplotypes were estimated using the SNPHAP program http://www-gene.cimr.cam.ac.uk/clayton/software/.
Cellular drug responses
The mutational status of p53, the genotypes of 109,687 SNPs (Affymetrix 125K chip) and the GI50 data for the NCI60 cell panel were obtained from the NCI/NIH Developmental Therapeutics Program web site, http://www.dtp.nci.nih.gov. The genotypes of the rs319217 SNP in the NCI60 samples have been determined using accurate allelic discrimination assays (Applied Biosystems).16 A univariate test was undertaken for 132 drugs to evaluate allelic differences in the GI50s. Specifically, the average log GI50 [X = −log10(GI50)] for cells for each of the three genotypes of a given locus (AA, Aa and aa) were calculated for cells either wild-type or mutant for p53. Subsequently, the probability (p value) was computed that just by chance the difference for the following groupings either was equal to or larger than the actual measurement: (a) Xa-XAA or (b) Xaa-XA, or (c) XAA-Xa, or (d) XA-Xaa, or (e) [Xaa-XaA and XaA-XAA] and (f) [XAA-XaA and XaA-Xaa]. These probabilities were estimated using a permutation test (106 permutations) that preserved the allele or genotype group sizes but permuted the samples among the groups. Results p < 0.05 were considered significant and p < 0.1 marginally significant. A multiple hypothesis test was performed for allelic differences in the GI50s across the entire panel of drugs. This test took advantage of the fact that 132 well-characterized compounds were tested against the NCI60 cell panel, which provided a set of independent measurements. A Fisher's exact test to compute the statistical significance of observing h univariate hits for a SNP on a total of D = 132 drugs, given that overall H significant hits are observed after testing S reference SNPs on the D drugs. All 109,687 Affymetrix genotyped SNPs were chosen as a reference set.
Breast cancer cohort
The case cohort consisted of 819 Caucasian women diagnosed with breast cancer. It was derived from patients evaluated at The Cancer Institute of New Jersey who were invited to participate in this study from 2004 to 2009. Greater than 95% of eligible individuals gave consent for participation. Eligibility included being at least 18 years of age and a history of biopsy-proven breast cancer verified by pathology records and confirmed on review by our institutional breast pathologist. Samples were collected retrospectively for cases diagnosed before 2004. Occurrence of asynchronous primary breast tumors and recurrent tumors were not considered in age at diagnosis analyses. For those women with more than one primary, age of diagnosis analysis was performed based on first primary. In 5% of cases, slides were not available for review and pathological features were based on available pathology reports from other institutions. Clinical information was abstracted through chart review. Estrogen receptor alpha and progesterone receptor were measured by immunohistochemistry for over 99% of tumors and were negative if staining was less than 10%. For those measured by protein, tumors with less than 5 fmol/mg was considered negative. BRCA1/2 testing was performed where clinically indicated through Myriad Genetic Laboratories using standard assays including full sequencing and rearrangement tests unless otherwise indicated. Patients with known BRCA1/2 mutations were excluded from age and recurrence association analyses due to potential confounding bias. Recurrence was defined as the time between the date of biopsy-proven diagnosis to date of biopsy-proven recurrent disease. Local, regional and distant recurrences were defined as biopsy-proved recurrence in-breast, in lymph node basins, or in other organs beyond the breast or lymph nodes, respectively. Patients with stage IV disease at diagnosis were excluded from the recurrence analysis. DCIS was not included in recurrence analyses as a function of chemotherapy as this treatment is not standard of care for DCIS. Because there were no associations between genotype and risk of noninvasive vs. invasive ductal carcinomas, DCIS was included in analysis of risk of recurrence, particularly as several individuals experienced distant recurrence in the absence of evidence local or regional recurrence. Investigations were performed with approval by the University of Medicine and Dentistry of New Jersey/Robert Wood Johnson Medical School Institutional Review Board. Demographics of the breast cancer cohort are depicted in Table 1.
Table 1. Demographics of breast cancer cohort1
Genomic DNA was extracted from 1 ml of peripheral blood, obtained through venipuncture, using a spin column-based method according to manufacturer's protocol (QIAGEN, Valencia, CA). Genotyping for the human PPP2R2B SNP (rs319217) was performed using an Applied Biosystems TaqMan assay on the ABI 7000 Sequence Detection System (Applied Biosystems, Foster City, CA). Briefly, reactions were performed with 5–10 ng genomic DNA in a 20 μl vol. PCR cycling conditions were 50°C for 2 min, 95°C for 10 min, followed by 40 cycles of 92°C for 15 s and 60°C for 1 min. The assay failed in <1% of cases. The alleles at the locus are A and G where G is defined as the ancestral allele.
Statistical analysis of the clinical data
A permutation test was done to determine the statistical significance of the noted increase of the average age of tumor diagnosis between patients carrying the A allele and patients carrying the GG homozygous. The statistical significance for the enrichment of GG carriers among patients manifesting recurrence events was computed using the Fischer's exact test. The odds ratio (OR) estimates were computed using, first, a Bayesian estimate of the probability density function of the fraction of GG homozygous in patients manifesting recurrence events, and second, a binomial test to compute the probability to observe (n) GG genotypes in (N) patients manifesting recurrence events. The analyses of recurrence free survival was performed using the Kaplan Meier analysis and the Cox's multivariate proportional hazards regression model with the SPSS 17.0 software (SPSS, Chicago, IL).
Genetic variant and haplotype analysis
The PPP2R2B gene is a relatively long gene of about 500 kb. It is composed of 10 exons with long first and second introns (Fig. 1). The SNP of interest (rs319217) is located in the first intron and it is linked to several other SNPs expanding over a region around the second exon (Fig. 1 and Supporting Information Fig. 1). There are two major haplotypes associated with the genotype calls of those SNPs in strong linkage disequilibrium with rs319217, overall accounting for 95% of the predicted haplotypes in the HapMap Caucasian samples. The rs319217 SNP locus displays two alleles A or G, each tagging one of the two major haplotypes. G is the ancestral allele and it is at lower frequency in the Caucasian population (about 0.3), compared to that for the derived allele A (about 0.7). Together, the fact that the derived allele is at high frequency and tags a very long haplotype appears to be inconsistent with a neutral model of evolution. Under neutrality, high frequencies can only be achieved after a long period of evolution and after such long period the linkage disequilibrium with nearby SNPs should have significantly decreased due to recombination events. The observation of both a high frequency and a long haplotype is thus indicative of a selective sweep acting on the haplotype tagged by the A variant. This hypothesis is supported by the statistical analysis of Haplotter,25 reporting that the PPP2R2B gene is under selection in the Caucasian population with a statistical significance of 0.045. Note that, based on the same Haplotter analysis, this gene is not under selection in Africans and, therefore, this is a population-specific effect.
The evidence for selection indicates that the rs319217 is a functional SNP or it is linked to some functional genetic variation. Indeed, for a genetic variant to be selected for it must lead to some phenotype affecting reproduction. The fact that PP2A is one of the major Ser/Thr phosphatases opens the window for several possibilities. In particular, PP2A has been implicated in the negative regulation of cell growth and division, which are essential processes during development. PP2A activity and regulation is necessary during developmental different stages.26–29
Allelic differences in drug responses in cancer cell lines
The first evidence of the association of the rs319217 alleles and cancer related phenotypes comes from the response of tumor derived cell lines to standard chemotherapeutic agents.1 This evidence is recapitulated in Figure 2, showing the average response of tumor derived cell lines carrying the A or G alleles to 132 standard chemotherapeutic agents, where the red color indicates that the A allele associates with a higher sensitivity to the drug (lower GI50), the green color indicates the same but for the G allele, and black represents the absence of an association. This heat map clearly shows that in general the A allele associates with a better response (predominance of the red color). The probability that this would happen for a randomly chosen SNP is of 3.3 × 10−6, 2.2 × 10−32 and 4.9 × 10−2 when considering all cell lines, cell lines with a wild-type p53, and cell lines carrying a mutant p53, respectively, indicating that this association is independent of the p53 status.
Allelic differences in breast cancer recurrence
The NCI60 cell response analysis indicates that patients carrying the A variant of rs319217 should manifest a better response to cancer treatment, and specifically to chemotherapy. To test this prediction, recurrence events following breast cancer treatment were analyzed in cohort of 667 Caucasian women diagnosed of breast cancer (Fig. 3). Among these patients, there were 247, 324 and 96 individuals with the AA, AG and GG genotypes. In carriers of the GG genotype at rs319217, patients manifest a higher frequency of recurrence events compared to patients carrying the A variant, with an average odds ratio 1.78 CI (0.97–2.69) and statistical significance of p = 0.043 (Fig. 3a). Thus, as predicted by the analysis of cell response to chemotherapeutic agents, patients carrying the A variant are predicted to have a better response to the chemotherapeutic treatment.
Stratified analysis was also performed according to ER, PR, Her2 status, the type of therapy received (chemotherapy, radiation, adjuvant hormonal therapy), the stage at diagnosis and the cancer subtype. No association was observed when stratifying by patients that received or not radiation therapy. The association became more significant in the group of patients receiving adjuvant hormonal therapy (p = 0.024), while no association was observed in the group of patients that did not received hormonal therapy (p = 0.43). Regarding the status of the different receptors, no association was observed except for the group of progesterone negative patients (p = 0.019).
The noted association became more significant (p = 0.016) when the analysis was restricted to patients that received chemotherapy as part of their treatment (n = 348), resulting in an average odds ratio 2.39 CI (1.01–3.71) (Fig. 3b), while no significant association was observed in the group of patients that did not received chemotherapy (p = 0.41). The majority of the patients receiving chemotherapy were treated with a combination of a topoisomerase II inhibitor (doxorubicin or epirubicin) and an alkylating agent (mainly cyclophosphamide). In some cases the latter received an additional agent, which was either an antimitotic agent (Docetaxel or Paclitaxel) or a DNA antimetabolite (5-Fluorouracil). On the other hand, some patients received a combination of an alkylating agent (cyclophosphamide), a DNA antimetabolite (5-Fluorouracil) and a RNA/DNA antimetabolite (Methotrexate). All these class of agents are represented in the 132 standard agents used in the NCI60 analysis (Fig. 2), allowing the comparison between associations in the tumor derived cell line responses and the patient responses to chemotherapy. There is a precise concordance with the cell lines study, both indicating that the A variant is associated with a better response to standard chemotherapeutic treatments.
To further assess the impact of the PPP2R2B SNP on breast cancer outcome, Kaplan Meier analysis of regression-free survival was performed. The analysis revealed that GG carriers had the worst recurrence-free survival with a mean survival time of 7,059 days (232.1 months), followed by patients heterozygous for the G-allele [mean survival time of 6,406 days (210.6 months)] and subsequently patients with an AA genotype [mean survival time of 5,682 days (186.8 months); p = 0.050, log rank test]. To exclude potential biases from other independent prognostic factors, a multivariate Cox's regression survival analysis was performed, adjusted to tumor stage and histological subtype including receptor status. This analysis further confirmed the initial results, showing a relative risk (RR) for recurrence of 1.81 for patients with the GG genotype when compared with individuals homozygous for the A-allele (p = 0.028; Fig. 3b). In line with the previous observations, this trend was also more pronounced when the analysis was restricted only to those patients who had received chemotherapy (RR = 1.85, p = 0.042), while no significant association was observed in the group of patients not receiving chemotherapy (RR = 1.63, p = 0.52).
Allelic differences in age of diagnosis of breast cancer
Previous studies suggest that, genetic variants implicated in the variable response to cancer treatment, can alter the rate and age at which individuals manifest cancer as well. Thus, the association between the rs319217 genotypes and the age of diagnosis of breast cancer in Caucasian women was evaluated. Genotypes and age of diagnosis was obtained for 760 patients, stratified into 284, 365 and 111 patients with the AA, AG and GG genotypes. On average, women carrying the GG genotype at rs319217 are diagnosed of breast cancer 3.0 years earlier than those carrying the A variant, with a statistical significance p = 0.0069. This association is a consequence of the increase in the incident rate in patients with the GG genotype (Fig. 4a), particularly after the age of 50 years, where the average age of menopause is 51 for women in the U.S.
Data were further stratified according to ER, PR and Her2 status, the stage at diagnosis and the cancer subtype for additional analysis. The noted association became stronger when restricting the analysis to women diagnosed with ER positive breast cancer (Fig. 4b). In this case patients carrying the GG genotype are diagnosed 4.4 years earlier (p = 0.0031). The genotype-specific associations were preserved for ductal carcinomas. The same trend was noted for invasive lobular carcinomas, but due to small group size (n = 82), statistical significance was not reached (p = 0.09). No genotype-specific associations were observed by Her2 status or stage at diagnosis.
Taken together our results indicate that the A variant of the rs319217 SNP associates with a better response to chemotherapy. In both, tumor derived cell lines and breast cancer patients, carriers of the A variant manifest a better response to standard chemotherapy. In addition, we also observed that breast cancer patients carrying the GG genotype were diagnosed 3.0 years earlier than those carrying the A variant, with a 4.4 years difference for ER positive breast cancers.
The rs319217 SNP resides within the PPP2R2B gene, encoding a regulatory subunit of PP2A. PP2A is a ubiquitously expressed heterotrimeric protein that accounts for a large fraction of phosphatase activity in eukaryotic cells.17 PP2A phosphatase activity has been shown to interact directly with the p53 pathway, causing the dephosphorylation of key residues of p53 and MDM2, resulting in the regulation of p53 activity levels in cells.30–34 ADP ribosylation factor like 2 (Arl2) modifies chemosensitivity to conventional drugs, e.g., taxanes and doxorubicin, used to treat breast cancer in the adjuvant setting. The mechanism implicated is through PP2A effects on p53 phosphorylation.35 PP2A can antagonize Ras signaling by dephosphorylating c-Myc and RalA and by negatively regulating the PI3 kinase/Akt signaling pathway.17 Further evidence of the importance of PP2A activity in suppressing cellular transformation lies in the wide range of mechanisms that transformed cells have evolved to inhibit its activity. For example, inhibition of PP2A activity has been shown to be mediated by the small tumor antigen of DNA tumor viruses36 by up-regulation of the c-Myc–specific inhibitor CIP2A,37 by BCR/ABL via SET up-regulation,38 through biallelic mutational inactivation of the Aβ subunit,39 or by decreased expression of the Aα subunit.20 PP2A has also been implicated in inhibition of nuclear telomerase in human breast cancer cells.40
The various forms of PP2A contain an active core dimer, made up of the catalytic (C) subunit and a structural scaffold A subunit. The scaffold subunit mediates interaction of the core dimer with a great variety of regulatory (B) subunits in a tissue-specific manner. The rs319217 resides within the gene PPP2R2B, encoding for a regulatory subunit. Interestingly, a recent report implicates a genetic variant of PPP2R5E, encoding for another regulatory subunit of PP2A, has been also shown to alter the age of diagnosis and survival in patients with soft tissue sarcomas.41 Furthermore, both PPP2R2B (here) and PPP2R5E41 appear to be under natural selection in the Caucasian population, as indicated by the analysis of their haplotypes. Haplotype analysis of tagged SNPs in PPP2R1A and PPP2R2A demonstrated that certain genotypes were protective for breast cancer, while others modified the risk for women with proliferative breast lesions.22 No previous reports have described associations for SNPs in PPP2R2B with either risk of breast cancer or its recurrence. Data from this cohort are consistent with associations between PPP2R2B alleles and breast cancer outcomes. However, longer follow-up is necessary to determine the relationship between survival curves while accounting for late-relapses.
The role of PP2A in breast cancer suggests how functional SNPs may play a role in breast cancer biology. For instance, reduced stability of ERalpha mRNA has been linked with reduced PP2A activity.42 Direct interactions between ERalpha and PP2A through the catalytic subunit of PP2A result in ER-dependent gene expression, even in the absence of estrogen.43 In addition, Loss of PP2A expression in human breast cancer results in endocytosis of e-cadherin, a key player in the beta-catenin pathway.44 Reduced or absent e-cadherin expression is observed in breast tumors. As a cell–cell adhesion molecule, reduced e-cadherin membrane expression may contribute to the metastatic potential of breast cancer cells. PP2A also has demonstrated effects on chemosensitivity. Reduced PP2A expression and activity have been observed in adriamycin resistant MCF-7 human breast cancer cells, while adenoviral E1A-mediated sensitization of a human breast cancer cell line to paclitaxel appears to occur through PP2A upregulation.45, 46
The association reported here has not been detected in a recent genome wide association study (GWAS) searching for common genetic variants with an association with breast cancer prognosis.4, 5, 47 The latter can be attributed to several factors, including the multiple-hypothesis testing complexity of GWAS studies and the lack of appropriate stratification of the patient population. Our preliminary NCI60 analysis offered the advantage of providing a candidate SNP (rs319217), the allele associated with better prognosis (A allele), and the specific context (response to chemotherapy).
The associations reported here remain to be validated in other patient cohorts, and importantly, the regulatory changes associated with PPP2R2B rs319217 or linked SNPs are yet to be determined. The latter is a challenging task given that the variant and ancestral alleles of rs319217 tag two long haplotypes expanding a 60 kb region. Furthermore, more exhaustive searches for other genetic variants closely linked to these SNPs, but not included in the HapMap project, will be necessary to develop a complete list of candidate functional SNPs that merit further experimental investigation into the molecular and cellular mechanisms behind the significant allelic differences in haplotype structure, cancer risk, and response to cancer treatment noted in this report. However, these data strongly suggest that PPP2R2B harbors genetic variants that can affect human cancer and may be under evolutionary selection pressure.