Department of Breast Surgery, Cancer Center and Cancer Institute, Shanghai Medical College, Fudan University, Shanghai, People's Republic of China
Corresponding author: Zhi-Ming Shao, MD, PhD, Department of Breast Surgery, Fudan University Shanghai Cancer Center, 399 Ling-Ling Rd, Shanghai, 200032, People's Republic of China; Fax: (011) 86-21-64434556; email@example.com
We thank Jiong Wu, Jin-Song Lu, Guang-Yu Liu, Gen-Hong Di, and Zhen-Zhou Shen for their excellent data handling.
The GATA3 gene (GATA-binding protein 3) is one of the most frequently mutated genes in breast cancer. The objective of the current study was to determine the clinicopathologic characteristics of patients with breast cancer harboring GATA3 mutations.
The authors examined the somatic mutation status of GATA3 and performed survival analysis in The Cancer Genome Atlas (TCGA) cohort (n = 934) and the Fudan University Shanghai Cancer Center (FUSCC) cohort (n = 308). Patient characteristics, including age; menopausal status; tumor laterality; tumor size; lymph node status; tumor grade; molecular subtypes; adjuvant radiotherapy, chemotherapy, and endocrine therapy; and prognosis, together with PIK3CA (phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit alpha) and TP53 (tumor protein p53) mutation status, were collected.
GATA3 mutations were detected in 8.8% of patients (82 of 934 patients) in the TCGA cohort and 14.9% of patients (46 of 308 patients) in the FUSCC cohort. GATA3 mutations were found to be significantly associated with luminal-like breast cancer (P = .002 in the TCGA cohort and P < .001 in the FUSCC cohort), and were highly mutually exclusive to PIK3CA mutations (P = .001 in the TCGA cohort and P = .003 in the FUSCC cohort) and TP53 mutations (P < .001 in both cohorts). Furthermore, GATA3 mutations were correlated with improved overall survival in the entire population (P = .025 in the TCGA cohort and P = .043 in the FUSCC cohort) as well as in patients with luminal-like disease who received adjuvant endocrine therapy.
Breast cancer is the most frequently diagnosed cancer among females, accounting for approximately 23% of total cancer cases and approximately 14% of cancer deaths. Clinically, this heterogeneous disease is categorized into 4 basic therapeutic groups including luminal A, luminal B, human epidermal growth factor receptor 2 (HER2)-enriched, and basal-like. Recent advances in next-generation sequencing have provided opportunities to further characterize the molecular architecture of breast cancer in depth.[2-4] A comprehensive molecular analysis of breast cancer revealed that TP53 (tumor protein p53), PIK3CA (phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit alpha), and GATA3 were the most frequently mutated genes, which significantly extends our knowledge of likely genomic drivers in breast cancer.
GATA-binding protein 3 (GATA3) belongs to a family of 6 mammalian GATA dual zinc-finger transcription factors (GATA1-6) that bind to the consensus 5′-(A/T) GATA (A/G)−3′ motif. GATA3 is critical for the normal development of various organs, including the mammary gland.[7-9] In patients with breast cancer, the expression of GATA3 is closely related to estrogen receptor (ER) status, and significantly associated with a favorable prognosis.[10-13] GATA3 is the third most frequently mutated gene in breast cancer, with a mutation rate of approximately 10%. It is interesting to note that these mutations are observed for the most part in ER-positive tumors, suggesting the involvement of GATA3 mutations in the pathogenesis of luminal-like breast cancer.
To describe the characteristics of GATA3-mutated breast cancer in depth, we analyzed GATA3 mutations in 934 breast cancer cases from The Cancer Genome Atlas (TCGA) cohort and an additional 308 cases from the Fudan University Shanghai Cancer Center (FUSCC) cohort. We also examined their associations with clinicopathologic characteristics and molecular features including the TP53 and PIK3CA status of these tumors. The current study allowed us to define the molecular and clinical relevance of GATA3 mutations in patients with breast cancer.
MATERIALS AND METHODS
Patients and Samples
For the TCGA cohort, the somatic mutation profiles (for a total of > 30,000 somatic mutations) were studied in 934 primary breast tumors from female patients with no pretreatment who were chosen from the updated TCGA database according to parameters mentioned in a previous study. Only patients with fully characterized (somatic mutation profiles) tumors and intact overall survival (OS) information were included in the study. Follow-up was completed on June 21, 2013. The median length of follow-up was 19 months (range, 1 month-226.5 months) and 99 patients had died at the end of follow-up. Extended demographics for these patients, characterized by the TCGA consortium, are shown in Table 1.
Table 1. Clinicopathologic Characteristics of Patients With Breast Cancer in the TCGA and FUSCC Cohorts
Abbreviations: ER, estrogen receptor; FUSCC, Fudan University Shanghai Cancer Center; HER2, human epidermal growth factor receptor 2; MT, mutant; NA, not applicable; PIK3CA, phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit alpha; PR, progesterone receptor; TCGA, The Cancer Genome Atlas; TP53, tumor protein p53; WT, wild-type.
Bold type indicates statistical significance.
Tumor size, cm
Lymph node status
Adjuvant endocrine therapy
The FUSCC cohort consists of 308 female patients with histologically confirmed invasive breast cancer who had undergone prior surgical resection and received no pretreatment. Only those patients with operable unilateral tumors and without any evidence of metastasis at the time of diagnosis were enrolled. All patients were treated at FUSCC between January 1, 2003 and December 31, 2009. Follow-up was completed on June 30, 2013. The median length of follow-up was 62 months (range, 5 months-97 months). Seventy-five patients in the FUSCC cohort had disease-free survival (DFS) events and 55 had died at the end of follow-up. The molecular subtypes of breast cancer according to immunohistochemical (IHC) profiles were categorized as follows: luminal A (ER positive or progesterone receptor [PR] positive or PR positive, HER2 negative, and Ki-67 < 14%), luminal B (ER positive or PR positive and HER2 positive or Ki-67 ≥ 14%), HER2-enriched (ER negative, PR negative, and HER2 positive), and basal-like (ER negative, PR negative, HER2 negative, and CK5/6-positive or epidermal growth factor receptor [EGFR]-positive).[14, 15] Tumors with IHC phenotypes of ER negative, PR negative, HER2 negative, CK5/6 negative, and EGFR negative were excluded. The staining and interpretation of ER, PR, HER2, Ki-67, CK5/6, and EGFR have been previously described.[14, 16] Patients' clinicopathologic characteristics are listed in Table 1. In general, the 2 cohorts were well balanced with regard to baseline characteristics. This study was approved by the Ethical Committee of FUSCC, and each participant provided written informed consent.
Sample Collection and Processing
Tumor samples and paired normal breast tissues were obtained from patients undergoing surgical treatment at FUSCC in accordance with the appropriate Institutional Review Boards. Generally, tumor tissues were macrodissected to avoid the influence of stromal tissues (< 5%). We used QIAamp DNA Mini Kits (Qiagen, Valencia, Calif) to extract DNA from the tissues. The quality and concentration of extracted DNA were determined using NanoDrop 2000 (Thermo Fisher Scientific, Wilmington, Del). The extracted DNA was then used for mutation analysis.
The detection of mutations in frozen tumor tissues and paired blood DNA was performed by Sanger sequencing for all the exons of GATA3, PIK3CA, and TP53. Polymerase chain reaction (PCR) amplification was performed on an ABI 9700 Thermal Cycler (Invitrogen, Carlsbad, Calif) using standard conditions. After the amplification, products were purified using a QIAquick PCR Purification Kit (Qiagen) and directly sequenced on an ABI PRISM 3730 automated sequencer (Invitrogen). All detected mutations were confirmed by resequencing of tumor and matched normal blood DNA from new PCR product.
Our definition of DFS events included the first recurrence of disease at a local, regional, or distant site; the diagnosis of contralateral breast cancer; and death from any cause. The OS rate was calculated from the date of diagnosis to the date of death or last follow-up. Patients without events or death were censored at the time of last follow-up.
Frequency tabulation and summary statistics were provided to characterize the data distribution. The Pearson chi-square test (when none of the cells of the contingency table had an expected count of <5) or Fisher exact test (when any cell in the contingency table had an expected count of <5) were used to assess the association between 2 categorical variables. For multivariate analyses, a binary logistic regression model was used. Survival curves were constructed using the Kaplan-Meier method, and the univariate survival difference was determined by the log-rank test. Adjusted hazards ratios (HRs) with 95% confidence intervals (95% CIs) were calculated using Cox proportional hazards models. All statistical analysis was performed using Stata statistical software (version 10.0; StataCorp, College Station, Tex). A 2-sided P value < .05 was considered to be statistically significant.
GATA3 Mutation Prevalence and Spectra
First, we investigated the prevalence and spectra of GATA3 mutations in both the TCGA and FUSCC cohorts. We summarized the GATA3 mutation spectra according to molecular subtype in the TCGA (see online supporting information) and the FUSCC (see online supporting information) cohorts. Approximately 50% of the GATA3 mutations were clustered in exons 5 and 6, exons that encode the second zinc-finger (ZnF2) and the C-terminal domain in both cohorts. In addition, the mutation spectrum of GATA3 in the TCGA cohort was far different from that in the FUSCC cohort.
In the TCGA group, 934 patients with breast cancer were screened for somatic mutations in GATA3. A total of 84 somatic mutations in GATA3 were detected in 8.8% of patients (82 patients). Among the 82 GATA3-mutated tumors, 97.6% (80 tumors) had 1 somatic mutation and 2.4% (2 tumors) had 2 somatic mutations. The latter group comprised 2 cases (TCGA ID numbers A0SU and A25B), each having 1 missense point mutation and 1 frameshift insertion. Of the 84 GATA3-mutated tumors, 7.1% (6 tumors) were missense point mutations, whereas 53.6% (45 tumors) were frameshift insertions and 39.3% (33 tumors) were frameshift deletions. Among the 4 main molecular subtypes of breast cancer, GATA3 mutations were more frequently detected in luminal A (19.6%) and luminal B (10.5%) breast cancer (Fig. 1A).
In the FUSCC group, somatic mutations in GATA3-mutated tumors were detected in 14.9% of tumors (46 of 308 tumors). Among the 46 tumors harboring GATA3 mutations, all had 1 GATA3 mutation and none had ≥ 2 GATA3 mutations. Of the 46 GATA3-mutated tumors, 78.3% (36 tumors) were missense point mutations, whereas 13.0% (6 tumors) were frameshift insertions and 8.7% (4 tumors) were frameshift deletions. Compared with the TCGA group, the percentage of point mutations in the FUSCC group was significantly higher (P < .001) when calculated via the chi-square test. Among the 4 main molecular subtypes of breast cancer in the FUSCC group, GATA3 mutations were once again more frequently detected in luminal A (22.0%) and luminal B (18.2%) breast cancer (Fig. 1B).
Clinicopathologic and Genetic Associations of GATA3 Mutations
Clinicopathologic and genetic correlations of GATA3 mutation were evaluated (Table 1). In both cohorts, GATA3 mutations were detected only in ER-positive breast cancer and were found to be significantly associated with the luminal subtype of breast cancer (P < .001 in the TCGA cohort and P = .013 in the FUSCC cohort). The GATA3 mutation rate in the FUSCC group was significantly lower compared with that in the TCGA set (see online supporting information), on the overall (P = .002) and in the ER-positive subset of breast cancer (P < .001). It should be noted that all of the patients with GATA3-mutant tumors received adjuvant endocrine therapy in the FUSCC cohort. GATA3 mutations were almost mutually exclusive to PIK3CA mutations (P = .014 in the TCGA cohort and P = .016 in the FUSCC cohort) and TP53 mutations (P < .001 in the TCGA cohort and P = .003 in the FUSCC cohort) (Table 1) (see online supporting information). Moreover, GATA3 mutations had a tendency to occur in younger patients (aged ≤ 40 years vs aged > 40 years; P = .056 in the TCGA cohort and P = .083 in the FUSCC cohort).
In multivariate logistic regression analysis to assess for independent predictors of the relation between GATA3 mutation and clinicopathologic features, tumor size (P = .006), molecular subtype (P = .002), PIK3CA mutation (P = .001), and TP53 mutation (P < .001) were found to be significantly associated with GATA3 mutations in the TCGA cohort (Table 2). Multivariate analyses in the FUSCC cohort revealed similar results compared with the TCGA set (Table 2). Taken together, GATA3 mutations are mainly detected in patients with luminal-like breast cancer, and are to a lager extent mutually exclusive to tumors with PIK3CA and TP53 mutations.
Table 2. Multivariate Logistic Regression Analysis of Factors That Might Affect the Presence of GATA3 Mutations in the TCGA Cohort and the FUSCC Cohort
Abbreviations: 95% CI, 95% confidence interval; FUSCC, Fudan University Shanghai Cancer Center; GATA3, GATA-binding protein 3;HER2, human epidermal growth factor receptor 2; MT, mutant; NA, not applicable; OR, odds ratio; PIK3CA, phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit alpha; TCGA, The Cancer Genome Atlas; TP53, tumor protein p53; WT, wild-type.
Adjusted by multivariate logistic regression analysis including all factors, as categorized in Table 2.
Bold type indicates statistical significance.
Tumor size, cm
Lymph node status
GATA3 Mutations and Clinical Outcome
We further questioned whether GATA3 mutations had an influence on the patients' DFS and OS. In the TCGA cohort, the Kaplan-Meier plot demonstrated that GATA3 mutations were associated with improved OS in the subgroup of patients with ER-positive breast cancer (P = .041; data not shown), but not in the overall cases (P = .389; data not shown). Although univariate analysis disclosed that lymph node status was the only independent prognostic factor for OS (HR, 2.22; 95% CI, 1.44-3.41 [P < .001]) (Table 3), multivariate analysis after adjustment for other potential prognostic factors indicated that lymph node status (HR, 7.01; 95% CI, 2.08-23.69 [P = .002]) and GATA3 mutations (HR, 0.28; 95% CI, 0.09-0.85 [P = .025]) were independent prognostic factors for OS. Neither PIK3CA (HR, 1.78; 95% CI, 0.43-4.22 [P = .089]) nor TP53 (HR, 1.60; 95% CI, 0.76-3.37 [P = .103]) mutations were found to be associated with OS.
Table 3. Univariate and Multivariate Cox Proportional Hazards Analysis of Overall Survival for Patients With Breast Cancer in the TCGA Cohort.
In the FUSCC cohort, Kaplan-Meier analysis revealed that GATA3 mutations were associated with improved DFS (P = .024) (Fig. 2A) and OS (P = .033) (Fig. 2B) in all cases. In the subgroup of patients with ER-positive (luminal-like) breast cancer who underwent adjuvant endocrine therapy, mutant GATA3 also indicated favorable DFS (P = .046) (Fig. 2C), although we only observed a borderline significance in association with better OS (P = .062) (Fig. 2D).
Cox proportional hazards analysis demonstrated similar results indicating that mutant GATA3 genotype is significantly associated with improved DFS (HR, 0.25; 95% CI, 0.08-0.86 [P = .032]) and OS (multivariate HR, 0.45; 95% CI, 0.14-0.96 [P = .043]) (Table 4). Taken together, GATA3 mutations may be associated with improved survival in patients with breast cancer.
Table 4. Multivariate Cox Proportional Hazards Analysis of DFS and OS for Patients With Breast Cancer in the FUSCC Cohort.
To the best of our knowledge, the current retrospective study represents the first comprehensive survey reporting the clinicopathologic characteristics and outcomes of patients with GATA3-mutant breast cancer. The findings demonstrate that tumors harboring GATA3 mutations are associated with unique clinicopathologic and molecular features, as well as improved survival.
GATA3 is a highly conserved gene/protein whose absolute expression level is crucial for its function. GATA3 is widely expressed throughout embryonic development and in adult tissues including T lymphocytes; endothelial cells; and placenta, kidney, adrenal gland, and central nervous systems. In mammary tissue, GATA3 is produced by cells of luminal origin and is a master regulator gene/protein of luminal epithelial cell differentiation.[7, 8] GATA3 is believed to have a tumor suppressive effect, and loss of GATA3 function is implicated in the pathogenesis of luminal-type breast cancer.
The results of the current analysis suggest that GATA3 mutations were mainly detected in luminal A and luminal B tumors, which is consistent with the hypothesis that GATA3 mutations might be important in the etiology of luminal-like breast cancers. In accordance with previous studies,[19, 20] approximately one-half of the GATA3 mutations identified in patients with breast cancer were clustered in exons 5 and 6, exons that encode ZnF2 and the C-terminal domain. Experimental evidence has shown that ZnF2 of GATA3 is required for DNA binding. Therefore, GATA3 mutations that affected ZnF2 might in turn affect DNA binding and result in a functional haploinsufficiency of GATA3,[22, 23] which would likely cause a perturbation in the developmental state of these cells and contribute to tumorigenesis. The data from the current study also demonstrated the different mutation spectra of GATA3 between the TCGA cohort and the FUSCC cohort, indicating that there might be some difference in the mutational evolution of luminal-like breast cancer in different populations. Because few Asian (or Chinese) patients were included in the TCGA cohort, it does raise questions regarding ethnic diversity for future research. In conclusion, the current study data provide clinical evidence that GATA3 insufficiency in breast tumors could indeed be an important factor in the development of luminal-type breast cancer.
Another interesting question that should be addressed herein is how GATA3 mutations are associated with a favorable prognosis. Because we observed the coexistence of GATA3 mutations and ER expression, and that luminal-like tumors are likely to be GATA3-mutated, it is not surprising that GATA3-mutated tumors are associated with a favorable prognosis in the overall population. More interestingly, we also found that patients with GATA3-mutated, ER-positive tumors tended to have a better survival compared with patients with wild-type GATA3, ER-positive tumors. There are several potential explanations. First, as a transcription factor, GATA3 induces a variety of genes during luminal differentiation, some of which play significant roles in chemoresistance, including trefoil factor 1, phosphoserine aminotransferase 1, anterior gradient 2 homolog, and cyclin D1.27,28 Mutations of GATA3 (especially in the zinc-finger region) might result in functional haploinsufficiency and downregulation of the expression level of these genes. However, the precise mechanism of the sensitivity of GATA3-mutant tumors to chemotherapy remains to be investigated. Furthermore, because the majority of the patients with ER-positive tumors received adjuvant endocrine therapy, GATA3 mutations might be predictive of better outcomes in patients who receive endocrine therapy (much like ER is predictive). There may be too few patients in the current study to address this question, but it could be a point of interest for future research.
A major strength of the current study is that the samples involved were obtained from 2 large populations, and patients were carefully followed over a long period of time. One limitation of this study is its defining molecular breast cancer subtypes using IHC-based surrogates in the FUSCC cohort. Compared with existing recommendations, we can only achieve an approximation of the breast cancer's genomically defined subtype when assessing biomarkers using IHC surrogates. In addition, information regarding disease recurrence and metastasis is unavailable in the TCGA cohort, and therefore only OS could be evaluated.
In conclusion, GATA3 mutations appear to occur mainly in patients with luminal-like breast cancer, are mostly mutually exclusive to tumors with PIK3CA and TP53 mutations, and are associated with a favorable prognosis. The results of the current study highlight the importance of considering GATA3 mutation-associated heterogeneity in the treatment of patients with breast cancer, and define a subgroup in whom limited therapy may be appropriate. Further preclinical and clinical studies are needed to validate these findings in other independent cohorts before GATA3 mutations can become a guiding factor when making a therapeutic selection for the treatment of breast cancer.
This work was supported by grants from the National Natural Science Foundation of China (grants 81001169, 81370075, and 81372848), the Shanghai United Developing Technology Project of Municipal Hospitals (grant SHDC12010116), the Shanghai Key Laboratory of Breast Cancer (grant 12DZ2260100), and the Zhuo-Xue Project of Fudan University (to Dr. Yu).
CONFLICT OF INTEREST DISCLOSURES
The authors have declared no conflicts of interest.