Cancer Cell Biology
CENP-F expression is associated with poor prognosis and chromosomal instability in patients with primary breast cancer
Article first published online: 4 JAN 2007
Copyright © 2006 Wiley-Liss, Inc.
International Journal of Cancer
Volume 120, Issue 7, pages 1434–1443, 1 April 2007
How to Cite
O'Brien, S. L., Fagan, A., Fox, E. J.P., Millikan, R. C., Culhane, A. C., Brennan, D. J., McCann, A. H., Hegarty, S., Moyna, S., Duffy, M. J., Higgins, D. G., Jirström, K., Landberg, G. and Gallagher, W. M. (2007), CENP-F expression is associated with poor prognosis and chromosomal instability in patients with primary breast cancer. Int. J. Cancer, 120: 1434–1443. doi: 10.1002/ijc.22413
- Issue published online: 30 JAN 2007
- Article first published online: 4 JAN 2007
- Manuscript Accepted: 14 SEP 2006
- Manuscript Received: 24 MAR 2006
- Cancer Research Ireland
- British Association for Cancer Research
- Enterprise Ireland
- Health Research Board of Ireland
- Third Level Institutions (PRTLI)
- breast cancer;
- DNA microarrays;
- tissue microarrays;
- chromosomal instability
DNA microarrays have the potential to classify tumors according to their transcriptome. Tissue microarrays (TMAs) facilitate the validation of biomarkers by offering a high-throughput approach to sample analysis. We reanalyzed a high profile breast cancer DNA microarray dataset containing 96 tumor samples using a powerful statistical approach, between group analyses. Among the genes we identified was centromere protein-F (CENP-F), a gene associated with poor prognosis. In a published follow-up breast cancer DNA microarray study, comprising 295 tumour samples, we found that CENP-F upregulation was significantly associated with worse overall survival (p < 0.001) and reduced metastasis-free survival (p < 0.001). To validate and expand upon these findings, we used 2 independent breast cancer patient cohorts represented on TMAs. CENP-F protein expression was evaluated by immunohistochemistry in 91 primary breast cancer samples from cohort I and 289 samples from cohort II. CENP-F correlated with markers of aggressive tumor behavior including ER negativity and high tumor grade. In cohort I, CENP-F was significantly associated with markers of CIN including cyclin E, increased telomerase activity, c-Myc amplification and aneuploidy. In cohort II, CENP-F correlated with VEGFR2, phosphorylated Ets-2 and Ki67, and in multivariate analysis, was an independent predictor of worse breast cancer-specific survival (p = 0.036) and overall survival (p = 0.040). In conclusion, we identified CENP-F as a biomarker associated with poor outcome in breast cancer and showed several novel associations of biological significance. © 2006 Wiley-Liss, Inc.
In a study by van't Veer et al. published in 2002, a 70-gene prognosis classifier was identified via DNA microarray analysis of primary breast cancer that could be used to predict metastatic potential.1 This work received considerable attention worldwide, and formed the basis for a clinical trial assessing the utility of DNA microarray technology in guiding treatment decisions for breast cancer patients.2 However, some concerns have been raised about the data analysis methods and sample distributions utilized for this study.3, 4, 5 To address such concerns, we reanalyzed this key DNA microarray dataset, validated our findings in 2 independent patient cohorts, and used tissue microarray (TMA) technology to explore biomarker associations with other known tumor variables. For re-analysis, we used the supervised method of BGA, which is based on carrying out an ordination (e.g. principal component analysis) of groups of samples rather than of individual samples. We previously demonstrated the successful application of BGA to DNA microarray data, and identified clinically important genes that were missed in previous analyses.6 In the present study, we trained and cross-validated a gene classifier which maximally discriminated between patients with a good or poor prognosis. CENP-F was among the genes highly expressed in breast tumors of patients with poor prognosis, a finding that we validated in a related DNA microarray dataset.5 Since little is known about the function of CENP-F in cancer, we examined its association with other known tumor parameters. Finally, we used immunohistochemistry on TMAs from 2 independent primary breast cancer cohorts to validate CENP-F protein expression as a prognostic marker, and identified coexpressed proteins that indicate a possible functional role of CENP-F in breast cancer.
Material and methods
Public DNA microarray datasets
DNA microarray data analysis
For the van't Veer dataset,1 the expression data arising from analysis of ∼25,000 human genes in 78 samples was filtered according to the original criteria.1 In brief, genes were excluded if they did not display at least a 2-fold difference in expression and a p value of less than 0.01 in more than 3 samples. BGA, using Correspondence Analysis to ordinate the good and poor prognosis groups,6 was applied to the resulting dataset of ∼5,000 genes and used to classify the remaining 19 test samples. The 96 pooled training and test samples were randomly recategorized into 77 training and 19 test samples. Sample 54 was removed from the analysis as it contained >20% missing values. The 96 samples were re-split 100 times and BGA performed at each iteration. The top 100 genes associated with good prognosis and the top 100 genes associated with poor prognosis were then selected. BGA was performed using the ADE4 module from Bioconductor (http://www.bioconductor.org). Analysis was performed using the statistical package R (http://www.r-project.org); the relevant R scripts are available on request. Data was downloaded from http://microarray-pubs.stanford.edu/wound_NKI/explore.html. For the van de Vijver dataset,5CENP-F mRNA expression was categorized as negative/low, unchanged, or high expression relative to pooled cRNA from each patient sample, acting as reference cRNA. Tumor samples were classified according to CENP-F mRNA expression based on absolute expression analysis P values (alpha level of 0.05), following the method of Moody et al.7
Patients and tumour samples for TMA analysis
Patients from the 2 independent primary breast cancer cohorts used in this study have been described previously.8 In brief, cohort I consisted of 114 patients diagnosed with primary invasive breast cancer in Northern Sweden during 1988–1991. Samples were available from 91 patients for analysis of CENP-F expression.
Cohort II consisted of 512 consecutive breast cancer patients diagnosed at the Department of Pathology, Malmö University Hospital, Sweden during 1988–1992. Samples were available from 289 patients for analysis of CENP-F expression. The 289 tumor samples had a higher proportion of larger (p < 0.001), ER-negative (p < 0.001), high grade tumours (p < 0.001) and node-positive patients (p = 0.019), when compared with the 223 missing samples. There was no significant difference in patient age (p = 0.367), histological type (p = 0.494) or PR status (p = 0.204) between available and unavailable samples. Ethical approval was obtained for the use of human tissue samples for research from the Review Boards at Umeå and Lund universities, respectively.
Construction of TMAs and immunohistochemistry
TMAs were prepared separately for each cohort as previously described.9 The tissue was deparaffinised, rehydrated and microwave-treated for 10 min in citrate buffer (pH 6.0). For detection of CENP-F, we used a rabbit polyclonal antibody (Abcam, Cambridge, UK; ab5) at a dilution of 1:100. Antibody specificity was confirmed by comparing the immunohistochemical staining of cell lines with corresponding Western blot reactivity. Nuclear staining immunoreactivity was determined by estimating the percentage of distinctly positive tumor cell nuclei. Based on previous studies of CENP-F,10 we used a 10% cut-off point to categorise CENP-F expression, where 0–9% = “<10%”; and 10–100% = “≥10%”. The results were separately scored by 2 observers and results compared. Any discrepancies in scoring were rescored by both observers together and a consensus reached. Evaluation of Ki67, VEGF-A, VEGFR1, VEGFR2, p53, phospho-ERK 1/2 and phospho-Ets-2 has been described elsewhere.8, 11, 12, 13
The breast cancer cell lines T47D, BT474 and MDA-MB-231 and SK-BR3 were obtained from the European Collection of Cell Cultures, Wiltshire, UK. T47D, BT474 and MDA-MB-231 cell lines were grown in DMEM (Sigma, MO) supplemented with 10% FCS (Invitrogen, CA), L-glutamine (2 μM), penicillin (50 IU/ml) and streptomycin sulphate (50 μg/ml). SK-BR3 cells were grown in McCoy's 5a Medium (Sigma, MO) supplemented with 10% FCS. Cells were maintained in humidified air with 5% CO2. Metaphase-arrested cells were obtained by incubating cells in the presence of nocodazole (1 μM) for 16 h.
Cell line array
The breast cancer cell lines T47D, SK-BR3, BT474 and MDA-MB-231 were used to optimize the anti-CENP-F antibody for immunohistochemical analysis. Cell lines were fixed in PFA for 30 min and resuspended in 70% ethanol overnight before being embedded in paraffin and arrayed.
Cultured cells were washed in 10 ml PBS, harvested and lysed in RIPA buffer containing 20 mM Tris pH 7.5, 150 mM NaCl, 1% Nonidet P-40, 0.5% sodium deoxycholate, 1 mM EDTA, 0.1% SDS and protease inhibitor cocktail (Sigma, MO). Protein levels were determined using the bicinchoninic acid (BCA) method (Pierce, IL). Samples containing 30 μg of protein were separated on a 3–8% Tris-acetate gel (Invitrogen, CA) by SDS-PAGE under reducing conditions. After electrophoresis, proteins were transferred to a polyvinylidene fluoride (PVDF) membrane, Immobilin P (Millipore, MA). Membranes were blocked in 5% non-fat milk for 1 h. CENP-F expression was detected using a rabbit polyclonal anti-human CENP-F antibody (1:1500, clone ab5 from Abcam). Membranes were washed and incubated for 1 h with horseradish peroxidase-conjugated anti-rabbit antibody (1:10,000; Promega, UK). Antigen-antibody complexes were detected using ECL Plus reagent (Amersham Biosciences, Buckinghamshire, UK). Expression of cyclin E was measured by Western blotting and densitometry and was described previously.11, 14
TMA statistical analysis
The χ2 test for trend, Fisher's exact and Mann–Whitney tests were used for comparison of CENP-F expression with all other known parameters. Kaplan–Meier plots were used for survival analysis and the curves compared using the log-rank test.15 Cox proportional hazards regression was used to estimate proportional hazard ratios and conduct multivariate analyses. All calculations were performed with SPSS v11.0 (SPSS, IL).
Identification of alternative candidate biomarkers for primary breast cancer following reanalysis of DNA microarray data
When implementing BGA, we used an identical filter criterion set by van't Veer et al. in their original analysis,1 which reduced the number of genes from 25,000 to just over 5,000. The original 78 training breast tumor samples were initially used to identify discriminating genes, and the 19 test samples used for validation of the identified genes. The classification accuracy we achieved was comparable to the original analysis,1 with 84% of the test set being ascribed to the correct prognosis group (data not shown).
To remove a possible training and test sample selection bias, we randomly recategorized patient breast tumor samples into training and test samples. By applying BGA iteratively, the classification accuracy ranged from 36 to 84% with a median classification accuracy of 68% (Fig. 1). A discrimination score was calculated for each gene by averaging the contribution (or weight) of a gene in each BGA over 100 iterations. Thus, this approach should produce a more robust gene ranking, as genes with more discrimination power should re-occur more frequently at higher rankings. Each of the 5,000 genes was ranked according to its average BGA co-ordinate. Supplementary Tables SIa and SIb detail the top 100 genes associated with good prognosis and the top 100 genes associated with poor prognosis, respectively, identified using BGA. Genes were then categorised into gene ontology (GO) categories (Table SII and Figs. S1 and S2). Genes involved in the cell cycle (p ≤ 0.001) and movement/motor activity (p ≤ 0.001) were significantly over-represented while genes involved in development (p ≤ 0.001), signal transduction (p ≤ 0.001) and cell communication (p ≤ 0.034) were significantly under-represented, in the poor prognosis group. Tables SIIIa and SIIIb detail the functional categories associated with each of the top 100 genes associated with good prognosis and the top 100 genes associated with poor prognosis.
Among the genes we identified as highly associated with poor prognosis was CENP-F, which encodes for a kinetochore-associated protein implicated in the regulation of cell division,16, 17S100A9, previously associated with inflammation and more recently with tumor development and metastasis,18, 19, 20survivin, an inhibitor of apoptosis and mitotic regulator,21cathepsin L2, a cysteine protease,22BUB1, a checkpoint kinase regulating the anaphase promoting complex or cyclosome,23carbonic anhydrase IX, a hypoxia-regulated enzyme involved in tumor cell survival,24 a neuropeptide, CART25 and adrenomedullin, an angiogenic peptide.26 Genes we identified as being highly associated with good prognosis included ER, PR, keratin 18, and serpinA3, a protease inhibitor,27 as well as lipophilin B and mammaglobin A, which form a heterodimeric complex and are overexpressed in breast cancer, but whose function remains unknown.28, 29 A heatmap was generated that depicts the association of the prognostic genes we identified, with their respective class (Fig. 2). Her2 expression was not a predictor of outcome in our re-analysis and ranked half way through the 5,000 significant genes in the dataset analysed (data not shown).
CENP-F is associated with poor prognosis in a related primary breast cancer DNA microarray dataset
CENP-F ranked 66 out of 100 genes associated with poor prognosis, with only small differences in correlation with survival between genes. We analyzed the expression of CENP-F in a related primary breast cancer DNA microarray dataset derived from 295 breast tumors.5 Tumor samples were classified according to CENP-F mRNA expression based on absolute expression analysis P values (alpha level of 0.05), following the method used by Moody et al.7 We found that CENP-F mRNA was overexpressed in 63 (21%) of the 295 tumors, with 108 tumors showing decreased expression and 124 tumors had no change in expression, relative to reference RNA. High CENP-F expression was associated with increased tumor size (p = 0.028), high tumor grade (p < 0.001) and ER-negative tumors (p < 0.001) (Table I). In addition, CENP-F mRNA expression was related significantly to reduced overall survival (p < 0.001) and reduced metastasis-free survival (p < 0.001) (Fig. 3).
|Variable||CENP-F (negative/low) (n = 108)||CENP-F (high) (n = 63)||P value (χ2 test)|
|Median (range)||43 (32–52)||45 (26–52)|
|<Median (26–44)||49 (45)||33 (52)||0.376|
|>Median (44–53)||59 (55)||30 (48)|
|Tumor size (mm)|
|Median (range)||20 (2–0)||23 (10–2)|
|T1 (1–20)||65 (60)||27 (43)||0.028|
|T2 (>20)||43 (40)||36 (57)|
|1||45 (42)||6 (10)||<0.001|
|2||38 (35)||18 (29)|
|3||25 (23)||29 (61)|
|Negative||65 (60)||35 (56)||0.553|
|Positive||43 (40)||28 (44)|
|ER−||13 (12)||25 (40)||<0.001|
|ER+||95 (88)||38 (60)|
Expression of CENP-F in primary breast cancer
Our next aim was to validate our findings using CENP-F expression at the protein level. The specificity of the anti-CENP-F antibody was first established using Western blotting, in parallel with immunohistochemical analysis of formalin-fixed, paraffin-embedded cell lines mimicking the handling of the primary tumors. CENP-F is maximally expressed in the G2/M phase of the cell cycle.16, 17 Each of the 4 breast cancer cell lines used in the study were treated with the anti-mitotic agent, nocodazole, which arrests cells at the G2/M phase of the cell cycle. As shown in Figure 4a, the anti-CENP-F antibody detected a protein of ∼350 kDa by Western blotting, with M-phase arrested, nocodazole-treated cells showing increased CENP-F expression when compared with untreated cells, as expected. Nuclear expression of CENP-F was detected by immunohistochemistry in a proportion of cells in each of the cell lines examined, in the absence of nocodazole (Fig. 4b).
CENP-F expression was then assessed in 2 different breast cancer cohorts (I and II) arranged in TMAs. For breast cancer cohort I, 90 (99%) out of 91 tumors available for analysis, expressed nuclear CENP-F in various amounts (Fig. 5). In cohort II, 289 samples were available for analysis and nuclear staining was seen in 206 (71%) out of 289 specimens. Patients from cohort I were significantly younger (p = 0.007) and tumor size was significantly larger (p = 0.006) than samples available for analysis from cohort II, which could contribute to the differences in the proportion of CENP-F-positive tumors between the cohorts. In addition, cohort I included, more ER negative breast carcinomas compared to cohort II, 29% versus 20%, but this difference was not statistically significant. There was no significant difference in grade, nodal status or PR status between the samples analyzed in cohorts I and II.
CENP-F protein expression correlates with clinico-pathological parameters in primary breast cancer
In patient cohorts I and II, we analyzed potential associations between CENP-F expression and known clinico-pathological parameters such as tumor size, tumor type, grade, hormone receptor status, patient age and the presence of lymph node metastases. We used a 10% cut-off point for CENP-F expression to categorize samples into groups, in accordance with previous studies.10 CENP-F expression was associated with ER-negative tumors (p = 0.028) in cohort II, and with high grade tumors in both cohort I (p = 0.002) and cohort II (p < 0.001) (Table II). CENP-F expression was not associated with tumor size, patient age, lymph node status, histological type or PR status in either cohort.
|Variable||Cohort I||p-value||Cohort II||p-value|
|N||CENP-F < 10%||CENP-F ≥ 10%||n||CENP-F < 10%||CENP-F ≥ 10%|
|Median (range)||56 (33–82)||62 (30–84)||65 (28–92)||64 (27–96)|
|Tumour size (mm)||86||0.2121||263||0.6601|
|Median (range)||20 (10–60)||25 (12–100)||19 (1–100)||20 (0–100)|
|Ductal||33 (94)3||49 (88)3||122 (72)3||70 (73)3|
|Lobular||2 (6)||3 (5)||29 (17)||12 (13)|
|Medullary||0||2 (3.5)||7 (4)||8 (8)|
|Tubular||0||0||6 (3.5)||4 (4)|
|Mucinous||0||2 (3.5)||6 (3.5)||2 (2)|
|Negative||19 (56)||25 (50)||104 (63)||46 (51)|
|Positive||15 (44)||25 (50)||61 (37)||44 (49)|
|ER–||6 (17)||20 (36)||29 (16)||27 (27)|
|ER+||29 (83)||35 (64)||154 (84)||72 (73)|
|PR–||19 (56)||24 (48)||61 (38)||42 (45)|
|PR+||15 (44)||26 (52)||101 (62)||51 (55)|
|1||17 (48)||19 (34)||37 (20)||12 (12)|
|2||9 (26)||5 (9)||92 (50)||30 (29)|
|3||9 (26)||32 (57)||57 (30)||60 (59)|
CENP-F protein expression correlates with tumour biological parameters in primary breast cancer
CENP-F expression was associated with the proliferation marker Ki67 in cohort II (Table III; p < 0.001) but not in cohort I (Table III; p = 0.198). In addition, CENP-F expression was associated with markers of chromosomal instability (CIN) including cyclin E over- expression (p = 0.021), survivin nuclear expression (unpublished data; p = 0.001), c-Myc amplification (p = 0.003), increased telomerase activity (p = 0.002) and aneuploidy (p = 0.025) in cohort I, indicating a link between CENP-F and CIN in these tumors (Table III). VEGF-A was not associated with CENP-F expression in either cohort I (p = 0.070) or cohort II (p = 0.959). However, in cohort II, significant associations were observed between CENP-F expression and tumor-specific VEGFR2 expression (p = 0.001) and phosphorylated Ets-2 (p = 0.001) but not with phosphorylated Erk1/2 (p = 0.190) (Table III). Finally, we found no association between CENP-F expression and tumor specific VEGFR1, p53 or Her2 overexpression (Tables II and III).
|Variable (n)||CENP-F <10%||CENP-F ≥10%||p value|
|Low||5 (14)||3 (5.6)|
|Intermediate||23 (66)||29 (53.7)|
|High||7 (20)||22 (40.7)|
|Cyclin E3 (91)||0.0212|
|Low||29 (83)||33 (59)|
|High||6 (17)||23 (41)|
|Telomerase activity4 (86)||0.0025|
|Myc amplification6 (71)||0.0037|
|Low||27 (96)||29 (67)|
|Intermediate/High||1 (4)||14 (33)|
|Diploid||19 (61)||19 (35)|
|Aneuploid||12 (39)||35 (65)|
|Negative||7 (20)||5 (9)|
|<50%||22 (63)||21 (37.5)|
|>50%||6 (17)||30 (53.5)|
|Grade 0–2||29 (85)||42 (78)|
|Grade 3||5 (15)||12 (22)|
|p53 status1 (89)||0.0952|
|p53 –||27 (82)||36 (64)|
|p53 +||6 (18)||20 (36)|
|≤10%||11 (32)||10 (18)|
|>10%||23 (68)||45 (82)|
|Neg/ Low||76 (50)||23 (26)|
|Intermediate||55 (37)||44 (51)|
|High||20 (13)||20 (23)|
|Neg/Low||54 (37)||35 (39)|
|Intermediate||65 (44.5)||35 (39)|
|High||27 (18.5)||19 (22)|
|Neg/Low||28 (16)||15 (15)|
|Intermediate||75 (42)||34 (35)|
|High||74 (42)||49 (50)|
|Negative||59 (37)||18 (18)|
|Low||43 (27)||34 (35)|
|Intermediate||41 (26)||24 (25)|
|High||15 (10)||22 (22)|
|Phospho-Erk 1/21 (231)||0.1907|
|Negative||66 (43)||32 (35.5)|
|Intermediate||29 (19)||13 (19)|
|High||16 (10)||14 (15.5)|
|≤10%||76 (42)||15 (15)|
|>10%||106 (58)||83 (85)|
CENP-F protein expression correlates with clinical outcome in primary breast cancer
Clinical follow-up data were available for all patients in cohorts I and II.8 Patient cohorts were analyzed separately. In agreement with previous findings,10 we found that using a cut-off of 10% CENP-F expression best separated patients on the basis of survival. In cohort I, CENP-F expression showed a significant association with overall survival (p = 0.05; Fig. 6a). In cohort II, expression of CENP-F correlated significantly with both breast cancer specific survival (p = 0.009) and overall survival (p = 0.04) (Fig. 6b and 6c).
Patients from cohort I had overall survival information only, and patient numbers were too low to carry out multivariate Cox regression analysis. In a univariate analysis of cohort I, CENP-F expression was associated with worse overall survival, with this association approaching significance (HR, 2.03; 95% CI, 0.97–4.24; p = 0.059). Univariate and multivariate Cox regression analyses were conducted on cohort II (Table IV). CENP-F expression, ER and tumor size were all significantly associated with breast cancer-specific survival (p = 0.011; 0.006; 0.001, respectively) but patient age, VEGFR2 and phospho-Ets-2 were not (p = 0.088; 0.408 and 0.544, respectively). In a multivariate analysis including ER and tumor size, CENP-F expression was an independent predictor of breast cancer-specific survival (p = 0.036) (Table IV). Tumor grade was not included in the multivariate model as our data suggests that CENP-F is involved in CIN; thus, tumor grade may be on a causal pathway between CENP-F and survival and should, therefore, not be included in multivariate models with CENP-F.30, 31 Using a similar approach, we carried out univariate analysis for overall survival on CENP-F expression, ER, VEGFR2, phospho-Ets-2, tumor size and patient age (Table V). CENP-F expression, tumor size and patient age were significantly associated with overall survival in a univariate analysis (p = 0.047; <0.001; <0.001, respectively), and in a multivariate analysis CENP-F retained its prognostic significance (p = 0.040) together with patient age (p < 0.001) (Table V). Ki67 showed only borderline significant association with overall survival in cohort II (p = 0.05) and was not significant when added to multivariate models for breast cancer specific or overall survival (data not shown).
|HR||95% CI||p-Value||HR1||95% CI||p-Value|
|CENP-F (<10% vs. ≥ 10%)||1.93||1.16–3.21||0.011||1.76||1.04–2.98||0.036|
|Tumour size (mm) (continuous)||1.01||1.01–1.02||0.001||1.01||1.00–1.02||0.029|
|ER (pos vs. neg)||0.49||0.29–0.81||0.006||0.54||0.31–0.97||0.037|
|Patient age (continuous)||0.99||0.97–1.01||0.088||n/a|
|VEGFR2 (grade 3 vs. 0–2)||0.80||0.38–1.67||0.544||n/a|
|Phospho-Ets-2 (grade 3 vs. 0–2)||0.79||0.39–1.59||0.513||n/a|
|HR||95% CI||p-Value||HR1||95% CI||p-Value|
|CENP-F (<10% v ≥ 10%)||1.38||1.01–1.90||0.047||1.40||1.02–1.92||0.040|
|Tumour size (mm) (continuous)||1.01||1.01–1.01||<0.001||1.00||0.997–1.01||0.261|
|Patient age (continuous)||1.06||1.05–1.07||<0.001||1.05||1.03–1.06||<0.001|
|ER (pos vs. neg)||0.73||0.52–1.01||0.057||n/a|
|VEGFR2 (grade 3 vs. 0–2)||1.19||0.81–1.76||0.380||n/a|
|Phospho-Ets-2 (grade 3 vs. 0–2)||1.06||0.72–1.56||0.780||n/a|
DNA microarrays offer new possibilities for the elucidation of individual genes and groups of genes that are preferentially expressed in tumor subgroups. The 70-gene prognosis classifier identified by van't Veer et al.1 contained a large number of unknown or unexpected genes and none of the well-known prognostic markers in breast cancer such as ER, Her-2, uPA or PAI-1.32 This dataset forms the basis of a clinical trial, which aims to validate the efficacy of using the identified classifier for tailoring of treatment options. However, the methodology used to obtain this gene signature has been criticized.3, 4
Given the complexity of selecting a relatively small number of informative genes from the many thousands of genes represented on a DNA microarray, reanalysis of such data using alternative approaches to identify discriminating genes is warranted. Here, we used the statistical method of BGA, a powerful method for the analysis of cancer microarray data,6 to reanalyze this breast cancer dataset from van't Veer et al.1 Our reanalysis approach revealed genes involved in key processes such as checkpoint control, apoptosis and angiogenesis, most of which were previously unidentified in the original analysis.
The classification accuracy we achieved using the same training and test samples as van't Veer et al.1 was comparable to published results, i.e. 84%. However, when training and test samples were selected randomly, the classification accuracy varied widely from a maximum accuracy of 84% to as low as 36%, with a median classification accuracy of 68%. Similar findings have been published by others,3, 4 suggesting a bias in the selection of the original training and test samples.
In our reanalysis of the van't Veer dataset,1CENP-F was among the genes that were highly associated with poor prognosis that could be studied at the protein level using TMAs. CENP-F is a cell cycle-regulated protein associated with kinetochores, the site at which chromosome-microtubule interactions are monitored and the source of checkpoint signals.33 CENP-F is maximally expressed at the G2/M phase of the cell cycle16, 17 and has been implicated in kinetochore assembly and/or the spindle checkpoint.34, 35 More recently, CENP-F has been shown to play a central role in the recruitment of the checkpoint proteins, BubR1 and Mad1, resulting in a sustained checkpoint response.36
In a related DNA microarray dataset that we reanalyzed containing 295 breast tumor samples,5 over-expression of CENP-F mRNA was associated with larger tumor size, as well as ER-negative, high grade tumors. CENP-F mRNA expression correlated significantly with worse overall survival and a decreased probability of remaining metastasis-free.
Two different primary breast cancer cohorts were used to further investigate the role of CENP-F. Two cohorts were analyzed, as each cohort has unique data available. CENP-F protein expression correlated with reduced breast cancer-specific survival and overall survival in both univariate and multivariate analyses. The strong correlation between CENP-F expression and breast cancer-specific survival highlights the usefulness of CENP-F as a breast cancer-specific marker of poor outcome. Our findings are in agreement with a previous report analyzing CENP-F expression and disease-free survival in node-negative breast cancer patients.10
In cohort I, parameters relating to cell cycle deregulation and CIN had previously been analyzed.14, 37 CENP-F expression was associated with cyclin E over-expression, survivin nuclear expression and c-Myc amplification. Cyclin E is involved in centrosome duplication leading to CIN,38, 39, 40 while constitutive expression of cyclin E has been shown to result in CIN41, 42 and is associated with poor prognosis in breast cancer.14 Survivin has been reported to activate the cyclin E/Cdk2 complex resulting in an accelerated S phase shift.43 CENP-F expression was also associated with c-Myc amplification which has been shown to activate cyclin E/Cdk2, leading to cell cycle progression and proliferation.44 In addition, CENP-F expression correlated significantly with high telomerase activity. Telomerase activation is associated with telomere dysfunction, a major mechanism underlying CIN of human cancer.45, 46 Furthermore, a significant proportion of tumors over-expressing CENP-F were aneuploid, strengthening the relation between CENP-F expression and markers of CIN. Additional studies, including FISH analysis, will determine if these associations have functional significance. FISH analysis could not be performed in this study because of insufficient sample availability
While tumor VEGF-A expression did not correlate with CENP-F expression in either patient cohort I or II, we found a significant correlation between tumor cell VEGFR2 expression and CENP-F in cohort II. CENP-F is a phosphoprotein but it is not known which kinases target CENP-F for phosphorylation or the role of phosphorylation in CENP-F regulation. It is tempting to speculate that CENP-F may be a target for phosphorylation through cyclin E or VEGFR2, as CENP-F is significantly associated with expression of both of these proteins. However, further studies will need to be carried out to establish a functional link.
In line with other publications, CENP-F was associated with proliferation47, 48, 49 and ER negativity5, 10 in cohort II. Furthermore, CENP-F expression was associated with the transcription factor phospho-Ets-2. Ets-2 expression in breast cancer may be linked to proliferation,8 however, the downstream target genes are unknown. CENP-F regulates gene transcription and proliferation through association with the transcription factor ATF4.50 Our results suggest that CENP-F may be a potential candidate for Ets-2 co-transcriptional regulation.
CENP-F is a farnesylated protein and is targeted by farnesyl transferase inhibitors (FTIs)51, 52 resulting in CENP-F inactivation. Originally generated to inhibit oncogenic RAS, FTIs are effective anti-neoplastic agents. It is now becoming apparent that RAS is not the only target of FTIs; however, the role of other molecular targets and their mechanism of action remains elusive.53 FTIs have been shown to be effective in clinical trials of patients with metastatic breast carcinoma, especially in Her2 positive patients54, 55 and CENP-F-positive breast cancer has a pathologic response to preoperative chemotherapy.56 FTI-sensitive cells pause at the G2/M phase of the cell cycle51, 57 and have misaligned chromosomes,58 similar to cells depleted of CENP-F by RNAi.59, 60 The anti-neoplastic activity, involving inhibition of proliferation and, apoptosis may be partly due to CENP-F inhibition. Thus CENP-F may be an important, clinically significant target in breast cancer and CENP-F farnesylation a useful biomarker of tumor response.
We are grateful for excellent technical assistance from Ms. Elise Nilsson. We thank Prof. Leslie Daly and Prof. Paul McKeigue for statistical advice. The cross-national component of this work was facilitated by the Marie Curie Transfer of Knowledge Industry–Academia Partnership research programme, TargetBreast (www. targetbreast.com). The UCD Conway Institute is funded by the Programme for Third Level Institutions (PRTLI), as administered by the Higher Education Authority (HEA) of Ireland. Prof. Robert Millikan was supported, in part, by a Fulbright Scholarship Award during the execution of this work.
- 29a gene preferentially expressed in breast tissue and up-regulated in breast cancer. Int J Cancer, in press., , , , , , . :
- 54Farnesyltransferase inhibitors and their potential in the treatment of breast carcinoma. Semin Oncol 2005; 30(5, Suppl 16): 79–92., , .