Brain metastases (BMs) are a common occurrence in patients with breast cancer, and accurately predicting survival in these patients is critical to appropriate management. A survival nomogram for breast cancer patients with BM was constructed, and its performance is compared to current predictive models of survival.
A Cox proportional hazards regression with a nomogram representation was used to model survival in a population of 261 women with breast cancer and BMs treated from 1999 to 2008. The model was validated internally by 10-fold cross-validation and bootstrapping, and concordance (c) indices were calculated. The predictive performance of the nomogram described here is compared to current prognostic models, including recursive partitioning analysis, graded prognostic assessment, and diagnosis-specific graded prognostic assessment.
The c-index for the model described here was 0.67. It outperformed recursive partitioning analysis, graded prognostic assessment, and diagnosis-specific graded prognostic assessment, based on c-index comparisons.
Breast cancer is among the most common cancers in the United States, with a lifetime risk in women of 12%.1 The overall incidence of brain metastases (BMs) in patients with breast cancer has been reported as high as 30%,2 although subgroups defined by clinical parameters and molecular markers [including estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor type-II (HER2), and others] have unique rates of BM that may differ significantly from this aggregate statistic.3-6 From the time of development of BMs, the median survival in breast cancer patients has historically been 3 to 6 months.7 More recently, the combination of early detection and advances in local and systemic therapy have led to improvements in breast cancer outcomes.8 The result has been an improvement in primary disease control,2, 9-12 a decline in overall mortality,8, 13, 14 and an increase in the number of long-term survivors15 over the past 2 decades. Prognostic models that accurately predict patient survival in the modern era of breast cancer therapy are essential to optimize the management of breast cancer patients with BM.
The first widely accepted model for prospective survival prediction in patients with BM was developed in 1997 by the Radiation Therapy Oncology Group (RTOG) using a recursive partitioning analysis (RPA) strategy16 in a population of patients with BM from various primary malignancies. The resulting model used 4 primary covariates, namely age, Karnofsky performance score (KPS),17 control of primary tumor, and presence of extracranial metastases, to define 3 “RPA classes” with distinct mean survivals for patients with BM. This system was replaced 11 years later by the graded prognostic assessment (GPA).18, 19 Also developed using patients with BM from various primary malignancies, the GPA system removed the subjective, binary variable of “primary disease control,” added consideration for the number of BM, and converted age, KPS, and number of brain lesions to semiquantitative, categorical variables with 3 possible values (0, 0.5, or 1). This resulted in a system with 8 potential scores (0-4 in increments of 0.5) defining 4 prognostic groups (0-1.0, 1.5-2.5, 3.0, and 3.5-4.0), each with distinct mean survival times.18
Recognition that modern management strategies, clinical courses, and outcomes are unique for patients with BM from different primary malignancies has catalyzed interest in disease-specific prognostic models. In 2010 the GPA methodology was adapted to construct diagnosis-specific GPA classes (DS-GPA) to predict survival in patients with BM from lung, renal, melanoma, gastrointestinal, and breast cancers.20 Analysis of 642 breast cancer patients suggested that the only factor contributing significantly to survival was KPS, so the DS-GPA model for breast cancer is based solely upon this parameter. More recently the breast cancer DS-GPA has been modified to include hormone receptor status (ER/PR), HER2 amplification status, and number of BM.21, 22
The revised DS-GPA for breast cancer represents an improvement over the KPS-only model. Notwithstanding, potential limitations in the GPA analytic approach and the incremental method of constructing the revised breast cancer DS-GPA led us to believe that alternate methods to model survival prospectively in these patients may outperform the revised DS-GPA. We used a nomogram approach to construct a model for survival in breast cancer patients with BM and tested it against other common models of survival.
MATERIALS AND METHODS
Patients managed by the Brain Tumor and Neuro-Oncology Center (BTNC) at the Cleveland Clinic from 1999-2008, with a histologic diagnosis of breast adenocarcinoma and ≥1 brain lesions consistent with metastases, were eligible. A start date of 1999 corresponds with the introduction of trastuzumab therapy. Patients with metastases to the spine or leptomeningeal carcinomatosis were excluded. Patients were also excluded if they were under 18 years of age at the time of diagnosis or if they harbored more than one primary malignancy of different histologic type. Approval of the Cleveland Clinic Institutional Review Board (IRB) was obtained. The study population comprised 261 patients.
Data was extracted, compiled, and verified against patients' historical medical records. This included relevant demographic data, data regarding survival and time to progression, radiographic data regarding the location, size, and number of metastases, clinical data regarding extent of primary disease, and molecular data, including ER, PR, and HER2 status. KPS was used as an objective descriptor of functional status at the time of diagnosis of CNS disease, and the Social Security Death Index was used to verify dates of death. The RPA class,16 GPA class,18, 19 and the original20 and modified21 DS-GPA classes for breast cancer were calculated using this data. These classes could be calculated for 148 (56.7%), 253 (96.9%), 251 (96.2%), and 116 (44.4%) of 261 patients, respectively. Missing values for these classes were imputed by predictive mean matching. Data are summarized in Table 1.
Table 1. Demographic Characteristics Are Shown for the Study Population
CNS survival (diagnosis of metastases to death or censoring, mo)
Outcome (n, %)
Karnofsky performance status at diagnosis of brain metastases
Number of CNS metastases at diagnosis
Maximum dimension of largest brain metastasis (mm)
Extra-CNS metastatic disease (n, %)
Estrogen receptor status (n, %)
Progesterone receptor status (n, %)
HER2 amplification status (n, %)
Negative (not amplified)
Breast cancer stage at initial diagnosis
Original DS-GPA class
Modified DS-GPA class
Factors available in our database for which prior evidence suggests possible influence on neurologic survival in patients with BM were treated as variables in subsequent data analysis. These factors included age,16 KPS,16 presence of extracranial metastatic disease,18 number of CNS metastases,18 largest dimension of CNS metastasis, ER status,21 PR status,21 HER2 status,10-12, 21 and stage of primary malignancy. To be included, all variable values required validation at the Cleveland Clinic through direct examination by a staff physician (eg, age, KPS, presence of metastatic disease), direct examination of the neuroimaging (eg, number and size of CNS lesions), and molecular or pathologic testing (eg ER, PR immunohistochemistry or FISH analysis of HER2 status), which explains why some cases variables may be absent.
Survival (“neurologic survival”) was calculated as the interval between diagnosis of BM and death. Patients alive at the time of analysis had a censored survival point entered as the last date of available clinical data. Complete data, including all of the aforementioned variables, was available for 63 of 261 patients (24.1%). The remainder of patients had one or more missing values imputed after log-transformation of the survival time.23 After imputation, survival time was transformed back to original scale.
We used the method previously described and validated by Kattan and colleagues to construct our survival nomogram24-27 (Fig. 1). This method is based on a Cox proportional hazard regression, and the proportional hazards assumption was verified by examination of residual plots. All measured covariates available in the database were used to construct the survival model. Restricted cubic splines23 were used for continuous variables (age, KPS, number of BM, and largest dimension of CNS metastasis) to allow for possible nonlinear relationships between these variables and survival. Ten-fold cross-validation and bootstrapping with 200 resamples were used to validate the nomogram.24, 26, 27 Calibration was assessed by plotting predicted versus observed probabilities (Fig. 2).
The concordance index (c-index), which quantifies the level of consonance between predicted survival probability and the actual chance survival,24, 27 was calculated for the primary and bootstrapped models. Interpretation of this index is similar to that of a receiver-operator curve: an index of 1.0 indicates a model that is perfectly concordant with the dataset; an index of 0.0 suggests perfect discordance.23, 28 An index of 0.5 suggests 50% concordance, which is consistent with perfectly random association. The c-indices were also calculated for RPA, GPA, and both the original and the newly modified DS-GPA models for the patients in this sample to evaluate the performance of our nomogram against current standards for survival prediction in this patient population.
A total of 261 patients treated between January 1, 1999, and December 31, 2008, for brain metastases from breast cancer comprised the study group. The mean age at diagnosis was 47.4 years and the median KPS at diagnosis was 90. The median number of brain lesions was 2, and the mean was 2.7 (σ = 2.8). The median RPA and GPA classes were 2 and 2, respectively. The median DS-GPA and modified DS-GPA classes were 3 and 2, respectively. This data, as well as additional data regarding the characteristics of the brain lesions, is presented in Table 1.
Of the 261 patients comprising the study group, 82 patients (31.4%) were confirmed as HER2 amplified (HER2+), 90 (34.5%) were HER2 unamplified (HER2-), and HER2 status was not assayed or was not confirmed at our institution in the remaining 89 patients (34.1%). A total of 72 HER2+ patients received adjuvant chemotherapy targeted at the HER2 receptor. This represents 27.2% of the total population but 87.8% of the HER2+ population. Of these 72 HER2+ patients, 59 (79.7%) were treated with trastuzumab, 6 patients (8.1%) were treated with lapatinib, and 9 (12.2%) received both. Three of the 89 HER2- patients (3.3%) were treated with trastuzumab despite documented negative HER2 status.
The nomogram (Fig. 1) is used by drawing a vertical line connecting the value of each variable with the point score at the top of the diagram. The scores for each variable are then summed to give a total points score, which is plotted along the “total points” line at the bottom of the nomogram. A vertical line drawn downward through the 1-, 3-, and 5-year survival scales allows for calculation of the probabilities associated with each survival interval.
The concordance index (c-index) of this model is 0.67 from 10-fold cross validation and 0.66 from bootstrapping. By comparison, the concordance index is 0.51 for RPA, 0.58 for GPA, 0.57 for original DS-GPA, and 0.61 for modified DS-GPA. The nomogram is a more accurate predictor of survival in these patients than the RPA, GPA, original DS-GPA, or modified DS-GPA classifications. These data are summarized in Table 2.
Table 2. Performance of the Nomogram Is Shown Relative to Current Predictive Models
Two independent strategies were used to validate this model. Ten-fold cross-validation26, 27 refers to the process of dividing the original patient sample into 10 equal groups, then removing 1 group and reconstructing the model using the reduced sample set. The new model is then tested for predictive accuracy against the excluded fraction, the process is repeated 10 times (each time with a different excluded subset), and the concordance index is calculated based on these results. This process is repeated 200 times to reduce the effect of random splits, and an overall c-index is calculated. This validation method resulted in a c-index of 0.67 for our model. Alternately, a strategy of resampling with replacement (bootstrapping)26, 27 constructs a sample of equal size to the original cohort but with the possibility of repetition or exclusion of any given patient, typically resulting in a novel sample containing approximately two-thirds of the original samples.29 This process is repeated 200 times to account for potentially uneven sampling. This validation method resulted in a c-index of 0.66 for our model. Further refinements are likely when other groups apply our nomogram to their unique patient populations.27
Nomograms for Survival Prediction
Nomograms have been used successfully for at least a decade to predict clinical outcomes based upon combinations of clinical and laboratory data24-27 and have several advantages over traditional, categorical predictors, such as TNM stage or RPA/GPA class. Nomograms present a simple, graphical representation of the factors influencing the outcome being modeled by a survival model that is otherwise more challenging to conceptualize.27 This representation allows clinicians and patients to understand the pertinent disease features and their relative roles in outcome prediction, and it facilitates straightforward calculation of outcomes probabilities given a set of known covariate values. This probabilistic output may be preferable to a categorical class assignment, as it may reflect the fundamental nature of the underlying disease process more realistically. In addition, the capability of nomograms to generate individualized outcome prediction enables their use in identification and stratification of patients for inclusion in clinical trials.27
Recent data suggest that nomogram-based strategies may outperform traditional, categorical predictive models for a variety of outcomes associated with cancer.24, 30, 31 The most familiar nomograms in clinical use are those for predicting disease recurrence, continence, and erectile function after radical prostatectomy for patients with prostate cancer,24-26 the application of which has become standard-of-care in the management of this disease. Additional successes of nomogram-based strategies in modeling outcomes for bladder cancer,30 sarcoma,32 melanoma,33 and gastrointestinal tumors34 have led some authors to suggest their use as an alternative to, or as a replacement for, traditional, TNM staging.30-34 Their theoretical and practical advantages, combined with their recent successes in predicting outcomes in other cancer types, prompted us to investigate a survival nomogram as an alternative to the current, categorical systems (RPA, GPA) to predict survival in breast cancer patients with BMs.
Comparative Predictive Ability
The c-index of the nomogram that we constructed (c = 0.67) suggests that it outperforms current, categorical predictors of survival by 9.8% to 31.4%. In contrast, the relatively low c-indices of the RPA (c = 0.51) and GPA (c = 0.58) suggest that these methods may be only slightly better than the flip of a coin for predicting survival in patients with BM from breast cancer. This may be attributable to the fact that these systems fail to consider the primary diagnosis as a covariate. This potential explanation is consistent with the findings of Sperduto et al,20 who observed that the GPA model is reasonably accurate in patients with lung cancer metastases but not in patients with BM from other primary malignancies. This finding prompted them to argue in favor of and, subsequently, to develop disease-specific prognostic indices.
Notwithstanding, the original diagnosis-specific GPA (DS-GPA) for breast cancer did not outperform the nonspecific GPA (c = 0.57 and c = 0.58, respectively) in our population. This suggests that the DS-GPA may not have the originally-predicted degree of external validity, a finding that may be attributable to some combination of sample characteristics, management strategies used for these patients, mathematical limitations of the GPA methodology, or effects related to the selection of primary covariates. With regard to the latter, inclusion of molecular features of known importance in breast cancer survival, including ER/PR and HER2 status, in a modified version of the DS-GPA21 slightly improved the predictive ability of this model in our population (from c = 0.57 to c = 0.61). Despite that improvement, the nomogram proposed here outperforms the modified DS-GPA (Table 2).
Predictive Variables in the Survival Model
An additional advantage of the nomogram representation is that it facilitates straightforward interpretation of the underlying Cox proportional hazard modeling. In our model, the four covariates with the largest potential contribution to the total score are 1) largest dimension of the largest BM (larger is less favorable), 2) number of CNS metastases at diagnosis (more is less favorable), 3) KPS (lower is less favorable), and 4) HER2 amplification status (unamplified is less favorable). Recent evidence supports a potential effect on survival for each of these factors that is consistent with their observed influence on the total score calculated using the nomogram.9, 10, 12, 19-21 Current survival models only account for a limited number of these factors: the RPA and original DS-GPA classifications incorporate only 1 factor (KPS), the GPA includes 2 factors (KPS and number of BM), and the modified DS-GPA considers 3 factors (KPS, number of BM, and HER2 status). The observed c-indices for these classifiers increase as the number of these factors increases (RPA [n = 1, c = 0.51] < DS-GPA [n = 1, c = 0.57] < GPA [n = 2, c = 0.58] < modified DS-GPA [n = 3, c = 0.61]). This may suggest that predictive accuracy of previous models improves as these models become progressively more similar (in terms of their measured covariates) to our proposed model, which includes all 4 of these factors.
Our model also includes 5 additional covariates (age, presence of non-CNS metastases, ER status, PR status, and breast cancer stage at initial diagnosis), each of which may have a role in survival for patients with BM.2, 16, 18, 21 These covariates are clinical metrics available to clinicians managing breast cancer patients in a modern treatment environment, which enhances the clinical utility of the index. Their cumulative analysis appears to result in an improved predictive ability of our model relative to those currently used for survival prediction in this patient population.
In addition, several of the factors that we have identified as predictors of survival have also been identified as predictors of the likelihood of developing BM in breast cancer patients. These include age, ER negativity, HER2 negativity, and number of metastastic sites, each of which are statistically-significant covariates both in our survival model and in the likelihood of metastases model developed by Grasselin et al.35 Although the overall significance of this overlap in covariates remains unknown, the independent identification of these factors in 2 related models may serve as validation for their predictive significance.
Our model was constructed from a population of patients who were all managed at a single institution since 1999. This allowed us to study specifically patients treated in the trastuzumab era and managed by an oncology team with a consistent and aggressive strategy for management of the patients' primary and metastatic disease. However, this also has the potential to negatively impact external validity, particularly in environments with less aggressive management protocols or where therapy targeted at the HER2 receptor is not consistently used. The effect of sample size (n = 261) on external validity must also be considered. Although this represents a considerable cohort for a single institution, a much larger, multi-institutional cohort may be useful to refine the model, exploring the effects of additional covariates, and improving generalizability, and we recognize this as the next step in its development. We note, however, that the only other breast cancer-specific survival models, the DS-GPA and modified DS-GPA, were developed from a cohort of similar magnitude (n = 642), which suggests that the effects of sample size on the external validity of our model should not be dramatically different than similar effects on current prognostic models.
At least one missing value was imputed in 74.3% of the patients in our cohort. This reflects the “real world” of clinical medicine, where ideal data is not always available for complete analysis. This also reflects the rigor with which we required every variable to be identified only based on analysis done at our institution (see “Variable Selection” in the Materials and Methods section), rather than rely on outside records or word of mouth, in an effort to insure the utility of our data. Accordingly, imputation is common in survival regression analysis, and the mathematical validity of this approach has been demonstrated and discussed elsewhere.23 Similarly, not all patients had adequate data for calculation of RPA, GPA, or DS-GPA scores, and missing values were again imputed. This may have some minor effect on the calculated c-indices of the predictive models compared here, although the difference between imputed and nonimputed risk grouping the c-index is small.
Also of note are the suggestions by the nomogram of 2 covariate effects that are discordant with current models of breast cancer BM prognosis. First, the nomogram suggests that the PR negativity is relatively more favorable than PR positivity. There are 2 potential explanations for this finding. The first is that, contrary to current models, PR negativity is not an independent, negative prognostic factor in these patients. The second possibility is that this is a statistical effect that would reverse if the sample size were altered. With the available data there is no way to distinguish between the 2 etiologies of this observation, and this is one reason why we encourage other groups to collect and analyze their data to validate and to further refine the nomogram.
The second observation is that stage IV breast cancer at the time of original diagnosis has a more favorable impact on survival than lower stages at the time of diagnosis. Because the nomogram is designed to predict neurologic survival (survival from the time a BM is identified), this finding suggests that patients in whom CNS metastatic disease is present at the time of initial diagnosis survive longer than those in whom BM develop in a delayed fashion. This may suggest that aggressive management strategies for patients initially diagnosed with non–stage IV cancer helps to delay the appearance of BM until late in their overall course, while those not afforded the benefit of early, systemic therapy may develop BM earlier in the course of their disease. Alternately, this observation may be influenced by the presence or by the extent of metastatic disease outside of the CNS at the time of diagnosis, making the “stage IV” group a heterogeneous mixture of patients with a wide range of metastatic burden. Finally, PR status may be less influential at this point. Our database and analysis were not constructed specifically to evaluate these possibilities and additional, focused investigation is necessary regarding this potentially important finding.
Finally, we note that the list of potential covariates that we have included in our model is not exhaustive. Additional factors, both clinical and molecular, may prove valuable for improving the predictive power of subsequent revisions of this nomogram. These may include, for instance, size of the primary tumor, node status, Ki-67 index, and levels of plasminogen activator inhibitor-1, thymidine kinase, and cathepsin D, all of which have proven valuable in models of primary breast cancer survival.36, 37 This dimension of the analysis, too, will be improved by additional study of more comprehensive clinical and molecular databases.
We used a Cox proportional hazards regression in conjunction with a nomogram representation to construct a predictive model of survival of breast cancer patients with BM that substantially outperforms current predictive models. This model is based on a combination of 9 clinical and molecular features that should be readily available to clinicians treating patients with breast cancer, and our validation simulations suggest that this model should be highly reproducible in similar patient populations. In addition, the nomogram can predict individualized, 1-, 3-, and 5-year survival for novel patients and its straightforward representations of the relative effects of each of the 9 covariates on neurologic survival.
This research was supported in part by grant W81XWH-062-0033 from the US Department of Defense Breast Cancer Research Program to R.J. Weil. We thank the Melvin Burkhardt chair in neurosurgical oncology and the Karen Colina Wilson research endowment within the Brain Tumor and Neuro-oncology Center at the Cleveland Clinic Foundation for additional support and funding.