The aim of the current study was to establish the predictive accuracy of the Kattan postoperative nomogram for nonmetastatic renal cell carcinoma (RCC) by comparing predictions with actual disease recurrence in patients who underwent surgery in a single center in France.
Between 1985 and 2000, 844 patients were treated for RCC. The following data were collated: age, symptoms, histology, tumor size, grade, TNM 1997 stage, recurrence, and progression. For each patient a prognostic score (predicted probability) for recurrence-free survival (RFS) at 5 years was calculated using the Kattan nomogram. The discriminating ability of the model was assessed by Harrell's concordance index (c-index). Bootstrapping was used to assess confidence intervals. Furthermore, survival was then estimated by the Kaplan-Meier method and Cox proportional hazards regression analysis.
In all, 565 patients (median age, 62 years) were included. At the time of the last follow-up, 81 patients had died and 101 had experienced RCC recurrence. The c-index for RFS (Kattan nomogram) was only 0.607 (95% confidence interval [CI]: 0.576–0.635). The 5-year RFS rate and cancer-specific survival rate were 81.5% and 84.7%, respectively. Of the 4 variables included in the nomogram, only TNM stage was associated with recurrence in a multivariate analysis (Cox analysis) (P = .022).
Renal cell carcinoma (RCC) is the most common kidney malignancy in adults. Incidence has increased by over 30% in the past 2 decades.1, 2 About 30,000 new cases are expected each year in the US and 20,000 in the European Union. Survival depends on disease stage at presentation. The 5-year survival rate is 50% to 90% for localized disease and 0% to 13% for metastatic disease.1, 2 Treatment for nonmetastatic RCC is usually partial or radical nephrectomy, or in situ tumor resection.3 However, about 30% of patients will develop metastases.1, 2 Treatment outcomes for metastatic disease are poor, and reliable prognostic indicators that distinguish between patients with different prognoses are needed to predict outcomes and help choose treatments and modalities of follow-up.4
Several prognostic models have been developed to predict disease recurrence and survival after nephrectomy for nonmetastatic RCC, using different covariates, tools (nomograms or prognostic categories), and endpoints.5–8 Nomograms are graphic charts that provide outcome probabilities for individual patients and are mainly used to inform patients of the risks and benefits of a treatment or diagnostic procedure.9 Their use is increasingly common in oncology, especially urological oncology, for example, for counseling patients with kidney, prostate, or bladder cancer.6, 9–11
Currently, a postoperative nomogram proposed by Kattan and colleagues, based on the analysis of a Memorial Sloan-Kettering database, is the most widely used model to predict treatment failure and tumor recurrence after surgery for kidney cancer.6, 9, 12, 13 It was applied recently in a 6-center European study and found to be more accurate than 3 other models (University of California-Los Angeles Integrated Staging System [UISS], Mayo Clinic Stage, Size, Grade, and Necrosis [SSIGN] score, and the Yaycioglu model).12 However, in the 6 centers the concordance index (c-index) for tumor recurrence was not assessed in the largest 3.12 The generalization of the Kattan nomogram to external cohorts of patients with characteristics different from the original dataset has therefore yet to be validated. The aim of this study was to establish the accuracy of the Kattan nomogram in predicting RCC recurrence in a representative patient population who underwent surgery in a large single center.
MATERIALS AND METHODS
Between 1985 and 2000, 3 surgeons from the Department of Urology at our institution removed 844 kidney tumors from 816 patients. The medical records were reviewed retrospectively by an individual urologist (M.R.) and the following data were collated from patient charts: sex, age at diagnosis, clinical presentation (incidental, local, or systemic), Eastern Cooperative Oncology Group (ECOG) performance status, surgical technique, disease recurrence, and progression. From the late 1980s, pathological data was registered prospectively and extracted directly from the records of the pathological department: tumor size, type and site, pathological stage (TNM 1997),14 and Fuhrman grade. Patients with any of the following characteristics were excluded from the study: performance status >3 and/ or metastatic disease at diagnosis (n = 30), a large tumor (pT4) (n = 11), bilateral synchronous disease (ie, Von Hippel Lindau) (n = 61), preoperative lymph node invasion (n = 19), benign disease on final pathological exam (e.g., cyst or oncocytoma) (n = 64), collecting duct carcinoma (n = 10), tumor with unclassified histology (n = 11), chronic renal insufficiency (n = 12), or solitary kidney (n = 22). Patients who did not undergo surgery or were lost to follow-up were also excluded (n = 11).
The remaining 565 patients who were eligible for study had previously undergone surgery, which consisted of open radical nephrectomy, partial nephrectomy, or in situ tumor resection after a subcostal or flank incision. Tumor resection was the chosen option when lesions did not exceed 40 mm in diameter. All patients underwent urine culture and cytology, abdominal ultrasonography, intravenous urography, and/or a renal computed tomography (CT) scan. After undergoing surgery, patients were followed-up annually by abdominal ultrasonography and abdominal computed tomography to detect local RCC recurrence.
The postoperative nomogram developed by Kattan et al.6 was used to calculate the probability that a patient would be free from recurrence at 5 years of follow- up. The 4 variables included in the nomogram were clinical symptoms, histology, tumor size, and 1997 TNM stage. The predictive accuracy of the nomogram was measured by the area under the receiver operating characteristic (ROC) curve as given by Harrell et al.'s15, 16 concordance index (c-index) for censored data (c-index of 0.5, no discrimination; c-index of 1.0, perfect discrimination). The 95% confidence interval (95% CI) of the c-index was calculated by bootstrapping, i.e., by testing the entire dataset using a 1000 replicate models obtained from samples drawn with replacement from the original dataset.13, 15, 16
Survival was evaluated on censured data by the Kaplan- Meier method. Prognostic factors were established by univariate analysis using the log rank test. A P-value <.05 was considered significant. Relations between all predictor variables included in the Kattan nomogram (symptoms, pathological stage, tumor size, and histological type) and survival were evaluated by Cox proportional hazards regression analysis.6, 13 The primary endpoint was recurrence-free survival defined as either the time from surgery to detection of the first local recurrence or of distant metastases, or the time from surgery to the close of the study. Patients who did not experience disease progression were censored to the date of the last follow-up or at time of death without disease. Disease-specific survival was evaluated from the date of surgery to the last follow-up visit or death from cancer. A scatterplot was used to compare for each patient the probability of remaining free of RCC at 5 years as estimated by the Kattan nomogram and as given by Cox proportional hazards analysis. All tests were carried out with SPSS v.14.0 software (Chicago, IL).
Complete baseline and follow-up data were available for 565 patients. Patient characteristics are given in Table 1. The median age was 62 years. The male to female ratio was 2:5. By the end of follow-up, 81 patients (14.3%) had died from all causes and 101 patients (17.9%) had experienced RCC recurrence. ECOG performance status was normal (=0) in 407 patients (72%). Fuhrman grade was established as follows: Grade 1 in 218 patients (38.6%), Grade 2 in 258 (45.7%), and Grade 3/4 in 89 (15.7%).
Table 1. Patient Characteristics From the Current Study (N = 565) and for the Patients Included in the Reference Study6 (N = 601)
The c-index describing the predictive accuracy of the nomogram was 0.607 with a 95% CI of 0.576 and 0.635 obtained by bootstrapping.
The 5-year recurrence-free and disease-free survival rates were 81.7% and 84.7%, respectively. Figure 1 illustrates the actual probability of remaining free from RCC after surgery. Of the 4 variables included in the nomogram, only TNM stage was statistically significant (P = .022) in a multivariate analysis for recurrence-free survival (Table 2). The scatterplot in Figure 2 illustrates the difference between recurrence as predicted by the Kattan nomogram and as given by Cox proportional hazards analysis.
Table 2. Prognostic Factors for Recurrence in Patients With Renal Cell Carcinoma
Only variables entered into the Kattan nomogram were tested in the multivariate analysis.
The nomogram and Cox analysis were in agreement only above a prediction threshold of 80%. Below this threshold (i.e., for patients at higher real risk of recurrence), the nomogram predicted a worse prognosis than the Cox analysis gave.
Nomograms are considered to be the most accurate paper-based method for explaining predicted probabilities to patients.9, 12, 13 However, in our study the Kattan nomogram showed poor performance in predicting overall RCC recurrence (c-index <0.7) even though our patient population was nearly as large as that used by Kattan et al. to derive their model (n = 565 vs. 601) and even though we included only patients from a single center. Given that the characteristics between the 2 populations were very similar and that our center has recruited a large series of representative patients with kidney cancer, the low predictive accuracy of the nomogram is of interest. It has been reported previously that the performance of a predictive model is overestimated when simply determined by the sample of subjects that was used to construct the model.17 Several approaches have been proposed to estimate the performance of the model in independent subjects, which is more accurate than results based on a naive evaluation of the training sample.15, 17 One simple approach is to randomly split the training data in 2 parts: 1 to develop the model of prediction and another to measure its performance.17, 18 A more sophisticated approach is to use cross-validation. Nevertheless, the most efficient validation has been claimed to be achieved by computer-intensive resampling techniques15, 17 such as the bootstrap used by Kattan and colleagues.6 However, the finding that data were collected from a large database over a long period of time (>15 years) is a well-known cause of potential bias because of inappropriate restaging or regrading. This point has to be considered carefully when considering our results and could be a reason as well for the poor applicability of the Kattan model in our population.
Moreover, in our study some variables included in the nomogram were not significant prognostic factors for recurrence, as in the original article.6 In Kattan et al.'s population, tumor size and histology were statistically significant variables; in our patients, only tumor stage was significant for disease-free survival. The inclusion of nonsignificant variables in the model artificially enhances prediction performance with regard to the initial dataset, but hinders subsequent application of the model to other datasets, i.e., overtraining of the model.17, 19 This was demonstrated by Kattan and colleagues,20 who developed a more accurate nomogram that was specific to a histological tumor type (conventional clear cell RCC) and by taking other nonsignificant variables (tumor necrosis and vascular invasion) into account.
However, the methodology used by Kattan and colleagues is not questionable and whether the Cox proportional hazards analysis is the most appropriate statistical method for determining prediction models and nomogram is a moot point.13 Kattan et al.6, 13 derived their nomogram from a rigorous methodology and from an extensive comparison of prediction models (tree-based methods, neural networks, recursive partitioning techniques, etc.) in which Cox proportional hazard's analysis proved to be as good, if not better, than the newer machine-learning techniques.
Nomograms are currently exerting a strong influence on clinical practice.6, 10, 11 However, in our study the predicted prognosis for recurrence-free survival by the nomogram was worse than the outcomes actually observed in many of our patients, highlighting the caution that needs to be exercised when routinely applying the nomogram. Although the Kattan model was reported to be currently the best among all other existing models, the same nomogram is not necessarily the best for all patient populations.12 The discrepancy also has important implications that go beyond patient counseling, as it may influence a clinician's decision as to whether or not to include a patient in a clinical trial of adjuvant therapy.12, 21 Currently, there is no satisfactory treatment for patients with advanced-stage kidney carcinoma. Immunotherapy has limited efficacy and new antiangiogenic agents are still undergoing clinical trials in Europe.22, 23 Because the appropriateness of applying the nomogram may depend on factors related to catchment areas, patient recruitment, and management, each center participating in a trial should externally validate the application of the nomogram to their patient population before selecting patients for inclusion in a trial.17 Moreover, any new nomograms in the near future should be constantly upgraded and derived f rom a multitude of datasets as new prognostic factors and improved modeling techniques become available.13, 18
The Kattan postoperative nomogram is a decision aid to be used with caution when applied to different patient populations. However, until new dynamic models become available, current nomogram and data for conventional prognostic factors still may be of significant benefit in certain clinical decision-making settings. They may be considered as well to be valid decision-making criteria when choosing cancer treatments.