D.Y.C.H., W.X., B.I.R., and T.K.C. were involved in the conception, study design, statistical analysis, and writing of the article. D.Y.C.H., G.A.B., U.V., M.-H.T., J.K., F.D., L.W., C.K., B.I.R., and T.K.C. provided and collected patient data. All authors approved the final article.
This is a project of the International Metastatic Renal Cell Carcinoma (mRCC) Database Consortium.
See editorial on pages 2586–7, this issue.
The majority of metastatic renal cell carcinoma (mRCC) clinical trials that examined targeted agents used progression-free survival (PFS) as the primary endpoint. Whether PFS can be used as a predictor of overall survival (OS) is unknown.
Patients from 12 cancer centers who received targeted therapy for mRCC were identified. Landmark analyses for progression at 3 months and 6 months after drug initiation were performed to minimize lead-time bias. A proportional hazards model was used to assess the utility of PFS for predicting OS.
In total, 1158 patients were included. The median follow-up was 30.6 months, the median age was 60 years, and the median Karnofsky performance status was 80%. For the entire cohort, the median PFS was 7.6 months, and the median OS was 19.7 months. In the landmark analysis, the median OS for patients who progressed at 3 months was 7.8 months compared with 23.6 months for patients who did not progress (log-rank test; P < .0001). Similarly, for patients who progressed at 6 months, the median OS was 8.6 months compared with 26 months for patients who did not progress (P < .0001). Compared with those who did not progress, for the patients who progressed at 3 months and at 6 months, the hazard ratios for death adjusted for adverse prognostic factors were 3.05 (95% confidence interval, 2.42-3.84) and 2.96 (95% confidence interval, 2.39-3.67), respectively. Similar results were demonstrated with landmark analyses at 9 months and at 12 months and in the bootstrap validation. Kendall tau rank correlation and a Fleischer model demonstrated a statistically significant dependent correlation.
Metastatic renal cell carcinoma (mRCC) has been revolutionized by the introduction of agents that target the vascular endothelial growth factor (VEGF) and mammalian target of rapamycin (mTOR) pathways. Progression-free survival (PFS) has been used as the primary endpoint for the majority of clinical trials that investigate these patients. Trials involving sunitinib, sorafenib, bevacizumab, everolimus, and pazopanib demonstrated a PFS benefit, which subsequently led to their regulatory approval.1-6 With the exception of 1 trial that involved temsirolimus in patients with a poor prognosis,7 the vast majority of these trials did not prove an overall survival (OS) benefit and cited issues of crossover and contamination of the control arm. This potentially was because patients in the control arm were allowed to receive the active agent or were treated with similar second-line and third-line agents, which may have led to the dilution of any apparent OS benefit. Because of the emergence of second-line and third-line targeted therapies, an OS benefit may be difficult to prove in a trial setting.
In the current study, we sought to determine whether there is an association between PFS and OS, which is the traditional gold standard of all endpoints, and whether or not there is any dependence between these 2 endpoints. This has implications for future drug evaluation and trial design.
MATERIALS AND METHODS
In total, 1158 patients with mRCC who received treatment with contemporary targeted therapy (sunitinib, sorafenib, bevacizumab, temsirolimus) were included in this study. They were identified from consecutive, population-based patient samples during 2005 through 2009 at 12 international cancer centers in Canada (Toronto Sunnybrook Odette Cancer Center [n = 158], Princess Margaret Hospital [n = 95], British Columbia Cancer Agency [n = 100], Queen Elizabeth II Health Sciences Center [n = 76], and Alberta Health Services Cancer Care [n = 134]), the United States (Dana Farber Cancer Institute and Brigham and Women's Hospital [n = 177], Cleveland Clinic [n = 113], Karmanos Cancer Center [n = 89], Beth Israel Deaconess Medical Center [n = 74]), Singapore (National Cancer Center [n = 80]). and Denmark (Aarhus University Hospital [n = 62]).
Patients may have been treated on clinical trial or off protocol and may have been treated at major academic centers or community oncology centers. Baseline patient characteristics and outcome data were collected using uniform data-collection templates in this large retrospective analysis.8 Regulatory approval from local institutional review boards or research ethics boards was obtained for each center.
The primary endpoint was OS, which was defined as the time from drug initiation to the date of death from any cause or was censored at last the follow-up. PFS was defined as the time from drug initiation to the time of disease progression or death or was censored at last the follow-up. We defined disease progression according to standard Response Evaluation Criteria in Solid Tumors (RECIST).9
Landmark analyses of PFS at 3 months and 6 months after drug initiation were performed to minimize lead-time bias. These time points were defined a priori because they were likely to coincide with the routine radiographic assessment of patients who receive VEGF-targeted therapy. Thus, this would be clinically useful to determine whether disease progression at these early time points would be able to predict a difference in OS. Studies in other tumors, such as prostate cancer, have justified the use of 3-month and 6-month landmark analyses.10 The patients who died before 3 months or 6 months were excluded in the 3-month landmark analysis or the 6-month landmark analysis, respectively, to provide the most conservative estimate of effect.10
The method of Kaplan and Meier was used to estimate the OS of patients stratified by disease progression at 3 months or 6 months. Exploratory analyses also were performed at 9 months and at 12 months, but those analyses were subjected to lead-time bias. A proportional hazards model was used to assess the significance of progression at 3 months or 6 months in predicting OS when adjusted for poor prognostic factors, including a Karnofsky performance status (KPS) <80%, time from diagnosis to treatment <1 year, anemia, hypercalcemia, thrombocytosis, and neutrophilia.8 Similar analyses were performed that adjusted for Memorial Sloan-Kettering Cancer Center (MSKCC) poor prognostic criteria.11 The proportionality assumption was met by graphically assessing plots of log(-log[survival]) versus log of survival time. The case deletion method was used to handle missing values in all explanatory variables.
The correlation between PFS and OS was estimated using the statistical model for dependence between PFS and OS developed by Fleischer et al.12 That model follows a parametric approach, which investigates how much variability of OS can be explained by variability from PFS. In the model, OS is partitioned as a sum of the time to progression (TTP ∼ Exp[λ1]) and postprogression survival (PPS ∼ Exp[λ3]) if patients progress before dying. In patients who have not experienced progression, PFS is equal to OS (Exp[λ2]). The correlation between PFS and OS is obtained by:
where model parameters λ can be estimated using a plug-in method for the estimated medians. Because this model requires an exponential distribution for time variables, we graphically assessed this assumption using the plot of negative log of survival distribution (LS) against time. This curve should be approximately linear through the origin if the exponential model is appropriate.12
We also investigated the nonparametric Kendall tau rank correlation for bivariate censored data,13 which has been used in previous studies of surrogate endpoints in prostate cancer.10 This is a more conservative measure and does not require data conforming to certain distributions. However, the Kendall tau considers PFS and OS as independent events and ignores the reality that OS inherently contains PFS information. The Kendal Tau correlation was calculated by using the cenken function from the R NADA library (http://cran.r-project.org/web/packages/NADA/index.html) modified for right double censored time to event data.
The standard errors of the correlation statistics (Fleischer model and Kendall tau) were estimated using the bootstrap method with 300 replications. The bootstrap confidence intervals (CIs) were computed based on the percentiles of the bootstrap distribution of the statistic. For the landmark analysis, 300 bootstrap samples were generated similarly at each landmark time point. We refit the Cox regression models and calculated the adjusted hazard ratios as in the original analysis. The bootstrap means and CIs were computed and compared with the model using the original study population at each landmark time point.
Bootstrap analysis was done using the statistical software package R version 2.8.0 (http://cran.r-project.org). All other statistical analyses were undertaken using SAS version 9.2 (SAS Institute Inc., Cary, NC). All statistical tests were 2-sided, and P values <.05 were considered significant.
In total, 1158 patients were included in the current analysis. At the time of the analyses, the median follow-up of the entire cohort was 30.6 months, and 81% of patients had progressed. The median KPS was 80% and the median age was 60 years. Thirty-one percent of patients had received previous immunotherapy, 80% had previously undergone nephrectomy, 77% had >1 metastatic site, and 8% had brain metastases. The risk groups according to the criteria published by Heng et al8 were favorable, intermediate, and poor for 21.1%, 50.2%, and 28.7% of patients, respectively. MSKCC risk groups11 were favorable, intermediate, and poor in 21.2%, 56.2% and 22.6% of patients, respectively (Table 1).
Table 1. Patient and Disease Characteristics at the Initiation of Targeted Therapy (N=1158)a
Total Cohort (n=1158), %
KPS indicates Karnofsky performance status; MSKCC, Memorial Sloan-Kettering Cancer Center.
The risk category was unknown in 195 patients (MSKCC) and in 125 patients (Heng criteria; Heng 20098).
Median age [range], y
Median KPS [range]
>1 Metastatic site
Targeted therapy as first-line therapy
Type of targeted therapy
MSKCC risk category
Heng criteria risk
In the landmark analysis, the median OS for patients who progressed at 3 months was 7.8 months compared with 23.6 months for the patients who did not progress (log-rank test; P < .0001) (Fig. 1). Similarly, for the patients who progressed at 6 months versus those who did not progress, the median OS was 8.6 versus 26.0 months, respectively (P < .0001). For the patients who progressed at 3 months and 6 months, compared with those who did not progress, the hazard ratios for death adjusted for adverse prognostic factors8 were 3.05 (95% CI, 2.42-3.84) and 2.96 (95% CI, 2.39-3.67), respectively. Hazard ratios when adjusted for the MSKCC prognostic groups (data not shown) and additional landmark analyses at 9 and 12 months (Table 2) all revealed consistent findings. Bootstrap analyses that were performed at 3 months, 6 months, 9 months, and 12 months produced very similar hazard ratios, confirming the high internal validity of these results. All of the aforementioned hazard ratios had a P value <.0001.
Table 2. Results of Landmark and Bootstrap Analyses for Overall Survival by Progression Status at 3 Months, 6 Months, 9 Months, and 12 Months After the Initiation of Targeted Therapy (N=1158)
OS indicates overall survival; HR, hazard ratio; CI, confidence interval.
All OS times were compared in a log-rank analysis and revealed P values <.0001. All adjusted HRs that were calculated using the Heng criteria revealed P values <.0001.
Reasons for exclusion: 1) death before landmark time (n=109 [3 mo], n=214 [6 mo], n=285 [9 mo], and n=362 [12 mo]), 2) follow-up for survival or disease not reaching landmark time (n=42 [3 mo], n=80 [6 mo], n=135 [9 mo], n=175 [12 mo]), and 3) missing OS/progression-free survival data (n=5).
For all patients (n = 1158), the median PFS was 7.6 months (95% CI, 6.8-8.2 months), and the median OS was 19.7 months (95% CI, 18.1-21.6 months). The median post-progression survival (PPS) for patients who progressed before death was 8.5 months (95% CI, 7.8-9.3 months). The negative log survival distribution curves followed a straight line through the origin, indicating satisfaction of the exponential distribution (data not shown). By using the Fleischer model, the estimated correlation between PFS and OS was 0.66 (bootstrap standard error [SE], 0.025; 95% CI, 0.61-0.71). This means that approximately 44% (r2 = 0.662) of the variability in OS can be explained by PFS. The calculated Kendall tau correlation was 0.42 (bootstrap SE, 0.016; 95% CI, 0.39-0.45; P < .0001), which suggested a statistically significant dependence between PFS and OS.
In the landmark analyses at each time point, we demonstrated that patients with mRCC who progressed on contemporary targeted therapy had an approximately 3 times increased risk of dying compared with patients at the same time point who remained progression free. A significant positive relation between PFS and OS was observed, as demonstrated by the Kendall tau statistic (0.42) and the Fleischer model (0.66). The Kendall tau statistic is a nonparametric measure of association between 2 censored variables and likely underestimated the true correlation between PFS and OS, because it ignored the inherent reality that OS contains PFS. The Fleischer model was developed specifically to describe this relation and, thus, may be a better estimate.12
To determine whether our results were consistent with expected values, we used a recent simulation study by Broglio and Berry,14 which suggested that the correlation between PFS and OS relies on the duration of postprogression survival (PPS). When the median PPS is short, the hazard ratios for progression and OS are highly correlated. According to simulation results, the estimated correlation was 0.88 when the median PPS was 9 months, which is higher than what we observed in our data (0.66). However, the Broglio and Berry14 simulation correlation was from data that perfectly followed an exponential distribution analyzed in a simulated meta-analysis format instead of using individual PFS and OS data for each patient as was done in our study.
The current analysis is important, because the use of OS as the gold-standard endpoint has become difficult in contemporary clinical trials. Many patients with mRCC in the control arms of randomized controlled trials eventually crossover to the active treatment or get contaminated by second-line or third-line drugs in the same drug category. This contamination can be as high as 62%, as documented in studies that examined bevacizumab plus interferon4 and sunitinib.15 There may be ethical concerns of not allowing crossover of patients on the control arm of clinical trials at the first hint of clinical activity on the experimental arm. However, if patients do crossover, then the OS benefit may be diluted. Thus, PFS may be the only endpoint that is not affected by contamination and crossover.
Intermediate endpoints like PFS and disease-free survival have been validated in other cancers, such as adjuvant and metastatic treatments for colorectal cancer,16-18 and similar analyses have been published for prostate cancer.10 Although most regulatory agencies and public health systems prefer to use OS data to make important approval decisions, PFS may be a suitable endpoint if it can be correlated with OS. This is especially true if there is a large magnitude of difference in PFS between the 2 treatment arms. It is important to validate these surrogate endpoints individually in each disease.
The strength of our study is that, to our knowledge, it represents the largest collection of patients that includes only those who received targeted therapy. Although 35.2% of patients in our study in fact did receive active second-line and/or third-line therapies, we still were able to demonstrate a correlation between PFS and OS using our study design. Most important, there was consistently high internal validity, as demonstrated by bootstrap validation.
Limitations of this study include its retrospective nature, for which missing data and selection bias may be encountered. However, standardized data collection templates were used, and only 5 patients had missing PFS and/or OS data. In addition, these patients constituted a consecutive series of patients from each institution, thereby limiting selection bias. In fact, the inclusion of consecutive patients created a patient population that was diverse and that was not limited to clinical trials (approximately 15% were included in a randomized controlled trial, not including expanded access programs); thus, this may be a more generalizable study. Finally, there was no central radiology review to determine progression in this study. However, this actually may better reflect everyday physician practices in the routine management of metastatic RCC and, thus, would make the results more applicable.
We conclude that disease progression at 3 months and at 6 months is associated with and can independently predict OS and that there is a dependent relation between PFS and OS. PFS may be the only endpoint that is not hindered by the issues of contamination and crossover that were observed for OS in most trials of targeted therapy, in which subsequent second-line and third-line targeted therapy were commonplace. These results require prospective evaluation.