See referenced original article on pages 52–60, this issue.
Progression-free survival: Does a correlation with survival justify its role as a surrogate clinical endpoint?
Article first published online: 8 OCT 2013
© 2013 American Cancer Society
Volume 120, Issue 1, pages 7–10, 1 January 2014
How to Cite
Becker, A., Eichelberg, C. and Sun, M. (2014), Progression-free survival: Does a correlation with survival justify its role as a surrogate clinical endpoint?. Cancer, 120: 7–10. doi: 10.1002/cncr.28378
- Issue published online: 17 DEC 2013
- Article first published online: 8 OCT 2013
- Manuscript Accepted: 3 SEP 2013
- Manuscript Revised: 25 AUG 2013
- Manuscript Received: 28 JUL 2013
After achieving a better understanding of the vascular endothelial growth factor and mammalian target of rapamycin pathways, the treatment of patients with metastatic renal cell carcinoma (mRCC) using targeted therapies has undergone tremendous improvement since the cytokine era, in which to date, 7 molecules (sunitinib,[2, 3] sorafenib, bevacizumab,[5-8] temsirolimus, everolimus, pazopanib,[11, 12] and axitinib,[13, 14]) have been approved by the US Food and Drug Administration (FDA). It is interesting to note that of those, only temsirolimus and, in subanalyses, sunitinib demonstrated improved overall survival (OS) compared with established therapies or placebo. All other trials demonstrated an improvement in progression-free survival (PFS) only, without a corresponding improvement in OS (Table 1).[2-14]
|Trial||Study||Primary Endpoint||OS, Months||PFS, Months|
|SU11248 Sunitinib vs IFN||Motzer 2007||PFS||NR||SS (11 vs 5; P < .001)|
|Motzer 2009||OS||SS (26.4 vs 21.8; P = .051)||NR|
|TARGET Sorafenib vs placebo||Escudier 2007||OS||NS (19.3 vs 15.4; P = .02)||SS (5.5 vs 2.8; P < .01)|
|AVOREN IFN + Bevacizumab vs placebo||Escudier 2007||OS||NR||NR|
|Escudier 2010||OS||NS (23.3 vs 21.3; P = .3)||SS (10.2 vs 5.4; P = .001)|
|CALGB 90206 IFN + Bevacizumab vs IFN alone||Rini 2008||OS||NR||SS (8.5 vs 5.2; P < .001)|
|Rini 2010||OS||NS (18.3 vs 17.4; P = .1)||NR|
|ARCC Temsirolimus vs IFN||Hudes 2007||OS||SS (10.9 vs 7.3; P = .008)||SS (5.5 vs 3.1; P < .001)|
|RECORD-1 Everolimus vs placebo||Motzer 2010||PFS||NS (14.8 vs 14.4; P = .2)||SS (4.9 vs 1.9; P < .01)|
|VEG105192 Pazopanib vs placebo||Sternberg 2010||PFS||NR||SS (9.2 vs 4.2; P < .001)|
|Sternberg 2013||OS||NS |
(22.9 vs 20.5; P = .2)
|AXIS Axitinib vs Sorafenib||Rini 2011||PFS||NS (20.1 vs 19.2; P = .4)||SS (6.7 vs 4.7; P < .0001)|
|Motzer 2013||OS||NS (20.1 vs 19.2; P = .4)||NR|
Recently, an increasing number of clinical trials not limited to metastatic kidney cancer have focused on PFS as the primary endpoint instead of the previously established standard clinical endpoint of OS.[15-18] In fact, according to a review of anticancer drugs approved by the FDA, approximately 23% of all new approvals between 2005 and 2007 were based on randomized controlled trials (RCTs) that focused on PFS or time to disease progression instead of OS. In this context, many believed that the oncology community was demonstrating a tendency toward accepting a delay in disease progression as a laudable objective, notwithstanding an increase in survival. In this sense, many investigators sought to evaluate whether PFS may represent a viable surrogate endpoint of OS. Although this postulation has been largely examined in other solid tumors,[15-18] to the best of our knowledge it remains poorly explored within the context of mRCC. Currently, to our knowledge only 1 other study conducted by Heng et al tested this hypothesis, although it relied on a retrospective series of nonrandomized patients.
To circumvent the paucity of data and the nature of the database used, Halabi et al, in the current issue of Cancer, used data originating from 2 phase 3 RCTs to test the correlation between PFS and OS, with the aim of assessing whether PFS may represent a valid surrogate endpoint in patients with mRCC who are treated with bevacizumab in combination with interferon-α. Specifically, data from the Cancer and Leukemia Group B (CALGB) 90206 trial served as the training set for their analyses (n = 732 patients), whereas data from the Avastin for Renal Cell Cancer (AVOREN) trial served as the testing set (n = 649 patients). The authors computed disease progression at 3 months and 6 months as a binary variable in proportional hazards models for the prediction of OS. Moreover, concordance between PFS and OS was tested using the Kendalls' tau, in which a value of 1.0 would mean a perfect correlation, whereas a value of 0.0 would indicate that PFS and OS are completely independent. Results from the training set demonstrated that the hazards ratio (HR) for mortality was 2.6 (95% confidence interval, 2.1-3.1) and 2.8 (95% confidence interval, 2.3-3.4), respectively, for patients who experienced disease progression at 3 months and 6 months compared with those who did not (both P < .001). HRs in the testing set revealed a similar effect. Finally, the estimated Kendalls' tau was 0.53 and 0.50, respectively, for the training and testing sets. Based on their findings, Halabi et al arguably concluded that, at least among patients treated with bevacizumab plus interferon-α, PFS may be considered an appropriate surrogate endpoint for OS.
Certainly, there are many advantages and reasons to advocate the use of PFS as a surrogate endpoint for OS. First, the evaluation of OS requires a large study cohort and a lengthy follow-up, which may often be impractical given the economic and timeframe burdens of clinical trials. Instead, disease progression and time to disease progression occur more frequently and are easy to measure, thereby salvaging the limitations associated with OS. Second, the most cited issue related to the lack of an increase in OS is the influence of postprogression therapy, in which many patients with mRCC treated in the control arms of RCTs eventually cross over to the experimental arm or are contaminated by second-line or third-line drugs in the same drug category. Crossover after disease progression from the control arm of an RCT is ethically necessary when the experimental arm demonstrates clinical activity. However, allowance for crossover may compromise the ability to assess a difference in OS. For example, in the RCT comparing everolimus versus placebo in patients with mRCC who had previously developed disease progression while receiving at least 1 targeted therapy, the authors reported an observed HR for PFS of 0.3 in favor of everolimus. However, approximately 80% of patients allocated placebo received everolimus, which likely resulted in a lack of detectable OS benefit between the 2 groups (median OS, 14.8 months vs 14.4 months; P = .162). Similarly, given that many alternative second-line, third-line, and even subsequent lines of treatment are available for the treatment of mRCC, it becomes even further improbable that an improved PFS will result in an OS benefit, even when study designs do not allow crossover.[22, 24] Such contamination has been reported to be as high as 59%.[2, 5]
Despite the aforementioned arguments in favor of using PFS as a surrogate endpoint for OS, most investigators and health policy decision-makers remain apprehensive of such an option; only recently, approval of the vascular endothelial growth factor receptor tyrosine kinase inhibitor tivozanib was denied by the FDA, although tivozanib demonstrated a significant advantage in PFS of nearly 3 months compared with sorafenib. The results of the TIVO-1 trial (Tivozanib Versus Sorafenib in first line Advanced RCC) were presented at the 2013 American Society of Clinical Oncology Genitourinary Cancers Symposium. Surprisingly, despite the advantage in considering PFS as the primary endpoint, patients treated with tivozanib were found to have a slightly shorter OS (28.8 months vs 29.3 months in the sorafenib arm; P = .1). This discrepancy emphasizes the importance of the validation of potential surrogate endpoints, such as PFS, before its use (eg, a meta-analysis). Such validation needs to satisfy 2 simultaneous conditions: 1) the surrogate must be correlated with the clinical endpoint (in this case, OS); and 2) the surrogate must fully capture the net effect of the intervention on the endpoint of clinical efficacy. Both Halabi et al and Heng et al have reported on what appears to be an unequivocal correlation between PFS and OS, therefore satisfying the first condition. However, to the best of our knowledge, the second condition has yet to be tested in patients with mRCC, and is conceptually more difficult to assess. An easy example in the urology setting would be levels of prostate-specific antigen. The latter is clearly correlated with the level of disease, which is great for early detection and prognosis, but unreliable for predicting the net benefit of any intervention. As such, although PFS may be fundamentally associated with OS, can it accurately predict OS?
A recurring question remains as to why do agents that increase PFS fail to improve survival? Some authors have suggested that PFS is more likely to be affected by evaluation and attribution biases, as well as measurement errors, which makes it easily detectable but less specific. Indeed, unlike OS, the assessment of disease progression is at a higher risk of false-positive results. Precisely, disease progression, originally developed to describe a change in tumor burden, was intended to provide preliminary activity of the tumor during phase 2 trials. The fact that many clinical trials are now designed to measure disease progression as a time-to-event endpoint[15-18] does not render it a reliable endpoint that is sufficiently adequate to capture a clinically meaningful benefit. This is particularly worrisome if the motivation to consider PFS as a surrogate endpoint is partly driven by the pharmaceutical industry, in which, given the lower costs and shorter time frame, such agents are allowed to be declared active when the actual benefit is not observable.
As we progress toward measures against mRCC, the studies of both Halabi et al and Heng et al provide perceptive affirmation that disease progression or time to disease progression are fundamentally associated with OS. However, the mere association is not enough to justify its consideration as a surrogate endpoint that is significant for the patient, for whom the term “meaningful” should ultimately convey benefit. Moreover, because only 49% of all patients in the CALGB trial received secondary treatments, the association between PFS and OS needs to be confirmed in contemporary patients who are typically receiving a sequential therapy of multiple biologically active agents. A rigorous evaluation of this topic is needed before PFS can be reliably considered to be a valid surrogate endpoint of OS in patients with mRCC.
No specific funding was disclosed.
CONFLICT OF INTEREST DISCLOSURES
The authors made no disclosures.
- 25Overall survival in patients from a phase III study of tivozanib hydrochloride versus sorafenib in patients with renal cell carcinoma [abstract]. J Clin Oncol. 2013;31(suppl 6):Pages. Abstract 350., , , et al.
- 29New guidelines to evaluate the response to treatment in solid tumors.European Organization for Research and Treatment of Cancer,National Cancer Institute of the United States,National Cancer Institute of Canada. J Natl Cancer Inst. 2000;92:205–216., , , et al.