Evaluation of prediagnostic prostate-specific antigen dynamics as predictors of death from prostate cancer in patients treated conservatively

Authors


  • Dr. Hans Lilja holds patents for free PSA and hK2 assays. Dr. Peter Scardino has stock in Claros Diagnostics.

Abstract

Prostate-specific antigen (PSA) dynamics have been proposed to predict outcome in men with prostate cancer. We assessed the value of PSA velocity (PSAV) and PSA doubling time (PSADT) for predicting prostate cancer-specific mortality (PCSM) in men with clinically localized prostate cancer undergoing conservative management or early hormonal therapy. From 1990 to 1996, 2,333 patients were identified, of whom 594 had two or more PSA values before diagnosis. We examined 12 definitions for PSADT and 10 for PSAV. Because each definition required PSA measurements at particular intervals, the number of patients eligible for each definition varied from 40 to 594 and number of events from 10 to 119. Four PSAV definitions, but no PSADT, were significantly associated with PCSM after adjustment for PSA in multivariable Cox proportional hazards regression. All four could be calculated only for a proportion of events, and the enhancements in predictive accuracy associated with PSAV had very wide confidence intervals. There was no clear benefit of PSAV in men with low PSA and Gleason grade 6 or less. Although evidence that certain PSAV definitions help to predict PCSM in the cohort exist, the value of incorporating PSAV in predictive models to assist in determining eligibility for conservative management is, at best, uncertain.

Widespread screening with prostate-specific antigen (PSA) in the United States has led to increased diagnosis of small, low-stage cancers, some of which would become lethal if left untreated, but many are unlikely to affect quality or length of life. Concerns about potential overtreatment of insignificant cancers are underscored by recent findings from one of the randomized trials of prostate cancer screening.1 Results showed that PSA-based screening reduced prostate cancer-specific mortality (PCSM), but that screening of 1410 men, and treating of 48 men, would be required to prevent one prostate cancer death.1 Better biomarkers are required to differentiate potentially lethal from more indolent cancers that would be suitable for conservative management.

PSA dynamics (both velocity and doubling time) have been advocated as a means of improving diagnosis and assessment of prostate cancer. PSA velocity (PSAV) has been statistically associated with biopsy outcomes.2, 3 Pretreatment PSAV has also been associated with recurrence and death after both surgical and radiation therapy.4, 5 PSA doubling time (PSADT) after biochemical recurrence has been demonstrated to predict time to metastasis and death from prostate cancer, thereby acting as a surrogate endpoint for cancer-specific mortality.6, 7 Hence, PSA dynamics may be suitable to help to predict outcome for men treated conservatively. This could include all stages of disease, from active surveillance of localized cancer to watchful waiting for locally advanced or recurrent cancer.

In contrast to reports advocating the use of PSA dynamics, we recently found that although pretreatment PSA dynamics were sometimes associated with outcome after radical prostatectomy, there was no improvement in predictive accuracy beyond that of pretreatment PSA alone.8 Although many authors have investigated whether outcome is statistically associated with PSAV or PSADT, scant attention has been paid to whether dynamics improve our ability to predict over and above either a single PSA measurement or a multivariable model including stage, grade and PSA.9, 10

In this study, we examined whether PSA dynamics could distinguish potentially lethal cancers from those that are insignificant. Our study population is a unique cohort in which patients not selected for aggressive curative therapy were managed conservatively by watchful waiting until disease progression occurred, at which point they received therapy. Thus, this cohort is very different from contemporary active surveillance cohorts. It does, however, provide a unique heterogeneous population of men with clinically localized prostate cancer to investigate the role of PSA dynamics in the natural history of prostate cancer. By using this cohort, we assessed the various published definitions of PSAV and PSADT as predictors of death from prostate cancer. To do so, we tested each definition for association with outcome, both alone and after adjustment for the prediagnostic PSA level, and we tested whether any PSA dynamic definition could improve predictive accuracy compared with the prediagnostic PSA level.

Abbreviations

CI: confidence interval; HSP: heat shock protein; PCSM: prostate cancer-specific mortality; PSA: prostate-specific antigen; PSAV: PSA velocity; PSADT: PSA doubling time; TAPG: Trans-Atlantic Prostate Group; TURP: transurethral resection of the prostate

Material and Methods

Patients

The study population [Trans-Atlantic Prostate Group (TAPG) cohort] has been described in detail.11 Briefly, potential cases were identified from six cancer registries in Great Britain if they were younger than 76 years (median 71 years; interquartile range 67–74 years) at the date of diagnosis and had probable clinically localized prostate cancer diagnosed by transurethral resection of the prostate (TURP) or needle biopsy. Diagnosis between 1990 and 1996 (inclusively) and a baseline PSA were required. Patients treated by radical prostatectomy or radiation therapy within 6 months of diagnosis were excluded. Additional exclusions were those with objective evidence of metastatic disease (by radiology or histology), clinical indications of metastatic disease or a PSA measurement more than 100 ng/mL at or within 6 months of diagnosis. These last two exclusions were a pragmatic method of focusing the study on patients who were very likely to have truly localized disease at presentation. Men who had hormone therapy before diagnostic biopsy were also excluded because of the influence of hormone treatment on interpreting Gleason grade. We also excluded men who died within 6 months of diagnosis or had less than 6 months of follow-up. In January 2005, the cancer registries were queried to obtain the most up-to-date survival data. Where available, death certificates for deceased patients were reviewed to verify cause of death, and outcomes were determined through medical records and cancer registry data. Deaths were divided into two categories, death from prostate cancer and death from other causes, according to the standardized World Health Organization criteria. Patients still alive at last follow-up were censored at that date.

We defined conservative management or watchful waiting as not receiving any therapy (excluding hormone manipulation) within 6 months of diagnosis. Of the 2333 patients eligible, 1663 received watchful waiting treatment. Of these, we identified 594 patients who had two or more PSA values 2 or more months apart before diagnosis; these patients constituted the study group. Of the 594 eligible patients, at last follow-up, 119 patients had died from prostate cancer and 165 patients had died from other causes. The median follow-up for survivors was 9 years.

PSA dynamics definitions

We used 20 definitions (Supporting Information Table) used as predictive tools in published literature: 11 PSADT6, 12–22 and 9 PSAV.4–5, 17, 19, 23–25 We also used the two dynamics tools from the Memorial Sloan-Kettering Cancer Center (MSKCC) website.26 The definitions are exactly the same as those used in our previous study.8 PSA dynamics were calculated using all PSA measurements available before prostate cancer diagnosis.

Statistical methods

Univariate Cox proportional hazards regression was used to evaluate the association between different definitions of PSA dynamics and PCSM. Multivariable Cox regression was used to determine which definitions of PSA dynamics were significantly associated with PCSM, controlling for a single prediagnostic PSA value. The single PSA value used in the predictive model was the final PSA before diagnosis that was used in the dynamic calculation under consideration.

To measure predictive accuracy, we used the concordance index (c-index), which is a similar concept to the area under the receiver operating characteristic curve that can be used to quantify discrimination for survival time data. To determine the enhancement in predictive accuracy associated with each PSA dynamic beyond that of PSA, we evaluated the c-index of (i) PSA alone, (ii) each PSA dynamic alone and (iii) PSA plus each PSA dynamic. We corrected for overfit and obtained confidence intervals (CIs) of the difference in predictive accuracy for c-indices using bootstrap methods.27 Each definition required a particular number of PSA measurements taken at particular intervals; hence, most definitions were not calculable for all patients in our study cohort. Therefore, to allow for comparison between (i), (ii) and (iii), we calculated c-indices in the subset of patients for whom the respective dynamic was calculable. Hence, these estimates for different definitions are not directly comparable. We repeated the analyses using a competing risk regression model instead of a Cox regression model, which accounts for the competing risk of dying from causes other than prostate cancer. None of the results from the competing risk model were substantially different from those reported here (data not shown). All statistical analyses were conducted using Stata 10.0 (Stata Corp., College Station, TX) and R (R Foundation for Statistical Computing, www.r-project.org) with the cmprsk statistical package. The study was conducted under the Health Insurance Portability and Accountability Act guidelines and received the approval of Institutional Review Board.

Results

Patient characteristics are summarized in Table 1, and PSA dynamics results are summarized in Table 2. Depending on the definition, the percentage of patients for whom a dynamic could be calculated varied from 7 to 100% (2 to 36% of the whole sample). For nine definitions, the dynamic was calculable for one third or less of our study cohort (Table 2). Four PSADT and two PSAV definitions were calculable for more than 90% of our study cohort.

Table 1. Characteristics of 594 patients in this study
inline image
Table 2. Summary of PSA, PSA doubling time (12 definitions), and PSA velocity (10 definitions), calculated at the time of diagnosis
inline image

Results of univariate analyses are shown in Table 3. A single PSA measurement alone was highly associated with PCSM (p < 0.001). Several PSAV definitions (4 of 10), but no PSADT definitions, were significantly associated with PCSM (all p < 0.003). On multivariable analysis controlling for prediagnostic PSA, three of the four PSAV definitions that were significant on univariate analysis remained significant predictors of PCSM (all p < 0.05, Table 3); In addition, D'Amico A as a continuous variable was a significant predictor of PCSM with adjustment for prediagnostic PSA (p = 0.01).

Table 3. Univariate and multivariable Cox proportional hazards regression to evaluate the association between different definitions of PSA dynamics and prostate cancer-specific mortality
inline image

The predictive accuracies of a single PSA alone, each dynamic definition and a single PSA plus each dynamic are summarized in Table 4. Very small enhancements in predictive accuracy above that of PSA alone were observed for four PSADT definitions that were not significantly associated with PCSM (either on univariate or multivariable analysis); these enhancements are likely explained by sampling variability. Two of the four PSAV definitions that were significantly associated with PCSM with adjustment for PSA apparently increased predictive accuracy above that of PSA alone. The enhancement in predictive accuracy with 95% CI for these definitions is given in Table 4. One of these definitions (D'Amico B categorized as >2 ng/mL/yr) included only 15 events and, therefore, had very wide CIs. D'Amico A >2 ng/mL/yr increased the c-index from 0.677 to 0.685 (difference of 0.008; 95% CI −0.018 to 0.036). All of the 95% CIs include zero, which indicates that none of the definitions significantly enhanced the predictive accuracy of PSA alone.

Table 4. Individual assessment of the enhancement in predictive accuracy associated with each definition of PSA dynamics for prediction of prostate cancer-specific mortality, after adjustment for the pretreatment PSA level
inline image

Our original intention was to include the most promising definitions in a multivariable model including Gleason grade. However, this was possible only for the D'Amico A definition because of the very small number of events for other definitions. After adjusting for both prediagnostic PSA and biopsy Gleason grade, D'Amico A PSAV remained significantly associated with PCSM (entered as continuous: hazard ratio of 0.989 per 1 ng/mL/yr; 95% CI 0.979–0.998; p = 0.019; entered as binary: hazard ratio of 1.71 for >2 vs. ≤2 ng/mL/yr; 95% CI 1.04–2.79; p = 0.034). To characterize these results in a clinical context, we considered a low-risk group of patients who would reasonably be considered for conservative management (155 patients with biopsy Gleason grade ≤6 and PSA ≤20 ng/mL). This subgroup had 10 PCSM events, with an 8-year probability of PCSM of 7% (95% CI 4–13). The critical use of PSAV in this setting would be to determine those patients at low risk of PCSM (who should receive conservative management) and those at high risk (who may benefit from curative therapy). D'Amico A PSAV was calculable for 110 of these patients (71%) but was increased (>2 ng/mL/yr) in only 9% (10/110), one of whom died of disease. Because only a very small number of patients were reclassified, the likelihood of PCSM in men with PSAV ≤2 ng/mL/yr remained high: 8% at 8 years. We believe that any man told he had a 1 in 10 chance of death within 8 years would not be comfortable with conservative management. Our results were not importantly affected by the more restrictive criteria that might be used to select patients suitable for active surveillance rather than watchful waiting (e.g., Gleason grade ≤6 and PSA ≤10). Of note, D'Amico B PSAV was calculable for only 59 of these patients (61%) and was >2 ng/mL/yr for only 2 (3%); both of these patients were alive at last follow-up (7.1 and 9.9 years after prostate cancer diagnosis).

As a sensitivity analysis, we repeated all analyses but included in the calculations any PSA values up to 6 months after diagnosis but before hormonal manipulation or TURP. This analysis expanded our cohort to 862 patients, of whom 188 died from prostate cancer. In general, there was no important difference in results. Five PSAV definitions (D'Amico A PSAV as continuous and categorized as >2 ng/mL/yr, D'Amico B PSAV as continuous and categorized as >2 ng/mL/yr and Smith PSAV), and no PSADT definitions, were significantly associated with PCSM with adjustment for prediagnostic PSA. None of the definitions importantly enhanced the predictive accuracy of PSA alone.

Additional sensitivity analyses focused on the four velocity definitions that showed promise from our main analyses (D'Amico A PSAV as continuous and categorized as >2 ng/mL/yr and D'Amico B PSAV as continuous and categorized as >2 ng/mL/yr). We performed analyses for the outcome of metastases or PCSM, and censoring patients who received curative therapy at the time of that treatment. Our key results were essentially unchanged: the two D'Amico A velocity definitions were statistically significant on multivariable analysis controlling for PSA (p < 0.05 for all analyses); the two D'Amico B velocity definitions were statistically significant on univariate (p < 0.001 for all analyses) but not on multivariable analysis controlling for PSA (p > 0.08 for all analyses). None of the four velocity definitions enhanced the predictive accuracy over and above PSA alone.

We compared our results using this TAPG cohort with those previously published using a cohort from MSKCC (Table 5).8 The MSKCC cohort used metastasis or biochemical recurrence after radical prostatectomy as the outcome and included 2938 patients. Importantly, D'Amico A PSAV did not even achieve a significant univariate association with metastases in the MSKCC cohort (p = 0.4 entered as continuous and p = 0.19 entered as binary for >2 ng/mL/yr). D'Amico B PSAV was significantly associated with metastases and biochemical recurrence in the MSKCC cohort; however, this association did not translate into a significant improvement in predictive accuracy for any of the outcomes. The only apparent enhancement observed was for D'Amico B PSAV >2ng/mL/yr for metastases (c-index for PSA plus dynamic vs. with PSA alone: 0.754 vs. 0.724), but this was based on only 16 events, and this enhancement was not independently replicated in the TAPG cohort.

Table 5. PSA dynamics significantly associated with prostate cancer-specific mortality with adjustment for PSA: comparison of results against previously published analyses with an MSKCC cohort8
inline image

Discussion

For a PSA dynamic to be of value for clinical decision making or patient counseling in a pretreatment setting, we propose that it must improve predictive accuracy beyond that of a single pretreatment PSA alone. Were this not to be the case, the clinician can just use the patient's latest PSA value for decision making. Here we report that among men with clinically localized prostate cancer treated conservatively, a small number (4 of 22) of definitions of prediagnostic PSA dynamics were significantly associated with PCSM on univariate analysis. Three of these four definitions, plus one other, were statistically significant in a multivariable model controlling for a single prediagnostic PSA value; two of these definitions also seemed to improve predictive accuracy. However, the definitions could be applied only to a subset of men; accordingly, our results are based on a small number of events with the improvements in predictive accuracy (assessed by c-index) associated with wide 95% CIs. For example, D'Amico B PSAV >2 ng/mL/yr enhanced the predictive accuracy by 0.003; however, this analysis was based on only 15 events, and we cannot exclude the possibility that this definition leads to only a very small enhancement in predictive accuracy. Any estimates of the risk of PCSM obtained from the current data set with D'Amico B PSAV would be highly variable and extremely difficult to determine how this PSAV definition should be implemented in clinical practice. None of the definitions significantly enhanced the predictive accuracy above that of PSA alone, and none showed an apparent enhancement in both the TAPG and MSKCC cohorts.

Of all 22 definitions used in our study, only the Stephenson PSADT was initially described in a cohort undergoing watchful waiting. In that cohort, a PSADT of less than 120 months correlated with disease progression.20 There was no difference between a PSADT of less than 48 months and PSADT of 48 to 120 months, which seems to be contrary to the concept that a more aggressive disease can be identified by a quicker rise in PSA. When applied to the TAPG cohort, this definition did not associate with outcome on univariate analysis (p = 0.7). One possible explanation for why we did not validate Stephenson PSADT is because in the Stephenson study, a rapidly rising PSA might have been used to make treatment decisions such as whom and when to rebiopsy, resulting in verification bias. Other groups who have suggested that PSA dynamics are of benefit in an active surveillance cohort incorporated the dynamic in the treatment algorithm, that is, patients with a rapidly rising PSA were considered to have progressed. Claiming that “PSA velocity predicts progression” therefore becomes little more than the claim that “PSA velocity predicts PSA velocity.”

In this study, PSA dynamics were not used in treatment decisions. Therefore, we can be confident that our results are not subject to selection bias.

This study adds to a growing body of literature showing that PSA dynamics add little or nothing to our ability to predict various outcomes across the spectrum of prostate cancer. Previously, we8, 28, 29 and others25, 30, 31 have demonstrated that PSA dynamics lead to, at best, only trivial improvements in predictive accuracy beyond that of a single PSA value alone, either before diagnosis or before radical prostatectomy. Furthermore, in a systematic review of the literature,10 only two articles compared the predictive accuracy of a model containing PSA and PSAV with that of PSA alone: one found no improvement; the other found a very small improvement but was flawed because of verification bias.

In the setting of conservative management, Fall et al.32 assessed PSAV and PSADT to predict outcome in the watchful waiting arm of the Scandinavian Prostate Cancer Group No. 4 trial of watchful waiting vs. radical prostatectomy. The PSA dynamics were calculated from PSA values in the 2-year period after randomization and assessed in three models as predictors of lethal prostate cancer (metastasis or disease-specific death). They found that both PSA and PSA dynamics were associated with outcome; however, neither PSA alone nor PSA dynamics was an accurate predictor. Although the authors did not examine whether the addition of PSA dynamics to PSA alone could improve the ability of PSA to predict outcome, we believe our results, from a cohort of patients treated in a clinical practice setting, confirm the finding from this randomized trial that PSAV does not contribute meaningful information when trying to predict outcome in patients treated conservatively.

PSA dynamics may be associated with outcome yet may not importantly improve the predictive ability of PSA alone, in part because PSA and PSA dynamics are highly correlated.33 If two variables are highly correlated, using both provides little additional information over using only one, and PSA is itself a predictive variable (c-index 0.647 for PCSM). We also note that both PSADT and PSAV were strongly influenced by the method of calculation (Table 2), as has been noted previously.8, 34, 35 In this study, for example, the median PSAV was 0.01 ng/mL/yr by the MSKCC definition and 0.33 ng/mL/yr by D'Amico A. The fact that two definitions, calculated for essentially the same patient group, differed so greatly demonstrates the fragility and variability of PSA dynamics and how criteria for selecting PSA values critically influence the results.

Our study is subject to certain limitations. First, our sample size for some definitions was small, but for 12 of the definitions, we were able to analyze reasonable numbers (>50) of clinically important events (deaths from prostate cancer), and we believe that our analysis would have revealed any definition that contributed strong predictive power above that of a single PSA. Second, all PSA values used in the calculations were obtained from clinical charts. Hence, the PSA values could have come from different assays, which would add to error in the calculated dynamics. That said, this reflects the normal diversity of clinical practice. If it were the case that PSA dynamics aided in the selection of patients for conservative management only if the PSA assays were rigorously controlled, this would have limited practical value.

In our analyses, we tested 22 hypotheses for each endpoint. Given this multiple testing, and considering that the single definition that significantly enhanced predictive accuracy was not identified previously in an independent cohort of patients, it is plausible that this one positive result was simply due to chance. We interpret these results as providing little evidence that any PSA dynamic improves the predictive ability for disease-specific mortality over that of a single PSA value alone. In contrast, the clinical outcome of this cohort of men with untreated prostate cancer has been assessed with respect to tissue-based biomarkers. In particular, expression of heat shock protein (HSP)-27 has been identified as a powerful (p < 0.001) prognostic indicator of poor clinical outcome in individual cases at diagnosis.36 Like PSA, this protein is also under the control of the androgen receptor37 but is more robust. Although powerful, the disadvantage of HSP-27 is that it is currently tissue-based and indicative of active management only when positive at diagnosis.

In conclusion, we see no justification for calculation of PSA dynamics to help to predict the outcome of patients undergoing conservative management. Instead, we recommend using a single pretreatment or prediagnostic PSA value in an independently validated predictive nomogram. Going forward, we believe that future research should focus on the four specific definitions of PSAV that were significantly associated with PCSM. These results must be independently replicated in data sets with large number of events. In particular, researchers should focus on the question of whether PSAV enhances prediction in comparison with standard predictors alone.

Acknowledgements

This study was supported in part by funds provided by National Health Institutes/National Cancer Institute SPORE in Prostate Cancer; Cancer Research UK, The Orchid Appeal, The Boxer Family Fellowship and The Sidney Kimmel Center for Prostate and Urological Cancers and David H. Koch through the Prostate Cancer Foundation. The authors thank Janet Novak, PhD, of Helix Editing for substantive editing of the manuscript which was paid for by Memorial Sloan-Kettering Cancer Center.

Appendix

Members of the Transatlantic Prostate Group included listed authors and investigators designated by an asterisk.

Thames Cancer Registry: Henrik Møller*, Shirley Bell (deceased), K. Linklater, J. Ottey V. Fisher; Ashford & St. Peter's, M. Hall, N. Harvey Hills; Barnet & Chase Farm, H. Reid; Brighton and Sussex, N. Kirkham, P. Thomas; Bromley, D. Nurse; Dartford & Gravesham, I. Dickinson, P. Thebe; East & North Hertfordshire, D. Hanbury, M. Ali-Izzi; Eastbourne, C. Moffatt; Epsom & St. Helier, M. Bailey, L. Temple; Essex Rivers Healthcare, W. Aung, C. Booth; Frimley Park, B. Montgomery, P. Denham; Greenwich Healthcare, N. Cetti, P. Pinto; Guy's & St Thomas's, A. Chandra, T. O'Brien; Hammersmith Hospitals, N. Livni; Havering Hospitals, I. Saeed; Hillingdon, F. Barker, T. Beaven; King's Healthcare, G. Muir, Z. Khan; Kingston, C. Jameson; Lewisham, A. Giles; Mayday Healthcare, N. Arsanious, A. Arnaout; The Medway, E. Boye; Mid Essex Hospitals; Mid Kent, M. Boyle; North West London Hospitals, M. Jarmulowicz; Royal Free Hampstead, R.J. Morgan, A. Bates; St Bartholomew's and The Royal London Hospitals, F. Chinegwundoh, R.T.D. Oliver, D. Berney; Royal Surrey County, S. De Sanctis; Southend, M. Chappell; St George's, London, R. Kirby, C. Corbishley; St Mary's, London, A. Patel, M. Walker; West Hertfordshire, J. Crisp, W. Riddle; Worthing & Southlands Hospitals, J. Grant.

Northern & Yorkshire Cancer Registry & Information Service: David Forman*, C. Storer, C. Bennett, C. Spink; Airedale, I. Appleyard, J. O'Dowd; Hull & East Yorkshire, J. Hetherington, A. MacDonald; The Leeds Teaching Hospitals, P. Whelan, P. Quirke, P. Harnden.

Oxford Cancer Intelligence Unit: Monica Roche*, Sandra Edwards, S. Bose, P. Hall; Heatherwood & Wexham Park, M. Ali, O. Karim; Milton Keynes, E. Walker, S. Jalloh; Northampton, M. Miller, A. Molyneux; Oxford Radcliffe, S. Brewster, D. Davies; Royal Berkshire & Battle, P. Malone, C. McCormick; Stoke Mandeville, J. Greenland, A. Padel

Welsh Cancer Intelligence & Surveillance Unit: John Steward*, Shelagh Reynolds, Lynda Roberts, Judith Adams; Ceredigion and Mid Wales, J. Edwards, C.G.B. Simpson; Conwy & Denbighshire, A. Dalton, V. Srinivasan; NE Wales, A. De Bolla, C. Burdge; Gwent Healthcare, W. Bowsher, M. Rashid; Swansea, M. Lucas, C. O'Brien; Cardiff & Vale, M. Varma.

Scottish Cancer Registry: David Brewster*; The Lothian University Hospitals, J. Royle, K. Grigor; North Glasgow University Hospitals, D. Kirk, A. Milano, R. Reid.

Merseyside & Cheshire Cancer Registry: Lyn Williams*, R. Iddenden; Royal Liverpool University Hospital, C.S. Foster, P. Cornford.

Memorial Sloan Kettering Cancer Center: H. Lilja*, S. Eggener*.

Ancillary