SEARCH

SEARCH BY CITATION

Keywords:

  • clinically insignificant prostate cancer;
  • clinically meaningful prostate cancer;
  • tumor volume;
  • nomogram

Abstract

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. REFERENCES

BACKGROUND.

Overtreatment of prostate cancer (PCa) is a concern, especially in patients who might qualify for the diagnosis of insignificant prostate cancer (IPCa). The ability to identify IPCa prior to definitive therapy was tested.

METHODS.

In a cohort of 1132 men a nomogram was developed to predict the probability of IPCa. Predictors consisted of prostate-specific antigen (PSA), clinical stage, biopsy Gleason sum, core cancer length and percentage of positive biopsy cores (percent positive cores). IPCa was defined as organ-confined PCa (OC) with tumor volume (TV) <0.5 cc and without Gleason 4 or 5 patterns. Finally, an external validation of the most accurate IPCa nomogram was performed in the same group.

RESULTS.

IPCa was pathologically confirmed in 65 (5.7%) men. The 200 bootstrap-corrected predictive accuracy of the new nomogram was 90% versus 81% for the older nomogram. However, in cutoff-based analyses of patients who were qualified by our and the older nomograms as high probability for IPCa, respectively 63% and 45% harbored aggressive PCa variants at radical prostatectomy (Gleason score 7-10, ECE, SVI, and/or LNI).

CONCLUSIONS.

Despite a high accuracy, currently available models for prediction of IPCa are incorrect in 10% to 20% of predictions. The rate of misclassification is even further inflated when specific cutoffs are used. As a consequence, extreme caution is advised when statistical tools are used to assign the diagnosis of IPCa. Cancer 2008. © 2008 American Cancer Society.

Active surveillance (AS) of prostate cancer with delayed intervention represents an attractive management option, as it delays and possibly avoids the morbidity and potential mortality associated with radical prostatectomy (RP) or various radiotherapy alternatives.1-5 The favorable stage, grade, and prostate-specific antigen (PSA) migration increased the proportion of men that might fulfill the inclusion criteria for AS (clinical stages T1c or T2a, PSA <10, and biopsy Gleason 6).6, 7 Despite the appealing overall and cause-specific mortality rates in AS series, respectively 85% and 99.2% at 8 years, there are concerns with pathologic outcomes of men who failed AS. In 1 series of 24 patients who were treated with delayed RP after AS failure, 58% had extracapsular extension and 8% had lymph node metastases at the time of surgery.2 These figures may indicate that the criteria used for selection of AS candidates might have been too broad. The alternatives include the Epstein criteria for prediction of pathologically insignificant prostate cancer (IPCa), defined as tumor volume 0.5 cc or less, confined to the prostate and without any high-grade components (Gleason sum 6 or less).8 These pathologic characteristics can be predicted using the following clinical definition: no more than 2 positive biopsy cores with up to 50% cancer involvement, biopsy Gleason sum of 6 or less, and PSA density less than 0.15.8 Unfortunately, the Epstein criteria are not perfect and 20% of patients who fulfill these criteria might have unfavorable pathologic cancer characteristics at RP. Kattan et al9 proposed a nomogram for prediction of IPCa according to the Epstein et al8 criteria at RP. Its accuracy was 79%. In consequence this model shares the same limitations as the Epstein criteria when AS selection is considered. Recently, similar models have been proposed by other investigators.10, 11 However, their accuracy was invariably inferior to 80%, which undermines their practical applicability relative to the Epstein et al8 and Kattan et al9 models.

Based on the limitation of existing models for prediction of IPCa that would potentially qualify for AS, we attempted to develop a more accurate model to discriminate between those with indolent and important PCa. Moreover, we examined the performance characteristics of the most accurate existing model for prediction of indolent PCa and compared them with the newly developed tool.

MATERIALS AND METHODS

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. REFERENCES

Patient Population

Between January 1992 and July 2003, clinical and pathologic data were prospectively gathered in 1153 referred, nonscreened, consecutive patients. All had biopsy-proven, clinically localized PCa and underwent RP at our institution. Of these, 21 patients were excluded because of missing PSA and biopsy Gleason data. Analyses targeted 1132 evaluable patients with complete records for pretreatment serum PSA, clinical stage, primary and secondary biopsy Gleason sores, cumulative length of the cores and of cancer in millimeters in all biopsy cores, percentage of positive biopsy cores (percent positive cores), and tumor volume (TV) assessment at final pathology.

Clinical and Pathologic Evaluation

Clinical stage was assigned by the attending urologist according to the 1992/2002 TNM system. Between 6 and 10 needle biopsy cores were obtained under transrectal-ultrasound (TRUS) guidance. Pretreatment PSA (Abbott Axym PSA assay, Abbott Park, Ill) was measured before digital rectal examination (DRE) and TRUS. Each biopsy core was individually interpreted and the distribution of cores was recorded for each patient. The length of PCa was quantified in each core and graded according to the Gleason system. To calculate the percent positive cores we divided the number of positive cores by the number of cores taken at TRUS biopsy. All prostatectomy specimens were step-sectioned at 3 mm according to the Stanford protocol and were graded according to the Gleason system. All tumors were identified and computer-assisted planimetry was used to define the cumulative tumor volume that accounts for all tumor foci within the prostate.12 Clinically insignificant PCa was defined as organ-confined disease with TV of less than 0.5 cc without Gleason patterns 4 or 5.8 No patient received neoadjuvant androgen deprivation therapy.

Statistical Analyses

Predictors consisted of pretreatment serum PSA, biopsy Gleason sum, cumulative length of cancer in all biopsy cores, and percent positive cores. All variables, continuous and categorical, were explored with respect to possible cutoff values that could be more informative than the unaltered variable format. Cutoff values were identified using the minimum P-value approach according to Mazumdar and Glassman.13 Because variable stratification can inflate the type I error, we used 200 bootstrap resamples to reduce overfit bias in the multivariable analyses. Multivariable logistic regression models (LRM) addressed the presence of IPCa at radical prostatectomy. Fast backward variable selection was used to define the most parsimonious and most accurate model. Subsequently LRM coefficients were used to generate a nomogram predicting the probability of insignificant PCa at RP. The area under the receiver operating characteristics curve (AUC) was used to quantify the accuracy of nomogram predictions. The extent of over- or underestimation of the observed versus the nomogram predicted insignificant PCa rate at final pathology was explored graphically using loess calibration plots.

Various nomogram probability cutoffs were tested to assess the ability to identify patients with or without insignificant PCa and to assist the clinician with this task. Finally, the most accurate Kattan et al nomogram,9 the full model, consisting of PSA, primary and secondary biopsy Gleason patterns, percentage of positive cores, ultrasound-derived prostate volume, millimeters of cancer tissue, and millimeters of noncancer tissue was graphically explored and externally validated. Of 1132 patients, 232 had unavailable information for prostate volume and could not be used for external validation. In consequence the external validation was based on 900 evaluable patients (79.5%). The differences in the AUC between the new IPCa nomogram and the Kattan et al nomogram were then compared using the Mantel-Haenszel test. All statistical tests were performed using S-PLUS Professional, v. 1 (MathSoft, Seattle, Wash). Moreover, all tests were 2-sided with a significance level at 0.05.

RESULTS

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. REFERENCES

The characteristics of the 1132 assessable patients are shown in Table 1. Data are stratified according to the presence or absence of clinically insignificant PCa at RP. Moreover, Table 1 shows the characteristics of 900 patients who fulfilled the inclusion criteria of Kattan et al and whose data were used for the external validation of the Kattan et al insignificant PCa nomograms.9

Table 1. Descriptive Characteristics of the Entire Cohort (n=1132), the Kattan Nomogram Validation Cohort (n=900), and Stratified According to Pathologic Clinically Significant (n=65) Versus Insignificant Prostate Cancer (n=1067)
VariablesEntire CohortClinically Insignificant Cohort*Clinically Significant CohortPKattan Validation Cohort
  • PSA indicates prostate-specific antigen; RP, radical prostectomy.

  • *

    Clinically insignificant prostate cancer defined as tumor volume < 0.5cc and no Gleason 4 or 53.

  • AJCC 2002 TNM staging system.

  • Biopsy cancer length (mm): represents cumulative length of cancer on all biopsy cores (mm).

  • Percentage of positive cores represents percentage of biopsy cores affected by cancer tissue (%).

No. of patients (%)1132 (100%)65 (5.7%)1067 (94.3%)900 (79.5%)
PSA, ng/mL (%)   <.001 
 Mean (median)9.6 (7.2)6.4 (5.2)9.7 (7.4)9.6 (7.3)
 Range0.6-49.80.6-23.40.9-49.80.6-49.8
Clinical stage (%)   .03 
 T1c678 (59.9)51 (78.5)627 (58.8)530 (58.9)
 T2a219 (19.3)6 (9.2)213 (20.0)187 (20.8)
 T2b164 (14.5)4 (6.2)160 (15.0)128 (14.2)
 T2c50 (4.4)3 (4.6)47 (4.4)39 (4.3)
 T321 (1.9)1 (1.5)20 (1.9)16 (1.8)
Biopsy Gleason sum (%)   <.001 
 5 (2+3)19 (1.7)2 (3.1)17 (1.6)15 (1.7)
 5 (3+2)20 (1.8)4 (6.2)16 (1.5)15 (1.7)
 6 (3+3)634 (56.0)58 (89.2)576 (54.0)511 (56.8)
 7-10 (>3+>3)459 (40.5)1 (1.5)458 (42.9)359 (39.9)
Biopsy cancer length, mm (%)     
 Mean [median]10.0 [6.8]1.8 [1.1]10.5 [7.5] 9.6 [6.6]
 Range0.1-65.30.1-8.40.1-65.3 0.1-65.3
 0.1-≤1.2149 (13.2)38 (58.5)111 (10.4)<.001160 (17.8)
 >1.2-≤2.4123 (10.9)7 (10.8)116 (10.9)90 (10.0)
 >2.4-≤3.068 (6.0)13 (20.0)55 (5.2)45 (5.0)
 >3.0-≤5.5165 (14.6)5 (7.7)160 (15.0)116 (12.9)
 >5.5627 (55.4)2 (3.1)625 (58.6)489 (54.3)
Percentage of positive cores (%)   <.001 
 Mean [median]33.6 [31.3]16.7 [12.5]34.6 [37.5]33.1 [31.2]
 Range12.5-100.012.5-62.512.5-10012.5-100.0
Organ confinement (%)691 (61.0)64 (98.5)627 (58.8)<.001553 (61.4)
RP Gleason sum (%)   <.001 
 42 (0.2)2 (0.2)2 (0.2)
 5121 (10.7)4 (6.2)117 (11.0)100 (11.1)
 6352 (31.1)61 (93.8)291 (27.3)283 (31.4)
 7-10657 (58.0)657 (61.5)515 (57.2)
Tumor volume at final pathology, cc   <.001 
 Mean [median]5.4 [3.6]0.2 [0.2]5.7 [3.9]5.3 [3.5]
 Range0.01-50.10.01-0.480.04-50.10.01-50.1
Clinically insignificant prostate cancer (%)65 (5.7)65 (100) 51 (5.7)

The mean and median pretreatment PSA levels were 9.6 and 7.2 ng/mL. Clinical stages T1c and T2a jointly accounted for 79.2% (n = 897) of cases. Of all, respectively 634 (56.0%) and 459 (40.5%) had biopsy Gleason sums of 6 and 7-10. Tumor volumes in excess of 0.5 were recorded in 1059 (93.6%). Mean and median values for the cumulative length of PCa on all biopsy cores were 10 and 6.8 mm and 33.6% and 31.3% for the percentage of positive cores. The pathological criteria of IPCa were fulfilled by 65 (5.7%) of patients.8

Of IPCa predictors, no statistically significant cutoff was identified for serum PSA. Conversely, 4 statistically significant cutoffs were identified for the variable defining the length of cancer tissue in biopsy cores (Table 1). These cutoffs were used in all subsequent analyses. Table 2 shows the univariate and multivariate logistic regression models predicting IPCa. Predictor variables consisted of PSA, clinical stage, biopsy Gleason sum, cumulative PCa length on all biopsy cores, and percent positive cores. In univariate analyses all examined variables were highly statistically significant predictors of IPCa at final pathology (P ≤ .001). In multivariate analyses all examined variables were independent predictors of IPCa at final pathology (P ≤ .036). After fast backward variable removal PSA, biopsy Gleason sum, cumulative cancer length and percent positive cores remained in the model and contributed to 90.4% accuracy for prediction of IPCa at RP. Figure 1A shows the regression coefficient-based nomogram, which was devised from these predictor variables.

thumbnail image

Figure 1. Nomogram and calibration plots. (A) Preoperative nomogram to predict presence of clinically insignificant prostate cancer at final pathology (n = 1132). (B) Internal validation: calibration plot of newly developed nomogram to predict presence of clinically insignificant prostate cancer at final pathology. (C) External validation: calibration plot of the previously reported full model of the Kattan et al nomogram to predict presence of clinically insignificant prostate cancer at final pathology (n = 900). PSA, prostate-specific antigen (ng/mL). Biopsy cancer length (mm): cumulative biopsy cancer core length in mm; % + Cores (%): percentage of positive biopsy cores (%). Nomogram instructions: To obtain nomogram-predicted probability of clinically insignificant prostate cancer, locate patient values at each axis. Draw a vertical line to the ‘Point’ axis to determine how many points are attributed for each variable value. Sum the points for all variables. Locate the sum on the ‘Total Points’ line to be able to assess the individual probability of clinically insignificant prostate cancer at final pathology on the probability (‘Probability of presence of clinically insignificant PCa’) line. Calibration plot instructions: The calibration plot shows the performance of the nomogram. Specifically, nomogram-predicted probabilities are compared with the observed rates of clinically insignificant prostate cancer at final pathology. X-axis represents nomogram-predicted probability of clinically insignificant prostate cancer at final pathology. Y-axis shows observed rate of clinically insignificant prostate cancer at final pathology. Perfect prediction would correspond to a slope of 1 (diagonal 45-degree broken line). Solid line indicates bootstrap corrected nomogram performance.

Download figure to PowerPoint

Table 2. Univariate and Multivariate Models to Predict Clinically Insignificant Prostate Cancer* at Final Pathology (n=1132)
VariablesUnivariate ModelsMultivariate Model
ORPORP
  • Clinical stage is according to AJCC 2002 TNM Staging system. Biopsy cancer length is the cumulative length of cancer on all biopsy cores (mm). Percentage of positive cores is the percentage of biopsy cores affected by cancer tissue.

  • PSA indicates prostate-specific antigen; OR odds ratio.

  • *

    Clinically insignificant prostate cancer is defined as tumor volume < 0.5cc and no Gleason score of 4 or 5.3

PSA0.9<.0010.9.036
Biopsy Gleason sum.001.01
 3+2 vs 2+32.1.44.0.2
 3+3 vs 2+30.9.81.01.0
 >3+3 vs 2+30.02.0010.07.048
Biopsy cancer length (mm)<.001<.001
 >1.2-≤2.4 vs 0.1-≤1.20.2<.0010.2.001
 >2.4-≤3.0 vs 0.1-≤1.20.7.31.2.7
 >3.0-≤5.5 vs 0.1-≤1.20.09<.0010.2.001
 >5.5 vs 0.1-≤1.20.009<.0010.05<.001
Percentage of positive cores (%)0.9<.0011.0.005
Predictive accuracy (%)90.4

The performance characteristics of the new nomogram are shown in Figure 1B, where the nomogram predicted probability is represented on the x-axis and the observed rate of insignificant PCa is plotted on the y-axis. The relationship between the predicted rates of IPCa at RP and the observed rate closely approximate a 45 degree line, which indicates virtually perfect agreement across the entire range of predictions.

The application of the Kattan at al nomogram9 to the subset of 900 assessable patients yielded an AUC of 80.6%, which was statistically significant lower than the AUC of 90.4% (P < .001) that was recorded for the newly developed nomogram. The performance characteristics of the Kattan et al nomogram are shown in Figure 1C and indicate that the nomogram tended to overestimate the probability of insignificant PCa in the current study cohort.

Table 3 shows the effect of applying the newly developed nomogram-derived probabilities of IPCa in the study population. For example, if the nomogram predicted probability cutoff of 10% was implemented, 942/1132 (83.2%) of the cohort would fall below that cutoff. These 942 patients would have been qualified by the nomogram as at low risk of having IPCa. However, 13 of these 942 harbored pathologically confirmed IPCa and would have been incorrectly classified. These 13 IPCa patients corresponded to 20% of all 65 IPCa cases. Of the entire cohort of 1132 patients, 190 cases would fall above the 10% cutoff and would have been qualified by the nomogram as having a high probability of having pathologically confirmed IPCa. Of these 190, 52 indeed had pathologically confirmed IPCa (27.4%). These 52 patients accounted for 80% of all 65 IPCa patients. Conversely, the remaining 138 of the 190 patients (72.6%) had unfavorable PCa characteristics at RP, despite having been labeled by the nomogram as at high risk of having IPCa. Specifically, 56 of these 138 (40.6%) had pathologic Gleason 7-10, 23 (16.7%) had extra capsular extension, and 8 of the 138 (5.8%) had either seminal vesicle invasion or lymph node invasion. Despite these adverse characteristics, these patients would have been considered as having IPCa if nomogram predictions were considered as the gold standard.

Table 3. Analysis of Nomogram-Derived Cutoffs Used to Determine Presence of Clinically Insignificant (IPCa; n=65) Versus Presence of Significant Prostate Cancer (PCa; n=1067)
Nomogram-Derived Probability of IPCaNo. of Patients Below Probability Threshold in Whom Significant PCa Is SuspectedNo. of Patients Below Probability Threshold Without IPCa (True Negatives)No. of Patients Below Probability Threshold With IPCa (False Negatives)No. of Patients Above Probability Threshold in Whom IPCa Is SuspectedNo. of Patients Above Probability Threshold Without IPCa (False Positives)No. of Patients Above Probability Threshold With IPCa (True Positives)Positive Predictive ValueNegative Predictive Value
%No. (%)No. (%)No. (%)No. (%)No. (%)No. (%)%%
<3782 (69.1)779 (73.0)3 (4.6)350 (30.9)288 (27.0)62 (95.4)17.799.6
<5864 (76.3)858 (80.4)9 (13.8)268 (23.7)212 (19.6)56 (86.2)20.199.3
<10942 (83.2)929 (87.1)13 (20.0)190 (16.8)138 (12.9)52 (80.0)19.698.6
<15966 (85.3)949 (88.9)17 (26.2)166 (14.7)118 (11.1)48 (73.8)28.998.2
<20986 (87.1)965 (90.4)21 (32.3)146 (12.9)102 (9.6)44 (67.7)30.197.9
<251020 (90.1)991 (92.9)29 (44.6)112 (9.9)76 (7.1)36 (55.4)32.197.2

Table 4 shows the effect of applying the most accurate version of the Kattan et al9 nomogram for prediction of IPCa in the study population. For example, if the Kattan et al nomogram predicted probability cutoff of 10% was implemented, 648/900 (72.0%) of the cohort would fall below the 10% cutoff and would have been qualified as at low risk of having IPCa. However, within these 648 patients, 12 had pathologically confirmed IPCa and would have been incorrectly classified as non-IPCa. These 12 patients corresponded to 23.5% of all 51 IPCa patients. Of the entire cohort of 900 patients, 252 cases would fall above the 10% cutoff and would have been qualified by the Kattan et al nomogram9 as having a high probability of IPCa. Of these 252, 39 had pathologically confirmed IPCa (correct classification 76.5%). These 39 patients accounted for 76.5% of all 51 IPCa patients. Conversely, 213 of the 252 patients (84.5%) did not have IPCa characteristics at RP, despite having been labeled as high probability for IPCa by the nomogram. Specifically, 73 of 213 (34.3%) had Gleason 7-10 at RP, 18 of 213 (8.5%) had extracapsular extension, and 4 of 213 (1.9%) had either seminal vesicle invasion or lymph node invasion. Despite these adverse characteristics, these patients would have been considered as having IPCa if the Kattan et al nomogram9 predictions were considered the gold standard.

Table 4. Analysis of the Kattan et al Nomogram-Derived Cutoffs Used to Determine Presence of Clinically Insignificant (IPCa; n=51) Versus Presence of Significant Prostate Cancer (PCa; n=849)
Nomogram-Derived Probability of IPCaNo. of Patients Below Probability Threshold in Whom Significant PCa Is SuspectedNo. of Patients Below Probability Threshold Without IPCa (True Negatives)No. of Patients Below Probability Threshold With IPCa (False Negatives)No. of Patients Above Probability Threshold in Whom IPCa Is SuspectedNo. of Patients Above Probability Threshold Without IPCa (False Positives)No. of Patients Above Probability Threshold With IPCa (True Positives)Positive Predictive ValueNegative Predictive Value
%No. (%)No. (%)No. (%)No. (%)No. (%)No. (%)%%
<3511 (56.8)502 (59.1)9 (17.6)389 (43.2)347 (40.9)42 (82.4)10.899.1
<5548 (60.9)538 (63.4)10 (19.6)352 (39.1)310 (36.6)41 (80.4)11.698.2
<10648 (72.0)636 (74.9)12 (23.5)252 (28.0)213 (25.1)39 (76.5)15.598.2
<15679 (85.3)666 (78.4)13 (25.5)221 (14.7)183 (21.6)38 (74.5)17.298.1
<20732 (81.3)713 (84.0)19 (37.3)168 (18.7)136 (16.0)32 (62.7)19.097.4
<25762 (84.7)742 (87.4)20 (39.2)138 (15.3)107 (12.6)31 (60.8)22.597.4

DISCUSSION

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. REFERENCES

Overtreatment of indolent PCa represents a serious concern in the Western world. The contemporary stage and grade migration keep increasing the proportion of cancers that have indolent pathologic characteristics and possibly increase the rate of PCa overtreatment. Epstein et al8 pioneered the approach to prediction of IPCa. Kattan et al9 followed with a methodologically different approach aimed at predicting the same concept. The work of both groups allowed selecting patients whose disease characteristics might not require active therapy. AS might represent the ideal treatment option for individuals with IPCa.2, 4, 5, 14 However, the success of AS rests on the concept that IPCa cases can be accurately distinguished from PCas that do not fulfill the IPCa criteria.15 The stakes related to AS candidate selection are high, because incorrect classification of individuals at risk might result in disease progression, which might not be salvageable with definitive therapy. A recent publication of the results of an AS cohort has shed a light of doubt on our ability to correctly identify patients who qualify for AS. In that series, of AS failures that were salvaged with RP due to progression, as many as 58% had extraprostatic extension and/or seminal vesicle invasion and 8% had lymph node metastases.2 These sobering statistics indicate that AS selection criteria might not yet be ideal and that those who fail AS are largely not ‘salvageable’ with available treatment modalities.

Several investigators addressed the task related to the discrimination between insignificant and significant prostate cancers.8 The pioneering work of Epstein et al8 defined the criteria used for identification of men with IPCa. Those consisted of PSAD ≤0.15, up to 2 positive biopsy cores, and no more than 50% core involvement with Gleason ≤6 PCa.8 These characteristics correctly predicted the presence of pathologically IPCa (tumor volume <0.5 cc, pathologic Gleason 6 or less, and organ-confined disease) in 73% of cases.8 It is of interest that the cohort of pathologically IPCa consisted of 41 cases within 157 consecutive radical prostatectomy.8 In a subsequent validation of this model, Epstein et al16 demonstrated 94% accuracy to detect 17 pathologically IPCa out of 163 RP cases. A contemporary validation of this model was performed by Bastian et al17 in a cohort of 237 T1c PCa patients treated with RP at Johns Hopkins University Hospital. Within that cohort, the Epstein criteria correctly predicted the presence of organ-confined PCa in 91.6% of cases.17 A formal validation of pathologically confirmed IPCa was not performed in this latest study. Interestingly, 8.4% of cases that fulfilled the IPCa criteria according to Epstein et al were nonorgan-confined at RP.17 Jeldres et al18 performed a European validation of the Epstein criteria in a cohort of 366 patients using the same methodology as Bastian et al. In their series 20% of patients that fulfilled the Epstein IPCa criteria had unfavorable findings at RP, which consisted of either pathologic Gleason 7 disease (88, 24%) or of nonorgan-confined pathologic stage (30, 8.3%). Taken together, those data indicate that pathologically confirmed IPCa (tumor volume <0.5 cc, pathologic Gleason 6 or less, and organ-confined disease) may be expected to be correctly predicted in 73% of patients, according to data from 157 consecutive RP cases.8 As a consequence, 27% of patients with PCa characteristics that are more aggressive than the pathologically confirmed IPCa definition may be incorrectly classified as IPCa. Moreover, between 20% and 8.4% of patients that fulfill the IPCa characteristics prior to definitive therapy may be expected to demonstrate nonorgan-confined disease that may not be curable.

The limitations of the Epstein criteria for prediction of pathologically confirmed IPCa were addressed by several investigators.10, 11 Kattan et al9 derived several nomograms for prediction of pathologically confirmed IPCa. The most accurate of these nomograms relies on PSA, clinical stage, primary and secondary biopsy Gleason, prostate volume, millimeters of cancer, and of noncancer in the biopsy specimens. More recently, Nakanishi et al10 attempted to further improve the accuracy of the existing tools, especially in patients with a single positive core at biopsy. In a cohort of 258 men their model predicted with 73% accuracy. It is noteworthy that the Kattan et al nomogram was externally validated by Steyerberg et al11 in a cohort of 247 European patients and was 76% accurate versus 79% in internal validation. The Nakanishi et al nomogram awaits its external validation. Taken together, the nomogram studies indicate that these statistical tools are similar to the original Epstein et al criteria in their ability to predict pathologically confirmed IPCa. The accuracy of the nomograms ranges from 73% to 79% versus 73% for the Epstein model. Therefore, in the best-case scenario the clinician may expect 80% accuracy when either organ-confined PCa presence is predicted with the original Epstein criteria or 76% to 79% accuracy when pathologically confirmed IPCa is predicted.

Accuracy of 76% to 80% is laudable. However, is this sufficient in the context of IPCa, where the ramifications related to incorrect classification of an incorrect diagnosis of IPCa may have disastrous consequences, as described by Klotz?2 We believe that the answer to this question is no. It might be argued that the decision to receive definitive therapy versus AS represents 1 of the most important milestones in the natural history of PCa. In consequence, nomograms for prediction of IPCa should be held to the highest scrutiny and should have virtually perfect accuracy and performance characteristics. These should result in virtually no misclassification between IPCa and other PCas. This hypothesis prompted us to attempt developing a more accurate model for prediction of IPCa than those described by Epstein et al8 and Kattan et al.9

The patient population used to test our hypothesis reflects the stage and grade distribution at biopsy, as well as the rate of IPCa at RP that can be observed in European men, which were shown to differ from their North American counterparts.18 This may affect nomogram accuracy and performance. Indeed, we have shown that the Epstein criteria that were defined in the US perform less well in European patients (80% accuracy) than they did in North American men (92% accuracy).17, 18 However, we found that the accuracy of the Kattan et al nomogram9 did not deteriorate when it was applied to our patient cohort (81% accuracy), relative to its performance in North American men (79% accuracy). Interestingly, our newly developed nomogram demonstrated better discriminant characteristics than all other models, which was evidenced by 90% accuracy, after controlling for overfit bias with 200 bootstrap resamples. Although, bootstrapping was shown to represent the best alternative to external validation,19 our accuracy results are not obtained from an external validation cohort and may still be affected by residual overfit bias.

Despite these highly encouraging discrimination characteristics, close examination of the discriminant characteristics of our newly developed nomogram, as well as of those of the Kattan et al9 nomogram, revealed worrisome findings (Table 3). The use of various nomogram derived probability cutoffs for prediction of IPCa resulted in misclassification of important proportions of patients. A strikingly important proportion of patients who were qualified by our nomogram (63%) and by the Kattan et al9 nomogram (45%) as high probability of IPCa harbored aggressive PCa variants at RP (Gleason 7-10, ECE, SVI, and/or LNI). From a practical perspective this type of misclassification implies that those men, despite aggressive PCa characteristics, could see themselves being offered AS. Obviously, such practice could result in truly disastrous cancer control outcomes.

The nomogram misclassification was substantially less important in patients that were qualified by the nomograms as low probability for IPCa. Of nomogram-labeled low probability IPCa cases between 20% and 23% did have IPCa at RP and might have been overtreated with definitive therapy.

Our findings need to be interpreted with caution. First, we demonstrated some pitfalls in the applicability of nomograms and other models for prediction of IPCa. However, we could not identify a valid alternative to model-based prediction. Clearly, several studies demonstrated that a nomogram-based approach is better than random assignment of risk, use of individual clinical experience, or reliance on risk strata.20, 21 Unfortunately, a satisfactory model for prediction of IPCa has not yet been identified. Second, the cutoff approach for testing of predictive models is not synonymous with accuracy, which is defined by the area under the curve. We applied the cutoff approach on exceptional grounds. The exception might be justified by the extra scrutiny that is warranted when models predicting IPCa are evaluated. It is important to emphasize that neither nomogram was designed to be used with a specific cutoff. Therefore, we plead guilty for bending the rules of nomogram testing. Third, the practical implications of our findings indicate that the diagnosis of IPCa cannot be made with perfect certainty, despite the existence of models that are 80% or more accurate. Despite these highly laudable characteristics a nonnegligible proportion of patients may be qualified as IPCa, but in reality they may harbor aggressive PCa variants. Finally, do our results imply that the term IPCa should be removed from the urologic dictionary? It clearly should not. However, it should be noted that the proportion of men with IPCa is relatively small, at least at RP, where patient selection might be biased toward clinically meaningful PCa. However, an RP is necessary to confirm the pathological IPCa characteristics. Therefore, this selection bias might be unavoidable in any study that wishes to address the relationship between pretreatment variables and the rate of pathologically confirmed IPCa.

Conclusions

In an era when less aggressive treatments are being encouraged for men with PCa (HIFU or AS), our findings indicate that even sophisticated statistical tools are unable to convincingly predict IPCa and that we need to be cautious when evaluating our patients for such treatments. Specifically, despite high accuracy, currently available models for prediction of IPCa are incorrect in 10% to 20% of cases. The rate of misclassification is even further inflated when specific cutoffs are used. In consequence, we advise extreme caution when the assignment of the diagnosis of IPCa is based on statistical models.

REFERENCES

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. REFERENCES