Comprehensive report on prostate cancer misclassification by 16 currently used low-risk and active surveillance criteria


Rein Jüri Palisaar, Department of Urology, Ruhr-University Bochum, Marienhospital Herne, Widumer Strasse 8, 44627 Herne, Germany. e-mail:


Study Type – Prognosis (case series)

Level of Evidence 4

What's known on the subject? and What does the study add?

Prostate cancer characterisation, based on laboratory findings, clinical examination and histopathological cancer features that are used to define selection criteria for AS, is not ideal. Consequently, a panel of strict or more lenient criteria to select patients for AS have been published. Studies investigating the relationship between pretreatment variables and final pathology have been done in the past showing the risk of cancer misclassification for some criteria.

No study has presented an overview of cancer selection using a panel of 16 currently used AS criteria that is presented in the present study. In an exactly defined cohort after radical prostatectomy, each set of criteria was used as a diagnostic test to separate between patients with more favourable (pT2, no Gleason upgrade between biopsy grading and final pathology) and unfavourable cancer features (pT3, pN+, Gleason upgrade). To the best of our knowledge a comparison of test quality criteria for AS criteria given by sensitivity, specificity, positive and negative predictive value and likelihood ratio has not yet been reported. Moreover, we showed that tumour characterisation, by a formally sufficient 12-core biopsy, in the present dataset harboured a risk of ≈20% that unfavourable cancer features were missed regardless of whether strict or more lenient selection criteria for AS were chosen.


  • • To evaluate final histopathological features among men diagnosed with prostate cancer eligible for low-risk (LR) or active surveillance (AS) criteria.


  • • Retrospective application of 16 definitions for AS or LR prostate cancer to a contemporary (January 2008 to March 2011) open retropubic radical prostatectomy (RRP) series of 1745 patients.
  • • Exclusion criteria: neoadjuvant hormones, radiotherapy, inadequate histopathological reports, <10 biopsy cores.
  • • Report on the number of men with insignificant tumours (defined as: ≤pT2, Gleason score ≤6, tumour volume <0.5 mL) and men who had unfavourable tumour characteristics on final pathology (defined as: extracapsular extension or seminal vesicle invasion or lymph node metastasis or Gleason upgrading).
  • • Sensitivity, specificity, positive predictive value (PPV) and negative predictive values (NPV) were calculated.


  • • Eligibility of patients in the final study cohort (n= 1070) varied from 5.1% to 92.7% depending on the AS or LR criteria used.
  • • Final pathology revealed 77 insignificant cancers and 578 patients who had unfavourable histopathological criteria.
  • • The detection rate for insignificant cancers on final pathology was variable ranging from 7.8% to 28.3% depending on the AS- or LR-prediction tool used; unfavourable tumour characteristics were found in up to 33.5% on final pathology.
  • • The sensitivity, specificity, PPV and NPV were 8.5–97.9%, 24.7–97.8%, 67.7–89.1% and 45.3–78.2%, respectively.
  • • The likelihood ratio to correctly identify a patient with LR disease on final pathology ranged from 1.3 to 8.


  • • AS or LR criteria have a significant risk of cancer misclassification.
  • • Better prediction tools are needed to improve these criteria.
  • • Re-biopsy might improve safety and should be considered more frequently in patients who opt for AS.

active surveillance




European Association of Urology


body mass index


extracapsular extension


positive predictive value


negative predictive value


lymph node metastases


open retropubic radical prostatectomy


seminal vesicle invasion


In 2008, prostate cancer incidence was 110.5/100 000 while the mortality rate was only 21.1/100 000 in Europe [1]. This indicates that a large proportion of men might harbour small, well-differentiated tumours that may never develop symptomatic or life-threatening disease. The European Randomized Study Of Screening For Prostate Cancer reported that 1410 men need to be screened and 48 patients to be treated to prevent the death of one man from prostate cancer [2]. Studies evaluating quality-adjusted life years suggest that a reasonable number of patients might benefit from active surveillance (AS) rather than initial surgery or radiotherapy [3]. Hence AS has become an important method to reduce overtreatment and treatment-related morbidity. Consequently, after initial cancer diagnosis it is important to identify patients with a low risk (LR) of cancer progression. To date no biomarker exists to safely predict prostate cancer prognosis. Common preoperative variables (e.g. clinical stage, serum PSA concentration, biopsy criteria) are used in clinical practice to individually counsel patients about prognosis and treatment. Several AS and LR definitions have been published to stratify patients at LR from others who might benefit from definitive treatment. These AS- or LR-prediction tools use preoperative variables and are based on varying threshold values aiming to separate favourable from unfavourable cancer features.

For patients considering AS as a treatment option, it is important to investigate whether current AS or LR definitions are adequate. Validation studies with contemporary cohorts are necessary to show prediction accuracy and reliability.

In the present study, we evaluated 16 currently used prostate cancer LR or AS criteria (Epstein et al. [4,5], Dall'Era et al. [6], D'Amico et al. [7], Patel et al. [8], Soloway et al. [9], van den Bergh et al. [10], van As et al. [11], Klotz [12], Roemling et al. [13], Meng et al. [14], Hardie et al. [15], Choo et al. [16], Khatami et al. [17], Al Otaibi et al. [18], Berglund et al. [19], European Association of Urology [EAU] guidelines [20]), which use characteristics of a systematic prostate biopsy of ≥10 cores in contemporary open retropubic radical prostatectomy (RRP) series. We retrospectively applied each of these criteria to our RRP series and compared pre- and postoperative cancer features. Misclassification, defined as a Gleason upgrading (biopsy vs final pathology), pT3 or pN+ disease on final pathology, was investigated for each patient eligible for any of the tested criteria. In addition, we assessed sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) to report reliability of each previously described LR or AS criteria.


Between January 2008 and March 2011, 1745 patients underwent RRP for clinically localized prostate cancer at a single institution. Prostate cancer was detected in all men by TRUS-guided biopsy either by the referring urologist (54.4%) or by an urologist at our department (45.6%). Exclusion criteria were defined as follows: salvage RRP, neoadjuvant hormonal therapy, an insufficient prostate biopsy histopathological report, <10 biopsy cores taken, missing clinical data (body mass index [BMI], prostate volume, clinical stage), laboratory (PSA concentration) or histopathological (cancer volume, Gleason score) reports. A histopathological report was considered sufficient if the following information was provided: Gleason grading for each biopsy core, anatomical localisation and number of biopsy cores taken, number of positive biopsy cores, length of each biopsy core and cancer length per core. Pathological evaluation was performed by dedicated genitourinary pathologists. Cases with a Gleason pattern 1 or 2 on biopsy underwent a second pathological review at our institution or were excluded from analysis. Histopathological final prostate specimen processing was performed according to the Stanford protocol [21]. The final study population included 1070 men. Patient characteristics are given in Table 1.

Table 1.  Clinical, laboratory, histopathological characteristics
VariableN (%)Mean (range)25/50/75% percentile
 I137 (12.8)  
 II700 (65.4)  
 III233 (21.8)  
Clinical stage:   
 1c558 (54.7)  
 2 a,b,c414 (38.3)  
 371 (6.6)  
Pathological stage:   
 2 a,b,c892 (83.4)  
 3a90 (8.4)  
 3b85 (7.9)  
 43 (0.3)  
LN status:   
 N0615 (57.5)  
 N158 (5.4)  
 Nx397 (37.1)  
Margin status:   
 Negative988 (92.3)  
 Positive82 (7.7)  
 None271 (25.3)  
 Unilateral268 (25.0)  
 Bilateral531 (49.6)  
Biopsy Gleason sum:   
 ≤3 + 3683 (63.7)  
 3 + 4227 (21.2)  
 4 + 382 (7.7)  
 ≥4 + 478 (7.4)  
Pathological Gleason sum:   
 ≤3 + 3502 (46.9)  
 3 + 4346 (32.3)  
 4 + 3112 (10.5)  
 ≥4 + 4110 (10.3)  
Age, years 65 (42–77)60/65/69
BMI, kg/m2 27 (19–43)25/27/29
Biopsy number 12.4 (10–32)10/12/12
Total core length, mm 157.6 (47–514)121/148/183
Prostate volume, mL 39 (9–170)26/35/46
Tumour volume, mL 2.7 (0.1–59)1.1/1.9/3.4
PSA concentration, ng/mL 9.4 (0.6–83)5.2/7/10.7
PSA-density, ng/mL2 0.28 (0.01–3.32)0.13/0.20/0.32

In all, 16 LR or AS [4–20] selection criteria were retrospectively applied to our study cohort. The LR and AS criteria are shown in Tables 2 and 3[4–20]. We report the rate of patients who had a Gleason upgrading, extracapsular extension (ECE), seminal vesicle invasion (SVI) or lymph node metastasis (pN+) present on final pathology. In addition, the mean and median tumour volume on final pathology are reported. The rate of pathologically insignificant cancers (defined as: tumour volume <0.5 mL, no Gleason 4/5, no ECE, no SVI) and the rate of unfavourable tumour characteristics (defined as: Gleason upgrading on final pathology compared with biopsy Gleason grading, presence of ECE, SVI or pN+ on final pathology) are shown.

Table 2.  AS selection criteria
VariableWhole patient cohortChoo et al.[16]Meng et al.[14]Klotz [12]Hardie et al.[15]Roemling et al.[13]Khatami et al.[17]Berglund et al.[19]Al Otaibi et al.[18]EAU guidelines [20]
  1. Gl, Gleason; #no Gl 4 cancer, <0.5 mL tumour volume, organ-confined cancer, *upgrading in comparison to biopsy Gleason.

Clinical stage ≤2≤2a≤2a1,21c,2 cT1–2a<2acT1c–cT2a
PSA concentration, ng/mL ≤15<10≤10≤20<15, density < 0.25.5, density < 0.15<10 <10
Biopsy features Gleason score ≤ 7Gleason score ≤ 7Gl-score ≤ 6Gleason score ≤ 7Gl-sum ≤3 + 3<2.9 mm cancer lengthGleason grade ≤ 3no Gleason grade 4Gleason score ≤6
    ≤3 positive cores <3 positive cores ≤3 positive cores<2 positive cores<2 positive cores
    ≤50% cancer per core   <50% cancer per core<50% cancer per core<50% cancer per core
N (%):          
 No of patients fulfilling criteria1070853 (79.7)690 (64.5)259 (24.2)899 (84.0)190 (17.8)55 (5.1)308 (28.8)297 (27.8)135 (12.6)
 ECE90 (8.4)49 (5.7)32 (4.5)3 (1.2)52 (5.8)1 (0.5)0,03 (1.0)6 (2.0)2 (1.5)
 SVI85 (7.9)31 (3.6)17 (2.5)3 (1.2)44 (4.9)1 (0.5)0,04 (1.3)4 (1.3)0,0
 pN+58 (5.4)37 (4.3)23 (3.3)6 (2.3)41 (4.6)6 (3.2)1 (1.8)7 (2.3)11 (3.7)5 (3.7)
 pGl-score ≥ 7569 (53.2)390 (45.7)300 (43.5)52 (20.1)432 (48.1)29 (15.3)13 (23.6)65 (21.1)71 (23.9)25 (18.5)
 pGl-sum ≥ 4 + 3222 (20.7)97 (11.4)69 (10)10 (3.9)111 (12.3)14 (7.4)1 (1.8)12 (3.9)18 (6.9)6 (4.4)
 pGl-score ≥ 8110 (10.3)26 (3.0)15 (2.2)2 (0.8)33 (3.7)1 (0.5)1 (1.8)3 (1.0)6 (2.0)2 (1.5)
 Overall unfavourable         
  pGl 7578 (54.0)401 (47.0)301 (43.6)53 (20.5)*432 (48.1)32 (16.8)*13 (23.6)66 (21.4)*74 (24.9)*26 (19.3)*
  pGl 4 + 3310 (29.0)164 (19.2)113 (16.4)20 (7.7)183 (20.4)6 (3.2)2 (3.6)23 (7.5)34 (11.4)12 (8.9)
  pGl 8247 (23.1)118 (13.8)*75 (10.9)*13 (5.0)135 (15.0)*9 (4.7)1 (1.8)15 (4.5)24 (8.1)9 (6.7)
 Insignificant prostate cancer#77 (7.2)71 (8.3)61 (8.8)48 (18.5)75 (8.3)44 (23.2)9 (16.4)51 (16.6)56 (18.9)32 (23.7)
Mean/median tumour volume, mL2.75/1.902.29/1.702.09/1.501.88/1.182.37/1.751.67/1.002.14/0.841.81/1.202.03/1.201.12/0.92
Sensitivity, % 89.374.741.992.637.88.549.245.322.2
Specificity, % 52.269.690.845.394.597.888.587.295.5
PPV, %
NPV, % 59.445.364.765.562.055.767.265.259.0
Likelihood ratio
Table 3.  AS or LR selection criteria
 Whole patient cohortEpstein et al.[4,5]D'Amico et al.[7]Patel et al.[8]van den Bergh et al.[10]Dall'Era et al.[6]Soloway et al.[9]van As et al.[11]
  1. Gl, Gleason; #no Gl 4 cancer, <0.5 mL tumorvolume, organ confined cancer, *upgrading in comparison to biopsy Gleason.

Clinical stage T1c<2b≤3<2c<2b≤2<2b
PSA concentration, ng/mL Density <0.15≤10 ≤10, density <0.2<10, density <0.15≤15<15
Biopsy features Gl-grade <4Gl-grade <4Gl-grade ≤ 7Gl-grade <4Gl-grade <4Gl-grade <4Gl-sum ≤3 + 4
<3 positive cores<3 positive cores<33% positive cores<3 positive cores<50% positive cores
<50% cancer per core<50%cancer per core
N (%):        
 No of patients fulfilling criteria1070.099 (9.3)514 (48.0)992 (92.7)174 (16.3)176 (16.4)334 (31.2)644 (60.2)
 ECE90 (8.4)014 (2.7)78 (7.9)1 (0.6)2 (1.1)8 (2.4)18 (2.8)
 SVI85 (7.9)011 (2.1)62 (6.3)1 (0.6)1 (0.6)6 (1.8)14 (2.2)
 pN+58 (5.4)3 (3.0)13 (2.5)47 (4.7)5 (2.9)3 (1.7)12 (3.6)23 (3.6)
 pGl-score ≥ 7569 (53.2)12 (12.1)136 (26.5)493 (49.7)25 (14.4)26 (14.8)108 (32.3)245 (38.0)
 pGl-sum ≥ 4 + 3222 (20.7)2 (2.0)26 (5.1)152 (15.3)4 (2.3)3 (1.7)33 (9.9)48 (7.5)
 pGl-score ≥ 8110 (10.3)08 (1.6)40 (4.0)007 (2.1)12 (1.9)
 Overall unfavourable        
  pGl 7578 (54.0)13 (13.1)*145 (28.2)*507 (51.1)27 (15.5)*29 (16.5)*112 (33.5)*252 (39.1)
  pGl 4 + 3310 (29.0)5 (5.1)54 (10.5)243 (24.5)11 (6.3)9 (5.1)47 (14.1)85 (13.2)*
  pGl 8247 (23.1)3 (3.0)39 (7.6)186 (18.8)*7 (4.0)6 (3.4)29 (8.7)59 (9.2)
 Insignificant prostate cancer#77 (7.2)28 (28.3)59 (11.5)77 (7.8)41 (23.6)36 (20.5)53 (15.9)70 (10.9)
Mean/median tumour volume, mL2.75/1.901.64/0.842.00/1.442.61/1.811.66/0.991.62/1.051.96/1.202.05/1.44
Sensitivity, %17.575.097.929.929.945.173.617.5
Specificity, %97.874.924.795.395.080.672.697.8
PPV, %86.971.881.384.583.567.786.886.9
NPV, %58.277.978.261.561.463.352.858.2
Likelihood ratio8.

Preoperative factors potentially separating favourable (defined as: pT2, no Gleason pattern 4) and unfavourable pathology were evaluated using uni- and multivariate analyses (Table 4).

Table 4.  Uni- and multivariate analyses to assess differences between favourable and unfavourable cancer features
VariableOCC, Gl-sum ≤3 + 3ECE or SVI or N+ or pGl-score ≥ 7Univariate analysis PMultivariate analysis POdds ratio*
  1. Gl, Gleason; OCC, organ-confined cancer; *Note that odds ratio relates to the specific increment in the respective variable, i.e. for Gleason sum for each increase in one point etc.

Age, years63.6 (6.52)64.8 (6.59)0.030.0151.03
BMI, kg/m227.127.10.79n.a. 
no biopsy cores12.412.50.51n.a. 
no positive cores2.94.4<0.010.752 
% positive cores24.036.9<0.010.135 
Total core length, mm158157.20.77n.a. 
Total cancer length, mm8.620.6<0.010.586 
% cancer length6.013.8<0.010.163 
Maximum single cancer length, mm4.88.9<0.010.0241.05
Prostate volume, mL41. 
PSA concentration, ng/mL7.411.1<0.010.285 
PSA density, ng/mL20.210.34<0.010.10 
No. biopsy cores >50% cancer involvement, n (%)     
 0418 (85)321 (55.5)<
 149 (10)135 (23.4)   
 217 (3.5)66 (11.4)   
 38 (1.6)32 (5.5)   
 ≥424 (4.1)    
Biopsy Gl-sum, n (%)     
 ≤3 + 3462 (93.9)221 (38.2)<0.01<0.014.76
 3 + 423 (4.7)207 (35.8)   
 4 + 36 (1.2)79 (13.7)   
 ≥4 + 41 (0.2)71 (12.3)   

Patients correctly classified by a specific criterion were compared with misclassified patients for tumour volume, PSA concentration, PSA density, age, BMI, number of biopsy cores, total biopsy length, prostate volume and whether the biopsy was performed in our department or externally.

To investigate the reliability of the AS- and LR-selection criteria in a current clinical cohort of patients, we considered each previously described criterion as a diagnostic test aiming to separate favourable (pT2, no Gleason pattern 4) from unfavourable tumours on final pathology. Sensitivity was calculated as the proportion of individuals with favourable tumour characteristics (organ-confined cancer, N0 or Nx status and concordance between biopsy Gleason pattern and final pathology) who were correctly identified by the investigated criteria. Specificity was calculated as proportion of individuals with unfavourable tumour characteristics (Gleason upgrading between biopsy and final specimen processing, ECE, SVI, LN+) who were not selected by the specific criterion.

Positive predictive value is shown as the proportion of individuals who were selected by a specific criterion and were proven by final pathology to have Gleason grading concordance, pT2, N0 or Nx status. Consequently, the NPV is given as the proportion of individuals who were not selected but revealed overall unfavourable tumour characteristics at final pathology report. The likelihood ratio (sensitivity/1-specificity) for a positive test result (fulfilled selection criterion) was also calculated (Tables 2 and 3).

Uni- and multivariate regression analyses were used to evaluate which variables might help differentiate patients with overall unfavourable characteristics (N+, pT3, Gleason pattern 4) and patients with organ-confined cancer and a Gleason 3 pattern (Table 4). A P≤ 0.05 was considered to indicate statistical significance.

Misclassified and correctly identified patients were compared for numerical and categorical data using the unpaired t-test or chi-squared test, respectively (Tables 5 and 6).

Table 5.  Comparison of correctly identified and misclassified patients by AS or LR selection criteria*
VariableEpstein et al.[4,5]D'Amico et al.[7]Patel et al.[8]van den Bergh et al.[10]Dall'Era et al.[6]Soloway et al.[9]van As et al.[11]
  1. −, no statistically significant differences (P > 0.05); *note that in case of significant differences the value for correctly identified patients is depicted first.

 Tumour volume, mL1.5 vs 2.11.8 vs 2.52.1 vs 4.51.5 vs 2.51.5 vs 2.11.8 vs 2.61.7 vs 2.6
 PSA concentration, ng/mL5.8 vs 6.58.0 vs 13.56.6 vs 8.16.8 vs 7.75.8 vs 6.5
 PSA density, ng/mL20.17 vs 0.210.23 vs 0.410.166 vs 0.240.19 vs 0.23
 BMI, kg/m2
 No. of biopsy cores
 Total core length, mm
 Prostate volume, mL
Biopsy (intern vs extern)++
Rate of upgrading to:       
 Gleason score ≥ 7, %     18.9 vs 30.5 
 Gleason score ≥ 8, %     4.7 vs 11.04.8 vs 11.2
Table 6.  Comparison of correctly identified and misclassified patients by AS selection criteria*
VariableChoo et al.[16]Meng et al.[14]Klotz [12]Hardie et al.[15]Roemling et al.[13]Khatami et al.[17]Berglund et al.[19]Al Otaibi et al.[18]EAU guidelines [20]
  1. −, no statistically significant differences (P > 0.05), *note that in case of significant differences the value for correctly identified patients is depicted first.

 Tumour volume, mL2.1 vs 3.52.0 vs 2.91.7 vs 2.32.1 vs 3.81.5 vs 2.61.1 vs 2.30.92 vs 1.7
 PSA concentration, ng/mL6.9 vs 8.35.9 vs 6.55.8 vs 6.57.3 vs 9.55.8 vs 6.78.0 vs 10.4
 PSA density, ng/mL20.20 vs 0.260.16 vs 0.210.21 vs 0.290.15 vs 0.200.18 vs 0.310.15 vs 0.19
 BMI, kg/m2
 No. of biopsy cores
 Total core length, mm
 Prostate volume, mL
Biopsy (intern vs extern)+++
Rate of upgrading to:         
 Gleason score ≥ 7, % 17.4 vs 26.4   12.8 vs 28.1  
 Gleason score ≥ 8, %8.3 vs 13.04.2 vs 6.713.6 vs 16.2  1.8 vs 8.3  


The median patient age was 65 years. In all, 10, 11, 12, 13–19 and >20 biopsy cores were obtained in 351, 67, 417 163 and 72 patients, respectively. The median number of biopsy cores was 12 (interquartile range 1012). Detailed patient characteristics are shown in Table 1.


While 65.2% of cases showed concordance when comparing biopsy Gleason score with final pathology Gleason score, 5.8% were down- and 29% were up-graded. Of the total cohort, 83.4% had organ-confined disease while 8.4%, 7.9% and 0.3% had ECE, SVI or pT4 cancers, respectively. There was lymph node involvement in 5.4%. Men were found to be eligible for LR or AS criteria in 5.1–92.7% depending on which criterion was used. In patients identified by LR criteria or eligible for AS, unfavourable tumour characteristics were found in 54% (pathological Gleason score ≥7 or ECE or SVI or pN+) and 29% (pathological Gleason score ≥8 or ECE or SVI or pN+), respectively.

The rates of insignificant cancers varied from 7.8% to 28.3% (Tables 2 and 3). The misclassification rate by AS and LR criteria ranged from 10.9% (Meng et al. [14]) to 33.5% (Soloway et al. [9]). Detailed results for each AS and LR definition are given in Tables 2 and 3.


The sensitivity, specificity, PPV and NPV were 8.5–97.9%, 24.7–97.8%, 67.7–89.1% and 45.3–78.2%, respectively. Likelihood ratio ranged between 1.3 (Patel et al. [8]) and 8 (Epstein et al. [4,5]) (Tables 2 and 3).


There was no statistically significant difference between groups for to age, BMI, the number of biopsy cores taken, total biopsy core length and prostate volume. The tumour volume on final pathology was significantly higher in misclassified patients in 14 of 16 criteria (Tables 5 and 6). Applying the D'Amico et al.[7], Patel et al. [8], Soloway et al. [9], van As et al.[11], Choo et al.[16], Klotz [12], Hardie et al.[15], and Berglund et al.[19] criteria, misclassified patients had statistically significant higher PSA concentrations and PSA-density (Tables 5 and 6).


Maximum cancer length per core, number of biopsy cores with >50% cancer involvement and the biopsy Gleason grade were independent predictors to stratify favourable from unfavourable cancers on final pathology (Table 4).


Detailed histopathological findings on prostate biopsy cores are important variables used in prediction tools to counsel patients diagnosed with prostate cancer. It has been reported in various studies that unfavourable tumour characteristics are associated with worse oncological outcomes after treatment [22–24]. Therefore, it is important to investigate if prediction tools provide reliable information to stratify patients eligible for AS from those who might benefit more from definitive treatment.

At our institution >50% of patients presented with external biopsy reports to discuss their treatment options and prognosis. Hence in the present study, biopsy scheme and the number of biopsy cores was variable. Only patients with ≥10 cores and a sufficient pathological report were included.

We retrospectively applied the information of a median number of 12 cores to currently used AS or LR criteria. We found pathologically insignificant tumours in 7.8–28.3% of patients and misclassified unfavourable tumour characteristics in 10.9–33.5% of men depending on the AS or LR criteria applied. The likelihood ratio to correctly identify patients eligible for 16 AS and LR criteria ranged from 1.3 to 8.0. In the best case (Epstein criteria) a patient who was selected for AS by this specific criterion had an eight-fold higher likelihood of having a pT2 cancer and no Gleason upgrading on final pathology. In contrast, it is questionable if we can rely on a test showing a likelihood ratio of 1.3 (Patel et al. [8]) for clinical decision-making. A formally sufficient systematic 12-core biopsy in the present dataset harboured a risk of ≈20% that unfavourable cancer features were missed regardless of whether you choose strict or more lenient criteria. The treating urologist must be aware of this variability when using AS- or LR-prediction tools.

Multivariate analysis showed that maximum single cancer length, number of biopsy cores with >50% cancer involvement and biopsy Gleason grade were significantly associated with unfavourable cancer features on final pathology. When we combined these parameters using varying threshold values, we were still unable to distinguish favourable from unfavourable characteristics. Considering only patients with 1 mm maximum single cancer length or Gleason score ≤ 6 on biopsy (n= 174, 16.3%), the misclassification rate was 19% and 9.8% (ECE or SVI or LN+ or Gleason score 7–10/8–10), respectively. Pathologically insignificant tumours were detected in only 21.9% of cases using this criterion. These findings reflect that tumour characterisation by prostate biopsy features, which are used to select patients for AS in daily clinical routine, is not ideal.

To the best of our knowledge this is the first report showing comprehensive information about 16 selection criteria used in a current prostate cancer series. To assess reliability of a specific criterion we calculated sensitivity, specificity, NPV and PPV, which has not yet been reported. The present study aimed to highlight the difficulties in separating more favourable from unfavourable characteristics for AS. We think this provides important background information for counselling individuals diagnosed with prostate cancer.

Suardi et al. [25] reported a series of 4485 patients who underwent RP between 1992 and 2007. They retrospectively applied five AS criteria to their cohort. A Gleason score 7–10 at final pathology was the definition of unfavourable. There was misclassification in up to 56% of patients. Suardi et al. did not provide information on the number of biopsy cores taken in their series. As the Suardi et al. study included patients from 1992, it is likely that a large proportion had a sextant biopsy scheme only. For the D'Amico et al. criteria [7] they reported a substantial misclassification rate of 39.4%. Applying the D'Amico et al. criteria to the present cohort there was misclassification in 28.2% of patients. Misclassification was lower in the present series but one third of patients were still misclassified when at least 10 biopsy cores were taken.

The present data are fairly consistent with Conti et al. [26] who reported an upgrading rate of 23–35% in 1097 patients after RP. Patients included in that study had at least six biopsy cores and 41% had >12 cores taken.

Mufarrij et al. [27] reported a substantial upgrading rate of 45.9–47.2% in their single surgeon series. They thought that extended biopsy was standard of practice in their series (2000–2008), but did not report the actual core number.

To reduce a high misclassification rate Berglund et al. [19] recommended a 14-core re-biopsy in a study cohort of 104 patients who opted for AS. Cancer diagnosis was based on externally performed biopsy with an enormous variation of 2–27 biopsy cores. There was upgrading in 27% of cases and in 26% of men diagnosed with prostate cancer at initial biopsy, re-biopsy histology was negative for prostate cancer.

Ploussard et al.[28] included 411 patients eligible for AS criteria who underwent a 21-core biopsy regimen. They reported results according to core numbers 6, 12 and 21 within the same individual. In all, 297 patients underwent immediate RP. The misclassification rate differed enormously according to biopsy number and was shown to be lower when more cores were taken. Even in a seemingly ‘optimal’ setting (extended uniform 21-core biopsy scheme performed by selected senior urologists at a single centre, central pathology review) the ability to predict final tumour characteristics using transrectal prostate biopsy remains restricted. Even in patients who showed cancer exclusively in a 21-core scheme compared with a 12-core regimen, the misclassification rate was ≈16% (pT3, Gleason score ≥ 8). It is important to consider that the Ploussard et al. 21-core regimen comprised 6 cores taken from the transition zone, which is less likely to harbour aggressive cancer. In the present cohort, none of the patients had aggressive cancer in the transition zone on final pathology.

Based on the present findings and the reports discussed, the chance of missing relevant cancer features might decrease if an early re-biopsy would be performed. Capitanio et al.[29] showed that an 18-core protocol compared with a 12-core regimen lowered the rate of upgrading from 47.9% to 23.5%. Similarly, Numao et al.[30] showed that a 26-core scheme compared with a 12-core regimen was better at predicting final cancer grade. Extended biopsy protocols should be considered to provide the best accuracy [31,32]. Saturation biopsy might also be performed as transperineal biopsy. Merrick et al.[33] reported on transperineal template-guided saturation biopsy. Their results suggest better tumour characterisation for Gleason grade accuracy compared with 6- and 12-core transrectal scheme relying on a median number of 50 biopsy cores. However, this alternative approach has to be performed under general anaesthesia and is therefore less commonly used. In the future, imaging techniques, e.g. contrast medium-enhanced ultrasonography [34,35], real-time elastography [36] , computerised TRUS [37] or MRI [38,39], might improve accuracy of cancer characterisation before treatment. To date imaging techniques are not evaluated to indicate or to defer local therapy of prostate cancer.

Several limitations apply to the present study: First, the study population underwent no uniform biopsy scheme. However, this is a worldwide daily routine: web-based prediction tools are available to calculate individual prognosis based on clinical, laboratory and histopathological findings. Urologist should be aware of the dilemma that these tools cannot control for the biopsy quality, pathology report standards and the detection method used to measure PSA concentration.

Secondly, we focused on histopathological findings as an outcome surrogate marker for misclassification but did not analyse prostate cancer-specific mortality as an endpoint. Thus, it is important to state that according to the final pathology a substantial proportion of men in the preoperative LR cohort who opted for RRP might be over treated. However, in the present study ≈20% of men eligible for AS might have based their decision on inadequate tumour characterisation.

Thirdly, we obtained no central review of all biopsy slides to exclude interobserver variability for Gleason grading. Nevertheless, we excluded patients with insufficient pathological biopsy reports and obtained a second pathological opinion in cases with Gleason pattern 1 or 2 (n= 25, 2.4%).

In conclusion, studies investigating the relationship between pretreatment variables and final pathology are necessary to clinically validate published LR or AS criteria and to carefully counsel patients. Current AS or LR disease prediction tools are associated with pathological tumour characteristics but the treating urologist must be aware of the critical risk of misclassification. Currently available AS and LR criteria failed in ≈20% to identify men likely to harbour insignificant cancer from patients who had unfavourable tumour characteristics, which most probably should not be controlled by AS.


None declared.