Aim : Immunohistochemical analysis of protein expression is central to most clinical translational studies and defines patient treatment or selection criteria for novel drugs. Interobserver variation is rarely analysed despite recognition that this is a key area of potential inaccuracy. Therefore our aim was to examine observer variation and suggest the revision of current standards.
Methods and results : We analysed inter- and intra-observer variation, by interclass correlation coefficient (ICCC) and κ statistics, in 8661 samples. Intra-observer assessment of nuclear, cytoplasmic and membrane staining for seven proteins in 1323 samples resulted in an ICCC of 0.94 and a κ-value of 0.787. Interobserver reproducibility, assessed on 28 proteins by seven observer pairs in 8661 carcinomas, gave an ICCC of 0.90 and a κ-value of 0.70. No significant effect of either antibody or cellular compartmentalization was observed.
Conclusion : We have demonstrated that ICCC is a consistent method to assess observer variation when a continuous scoring system is used, compared with κ statistics, which depends on a categorical system. Given the importance of accurate assessment of protein expression in diagnostic and experimental medicine, we suggest raising thresholds for observer variation: ICCC of 0.7 should be regarded as the minimum acceptable standard, ICCC of 0.8 as good and ICCC of ≥ 0.9 as excellent.