• observer variations;
  • kappa statistics;
  • jaundice;
  • diagnostic accuracy;
  • liver disease

ABSTRACT— Five observers each examined 20 jaundiced patients, recording clinical signs and symptoms on a form which also gave the definitions used for the study. The balanced design of the study allowed examination for order effects, but none were found, except for a tendency for agreement on indicants with more than two categories to improve as the study progressed. Chance agreement was corrected by the use of kappa statistics which showed that 80% of the indicants showed agreement significantly greater than expected by chance. Certain indicants (dark urine, variability of jaundice, abdominal pain, character of liver edge and presence of spleen) showed no evidence of significant agreement, even though the indicants were frequently observed in both states – present or absent. The percentage of correct clinical diagnoses reached by the observers (without biochemical or any other information) varied between 65% and 84%. The consensus diagnosis was correct in 80% of cases. Agreement was higher if the diagnosis was simplified to a ‘Medical’ or ‘Surgical’ diagnosis, the observers' accuracy being between 90 and 100%.