Inter- and intrarater reliability in scoring the signs of Parkinson's disease using the original Columbia scale and a modified version of this, the Sydney scale, were assessed in five neurologists participating in a long-term study of Parkinson's disease. Scoring was done on video recordings of 41 patients whose disability ranged from mild to severe. Although all the neurologists were familiar with the scales and had received training designed to produce uniformity of scoring, interrater reliability was poor. The mean score for the Columbia scale varied from 18.6 to 30 and for the Sydney scale from 15.2 to 23.2. By contrast, intrarater reliability was good. This study highlights the limitations of clinical rating scales in Parkinson's disease when more than one rater is used. In designing clinical trials, every effort should be made to ensure that the same patient is always assessed by the same rater.