Summary. The objective structured clinical examination (OSCE) is increasingly being used as a method of clinical assessment yet its measurement characteristics have not been well documented. Evidence is accumulating that many OSCEs may be too short to achieve reliable results. This paper reports detailed psychometric analyses of OSCEs which were administered as part of a well-established final- year examination. Generalizability theory guided investigation of test reliability.
At the present test length the OSCE components showed low reliabilities relative to written components. Satisfactory reliabilities could potentially be achieved if test length was increased to approximately 6 hours, a time which would create significant logistic problems for most medical schools.
Several strategies for dealing with this practical problem have been explored. Firstly, it was shown that more careful selection of stations based on their psychometric characteristics can significantly improve reliability. Secondly, where rater availability is a limiting factor to increasing test length, more can be gained by using one rater per station and having more stations than using two raters per station. Finally, OSCE scores can, with advantage, be combined with other test scores which are obtained by using less resource-intensive methods. By adopting such strategies, a reliable assessment of clinical competence could be obtained in about 4 hours of testing time which was equally divided between an OSCE constructed of practical and clinical stations and a written test.