Objectives This study aimed to explore faculty attendings’ scoring and opinions of students’ written responses to professionally challenging situations.
Methods In this mixed-methods study, 10 pairs of faculty attendings (attending physicians in internal medicine) marked responses to a professionalism written examination taken by 40 medical students and were then interviewed regarding their scoring decisions. Quantitatively, inter-rater scoring agreement was calculated for each pair and students’ global scores were compared with a previously developed theoretical framework. Qualitatively, interviews were analysed using grounded theory.
Results Inter-rater reliability in scoring was poor. There was also no correlation between faculty’s scores and our previous theoretical framework; this lack of correlation persisted despite modifications to the framework. Qualitative analysis of faculty attendings’ interviews yielded three major themes: faculty preferred responses in which students expressed insight, showed responsibility, and ultimately put the patient first. Faculty also expressed difficulty in deciding what was more important (the behaviour or the rationale behind it) and in assigning numerical scores to students’ responses. Interestingly, they did not downgrade students for mentioning implications for themselves as long as these were balanced by other considerations.
Conclusions This study attempted to overcome some of the instability that results when we judge behaviours by making the rationales behind students’ behaviours explicit. However, between-faculty agreement was still poor. This reinforces concerns that professionalism, as a subtle and complex construct, does not reduce easily to numerical scales. Instead of concentrating on creating the ‘perfect’ evaluation instrument, educators should perhaps begin to explore alternative approaches, including those that do not rely on numerical scales.