Disclosures: The authors have no disclosures or conflicts of interest to report.
A Core Competency–based Objective Structured Clinical Examination (OSCE) Can Predict Future Resident Performance
Article first published online: 12 OCT 2010
© 2010 by the Society for Academic Emergency Medicine
Academic Emergency Medicine
Special Issue: CORD/CDEM Educational Advances Supplement
Volume 17, Issue Supplement s2, pages S67–S71, October 2010
How to Cite
Wallenstein, J., Heron, S., Santen, S., Shayne, P. and Ander, D. (2010), A Core Competency–based Objective Structured Clinical Examination (OSCE) Can Predict Future Resident Performance. Academic Emergency Medicine, 17: S67–S71. doi: 10.1111/j.1553-2712.2010.00894.x
Supervising Editor: Terry Kowalenko, MD.
- Issue published online: 12 OCT 2010
- Article first published online: 12 OCT 2010
- Received April 23, 2010; revision received June 15, 2010; accepted July 19, 2010.
- educational measurement;
- competency-based education
Objectives: This study evaluated the ability of an objective structured clinical examination (OSCE) administered in the first month of residency to predict future resident performance in the Accreditation Council for Graduate Medical Education (ACGME) core competencies.
Methods: Eighteen Postgraduate Year 1 (PGY-1) residents completed a five-station OSCE in the first month of postgraduate training. Performance was graded in each of the ACGME core competencies. At the end of 18 months of training, faculty evaluations of resident performance in the emergency department (ED) were used to calculate a cumulative clinical evaluation score for each core competency. The correlations between OSCE scores and clinical evaluation scores at 18 months were assessed on an overall level and in each core competency.
Results: There was a statistically significant correlation between overall OSCE scores and overall clinical evaluation scores (R = 0.48, p < 0.05) and in the individual competencies of patient care (R = 0.49, p < 0.05), medical knowledge (R = 0.59, p < 0.05), and practice-based learning (R = 0.49, p < 0.05). No correlation was noted in the systems-based practice, interpersonal and communication skills, or professionalism competencies.
Conclusions: An early-residency OSCE has the ability to predict future postgraduate performance on a global level and in specific core competencies. Used appropriately, such information can be a valuable tool for program directors in monitoring residents’ progress and providing more tailored guidance.
ACADEMIC EMERGENCY MEDICINE 2010; 17:S67–S71 © 2010 by the Society for Academic Emergency Medicine
The early identification of a resident’s unique strengths and weaknesses as a physician allows for a more tailored and individualized educational curriculum. The Accreditation Council for Graduate Medical Education (ACGME) requires the documentation of performance improvement over the course of training.1 Identification of a resident who is struggling is the first and probably the most time-sensitive of all the steps required to ensure a steady course toward competency and the ultimate goal of graduation as a competent physician capable of independent practice.
The ACGME provides a “toolbox” of possible resident assessment instruments. Of these, assessments of live performance in the clinical setting are the most realistic and valid tools. In emergency medicine (EM), faculty evaluation of a resident’s clinical performance in the emergency department (ED) is the mainstay of a resident’s overall assessment, and residency directors rely on those evaluations to identify performance deficiencies. Generally, such identification hinges on documentation from multiple faculty members over a course of time. Recognizing patterns of performance requires the analysis of a significant quantity of individual faculty evaluations.
The typical structure of EM residency programs poses a challenge to EM educators and residency directors. The first year of postgraduate training in EM generally involves a significant number of off-service (non-ED) rotations, the combined total of which may constitute more than half of the Postgraduate Year 1 (PGY-1) experience. During this time, the EM resident is performing clinical work outside of the ED, under the supervision of faculty and residents in other medical specialties. Resident performance is evaluated and assessed during these rotations by non-EM faculty for different skill sets and in a context foreign to the clinical care of the ED. While off-service evaluations will be helpful to residency directors, their utility is likely weak in comparison to clinical evaluations from EM faculty based on resident performance in the ED. Additionally, the remaining “ED” months may be distributed among various secondary training sites, including pediatric departments and community hospitals. While faculty in those departments may have significant educational expertise, departmental boundaries and physical distance often pose a challenge in ensuring consistency in evaluation practices.
Given the typical structure of the PGY-1 year, in which only a minority of time is spent in the ED, it is easy to understand why resident weaknesses may not be appreciated until the PGY-2 year. For a resident struggling in a particular area, this delay can have significant consequences. In a PGY-1 to -3 residency, the most common format of postgraduate EM training, this time limitation is especially compelling. In this format, it could take half of residency training to identify a significant weakness, leaving relatively little time for remediation.
The objective structured clinical examination (OSCE) is perhaps the oldest form of simulation education, first introduced in 1975.2 Its use in medical education has steadily increased since that time, and the OSCE is now a fixture of medical school curricula, used both as an educational tool and as an evaluation instrument. An OSCE is a simulated patient encounter using specially trained actors who function in the role of standardized patients (SPs). Traditionally, OSCEs are conducted in mock examination rooms similar to what one would encounter in an office-based medical practice. Often these rooms have video and audio recording capability, so that faculty can observe the encounter without being physically present in the examination room. There is ample literature documenting and supporting varied uses of OSCEs and SPs in both undergraduate and graduate medical education. In EM postgraduate education specifically, OSCEs and SPs have been used to teach communication competencies3,4 and clinical skills such as ultrasound.5
Prior studies have shown the validity of OSCEs in predicting future clinical performance. Among medical students, poor performance on OSCEs predicted future poor performance in both clinical clerkship examinations6 and clinical performance assessment by supervising faculty.7 To the best of our knowledge, however, no studies have examined the ability of an OSCE administered during residency to predict future clinical performance in postgraduate training.
We hypothesized that performance on an interactive and management-focused OSCE held during the first month of the PGY-1 year would correlate with future clinical performance as assessed by cumulative faculty evaluations midway (18 months) through residency training. We further hypothesized that a correlation would be noted within specific ACGME core competencies. Such a correlation would identify the OSCE as a useful predictive tool in identifying a resident’s specific clinical strengths and weaknesses.
This was a prospective cohort study. Written consent for participation was obtained from all 18 residents. This study was approved by the institutional review board.
Study Setting and Population
Our study population consisted of 18 PGY-1 residents in their first month of training (July 2008).
A five-station OSCE was developed and piloted as a trial in 2007. The cases were developed by the Department of Emergency Medicine Education Committee and represent the broad spectrum of disease, acuity, and patient demographics seen in the ED (Table 1).
|Presentation||Behavior||Demographic||Final Diagnosis||Critical Actions|
|Dyspnea and abdominal pain after MVC||Normal||Varies||Pneumothorax, splenic laceration||Cervical spine evaluation, tube thoracostomy, abdominal CT scan|
|Altered mental status||Obtunded||Elderly male or female||Septic shock, pneumonia||Oxygen delivery, IV fluid resuscitation, antibiotic therapy|
|Seizure||Confused||College-age male or female||Bacterial meningitis||Fingerstick glucose, lumbar puncture, antibiotic therapy|
|“Took pills”||Agitated and uncooperative||Varies||Acetaminophen overdose, depression||Activated charcoal, acetaminophen level, NAC therapy|
|First-trimester vaginal bleeding with abdominal pain||Anxious||Young female||Missed abortion, intimate partner violence||Ultrasound, communication of bad news, detection of and counseling for IPV|
The OSCE cases were interactive in nature, and residents performed patient management through verbalization of orders to a case-facilitator stationed in each room. These case-facilitators were specially trained SPs who were familiar with the intricacies of each case. All SPs were trained in the case material by the Emory Clinical Skills Center coordinators in conjunction with the authors. The design and flow of the cases were similar to an American Board of Emergency Medicine oral examination session in that results of diagnostic studies that the resident orders are returned in a simulated real-time manner, and updates on a patient’s vital signs and clinical status are also provided, based on management steps taken by the resident. Given the unique format of this OSCE compared to more traditional models, a brief orientation program was given to the residents immediately prior to the OSCE. Residents were instructed to act as if they were caring for real patients in the ED, performing simultaneous evaluation and management as indicated by the acuity of the case. Each case began with a triage form, and the residents were given 15 minutes to perform evaluation and management and reach a disposition. Cases were observed by EM faculty in a monitoring room via live video and audio feed, and the evaluation form was completed by the faculty member immediately following the case.
We utilized a core-competency assessment form developed by the Council of Emergency Medicine Residency Directors (CORD) Standardized Direct Observation Tool (SDOT) study group.8 This form (a portion of the CORD SDOT, viewable at http://www.cordtests.org/SDOT.htm) evaluates the six ACGME core competencies (patient care, medical knowledge, problem-based learning and improvement, interpersonal and communication skills, professionalism, and systems-based practice) on a five-point Likert scale, anchoring numerical scores with “needs improvement” (1), “meets expectations” (3), and “above expectation” (5). This instrument was chosen to mirror the global assessment of live performance used by EM faculty to evaluate resident performance in the ED. Although familiar with ACGME core competency assessment, faculty members were trained in the use of the scoring instrument prior to the start of the OSCE and were also given a list of three to four critical management steps expected to be performed in each case to help guide their evaluation of the resident. Combined scores for each core competency were calculated at the end of the OSCE as was a final OSCE score, which was an equal weighting of all core competencies across cases. OSCE performance data are not part of our resident’s official performance record, and OSCE scores were not released to the other faculty or residency directors.
To obtain a rigorous evaluation procedure, a trial of the process was performed in July 2007. Based on this trial, a number of modifications were made to improve the reliability of the OSCE evaluation when our study began in 2008. In 2007 the OSCE was held in four blocks over a total of two days, each block using a different set of standardized patients and facilitators. To eliminate the variability in each case experience, we modified the schedule to allow all residents to complete the OSCE in one day using the same set of actors and facilitators. Additionally, in 2007 a different set of faculty evaluators was used for each of the four blocks, and significant interrater variability was identified. In 2008, one faculty member was chosen to evaluate each of the five cases throughout the entire OSCE day, so that all 18 residents had the same faculty evaluator scoring performance on each of the cases. Thus, faculty member A scored all 18 residents on Case A, faculty member B scored all 18 on Case B, etc. Finally, in 2007 we did not “screen” our faculty evaluators and in fact also used two senior residents on an administrative rotation as evaluators. In 2008, we recruited specific faculty based on their educational expertise. Our cohort of evaluators in 2008 included two senior faculty, an associate program director, a medical director, and two faculty with administrative roles within the student clerkship.
OSCEs were performed on one day in mid-July, 2008. Over the subsequent 18 months of postgraduate training, residents’ clinical performance was evaluated using our ACGME core competency–based global assessment instrument. OSCE scores were compared to EM faculty evaluations from adult ED rotations only, as this was the most similar experience to the OSCE encounters. Pediatric ED and off-service evaluations were not included. All faculty completing clinical evaluations of residents were blinded to the residents’ performance on the OSCE. At the end of 18 months (December 31, 2009), the cumulative composite scores for each core competency was calculated based on an equal weighting of all completed faculty evaluations. In addition, an overall clinical performance score was calculated through an equal weighting of all core competencies.
Using Pearson’s test, the correlation between the final OSCE score and overall clinical performance score was determined. The same correlation was then determined by each core competency. Based on our sample size of 18, this study had the power to detect a correlation of 0.60 (one-tailed α of 0.05 and β of 0.20).
Mean overall OSCE and clinical final scores, as well as mean scores for each core competency, are listed in Table 2, as are the correlations between OSCE and clinical scores. For each competency, the mean scores on the global evaluations were higher than those on the OSCE. After 18 months of training, residents had an average of 30 clinical evaluations from EM faculty (range 18 to 39). There was a statistically significant correlation between overall OSCE score and clinical score (R = 0.48, p < 0.05). The strongest and statistically significant correlations were seen in the individual competencies of patient care (R = 0.49, p < 0.05), medical knowledge (R = 0.59, p < 0.05), and practice-based learning and improvement (R = 0.49, p < 0.05). There was no significant correlation seen in systems-based practice (R = 0.35), interpersonal and communication skills (R = 0.04), and professionalism (R = –0.09).
|Pearson’s correlation (R)||0.49*||0.59*||0.49*||0.04||–0.09||0.35||0.48*|
Residents’ overall performance in a July PGY-1 OSCE correlated with their overall, cumulative clinical evaluations at the end of 18 months of training. The strongest correlation among individual core competencies was noted in patient care, medical knowledge, and practice-based learning and improvement. OSCE scores on systems-based practice, interpersonal and communication skills, and professionalism did not correlate with faculty assessment of those competencies.
Mean scores in all core competencies improved from the time of the OSCE administration to the cumulative evaluation at 18 months. While there may be confounding factors, the performance improvement in all categories from the start of residency training through its midpoint is evidence for the reliability of the evaluation instrument. There was greater variation in resident OSCE performance than in resident clinical performance in all core competencies. This may reflect the greater variation in experience and knowledge among incoming EM residents prior to the start of postgraduate training or it may be the result of averaging numerous faculty evaluations to arrive at composite scores. It may, though, also reflect a more thoughtful evaluation by faculty participating in an educational activity without the competing time demands of patient care. Regardless of the cause, however, the tight clustering of clinical core competency scores makes it more difficult to demonstrate a strong correlation to OSCE scores, particularly with our small sample size.
We did expect to see a stronger correlation between OSCE and clinical score in the competencies of interpersonal and communication skills and professionalism, given the known usefulness of OSCEs in assessing communication skills. The lack of correlation may be explained by the smaller standard deviation (SD) in performance in these categories on the OSCE, suggesting that in its current form the OSCE does not well discriminate resident performance in these areas. Future modifications, including more challenging communication and counseling tasks within the cases, could improve this discriminatory ability. We did not anticipate a strong correlation in practice-based learning and improvement or systems-based practice given the difficulty in assessing these competencies through observing only a single case encounter, although the correlation for both of these exceeded those for interpersonal and communication skills and professionalism.
While the overall correlation and correlations within several of the core competencies is statistically significant, the strength of the correlation is moderate. Future studies with a larger sample size may show a stronger correlation, although even with a moderate correlation, OSCE scores can be used as a tool to gauge areas of potential clinical weaknesses. Given that the correlation with future performance is moderate, this should be done with caution, and it would be inappropriate to label a resident as having “a problem” in any core competency based on poor performance in the OSCE. It is, however, reasonable to state that those who perform poorly are at risk of future poor performance on clinical evaluations by faculty. This information, taken in the context of it limitations, can be very useful not only to resident educators but to residents themselves. A discussion of performance with each resident should be a component of an intern OSCE program. We have not yet studied the correlation of OSCE scores to performance as a senior resident and at graduation, although this can be evaluated once those data are available.
Finally, an intern OSCE has other valuable advantages that program directors should consider. It is a valuable instructive tool as well as an evaluating tool. It can be used to highlight core values of the program and high-yield principles of EM at the very start of residency training, when the variation in resident knowledge and experience is arguably at its highest. Most medical schools have established OSCE and standardized patient programs that are available to both students and residents, making this a feasible program for EM programs around the country. Practical concerns that programs should consider include cost of standardized patients, scheduled availability of the intern class, and recruitment of faculty evaluators.
This study had a small sample size representing data from a single class of residents. This may affect the power to detect statistically significant correlations. In addition, the small sample size limits our ability to comment on more specific metrics, such as correlation among quartiles or other subgroups.
The study was conducted at a single institution, and a collaborative study would both increase our sample size and demonstrate reproducibility at multiple sites. We have continued our OSCE with subsequent intern classes, and plan on continuing this analysis as our sample size and power increases.
While we attempted to minimize variation in scoring by having each faculty member observe all residents in a single case, we did not assess the accuracy of the raters or the interrater reliability of our OSCE scoring instrument. This was due to manpower and other practical limitations, although future studies could use video review by multiple faculty members to ensure more accurate performance assessment.
Emergency department clinical evaluations from the five faculty members who assessed resident performance in the OSCE were not excluded, and that group may have developed biases based on resident performance in the OSCE. However, given the large number of clinical faculty (approximately 100), and the extended time period during which clinical evaluations were performed, we believe that this effect is minimal.
Finally, while we used faculty evaluation of resident performance in the ED as our comparison, such evaluations may lack objectivity, are subject to numerous biases, and may not represent a true criterion standard in assessment of residents’ clinical skills. Future studies could compare performance on an early-residency OSCE with performance in subsequent OSCEs or simulation encounters.
Resident performance of a five-station objective structured clinical examination during their first month of training correlated with their cumulative ED evaluations after 18 months. In addition to an overall correlation, the objective structured clinical examination assessment of patient care, medical knowledge, and problem-based learning and improvement statistically correlated their subsequent performance on those measures. While the correlations were only moderate, these initial results suggest that an objective structured clinical examination held in the first months of residency can be a valuable tool for program directors to focus their resources in a resident’s training.
- 1Accreditation Council of Graduate Medical Education. Common Program Requirements. Available at: http://www.acgme.org/acWebsite/dutyHours/dh_dutyhoursCommonPR07012007.pdf. Accessed Jul 20, 2010.