ACADEMIC EMERGENCY MEDICINE 2011; 18:1081–1089 © 2011 by the Society for Academic Emergency Medicine
Objectives: The objective was to critically appraise and highlight medical education research studies published in 2010 that were methodologically superior and whose outcomes were pertinent to teaching and education in emergency medicine (EM).
Methods: A search of the English language literature in 2010 querying PubMed, Scopus, Education Resources Information Center (ERIC), and PsychInfo identified 41 EM studies that used hypothesis-testing or observational investigations of educational interventions. Five reviewers independently ranked all publications based on 10 criteria, including four related to methodology, that were chosen a priori to standardize evaluation by reviewers. This method was used previously to appraise medical education published in 2008 and 2009.
Results: Five medical education research studies met the a priori criteria for inclusion and are reviewed and summarized here. Comparing the literature of 2010 to 2008 and 2009, the number of published educational research papers increased from 30 to 36 and then to 41. The number of funded studies remained fairly stable over the past 3 years at 13 (2008), 16 (2009), and 9 (2010). As in past years, research involving the use of technology accounted for a significant number of publications (34%), including three of the five highlighted studies.
Conclusions: Forty-one EM educational studies published in 2010 were identified. This critical appraisal reviews and highlights five studies that met a priori quality indicators. Current trends and common methodologic pitfalls in the 2010 papers are noted.
Medical education research attempts to provide an evidence basis for pedagogic techniques and methodologies. Publication of this research exposes educators to new educational theories, methods, and innovations in research that can be used to improve teaching, provide a foundation for future medical education research, and advance the field of medical education as a discipline. The execution of medical education research requires in-depth knowledge of educational theory, research methodology, and current educational needs and opportunities. Medical education research, which focuses on the scientific investigation and assessment of the effects of teaching and educational efforts, can often provide an explanation as to why success or failure occurs in a particular educational situation.1
Educational research in emergency medicine (EM) has benefited recently from increased attention and emphasis. Both the Society for Academic Emergency Medicine (SAEM) and the Council of Emergency Medicine Residency Directors (CORD) have recently announced funding grants for educational research. SAEM, CORD, the American College of Emergency Physicians (ACEP), and the American Academy of Emergency Physicians (AAEM) all provide opportunities to report results in their journals and to present research at their academic meetings. In 2009 and 2010, Academic Emergency Medicine published education-focused supplements sponsored by CORD and the Clerkship Directors in Emergency Medicine (CDEM).
Medical education scholars have suggested the use of methodologies and metrics adapted from traditional bench and clinical research to perform and assess medical education research.2–6 The Research in Medical Education (RIME) Symposium of the Association of American Medical Colleges (AAMC) developed criteria for evaluating the quality of educational research submitted for publication and presentation at the national AAMC meeting. In 2009 and 2010, we used revised RIME criteria to scientifically appraise and rank all of the EM educational research published the prior year and highlighted those that received best scores based on a priori criteria.2,3 We also assessed trends in EM education research methods.
The reviewers used the previously published criteria to critically analyze and rank the EM educational research published in 2010. We here highlight the 2010 studies that are pertinent to teaching and education in EM and that are methodologically superior. This article is intended to serve as an unbiased summary of excellent educational research. It is hoped that educators and researchers in EM will find this a valuable resource for their own efforts.
A medical librarian performed the literature search in the medical and social sciences literature domains and supplied medical subject heading (MeSH) and keyword terms. MEDLINE was searched through PubMed using a Boolean search strategy that incorporated the following MeSH terms: emergency medicine and medical education, medical student, internship, house staff, resident, undergraduate medical education, graduate medical education, and continuing medical education. Keyword variants for the MeSH terms were included in the search for comprehensiveness. Boolean searches of other databases, including Scopus (“medical education” and emergency), Education Resources Information Center (ERIC; emergency medicine), and PsychInfo (emergency medicine and education) were performed using keyword searching and where possible using the databases’ controlled vocabularies. Publications were limited to English language papers published in 2010. Searches were run in February 2011.
Inclusion and Exclusion Criteria
Publications on the education of medical students, residents, academic and nonacademic attending physicians, and other emergency providers were included. Medical education studies were defined as hypothesis-testing investigations and measurements of educational interventions using either quantitative or qualitative methods. Publications were excluded if they were opinion, comments, literature reviews, descriptive papers, or reports on education of prehospital personnel or if the study could not be generalized to EM training outside of the country in which it was performed.
Data Collection and Analysis
One author (PS) screened abstracts of all retrieved publications and applied the exclusion criteria. Two authors (GK, ML) reviewed and approved the selection. Retrieved publications were maintained in an EndNote X2 (Thomson Reuters, New York City, NY) database. All differences in opinion were resolved by discussion. Publications that met inclusion criteria were posted in a shared folder online for all five reviewers to score independently.
Using the criteria developed in 2009, and then modified in 2010,2,3 papers were scored in 10 categories. Categories for methodology were “study design,”“implementation of study design,”“data collection,” and “data analysis.” Additional categories were “introduction,”“discussion,”“limitations,”“innovation of project,”“relevance of project,” and “clarity of writing.” Each of the categories was scored based on predefined criteria to make scoring as objective as possible (Table 1). Possible scores ranged from 0 to 28.
|Domain||Item||Item Score||Maximum Domain Score|
|1. One point for each:|
|Description of background literature||1|
|Clearly frame the problem||1|
|1. Pick most appropriate score:|
|Not appropriate for hypothesis||0|
|Appropriate design, but not best method||1|
|Excellent design for question asked||2|
|Implementation of study design||4|
|1. Pick most appropriate score:|
|No pretest, posttest||0|
|Both experimental and control group with nonrandom assignment||3|
|Both experimental and control group with random assignment||4|
|Data collection (institutions + response rate)||4|
|1. Institutions—pick most appropriate score:|
|More than two institutions||2|
|2. Response rate—pick most appropriate score:|
|Response rate <50% or not reported||0|
|Response rate 50%–74%||1|
|Response rate ≥75%||2|
|Data analysis (add appropriateness + sophistication)||3|
|1. Appropriateness of analysis—pick most appropriate score:|
|Data analysis inappropriate for study design or type of data||0|
|Data analysis appropriate for study design and type of data||1|
|2. Sophistication of analysis—pick most appropriate score:|
|Descriptive analysis only||1|
|Beyond descriptive analysis||2|
|1. One point for each|
|Data supports conclusion||1|
|Conclusion clearly addresses hypothesis/objective||1|
|Conclusions placed in context of literature||1|
|1. Pick most appropriate score:|
|Limitations not identified accurately||0|
|Some limitations identified||1|
|Limitations well addressed||2|
|Innovation of project||2|
|1. Pick most appropriate score:|
|Previously described methods||0|
|New use for known assessment||1|
|New assessment methodology||2|
|Relevance of project||2|
|1. Pick most appropriate score:|
|Impractical to most programs||0|
|Relevant to some||1|
|Clarity of writing||3|
|1. Pick most appropriate score:|
Reviewers were excluded from scoring their own publications or publications from their own institution. Also, reviewers did not score papers that they had previously reviewed as part of the editorial process of a journal. Publications were listed alphabetically by first author and each reviewer was assigned a different place to start on the list in an attempt to prevent bias resulting from reviewer fatigue. Each reviewer independently reviewed and rated the publications, and a total rating score was calculated for each article. All rating scores were entered into Microsoft Excel 2007 (Microsoft Inc., Redmond, WA). Using each reviewer’s total rating score for each article, a rank list of all publications was created for each reviewer. The rankings were then averaged to prevent overvaluing of any one reviewer’s scoring. The a priori criteria for papers to be included here as “top papers” were: 1) the average of all reviewers’ rankings of an article placed the article’s rank in the top 10 and 2) there was a minimum of 80% agreement among reviewers that the article was in the individual reviewer’s top 10 ranking.
A total of 329 papers satisfied the search inclusion criteria. Forty-one papers7–47 survived the exclusion criteria and were scored by each of five reviewers, with a range of scores from 9.25 to 22.75. Five papers that met a priori criteria and had a mean rank of at least seven were considered methodologically superior and are highlighted for review. They are presented in alphabetical order by the surname of the first author.
Review of Publications
Andreatta PB, Maslowski E, Petty S, et al. Virtual reality triage training provides a viable solution for disaster-preparedness. Acad Emerg Med. 2010; 17:870–6.8
Background. Comprehensive training and assessment of emergency workers’ disaster triage knowledge and skills can be logistically complex and resource-intensive. The objective of this study was to compare the use of a fully immersive virtual reality (VR) disaster drill to a live disaster drill using standardized patients (SP) in the teaching and assessment of EM residents’ knowledge and application of disaster triage, using the Simple Triage and Rapid Treatment (START) algorithm.
Methodology. Volunteer EM residents were administered a pretest of use of the START triage algorithm. Residents were then randomized to the VR or live SP disaster drill groups. Both groups performed triage of the same 14 victims from the same mass casualty disaster scenario. Residents’ performance was observed, timed, and scored on a START triage rating scale. Two weeks after the educational intervention, all residents completed a posttest. Measured outcomes included pretest scores, triage rating scores, and posttest scores. Descriptive data, Cohen’s d measure of association, and Pearson coefficients were calculated to analyze differences and associations between the various outcome measures.
Results. Fifteen EM residents with no prior START training completed all phases of the study. Based on pretest scores, both educational groups were comparable in their baseline knowledge of the START triage algorithm. Triage performance ratings of the VR and SP groups were similar. The mean pretest scores of the two groups were not significantly different, but there was a trend toward improved posttest scores in the SP group.
Strengths of the study. This study used randomization to assess the relative effectiveness of two educational interventions, both designed to teach complex, high-acuity, infrequently used triage skills. The authors used pretest scores to assure that both groups were comparable in their baseline skills prior to the intervention. Although a small sample size, this application of immersive virtual reality has rarely been described in EM training.
Relevance for future educational advances. Virtual reality, as a resource for training in high-intensity, low-frequency events, is costly and not readily available to many programs. However, the future collaborative use of virtual reality–based education may help to defray costs, while providing standardized, repeatable education and assessment opportunities for complex clinical training.
Gravel J, Roy M, Carrière B. 44-55-66-PM, a mnemonic that improves retention of the Ottowa Ankle and Foot Rules: a randomized controlled trail. Acad Emerg Med. 2010; 17:859–64.19
Background. The Ottowa Ankle Rules (OAR) have been evaluated and found to be 100% sensitive in predicting the need for ankle radiographs. However, there is some discordance between the prevalence of knowledge of the rules and their implementation in clinical practice. The authors posit that this discrepancy may be due to an inability to recall the components of these decision rules. The objective of this study was to determine if the use of a mnemonic would improve knowledge of the OAR.
Methodology. This was a single-blinded, randomized control trial performed in an urban, tertiary care pediatric ED. After enrollment, residents and medical students answered a questionnaire to indicate their knowledge and application of knowledge of the OAR. They were then randomized to one of two educational groups: receiving a mnemonic to recall the components of the OARs or a description of the rules. Participants were retested on the same knowledge questionnaire at 3 weeks and 5 to 9 months after intervention. Differences in mean scores between the intervention and control groups were measured using Student’s t-test, and differences in proportions of perfect scores were compared using chi-square analysis.
Results. Seventy-two percent of participants completed all phases of the study. At 3 weeks, both intervention and control groups demonstrated improvement in their knowledge of the OARs compared to group baselines. The groups’ scores were not statistically different. At long-term retesting, randomization to the intervention (mnemonic) group was associated with higher scores on the retest of knowledge of the OARs.
Strengths of the study. This study uses randomization of learners as a method to analyze the effects on knowledge retention of a simple educational intervention. Pre- and posttesting was used to assess the change in knowledge and persistence of recall. The authors also attempted to measure any cross-contamination of groups during their 3-week retest of knowledge.
Relevance for future educational advances. This study demonstrates the successful use of randomization in a study of an educational intervention to improve knowledge, as demonstrated by recall on a posttest. Future advances should look to analyze the application of knowledge in the clinical setting, the next step in enhancing the use of clinical decision rules.
Harvey A, Nathens AB, Bandiera G, LeBlanc VR. Threat and challenge: cognitive appraisal and stress responses in simulated trauma resuscitations. Med Educ. 2010; 44:587–94.21
Background. Individuals vary in their responses to acute, stressful situations. The authors note that performance impairment may be exaggerated in individuals who have an enhanced subjective and physiologic response to stress. The objective of this study was to determine the relationship between residents’ cognitive appraisals, subjective levels of anxiety, and physiologic responses during simulated trauma resuscitations.
Methodology. For the purpose of the study, “cognitive appraisal” was defined as one’s subjective assessment of a situation as a challenge or a threat, based on the perceived acute demands relative to the available resources. Subjective levels of anxiety were measured using the State-Trait Anxiety Inventory (STAI). Physiologic responses to stress were assessed through the measurement of the peak and change values in salivary cortisol levels. Advanced Trauma Life Support–certified residents voluntarily participated in two simulated trauma resuscitations of different complexity: a low-stress (LS) relatively stable scenario and a high-stress (HS) unstable scenario. Residents’ baseline STAI scores and salivary cortisol levels were measured and compared to scores and levels during and immediately postscenario. Residents gave a cognitive appraisal of each scenario immediately after completing the simulated resuscitation. All residents completed both scenarios in a crossover design, with a washout time between measurements. Dependent variables were the subjective STAI and cognitive appraisal scores and cortisol levels. Independent variables were the LS versus HS scenarios and time. Absolute peaks and changes in mean scores and cortisol levels between baseline and postscenario, and between LS and HS scenarios, were compared using one-way t-tests. Pearson correlation coefficients were calculated to assess the relation between mean scores and levels and the complexity of the resuscitation scenario.
Results. Thirteen residents completed both LS and HS scenarios. STAI scores were significantly higher in the HS scenario groups. Cognitive appraisals suggested that residents perceived HS scenarios as threats, compared to LS scenarios, which were perceived as challenges. Peak cortisol levels were higher during and after HS scenarios. There was a statistically significant positive correlation between peak cortisol level, change in cortisol level, cognitive appraisals of threat, and the HS scenario.
Strengths of the study. This innovative study demonstrates a positive association between residents’ subjective assessments of a potentially stressful acute care event and their physiologic stress response.
Relevance for future educational advances. Training in clinical EM, particularly in the acquisition of complex resuscitation skills, can be stressful. Identifying residents who feel threatened by high-acuity complex clinical scenarios may allow educators to train residents on coping strategies. The resulting effect on clinical performance could translate into improved patient care and a smoother progression towards competency.
Hill C, Reardon R, Joing S, Falvey D, Miner J. Cricothyrotomy technique using gum elastic bougie is faster than standard technique: a study of emergency medicine residents and medical students in an animal lab. Acad Emerg Med. 2010; 17:666–9.22
Background. Cricothyrotomy is an infrequently performed, but critical procedure in EM. A number of techniques to facilitate the successful performance of this procedure have been described. The authors compared the speed, efficacy, and ease of a novel variation of the procedure using a gum elastic bougie (bougie-assisted cricothyrotomy technique [BACT]) to the standard open technique of cricothyrotomy.
Methodology. This was a prospective, randomized comparison of two cricothyrotomy techniques performed by inexperienced EM residents and medical students on anesthetized domestic sheep. Volunteer participants were randomized to technique. All participants were shown an instructional video and allowed to familiarize themselves with the equipment. Participants were timed in their performance of the assigned technique and rated the difficulty of the procedure. Time to completion, failure of the procedure, and the participant’s perceptions of difficulty were compared using Wilcoxon rank sum tests.
Results. Twenty-one participants completed the study. The mean insertion time of an endotracheal tube using the BACT technique was significantly faster than with the standard open technique. The BACT technique was also rated as significantly easier to perform. Failure rates in the two groups were similar.
Strengths of the study. This study used a prospective, randomized method to compare two procedural techniques. As a result, the participants in each group were similarly matched by level of training and experience with the procedure. This simple and elegant study was able to show statistically significant differences in procedure time and ease of performance.
Relevance for future educational advances. The authors successfully demonstrated the ease of performing the BACT technique after a simple educational intervention.
Ten Eyck RP, Tews M, Ballester JM, Hamilton GC. Improved fourth-year medical student clinical decision-making performance as a resuscitation leader after a simulation-based curriculum. Sim Healthcare. 2010; 5:139–45.40
Background. The objective of this study was to compare the effect of simulation-based instruction versus case-based group discussion on the performance of fourth-year medical students as resuscitation team leaders.
Methodology. This was a randomized, controlled, single-blinded study of fourth-year medical students. Each student completed an initial individual simulation-based resuscitation case as a team leader. Students were then randomized to one of two educational groups: simulation-based or case-based group discussion of the same standardized case curriculum. After completion of the curriculum, all students completed a second, follow-up individual simulation case, again as a team leader. Eight behavioral outcomes were assessed during all simulations. Mean scores on the simulations were measured between groups based on the educational method. Change in the performance of individual students was measured using paired comparisons of individual initial and follow-up simulation skills.
Results. Sixty-eight students completed all phases of the study. Between-group comparisons of mean performance during the follow-up simulated case indicated better performance on four of the eight behavioral outcomes by the simulation-based educational group, compared to the group discussion group. Within-subject comparison of individual student performance on the initial versus follow-up simulation showed significant improvement in the performance of six of eight outcome behaviors.
Strengths of the study. This study randomized an educational intervention to study the effects of both an initial, discrete simulation experience and an entire simulation-based educational experience, on the team leader performance of individual students and education-assigned groups. All students received the same curriculum, isolating the measured outcomes to the effects of simulation-based teaching.
Relevance for future educational advances. Simulation-based teaching and assessment has become accepted for the purposes of team-based patient care, communication, and clinical leadership skills. This study is further evidence that simulation-based teaching, even in a single encounter, can have an effect on subsequent performance in a simulated assessment Table 2).
|Variable||All Publications (n = 41)||Highlighted (n = 5)|
|Both students and residents||5||2|
|Experimental or quasi-experimental||10||5|
|Topics of study|
|Location of study|
Trends in Medical Education Research in 2010
As in past years,2,3 we performed an observational analysis for various trends in research for the publications meeting our inclusion criteria. The areas identified this year were funding, learner group (medical student, resident, other), study methodology (survey, observational, quasi-experimental/experimental), topic of research, and location of research.
A correlation between quality of study design and funding has been reported in the literature.48 In this year’s papers, nine of the 41 (24%) and two of the five studies (40%) that were highlighted received some type of funding.19,21 Each of this year’s featured studies employed a methodologically superior experimental or quasi-experimental design.8,19,21,22,40 The number of funded studies remained fairly stable over the past three years at 13 (2008), 16 (2009), and nine (2010). A list of funding sources for EM medical education articles published in 2008–2010 are listed in Data Supplement S1 (available as supporting information in the online version of this paper).
The majority of studies appeared in EM journals, with four being published in journals that specialize in the topic of the research28,34,40,43 and one being in a journal focused on medical education.21 Of note, these topical journals accounted for two (40%) of the featured research studies.21,40 EM researchers conducted 38 of the 41 studies (93%), 12 of which (29%) involved collaboration with authors from other specialties. A new trend this year was that 27% of the studies were conducted outside the United States (six in Canada and five internationally). Nine studies were multi-institutional;7,9–11,17,21,31,33,38 however, the primary methodology of these studies was survey.
As in past years, research involving the use of technology accounted for a significant number of publications (34%), including 60% of the highlighted studies. Simulation (22%)8,12,16,21,26,27,35,40,41 and ultrasound (15%)7,13,26,27,39,44 were the technologies employed. As in past years, simulation was chosen to teach and evaluate critical events, especially those that occur infrequently in daily practice, or to introduce inexperienced learners to advanced topics. Andreatta et al.8 compared two simulated methods of triaging disaster victims, one using standardized patients and the other a technologically based virtual reality simulation for residents. Franc-Law et al.16 showed that medical students could perform disaster triage more effectively when trained using simulation instead of conventional methods, while Ten Eyck et al.40 demonstrated that students could effectively serve as team leaders in simulated disasters when training in a simulated environment. Both Loukas et al.26 and Mallin et al.27 demonstrated an improvement in their learners to gain vascular access on a simulator. Pediatric EM fellows favorably reviewed a simulation-based curriculum in acute care;12 medical students demonstrated improved performance in managing emergency situations on an observed structured clinical examination (OSCE) evaluation;35 and critical care nurses retained their skills in securing a difficult airway on a simulator 1 month after training.41 Harvey et al.21 measured increased salivary cortisol levels in simulation participants whose scenario included a high degree of stress and concluded that interventions addressing stress management skills should be developed.
Twenty-three percent of the studies evaluated the efficacy of a new curriculum, including four of the five featured studies.8,19,22,40 Eight studies focused on workplace issues in the ED.9,18,28,29,32,33,40,43 Residency selection remained an important topic of research this year, and the six studies addressed issues relating to predicting which applicant will become a successful resident.14,20,31,36,42,45
Common Reasons for Lower Rating Scores
The papers meeting inclusion criteria over the past 3 years are all valuable and have survived the peer review process to be published. In selecting papers that are methodologically superior, the reviewers have noted several trends among educational research papers that score lower using the criteria in Table 1.
Although survey-based studies receive a lower score in the “study design” category, several received an even lower score because they reported a low response rate of <75%. This creates a significant selection bias and makes the results inconclusive. Surveys at a single institution and those with only a postintervention survey also score lower.
Many studies appropriately used objective outcome measures, such as medical knowledge based on a pre- and posttest written exam or observed demonstration of a skill. Some such studies, however, received lower ratings because they enrolled few learners (<30). Studies with small sample sizes have low statistical power. Enrolling more learners over the course of several months or enlisting other sites to create a multi-institutional study can help overcome this methodology flaw.
Limitations to this analysis of the literature remain similar to those from previous years. Although this year’s article search was meant to be extensive in reviewing the MEDLINE, ERIC, and PsychInfo literature databases, it is possible that the article inclusion criteria may have been too narrow, missing some publications.
When rating any research it is possible for bias to exist. Although reviewers did not assess papers that they had been involved in writing or ones they had previously reviewed for a journal, the selection and scoring of publications was not blinded, which may have led to bias. To minimize bias, the reviewers attempted to standardize their individual article ratings through a priori discussions of the rating definitions and rating agreements. The use of rankings limited the variance inherent to individual reviewer ratings.
Comparing the literature of 2010 to 2008 and 2009, the number of published educational research papers meeting our criteria increased from 30 to 36 and then to 41. The number of funded studies increased from 13 in 2008 to 16 in 2009 and then decreased to 9 in 2010. Hopefully the new educational research funding opportunities from SAEM and CORD can establish a more reliable trend toward high-quality projects and papers for 2011. Support of researchers performing medical education research focused on EM will assist academic EPs in implementing innovative educational approaches, based on the most valid and effective evidence.
This critical appraisal of the EM literature provides a snapshot of exemplary educational research in 2010 and highlights advances and trends of research in the field. Each of the highlighted research publications contributes to the growing field of medical education research relevant to EM, while addressing the methods to control, justify, or minimize the limitations that are inherent to this focus. Our highlighting the unique strengths of these high-quality publications is meant to encourage educators to conduct methodologically sound educational research.