Many reported studies of medical trainees and physicians have demonstrated major deficiencies in correctly identifying heart sounds and murmurs, but cardiologists had not been tested. We previously confirmed these deficiencies using a 50-question multimedia cardiac examination (CE) test featuring video vignettes of patients with auscultatory and visible manifestations of cardiovascular pathology (virtual cardiac patients). Previous testing of 62 internal medical faculty yielded scores no better than those of medical students and residents.
In this study, we tested whether cardiologists outperformed other physicians in cardiac examination skills, and whether years in practice correlated with test performance.
To obviate cardiologists' reluctance to be tested, the CE test was installed at 19 US teaching centers for confidential testing. Test scores and demographic data (training level, subspecialty, and years in practice) were uploaded to a secure database.
The 520 tests revealed mean scores (out of 100 ± 95% confidence interval) in descending order: 10 cardiology volunteer faculty (86.3 ± 8.0), 57 full-time cardiologists (82.0 ± 3.3), 4 private-practice cardiologists (77.0 ± 6.8), and 19 noncardiology faculty (67.3 ± 8.8). Trainees' scores in descending order: 150 cardiology fellows (77.3 ± 2.1), 78 medical students (63.7 ± 3.5), 95 internal medicine residents (62.7 ± 3.2), and 107 family medicine residents (59.2 ± 3.2). Faculty scores were higher in those trained earlier with longer practice experience.
This study was supported by grants 1R43HL062841-01A1, 2R44HL062841-02, and 2R44HL062841-03 from the National Heart, Lung, and Blood Institute, Bethesda, MD. The funding source was not involved in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript. The authors have no other funding, financial relationships, or conflicts of interest to disclose.
Cardiac physical examination is a declining art,1–5 even though it is a sensitive, specific, and cost-effective tool for screening for cardiac disorders.6–8 At Harbor-UCLA, where we have archived cardiac examination (CE) findings for >10 years with video,9 we found that 25 of 254 patients recorded had new physical findings on bedside recording that materially changed the diagnosis and treatment plan (unpublished data). For example, diminished femoral arterial pulses were noted in a young patient with aortic regurgitation, indicating the presence of previously unsuspected aortic coarctation.
Ideally, CE involves integration of patient history with inspection, palpation, auscultation, and, when appropriate, maneuvers to bring out pathophysiological findings.10–14 Although objective assessment of competence in patient examination remains a challenge,15,16 there is a growing body of literature documenting an alarming lack of competency in current medical trainees and practicing physicians,1,10,17–19 despite the fact that auscultatory training in internal medicine programs has been increasing.20
Classroom testing with a validated multimedia test of CE had been reported in a previous study, but only 7 cardiologists (3 faculty level, 4 private practice) had participated.1 To enlist more academic cardiologists, we provided custom CD-ROMs that could be installed at academic centers that would upload test scores to a central secure database via the Internet. To ensure that remote testing was statistically similar to classroom testing, mean scores from each training level obtained by both methods of testing were compared.
Cardiac Examination Test
A 50-question interactive multimedia CE test (Blaufuss Multimedia, rolling Hills Estates, CA) was developed to assess competent recognition of auscultatory and visual manifestations of cardiac pathology. This CE test had been used in a previous study1 to evaluate CE skills of 860 medical students, trainees, physicians, and faculty in proctored examinations with answers submitted on printed forms. Of 62 full-time academic physicians previously tested in a classroom setting, only 3 were cardiologists. In preparation for an invited address to the Association of University Cardiologists, the senior author (JMC) obtained permission from the association members to facilitate testing of cardiologists at their respective institutions. For remote testing, CD-ROMs containing the 50-question test were installed in 1 or more testing stations at 19 academic medical centers. Each testing station had a desktop or laptop personal computer and listening devices consisting of headphones or speaker pads to be used with conventional stethoscopes. Participants began the test by entering demographic data including training level, specialty, professional status, and years since completion of training. Although each institution was asked to test primarily members of the cardiology and medical faculty, trainees were invited to participate as well.
The test features a preliminary section in which heart sounds and intracardiac pressures are synchronized with fully labeled animated depictions of the normal left heart, mitral stenosis, and aortic stenosis (Figure 1A–C). These seamlessly looped scenes may be manipulated by stop and slow-motion animation controls. The animations of mitral and aortic stenosis play actual heart sounds recorded from the apex and base, and can be “morphed” to a normal heart for comparison. This preliminary section of the test was designed to reacquaint the participants with the expected auscultatory findings produced by these lesions: the loud first heart sound, opening snap and diastolic rumble of mitral stenosis, and the ejection sound, mid-systolic murmur, and delayed carotid upstroke of congenital aortic stenosis. Other preliminary questions review the effects of respiration on heart sounds, the differentiation of carotid from jugular venous pressure waves, and the use of carotid pulsation for timing auscultatory events. The remaining 22 questions relate to virtual patient examinations (VPEs): video scenes of actual patients who had pathological heart sounds and murmurs recorded through the stethoscope (Figure 1D). Although VPEs are accompanied by a brief medical history, the test questions did not request specific cardiac diagnoses, but rather identification of carotid vs venous pulsations, presence and timing of heart sounds, and systolic and/or diastolic timing of murmurs. Correct timing of events requires coordinated inspection and auscultation. The VPEs were chosen to depict clear, unambiguous findings and had heart sounds and murmurs similar to those depicted in the preliminary section previously described.
Test content was determined using a published survey of internal medicine residency program directors that identified important cardiac findings,21 and Accreditation Council for Graduate Medical Education training requirements for internal medicine residents22 and cardiology fellows.23 We tested for recognition of: (1) sounds (ejection sound, absent apical first sound, opening snap, and split sounds) and (2) murmurs (systolic [holosystolic, middle, and late], diastolic [early, middle, and late], and continuous murmur). Examinees were not asked for a diagnosis, but rather for bedside findings that provided pertinent diagnostic information. Six academic cardiologists reviewed the test, and minor content revisions were made accordingly.
Before taking the test, participants were informed that 0 points will be awarded for unanswered multiple-choice or true/false answer questions, 2 points will be awarded for every correct answer, and 1 point will be deducted for each incorrect answer. A score of 100 points is awarded if all 50 questions are answered correctly. At the end of the test, the automatically calculated score is revealed to the participant and then transmitted along with demographic data via the Internet to a secure central database.
To test for differences in CE competency by training level, we compared the mean test scores of students, trainees, and physicians, using 1-way analysis of variance (ANOVA; F test). The Levene statistic was computed to test for homogeneity of group variances. After a significant F score, a priori pairwise mean comparisons were made using the Newman-Keuls test (for homogeneous group variances) or the Games-Howell test (for heterogeneous group variances). Linear regression for years in practice vs test scores (for cardiology and internal medicine faculty) was performed using the least squares method. To determine whether there was a difference in remote vs classroom testing, the 520 remote test scores were compared with the 823 classroom test scores from a previous study, by performing a 2-factor ANOVA with the univariate generalized linear model. Tests of between-subjects effects used the dependent variable of test score against the factors of training level and remote vs classroom testing. Analyses were performed using SPSS version 17.0 (SPSS Inc., Chicago, IL).
From August 6, 2004 to October 15, 2008, we received 527 test scores with demographic data from 19 medical centers. From these 527 participants, 7 had specified a training level of “other” and were not used in this analysis. Likewise, 37 of the 860 classroom test scores collected in a previous study between July 10, 2000 and January 5, 2004 were marked as “other” and were not used.
Table 1 provides the mean values, standard deviations, and 95% confidence intervals for each training level group for both remote and classroom testing. The sample size of remotely tested groups over classroom testing was 57 vs 3 for cardiology faculty, 10 vs 0 for cardiology volunteer clinical faculty, 4 vs 4 for private-practice cardiologists, and 150 vs 85 for cardiology fellows. Students, residents, and noncardiology faculty were also tested remotely. Figure 2 plots mean CE test scores by training level. With remote testing, mean scores were highest for cardiology volunteer clinical faculty (86.3), cardiology faculty (82.0), cardiology fellows (77.3), and cardiologists in private practice (77.0). Mean scores for medical students (63.7), internal medicine (62.7) and family medicine (59.2) residents, and noncardiology faculty (67.3) were ≥10 points lower than scores for cardiologists and fellows. The difference between cardiology and others was P < 0.05.
Table 1. CE Test Scores From Remote and Classroom Testing
Abbreviations: CE, cardiac examination; CI, confidence interval; FM, family medicine; IM, internal medicine; NT, not tested; SD, standard deviation.
Remote testing (N = 520)
Cardiology volunteer faculty
Noncardiology private practice
Cardiology private practice
Classroom testing (N = 823)
Cardiology volunteer faculty
Noncardiology private practice
Cardiology private practice
Comparison of the 2 studies (classroom vs remote) for the different training levels via 2-factor ANOVA revealed that the P value for the interaction between training level and classroom vs remote testing was 0.28, indicating that there was no significant difference in test scores obtained for each training level by either test method. Therefore, we combined the 2 sets of results for a common overall assessment of training level differences. Comparison of differences in test scores by training level had a P value < 0.0001, whereas differences by classroom vs remote testing had a P value < 0.01. The partial eta squared (η2, or the relative contribution of each factor to the observed difference in scores) was heavily weighted toward training level (partial η2 = 0.161) over that of remote vs classroom testing (0.016).
We did not test noncardiology private-practice physicians in the new remote testing study, and we did not test cardiology volunteer faculty in the old classroom testing study, and were therefore unable to measure directly how these groups may have differed from the corresponding other study. We assume that the study differences for these 2 groups would be similar to the differences observed in the other training levels that were tested in both studies.
Based on the results of the 2-factor ANOVA, we had reasonable confidence that remote and classroom testing could be combined to make comparisons across all training levels, from medical students to cardiology faculty. Table 2 shows results from Student Newman-Keuls comparisons of mean CE test scores confirmed the observed difference in scores between cardiology faculty, cardiology volunteer faculty, cardiologists in private practice, and cardiology fellows vs every other training level (P < 0.05).
Table 2. Stratification of Test Scores Into Distinctively Similar Groups (N = 1343)
Subset for α = 0.05
Test scores from remote and classroom testing were similar (P = 0.28, Figure 2), allowing them to be combined for the analysis in this table. Mean scores that fall into distinct groupings after statistical comparisons (by the Student Newman-Keuls test) are listed in columns. Groups 1–3 show overlapping mean scores for cardiologists and cardiology fellows. Group 4 shows that mean scores for noncardiologists, trainees and students were significantly lower (P < 0.05).
Abbreviations: FM, family medicine; IM, internal medicine.
Cardiology volunteer faculty
Cardiology private practice
Noncardiology private practice
To measure the effect of years in practice on CE test scores, 91 faculty and volunteer faculty cardiologists and internists for whom years since completion of training were available were grouped into 5-year increments. The results are plotted in Figure 3. The highest scores were obtained by the most senior faculty members, with scores decreasing linearly the more recently the physician was trained (P < 0.004).
Remote and confidential testing of full-time academic faculty cardiologists provided access to a previously untested population cohort that performed (as would be expected) better than any other group of examinees with the exception of cardiologists on the voluntary teaching faculty. Remote testing appeared to be a viable alternative to classroom testing by installing the CE test on computers at teaching centers where cardiologists were able to take the test and upload their answers confidentially.
When the number of years after completing training are compared with test scores (Figure 3), we observed generally greater CE test scores with increasing seniority. There are 2 potential interpretations. The more comforting explanation is that performance improves with years of experience. A less comforting explanation is that older physicians benefited from superior teachers at the time of their medical school and residency training, and that later experience had little impact on CE skills. Our earlier report of classroom testing showed no improvement in CE test scores after the second year of medical school; in fact, internal medicine faculty tested no better than third-year medical students.1 For this reason, we believe it is less likely that the increase in CE test scores is simply due to years of experience. On the contrary, we believe that competence in bedside CE requires: (1) preparation, (2) supervised exposure to patients with cardiac findings, and (3) patient exposure with critical reinforcement.
Preparation in CE begins with acquisition of a basic knowledge of cardiac physiology and pathophysiology in the first 3 years of medical school. Supervised exposure to patients with cardiac findings is principally confined to the third year in most medical schools,14 but it can be enhanced by taking fourth-year electives offering supervised exposure by competent mentors. “Critical reinforcement” implies a commitment to confirming or refuting one's bedside diagnostic impressions by critical review and correlation with available imaging and/or hemodynamic studies performed on that patient. Cardiology fellows indeed benefit from exposure to patients with a wider spectrum of cardiac findings than trainees would be expected to encounter in family and internal medicine. Moreover, fellows augment this natural advantage in patient population by ready access to special studies that provide the necessary critical reinforcement to improve their examination skills. We propose that this mode of exposure is the means by which cardiology fellows obtained better CE test scores, and why cardiologists continue to improve over time.
On the other hand, patient exposure without critical reinforcement seems to be the norm for the average medical resident, explaining their lack of advancement in scores despite clinical encounters with hundreds of patients. Some form of critical reinforcement is needed for all trainees to improve their cardiac examination skills.
The lack of improvement in scores achieved by full-time faculty members in internal medicine may also represent the absence of critical reinforcement during their own training and therefore suspension of any further advancement in this complex skill. Unfortunately, internists have become largely responsible for teaching CE to medical students and residents.23 Cardiac examination is unlikely to improve at any training level without first improving the skill levels of the faculty who teach it.
Compounding the problem of critical reinforcement is the substantial drop in available teaching material: teaching hospitals no longer have a ready population of patients with heart valve disease from which to learn. Although schools and training programs continue to teach CE on the inpatient services, patients with heart valve disease are now almost exclusively processed as outpatients, where they are followed, treatment plans are made, and preoperative assessments are done. When these patients do arrive at the hospital for same-day surgery, if they are not immediately wheeled into the surgical suite, they often spend very little time in their hospital room while being transported elsewhere for tests, minimizing time available for obtaining a cardiac history and bedside examination, and preventing access on teaching rounds. Further, cardiac patients are often festooned with electrocardiographic electrode patches strategically placed over the precordium, inhibiting stethoscopic access to desired listening areas; in addition, ambient beeps emanating from monitors, infusion pumps, and ventilators add to a cacophony that inhibits contemplative auscultation.
The consequence of cursory CEs that miss important findings is delay in proper diagnosis and appropriate treatment. These “objective findings” in the medical record can and do affect subsequent workup and therapy. It is especially true when the initial workup in a patient's chart contains the statement “S1, S2, no MRG” (first heart sound; second heart sound; no murmurs, rubs, or gallops). This terse entry too often becomes a “rubber stamp” entry on all subsequent history and physical entries in the chart. Although we have documented 10% of patients in our collection as having materially different cardiac findings, it is likely that this error rate underestimates the true magnitude of the problem. Our patients have come from a teaching hospital that has emphasized CE competency in its trainees and attending physicians; other hospitals may have even higher error rates. Further, we do not know the false-negative rate, where patients have important cardiac findings that remain undetected, because we do not routinely screen patients with normal findings.
Fortunately, convenient “on demand” access to patients with important CE findings is possible by the use of patient surrogates. Several approaches that have been used are: (1) audiotapes and compact disks, (2) manikins, and (3) multimedia VPEs. Audio-only programs are disadvantaged by the inability to use visible or palpable timing signals to distinguish systole from diastole, or to discern invaluable ancillary diagnostic information inherent in the appearance of the patient or the quality of the carotid, jugular, and precordial pulses. Manikins (ie, artificial patients) have also been employed, but are costly, space consuming, and require skilled maintenance for optimal performance. As simulations, manikins are the result of editorial choices on what to leave in, and what to leave out, of the training experience. At best, they present a particular lesion as an archetype (sacrificing the spectrum of presentations a disease may have in nature). At worst, manikins may introduce errors based on an incomplete understanding of the disease.24 In addition, the simulated sounds and murmurs are never very realistic and probably do no more than help the learner with timing.
Some form of objective, reproducible, and realistic testing is necessary if CE skills are to improve. The traditional objective structured clinical examination, which employs standardized patients (healthy individuals trained to deliver a medical history and symptoms), cannot adequately portray abnormal heart sounds and murmurs with altered carotid and jugular venous pulsations. Some medical schools have attempted to supplement these stations with audio recordings of heart sounds, but the lack of a timing reference (either a carotid, jugular, or apical pulsation) severely limits their utility. In contrast, the National Board of Medical Examiners has incorporated actual heart sounds and animated virtual patients into its US Medical Licensing Examination. We believe that this form of testing, with synchronized pulsations that have realistic contours that match the pathology portrayed, is superior to using standardized patients. For residents, the dearth of an inpatient population with bedside cardiac findings means that what patient encounters they do have must be supplemented with VPEs. For any training level, a minimum of 5 hours of practice appears to be necessary to create any significant improvement in CE skills,10 and 12 hours of practice is probably necessary for the skills to improve without further intervention.14
Computer technology has advanced rapidly over the past decade, making multimedia programs reliable, reproducible, and affordable, and capable of being transmitted or downloaded over the Internet. The same technology used for CE testing can also be used for CE training. These programs can be studied by individuals or small groups, often a few steps away from the patient's bedside. The immediacy of critical reinforcement can make the acquisition of CE skills less dependent on a dwindling population of master clinicians, and more widely available to physicians who need them.
Remote testing of previously tested training groups (cardiology volunteer faculty, cardiology fellows, residents, students, and noncardiology faculty) revealed slightly higher mean CE test scores over classroom testing, although this difference was generally not significant. For medical students, however, mean scores from remote testing were significantly higher (63.7 vs 56.8, P < 0.05). One explanation is that remote administration allowed students more time to contemplate their answers, because they could complete the test at their own pace, as opposed to classroom administration of the test, which requires that all participants complete answers at the same pace. A more likely explanation is that the medical students tested remotely were a special population of students with slightly greater interest or skill in cardiology; 60 of 78 students tested remotely were in their fourth year, presumably during their elective cardiology clerkship. Noncardiology faculty tested remotely did so at the behest of a cardiologist, whereas the majority of noncardiology faculty members tested in a classroom setting were attendees of a heart sounds workshop at an annual meeting of the American College of Physicians. Therefore, we believe that the observed differences in remote and classroom testing were probably the result of mild sample bias. For both testing methods, some selection bias was probably at work, because people who were more confident in their skill would be more likely to volunteer to take the test, whereas those less confident would be more likely to avoid it. Therefore, the actual skill levels for each training level are probably worse than reported here.
Academic cardiologists outperform other medical faculty in an objective and reproducible multimedia test of CE, reinforcing the validity of using VPEs to test this skill. Remote and confidential testing of CE skills is a viable alternative to low-stakes proctored examinations in the classroom, yielding comparable scores, and may facilitate wider assessment of CE skills, especially for individuals, including faculty, who are reluctant to be tested yet would benefit from confidential feedback. Higher CE test scores for cardiology fellows were again confirmed, whereas test scores for medical students, residents, or noncardiology faculty did not differ significantly, again confirming lack of significant improvement for noncardiologists at any training level past medical school. Cardiac examination test scores appear to diminish with more recently trained faculty, raising the troublesome prospect that CE skills in medicine will continue to decline as the older generation of master clinicians retires and is replaced by more poorly trained physicians.
The authors would like to thank Drs. Donald D. Brown, FACC; Lawrence S. Cohen, FACC; James C. Fang, FACC; Gottlieb C. Friesinger, FACC; Allan S. Jaffe, FACC; Joseph V. Messer, FACC; James A. Shaver, FACC; Rebecca Shunk, MD; and Robert J. Siegel, FACC for their commitment in installing and encouraging trainees and faculty to take the CE test at their respective institutions. The authors also thank Peter Christenson, PhD at Harbor-UCLA for his expertise and helpful suggestions in statistical analysis. Video recording of patients and testing of medical students and postgraduate trainees were performed under the aegis of the Institutional Review Board of the Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center. All authors had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.