Presented at the Doris Duke Clinical Research Fellows Meeting, Chapel Hill, NC, May 2009; and the American College of Emergency Physicians Scientific Assembly, Boston, MA, October 2009.
Original Research Contribution
The Relationship Between Shift Work, Sleep, and Cognition in Career Emergency Physicians
Article first published online: 5 JAN 2012
© 2012 by the Society for Academic Emergency Medicine
Academic Emergency Medicine
Volume 19, Issue 1, pages 85–91, January 2012
How to Cite
Machi, M. S., Staum, M., Callaway, C. W., Moore, C., Jeong, K., Suyama, J., Patterson, P. D. and Hostler, D. (2012), The Relationship Between Shift Work, Sleep, and Cognition in Career Emergency Physicians. Academic Emergency Medicine, 19: 85–91. doi: 10.1111/j.1553-2712.2011.01254.x
Funded by the Doris Duke Clinical Research Fellowship Program.
PDP—The project described was supported by Award Number KL2 RR024154 from the National Center for Research Resources. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Center for Research Resources or the National Institutes of Health. No other disclosures. CWC—supported in part by an NHLBI grant U01HL077871. CM—NIH CTSA award funded the support for the statistical analysis. No other disclosures. The rest of the authors have no disclosures or conflicts of interest to report.
Supervising Editor: Jacob Ufberg, MD.
- Issue published online: 17 JAN 2012
- Article first published online: 5 JAN 2012
- Received March 22, 2011; revision received July 22, 2011; accepted July 28, 2011.
ACADEMIC EMERGENCY MEDICINE 2012; 19:85–91 © 2012 by the Society for Academic Emergency Medicine
Objectives: The 24-hour physician coverage of the emergency department (ED) requires shift work, which can result in desynchronosis and cognitive decline. We measured changes in cognition and sleep disturbance in attending emergency physicians (EPs) before and after day and overnight shifts.
Methods: Thirteen EPs were tested before and after day and overnight shifts using the Paced Auditory Serial Addition Test (PASAT), the University of Southern California Repeatable Episodic Memory Test (REMT), the Trail Making Test (TMT), and the Stroop Color-Word Test. Sleep quality and fatigue were assessed using the Pittsburgh Sleep Quality Index (PSQI) and Chalder Fatigue Questionnaire (CFQ). Saliva samples were collected from each physician immediately before and after day shifts and night shifts for neurohormonal assays.
Results: Significantly fewer words were recalled on the REMT after both day (−2.4, 95% confidence interval [CI] = −4.4 to −0.4) and overnight shifts (−4.6, 95% CI = −6.4 to −2.8). There was a significant postshift increase in words recalled from the last third of the REMT list after overnight shifts (6.6, 95% CI = 2.8 to 10.4). Sleep quality was worse in EPs (mean PSQI = 4.8, SD ± 2.5) compared to the normal population, with 31% of subjects reporting poor sleep quality. Postshift fatigue was correlated with the perceived difficulty of the shift. Salivary cortisol and melatonin demonstrated diurnal variation consistent with normal circadian rhythms. Morning cortisol peak was decreased or delayed in samples from physicians after a night shift.
Conclusions: These data indicate that short-term memory appears to decline after day and overnight shifts and confirms the high incidence of disturbed sleep in this population.
Emergency departments (EDs) provide care for patients at all times. To provide continuous service, emergency physicians (EPs) typically work rotating day, afternoon, and overnight shifts. It is well known that shift work has detrimental effects on health,1 mood, concentration, and metabolism as consequences of both sleep deprivation and desynchronosis or dysregulation of the circadian rhythm.2 This has implications not only for patient safety, but also for physician career longevity and workforce turnover.
Most studies of desynchronosis in health care providers have used simulated shift work or have focused on medical trainees. There are few data about performance of career health care workers under actual working conditions.3,4 Furthermore, circadian variations in melatonin and cortisol are often used as reliable biomarkers of circadian phase and entrainment, but are usually studied in laboratories with controlled lighting, diet, and sampling, rather than during real-life shift work. There are few data about the ability to detect circadian changes using cortisol and melatonin sampled in field conditions.
This study examined whether attending EPs had detectable declines in cognition following shifts, reduced sleep quality, or changes in hormonal rhythms as a consequence of shift work. Specifically, we describe attention, memory, executive function, and impulsiveness in attending EPs before and after day and overnight shifts. Additionally, we determined whether a simple collection method for cortisol and melatonin could detect changes in the circadian phase of cortisol and melatonin levels between physicians on day shifts and night shifts as a marker of neurohormonal circadian rhythm.
This was a prospective, repeated-measures study of cognition at a university hospital ED. It was approved by the institutional review board, and written informed consent was obtained from all participants.
Study Setting and Population
The adult ED serves a Level I trauma center and stroke center and treats approximately 52,000 patients annually. Participants (n = 13) were a sample of attending EPs who worked at the university ED at least one overnight shift and at least one daylight shift per month (Table 1). Physicians taking prescription medications known to affect cognition (e.g., modafinil) were excluded. Caffeine use was not restricted. Twenty-five attending physicians worked at the university hospital during the study period. Fifteen were eligible to participate between January 2009 and April 2009. One physician declined to participate, and one was subsequently excluded for prescription medication use.
|Age (yr)||38.2 ± 7.0 (31–52)|
|Years as EM attending||8.1 ± 6.6 (1–17)|
|Number of day shifts per month||7.1 ± 3.4 (4–14)|
|Number of overnight shifts per month||2.5 ± 1.4 (1–4)|
|Body mass index|
|Underweight or normal (BMI < 25)||6 (46.2)|
|Overweight or obese (BMI ≥ 25)||7 (53.8)|
|Non smoker||13 (100)|
|1–3 drinks/week||7 (53.8)|
|4–10 drinks/week||3 (23.1)|
|11+ drinks/week||1 (7.7)|
|General health ranking|
|Ever told by doctor you have|
|High blood pressure||2 (15.4)|
|Lung/breathing problems||2 (15.4)|
Each subject was tested 30 minutes before and immediately after at least one daylight shift and at least one overnight shift (a set). Within a set, any two testing periods were at least 72 hours apart. If a subject agreed to participate in multiple sets, the sets were required to be at least 14 days apart. Subjects were allowed to participate in up to three sets. If a subject had consecutive day shifts or overnight shifts scheduled, he or she was tested on his or her first shift of that series. The regularly scheduled shifts in the ED did not vary during the study period. Overnight shifts were defined as those spanning midnight and both beginning before typical bedtime and ending after typical waking times. Daylight shifts allowed the physician to sleep during the night both the day before and the day following the shift.
Subjects completed the four neuropsychiatric tests in a private room adjacent to the ED before and after the shifts. During every session, the tests were given in the same order and in the same fashion, and each subject was asked to rank the perceived difficulty of the shift by marking a 10-cm visual analog scale. The test administrator (MSM) was experienced at proctoring these tests and administered the tests every time for consistency. Most stimuli (e.g., lists of words and numbers) were delivered from digital recordings to ensure consistency of presentation.
Additionally, subjects completed a background questionnaire with demographic information, the 19-item Pittsburgh Sleep Quality Index (PSQI),5 and the 11-item Chalder Fatigue Questionnaire (CFQ) at the time of enrollment.6 Subjects also completed CFQ and PSQI monthly for the duration of their enrollment (1 to 3 months per subject). The CFQ was also completed at the end of each shift.
The 2-second Paced Auditory Serial Addition Test (PASAT)7 assessed sustained and divided attention and rate of information processing. Subjects were presented with 61 single-digit numbers on an audio recording at a rate of two per second and were required to add each new number to the one immediately prior to it. Their score is the number of correct sums given (out of 60 possible). Decreasing scores would indicate deteriorating vigilance during repeated tasks or impaired ability to respond simultaneously to multiple tasks.
The University of Southern California Repeatable Episodic Memory Test (REMT)8 was used to measure immediate memory span, new learning, recognition memory, and susceptibility to interference. The subject listens to a recording of a list of 15 words that he or she must then recall. This is performed three times, each time with the same words in a different order. The subject is then read a list of 45 words from which he or she must identify the words that were in the original list of 15 words. Finally, the subject hears a different list of 15 words. Following this, he or she is read sets of three words from which he or she must identify the words that were in the list. The scores include the total number of correctly recalled words, number of perseverations (words repeated during the recollection period), and number of intrusions (words recalled that were not on the list). Recall consistency was calculated as the number of correctly recalled words on trial 2 also recalled on trial 1, plus the number of correctly recalled words on trial 3 recalled on trial 2 divided by the total number of correctly recalled words on trials 1 and 2. Serial position measures are computed as the percentage of correctly recalled words from each 15-word list from the primacy region (words 1 to 5), the midlist region (words 6 to 10), or the recency region (words 11 to 15). To minimize learning effect, all seven published word lists were used in this study. Recall of more words near the end of the list potentially indicates the individual is not moving new information into long-term memory. For the EP, this might correspond to failing to recall details of the patient’s medical history or details of the physical exam later in the course of care.
The Trail Making Test (TMT) (forms A and B only) assessed motor speed and attention and has been used in patients with brain pathology to evaluate general brain functions.9 The subject is required to trace a trail through the symbols in correct order. The numbers of correct connections made are added and overall speed of completing the tasks is noted in seconds. When combined with the Stroop Color-Word Test, the battery is an assessment of executive function that, if reduced, could impair planning, decision-making, and troubleshooting skills.
The Stroop Color-Word Test measured cognitive processing, cognitive flexibility, resistance to interference from outside stimuli, and ability to cope with cognitive stress.10 In the Color Task, the individual reads aloud a list of 112 color names in which no name is printed in its matching color. In the Color-Word Task, the individual names the color of ink in which the color names are printed. The number of words they read aloud accurately and the instances when the subjects correct themselves is recorded. Reduced performance on this test would correspond to a physician being more susceptible to distraction.
The PSQI measures sleep quality across seven constructs: subjective sleep quality, sleep latency (the time from lying down for sleep to the start of actual sleep), sleep duration, habitual sleep efficiency (the proportion of actual sleep to time spent in bed), sleep disturbances, use of sleeping medications, and daytime dysfunction.5 The construct-specific scores are calculated on a scale of 0 to 3 and are aggregated into a global PSQI score ranging from 0 to 21. A global score greater than 6 is suggestive of poor sleep quality. Previous psychometric testing showed good internal consistency and validity between good and poor sleepers.5
The CFQ measures two constructs: physical and mental fatigue.6 The CFQ is in the four-point Likert scale format (always, sometimes, rarely, never) and each item is scored as 0, 0, 1, 1. The points are summed up to reach a total score. Scores of 4 or higher indicate severe mental and physical fatigue. The psychometric properties of the survey have shown good reliability and acceptable validity.6 For the purposes of this study, we modified the items to reference the EPs’ work setting (e.g., Base version: do you have problems with tiredness? Our version: do you have problems with tiredness during your emergency medicine shifts?). One item in the survey was taken out in the survey administered after shifts because it did not apply to the situation.
Laboratory Assays. Cortisol levels were assayed by enzyme immunoassay according to the manufacturer’s instructions (Salimetrics, State College, PA). Melatonin levels were assayed by ELISA (Direct Saliva Melatonin ELISA kit, Buhlman, Schönenbuch, Switzerland).
Each subject contributed at least one daylight shift score (preshift – postshift) and one overnight shift score. For subjects providing two or three scores, the “replicates” provided information on the within-person variability of the change score. We first used simple plots of day and night shift pre/post scores to assess patterns of change and detect any unusual observations. Using the first set of observations for each physician, we estimated the mean change for each shift and its corresponding standard deviation (SD) and 95% confidence interval (CI). Using all replicates, we performed a linear mixed model on postscore with fixed effects of shift (day/night) and prescore and a random physician effect to account for repeated measures within physician. From this model we estimated the within-physician variability and an adjusted difference in scores along with its 95% CI.
Cortisol and melatonin levels were compared between overnight shift and daylight shift samples to test whether subjects shifted their neurohormonal circadian rhythms when working on night shifts. Time was blocked into epochs: morning (04:01–10:00), midday (10:01–16:00), evening (16:01–22:00), and overnight (22:01–04:00). The influence of time and shift were examined using a general linear model. Post hoc comparisons between daylight shift and overnight shift data were made using t-tests. When data within time epochs deviated from normality, a nonparametric Wilcoxon test was used for comparisons. Nonnormal data are presented as median (interquartile range).
Testing took 20 to 25 minutes and was well tolerated by the subjects both before and after their shifts. The REMT total words recalled decreased after both daylight (31.3 [SD ± 5.2] pre vs. 28.9 [SD ± 5.0] post) and overnight shifts (31.6 [SD ± 5.2] pre vs. 27 [SD ± 6.2] post). REMT recall consistency decreased significantly after both daylight (82.2 [SD ± 9.7] pre vs. 72.6 [SD ± 13.0]) and overnight shifts (78.5 [SD ± 11.0] vs. 68.2 [SD ± 15.0]). After overnight shifts, there was a decrease in tendency to recall from the midlist segment (29.4 [SD ± 5.4] pre vs. 24.6 [SD ± 8.0] post) and an increase in the tendency to recall words from the recency list (37.5 [SD ± 4.4] pre vs. 44.2 [SD ± 10.3] post; Figure 1).
In the three trials of recall on the REMT, there was a decrease in number of words recalled in all trials after overnight shifts and in trials 1 and 3 after daylight shift (Figure 1). Forced intrusions increased after overnight shifts (1.3 [SD ± 1.3] pre vs. 2.4 [SD ± 2.8] post). Finally, there was an increase in REMT yes–no intrusions after day shifts (0.2 [SD ± 0.5] pre vs. 0.3 [SD ± 0.8] post). The Stroop Color-Word Test score decreased after overnight shifts (111.2 [SD ± 2.4] pre vs. 106.8 [SD ± 10.7] post). No changes were seen for day shifts. There were no postshift decrements on the PASAT or TMT (Table 2).
|Variable||Adjusted Estimate of δ (95% CI)|
|Day Shift||Night Shift|
|PASAT||0.3 (−1.2 to 1.8)||0.1 (−1.7 to 1.9)|
|REMT total correct||−2.4 (−4.4 to −0.4)*||−4.6 (−6.4 to −2.8)*|
|1||−1 (−1.9 to −0.2)*||−1.1 (−2 to −0.2)*|
|2||−0.6 (−1.4 to 0.3)||−1.4 (−2.3 to −0.6)*|
|3||−0.9 (−1.7 to −0.1)*||−2 (−2.9 to −1.2)*|
|Primacy||−1.3 (−4.5 to 1.9)||−1.8 (−5.7 to 2.2)|
|Midlist||−1.6 (−5.1 to 2)||−4.8 (−8.4 to −1.3)*|
|Recency||2.9 (−0.1 to 6)||6.6 (2.8 to 10.4)*|
|Recall consistency||−9.9 (−16 to −3.8)*||−10.3 (−16.4 to −4.2)*|
|Yes-No correct||−0.3 (−0.8 to 0.1)||−0.5 (−1.3 to 0.2)|
|Yes-No intrusion||0.4 (0.3 to 0.6)*||0.4 (−0.2 to 1)|
|Forced intrusion||0.2 (−0.4, 0.9)||1.1 (0 to 2.1)*|
|A||−1.4 (−2.8 to 0.1)||−1.2 (−2.4 to 0)|
|B||−1.6 (−4 to 0.8)||−2.8 (−6.1 to 0.6)|
|Stroop Color||0.1 (−0.1 to 0.2)||−0.1 (−0.3 to 0.2)|
|Stroop Color-Word||−0.2 (−0.8 to 0.3)||−3.9 (−5 to −2.7)*|
Sleep and Fatigue
Subjects completed two or three monthly PSQI and CFQ surveys, leaving 33 surveys available for analysis. Each subject also completed a separate CFQ survey at the end of each observed shift.
The mean global PSQI score across all 3 months of the study was 4.8 (SD ± 2.5). The mean global score for these subjects was 2.1 points higher than a normative sample of healthy controls tested in a previous study.5 Four (30.7%) subjects reported “poor” sleep quality (PSQI ≥ 6) at one or more times during the course of the study.
The mean CFQ score across all 3 months of the study was 1.8 (SD ± 2.0) with scores ranging from 0 to 10. Five (38.4%) subjects had severe mental and physical fatigue during the course of the study. Two of these five subjects had severe mental and physical fatigue for 2 consecutive months.
The mean CFQ score after a shift was 1.3 (SD ± 1.9), with scores ranging from 0 to 7. Postshift fatigue score was positively associated with perceived level of difficulty of the shift (r = 0.306, p = 0.02). Increases in the number of patients seen during a shift was associated with greater levels of perceived difficulty of the shift (r = 0.639, p < 0.001).
Salivary cortisol and melatonin were measured in 80 samples from the workplace. Mean time of sampling for night shift (morning, 6:57 [SD ± 0:35]; evening, 18:37 [SD ± 1:25]; night, 0:23 [SD ± 1:40]) and day shift (morning, 8:12 [SD ± 0:58]; midday, 14:00 [SD ± 2:00]; evening, 18:26 [SD ± 1:52]) differed slightly.
Figure 2 depicts cortisol and melatonin levels for daylight shift and overnight shift versus time of collection. There was a significant variation in cortisol and melatonin levels with time, reflected by a significant main effect of time block (cortisol, F[3,68] = 43.0, p < 0.001; melatonin, F[3,68] = 7.36, p < 0.001). Cortisol levels were greatest in the morning time block (04:01–10:00). Melatonin levels were lowest in the midday (10:00–16:00) and evening (16:01–22:00) time blocks relative to overnight and morning.
While there was no significant effect of shift on cortisol levels, significant interaction between shift and time for cortisol (F[1,68] = 9.47, p = 0.003) reflected the fact that morning cortisol levels were not lower following the overnight shift (mean difference, 0.058, 95% CI = −0.02 to 0.22). Evening cortisol levels tended to be higher (mean difference, 0.053, 95% CI = 0.005 to 0.010) in the overnight shift group relative to the daylight shift group. There was no significant effect of shift and no significant interaction between shift and time on melatonin levels.
There has been increased interest in the effects of sleep deprivation among physicians as it relates to patient care, given the link between preventable medical errors and long work hours.11 Concern about cognitive decline with sleep deprivation led to the 2008 Accreditation Council for Graduate Medical Education recommendation for an 80-hour workweek for resident physicians. This report identifies similar effects of shift work on cognition among EPs, who not only provide patient care, but also may be responsible for resident supervision and the flow of patients through the ED.
Subjects recalled fewer words on the REMT after both daylight and overnight shifts, and this short-term memory impairment generally occurred across all trials. After overnight shifts, physicians tended to remember things that they heard most recently. Although the clinical implication of these results is uncertain, these results may play a role in errors of omission if facts learned during patient examination or interview (e.g., medical history, recent changes in medications) are not recalled at the time of order entry. These results are consistent with laboratory studies showing that short-term memory is affected by shift work and sleep deprivation.12
Physicians are more susceptible to intrusions in REMT testing after their shifts. Likewise, intrusions on the Stroop Color-Word Test increased after overnight shifts, implying that physicians were more susceptible to interferences after overnight shifts. This increased susceptibility to interference may play a role in errors of commission during the course of patient care. It is of interest that these deficiencies in memory occur after both daylight and overnight shifts. Sleep is known to be important for consolidation of declarative and procedural memory.13 It is often assumed that a rested worker arriving for a daylight shift is immune to these decrements and it has been shown that some degree of mental stimulation improves cognitive processes.14 However, extreme mental stress reduces memory function in healthy adults.15,16 The results of this study suggest that both daylight shifts and overnight shifts in emergency medicine are sufficiently stressful to create this effect. Further investigations are required to define the magnitude and time course of this work-induced decrement.
We did not find any changes in the PASAT or TMT scores. This result contrasts with other studies in which tests that demand constant attention and contain monotony are the most sensitive to fatigue and sleep deprivation.17 One reason we did not detect a change may be that lapses of attention are more pronounced in sustained or continuous work,3,18 particularly lasting longer than 3 minutes.19 In this study, the PASAT only lasted 2 minutes, so it is possible that a longer test of sustained attention may have yielded different results.
Our results indicate that short-term memory is the cognitive function most affected by both day and overnight shifts and is readily measured in a short time interval following the shift. At a minimum, future studies of interventions to mitigate the adverse effects of shift work should include validated tests of short-term memory. Longer tests of attention should be explored in future studies, but standard tests of executive function may have less value in this population.
A proportion of EPs in this study reported disturbed sleep and chronic fatigue. In addition, acute fatigue after daylight and night shifts was related to the perceived difficulty of the shift more than to the circadian time of the shift. Salivary cortisol and melatonin displayed similar temporal variations for physicians working daylight or overnight shifts, and the interaction between shift and cortisol levels provided only weak statistical evidence for a phase advance in salivary cortisol after a single overnight shift. Cortisol is affected by stress and sleep20 and is increased with nocturnal sleep deprivation.21,22 It is unlikely that differences in sampling time account for the observed lower morning cortisol in night shift samples. Mean morning sample times for subjects completing the night shift were earlier than times for subjects beginning the day shift and closer to the expected peak of cortisol (06:00). Therefore, differences in sampling time would be expected to cause the opposite effect (night shift higher). These data were obtained without dietary restriction or lighting control, demonstrating the potential for using these minimally intrusive techniques in ecologic studies of circadian rhythm and sleep disorders in health care providers. The absence of a strong effect of shift on circadian phase for cortisol and melatonin indicates that many physicians begin night-shift work without resetting their endogenous rhythms. Future studies could compare changes over consecutive night shifts and compare phases for providers with good versus poor sleep hygiene.
Sleep quality scores indicated that a relatively large proportion of physicians (31%) suffered poor sleep quality associated with fatigue. These data suggest that sleep disruption continues routinely beyond training years and may be a widespread issue among health care providers. Chronic decreases in sleep quality and increases in fatigue may affect career satisfaction and career longevity. These observations were made in attending physicians, who were working shifts typical for their specialty, rather than trainees who might be expected to tolerate disruption during relatively short-term periods of training. The long-term health effects of chronic sleep disruption as a result of career choice remain to be determined. Future studies will need to examine the relationship of sleep quality to clinical performance and to career satisfaction.
A limitation of this study is that the shift lengths varied from 6 to 8 hours. However, most physicians did not end their shifts at the times scheduled; therefore, the shift lengths were often longer and could vary by more than 2 hours. Longer shift lengths are common at many institutions. Additional research is required to sort the effects of shift length and timing among EPs. Samples for neurohormonal assays were only collected before and after each shift, rather than at set times, perhaps obscuring the true diurnal variation. Finally, learning effects may have confounded some tests such as the PASAT and the TMT.23,24 Neurocognitive testing among healthy adults is difficult even under controlled circumstances. We cannot say if the absence of change in these tests means that they were not repeated enough times to display a practice effect or if there were decrements resulting from the shift work that canceled an anticipated improvement from repeated administration.
Emergency physicians exhibit some decrement in cognitive performance during the course of both day and night shifts. Short-term memory is the cognitive domain most likely to decline in EPs during shift work. Physicians are more susceptible to intrusions (distractions) at the ends of shifts. These effects are exacerbated when working an overnight shift without significant advancement of the circadian rhythm (as assessed by cortisol peak). Reported sleep disruption is common in career EPs who are required to work a mixture of daylight and overnight shifts. Awareness of these vulnerabilities should prompt exploration of system improvements to reduce chances of error: for example, increased use of memory aids, scribes, or other adjuncts toward the ends of shifts or decreased tolerance for interruptions during patient care on night shifts. Individual physicians should also consider these stressors when designing their personal sleep hygiene.
The authors thank Ms. Mary Margaret Murtha and Ms. Maureen Morgan from the University of Pittsburgh Department of Emergency Medicine for their administrative assistance, Ms. Julia Morley from the University of Pittsburgh, and Dr. Meryl Butters from the University of Pittsburgh Department of Psychiatry.
- 10Studies of interference in serial verbal reactions. Exp Psychol. 1935; 18:643–62..