Sleep restriction for the duration of a work week impairs multitasking performance

Authors


Marja-Leena Haavisto, Organisational Psychology, Technical Research Centre of Finland, P.O. Box 1000, FI-02044 VTT, Finland. Tel.: +358 40 7056443; fax: +358 20 7225888; e-mail: marja-leena.haavisto@vtt.fi

Summary

It is important to develop shift schedules that minimise the chance for sleep-related human error in safety-critical domains. Experimental data on the effects of sleep restriction (SR) play a key role in this development work. In order to provide such data, we conducted an experiment in which cognitively demanding and long-duration task performance, simulating task performance at work, was measured under SR and following recovery. Twenty healthy male volunteers, aged 19–29 years, participated in the study. Thirteen of them had first two baseline days (8-h sleep opportunity per day), then five SR days (4-h sleep) and finally two recovery days (8-h sleep). Seven controls were allowed to sleep for 8 h each night. On each experimental day, multitask performance was tested in 50-min sessions, physiological sleepiness was evaluated during multitask performance using electroencephalogram (EEG)/electrooculogram (EOG) recordings, and psychomotor vigilance task performance and Karolinska Sleepiness Scale were recorded. Sleep–wake rhythm was monitored throughout the experiment. The multitask performance progressively deteriorated as a result of prolongation of the SR and the time spent on the task. The effect was significant at group level, but individual differences were large: performance was not markedly deteriorated in all participants. Similar changes were observed also in EEG/EOG-defined sleepiness. The recovery process of performance and sleepiness from the SR continued over the two recovery sleep opportunities. In all, our findings emphasise the importance of shift systems that do not restrict sleep for several consecutive days.

Introduction

In many safety-critical occupations, work consists of multiple tasks that need to be performed simultaneously. To cope with these tasks, called multitasks, the workers are required to switch their attention between subtasks and make decisions on priorities (Navon and Gopher, 1979; Wickens, 2002; Wickens et al., 2003). Examples of occupations requiring multitasking are drivers, pilots, traffic controllers and process operators, who simultaneously search information from several sources and have multiple parallel subtasks under processing.

Employees in safety-critical occupations often work under acute and/or cumulative sleep loss because of irregular and long work hours (Folkard and Lombardi, 2006; Härma et al., 2002; Sallinen et al., 2003). The combination of multitasking and being sleep deprived is potentially hazardous: in transportation and industry, where multitasking is common, restricted sleep opportunities are a major cause of accidents (Caldwell, 2005; Philip and Åkerstedt, 2006). For example, the National Transportation Safety Board (NTSB) in the USA has estimated that fatigue-related accidents involving heavy trucks make up to 30% of fatal accidents (NTSB, 1990). Previous research supports the view that restricted sleep markedly degrades performance in multitasking (Caldwell and Caldwell, 1998; Caldwell and Ramspott, 1998; Elsmore, 1994; Sallinen et al., 2008). However, from the viewpoint of work life, at least three important questions have so far been inadequately addressed by previous research.

The first question concerns cumulative sleep loss, that is, partial sleep loss across several consecutive days. Until now, studies of cumulative sleep restriction (SR) have shown that performance on short duration, usually vigilance tasks gradually degrades in the course of SR (Belenky et al., 2003; Van Dongen et al., 2003; Webb and Agnew, 1974). Recently it has been estimated that the extension of wakefulness to >20 h a day (i.e. < 4 h per night) leads to an escalation of performance impairment (McCauley et al., 2009). Although several studies have addressed multitask performance after acute sleep loss (Caldwell and Caldwell, 1998; Caldwell and Ramspott, 1998; Elsmore, 1994; Sallinen et al., 2008), to the best of our knowledge performance after cumulative SR has been addressed in a single study (Balkin et al., 2004).

Second, in many safety-critical occupations performance needs to be maintained at a high level for long periods of time uninterrupted. The time-on-task effect, that is, a progressive deterioration of performance in the course of the task session, is affected by SR (Dinges and Kribbs, 1991). Thus, it can be argued that the duration of the task may be a critical factor in occupational safety. Until now, the time-on-task effect on multitasking has been studied in two acute sleep deprivation studies. In one of these studies, the time-on-task effect was markedly increased after the first 10 min on the task (Caldwell and Ramspott, 1998), whereas in the other the effect just approached significance in a 70-min test session (Sallinen et al., 2008). Studies on the time-on-task effect on multitasking under cumulative SR are lacking.

The third understudied question regards recovery from partial cumulative sleep loss. This information is crucial when planning shift work schedules; particularly when considering the number of days off between two consecutive shift spells.

Until now, recovery of multitasking performance has been examined in one study showing that 8-h sleep after 1 night of partial sleep loss is not sufficient for full recovery (Sallinen et al., 2008). Recently, Banks et al. (2007) and McCauley et al. (2009) have shown that a single recovery night of extended sleep after cumulative SR improves participants’ performance, but leaves their vulnerability to sleep deprivation at an elevated level. The pace of recovery after SR is affected by at least two determinants: the first of which is severity of SR – the more severe the sleep loss, the longer the period of recovery required (Lamond et al., 2007); the second element is the nature of sleep loss – recovery from cumulative sleep loss appears to require more time than recovery from acute sleep loss (Axelsson et al., 2008; Belenky et al., 2003; Lamond et al., 2007). In addition, the methods used to measure the recovery process matter, for example, the recovery of subjective sleepiness occurs earlier than the recovery of physiological sleepiness (Lamond et al., 2007). With this in mind, there is a clear need to ascertain how multitasking performance recovers from cumulative sleep loss.

This study was aimed at elucidating the three above-mentioned practical questions. We hypothesised that cumulative sleep loss, reflecting a restriction of sleep for duration of a work week, leads to progressive impairments in multitasking. We predicted that the degree of impairment would be affected by both the duration of the task (the time-on-task effect) and the number of days of restricted sleep (extent of sleep loss). We were also interested in the recovery process, and posed the question whether 2 days, simulating a weekend, are sufficient for recovery of a multitask performance of long duration. We curtailed young male volunteers’ sleep opportunity for 5 days, and thereafter allowed them to sleep normally for 2 nights while measuring their multitask performance in two 50-min sessions every day. To examine the overall effect of the SR and the recovery days, we also measured participants’ subjective, behavioural and physiological sleepiness in the course of the experiment. The results from the SR group were compared with those from a group that was offered a sleep opportunity of 8 h each night through the experiment.

Materials and methods

Participants

After signing a form for informed consent, 20 healthy men (aged 19–29 years) with 7–9 h of habitual sleep and sleep need voluntarily participated in the study. The measure of habitual sleep need was based on participant’s subjective evaluation in the questionnaire. The Ethics Committee of the Hospital District of Helsinki and Uusimaa approved the study. Prior to the study, participants were screened to exclude those with extreme circadian types, sleep disorders, psychiatric illness, chronic or recent acute medical conditions, a history of drug or alcohol dependence, having crossed time zones during 4 weeks preceding the beginning of the study, habitual napping, and shift work and/or night work. Positive criteria for selection included regular lifestyles with habitual bedtime before 24:00 hours and wake-up time after 06:00 hours. The Nordic Sleep Questionnaire (Partinen and Gislason, 1995), medical screening questionnaires, and clinical blood and urine laboratory tests were used to identify and exclude individuals with drug dependence, sleep disorders and other conditions. In the second phase, the participants were examined by a physician to ensure fitness for the experiment.

Procedures prior to the experiment

At least 2 weeks before the experiment, the participants slept an adaptation night in the laboratory, and a polysomnogram, including electroencephalography (EEG), bilateral electrooculography (EOG), submental electromyography and electrocardiography, was recorded to exclude persons suffering from organic sleep disorders.

The participants were instructed to maintain a regular sleep–wake cycle for 2 weeks prior to the study, which was verified by wrist-worn actigraphy and sleep diary recordings. In addition, they were instructed to maintain a regular nutrition schedule, and to refrain from caffeine, alcohol and tobacco for 2 weeks before arriving in the sleep laboratory. Mean sleep duration was 6 h 53 min (SD = 35.0 min) in the SR group and 7 h (SD = 51.4 min) in the control group during the 14 days preceding the experiment derived from actigraphy data.

Design and experimental procedures

Participants were randomly selected to the experimental or control groups. Participants spent 10.5 consecutive days in the laboratory (Fig. 1). The first day was an adaptation/training day (A) and the second served as baseline (BL). During A, participants became familiar with the day schedule, and practised the tasks and self-rating scales. Thirteen of the participants underwent sleep deprivation conditions, including the training and baseline day (8 h in bed per night, 23:00–07:00 hours), 5 days of partial sleep loss (4 h in bed per night, 03:00–07:00 hours), and two recovery days (8 h in bed per night, 23:00–07:00 hours). The remaining seven volunteers were allowed to sleep for 8 h per night throughout the experiment.

Figure 1.

 Study experimental design, showing nightly time in bed across days: adaptation (A), baseline (BL), sleep restriction days (SR1–SR5) and recovery (R1–R2), and measurements of psychomotor vigilance task performance (PVT) and multitasking in the 4-h SR group. Karolinska Sleepiness Scale (KSS) ratings were collected at the beginning and end of each task session. The control group had a similar schedule, with the exception that they had the opportunity to sleep 8 h per night throughout the study. One of the two forenoon sessions contained a 10-min break, and the data collected during this session have been excluded from this study.

No alcohol, tobacco or caffeine was allowed during the laboratory visit. The fixed meal hours and amount of calories per meal were as follows: breakfast 600 kcal at 07:30 hours; lunch 800 kcal at 12:30 hours; snack 300 kcal at 15:30 hours; dinner 700 kcal at 18:00 hours; and snack 200 kcal at 21:30 hours. The participants were under behavioural monitoring, and their sleepiness was measured by EEG and EOG 24 h per day.

The study was designed to simulate a typical work week with daily working time between 07:00 and 16:30 hours. The participants completed a 50-min multitask session at 10:00, 11:40 and 14:00 hours on each of the 9 days. One of the two forenoon sessions contained a 10-min rest pause, and the data from this session were not included in this study. The scheduling of the forenoon multitask sessions with and without the 10-min break was counterbalanced across the days and participants. To avoid the knowledge-of-results effect, the participants were not provided with feedback from their performance during the experimental days.

The psychomotor vigilance task performance (PVT) was administrated each day at 07:10, 11:00 and 15:00 hours. The Karolinska Sleepiness Scale (KSS) was rated at the beginning and end of each multitask session. Each time, two participants belonging to the same group spent the night at the laboratory at the same time. Between the task sessions, participants were allowed to read, watch TV or movies, and interact with each other and the laboratory staff helped them to stay awake. In order to avoid light exposure, going outdoors was not permitted. In addition, physical exercise was not allowed during the laboratory experiment. Illumination in the sleeping room and in the test room ranged from 150 to 400 lux, and in the living room from 350 to 600 lux. The temperature ranged from 19 to 23 ºC.

EEG recording

Electroencephalogram was recorded from 10 to 20 system derivations Fp1-A2, Fp2-A1, C3-A2, C4-A1, O1-A2 and O2-A1. The recordings were conducted with a digital recorder (Embla, Flaga HF, Reykjavik, Iceland), using a sampling rate of 200 Hz with a bandwidth of 0.5–90 Hz. Electrode impedances were checked and corrected at the beginning of each recording. The sleep periods were visually scored and classified in 30-s epochs into sleep stages according to the criteria of Rechtschaffen and Kales (1968).

EOG/EEG-defined sleepiness

The daytime EEG and EOG data were recorded from the same locations and at the same sampling rate as in the night measurements. The EEG and EOG recordings during the multitask sessions were scored into the following four categories in 20-s epochs: (1) wakefulness; (2) drowsiness indicated by slow eye movements accompanied by theta activity of < 5 s period in EEG; (3) microsleep indicated by theta activity for 5–10 s in EEG; and (4) stage 1 sleep indicated by theta activity for at least a 10-s period in EEG (Sallinen et al., 2004, 2008). All categories but category 1 were defined as increased sleepiness during multitasking. The data of EOG/EEG-defined sleepiness and multitask performance from each 50-min task session were divided into five 10-min segments for illustrations and statistical analyses.

Performance measures

Multitasking

A computerised multitask entitled Brain@Work (see Fig. 2) consisting of four subtasks was developed at the Finnish Institute of Occupational Health. It represents a modified version of a multitask entitled SYNWORK that has been previously used in sleep deprivation studies and found to be sensitive (Caldwell and Caldwell, 1997; Caldwell and Ramspott, 1998; Elsmore, 1994). The four subtasks of the multitask are introduced at scheduled intervals, and the participant must be able to choose the optimal moment for the performance of each subtask. The Brain@Work has been employed in our recent acute sleep deprivation study in a similar manner as in this study (Sallinen et al., 2008).

Figure 2.

 A computer screen showing the four subtasks of the Brain@Work multitask.

In the short-term memory subtask, at the beginning of each task session for a period of 10 s the participants were shown a string of target letters that they were instructed to learn. During the test sessions, the participants were presented with probe letters one at a time at 7-s intervals. The participants’ task was to classify each probe letter as either a target or a non-target by clicking on the icons ‘Yes’ or ‘No’.

In the arithmetic subtask, the instruction was to indicate the sum of two numbers by clicking on digits on a digit pad on the computer screen. The duration of each trial was 7 s. In short-term memory and arithmetic subtasks, the participants obtained 10 points for each correct response, −10 points for each false response, and −20 points for not responding before stimulus offset.

In the visual monitoring subtask, participants were instructed to return a moving dot to the centre of the innermost circle by clicking on the ‘Reset’ icon. The number of points obtained (2, 4, 6 or 10) was greater the nearer the dot was to the outermost edge upon responding. If the participant did not react before the dot reached the outmost circle, which took 10 s, 10 points were deducted per each elapsed second. In the auditory monitoring, the participants were instructed to discriminate between a non-target tone of 1000 Hz (80% probability) and a target tone of 1200 Hz (20% probability). The tones (intensity 62 dB SPL, duration 50 ms) were presented at 1.5-s intervals. The participants were instructed to press ‘Enter’ whenever they heard the target tone. The points were awarded and subtracted identically for the memory and arithmetic subtasks.

Less than 2 weeks before the experiment, each participant practised the multitask for 30 min. The level of difficulty in the task was defined individually for the experiment (see below). This procedure made it possible to set task difficulty equally according to each participant’s capacity. The participants performed the multitask for an average duration of 75 min (SD = 23.3 min) in the SR group and for 74 min (SD = 55.5 min) on average in the control group during the performance level adjustment procedure. In all, the participants practised the multitask for an average of 105 min prior to the experiment to flatten the practice effect. Moreover, on the first day of the experiment (Day A in Fig. 1), the participants practised the multitask for 140 min.

To achieve comparable task difficulty between individuals, difficulty level was determined in two phases. First, the difficulty levels of the short-term memory and mental arithmetic subtasks were defined. With this procedure, the effects of individual differences on the performance of individual tasks could be controlled for (Baddeley et al., 1997). In the short-term memory subtask, the number of the target letters was set two letters shorter than the smallest number of letters that the participant failed twice to repeat. In the arithmetic subtask, the number of digits was one less than the smallest number of digits found too difficult for the participant in the additions task (<80% of the additions correct). Second, the temporal intensity of the arithmetic and short-term memory subtasks was adjusted by manipulating the inter-stimulus interval (ISI, i.e. the time between two successive items) in 5-min multitasking sessions. In the first 5-min session, the ISI was 6.5 s, and it was shortened by 0.5 s per session all the way down to 0.5 s if needed. The adjustment of temporal intensity was finished when the participant failed to obtain at least 70% of the total score twice.

Vigilance

A 10-min PVT was used to evaluate behavioural alertness (Dinges and Powell, 1985). During this task, the participants were instructed to attend to the timer presented on the computer screen and press the response button as quickly as possible whenever the timer started running. The ISI varied from 2000 to 10 000 ms. The mean number of lapses [reaction times (RTs) longer than 500 ms] and the slowest 10% of all responses (mean 1/RT × 1000 from slowest 10% RTs per trial) were used as dependent measures.

Subjective measures

Subjective sleepiness was measured using the nine-point KSS (Åkerstedt and Gillberg, 1990). The scale varies from very alert (1) to very sleepy/fighting sleep/effort to keep awake (9).

Statistical analysis

Statistical analyses were performed using linear mixed-model anovas, as this technique is suited for the analysis of individual differences over time (Bliese et al., 2006; Van Dongen et al., 2003). Traditional repeated-measures anova are inappropriate for distinguishing stable changes from error variance across measurement points. Moreover, repeated-measures anovas assume equal variances within each group. Our mixed-models included group (SR, control) and day (BL, SR1, SR2, SR3, SR4, SR5, R1, R2) as fixed effects, and time-on-task (1st, 2nd, 3rd, 4th, 5th 10-min multitask interval) as a random effect. Separate analyses were performed to test the effects of the SR days (days BL, SR1, SR2, SR3, SR4 and SR5) and the recovery days (days BL, R1, R2) on the dependent variables. The time-of-day factor was left out of the models, as preliminary analysis showed no difference between the forenoon and the afternoon sessions. A random effect on the intercept and random slopes for centred (to the value of 3rd 10-min segment in each 50-min task session) multitask score or percentage of EEG/EOG-defined sleepiness was included in the models to account for individual differences in the dependent variables. For intraclass correlation (ICC) calculations, we used only random intercept models without random slope of the time-on-task factor. In analysis we extended the compound symmetry correlation structure for the repeated observations using the linear covariance structure parameter (PARM) of the SAS/MIXED-procedure to account for the different correlations during measurements in the same day and between the different days. The data were analysed using PROC MIXED in SAS 9.1 (Sas Institute Inc., 2004).

An ICC for multitask performance was computed from the estimated between-subjects variance (systematic interindividual variance) and the within-subjects variance (residual intraindividual variance) separately for the experimental and the control group. These variance components analyses were performed by mixed-model anova, with day as a fixed linear effect (five SR days) and participants as a random effect (random intercept model). Confidence limits for ICC were calculated with SAS/NLMIXED procedure by the estimate statement where the limits are based on the t-distribution rather than on the standard normal distribution.

Results

Sleep length in the experiment

Mean sleep duration for the different days of the experiment in the SR and the control group are given in Fig. 3. It shows that the experimental manipulation of sleep opportunities resulted in the planned changes in sleep duration. In the SR group, the mean sleep length was 7 h 19 min (SD = 17.4 min) on the baseline night, reduced to 3 h 52 min (SD = 2.4 min) on the SR days, and returned to baseline level (7 h 40 min, SD = 8.3 min) in the recovery phase. The SR group slept, on average, 47.2% less on the SR nights than at baseline. The control group slept, on average, 7 h 20 min (SD = 13.4 min) per night without marked variation between the nights.

Figure 3.

 Mean (and SEM) total daily sleep durations in hours in the sleep restriction group (SR) and control group. Error bars are included but are obscured by data points. BL, baseline; R, recovery.

Effects of restricted sleep

All results of mixed-model anovas are presented in Table 1.

Table 1.   Results of mixed-model anovas for the main and interaction effects of group, day and time-on-task on multitasking, EEG/EOG-defined sleepiness, subjective sleepiness (KSS) and vigilance performance (PVT)
MeasuredfF-valueP-value
  1. D, day; EEG, electroencephalogram; EOG, electrooculogram; G, group; KSS, Karolinska Sleepiness Scale; PVT, psychomotor vigilance task performance; T, time-on-task.

Sleep restriction days
Multitasking
 Group1,11572.670.1024
 Day5,11572.340.0395
 Time-on-task1,185.140.0359
 G × D5,11576.27<0.0001
 G × T1,11576.680.0099
 D × T5,11576.27<0.0001
 G × D × T5,11578.40<0.0001
EEG/EOG-defined sleepiness during multitasking
 Group1,11572.810.0937
 Day5,11571.740.1216
 Time-on-task1,182.270.1492
 G × D5,11571.810.1089
 G × T1,11572.490.1146
 D × T5,11573.910.0016
 G × D × T5,11574.620.0004
KSS
 Group1,182.750.1143
 Day5,907.87<0.0001
 G × D5,904.900.0005
PVT lapses
 Group1,183.020.0995
 Day5,902.740.0238
 G × D5,902.560.0327
Recovery days
Multitasking
 Group1,5691.160.2817
 Day2,5691.040.3530
 Time-on-task1,180.020.8997
 G × D2,5694.120.0168
 G × T1,5691.220.2697
 D × T1,5692.770.0634
 G × D × T2,5692.730.0659
EEG/EOG-defined sleepiness during multitasking
 Group1,5690.170.6845
 Day2,5690.940.3928
 Time-on-task1,182.240.1515
 G × D2,5692.370.0947
 G × T15694.120.0430
 D × T2,5692.730.0659
 G × D × T2,5694.190.0157
KSS
 Group1,180.100.7555
 Day2,361.680.2012
 G × D2,360.180.8371
PVT lapses
 Group1,180.640.4328
 Day2,360.970.3899
 G × D2,360.280.7569

Multitask performance

In the SR group, multitask performance was gradually impaired over the SR days compared with the control group (P < 0.0001; Fig. 4a). Also the group × day × time-on-task interaction was significant (P < 0.0001), meaning that the deterioration of performance in the task sessions progressed more steeply in the SR group than in the control group over the SR days.

Figure 4.

 (a) Mean (and SEM) total score in 50-min multitask performance at 10-min intervals over the experimental days. The daily means of two 50-min sessions are presented in the sleep restriction group (SR) and the control group. (b) Predicted average multitask score for the SR group and the control group. BL, baseline; R, recovery.

Figure 4b displays a predicted average line for multitask performance (time-on-task as a random effect). Each SR day the slope for the daily parameter estimates (group × day × time-on-task) was steeper in the SR group (from the baseline estimate = 19.20, t = 0.65, P = 0.5150 to the fifth SR day estimate = −205.80, t = −6.98, P < 0.0001) compared with those of the controls (from the baseline estimate = 9.22, t = 0.23, P = 0.8185 to the fifth SR day estimate = 4.16, t = 0.10, P = 0.9176). Fig. 5 presents examples of individuals who performed at high, middle and low level during the SR days. In two individuals performance deteriorated >80% from the baseline level and in four individuals maximally only 7%; in the rest of the individuals the performance deterioration was in-between the two extremes.

Figure 5.

 Total multitask performance scores at 10-min intervals for three individuals of the sleep restriction (SR) group whose performance either markedly (low performance), moderately (middle) or only slightly (high performance) impaired during the experimental days. BL, baseline; R, recovery.

An ICC indicated that the order between the participants in multitasking remained substantially stable in the control group [ICC = 96.0%, 95% confidence interval (CI) = 89.4–100, between-subject variance = 2129113, SE = 1148545, t = 1.85, P = 0.1132 and within-subject variance = 98386, SE = 26295, t = 3.74, P = 0.0096], but also in the SR group (ICC = 81.4%, 95% CI = 66.2–96.5, between-subject variance = 7865102, SE = 3227152, t = 2.44, P = 0.0313 and within-subject variance = 1802889, SE = 353576, t = 5.10, P = 0.0003) during the SR days.

EEG/EOG-defined sleepiness while multitasking

A significant group × day × time-on-task interaction (P = 0.0004) revealed that the groups differed in terms of how much EEG/EOG-defined sleepiness increased within the 50-min multitask sessions during the SR days. Figure 6 shows that the time-on-task effect increased more for the SR group than the control group as the number of days increased. There were only two individuals whose EEG/EOG-defined sleepiness increased 25% or more from the baseline to the fifth SR day. Noteworthy, these were the same individuals whose performance deteriorated at least 80% during the SR. The other individuals in the SR group showed only an increase of 0–6% change in physiological sleepiness during multitasking.

Figure 6.

 Mean (and SEM) electroencephalogram (EEG)/electrooculogram (EOG)-defined sleepiness at 10-min intervals during the multitask sessions over the experimental days. The daily means of forenoon and afternoon sessions are presented in the sleep restriction group (SR) and the control group. BL, baseline; R, recovery.

PVT

The number of PVT lapses increased more in the SR group than in the control group (P = 0.0327; Fig. 7). The estimated number of lapses increased from 0.92 ± 0.73 (BL) to 3.54 ± 0.73 (SR5) in the SR group, and from 0.62 ± 1.00 to 0.90 ± 1.00 in the control group. There was a tendency that the slowest 10% of all the responses were slower in the SR group, but the group difference was not statistically significant (P = 0.16).

Figure 7.

 Mean number (and SEM) of lapses per day in the PVT in the sleep restriction group (SR) and the control group. BL, baseline; R, recovery.

Subjective sleepiness

Self-rated sleepiness increased more in the SR group than in the controls after the baseline day (P = 0.0005; Fig. 8). The estimated ratings of sleepiness (KSS) during the SR days increased from 4.65 ± 0.29 (BL) to 6.2 ± 0.29 (SR5) in the SR group, and from 4.9 ± 0.39 to 5.0 ± 0.39 in the control group.

Figure 8.

 Mean daily ratings (and SEM) of subjective sleepiness (Karolinska Sleepiness Scale, KSS) for the sleep restriction group (SR) and the control group. The mean values represent KSS ratings collected before and after the multitask sessions in the forenoon and afternoon. BL, baseline; R, recovery.

Effects of recovery sleep

Multitask performance

Figure 4a shows that the multitask performance in the SR group improved close to the level of the controls following the first recovery sleep and that the group difference diminished further following the second recovery sleep. A significant group × day interaction (P = 0.0168) indicated that the recovery process continued still during the second night of recovery. The daily parameter estimates for time-on-task lines decreased from R1 (estimate = −37.63, t = −3.09, P = 0.0021) to R2 (estimate = −3.61, t = −0.30, P = 0.7673) in the SR group, meaning that the time-on-task effect almost disappeared during R2. The control group showed no comparable changes.

EEG/EOG-defined sleepiness while multitasking

A significant group × time-on-task × day interaction (P = 0.0157) demonstrated that the difference between the groups in physiological sleepiness was dependent on both the time spent on the task and the number of recovery days. Figure 6 shows that the group difference in the time-on-task effect decreased from R1 to R2.

PVT

The groups did not differ in the number of PVT lapses during the recovery period.

Subjective sleepiness

The sleepiness ratings on the recovery days showed no group differences.

Discussion

The main findings of this study were that at a group level multitask performance was significantly impaired by 5 days of partial SR, and that this impairment increased as a function of the time spent on the task. However, within the group exposed to the SR, only few individuals showed large impairments in their performance. Most of the sleep-deprived individuals showed only moderate deteriorations and some individuals’ performance remained virtually unchanged. In addition, EEG/EOG-defined sleepiness increased significantly at the group level during multitasking in the course of the SR but, actually, there were only few individuals whose sleepiness increased markedly.

The finding that multitask performance progressively deteriorated with the increasing number of SR days is in line with previous studies in which participants’ performance on less complicated and shorter tasks deteriorated in the course of SR (Belenky et al., 2003; Dinges et al., 1997; Van Dongen et al., 2003). From the viewpoint of work life, the strength of this study was that the used task included two important characteristics of real operational tasks that are often performed under SR, namely high demands on cognitive processes such as divided attention (Gopher, 1996; Wickens et al., 2003) and the requirement of performance over long periods. In practice, our finding can be understood that a high number of sleep-limiting shifts in a row substantially increase the risk for human error in operational tasks that require multitasking. However, studies conducted in authentic work conditions are needed to verify this conclusion, as, for example, expertise based on long experience and awareness of the consequences of a performance error probably also play a role in how well a person actually performs at work while restricted of sleep.

The progressively augmenting time-on-task effect on multitask performance was observed in the SR but not in the control group, indicating that it was totally dependent on the sleep loss preceding the performance. The time-on-task effect on multitasking or simulator performance has been also found in previous studies on acute sleep loss (Åkerstedt et al., 2005; Caldwell and Ramspott, 1998; Sallinen et al., 2008). The practical significance of this finding is high. In safety-critical occupations, for example, with air-traffic controllers, most task sessions performed under sleep loss are relatively long in duration. Interrupting the working period with a break could theoretically be of advantage. However, our previous work has shown that it is unlikely that, for example, a rest pause with light neck-and-shoulder exercise would be an effective remedy (Sallinen et al., 2008). Cognitive performance during SR is, to some extent, improved by napping (Mollicone et al., 2008; Purnell et al., 2002; Sallinen et al., 1998), and thus naps during breaks could be of advantage. Another strategy is to use stimulants. However, continuous use often leads to tolerance and increased dosages, which may affect sleep following the shift.

The time-on-task effect on multitasking intensified in the course of the SR. Previous studies have shown that 20 h of wakefulness per day – the same amount that was used in this study – is sufficient for escalation of performance impairments (McCauley et al., 2009). Our new finding was that most of the deterioration occurred during the latter part of the 50-min task session, implying that the escalating negative effects of extended wakefulness on multitask performance are dependent on the time spent on the task.

There are various brain mechanisms that may explain the observed deterioration in multitasking. The increases in EEG/EOG-defined sleepiness during the deterioration of multitasking suggest that at least the arousal mechanisms played a key role. The level of thalamic activation that regulates arousal and attention has been found to decrease in association with sleep deprivation (Chee et al., 2006, 2008; Coull et al., 1998; Thomas et al., 2000). This deactivation could at least to some extent explain the observed decrements in multitask performance under SR. On the contrary, multitasking places special demands on cognitive processes required in subtasks and coordinating attention switching between the subtasks (D’esposito et al., 1995; Dux et al., 2006; Just et al., 2001, 2008). SR-induced changes in brain mechanisms underlying these cognitive processes may thus also explain the observed deterioration in multitasking.

Large individual differences in multitasking were observed on the SR days. Importantly, the individual differences were substantially stable within this period: the same individuals showed either sensitivity or tolerance to the curtailment of sleep. Only two individuals out of 13 exhibited severe decrements in multitask performance and, interestingly, the same individuals also showed the most severe increase in EEG/EOG-defined sleepiness during the task performance. Our finding of large individual differences in multitasking within a single SR period is in line with a study by Van Dongen et al. (2006) in pilots. In their study, differences in flight-simulator performance between sleep-deprived pilots were large and stable in the course of a single experiment (Van Dongen et al., 2006).

The individual differences observed in this study cannot be explained by differences in cognitive aptitude. Prior to the experiment, the cognitive demand of the task was adjusted individually to be 70% of the maximal capacity of each participant. The individual adjustment was carried out because previous studies have shown that the ability to divide attention between one or more simultaneous tasks differs greatly from one person to another (Damos, 1993). The adjustment protocol ensured that all individuals were able to perform the multitask and that the starting level for each individual was equally demanding. The observed differences suggest that there are individual differences in tolerance for SR (Leproult et al., 2003; Van Dongen et al., 2003, 2004). In our study, individual differences remained stabile through the SR days, but we did not examine whether the differences would have been replicable in a repeated exposure to SR.

It is not self-evident how and to what extent the individual differences in response to sleep loss should be accounted for in shift work: whether they should be used as a selection criterion when recruiting new personnel or whether they should be used as a basis for adjusting work demands, including working hours, on an individual basis. The seriousness of the safety hazard associated with the task in question is one aspect to be considered in this context. Second, it would be important to establish the possibilities for obtaining reliable data on a person’s sensitivity to sleep deprivation before starting shift work. For the moment, it seems that the only reliable way of indentifying persons with high vulnerability to sleep loss is to subject them to such conditions, as no potential baseline predictor has turned out to be reliable enough for this purpose (Van Dongen and Belenky, 2009). In addition, a Bayesian forecasting technique based on closed-loop feedback of measured performance can be used for predicting changes in a sleep-deprived worker’s job performance (Van Dongen and Belenky, 2009).

Electroencephalogram/electrooculogram-defined sleepiness during multitasking responded to SR similarly to multitask performance itself: sleepiness increased as a function of the time spent on the task and the number of SR days, but actually only two individuals out of 13 showed a marked increase. Previous studies have found no clear associations between changes in cognitive performance and concomitant physiological sleepiness under sleep loss (Galliaud et al., 2008; Stenuit and Kerkhofs, 2008; Wilson et al., 2007). This may be partly due to the different durations of tasks used in the previous studies: in this study, the task duration was much longer than those used earlier. It can be assumed that the association is more obvious when the effect of sleep loss on both measures has augmented close to its maximum at the end of the task performance. In all, our findings suggest that increased physiological sleepiness is at least one of the factors underlying impaired long-duration multitasking under cumulative sleep loss.

Both standard measures of sleepiness, the KSS ratings and number of PVT lapses, were affected by the SR. On the fifth SR day, the mean level of the KSS ratings was close to the level (≥7) that is known to be associated with electrophysiological signs of extreme sleepiness and impaired driving performance (Åkerstedt and Gillberg, 1990; Ingre et al., 2006). A somewhat surprising result was that the slowest 10% of the PVT response times were not significantly affected by the SR. A reason for this result may be that the PVT was always presented immediately after a long-duration multitask session. This protocol may have affected the level of arousal at which the PVT task was initiated.

Recovery of long-duration multitasking from the cumulative SR proceeded gradually. Following the first recovery sleep period, the level of performance clearly improved as compared with the last SR day, but still remained below that of the control group. Performance returned to the baseline level after the second recovery sleep period. In the course of the gradual recovery process, the time-on-task effect and individual differences decreased. In a previous study, full recovery from a 7-day SR was not reached after three recovery nights (Belenky et al., 2003). Interestingly, extension of sleep duration for several days previous to SR improved the rate of performance recovery (Rupp et al., 2009). The relationship between the severity of the preceding SR (accumulated sleep loss) and the pace of recovery process warrants further research. The question is of practical significance when planning shift work schedules: how many recovery days must be included in the schedule after a certain number of sleep-limiting shifts?

The recovery process of EEG/EOG-defined sleepiness during multitasking resembled that of multitask performance: the recovery process continued over the two recovery sleep opportunities and was characterised by decreases in the time-on-task effect and in individual differences. This finding is in line with our recent study, where both long-duration multitask performance and EEG/EOG-defined sleepiness responded similarly to an 8-h sleep opportunity after only 2 h of sleep on the previous night (Sallinen et al., 2008). When considering time, needed for recovery, it is important to notice that the participants of this study were provided with optimal sleeping conditions free from many sleep-disturbing factors normally present in everyday life. Thus, it is possible that recovery takes even longer under real working conditions than in our laboratory environment.

There are several limitations in our study that should be taken into account to interpret the results. First, our sample consisted of only young healthy men, which limits the possibilities to generalise the results to other age groups, women, persons with health problems, and experienced shift workers. Second, the long-term stability of the observed individual differences in response to SR remains open, as the participants were exposed to SR only once. Third, the recovery process from 5 days of SR was followed only for 2 days. However, shift schedules often contain repetitive spells of sleep-limiting shifts with only a day or two off between them. In this context, the question of how many recovery days is needed to prevent any carry-over effect from a period of SR to the next one is of importance, but cannot be answered on the basis of our results. Finally, the laboratory conditions of our study limit the possibilities to generalise the results to everyday life and thus field studies are needed to verify our findings.

Conclusions

In conclusion, this study demonstrates that complex and long-duration cognitive performance gradually degrades in young healthy men when their sleep is restricted to 4 h per night for 5 days. This degradation is characterised by a strong time-on-task effect and large individual differences, and accompanied with an increase in physiological sleepiness. Recovery from restricting sleep to 4 h per night for a period of a work week takes at least two 8-h sleep opportunities.

Acknowledgements

This study was supported by the European co-funded, 6th FW, Integrated project SENSATION (IST, 507231), Finnish Work Environmental Found and National Technology Agency of Finland. We would like to thank Outi Fischer, Hannele Huhta, Seija Karas, Hannele Kataja, Nina Lapveteläinen, Mari Marjamäki, Johanna Parikka, Teppo Valtonen, Riitta Velin and students from the medical faculty of Helsinki University for invaluable assistance in recruiting and screening of the participants, and running of the experiments. We would also like to thank Jaana Hiltunen, Kati Hirvonen, Anu Holm and Mika Letonsaari for technical help in this study. In addition, we received generous help and important comments from Risto Näsänen, PhD, and Ritva Akila, neuropsychologist.

Ancillary