Trait interindividual differences in the sleep physiology of healthy young adults



This article is corrected by:

  1. Errata: Corrigendum Volume 23, Issue 3, 361, Article first published online: 14 May 2014

Hans P. A. Van Dongen PhD, Sleep and Performance Research Center, Washington State University, Spokane, PO Box 1495, Spokane, WA 99210-1495, USA. Tel.: +1-509-358-7755; fax: +1-509-358-7810; e-mail:


Despite decades of sleep research by means of polysomnography (PSG), systematic interindividual differences in PSG-assessed sleep parameters have been scarcely investigated. The present study is the first to quantify interindividual variability in standard PSG-assessed variables of sleep structure in terms of stability and robustness as well as magnitude. Twenty-one carefully screened healthy young adults were studied continuously in a strictly controlled laboratory environment, where their PSGs were recorded for eight nights interspersed with three separate 36 h sleep deprivation periods. All PSG records were scored blind to subject and condition, using conventional criteria, and delta power in the non-REM sleep EEG was computed for four electrode derivations. Interindividual differences in sleep variables were examined for stability and robustness, respectively, by comparing results across equivalent nights (e.g. baseline nights) and across experimentally differentiated nights (baseline nights versus recovery nights following sleep deprivation). Among 18 sleep variables analyzed, all except slow-wave sleep (SWS) latency were found to exhibit significantly stable and robust – i.e. trait-like – interindividual differences. This was quantified by means of intraclass correlation coefficients (ICCs), which ranged from 36% to 89% across physiologic variables, and were highest for SWS (73%) and delta power in the non-REM sleep EEG (78–89%). The magnitude of the trait interindividual differences was considerable, consistently exceeding the magnitude of the group-average effect on sleep structure of 36 h total sleep deprivation. Notably, for non-REM delta power – a putative marker of sleep homeostasis – the interindividual differences were from 9.9 to 12.8 times greater than the group-average increase following sleep deprivation relative to baseline. Physiologic sleep variables did not vary among subjects in a completely independent manner – 61.1% of their combined variance clustered in three trait dimensions, which appeared to represent sleep duration, sleep intensity, and sleep discontinuity. Any independent functional significance of these sleep physiologic phenotypes remains to be determined.


Interindividual differences in normal, non-pathologic sleep have sparked interest with recent reports implicating their role in a variety of health outcomes (Ayas et al., 2003a,b; Gottlieb et al., 2006; Hasler et al., 2004; Verkasalo et al., 2005). Yet, despite the availability of standardized scoring criteria for human sleep for almost 40 years (Rechtschaffen and Kales, 1968), there is a scarcity of in-depth investigations of interindividual differences in normal sleep physiology (Van Dongen et al., 2005). According to the 2003 National Sleep Disorders Research Plan published in the USA, there is a need for ‘an assessment of normal human sleep phenotypes and the normal range of variation in this phenotype in adults and children’ (National Center on Sleep Disorders Research, 2003, p. 5). This paper aims to help meet this need, by reporting the magnitude of variability among normal young adults in a laboratory study designed specifically to examine trait interindividual differences in sleep physiology.

Interindividual differences represent a trait if they are significant, stable over time, and robust to experimental challenges (Van Dongen et al., 2005) – or if genetic influences can be demonstrated, as in the case of twin studies. Much of the currently available evidence about trait interindividual differences in sleep architecture comes from a series of twin studies conducted by Linkowski and colleagues (reviewed in Linkowski, 1999). These studies involved a total of 21 monozygotic and 20 dizygotic healthy male twin pairs (ages 16–35 years) recorded by means of polysomnography (PSG) for three to four nights. Non-REM sleep (stages 2 through 4) was consistently found to be influenced by genetic factors. Mixed evidence of genetic control was found for wakefulness after sleep onset (WASO) and REM density, and the findings for stage 1 and REM sleep amounts were inconclusive. Negative results were found for total sleep time (TST), sleep efficiency (SE), sleep latency, REM latency, and number of sleep stage transitions. None of Linkowski’s twin studies, nor earlier pilot studies of twins (Hori, 1986; Webb and Campbell, 1983; Zung and Wilson, 1966), reported the actual magnitude of interindividual variability in sleep variables.

Merica and Gaillard (1985), following up on the pioneering work of Webb and Agnew (1968), conducted the first comprehensive investigation of interindividual differences in PSG-recorded sleep variables using non-twins. They were also the first to use the intraclass correlation coefficient (ICC) to quantify the stability of interindividual differences in sleep variables over time. However, their work involved repeated measures across different experiments, which may have resulted in increased error variance within individuals. Indeed, despite including as many as 147 subjects, Merica and Gaillard (1985) found evidence of systematic interindividual differences exceeding intraindividual error variance for stage 4 sleep only. This sole positive result reported by them is in sharp contrast with the range of findings in the twin study results, and highlights the importance of strictly controlling sources of variance when studying interindividual differences (Van Dongen et al., 2004b).

Feinberg and colleagues examined the night-to-night stability of the spectrum of the EEG in non-REM and REM sleep (Feinberg et al., 1978, 1980; Palagini et al., 2000; Tan et al., 2000, 2001). In their more recent studies, they recorded 16 healthy young adults for five consecutive baseline nights (Tan et al., 2000) and 19 healthy young adults for four non-consecutive baseline nights (Tan et al., 2001). Various frequency bands were extracted from the spectrum of the sleep EEG, including the delta band [which captures slow-wave sleep (SWS)] and the sigma band (which contains sleep spindles). It was observed that the various frequency bands as well as the shape of the entire spectrum were highly stable within individuals for both non-REM and REM sleep. In a further report (Palagini et al., 2000), it was also found that the stability within individuals for power in the delta and sigma bands was robust to administration of GABAergic agents. Although the statistical approach (based on correlation matrices) did not allow for quantitative conclusions, Feinberg and colleagues pointed out that these findings imply the existence of trait-like interindividual differences in the sleep EEG (Tan et al., 2001).

Achermann and colleagues compared the EEG of baseline sleep to that of recovery sleep after 40 h of total sleep deprivation (Finelli et al., 2001; Tinguely et al., 2006). In eight healthy adult males, evidence was found for robust interindividual differences (‘fingerprints’), preserved from baseline to recovery sleep, across frequency bands in the spectrum of the non-REM and REM sleep EEGs (and even the waking EEG during sleep deprivation) – as well as their topographical distribution over the scalp. The magnitude of interindividual variability in the non-REM sleep EEG spectrum appeared to exceed the magnitude of the average response to sleep deprivation (Finelli et al., 2001), although quantitative evidence was not reported. Achermann and colleagues normalized their data, which may have affected the results for interindividual variability. Furthermore, for their analyses they used a Manhattan distance technique, which was not developed or validated for the analysis of interindividual differences in repeated-measures designs. Thus, there is some uncertainty in the interpretation of the results.

De Gennaro et al. (2005) also compared interindividual differences in the spectrum of the non-REM sleep EEG, focusing specifically on the sleep spindle range, across nocturnal PSG recordings made under diverse experimental conditions. For ten healthy adult males, De Gennaro and colleagues recorded baseline sleep and four sleep periods each interrupted for psychophysiologic testing and affected by various degrees of selective SWS deprivation and recovery thereof. In line with earlier work by Werth et al. (1997) indicating that sleep spindle frequencies are a person-specific characteristic, De Gennaro and colleagues found that interindividual differences in the shape of the spectrum were robust across the experimental conditions. However, outcome variables were again normalized, which may have confounded the results.

Finally, Buckelmüller et al. (2006) recorded eight healthy adult males over two pairs of baseline nights separated by 1 month. Variability was greater between subjects than within subjects – which imply systematic interindividual differences – for sleep stages 2 through 4 and REM sleep, and parts of the EEG spectrum during non-REM sleep as well as REM sleep. A Euclidian distance technique was employed which, like the Manhattan distance technique, was not designed for use in repeated-measures designs, leaving questions regarding the statistical validity. Also, results were not reported for other variables commonly derived from PSG. Thus, the full extent of interindividual differences in standard sleep parameters was not revealed.

Although the studies discussed above provide converging evidence for trait-like interindividual differences in the physiology of sleep, no studies to date have quantified the magnitude of systematic interindividual variability in sleep parameters. Using appropriate experimental and statistical techniques, we report here the quantitative results of the first experimental study to simultaneously assess all three criteria needed to fully establish trait interindividual differences – i.e. their magnitude, stability, and robustness – for standard PSG-assessed variables of sleep physiology, in healthy young men and women studied under a high degree of experimental control.



A total of N = 21 subjects completed the experiment. They were 10 males and 11 females, ranging in age from 22 to 40 years (mean ± SD: 29.4 ± 5.3 years). Eleven subjects reported to be African-American, nine Caucasian, and one Asian. All subjects were physically and psychologically healthy and free of traces of drugs (other than contraceptives), as assessed by physical examination, blood chemistry, urine analysis, and history. Subjects reported to be good sleepers, habitually sleeping between 7 and 9 h per night and regularly getting up between 06:30 and 08:30 hours. They were instructed to maintain their habitual sleep/wake pattern and not to take any naps, as monitored by actigraphy (Actiwatch-L, Bend, OR, USA; Mini Mitter) and diary, and to abstain from caffeine, tobacco, alcohol, and drugs during the 7 days before the experiment. The study was approved by the Institutional Review Board (IRB) of the University of Pennsylvania, and all subjects gave written informed consent.

Experimental design

Subjects spent 11 consecutive days in a controlled laboratory environment in the General Clinical Research Center of the Hospital of the University of Pennsylvania. The laboratory experimental protocol is illustrated in Fig. 1.

Figure 1.

 Schematic of the 11-day laboratory experimental protocol. Gray areas represent nighttime sleep opportunities; white areas represent periods of scheduled wakefulness. Bottom tic marks indicate midnight (long) and noon (short). The experiment began with a 12 h adaptation night (A). Subsequently, there were three iterations (labeled 1 through 3) of the following pattern: 12 h scheduled wakefulness (W); 12 h TIB for baseline sleep (B); 36 h total sleep deprivation (SD); and 12 h TIB for recovery sleep (R). The experiment ended with an additional 12 h wakefulness period (W) and a 12 h predeparture sleep period (P). Every 12 h wakefulness period began at 10:00 hours and ended at 22:00 hours; and every 36 h sleep deprivation period began at 10:00 hours and ended at 22:00 hours the next day. Each of the eight scheduled 12 h sleep periods began at 22:00 hours and ended at 10:00 hours, and was recorded polysomnographically.

Upon entering the laboratory in the afternoon, subjects practiced a neurobehavioral performance task (NPT) battery adapted from an earlier study (for details, see Van Dongen et al., 2004a). Subjects went to bed at 22:00 hours for a 12 h adaptation sleep period. At 10:00 hours the next morning, they were awakened to begin a 12 h scheduled waking period, during which they performed a low workload (0.2 h) version of the NPT battery every 2 h. They went to bed again at 22:00 hours for a 12 h baseline sleep period. At 10:00 hours the next morning, subjects were awakened to begin a 36 h sleep deprivation period. Every 2 h they performed either a moderate workload (0.5 h) or a high workload (1.0 h) version of the NPT battery. At 22:00 hours at the end of the 36 h sleep deprivation, subjects went to bed for a 12 h recovery sleep opportunity.

The pattern of 36 h sleep deprivation with moderate or high workload, preceded by a 12 h waking period with low workload and a 12 h baseline sleep period and followed by a 12 h recovery sleep period, was repeated three times (see Fig. 1). One of the three sleep deprivation periods involved the high workload (1.0 h) version of the NPT battery; the other two sleep deprivations involved the moderate workload (0.5 h) version of the battery. These conditions occurred within subjects, in randomized, counterbalanced order. This experimental design was driven by the objective to study interindividual differences and workload effects on neurobehavioral performance and sleep variables in a single experiment – here the emphasis is on the interindividual differences in sleep variables only.

At 10:00 hours after the third occurrence of a recovery night following 36 h sleep deprivation, another 12 h period of scheduled wakefulness with low workload began. Subjects went to bed again at 22:00 hours for a final 12 h recovery sleep period. At 10:00 hours after this predeparture sleep period, subjects performed the low workload (0.2 h) NPT battery one final time, and went home.

Throughout the experiment, the conditions in the laboratory were controlled in terms of environmental circumstances and scheduled activities. Ambient temperature was maintained at 21 ± 1 °C. Light exposure was fixed at <50 lux during scheduled wakefulness, and <1 lux (darkness) during scheduled sleep. Subjects were behaviorally monitored by trained staff members continuously. Between performance test bouts, subjects were allowed only non-vigorous activities. They had no interactions with people outside the laboratory. Standardized meals were given at 11:00, 15:00, and 19:00 hours each day, and also at 23:00, 03:00, and 07:00 hours during sleep deprivation. Food was controlled in terms of calories and nutrients (proteins, fats, and carbohydrates). The amount of food subjects received during the 36 h sleep deprivation periods matched their normal 2 day caloric requirement based on height and weight. Caffeine, alcohol, and tobacco were prohibited.


All eight scheduled sleep periods (see Fig. 1) were PSG-recorded with digital equipment (Vitaport 3; TEMEC Instruments, Kerkrade, the Netherlands), using a sampling rate of 128 Hz. The PSG montage included frontal (Fz), central (C3, C4), and occipital (Oz) EEG (referenced against A1/A2), bilateral EOG, submental EMG, and ECG. The A1 and A2 leads were bridged to reduce the impact of possible mastoid electrode detachment. The bipolar EEG/EOG/EMG signals were conditioned with analog low-pass filters (30 Hz cutoff frequency for the EEG and EOG; 70 Hz cutoff frequency for the EMG) and high-pass filters (1.0 s time constant for the EEG; 3.0 s time constant for the EOG; 0.15 s time constant for the EMG). Electrode impedances were checked and logged at the beginning of each PSG recording. All records were scored visually in 30 s epochs using conventional criteria (Rechtschaffen and Kales, 1968), by a single trained technician supervised by a registered polysomnographic technologist. Sleep scoring occurred blind to which subjects and which nights were associated with the records. Equipment problems resulted in the loss of 14% of the PSG records. A total of 144 records were available for analysis; none of these suggested the presence of any sleep disorders.

The following sleep variables were extracted from the PSG records: time in bed (TIB), TST, SE (defined as TST over TIB), durations of sleep stages 1 and 2 (S1 and S2, respectively), duration of SWS (stages 3 and 4 combined), duration of rapid eye movement (REM) sleep, duration of movement time (MT), sleep latency to stages 1 and 2 (SL1 and SL2, respectively), SWS latency and REM latency (SWSL and REML, respectively, as measured from stage 1 sleep onset), WASO (defined as any wakefulness between sleep onset and scheduled rising time), and the number of completed non-REM/REM sleep cycles (#Cycles). In addition, a transition index (TI) was calculated as the average number of sleep stage transitions per hour.

After combined automated and visual artifact rejection, spectral analysis of the sleep EEG was performed with Fast Fourier Transform (FFT) on 4 s subepochs, cosine tapered and averaged with 1 s overlap within every 30 s epoch (Vitascore; TEMEC Instruments). Grand average peak-to-peak power in the delta band (0.75–4.50 Hz) was calculated across all epochs of non-REM sleep (stages 2 through 4), weighted for the number of artifact-free subepochs in each epoch. This procedure was applied to each of the four EEG derivations. For Fz and Oz, delta power results were available for 136 records; for C3 and C4, they were available for 135 records.

Statistical analyses

For the analysis of systematic interindividual variability, methodology to separate between-subjects variance (interindividual variability) from within-subjects variance (intraindividual variability or error variance) is needed (Van Dongen et al., 2004b). To this end, statistical analyses employed mixed-effects analysis of variance (ANOVA) using SAS 9.1 (SAS Institute, Cary, NC, USA). The main analysis combined all eight PSG-recorded nights in a single mixed-effects ANOVA, controlling for the effects of sleep deprivation (moderate and high workload), for the adaptation and predeparture nights, and for any order effects. As there were no statistically significant differences between moderate and high workload in this analysis for any of the sleep variables after sleep deprivation, the workload conditions were not differentiated for the results reported here. A random effect on the intercept served to estimate systematic interindividual differences across the eight nights. This allowed calculation of the between-subjects and within-subjects variances. The between-subjects variance was tested against zero using a Wald Z-test.

The stability of interindividual differences was quantified with the ICC, which was calculated as the between-subjects variance divided by the sum of the between- and within-subjects variances (Van Dongen et al., 2004b). A 95% confidence interval for the ICC was calculated using the exact method prescribed by Rao (1997). ICC estimates were interpreted using published benchmark ranges (Landis and Koch, 1977) of ‘slight’ (0.0–0.2), ‘fair’ (0.2–0.4), ‘moderate’ (0.4–0.6), ‘substantial’ (0.6–0.8), and ‘almost perfect’ (0.8–1.0).

The between-subjects standard deviation (SDbs) was calculated as the square root of the between-subjects variance (and the standard error was computed using the delta method). The size of the 95%‘confidence’ interval, called ‘reference’ interval (RI) in this case,* for systematic interindividual variability was estimated by multiplying the SDbs with 3.92 (representing 1.96 times the standard deviation on both sides of the mean to capture the 95% range between the 2.5 and 97.5 centiles). For comparison purposes, the magnitude of the main effect of sleep deprivation in the mixed-effects ANOVA was computed. This provided an estimate of the size of the group-average effect of 36 h sleep deprivation, i.e. the average difference between the 12 h recovery nights and the 12 h baseline nights. It was tested against zero using a one-sample t-test (Van Dongen et al., 2004c).

The main analysis described above was repeated with age and gender as covariates, to estimate the proportion of interindividual variability captured by these demographics. Likewise, the analysis was repeated with ethnicity as a covariate (excluding the one Asian subject, as a minimum of two subjects per ethnic group was needed for this analysis). For delta power, the main analysis was also repeated with electrode impedance as a covariate, to ascertain that systematic interindividual differences in these measures were not merely due to irrelevant recording differences.

Finally, the raw data of the 18 sleep variables (a total of 2558 data points) were subjected to principal components analysis (McPherson, 2001) to examine which variables covaried across the eight nights and the 21 subjects in order to reduce the dimensionality of the data set. The scree plot was inspected to determine how many dimensions to retain. To help simplify the relationships between the retained dimensions and the original sleep variables, an orthogonal variance maximizing (varimax) rotation was performed (Cooper, 1983). For a provisional interpretation of the rotated dimensions (recognizing that generalizability of the interpretation beyond the present study could not be confirmed), sleep variables with absolute factor loadings greater than a predetermined level of 0.5 were considered. For every rotated dimension, a new ‘composite’ sleep variable was constructed from the factor scores. Each of the three resulting composite variables was then analyzed using mixed-effects ANOVAs as described in the paragraphs above.


The results of the main analysis combining all eight PSG-recorded nights in a single mixed-effects ANOVA are shown in Tables 1 and 2 and Fig. 2. Table 1 displays the group averages for the 18 sleep variables for the baseline nights and for the recovery nights following 36 h sleep deprivation, as well as the group-average differences between the recovery and baseline nights. For all sleep variables, these differences were in the direction of subjects getting more and deeper sleep during the recovery nights, and they were statistically significant for every variable except MT and REML (see Table 1).

Table 1.   Sleep variables at baseline and during recovery after 36 h total sleep deprivation
  1. Group averages ± standard errors are displayed, as well as a statistical significance test for the difference between recovery and baseline sleep.

  2. *Δ indicates delta power in the non-REM sleep EEG at a specified electrode derivation. Note that the absolute values displayed here are specific for the methodology used to record and calculate delta power. There is no generally accepted standard for such methodology, which should be kept in mind when making direct comparisons with absolute results from other studies.

  3. t-test with 113 degrees of freedom; 105 for ΔFz and ΔOz; 104 for ΔC3 and ΔC4.

TIB (h)  12.00  12.000.00
TST (h)   9.09 ± 0.19  10.89 ± 0.181.80 ± 0.1512.24<0.001
SE (%)  75.7 ± 1.5  90.4 ± 1.514.7 ± 1.212.00<0.001
S1 (h)   0.76 ± 0.06   0.53 ± 0.06−0.23 ± 0.05−4.36<0.001
S2 (h)   5.15 ± 0.17   5.88 ± 0.170.73 ± 0.12 6.24<0.001
SWS (h)   1.15 ± 0.12   1.82 ± 0.120.68 ± 0.0610.87<0.001
REM (h)   2.03 ± 0.09   2.67 ± 0.090.64 ± 0.07 8.92<0.001
MT (h)   0.19 ± 0.01   0.19 ± 0.010.00 ± 0.01−0.36  0.722
WASO (h)   1.66 ± 0.20   0.77 ± 0.20−0.89 ± 0.13−6.83<0.001
SL1 (h)   1.01 ± 0.13   0.16 ± 0.13−0.86 ± 0.12−7.36<0.001
SL2 (h)   1.15 ± 0.13   0.19 ± 0.13−0.97 ± 0.12−8.31<0.001
SWSL (h)   0.44 ± 0.05   0.17 ± 0.05−0.27 ± 0.07−4.00<0.001
REML (h)   1.24 ± 0.10   1.19 ± 0.10−0.05 ± 0.09−0.60  0.548
#Cycles   5.8 ± 0.2   6.8 ± 0.21.0 ± 0.2 5.72<0.001
TI (h−1)  17.2 ± 0.8  14.5 ± 0.8−2.7 ± 0.6−4.43<0.001
ΔFz (10−12 V2)* 901 ± 1221114 ± 121213 ± 41 5.14<0.001
ΔC3 (10−12 V2)* 264 ± 36 312 ± 36 48 ± 17  2.85  0.005
ΔC4 (10−12 V2)* 241 ± 33 287 ± 33 46 ± 10 4.49<0.001
ΔOz (10−12 V2)*1128 ± 1211325 ± 121197 ± 39 5.06<0.001
Table 2.   Interindividual variabilities, between- and within-subjects variance components, and intraclass correlation coefficients
 SDbs95% RIZP-valueVARbsVARwsICC
  1. The magnitude of systematic interindividual differences across the eight PSG-recorded nights is shown as the between-subjects standard deviation (SDbs) ± standard error. The SDbs was multiplied by 3.92 to calculate the 95% reference interval (RI) of between-subjects variability, which represents the span of interindividual differences. The statistical significance of the systematic interindividual differences is also shown (Wald Z-test of between-subjects variance). Between- and within-subjects variance components (VARbs and VARws, respectively) ± standard error, controlling for the effects of sleep deprivation in the recovery nights, are displayed next. The stability of the interindividual differences is indicated as the intraclass correlation coefficient (ICC), with numbers in parentheses reflecting the lower and upper limits of the 95% confidence interval.

  2. *For SWSL, between-subjects variance regressed to its lower boundary of zero and was not statistically significant.

  3. Δ indicates delta power in the non-REM sleep EEG at a specified electrode derivation.

TST (h)0.70 ± 0.142.732.530.0060.49 ± 0.190.57 ± 0.080.46 (0.29–0.66)
SE (%)5.8 ± ± 13.439.8 ± 5.30.46 (0.29–0.66)
S1 (h)0.21 ± 0.050.812.190.0140.04 ± 0.020.08 ± 0.010.36 (0.20–0.57)
S2 (h)0.68 ± 0.132.652.630.0040.46 ± 0.170.36 ± 0.050.56 (0.39–0.74)
SWS (h)0.53 ± ± 0.100.10 ± 0.010.73 (0.60–0.86)
REM (h)0.35 ± 0.071.382.550.0050.12 ± 0.050.13 ± 0.020.48 (0.31–0.68)
MT (h)0.05 ± ± 0.0010.003 ± 0.0010.44 (0.27–0.64)
WASO (h)0.80 ± ± 0.240.45 ± 0.060.59 (0.43–0.76)
SL1 (h)0.48 ± 0.111.902.300.0110.23 ± 0.100.36 ± 0.050.40 (0.23–0.61)
SL2 (h)0.48 ± 0.101.872.290.0110.23 ± 0.100.36 ± 0.050.39 (0.23–0.60)
SWSL (h)*>0.990.12 ± 0.01
REML (h)0.37 ± 0.081.462.450.0070.14 ± 0.060.20 ± 0.030.40 (0.24–0.62)
#Cycles0.7 ± ± 0.180.75 ± 0.100.36 (0.21–0.58)
TI (h−1)2.9 ± 0.611.32.560.0058.3 ± 3.310.1 ± 1.30.45 (0.29–0.66)
ΔFz (10−12 V2)540 ± 9221162.920.002291 400 ± 99 69542 025 ± 58020.87 (0.79–0.94)
ΔC3 (10−12 V2)157 ± 286162.830.00224 714 ± 87196900 ± 9580.78 (0.66–0.89)
ΔC4 (10−12 V2)146 ± 255732.930.00221 367 ± 72982555 ± 3550.89 (0.82–0.95)
ΔOz (10−12 V2)538 ± 9221082.930.002289 188 ± 98 81937 055 ± 51170.89 (0.81–0.94)
Figure 2.

 Magnitude of stable and robust interindividual differences versus magnitude of group-average response to 36 h total sleep deprivation. The black bars indicate the span of systematic interindividual differences across the eight nights (i.e. 95% reference interval of between-subjects variability). The white bars display the group-average effect of sleep deprivation (i.e. the absolute difference between recovery sleep and baseline sleep). Results are shown for total sleep time (TST), stage 2 sleep (S2), slow-wave sleep (SWS), REM sleep, sleep latency to stage 2 (SL2), and delta power in the non-REM sleep EEG at the frontal derivation (Fz; right-hand ordinate). Error bars represent standard errors.

Table 2 quantifies the variance associated with systematic interindividual differences in sleep physiologic variables across the eight PSG-recorded nights, controlling for the effects of sleep deprivation in the recovery nights as shown in Table 1. The magnitude of the systematic interindividual differences is displayed as the between-subjects standard deviation (SDbs), as well as the 95% RI of between-subjects variability. The latter were found to be substantial (e.g. sleep stages varied among subjects by approximately 1–2.5 h). The Wald Z-tests and accompanying P-values in Table 2 indicate that the SDbs measures were statistically significant for 17 of the 18 sleep variables, even when controlling for multiple comparisons using the false discovery rate procedure (Curran-Everett, 2000). Among all sleep variables considered, only SWSL did not show statistically significant interindividual variability. For select sleep parameters, Fig. 2 illustrates the magnitude of the interindividual differences, by comparison with the size of the effect of 36 h sleep deprivation (i.e. the group-average difference between the 12 h recovery nights and the 12 h baseline nights, taken from Table 1). With the exception of SWSL, the magnitude of the interindividual differences was greater than the magnitude of the effect of sleep deprivation for every sleep variable analyzed.

Table 2 (right-hand side) shows the estimates for between-subjects and within-subjects variance and the ICC values derived from them. The ICC values are a measure of the stability of the interindividual differences across the eight PSG-recorded nights. By experimental design, these nights involved both equivalent conditions (e.g. the three baseline nights) and variations (baseline versus recovery nights). As a consequence, the ICC values can be interpreted as evidence of both stability (across equivalent nights) and robustness (across varying experimental states). Using the benchmarks suggested by Landis and Koch (1977), the ICC values were ‘fair’ for S1, SL1, SL2, REML, and #Cycles; ‘moderate’ for TST, SE, S2, REM, MT, WASO, and TI; and ‘substantial’ or ‘almost perfect’ for SWS and for non-REM delta power at all four EEG derivations. Thus, interindividual differences in sleep physiology were at least fairly stable and robust, and particularly so for SWS and delta power.

After controlling for electrode impedance, more than 95% of the between-subjects variance in delta power at each of the four EEG derivations remained, and ICC values changed by no more than 1%. Indeed, electrode impedance was not statistically significant as a covariate for any of the four EEG derivations (F < 1.2, P > 0.27), confirming that the systematic interindividual differences observed in delta power were biologic in nature and not merely a recording artifact.

Repeating the main analysis with age and gender as covariates, age was found to contribute significantly to interindividual differences in delta power, reducing power at C4 (F = 4.0, P = 0.049) and Oz (F = 5.1, P = 0.026). A trend was found for C3 (F = 3.2, P = 0.075), but the effect of age at Fz did not reach statistical significance in our sample. Gender contributed significantly to interindividual differences in TST (F = 5.8, P = 0.018), SE (F = 6.1, P = 0.015), and S1 (F = 4.0, P = 0.048), with women on average getting more sleep overall but less S1 sleep than men. Across all sleep variables with significant interindividual variability (i.e. all variables considered except SWSL), age and gender differences together captured <30% of the between-subjects variance, leaving at least 70% (and in most cases more than 90%) of the systematic interindividual variability in physiologic measures of sleep unexplained.

Repeating this same analysis with ethnicity as a covariate (replacing age and gender), ethnicity was found to contribute significantly to the interindividual differences in S2 (F = 9.5, P = 0.003), REM (F = 6.1, P = 0.015), and delta power at C3 (F = 4.4, P = 0.038) and C4 (F = 6.3, P = 0.014). African-Americans on average exhibited more S2 sleep, less REM sleep, and less non-REM delta power than Caucasians. The effect of ethnicity captured 37.1% of the between-subjects variance for S2 sleep, and 43.6% of the between-subjects variance for REM sleep. Across all other sleep variables with significant interindividual variability, ethnicity captured <20% of the between-subjects variance.

Examination of the scree plot in principal components analysis of the 18 sleep variables exposed three main independent dimensions of sleep physiology. These dimensions were retained and subjected to orthogonal varimax rotation – the factor loadings are shown in Table 3. Of the overall variance in the data set, the first dimension captured 25.3%; the second dimension captured 24.1%; and the third dimension captured 11.7%. (For comparison, if all 18 sleep variables were completely independent of each other, each dimension would have been expected to capture 1/18, i.e. 5.6% of the variance.) Using absolute factor loadings >0.5 to interpret each of the retained dimensions, the first dimension appeared to be dominated by variables related to sleep duration (positive loadings of TST, SE, S2, #Cycles; negative loadings of WASO, SL1, SL2). The second dimension involved non-REM sleep intensity (positive loadings of SWS and delta power at all four EEG derivations). The third dimension seemed to be related to sleep discontinuity (positive loadings of TI, MT, S1; negative loading of REM). Neither SWSL nor REML loaded substantially on any of the dimensions retained.

Table 3.   Results of principal components analysis.
 Dimension 1Dimension 2Dimension 3
  1. Factor loadings of the 18 sleep variables following varimax rotation are shown for the three dimensions retained in the analysis.

  2. *Δ indicates delta power in the non-REM sleep EEG at a specified electrode derivation.


Composite variables constructed from the factor scores for the three dimensions all exhibited statistically significant between-subjects variance (Z ≥ 2.22, P ≤ 0.013) – even when controlling for multiple comparisons (Curran-Everett, 2000). ICC values, which can be interpreted here as the proportion of stable and robust between-subjects variance, were 0.40 (‘fair’) for the first dimension (sleep duration); 0.91 (‘almost perfect’) for the second dimension (non-REM sleep intensity); and 0.54 (‘moderate’) for the third dimension (sleep discontinuity).

Age and gender differences did not significantly mediate the interindividual differences associated with any of the three composite variables, although for the second dimension (sleep intensity) a trend was found for a diminishing effect of age (F = 2.8, P = 0.10). Ethnicity only contributed significantly to the interindividual differences in the composite variable of the first dimension (sleep duration), with African-Americans on average scoring higher (representing more sleep) than Caucasians (F = 4.0, P = 0.049). This effect captured 15.1% of the between-subjects variance, leaving 84.9% of the trait interindividual variability associated with the composite variable of the first dimension unexplained.


This study involved repeated PSG recordings of sleep under controlled laboratory conditions before and after multiple exposures to total sleep deprivation. A main outcome was that almost all standard PSG-assessed sleep variables demonstrated considerable trait interindividual variability. Before discussing this interindividual variability further, the group-average results are briefly reviewed first. At 12 h TIB, average baseline sleep was similar to that seen on the third (last) day of a recently published study involving extended laboratory bed rest (Klerman and Dijk, 2005). Because TIB was 12 h, average baseline SE was low at 75.7%, but this represented 9.09 h TST – even as average sleep latency exceeded 1 h due to the early bedtime (22:00 hours). During recovery after 36 h total sleep deprivation, subjects slept an average of 1.8 h more than at baseline. As expected (see Bonnet, 2005), differences from baseline to recovery sleep were consistently in the direction of subjects getting more and deeper sleep, with SE increasing to 90.8% on average. Delta power in the non-REM sleep EEG increased by 12.1–23.5% depending on the EEG derivation (with the greatest increase occurring frontally; cf. Cajochen et al., 1999). The 12 h TIB period for recovery sleep allowed both SWS and REM sleep to increase relative to baseline.

Despite the differences between baseline and recovery nights, systematic interindividual differences were seen for every PSG-assessed sleep variable but one. The main goal of the study was to determine whether these interindividual differences constitute traits. A trait can be established by demonstrating that the interindividual differences are significant, stable when measured repeatedly, and robust when manipulated experimentally (Van Dongen et al., 2005). These properties were assessed simultaneously by examining the interindividual differences across all eight sleep periods, which included both equivalent nights (e.g. the three baseline nights) and experimentally differentiated nights (baseline nights versus recovery nights following 36 h total sleep deprivation). The ICC was used to quantify the sleep physiologic traits – the ICC could only be high if there were significant interindividual differences with stability across equivalent nights and robustness across non-equivalent nights. Among the 18 sleep variables evaluated (see Table 1), only SWSL did not meet this criterion. For the other 17 sleep variables considered, ICC values ranged from 0.36 to as high as 0.89 (see Table 2). Thus, for various aspects of sleep physiology, interindividual variability was fairly to substantially trait-like. Trait variability was particularly strong for SWS and for quantitatively assessed delta power in the non-REM sleep EEG.

These results confirm and extend findings from previous studies of either the stability or the robustness of interindividual differences in sleep variables (Buckelmüller et al., 2006; De Gennaro et al., 2005; Feinberg et al., 1978, 1980; Finelli et al., 2001; Merica and Gaillard, 1985; Palagini et al., 2000; Tan et al., 2000, 2001; Tinguely et al., 2006; Webb and Agnew, 1968). Despite experimental and/or analytical concerns, these earlier studies were right in pointing out systematic interindividual variability in various sleep parameters, and actually exposed only a portion of the full scope of the phenomenon. Our study contributes to the existing literature by establishing, for the first time, that the interindividual differences in sleep variables constitute physiologic traits in the healthy young adult population. To what extent these traits are stable over long time periods (e.g. from year to year) remains to be determined.

The high level of experimental control before and during the laboratory phase of the study, and the ample time allowed for baseline and recovery sleep (12 h TIB), should have helped to avoid confounds for the trait estimates because of enduring states (e.g. by ruling out any systematic carry-over effects from prior sleep debt). Nevertheless, it should be recognized that any state-specific effects on sleep which we did not manage to control for, and which could have persisted across the three sleep deprivation periods and 11 consecutive days inside the laboratory (i.e. systematic error variance), may have contributed to the observed trait-like interindividual differences. Furthermore, the origin of the exposed sleep traits was not revealed (cf. Webb et al., 1976) – they could have been acquired during development, result from environmental factors or habit formation, and/or reside in the genome. For a limited number of sleep variables, published studies of twins (Hori, 1986; Linkowski, 1999; Webb and Campbell, 1983; Zung and Wilson, 1966) have indicated the presence of genetic influences on human sleep architecture. This suggests that the trait variability observed in the present experiment was, at least partially, genetically controlled. Elegant work by Rétey et al. (2005) recently pinpointed a gene encoding adenosine deaminase, which among other things is associated with brain energy metabolism, as one gene potentially involved.

Our study was the first to focus on the magnitude of the trait interindividual differences in sleep variables. The otherwise relatively homogeneous sample of carefully screened, drug-free, healthy, normal sleepers aged 22 through 40 revealed a remarkable degree of trait heterogeneity in almost every sleep variable examined. This trait heterogeneity was quantified by the size of the 95% RI for systematic interindividual differences (see Table 2), which represents the range that would be estimated to capture 95% of the population of young adult, healthy normal sleepers. To put the 95% RI for each sleep variable in context, it was compared with the group-average effect of sleep deprivation (see Fig. 2). This comparison revealed that, aside from SWSL, the magnitude of the interindividual differences consistently exceeded the magnitude of the group-average effect of 36 h total sleep deprivation – even for sleep parameters known to be substantially affected by sleep deprivation. As the effects of the sleep-deprived state on sleep structure have significant physiologic correlates (Dinges and Chugh, 1997), it may be hypothesized that the state-independent, trait interindividual differences in sleep structure also have physiologic relevance. In any case, the observed trait differences were far from trivial in size, and overall represented the leading category of attributable variance in the sleep data.

In describing the results of an extended bedrest study, Klerman and Dijk (2005) presented an alternative viewpoint by suggesting that interindividual variability in sleep duration may primarily reflect variations in self-selected sleep restriction (i.e. habit). However, this hypothesis was not substantiated, as their study did not assess the magnitude of trait-like interindividual differences for comparison with the variability attributable to habitual sleep-restriction state. In contrast, our study demonstrated explicitly that trait interindividual differences in sleep duration and other sleep variables are a dominant source of variance – with or without prior sleep loss. Consequently, while self-imposed sleep schedules certainly play a role in the distribution of sleep parameters in the general population, physiologic trait variability appears to be a major factor driving interindividual variability in sleep duration. The work of Aeschbach et al. (1996) has indicated that biologic differences in tolerance for homeostatic sleep pressure may also be involved.

Differences among subjects in age, gender, and ethnicity contributed to the observed interindividual variability in sleep variables. Even across the limited age range of the sample (22–40 years), greater age was found to be associated with reduced delta power in the non-REM sleep EEG, as also observed by others (e.g. Åström and Trojaborg, 1992; Ehlers and Kupfer, 1989; Gaudreau et al., 2001). At least for delta sleep expression, therefore, trait interindividual differences were modulated by age (see also Ohayon et al., 2004). Of the 21 subjects in our study, 11 were female. On average, females had increased TST (and thus greater SE) and less S1 sleep. Although the literature about the effects of gender on sleep architecture is mixed, these findings echo those of recent studies that examined gender differences in sleep variables over an age range similar to that of our subjects (e.g. Goel et al., 2005; Roehrs et al., 2006).

Because of the ethnic make-up of our sample, which was representative for the population from which the sample was drawn, ethnicity analyses were limited to a comparison between African-Americans and Caucasians. On the whole, African-Americans displayed more S2 sleep and less REM sleep. The effect of ethnicity on S2 sleep has been noted in the literature before (Rao et al., 1999; Redline et al., 2004). The difference in REM sleep has not been previously observed, and one study actually reported an opposite effect (Profant et al., 2002), although this discrepancy may well be a consequence of reduced TIB relative to the 12 h allowed in our experiment. African-Americans also exhibited less delta power in the non-REM sleep EEG. The latter finding may correspond with the reduced SWS reported for African-American subjects in the literature (Profant et al., 2002; Rao et al., 1999; Stepnowsky et al., 2003), although in the present study, the visually scored SWS variable did not differ significantly between the ethnic groups. Interaction effects among age, gender, and ethnicity were not specifically examined because of insufficient statistical power. Still, these demographic variables collectively did not come close to explaining the bulk of the systematic interindividual differences in sleep architecture for our sample, and large trait interindividual variability within age, gender, and ethnic groups remained.

Some interdependency should be expected among the various sleep variables, e.g. among the different sleep stages which compete for the available sleep time (Merica and Gaillard, 1985). Indeed, a principal components analysis indicated that most of the 17 trait-like sleep variables were ‘hanging together’ in three independent trait dimensions (see Table 3). These phenotypic dimensions of sleep physiology together explained 61.1% of the overall variance in the principal components analysis, considerably more than the 16.7% that would be expected by chance. The three phenotypes appeared to have straightforward physiologic interpretations: sleep duration, sleep intensity, and sleep discontinuity. Generalizability of these trait dimensions will need to be confirmed in additional studies with larger sample sizes. Further research is also needed to find out if any other trait dimensions may exist in the sleep physiology of healthy young adults. For example, experiments aimed at manipulating circadian phase may dissociate interindividual differences in sleep variables influenced directly by circadian timing (e.g. REM sleep and REML; Czeisler et al., 1980).

The high ICC value for the sleep intensity trait (ICC = 0.91) may have theoretical implications. Not only was the variability in this dimension considerably stable and robust – also for delta power in the non-REM sleep EEG (i.e. slow-wave activity) the systematic interindividual differences surpassed the group-average sleep deprivation effect by an order of magnitude (Fig. 2). The two-process model theory of sleep regulation postulates that non-REM delta power is a marker of sleep homeostasis within individuals (Borbély and Achermann, 2005), as is consistent with the increase of delta power following sleep deprivation in the present study. However, the theory does not attribute any functional role to the trait-like quality of non-REM delta power – within the theory, the systematic interindividual differences remain unexplained (Tan et al., 2000). Interestingly, SWS latency (SWSL), which has also been found to reflect sleep homeostasis (Dinges, 1986), was affected by sleep deprivation (Table 1) without displaying trait interindividual differences (Table 2). As such, SWSL may be a more specific marker of sleep homeostatic state.

Besides their theoretical relevance, trait interindividual differences in sleep variables also have practical implications (Van Dongen et al., 2005). As the magnitude of trait differences was found to be substantial (see Table 2) and greater than state-specific changes from sleep deprivation as observed during the experiment (see Fig. 2), interindividual variability is likely to represent a considerable portion of the variance in any study dealing with standard PSG-assessed sleep variables and/or non-REM delta power. In studies involving repeated measures, this source of variance can be utilized to enhance statistical power using modern statistical techniques (e.g. mixed-effects models – see Burton et al., 1998; Van Dongen et al., 2004b). On the other hand, systematic interindividual differences may pose a problem in investigations relying on between-subjects correlations to find associations between sleep physiology and other phenomena of interest. Studies on the role of sleep (stages) in learning and memory consolidation tend to fall in this category. The large natural variability in sleep architecture may lead to spurious, sample-dependent correlations having no functional significance – which may explain the scattered findings in this latter area of research (cf. Vertes and Siegel, 2005).

This study revealed the large systematic variability that may be encountered in the standard PSG-assessed sleep parameters of normal, healthy, young adults. Thus, what is experienced as ‘normal sleep’ appears to cover a wide terrain. Consequently, there may be significant overlap between the physiology of normal sleep with that of pathologic sleep, reducing the sensitivity and specificity of clinical tests based on sleep parameters (cf. Youngstedt, 2003). To get a better sense of the scope of this problem, there is a need for studies of the magnitude of trait interindividual differences in the sleep of clinical populations (Van Dongen et al., 2005). Such research may also reveal how over the long-term, interindividual differences in sleep physiology may contribute to adverse health outcomes – or, conversely, how they may play a role in resilience to poor health.


We thank the staff of the Unit for Experimental Psychiatry in the University of Pennsylvania School of Medicine; the nurses and staff of the General Clinical Research Center of the Hospital of the University of Pennsylvania; registered polysomnographic technologist Claire Fox; and research assistants Alicia Levin, Kristen Vitellaro, and Allison Stakofsky for their significant contributions to the experiment. We are also grateful to Peter Achermann for sleep physiological expertise, and to Poduri Rao and Greg Maislin for statistical advice. This work was supported by NIH grant R01-HL70154 awarded to Hans Van Dongen, and in part by NIH grant R01-NR04281 awarded to David Dinges and NIH grant M01-RR00040 awarded to the General Clinical Research Center of the Hospital of the University of Pennsylvania.