Event-related potential measures of the disruptive effects of trains of auditory stimuli during waking and sleeping states


David S. Michaud, PhD, Research Scientist, Acoustics Division, Consumer & Clinical Radiation Protection, Healthy Environments and Consumer Safety Branch, 775 Brookfield Road, Address locator 6301B, Ottawa, Ontario K1A 1C1, Canada. Tel.: 1-613-954-6670; fax: 1-613-941-1734; e-mail: dmichaud@hc-sc.gc.ca


Acoustic backup alarms have been reported to particularly disrupt sleep. The present study simulated backup alarms by presenting trains of five consecutive 500 ms duration audible tones, with the time between the onset of each tone being 1 s and the time between trains (offset to onset) between 15 and 20 s. In different conditions, the tones were set at either 80 or 60 dB sound pressure level (SPL). Twelve young adults spent two consecutive nights in the laboratory. Stimuli were presented only on the second night. Measures of traditional sleep architecture (sleep stages) were not affected by the acoustic trains. Event-related potentials were also measured following presentation of the stimuli. In the waking state, the initial 80 dB stimulus elicited a large amplitude N1, peaking at about 100 ms, followed by a positive peak, P3, peaking at about 325 ms. N1 was attenuated following presentation of the 60 dB stimulus. The amplitude of N1 was much reduced following presentation of the subsequent second to fifth stimuli in the train. During non-rapid eye movement (NREM) sleep, the initial 80 dB stimulus elicited a large and later negativity (N350) that was reduced in amplitude for the 60 dB stimulus. A K-Complex (a composite N350 and a much larger N550) was elicited following 35% of the initial 80 dB tones and 12% of the initial 60 dB tones. The amplitude of N550 did not, however, significantly vary as a function of stimulus SPL. During REM sleep, N1 continued to be elicited by the initial louder stimulus, but the later positive wave was not apparent. A late negativity peaking at about 350 ms was, however, apparent. When queried the next morning, subjects rarely indicated that the stimulus presentations disturbed their sleep. This might be because of the absence of the late positivity. The presence of the long latency negativities (N350 and N550) might serve to protect sleep from obtrusive sound during sleep.


Stimuli that occur very infrequently are quite obtrusive. These stimuli will affect ongoing cognitive activities in the waking state and may disturb sleep. The major complaint from residents living near the construction of the Central Artery Tunnel in Boston (the Big Dig project) was the night-time use of backup alarms on construction vehicles (Thalheimer, 2000). An acoustic backup alarm consists of a series of repetitive on–off tones that are emitted while the vehicle is reversing. When the vehicle ceases to reverse and subsequently advances, the alarm is not sounded. The backup alarm is thus experienced as a train of repetitive tones followed by a long period of silence after which the train is again sounded as the vehicle again reverses. The time between the presentation of the final stimulus in the train and the subsequent initial stimulus is thus very long. The time between the presentation of the initial and the subsequent signals is however very short.

Näätänen's (1992) comprehensive model of auditory processing provides an elegant explanation of why such stimulus trains can be so distracting. In it, he discusses the routes through which an observer can become aware of external acoustic input. The route discussed in this study implicates the passive and automatic switching of attention toward an unattended source. This involves what Näätänen calls the transient feature detector system, located in the auditory cortex. As the name implies, this system detects certain transient aspects of the auditory stimulation, such as abrupt onsets of infrequently presented stimuli or offsets of long-duration stimuli. The output of this system is proportional to the ‘obtrusiveness’ of the stimulus.

Näätänen often employs event-related potentials (ERPs)1 to provide real-time measures of the extent of auditory processing. ERPs are the changes in the electrical activity of the brain that are elicited by an external stimulus or an internal psychological event. Often, these changes are very minute and are usually difficult to observe in the ongoing EEG. Signal averaging techniques are therefore used to reduce the amplitude of the background ‘noise’ allowing the ERP to emerge from the EEG. The ERP consists of a series of negative- and positive-going components.2

A negative component, N1, peaking at about 100 ms, is thought to reflect the output of the transient feature detection system in the auditory cortex (Näätänen, 1992). As such, the amplitude of N1 is proportional to the obtrusiveness of the stimulus. Importantly, when the output of the transient feature detection system reaches a certain critical threshold, a signal is sent to the central executive, interrupting ongoing cognitive activity. Attention is then switched to the auditory channel. This disruption may cause deterioration in performance of the relevant cognitive task. The actual switch of attention away from relevant cognitive activity and toward the auditory channel is thought to be reflected by a later positive component, the P3a, peaking 250–300 ms after stimulus onset (Escera et al., 1998). This passively elicited P3a is maximum over central areas of the scalp. N1 is much affected by the rate of stimulus presentation. Its amplitude increases dramatically when the rate of stimulus presentation is very slow (inter-stimulus interval > 10 s) (Näätänen and Picton, 1987). With very slow rates of stimulus presentation, the scalp distribution of N1 changes from its usual fronto-central maximum to a more central maximum (Budd et al., 1998). There is some evidence that this is associated with frontal activation and the switch of attention to the obtrusive stimulus event (Giard et al., 1994).

A number of studies have employed ERP methodology to examine the extent of auditory information processing during sleep. In their review, Campbell and Colrain (2002) indicate that the amplitude of N1 has been consistently shown to gradually decrease during the sleep onset period and to reach the baseline levels during definitive sleep (stage 2). The amplitude of N1 remains at baseline levels throughout non-REM (NREM) sleep. The reduction in the amplitude of N1 occurs independently of stimulus SPL and the rate of stimulus presentation (Armitage et al., 1990; Cote et al., 2001). Very different results emerge during REM sleep. N1 can easily be elicited during REM sleep although it is reduced to only 25–50% of its waking amplitude (Campbell et al., 1992). Importantly, particularly loud, rare and/or psychologically relevant stimuli might continue to elicit a P3-like wave over parietal areas of the scalp during REM sleep, but its amplitude is also much reduced compared with the waking state. The frontal aspects of this P3 are however absent. Instead, a negativity peaking at this latency (300–400 ms) is often observed at anterior sites (Cote et al., 2001; Perrin et al., 1999).

Although the waking N1 and the P3 ERPs are difficult to elicit in NREM sleep, a series of other longer latency waveforms can be elicited. A negative wave peaking at about 350 ms (N350) appears during the initial sleep onset period. It is maximum over the central areas of the scalp and appears to be identical to the frequently observed sleep-associated vertex sharp wave (Colrain et al., 2000b). Stimuli that are very obtrusive in the waking state may elicit a K-Complex during NREM sleep (see Colrain, 2005 for a historical review of the K-Complex). This consists of a composite of a large amplitude N350 and a much larger amplitude and later N550. The N350 and N550 are thought to be unique to NREM sleep and cannot be elicited in either the waking state or in REM sleep (reviewed by Bastien et al., 2002). A loud acoustic stimulus will elicit many more K-Complexes than a low intensity stimulus (Bastien and Campbell, 1992). The time between stimulus presentations is exceedingly important. Very few K-Complexes can be elicited when stimuli are presented more rapidly than every 5 s while more than 50% of stimuli can elicit a K-Complex when they are presented more slowly than every 15 s (Bastien and Campbell, 1994). It would therefore be expected that a K-Complex would be elicited more often following presentation of the initial stimulus in a train of auditory stimuli. While the amplitude of the waking N1 and P3 ERPs may reach 10–20 μV, the amplitude of the large N550 during NREM sleep often exceeds 100 μV. A single K-Complex is thus easily visible in the background EEG of stage 2 sleep. The N350 is not unique to the K-Complex. It can be elicited when the N550 component cannot. Although the N350 is maximum over the central regions of the scalp, the N550 is maximum over more fronto-central areas (Colrain et al., 1999), although there may be some variation in its scalp distribution between stage 2 and 4 of sleep (Cote et al., 1999). Although stimulus characteristics will affect how often the K-Complex is elicited, they do not affect the actual amplitude of the N550 component (Bastien and Campbell, 1992). It is not known why, on some stimulus presentations, the K-Complex will be elicited while, on others, the identical stimulus will not elicit it.

The purpose of this study is to determine the effects of ordinal position on the processing of auditory stimuli in both the waking and sleeping states. A train of five consecutive tones was presented with an inter-stimulus (tone) interval of 1 s. The time between trains was varied to be between 15 and 20 s. In different conditions, the sound pressure level (SPL) of the stimuli was also varied. It was expected that the initial stimulus in the train of repetitive stimuli would be particularly obtrusive, especially when it was loud and thus, processed differently. During the waking state, it was expected that the initial stimulus would elicit an unusually large amplitude N1, perhaps followed by a later positive wave, the P3a. A similar type of pattern of responding was expected to be observed during REM sleep, although the amplitude of the ERPs would be reduced. The auditory stimuli were expected to elicit a unique set of ERPs during NREM sleep. The N350 should be elicited by the initial stimulus in the train. The initial stimulus was also expected to elicit more K-Complexes, particularly when it was loud.



Twelve subjects (five females and seven males) between the ages of 19 and 25 volunteered to participate in this study. None of the subjects had previously participated in a sleep study. Each spent two consecutive nights in the sleep laboratory. No one reported a history of psychiatric, neurological or audiological disorder. Additionally, no one was found to have disordered sleep as indicated by the Pittsburgh Sleep Quality (PSQ) Index. All subjects were asked to read and sign a consent form that provided details of the experimental paradigm and procedures. Each subject received an honorarium for participation in this study. This study was conducted according to ethical guidelines established by the Canadian Tri-Council (natural, medical, and social sciences) research board.


The EEG was recorded from gold electrodes placed at midline frontal, central and parietal scalp sites (Fz, Cz, Pz, respectively) and referenced to the left mastoid. Horizontal eye movements (hEOG) were recorded from electrodes placed at the outer canthus of each eye. Vertical eye movements and blinks (vEOG) were recorded from electrodes placed at the supra- and infra-orbital ridges of the left eye. The high frequency filter was set at 35 Hz. The time constant was 1 s. The physiological signals were sampled 256 times s−1 and stored on hard disk for later off-line scoring and reconstruction.

Auditory stimuli

A train of five consecutive 1000 Hz, 500 ms pure tones was presented monaurally to the right ear of each subject. An Eartone 3A earphone insert produced the auditory stimuli. Use of an insert earphone controlled for consistency of stimulus presentation in spite of head movement and ear position during the night. The offset-to-onset period of the tones was also 500 ms. Each tone had a 2 ms rise-and-fall time. The sound pressure level (SPL) of a stimulus was set at either 80 or 60 dB in different conditions. The time between trains of five consecutive stimuli (offset to onset) varied between 15 and 20 s. A total of 30 trains of stimuli (i.e. 150 total stimuli) were presented in a block. On average, approximately 15 blocks were presented to each subject over the entire night. The auditory stimuli were synthesized using a 16-bit waveform generator card. Stimulus level was verified using a Bruel and Kjaer 2209 sound level meter and Bruel and Kjaer 4152 artificial ear with 2 cm3 coupler. The calibration of the artificial ear was checked with a Bruel and Kjaer 4220 pistonphone. Background sound level in the sleep room was <35 dBA during sleeping hours.

The stimuli were presented when the subject was awake, sitting in an upright position, prior to sleep on the two nights of the study. Both the 80 and 60 dB conditions were run while the subject read a book or magazine and were instructed to ignore the auditory stimuli. Horizontal eye movements were monitored to verify compliance with reading. The entire procedure was repeated a second time to ensure replicability of results. Subjects were informed that the stimuli would also be presented at different times during the sleep period on both nights. In actual fact, stimuli were only presented during the second sleep night (Experimental night). Following lights out, subjects spent 480 min (±5 min) in bed. On the second night of the study, stimulus presentation did not begin until at least 10 min had been spent in definitive sleep [stage 2 or slow wave sleep (SWS)]. During sleep, the time between stimulus presentation blocks was randomized to be between 10 and 30 min. At least two blocks of stimuli were presented in stages 2, 3 and 4 (combined to form SWS) and REM sleep. If time permitted, additional blocks of stimuli were presented. Emphasis was placed on the presentation of the 80 dB stimulus blocks. For this reason, there was insufficient time to permit the presentation of the 60 dB stimulus blocks during all stages of sleep for all subjects. For the 60 dB stimuli, stages 2, 3 and 4 were also combined to form NREM sleep. At times, the stimulus elicited an arousal (a change from deeper to lighter sleep) and/or awakening. If the arousal lasted more than 10 s, stimulus presentation was paused until the subject returned to sleep.

Single trials were sorted according to electrode position, stimulus SPL and stage of sleep. The EEG was reconstructed into discrete 8000 ms ‘sweeps’ beginning 100 ms prior to stimulus presentation and averaged. The average thus contained the ERPs to the first, second, …, fifth stimulus presentations. The average of the EEG in the 100 ms prior to each stimulus presentation was used to form a baseline from which all ERPs were measured. In NREM sleep, a K-Complex was often elicited by the initial stimulus in the sequence. All single trial stimulus presentations in NREM sleep were therefore also manually sorted according to the presence or absence of a K-Complex and averaged separately. A set of algorithms was used for the definition of a K-Complex. Following stimulus presentation, a negative peak measuring at least 75 μV had to occur between 450 and 650 ms, followed by a positive peak between 800 and 1200 ms. Furthermore, the negative peak had to have a fronto-central maximum scalp distribution. This algorithm was implemented to exclude possible random background noise being considered to be a stimulus-elicited K-Complex. This was especially necessary during SWS in which isolated, single, large amplitude delta activity, might be mistakenly considered to be a K-Complex (Bastien and Campbell, 1992). In the waking state, sweeps were rejected from the ongoing average if either the EOG or EEG exceeded ±100 μV.

The sleep recordings were scored in two ways. The all-night continuous sleep recording was divided into 20 s ‘epochs’ and staged according to standard Rechtschaffen and Kales (1968) criteria by an experienced scorer. An epoch was scored as an arousal if the EEG showed a movement from deeper to lighter sleep (SWS to stage 2, or stage 2 to stage 1/awake) that lasted more than 10 s (i.e. 50% of the epoch). These arousals were scored independently of stimulus presentation (i.e. whether they were preceded by a stimulus presentation or not). One-quarter of the recordings were also scored by a second experienced rater. The overall agreement on staging was 84%. Most of the disagreement occurred for the scoring of stage 3 compared with stage 4 sleep. All stimulus block conditions were also staged. As mentioned, 30 stimulus trains were presented in each block. Each train was divided into a 20 s epoch beginning 5 s before the onset of, and for 15 s following the onset of the initial stimulus. The 20 s epoch was staged by both scorers. If either of the scorers disagreed with the staging, it was rejected from further analysis (i.e. the epoch would not be averaged with other epochs). Because stages 3 and 4 were collapsed in the ERP analyses, relatively few data were rejected.

Event-related potentials in the waking and sleeping states varied markedly. Specific components were identified in the grand averages (the average of all subjects’ averages). The N1 and later positivities could only be observed in the waking state and during REM sleep. The N550 could only be observed in NREM sleep when a K-Complex was elicited. A negative wave peaking at about 350 ms was apparent in both NREM sleep and REM sleep. In the waking and REM sleep states, N1 was defined as the maximum negative peak between 80 and 120 ms. Although N1 is often measured at Fz, when stimuli are presented very slowly, it is maximum over central areas of the scalp (Budd et al., 1998; Näätänen and Picton, 1987). A later P3a could not however be easily identified. In the waking state, a P3b (P300) was visible in the grand-averaged waveforms. This was defined as the maximum positive peak at Pz between 300 and 400 ms. During NREM sleep, N1 was at or near baseline and was difficult to discern in the background noise. Instead, a large amplitude negative wave peaking at about 350 ms was apparent. This N350 was measured at Cz, where it was largest, and defined as the maximum negative peak occurring between 300 and 400 ms. The K-Complex contained a very large amplitude negative wave peaking between 450 and 650 ms. This N550 was measured at Fz where it was largest. A N350 was also apparent during REM sleep. It was, however measured, at Fz where it was largest.

The number of subjects available for statistical analyses varied as a function of the stage of sleep (wakefulness, stage 2, SWS or REM) and the SPL of the stimulus. As explained in the Results section, because of the varying number of subjects, the stage factor was analyzed in separate analyses. In general, two-way repeated measures anovas were run to compare the effects of ordinal position of the stimulus and its SPL. Results were considered to be significant at an α value of 0.05. Greenhouse and Geisser (1959) correction factors were applied when appropriate.


Sleep architecture was compared on the first (Baseline) and second (Experimental) nights. Again, it should be stressed that stimuli were not presented during the first night. Neither the absolute time spent in stages 1–4 or REM sleep, nor the relative sleep time spent in the respective stages significantly varied between the two nights (P > 0.05 in all cases). These results are summarized in Table 1. Similarly, the latency to the various stages did not significantly vary. The total number of arousals (independent of duration), and the number of arousals that exceeded 1, 2 or 5 min in duration were computed. The number of arousals and the duration of each arousal did not significantly vary between the baseline (night 1) and experimental (night 2) sleep period times (P > 0.05 in all cases).

Table 1.  Percentage of time (SD) spent in the different stages of sleep. Acoustic stimuli were presented only during the Experimental night (night 2) and not during the Baseline night (night 1)
  1. *Percentage of time spent awake does not include time before sleep onset.

17.57 (6.41)8.30 (7.12)
248.09 (12.73)49.88 (8.87)
35.60 (2.13)5.59 (2.29)
412.60 (6.00)12.16 (5.84)
Wake*9.82 (4.45)5.61 (8.31)


Waking state

The grand-averaged waveforms following presentation of the 80 and 60 dB SPL trains of stimuli are illustrated in Fig. 1. An on–off negativity was apparent in the ERP waveforms. An initial N1 was apparent at about 100 ms. This sustained negativity terminated at stimulus offset. An off-response was thus apparent approximately 600 ms after stimulus onset (i.e. about 100 ms after stimulus offset). As may be observed, the ERPs following presentation of the initial stimulus were markedly different from those following the second to fifth stimuli in the train for both the 80 and 60 dB stimulus intensities.

Figure 1.

Waking grand-averaged ERPs elicited by a train of five consecutive 500 ms tones. The time between tones was 1 s. The time between trains was varied between 15 and 20 s. In separate conditions, the intensity of the tones was set to be either 80 or 60 dB SPL. The initial stimulus in the train elicited a large amplitude negativity, N1, peaking at about 110 ms after stimulus onset. A positive wave (P3) is apparent at parietal sites 300–350 ms. The long duration stimulus was associated with a sustained negativity. An off-response is apparent 100 ms after stimulus offset. In this and all other figures, negativity at the scalp relative to the reference is recorded as an upward deflection.

A two-way anova with repeated measures on ordinal position (first, …, fifth) and stimulus SPL (60 dB versus 80 dB) was run. The N1 to the initial stimulus was largest at Cz. N1 to the initial stimulus in the train peaked at 110 ms. N1 latency did not differ as a function of ordinal position (P > 0.05). A significant main effect of stimulus position was observed for the amplitude of N1 (P < 0.01). N1 was much larger following presentation of the initial stimulus in the train. A significant main effect was also found for stimulus SPL (P < 0.01). The amplitude of N1 declined by about 40% following presentation of the 60 dB SPL stimulus. A late positive wave, P3, maximum at Pz, peaking at about 325 ms, was also visible following presentation of the initial stimulus. It was not apparent for the second to fifth stimulus presentations. The amplitude of this P3 did not significantly differ as a function of stimulus SPL.

Stage REM sleep

Ten subjects were tested in the 80 dB condition during REM sleep. Only six subjects passed a sufficient amount of time in stage REM to permit the presentation of the 60 dB SPL condition.

A one-way anova was therefore run on the 80 dB data to compare the effects of ordinal position of the stimulus. As may be observed in the upper portion of Fig. 2, during stage REM, the grand-averaged ERP following the initial stimulus was again markedly different from that to the second through fifth stimuli. N1 was significantly larger to the initial stimulus (P < 0.01). This N1 was reduced by approximately 50% compared with the waking state. This was followed by a negativity, peaking at approximately 350 ms over fronto-central areas of the scalp. This N350 was significantly larger in response to the first stimulus compared with subsequent stimuli (P < 0.01). A late positivity was not apparent.

Figure 2.

Grand-averaged ERPs elicited by the train of five consecutive tones during REM sleep. A large, but reduced N1 is still apparent to the initial stimulus following presentation of the 80 dB tone. This is followed by a large frontal maximum negative wave, N350, peaking at about 350 ms. There is little evidence of a P3 component. Only a much attenuated N1 is visible following presentation of the initial 60 dB tone. The large N350 is still visible.

A two-way anova was run to compare the effects of stimulus SPL and ordinal position in the six subjects having sufficient data. A significant main effect of stimulus position was again found (P < 0.01). N1 was significantly larger to the initial stimulus compared with the subsequent stimuli. A significant interaction between SPL and ordinal position was observed for the amplitude of N1 (P < 0.01). It was significantly larger following the initial 80 dB than the 60 dB SPL stimulus (Fig. 2, lower portion). The amplitude of N1 was not affected by stimulus SPL for the other ordinal positions. A significant interaction for the amplitude of the N350 was also apparent (P < 0.05). N350 was significantly larger to the initial stimulus than to any of the subsequent stimuli, regardless of stimulus SPL. The N350 was nevertheless significantly larger following initial presentation of the 80 dB than the 60 dB stimulus. An off-response occurring approximately 100 ms after stimulus offset was again apparent following each of the stimulus presentations.

Stage 2 sleep

All 12 subjects spent enough time in stage 2 to allow presentation of at least two or more SPL blocks. Complete data were available for only six subjects in the 60 dB condition. On approximately 35% of the 80 dB trains, a K-Complex was elicited by the initial stimulus in the train. A K-Complex was elicited on only 12% of the 60 dB stimulus presentations. Stimulus trains were thus sorted and averaged according to the presence or absence of a K-Complex following the initial stimulus. Fig. 3 illustrates the grand average of all trains in which the initial stimulus elicited a K-Complex (upper portion) and when it did not elicit the K-Complex (lower portion). In this figure, the 8 s sweep time is shown. Although a clear ERP is apparent following the initial stimulus (whether it elicited a K-Complex or not), the ERP to the subsequent stimuli in the train were difficult to discern. For this reason, only the ERPs to the initial stimulus were further quantified. A zoom of the grand averages following presentation of the initial stimulus in the train in which a K-Complex could not be elicited is presented in the left-portion of Fig. 4. The large N1, peaking at about 100 ms, that was elicited during wakefulness and REM sleep was at baseline level during stage 2 sleep. When the K-Complex could not be elicited, a large negative wave peaking at about 350 ms was still quite apparent. The N350 was also apparent on trials in which the K-Complex was also elicited (Fig. 5). This was followed by the exceedingly large N550 peak. The amplitude of the N350 following presentation of the 80 dB stimulus was significantly larger when a K-Complex was elicited than when it was not (P < 0.01). The effects of stimulus 60 dB SPL were analyzed by a two-way anova (60 dB SPL × presence–absence of K-Complex) in the six subjects having a sufficient amount of data. The N350 was significantly larger for the 80 dB stimulus and was larger when either the 60 or 80 dB stimulus also elicited the K-Complex (P < 0.05 in both cases). The N550 was almost eliminated when the K-Complex was not elicited. The K-Complex was elicited much more often by the 80 dB stimulus. However, as is apparent in Fig. 5, the amplitude of N550 was essentially identical in the 80 and 60 dB conditions.

Figure 3.

Grand-averaged ERPs following presentation of the 80 dB tones during stage 2 sleep. Note the difference in the amplitude and time calibration scales compared with previous figures. A long 8 s sweep time is displayed. This thus incorporates the responses to all five stimuli within the train. On approximately 35% of trains, a large amplitude K-Complex was elicited by the initial stimulus in the train (upper portion of the figure). Its amplitude exceeded 100 μV. The lower portion of the figure displays the grand average of all trials in which a K-Complex was not elicited. A large amplitude (approximately 25 μV) negative wave peaking at about 350 ms (N350) is apparent following presentation of the initial stimulus. The ERPs to the remaining four stimuli in the train are difficult to discern in the large amplitude background EEG of NREM sleep.

Figure 4.

The effects of stimulus SPL on NREM trials when no K-Complex was elicited. In both stage 2 and SWS (stages 3 and 4), N350 was nevertheless elicited by the initial stimulus in the train. N350 is reduced in amplitude following presentation of the 60 dB SPL stimulus.

Figure 5.

The effects of stimulus SPL on NREM trials when a K-Complex was elicited. In both stage 2 and SWS, a K-Complex was elicited by more than a third of the 80 dB initial presentations. The major peak, N550 did not differ in amplitude between stages 2 and 4. On fewer than 15% of presentations of the initial 60 dB stimulus, a K-Complex was elicited. When it was elicited, the amplitude of the N550 did not differ from that observed following presentation of the louder 80 dB tone. Note the difference in the amplitude calibration compared with Fig. 4.

Slow wave sleep

The initial stimulus in the sequence again elicited a K-Complex on many trials. Because of its very large amplitude, only a few trials were required to be averaged to permit the signal to emerge in the large amplitude background EEG of SWS. On the contrary, when the K-Complex was not elicited, the resulting small amplitude ERP required many more trials in order to become visible. The duration of SWS was also much shorter than that of stage 2. Fewer data were therefore available for analysis following averaging than in the case of either stage 2 or REM. A total of nine subjects had enough SWS to permit at least two replications of the 80 dB condition. Only five subjects spent enough time in SWS to permit a sufficient number of 60 dB trials to be presented. Descriptive analyses of the 60 dB condition will thus be employed. A K-Complex was elicited on 34% and 22% of the initial presentations of the 80 and 60 dB stimuli, respectively. The ERPs following presentation of the second to fifth stimuli were especially difficult to discern. Statistical analyses were therefore again restricted to the initial stimulus presentation. As is apparent in the right-portion of Fig. 4, the N350 was elicited by the 80 dB stimulus independently of the occurrence of the K-Complex. It was however significantly (P < 0.01) larger when a K-Complex was elicited (Fig. 5). Although statistical comparisons were not carried out, in those subjects in which a sufficient number of 60 dB stimuli could be presented, N350 was much reduced in amplitude compared with 80 dB trials. When a K-Complex was elicited, its amplitude did not differ markedly between 80 and 60 dB conditions.

Stimulus-elicited arousals

At times, the presentation of the acoustic stimuli was followed by an arousal (as mentioned in the Methods, a movement to a lower stage of sleep or an awakening-movement). An arousal was considered to be stimulus-elicited if the arousal occurred within 15 s of the initial stimulus presentation (i.e. 10 s after the end of the sequence). This definition of an arousal thus identifies brief, stimulus-induced arousals whereas that used in the standard sleep scoring system identifies longer duration arousals independent of stimulus presentation. Arousals were relatively rare, occurring on only 1–3% of trials following presentation of the loud, 80 dB stimuli. The small number of arousals only permitted a descriptive analysis of these results. The number of arousals decreased from light (stage 2) to deep (SWS) sleep, the proportion of arousals during REM sleep being intermediate. For the 80 dB stimulus, the number of arousals was independent of occurrence of a K-Complex. The proportion of arousals was affected by the SPL of the stimulus. They were exceedingly rare (fewer than 0.5% of train presentations) when the lower 60 dB SPL stimulus was employed.


A crucial factor affecting the brain's response to external stimulation in both the waking and sleeping states is the rate of stimulus presentation. In most studies, the rate of stimulus presentation is varied in different conditions. For example, Bastien and Campbell (1994) presented auditory stimuli every 5 s in one condition, every 10 s in another and finally every 30 s in a final condition. A problem with this approach is that the researcher cannot be assured of a constant brain state in the different conditions. The present study manipulated the rate of stimulus presentation within the same condition by the use of stimulus trains. In the present study, trains of five consecutive auditory stimuli were presented every 15–20 s. The time between stimuli in the train was however only 1 s. Whatever differences were found could not therefore be attributed to momentary fluctuation of the state of arousal between or within conditions.

Most ERP studies use short duration stimuli (∼<50 ms). A relatively long duration (500 ms) stimulus was used in this study. It elicited an on-response at the onset of the stimulus, followed by a ‘sustained negativity’ lasting the duration of the stimulus and an off-response at the termination of the stimulus, similar to that described by Picton et al. (1978). This sustained negativity was maximum over fronto-central areas of the scalp and as a result probably overlapped and summated with other fronto-central ERPs, most notably the later P2 and P3a deflections. The response to the initial stimulus in the train was quite different than that to subsequent stimuli in the train in both the waking and sleeping states and within sleep, in both stages REM and NREM. In the Näätänen (1992) model of auditory processing, the time between stimulus presentations is claimed to play a critical role in the decision of whether central processing should be interrupted or not. As predicted, in the waking state, the amplitude of N1 was much larger to the first stimulus in the train than to the subsequent second to fifth stimulus repetitions. The amplitude of the N1 to the initial stimulus was attenuated when the lower 60 dB SPL stimulus was presented; confirming many previous reports (see Näätänen and Picton, 1987 for a review). When the amplitude of N1 exceeds a certain critical threshold, it is thought that the subject's attention will be automatically diverted away from the task at hand and toward the auditory stimulus. This switch of attention is associated with a centro-frontal positive component, the P3a (Escera et al., 1998). The P3a was not observed in this study, perhaps because of the overlapping effects of the sustained negativity. Rather, the initial stimulus in the train elicited a longer latency positivity that peaked at about 325 ms. This positivity was maximum over the parietal region of the scalp and was thus less affected by the spatial overlap with the sustained negativity. This late, parietal positivity is more consistent with a P3b (or P300) than a P3a. The P3b has been associated with active (rather than passive) and overt detection of a rare stimulus event (Donchin & Coles, 1988). It can however be elicited in inattentive subjects if stimulus presentation is exceedingly rare and obtrusive (see Cote et al., 2001). The appearance of the P3b deflection does provide strong evidence that the initial stimulus must have been quite ‘obtrusive’ interrupting ongoing cognitive activity.

Clear identification of the central maximum P2 component was not possible in either the waking or sleeping states, probably because of its summation to the sustained negativity effectively canceling its appearance at the scalp (Picton et al., 1978). A number of studies have reported that P2 may be larger in sleep than in wakefulness (reviewed by Crowley and Colrain, 2004). Further, Crowley and Colrain propose that P2 may reflect inhibition of auditory processing in the sleep state.

During NREM sleep, N1 was difficult to observe, even to the initial 80 dB stimulus in the train, confirming the results of many studies (Campbell and Colrain, 2002). A series of later negativities was elicited by the initial stimulus. These consisted of either the N350 when a K-Complex was not elicited or a composite N350–N550 when it was. ERPs to the subsequent stimuli in the train were difficult to discern in the background EEG, even following averaging. The initial 80 dB stimulus elicited a K-Complex on about 35% of trials in both stage 2 and stage 4. Very few K-Complexes were elicited by the subsequent stimuli in the train. This could reflect habituation of the K-Complex, or alternatively a very long recovery period of the network responsible for the generation of the K-Complex, as outlined by Bastien and Campbell (1994). They point out that it is quite difficult to distinguish between habituation and recovery effects. A classical set of criteria are required in order for a response to be considered to have habituated (Thompson and Spenser, 1966). The most often cited criterion is that that the response gradually must decline in amplitude upon repetition of the stimulus. What is typically found in K-Complex studies is an entire absence of a response rather than a gradual decline in its amplitude. The absence of subsequent K-Complexes is what was observed in the present study. Further criteria include a response recovery upon presentation of a novel stimulus inserted in the train of stimuli and subsequently dishabituation of the previously habituated response following presentation of the changed stimulus. Oddball studies in which a changed (or ‘deviant’) stimulus is presented at random among more frequently occurring ‘standard’ stimuli have indeed indicated that the deviant stimulus will elicit more K-Complexes than the frequently occurring standard (Bastuji et al., 1995, 2003; Colrain et al., 2000a; Niiyama et al., 1994). Bastien et al. (2002) have noted that this may not necessarily be evidence of habituation. The deviant stimulus may activate a different, and non-refractory neuronal population than the frequent stimulus.

Bastien and Campbell (1992) have also indicated that the intensity of the auditory stimulus has a major effect on the probability of a K-Complex being elicited. In the present study, only a limited number of subjects were tested with the 60 dB stimulus. In these subjects, it was quite apparent that this low SPL stimulus elicited many fewer K-Complexes. When the 60 dB stimulus did elicit a K-Complex, the amplitude of the large N550 did not vary from that elicited by the 80 dB stimulus, replicating similar findings by Bastien and Campbell (1992). Once the K-Complex is elicited, the amplitude of N550 does not vary. Stimulus parameters may affect how often the K-Complex will be elicited. They will not affect the magnitude of the response. On trials in which the K-Complex was not elicited by the initial stimulus, a large amplitude N350 was apparent. The amplitude of N350 was significantly smaller compared with trials in which the K-Complex was elicited. The amplitude of N350 did vary as a function of stimulus SPL. The N350 was attenuated following presentation of the 60 dB SPL stimulus. Unfortunately, time permitted the testing of too few subjects to allow for definitive conclusions. This is nevertheless consistent with the Bastien and Campbell (1992) finding of a similar relationship between stimulus SPL and the amplitude of the N350. They suggested that the amplitude of N350 must reach a certain critical amplitude before the larger N550 K-Complex can be triggered.

External stimuli have often been reported to disrupt or fragment sleep. In the present study, in the morning following sleep, subjects often reported hearing the stimuli during the night but did not indicate that the presentations disturbed their sleep. A large number of studies have indicated that fragmented sleep, marked by many brief arousals, affects its recuperative value (Bonnet, 1986; Martin et al., 1999; Stepanski, 2002). Often, these brief ‘microarousals’ are not detected by the usual 20–30 s sleep stage scoring criteria. Consistent with this, in the present study, neither the absolute nor the relative quantity of the different sleep stages was affected by the presentation of the auditory stimuli. Similarly, the number of arousals and the duration of these arousals were not affected by the presentation of stimuli. These findings bear a resemblance to several field reports showing little if any ‘objective’ effects of automobile, rail and aircraft noise on sleep (see Fidell et al., 1994; Hume et al., 2003). On the other hand, laboratory reports indicate larger sleep disturbances from transportation noise sources (reviewed in Pearsons et al., 1995). The inability to replicate other laboratory results is a subject for further research but could be related to the characteristics of the stimulus. Most field and laboratory studies of noise examine the effects of acoustically complex long duration stimuli whereas in the present study, acoustically simple (pure tones), relatively short duration stimuli, were employed. Moreover, in the present study, any comparison of the two nights of sleep must take into account the well-documented confound of the first-night effect (Agnew et al., 1966). It has been observed that sleep on the first night in the laboratory is typically poorer than on subsequent nights. Whatever disruptive effects the stimuli might have had could be masked by the poor quality of sleep on the first night when stimuli were not presented. Finally, the subjects employed in the present study were young adult students. Although the questionnaire results indicated that they were normal sleepers, the sleep habits of this population may be more variable than older adults (Taub, 1981). It is possible that they may have deeper sleep and thus higher arousal thresholds than older adults.

The failure to find an effect of the acoustic stimuli on the different sleep stages runs counter to what might be expected from the frequent complaints about the disruptive effects of backup acoustic alarms on sleep (Thalheimer, 2000). It is unclear if these complaints are associated with awakenings and/or an inability to initiate/resume sleep from the wakened state. In this study, the presentation of the stimuli was stopped if subjects awakened and therefore we were unable to assess the impact these alarms may have had on returning to the sleeping state. It is also possible that the source of such complaints might be best detected through an analysis of microarousals. An arousal from sleep is defined as a temporary intrusion of wakefulness into sleep or at least a sudden transient elevation of vigilance level (Atlas Task Force, 1992). Unfortunately, there is little agreement on what physiological measures constitute a spontaneous or stimulus-induced arousal (Halász et al., 2004). Certainly, the appearance of wake-like ERPs would be considered to be atypical of sleep. Bastuji et al. (2003) recorded ERPs during naps using an oddball paradigm. When forcibly awakened, the subjects were asked to recall if they had heard stimuli during the nap. The odd (or rare) stimulus was associated with the occurrence of a P3b during the actual nap if subjects were able to later recall having heard the stimuli. However, when a K-Complex was elicited by the odd stimulus, subjects could later not recall hearing the stimuli. The sleeping ERPs may thus systematically predict later waking recall of disturbance. During non-REM sleep, the K-Complex is most often elicited by highly obtrusive (as in the present study) or psychologically relevant stimuli. For this reason, the K-Complex has long been considered to be a sign of cortical arousal (Roth et al., 1956; Sassin and Johnson, 1968). More recent evidence suggests that the K-Complex may serve a protective role, protecting against sleep from needless disturbance. During stage 4, the EEG shifts toward slower frequencies shortly after the K-Complex is elicited (Bastien et al., 2000), suggesting sleeper is deeper following a K-Complex. Nicholas et al. (2002) have noted that the number of K-Complexes increases following a night of fragmented sleep. Crowley et al. (2002) have pointed out that an identical stimulus will elicit more K-Complexes in younger than older subjects. This paucity of K-Complexes in the elderly may explain why their sleep is lighter and marked by frequent awakenings. Halász et al. (2004) suggested that part of the reason for the controversy in determining the role of the K-Complex is that NREM has generally been considered to be a uniform state. They propose that sleep consists of a cyclic pattern of ascending and descending slopes of arousal. The role of the K-Complex may well vary depending on whether it was recorded during an ascending or descending slope.

Several laboratories have described the N350 during NREM sleep (see Bastien et al., 2002 for a review). Colrain et al. (2000b) suggest that it may, in fact, be identical to the frequently described vertex sharp wave, a position emphasized earlier by Harsh et al. (1994). Harsh et al. postulate that the N350 may also reflect inhibition of auditory processing during the sleep onset period. It first appears when subjects no longer overtly respond to external auditory targets (Harsh et al., 1994; Ogilvie et al., 1991). Colrain et al. indicate it appears during stage 1 sleep dominated by deeper theta rather than lighter alpha activity. This is again consistent with a loss of consciousness and increasing sleepiness. Voss and Harsh (1998) examined how a subject's personality characteristics might influence the N350. Individual differences in coping styles were categorized as Monitors (information seeking) or Blunters (information avoiding). Monitors made more detections of target stimuli both in wakefulness and in stage 1. During stage 1 the Monitors had a larger P3b response but a smaller N350 response during stage 1. The Blunters who did not appear to be conscious of the target stimulus (fewer behavioral detections) had a larger N350. This intriguing result may well explain why the sleep of certain individuals is more disturbed by external environmental noise. The Bastien and Campbell (1992) position also suggests that N350 may serve to inhibit information processing. They suggest that the amplitude of N350 might reflect the extent of this inhibition. When the stimulus is particularly obtrusive, the N350 inhibitory mechanism may not prove sufficient. A secondary protective system, the K-Complex, marked by the large N550, is then elicited. Crowley et al. (2002) have also reported a reduced N350 in the elderly. This may further explain their failure to initiate K-Complexes and their vulnerability to the disruptive effects of environmental noise.

During REM sleep, a wake-like N1 was evident following presentation of the initial stimulus. This was however much reduced compared with that elicited in the waking state. Crucially, neither a P3a nor a P3b could be recorded even for the loudest stimulus. Rather, a relatively large frontal negative wave peaking at about 350 ms was apparent, at least for the louder 80 dB stimulus. This was somewhat surprising. Many studies have claimed that N350 is unique to NREM sleep (see e.g. Bastien et al., 2002). Both Cote et al. (2001) and Perrin et al. (1999) reported a prominent posterior P3 to highly salient stimuli but also discussed an anterior negativity related perhaps to an absence of frontal affect. This frontal negativity may explain why subjects do not awaken to stimuli so obtrusive that they elicit a P3 during stage REM. The extent to which the N350 elicited in NREM and REM sleep represent similar generator sources or share a common functional significance is not known. Large, multi-channel recordings will need to verify the scalp distribution of the different N350s.

The extent to which environmental noise from aircraft, rail and automotive sources actually disrupts sleep remains equivocal. On the one hand, residents living near these sources frequently complain about disrupted sleep. On the other hand, in field studies, the more objective macro measures of sleep, such as the quantification of sleep stages, generally have failed to show large effects even in apparently high noise areas. Perhaps the disruptive effects of environmental noise were too brief to be detected by macro measures. Few of these studies have examined so-called short duration microarousals.

The present study examined the short duration ERPs (occurring within about 500 ms following stimulus onset) as a measure of the extent of information processing during sleep. The stimuli that were employed simulated backup acoustic alarms, known to produce many subjective complaints about sleep disruption. The initial stimulus in the train of acoustic stimuli had a marked effect on the ERPs in the waking state and in both REM and NREM sleep. The subsequent stimuli in the train had very little effect. During NREM sleep, the initial stimulus elicited either a large N350 or a composite N350/N550 (i.e. a K-Complex). During stage REM, a negative wave, N1, peaking at about 100 ms, was elicited by the initial stimulus in the sequence. A later positive wave, P3a, thought to reflect distraction by the obtrusive stimulus, was however not apparent. A relatively large negative wave, peaking at about 350 ms was however apparent. The presence of an N1 and N350 but the absence of a P3-like wave could explain an apparent contradiction – while subjects did report some awareness that the stimuli had been presented, they did not perceive them to be disruptive. Further, whatever disruptions occurred to sleep were very brief and could not be detected by the standard sleep scoring procedures. Certainly, the results of the present study indicate that if trains of auditory stimuli do disrupt sleep, it is because of the very large effects of the initial rather than the subsequent stimuli in the train.


  • 1

    Event-related potentials are also sometimes called evoked potentials (EPs). Some laboratories restrict the use of the EP label to refer to responses that are generated in precortical regions such as the brainstem or thalamus, while those that are generated in cortical regions are labeled as ERPs. The EP label is also used to describe responses that are primarily affected by the physical parameters of the stimulus (thus, sensory EPs) while those that are affected by the more salient-cognitive aspects of the stimulus or psychological ‘events’ are called ERPs (or cognitive ERPs). This was not, however, the intended use of the ERP label when it was initially developed in the 1960s and 1970s.

  • 2

    In this article, the Näätänen and Picton (1987) definition of a component will be employed. It is ‘…the contribution to the recorded waveform of a particular generator process, such as the activation of a localized area of cerebral cortex by a specific pattern of input …Whereas the peaks and deflections of [the scalp-recorded potential] can be directly measured from the average waveform, the components contributing to these peaks can usually be inferred only from the results of the experimental manipulation’ (p. 376). In this definition, an ERP component must be associated with an independent intra-cranial generator. A different component will thus have a different intra-cranial generator. Donchin et al. (1978) define a component differently: ‘Functionally different [neuronal] aggregates need not be anatomically distinct neuronal populations. But it is assumed that neuronal aggregates whose activity will be represented by an ERP component have been distinctly affected by one or more experimental variables’ (p. 353). While the Näätänen and Picton definition places particular emphasis on a component's distinct intra-cranial generation, the Donchin et al. definition places more emphasis on the component's functional role.