Pupillometry and P3 index the locus coeruleus–noradrenergic arousal function in humans


  • The authors declare no conflicts of interest in conducting the research presented here. This research was supported by an Irish Research Council for Science, Engineering and Technology (IRCSET) “Embark Initiative” grant, awarded to P.R.M., an IRCSET Enterprise Partnership Scheme Fellowship to J.H.B., and an IRCSET Empower Fellowship to R.G.O'C. The authors also acknowledge funding support via the HEA PRTLI Cycle 3 program of the EU Structural Funds and the Irish Government's National Development Plan 2002–2006. We thank Elisa Tatti for her assistance with data collection, Robert Whelan for assistance with stimulus coding, and Mark Bellgrove for his valuable comments on an early draft of the manuscript.

  • [The copyright line for this article was changed on 13 July 2017 after original online publication.]

Address correspondence to: Peter R. Murphy, Room 3.60, Trinity College Institute of Neuroscience, Lloyd Building, Trinity College Dublin, Dublin 2, Ireland. E-mail: murphyp7@tcd.ie


The adaptive gain theory highlights the pivotal role of the locus coeruleus–noradrenergic (LC-NE) system in regulating task engagement. In humans, however, LC-NE functional dynamics remain largely unknown. We evaluated the utility of two candidate psychophysiological markers of LC-NE activity: the P3 event-related potential and pupil diameter. Electroencephalogram and pupillometry data were collected from 24 participants who performed a 37-min auditory oddball task. As predicted by the adaptive gain theory, prestimulus pupil diameter exhibited an inverted U-shaped relationship to P3 and task performance such that largest P3 amplitudes and optimal performance occurred at the same intermediate level of pupil diameter. Large phasic pupil dilations, by contrast, were elicited during periods of poor performance and were followed by reengagement in the task and increased P3 amplitudes. These results support recent proposals that pupil diameter and the P3 are sensitive to LC-NE mode.

Recent theoretical and empirical work has highlighted the pivotal role of the brain's locus coeruleus–noradrenergic (LC-NE) neuromodulatory system in regulating task engagement and optimizing performance according to environmental contingencies (Aston-Jones & Cohen, 2005). The LC is a small nucleus located in the dorsal pons and is the sole source of cortical NE, and its efferent projections innervate widely distributed areas of the cerebral cortex (Berridge & Waterhouse, 2003). High-frequency phasic LC activity is elicited by salient or task-relevant stimuli, and the resultant release of NE to the cerebral cortex potentiates stimulus processing by selectively increasing neuronal gain within task-relevant regions (Aston-Jones & Bloom, 1981; Aston-Jones & Cohen, 2005; Foote, Aston-Jones, & Bloom, 1980; Sara, 2009). The role of phasic LC activity in facilitating stimulus processing is supported by animal studies that highlight the phasic LC response as an important antecedent to appropriate behavioral responding in stimulus detection paradigms (Aston-Jones, Rajkowski, Kubiak, & Alexinsky, 1994; Clayton, Rajkowski, Cohen, & Aston-Jones, 2004; Rajkowski, Majczynski, Clayton, & Aston-Jones, 2004).

Based primarily on such intracranial recordings from animals, the adaptive-gain theory of LC-NE function (Aston-Jones & Cohen, 2005) states that relative levels of tonic and phasic LC activity relate to task performance in a manner that reflects the classic Yerkes–Dodson arousal curve (Yerkes & Dodson, 1908): Performance and phasic LC responding are optimal at an intermediate level of tonic LC activity, but shifts toward either end of the tonic activity continuum are associated with declining performance and nonspecific or attenuated phasic responses. More generally, the “phasic” LC mode, characterized by intermediate tonic activity, is hypothesized to drive exploitation of the current environment, whereas the “tonic” mode, characterized by high tonic activity, induces exploration of different environments and potentially rewarding opportunities (Aston-Jones & Cohen, 2005; Cohen, McClure, & Yu, 2007; Usher, Cohen, Servan-Schreiber, Rajkowski, & Aston-Jones, 1999). In the animal literature, the LC has been consistently shown to exhibit fluctuations between these modes of activity during simple attentional tasks, and such fluctuations correspond to significant periodicity in task performance (Aston-Jones et al., 1994; Rajkowski et al., 2004; Rajkowski, Kubiak, & Aston-Jones, 1994).

The importance of this LC arousal function in humans has been highlighted by pharmacological and genetic studies that corroborate the role of NE as a critical determinant of engagement and task performance on tests of attention (Coull, 2001; Greene, Bellgrove, Gill, & Robertson, 2009; Minzenberg, Watrous, Yoon, Ursu, & Carter, 2008; Nieuwenhuis, van Nieuwpoort, Veltman, & Drent, 2007; Smith & Nutt, 1996). Our understanding of the functional dynamics of LC-NE activity in humans has been hampered, however, by an absence of reliable, noninvasive neurophysiological markers that have sufficient temporal resolution to index the tonic and phasic shifts that are observed to occur within this system. Validating such indices will allow for the elucidation of prominent models of task engagement and performance in humans and expedite the development of novel biomarkers for the treatment of associated clinical conditions (e.g., attention-deficit hyperactivity disorder; Arnsten, 2009; Brennan & Arnsten, 2008).

The present study seeks to evaluate the utility of two candidate psychophysiological markers of the LC-NE system: the P3 event-related potential (ERP) and pupil diameter. The P3 has been one of the most heavily investigated ERPs, peaking 300–600 ms after a task-relevant stimulus and with a maximal distribution over centro-parietal midline electrode sites (Sutton, Braren, Zubin, & John, 1965). Despite the large amount of interest in this component as a clinical and neuro-cognitive marker (Polich, 2007), its precise functional origins are poorly understood. Recent evidence from animal, genetic, and pharmacological studies, however, suggests that the P3 may represent a cortical electrophysiological correlate of the phasic LC response (Nieuwenhuis, Aston-Jones, & Cohen, 2005; Nieuwenhuis, De Geus, & Aston-Jones, 2011). This hypothesis is driven in part by the remarkable similarities between the antecedent conditions and the classes of stimuli shown to drive both the LC phasic response and the P3 (see Nieuwenhuis et al., 2005, for an extensive review). Generally, task-relevant stimuli consistently evoke robust P3 components (e.g., Polich, 2007). Furthermore, those stimuli that are accompanied by a large P3 have a higher chance of being detected and responded to appropriately compared to those which fail to elicit a P3 (Hillyard, Squires, Bauer, & Lindsay, 1971; Parasuraman & Beatty, 1980), linking this component to task performance in a manner consistent with the LC phasic response. In one study that recorded monkey LC neuron activity and cortical ERPs simultaneously, both phasic LC activity and fronto-parietal ERPs analogous to the human P3 were selectively evoked by target stimuli and followed closely related time courses (Aston-Jones, Chiang, & Alexinsky, 1991). Pharmacological (Swick, Pineda, & Foote, 1994) and lesion (Pineda, Foote, & Neville, 1989) studies with primates also point to a causal role for the LC-NE system in P3 generation. There has been little investigation of this LC-P3 hypothesis in humans, and although genetic evidence has emerged linking P3 amplitude to a collection of single-nucleotide polymorphisms (SNPs) that code for NE synthesis and expression in the human brain (Liu et al., 2009), research looking at the effects of pharmacological NE manipulation on the P3 have yielded ambiguous results (e.g., Halliday et al., 1994; Studer et al., 2010; Turetsky & Fein, 2002). To date, the precise relationship of the P3 to real-time fluctuations in human LC activity and to the patterns of task performance that accompany such fluctuations has yet to be investigated in detail.

Whereas the P3 may index the phasic LC response, pupil diameter has been hypothesized to reflect both the tonic and phasic aspects of LC-NE activity. Although it has proven difficult to isolate a direct anatomical connection between LC and the pupillary dilator muscle (cf. Nieuwenhuis et al., 2011), baseline pupil diameter and intracranial recordings of tonic LC activity in the monkey have been found to correlate remarkably well, such that large pupil diameter appears to equate to high tonic LC activity (Rajkowski, Kubiak, & Aston-Jones, 1993). Pharmacological up-regulation of tonic NE has been found to increase baseline pupil diameter and decrease pupillary variability, which suggests a strong causal noradrenergic influence over pupil diameter dynamics (Hou, Freeman, Langley, Szabadi, & Bradshaw, 2005; Phillips, Szabadi, & Bradshaw, 2000). Furthermore, the well-documented pupil dilatory response that occurs to a wide range of task-relevant stimuli and events (Beatty, 1982) is consistent with the LC phasic response. More recently, a prestimulus measure of baseline pupil diameter has been shown to relate to task engagement in a manner explicitly predicted by the adaptive gain model of LC-NE function (Gilzenrat, Nieuwenhuis, Jepma, & Cohen, 2010; Jepma & Nieuwenhuis, 2011): High prestimulus pupil diameter predicted task disengagement and exploration of different reward opportunities (indicative of elevated tonic LC activity), whereas low prestimulus pupil diameter corresponded to task engagement and exploitation of the current source of the reward. In a simple auditory oddball task, Gilzenrat et al. also highlighted a negative linear relationship between prestimulus pupil diameter and both attentional engagement, as indexed by reaction times, and phasic pupil dilations size. The finding of an inverse relationship between baseline pupil diameter and phasic pupil dilation is consistent with the observed differentiation between tonic and phasic modes of LC activity in the animal literature and also corresponded well to the performance dynamics predicted from this model (Aston-Jones & Cohen, 2005).

Although two previous studies measuring P3 and phasic pupil dilation concurrently found comparable relationships between these measures and stimulus probability (Friedman, Hakerem, Sutton, & Fleiss, 1973) and monetary feedback (Steinhauer, 1982; cited in Steinhauer & Hakerem, 1992), these studies did not report any detailed fluctuations in task performance or how such periodicity in performance related to either measure. In the present study, we provide the first detailed examination of the relationships between single-trial measurements of the P3 potential and pupil diameter in the context of extended performance of an auditory version of the oddball task—a paradigm widely used in animal studies of LC function (Aston-Jones et al., 1991, 1994; Rajkowski et al., 1994, 2004; Swick, Pineda, & Foote, 1994; Swick, Pineda, Schacher, & Foote, 1994). The goals of our study were twofold: first, to further support the use of prestimulus pupil diameter as an index of fluctuations in task engagement predicted by the adaptive gain theory and, second, to establish the extent to which the P3 component shows sensitivity to these same changes.



Thirty-three participants took part in this study. Nine participants were excluded because of excessive artifacts in their pupil data, which precluded the reliable analysis of these data sets. This left a final sample of 24 participants (12 female, 1 left-handed), with a mean age of 24.4 years (SD=4.4). This final sample did not differ significantly from the excluded participants on any reported behavioral measures, and did not show differences in P3 or N1 amplitudes (all p values >.1). All participants had normal or corrected-to-normal vision and no history of psychiatric illness or head injury. They provided written informed consent before testing began, and all procedures were approved by the Trinity College Dublin ethics committee and in accordance with the Declaration of Helsinki.

Auditory Oddball Task

The auditory oddball task is a simple and well-established paradigm for the investigation of arousal effects on cognitive performance and has been shown to reliably evoke both pupillary dilations (Beatty, 1982) and robust P3 components (Polich, 2007). Here, stimuli were presented through headphones using the Presentation software suite (NeuroBehavioral Systems, San Francisco, CA). They consisted of 60-ms-duration sinusoidal tones of frequencies of 1000 Hz (targets) and 500 Hz (standards). Targets were pseudorandomly interspersed throughout the task and constituted 20% of the total number of trials. Participants were instructed to respond to target tones with a right index finger mouse click as quickly and accurately as possible while ignoring presentation of the nontarget standard tones.

Participants completed a practice run of the task to ensure that they were well acquainted with the instructions before beginning. They were seated comfortably at a distance of ∼50 cm from a 20-in. LED monitor (Dell P2011H; Dell Inc., Ireland) with their head supported by a chin rest and were instructed to maintain gaze on a white fixation cross presented over a black background at the center of the monitor (font size=48). The study was conducted in a dark room with the only ambient light provided by the fixation cross.

The total duration of the task was 37 min with no breaks. Tones were presented at an interstimulus interval (ISI) that varied pseudorandomly between 2.1 and 2.9 s, with an average of 178 target tones over the whole task (712 standards). To allow target-evoked pupil responses to return to baseline, the stimuli were ordered such that at least three standard tones were presented between targets, leaving a minimum intertarget interval of 8 s.

Data Acquisition and Processing

Continuous electroencephalogram (EEG) was acquired using an ActiveTwo system (BioSemi, The Netherlands) from 64 scalp electrodes, configured to the standard 10/20 setup and digitized at 512 Hz. Vertical and horizontal eye movements were recorded using two vertical electroocculogram (EOG) electrodes placed above and below the left eye and two horizontal EOG electrodes placed at the outer canthus of each eye, respectively. Continuous EEG data were re-referenced off-line to the average reference, high-pass filtered to 0.53 Hz and low-pass filtered up to 35 Hz. Data from the 64 scalp electrodes for each participant were then subjected to temporal independent component analysis (ICA) using infomax (Bell & Sejnowski, 1995) and implemented in EEGLAB (Delorme & Makeig, 2004) for removal of EOG and other noise transients.

Continuous pupil diameter was recorded using an Eyestart eye-tracker (ASL, Bedford, MA). Pupil diameter of the left eye was sampled at a rate of 50 Hz with a spatial resolution of greater than 0.01 mm. As a preliminary preprocessing measure, artifacts and blinks were interpolated using a linear interpolation algorithm in the ASL Results software suite. All participants' data were visually inspected after interpolation, and those with excessive artifacts still remaining (e.g., blinks of long duration or excessively noisy periods of data) were excluded from further analyses (n=9). Continuous pupil diameter data sets from the remaining participants were up-sampled to 512 Hz for compatibility with the EEG data.

Event markers emitted by the stimulus presentation computer were recorded simultaneously during EEG and pupil diameter acquisition. Before combining data streams from the respective modalities for analysis, 3-s epochs were extracted around each stimulus marker from 1 to +2 s relative to stimulus presentation. EEG data set epochs were baseline corrected relative to the mean activity in the 100 ms directly preceding stimulus presentation, whereas epochs from the pupil data sets were baseline corrected to the prestimulus interval of 1 s. All further processing was carried out using a combination of in-house MATLAB scripting and EEGLAB (Delorme & Makeig, 2004).

EEG/pupil diameter data sets were subject to further artifact rejection criteria applied between −100 and +800 ms relative to the stimulus for the EEG epochs and between −1 and +2 s for the pupil epochs. Any epochs with an EEG amplitude exceeding ±90 μV or with a peak pupil diameter exceeding ±2 mm were rejected. To eliminate instances of brief, high amplitude noise in the up-sampled pupil data, any epoch in which the difference between two consecutive samples exceeded ±0.03 mm was rejected. Each data set was also removed of epochs in which any pupil diameter data point exceeded the combined mean of that epoch plus two neighboring epochs to either side by 5 standard deviations or more (for a similar appraoach, see Porter et al., 2010). Finally, all epochs on which participants responded to standard tones (false alarms; M=1.50; SD 1.69), failed to respond to target tones (misses; M=0.54; SD 1.67), or responded within the first 100 ms after target presentation (quick responses; M=0.04; SD 0.20) were also removed from the data. A total of 19 participants had no misses on the task, and 15 participants had one or zero false alarms, which precluded any analysis of target detection accuracy. After applying the above criteria, a mean of 167 (SD=9.87) target trials remained per participant.


Target stimuli evoked an auditory N1 component with a central topography as well as a large positive component over centro-parietal scalp areas (the P3). The P3 component was the primary focus of this study, but the N1 was included as a control to evaluate the unique sensitivity of the P3 to changes in task engagement. In accordance with the spatial topography of both components in the grand average (see Figure 1a for grand-average P3 topography), the P3 was analyzed at electrode Pz and the N1 at electrode Cz. Similarly, the widths of the latency windows used to identify component amplitudes were informed by the duration of each component in the grand average. The majority of ERP studies to date have averaged across trials in order to eliminate extraneous noise from their measures, but this approach fails to take account of the fact that task engagement has been shown to fluctuate significantly over a relatively short time scale (<1 min; e.g., Jung, Makeig, Stensmo, & Sejnowski, 1997; Makeig & Jung, 1995, 1996; O'Connell, Dockree, Robertson, Bellgrove, Foxe, & Kelly, 2009). We therefore isolated single-trial measures of the P3 and pupil diameter to allow a better characterization of their dynamics and relationships to task performance. A denoising procedure was used to obtain reliable single-trial measures of N1 and P3 amplitude (see Spencer, 2004, for a discussion). The EP_den_v2 plug-in for MATLAB (Quian Quiroga & Garcia, 2003) uses wavelet decomposition of the average ERP as a denoising template and applies the wavelet coefficients that are correlated with the ERPs of interest back to each single trial. Wavelet denoising in this way has optimal resolution in both the frequency and time domains and allows the effective removal of extraneous noise from the single-trial ERPs. This process was applied separately to the P3 and N1 components. Because there was substantial variability in amplitude around the onset of the denoised N1, we defined this component as the peak-to-peak measure of the maximum voltage (in microvolts) between 70 and 110 ms poststimulus minus the minimum voltage 100–200 ms poststimulus. By contrast, the activity at denoised P3 onset was relatively homogeneous (e.g., Figure 1a), and it was therefore defined as the peak amplitude 250–600 ms poststimulus. A wide latency window was used for the P3 because of the substantial latency differences in this component across different periods of the task.

Figure 1.

 Grand-average P3 ERPs and phasic pupil dilation waveforms sorted into quintiles according to pretarget pupil diameter. Target tones evoked both P3 components (a; accompanied by grand-average topography) and large phasic pupil dilations (b), sorted here into quintiles according to pretarget pupil diameter. See Method for a detailed description of the sorting procedure.

Target tones also evoked significant dilatory responses in the pupil (Figure 1b), and visual inspection of the raw data indicated that, despite baselining, there remained substantial variability in pupil diameter at the onset of dilation. We therefore defined pupil dilation (in millimeters) as the peak-to-peak measure of the maximum dilation between 0.4 and 2 s poststimulus minus the minimum pupil diameter 0–0.4 s poststimulus.

We also examined a marker of prestimulus, baseline pupil diameter. As in recent research (Gilzenrat et al., 2010; Jepma & Nieuwenhuis, 2011), prestimulus pupil diameter on each epoch was calculated by averaging the 1 s of pupil diameter data preceding tone presentation on that epoch. Thus our analyses included both baseline and stimulus-evoked or phasic changes in pupil diameter.

Lastly, for measures of task performance, we calculated reaction time (RT; in milliseconds) and RT coefficient of variation (CV). The latter is a stringent measure of performance variability that has demonstrated sensitivity to the efficiency of frontal top-down control networks (Bellgrove, Hester, & Garavan, 2004; Stuss, Murphy, Binns, & Alexander, 2003), calculated by dividing the standard deviation in RTs for a group of epochs by their mean.

Analysis and Statistics

Our primary analyses focused on sorting and binning each participant's epochs according to different variables of interest: pretarget pupil diameter, P3 amplitude, pupil dilation amplitude, and time on task. As a general guiding principle, the selection of sorting variables was determined by the relative onset latencies of the measures in question. On the basis of the assumption that earlier processing stages can affect later ones but not vice versa, measures were only used as “sorting” variables if they occurred earlier or simultaneously in time compared to the sorted variables. Because the present study sought to elucidate the relationship of these measures to the hypothesized Yerkes–Dodson LC-NE arousal function, we chose to bin epochs into quintiles: This facilitated the investigation of possible quadratic trends in the data while also ensuring sufficient epochs per bin (M=33; SD=1.97). To illustrate, sorting according to pretarget pupil diameter meant binning the 20% of each participant's epochs with the lowest pretarget pupil diameters into Quintile 1, and up to the 20% of that participant's epochs containing the highest pretarget pupil diameters into quintile 5. Our analyses proceeded in four stages: First, we sought to examine the relationship between pretarget pupil diameter and task performance; second, we probed how phasic pupil dilations related to task performance dynamics; third, we investigated the extent to which the P3 component related to these measures; and last, we investigated time-on-task effects across measures. Each comparison was analyzed using repeated measures analysis of variance (ANOVA) with five levels of quintile. As an exception, the P3 and N1 measures were incorporated into the same 5 × 2 ANOVA, with five levels of quintile and two levels of component. This analysis enabled the investigation of ERP effects specific to the P3 component. Greenhouse–Geisser corrected degrees of freedom were used in cases of violated sphericity with corrected degrees of freedom reported. We also conducted planned comparisons of the first and fifth quintiles for all measures in order to highlight relationships that may only exist at the high and low extremes of the sorting variables. This enabled indirect comparison of our results to those from the recently published study by Gilzenrat et al. (2010) where appropriate.

As part of the second stage of our analysis, a detailed examination of the relationship of phasic pupil dilations to task performance dynamics was conducted. Epochs containing the 20% largest pupil dilations (i.e., those constituting Quintile 5 when epochs were sorted by pupil dilation) were isolated for each participant, and changes in our behavioral and physiological measures were examined in the three epochs before and the one epoch after these maximum dilations (Epoch −3 to Epoch +1). For those measures specifically evoked by target stimuli (pupil dilation, P3, RT, RT CV), the groups of five epochs isolated for this analysis therefore consisted of consecutive target epochs and spanned an average time range of approximately −30 to +10 s relative to Epoch 0. Prestimulus pupil diameter could also be extracted prior to standard tones, which allowed for a more temporally confined picture of its dynamics preceding and following Epoch 0; consequently, Epochs −3 to +1 in this analysis consisted of the three standard epochs before and the one standard epoch after the target epochs containing maximum dilations and spanned an average time range of approximately −8.5 to +2 s relative to these dilations. Any maximum-dilation epoch flanked by one or more target tones within this 10.5-s range was excluded from analysis. In all of these analyses, separate repeated measures ANOVAs were used to examine the pre- and postmaximum pupil dilation trends in each measure.


Prestimulus Pupil Diameter and the LC-NE Arousal Function

Our first analyses focused on the sorting and binning of epochs according to pretarget pupil diameter in order to investigate the extent to which this measure might show an inverted-U relationship to task performance in a manner consistent with the LC arousal function. Behaviorally, there was no effect of pretarget pupil diameter quintile on RT (p=.436), but there was a significant main effect on RT CV, F(4,92)=2.56, p<.05, η2=.1, which was driven by a U-shaped quadratic trend, F(1,23)=8.81, p<.01, η2=.28, centered on an intermediate level of pretarget pupil diameter (Quintile 3; Figure 2a).

Figure 2.

 Pupil diameter and task engagement. a: Reaction time coefficient of variation (RT CV) exhibited a U-shaped relationship to pretarget pupil diameter: Epochs marked by an intermediate pretarget pupil diameter were associated with good performance, indicative of increased engagement in the task (second-order polynomial line of best fit drawn in black). b: Phasic pupil dilations were strongly inversely related to their corresponding pretarget pupil diameters, and (c) larger pupil dilations were marked by relatively poor task performance. Error bars depict standard error of the mean.

In contrast to Gilzenrat et al. (2010), we did not observe any significant difference in RT or RT CV when comparing epochs from the largest and smallest pretarget pupil diameter quintiles (p=.8 and p=.6, respectively), although our numerical trends were in the same direction. This remained the case when we compared the highest and lowest pupil diameter quartiles in an identical manner to the analysis carried out by Gilzenrat et al. (RT, p=.9; RT CV, p=.9).

Pupil Dilation and Phasic Reorienting

We next investigated the relationship of phasic pupil dilations to our other physiological measures and to task performance. First, we replicated the earlier finding of Gilzenrat et al. (2010) that the amplitude of phasic pupil dilations had a strong inverse relationship with pretarget pupil diameter (Figure 2b; significant main effect of pretarget pupil diameter quintile, F(1.6,36)=85.28, p<.001, η2=.79).

Although there was a visible trend toward a linear relationship between pupil dilation and RT when epochs were sorted according to the former (Figure 2c), there was no significant main effect of quintile (p=.3) and no significant first versus fifth quintile differences (p=.1). Similarly, there was no main effect of quintile on RT CV (p=.1), although here there was a significant difference, F(1,23)=12.85, p<.01, η2=.36, between Quintile 1 (M=0.185, SD=0.056) and quintile 5 (M=0.227, SD=0.078).

To better understand the functional significance of phasic pupil dilations, we investigated changes in our behavioral and psychophysiological measures before and after trials on which the largest dilations occurred. Behaviorally, the maximum-dilation epochs appeared to be preceded by a progressive slowing of RT and followed by a significant improvement in performance (Figure 3a). The trend of increasing RTs from Epoch 3 to Epoch 0 neared significance, F(3,69)=2.47, p=.069, η2=.1), and there was a significant speeding of RT from Epoch 0 to Epoch +1, F(31,23)=6.28, p<.05, η2=.21. The same analyses were conducted on the RT CV data, and although similar numerical trends were apparent across the five epochs, neither the main effect from Epoch 3 to Epoch 0 (p=.6) nor the decrease in RT CV from Epoch 0 to Epoch +1 (p=.095) reached significance.

Figure 3.

 Large pupil dilations characterized by task disengagement followed by reengaging. a: Maximum pupil dilations (epochs extracted from pupil dilation Quintile 5) were preceded by progressively poor task performance as indexed by reaction times (RT) and followed immediately by an improvement in performance. b: These dilations were also preceded by a progressive decrease in prestimulus pupil diameter on the standard trials directly before target presentation. Error bars depict standard error of the mean.

Maximum dilations were also preceded by a gradual decline in prestimulus pupil diameter (Figure 3b). This decrease (from standard Epoch3 to Epoch 0) was highly significant, F(2,46.7)=51.65, p<.001, η2=.69), as was the subsequent increase in pupil diameter from Epoch 0 to Epoch +1, F(1,23)=69.08, p<.001, η2=.75.

P3 and the LC-NE Arousal Function

Having established the relationship between our pupillometry measures and task performance, we applied the same analysis techniques to the P3. The auditory N1 component was also included in these analyses in order to gauge the unique sensitivity of the P3. When epochs were again sorted by pretarget pupil diameter, combined P3/N1 analysis (Figure 4a) revealed a significant main effect of quintile, F(2.7,61.5)=3.54, p<.05, η2=.13, and a significant Component × Quintile interaction, F(3,68.8)=2.86, p<.05, η2=.11). When we unpacked this effect by separate post hoc ANOVAs for each component, it emerged that P3 amplitude had a significant inverted U-shaped relationship with pretarget pupil diameter, F(2.5,57.9)=4.22, p<.05, η2=.16; significant quadratic trend, F(1,23)=11.41, p<.01, η2=.33, while there was no relationship between pretarget pupil diameter and the N1 (p=.6). This indicates that the P3 showed the same U-shaped relationship to pretarget pupil diameter as was observed for task performance (RT CV).

Figure 4.

 P3 modulated by task engagement. a: There was a quadratic relationship between pretarget pupil diameter and P3 amplitude that closely mirrored the relationship between pretarget pupil diameter and task engagement (second-order polynomial line of best fit drawn in black). The P3 and pupil dilation exhibited opposite relationships to task performance (compare b and Figure 2c) and were not directly related to each other (c). d: There was also a significant increase in P3 amplitude on epochs directly following large pupil dilations. Error bars depict standard error of the mean.

The relationship between P3 amplitude quintile and RT CV did not reach significance (p=.3), nor did post hoc comparisons. However, there was a significant relationship between P3 amplitude quintile and RT (Figure 4b), F(4,92)=2.64, p<.05, η2=.1), with faster RTs observed at increasing P3 amplitudes. Therefore the P3 and phasic pupil dilation exhibited opposite relationships to task performance. This behavioral dissociation between the P3 and pupil dilation was also reflected in a direct comparison between the two measures: No significant relationship was observed between P3 and phasic pupil dilation when epochs were sorted by P3 amplitude (Figure 4c; p=.8).

Lastly, we investigated P3 dynamics in the epochs surrounding the largest pupil dilations in order to elucidate further the relationship between these two measures (Figure 4d). The P3 and N1 amplitude data on Epochs 3 to 0 relative to maximum dilations were entered into a 4 × 2 ANOVA with four levels of epoch and two levels of component. No main effect of epoch was found (p=.3), and there was no Component × Epoch interaction (p=.9). However, there was a significant increase in amplitude for both components from Epoch 0 to Epoch +1, main effect of epoch: F(1,23)=8.18, p<.01, η2=.26. There was no Component × Epoch interaction in this comparison (p=.2), indicating that this ERP “boosting” effect after large pupil dilatory responses was not specific to the P3.

Time-on-Task Effects

Vigilance models have often interpreted time-on-task performance decrements in terms of decreasing arousal, and the LC has often been implicated in this process (e.g., Coull, 1998; Paus et al., 1997). Therefore, for our final analysis, we investigated the effects of time on task on each measure.

As stated above, 19 of our total sample of participants (n=24) performed the entire task at ceiling. Even when the five participants whose performance was below 100% accuracy were isolated (mean misses=2.6, SD=3.0), they showed no effect of time-on-task quintile on performance accuracy (p=.6). Although there were trends toward a RT decrement with time on task (Figure 5a), neither RT (p=.06) nor RT CV (p=.07) showed significant main effects of quintile. Further analyses did reveal that the first 20% of epochs during the task (Quintile 1) were characterized by significantly faster RTs (M=421 ms, SE=18), F(1,23)=9.17, p<.01, η2=.29, and less RT variability (M=0.17, SE=0.05), F(1,23)=8.65, p<.01, η2=.27, when compared with the final 20% of epochs (Quintile 5; RT: M=447 ms, SE=22; RT CV: M=0.22, SE=0.07).

Figure 5.

 Time-on-task effects. Measures of performance (RT, RT CV) showed trends toward a time-on-task decrement (a). Both P3 amplitude and pupil dilation decreased with time spent on the task, whereas pretarget pupil diameter significantly increased (b). Data are expressed in terms of quintile z-scores relative to Quintile 1, averaged across participants.

Robust time-on-task effects were found across our psychophysiological measures (Figure 5b). Pupil dilation and pretarget pupil diameter exhibited inverse time-on-task relationships with respect to one another: The former decreased as the task progressed, F(4,92)=13.28, p<.001, η2=.37, whereas the latter increased, F(2.2,51)=11.95, p<.001, η2=.34.

Combined P3/N1 analysis revealed a significant main effect of time-on-task quintile, F(4,88)=4.89, p<.01, η2=.17, and a significant Component × Quintile interaction, F(2.9,64.8)=2.8, p<.05, η2=.12. Post hoc ANOVAs were then conducted separately for the P3 and N1 components to decompose this effect and showed that whereas N1 amplitude exhibited little change as the task progressed (no effect of quintile: p=.2), the P3 became significantly smaller, F(4,92)=5.16, p<.01, η2=.18. This indicates that time on task had a unique effect on P3 amplitude.


To our knowledge, the present study constitutes the first detailed investigation in humans of the interrelationships between performance dynamics on a widely used attentional task and two putative psychophysiological indices of LC-NE system activity: the P3 ERP and pupil diameter. In so doing, we demonstrate that pupil diameter and the P3 closely mirror the changes in task engagement that are predicted by the adaptive gain theory of LC-NE function (Aston-Jones & Cohen, 2005). Baseline, prestimulus pupil diameter exhibited a significant inverted U-shaped relationship with both P3 amplitude and task performance such that the largest P3 amplitudes and optimal performance occurred at the same intermediate level of prestimulus diameter. Our results therefore provide indirect evidence in humans that the P3 may index LC-NE mode. In addition, large phasic pupil dilations, hypothesized to be a physiological marker of the LC phasic response (Gilzenrat et al., 2010), were preceded by a progressive degradation in task performance and immediately followed by a reengagement in the task and P3 components of increased amplitude.

Based on extensive primate research, Aston-Jones and Cohen (2005) have proposed the influential adaptive gain theory of LC-NE function, which states that task engagement is modulated by tonic LC activity in a manner that mirrors the classic Yerkes–Dodson arousal curve. According to this model, the low end of the tonic LC activity spectrum is associated with a drowsy, inattentive state whereas high tonic activity is marked by distractibility and explorative behavior. In contrast, intermediate tonic LC activity is associated with optimal performance and task engagement. On a simple detection task like the oddball, the predicted behavioral consequences of shifts toward either end of the spectrum are essentially the same: diminished performance. In keeping with this model, we found that task performance was best when prestimulus pupil diameter was at an intermediate level but declined at the highest and lowest diameters. Although other neurotransmitter systems have been shown to exhibit U-shaped relationships to behavior (e.g., dopamine; Arnsten, 2009), two established findings support the claim that our measures specifically indexed LC-NE dynamics: (1) the long-confirmed primary role of this system in attentional tasks like the oddball (Aston-Jones et al., 1994; Aston-Jones, Rajkowski, & Kubiak, 1997) and (2) the demonstrated relationship, via electrophysiology in the monkey (Rajkowski et al., 1993) and pharmacological manipulation in humans (Hou et al., 2005), between pupil diameter and tonic LC activity. Our observation of a quadratic relationship between pupil diameter and task performance supports the contention that prestimulus pupil diameter is a useful measure of task engagement and a valid proxy for tonic LC activity in humans (Gilzenrat et al., 2010; Jepma & Nieuwenhuis, 2011).

In contrast to Gilzenrat et al. (2010), we did not observe a linear improvement in performance when comparing epochs with the highest and lowest pretarget pupil diameters, although numerical trends were in the same direction. One possible reason for this discrepancy lies with a subtle difference in task design: Our testing was conducted in near total darkness, allowing the complete dynamic range of the pupil to be expressed (Einhauser, Stout, Koch, & Carter, 2008), whereas Gilzenrat et al. tested participants under a moderate degree of ambient lighting. This latter protocol may have placed an upper limit on the extent to which the pupil was physically capable of dilating, with the potential consequence of obscuring any U-shaped trends. Similarly, whereas Gilzenrat et al. explored simple linear relationships between oddball performance and the highest and lowest extremes of prestimulus pupil diameter, the inclusion of several intermediate levels in the present study allowed us to uncover a more complex U-shaped relationship, which we contend is entirely consistent with the adaptive gain theory (Aston-Jones & Cohen, 2005). This theory does, however, particularly emphasize the impact of two specific modes of LC-NE activity on the regulation of cognitive control states: the “phasic” mode, at which tonic activity is relatively low and phasic responses are large, and the “tonic” mode, in which tonic activity is relatively high and phasic responses are diminished. These modes, respectively, represent the intermediate and high ends of the LC-NE arousal curve and have been associated with qualitatively distinct patterns of exploitative versus exploratory behavior during complex decision-making tasks (Aston-Jones & Cohen, 2005; Cohen et al., 2007; Rajkowski et al., 2004; Usher et al., 1999). Whereas the tonic versus phasic “mode” distinction appears to have strong explanatory power for such tasks, highly routine and monotonous paradigms like the attentional oddball require continual engagement and are likely to induce disengagement because of periodic shifts toward both the high and low ends of the tonic LC continuum (Robertson & Garavan, 2004). Our finding that epochs marked by particularly low pretarget pupil diameters were associated with poor task performance highlights the need to incorporate instances of low arousal when relating LC-NE function to behavior, particularly in the realm of attention.

The postulated role of the LC-NE system in vigilance (cf. Coull, 1998) prompted us to investigate time-on-task effects on each of our measures. Consistent with findings from the animal literature of diminished phasic LC responses with prolonged task performance (Aston-Jones et al., 1994; Rajkowski et al., 1994), we found that both phasic pupil dilation and the P3 significantly decreased with time on task. In contrast, tonic prestimulus pupil diameter significantly increased with time on task, which is difficult to interpret within a traditional vigilance framework. Models of vigilance are based on tasks that heavily tax endogenous attentional resources and induce time-on-task performance decrements in both accuracy and response speed as the demand on a neural “vigilance network” increases (e.g., Coull, Frackowiak, & Frith, 1998; Paus et al., 1997). However, target detection accuracy on our auditory oddball was at ceiling, and there were no main effects of time on task on RT or RT CV (although trends toward a vigilance decrement did exist). These findings, coupled with the observed increase in prestimulus pupil diameter with time on task, suggest that participants did not suffer from the gradual diminution of arousal, which is hypothesized to be a hallmark of extreme vigilance (Coull et al., 1998; Parasuraman, 1984). Indeed, increased pupil diameter may even point to a time-on-task trend toward the right side of the LC-NE arousal curve and increased distractibility as opposed to diminished arousal. This presents an interesting question for future research that will require paradigms capable of disentangling periods of inattentive behavior arising from both low and high arousal states (e.g., Makeig & Jung, 1996).

Although the auditory oddball did not yield any behavioral time-on-task effects, our more detailed quintile sorting analysis showed that there were significant periodic fluctuations in task performance, as revealed by the significant inverted U-shaped relationship between RT CV and prestimulus pupil diameter, which were masked by the time-on-task analysis. Such fluctuations are consistent with the high periodicity in attentional performance and arousal reported elsewhere, which take place over a relatively short timescale (Jung et al., 1997; Makeig & Jung, 1995, 1996; O'Connell, Dockree, Robertson, Bellgrove, Foxe, & Kelly, 2009) and highlight an important caveat in the interpretation of linear time-on-task effects using similar experimental paradigms.

The observation that large phasic pupil dilations were accompanied by poor task performance appears inconsistent with the prediction that large LC-NE phasic responses should be synchronous with high task engagement (Aston-Jones & Cohen, 2005; Gilzenrat et al., 2010). However, our detailed examination of epochs preceding and following large pupil dilations revealed that such dilations were followed by significantly improved RTs on the next target trial. One of the few studies to putatively localize the LC via functional magnetic resonance imaging (fMRI) found that human phasic LC responses were only evoked by a significant “attentional challenge” and served to maintain good task performance in the face of draining cognitive resources (Raizada & Poldrack, 2007). In the present study, the argument that large phasic pupil dilations were characterized by such an attentional challenge is supported by the finding that they were preceded by a progressive worsening of performance and a progressive decrease in prestimulus pupil diameter. These markers point toward decreased engagement and increased drain on endogenous attentional resources. Importantly, a combined pupillometry–fMRI study (Critchley, Tang, Glaser, Butterworth, & Dolan, 2005) has found that phasic pupil dilations were largest after errors on trials of maximum difficulty, and the anterior cingulate cortex, anterior insula, and dorsal pons (which contains the LC) were the only brain areas significantly related to these dilations. These brain areas are heavily interconnected (Aston-Jones & Cohen, 2005; Aston-Jones, Ennis, Pieribone, Nickell, & Shipley, 1986; Gompf et al., 2010; Sara & Herve-Minvielle, 1995) and have previously been identified as critical nodes in a performance monitoring network (Mottaghy et al., 2006; Sridharan, Levitin, & Menon, 2008; Ullsperger, Harsay, Wessel, & Ridderinkhof, 2010). It is therefore possible that the periodic large pupil dilations we observed may reflect phasic LC activations driven by higher cortical performance monitoring brain regions that serve to reengage participants in the task. This proposal may be indirectly tested, using pupil diameter, by employing task paradigms that allow for the analysis of error trials in addition to RT trends for correct responses (e.g., Hajcak, McDonald, & Simons, 2003; O'Connell, Dockree, Bellgrove, Turin, Ward, Foxe, & Robertson, 2009).

A great majority of P3 research has examined this component's relationship to aspects of attention and memory (Polich, 2007), and P3 abnormalities have been linked to a variety of clinical disorders (Barry, Johnstone, & Clarke, 2003; Szuromi, Czobor, Komlosi, & Bitter, 2010; van Tricht et al., 2010) and to the severity of cognitive deficits associated with ageing (Fjell, Walhovd, Fischl, & Reinvang, 2007). Despite its utility as a clinical marker, the neurophysiological origins of the P3 are not well understood. Based on similarities in their antecedent conditions, as well as pharmacological studies in humans and animals, it has recently been proposed that the P3 may represent the electrophysiological correlate of the LC phasic response (Nieuwenhuis et al., 2005, 2011). The present study represents an indirect test of this LC-P3 hypothesis using pupil diameter as a proxy for tonic LC activity. Our results indeed suggest that the P3 potential may be related to LC-NE mode. The P3 exhibited a relationship to prestimulus pupil diameter that is reflective of the well-documented relationship between the LC phasic response and tonic LC firing rate: Largest responses were elicited at intermediate levels of tonic activity. Taking into account the important role of the LC-NE system in regulating autonomic nervous system activity and the sleep/wake cycle (Berridge & Waterhouse, 2003; Nieuwenhuis et al., 2011), this possible link between the LC-NE arousal function and P3 amplitude may partly account for previous findings that show fluctuations across a variety of physiological measures of “arousal state” (e.g., heart rate, circadian phase, sleep deprivation) to affect P3 morphology (Polich & Kok, 1995). More generally, our results tentatively corroborate previous pharmacological, genetic, and animal research pointing to the LC-NE system as an important generator of the P3 (Liu et al., 2009; Nieuwenhuis et al., 2005; Studer et al., 2010; Turetsky & Fein, 2002).

To the extent that the P3 is sensitive to shifts in tonic LC-NE mode, as measured by pretarget pupil diameter, our results are consistent with the LC-P3 hypothesis (Nieuwenhuis et al., 2005, 2011). However, the proposal that the P3 indexes the phasic LC response was not supported when the P3 and phasic pupil dilation measures were directly compared within the same trial, and, contrary to predictions, the two measures exhibited opposite relationships to task performance. These findings suggest that P3 and pupil dilation do not index the same neural process, as has been previously hypothesized (Nieuwenhuis et al., 2005, 2011), although two measurement issues may have confounded the observed relationship between P3 and pupil dilation. First, it may be the case that the extraneous sources of variance inherent to both measures and divergent susceptibilities to different classes of artifact during recording obscured a more direct relationship between them (see also Nieuwenhuis et al., 2011). More fundamentally, pupil dilation and P3 have markedly contrasting latencies, and it is possible that they may reflect different combinations of distinct information processing stages. For example, it has sometimes been noted (e.g., Porter et al., 2010), and is evident in our data, that there is an apparent “double bump” in the dilatory response, possibly reflecting separate stimulus-evoked and cognitive- or response-related processes. The largest phasic pupil dilations observed in the current study occurred on the later of these peaks and may therefore reflect a neural process separable from that manifest in the stimulus-locked P3. For example, it may be the case that an element of motor processing manifests in the pupil dilatory response that is absent in the P3 and that this obscures a relationship between these measures. Isolating the largest dilations did reveal a significant increase in P3 amplitude on the subsequent target trial, indicating that the neural processes underlying the two measures are in some way related. If such pupillary responses are driven by performance monitoring processes taking place after a slow target response, as suggested above, they may underlie the restoration of a phasic mode of firing to the LC that is reflected in an enhanced P3 on subsequent trials. This possibility is consistent with the earlier suggestion that the P3 is a sensitive electro-cortical index of tonic LC mode. In this case, however, the amplitude of the earlier auditory N1 potential also increased after large pupil dilations, suggesting that this enhancing effect was not restricted to the P3.

Because of its location and size, the localization of the LC using standard fMRI techniques has proven challenging, and recent attempts (Keren, Lozar, Harris, Morgan, & Eckert, 2009; Minzenberg et al., 2008; Raizada & Poldrack, 2007; Schmidt et al., 2009; Shibata et al., 2006; van Marle, Hermans, Qin, & Fernandez, 2010) have met with varying degrees of success (Astafiev, Snyder, Shulman, & Corbetta, 2010). We believe the findings of our study should promote future attempts to index LC activity by measuring pupil diameter and the P3 in conjunction with fMRI and may allow researchers to test further hypotheses regarding the role of this nucleus in regulating human cognitive function.