How Reliably Do Eye Parameters Indicate Internal Versus External Attentional Focus?

Eye behavior is increasingly used as an indicator of internal versus external focus of attention both in research and application. However, available ﬁndings are partly inconsistent, which might be attributed to the different nature of the employed types of internal and external cognition tasks. The present study, therefore, investigated how consistently different eye parameters respond to internal versus external attentional focus across three task modalities: numerical, verbal, and visuo-spatial. Three eye parameters robustly differentiated between internal and external attentional focus across all tasks. Blinks, pupil diameter variance, and ﬁxation disparity variance were consistently increased during internally directed attention. We also observed substantial attentional focus effects on other parameters (pupil diameter, ﬁxation disparity, saccades, and microsaccades), but they were moderated by task type. Single-trial analysis of our data using machine learning techniques further conﬁrmed our results: Classifying the focus of attention by means of eye tracking works well across participants, but generalizing across tasks proves to be challenging. Based on the effects of task type on eye parameters, we discuss what eye parameters are best suited as indicators of internal versus external attentional focus in different settings.


Introduction
We spend a lot of our waking time absorbed in thought, ignoring the world around us (Killingsworth & Gilbert, 2010), for example, when we imagine future events or daydream. This internal focus of attention (also referred to as internally directed cognition) is a crucial part of human cognition (Chun, Golomb, & Turk-Browne, 2011). One way to assess and investigate the direction of attentional focus is by means of eye tracking. Eye behavior is an intuitive and rich source of information on human cognition; hence, it is more and more used as indicator of internal and external focus of attention, both in research (e.g., mind wandering and creativity research: Konishi, Brown, Battaglini, & Smallwood, 2017;Salvi, Bricolo, Franconeri, Kounios, & Beeman, 2015;Unsworth, Robison, & Miller, 2018; and in application (e.g., driver safety: Palinko, Kun, Shyrokov, & Heeman, 2010;Tsai, Viirre, Strychacz, Chase, & Jung, 2007). Ongoing technical developments make it easier and cheaper to assess eye behavior, making eye behavior also a promising parameter for automatic moment-to-moment detection of attentional focus using machine learning approaches (Huang, Li, Ngai, Leong, & Bulling, 2019;Vortmann, Schult, Benedek, Walcher, & Putze, 2019).

Pupil diameter
Pupil diameter (also referred to as tonic, baseline, or median pupil diameter) is the parameter most commonly used as an indicator of internal versus external focus of attention, both in research and application (e.g., Franklin et al., 2013;Konishi et al., 2017;Palinko et al., 2010;. Studies found larger Ceh et al., 2020;Franklin et al., 2013;Jubera-García et al., 2020;Konishi et al., 2017;Makovac et al., 2019;Pelagatti et al., 2018;Smallwood et al., 2011;Walcher et al., 2017) and smaller pupil diameter Grandchamp et al., 2014; during internal compared to external attentional focus direction. Furthermore, some studies found no effect of attentional focus on pupil diameter (Jubera-García et al., 2020;. Pupil diameter is known to be sensitive to workload, with higher workload eliciting larger pupil diameter (Piquado et al., 2010;Unsworth & Robison, 2017). Hence, a possible reason for the inconsistent findings is likely that workload differed between internal and external activities in those studies. Generally, it is challenging to find activities with external and internal attentional focus that are comparable in workload. In external activities, part of the required information can be "outsourced" to the external world, for example, the operands of a multiplication. In internal activities, all information needs to be held in mind, placing higher demands on workload and, consequently, causing pupils to dilate more strongly than during external activities.
Hence, we expect larger pupil diameter during internal compared to external tasks, due to the intrinsically higher workload and extend the literature by investigating whether task type moderates the effect of internal versus external attentional focus on pupil diameter.

Fixation disparity
When we look at something in three-dimensional space, both eyes are oriented toward the attended object, which involves a convergence of eyes to focus on closer objects and a divergence to focus on objects farther away. Together with form changes of the lenses and the modification of the pupil diameter, these changes of the angle of eye vergence represent the near-response-triad (Myers & Stark, 1990). The angle of eye vergence can be measured easily in terms of fixation disparity, which reflects the difference between the horizontal positions of the left and right eye on the screen. Fixation disparity near 0 represents focus on the screen, positive values reflect a farther focus (looking through the screen), and negative values reflect a closer focus.
During external attentional focus, fixation disparity needs to be close to zero to obtain unblurred vision of the screen. When attention is turned inward, it is no longer relevant to focus on the screen, thus allowing for stronger variability in disparity of the eyes. In fact, fixation disparity can even follow imagined distances (Laeng & Sulutvedt, 2014), suggesting that our eyes align with internal cognition, too. A first study using fixation disparity for automated moment-to-moment detection of internal versus external attentional focus showed that this measure may even outperform common eye parameters . In our lab, we found larger fixation disparity (looking beyond the screen) during internal compared to external attentional focus in some studies Benedek et al., 2017), while in others, we found no differences (Ceh et al., 2020;Walcher et al., 2017). Hence, the present study tests if differences in disparity of the eyes between internal and external attentional focus depend on the task type. As there are only few studies up to now reporting mixed evidence, we are not formulating any directed hypothesis. However, intuitively, fixation disparity should be close to zero in all external tasks (i.e., focus on screen) and might be larger or smaller or might simply exhibit stronger variability during internal tasks (i.e., focus off screen).

Variation in pupil diameter and fixation disparity
During internal attentional focus, eye behavior is less restricted to adapt to the characteristics of the external visual stimuli (e.g., luminance and distance) as those stimuli are irrelevant. Consequently, more stimulus-independent variation in eye behavior is possible (see also Perceptual Decoupling Hypothesis; Smallwood & Schooler, 2006). Additionally, internal processes and internal visual representations (e.g., luminance and distance of imagined objects) can elicit changes in eye behavior (Laeng & Sulutvedt, 2014) and further increase variation in eye behavior.
There are only few studies, to our knowledge, that investigated the effect of attentional focus on variation in pupil diameter and fixation disparity. In our lab, we found larger variation in pupil diameter and in fixation disparity during internal compared to external attentional focus , as well as no effects (Ceh et al., 2020;Walcher et al., 2017). The present study extends the sparse literature on the effect of attentional focus on variation in pupil diameter and fixation disparity and tests the influence of task type on this effect. Following the reasoning above, we expect more variation in pupil diameter and fixation disparity in the internal compared to the external tasks.

Blinks
Blinks are an easily assessable eye behavior and frequently used as indicator of internal versus external attentional focus. Blinks interrupt visual input and temporarily suppress visual processing (Berman, Horovitz, Morel, & Hallett, 2012). Hence, when tasks require processing of external visual information, blinks are reduced (Fukuda, 2001;Shin et al., 2015;Shultz, Klin, & Jones, 2011). During internal attentional focus, the interruption of irrelevant visual input through blinks can be even beneficial.
A variety of studies found more blinks during internal compared to external attentional focus (Annerer-Walcher et al., 2018;Grandchamp et al., 2014;Salvi et al., 2015;Smilek et al., 2010;Walcher et al., 2017). Some studies found no effects for blink rate but for blink duration Ceh et al., 2020). Overall, internal attentional focus seems robustly associated with more and/or longer blinks. The present study tests, whether the effect of increased blink rate during internal versus external attentional focus is also robust across different task types.

Saccades
Findings regarding the effect of internal versus external attentional focus on saccade rate are mixed: Some studies find more (Annerer-Walcher et al., 2018;Walcher et al., 2017); others find fewer saccades Benedek et al., 2017;Ceh et al., 2020) during internal attentional focus compared to external attentional focus.
There are external tasks that require many eye movements (e.g., during visual search), whereas other external tasks require steady fixation. The same applies to internal attentional focus, as external and internal activities share cognitive processes (e.g., map-like representation in both search in external and internal environments, see Todd & Hills, 2020). If an internal task has many spatial components, for example, when imagining a supermarket and mentally searching for the dairy section, saccadic activity is high (Johansson et al., 2006). Hence, the saccade rate should depend mainly on the "eye movement requirements" of a task. If both the internal and external task require no eye movements, it is likely that the internal task is associated with more saccades because eye behavior is less restricted by external visual stimuli and consequently more random or internally coupled eye behavior can occur (Ferreira, Apel, & Henderson, 2008). The present study investigates the influence of task type on the effect of attentional focus on saccade rate. We expect that especially visuo-spatial tasks elicit more saccades than the other task types, as they require frequent eye movements.

Microsaccades
Microsaccades are small eye movements occurring during fixation, and they counteract visual fading and correct gaze position after saccades (Martinez-Conde, Otero-Millan, & Macknik, 2013). There are only a few studies assessing microsaccade rate during internal and external attentional focus. Most studies found fewer Benedek et al., 2017;Ceh et al., 2020;Walcher et al., 2017), while one study found increased microsaccadic activity during internal compared to external attentional focus (Annerer-Walcher et al., 2018). Previous studies showed that task difficulty influences microsaccade rate in internal tasks with more difficult tasks being associated with fewer microsaccades (Gao, Yan, & Sun, 2015;Siegenthaler et al., 2014). One can argue that higher task difficulty leads to smaller amounts of resources being dedicated to maintaining stable vision when it is not needed (i.e., during internal attentional focus). Hence, when task difficulty varies between internal and external tasks, these differences can be assumed to impact microsaccadic activity.
Besides task difficulty, the spatial requirements of the task seem to influence the microsaccade rate: In external tasks, where information is presented in a predefined area for a short time, microsaccades are reduced compared to sustained fixation, presumably to reduce the chance of missing the stimulus due to saccadic suppression (Poletti & Rucci, 2016). External tasks requiring many saccades toward targets typically involve more microsaccades to correct gaze positions after saccades (Martinez-Conde et al., 2013). Hence, the microsaccade rate is expected to be sensitive to the difficulty and gaze demands of internal and external tasks.

Machine learning and eye tracking
Beyond traditional aggregated analysis on large batches of trials, methods of machine learning have been applied to single trials of eye tracking data. In machine learning, algorithms learn to classify data in two or more categories by learning statistical relationships between input features and class labels in provided training data. While a large number of algorithms are available, recent years saw a dominance of neural network architectures (LeCun, Bengio, & Hinton, 2015), including recurrent neural networks, which are tailored toward the processing of time-series data (Hochreiter & Schmidhuber, 1997). Classification approaches are important for real-world applications as well as for real-time analysis of cognitive processes, for example, to study timings or causal relations, especially in variable experimental settings. Examples of such applications range from the robust detection of fixations and saccades (Startsev, Agtzidis, & Dorr, 2019) to the detection of user confusion (Sims, 2020) or personality traits (Berkovsky et al., 2019). Recently, machine learning on a single-trial basis has also been applied to the differentiation of internal and external attention. For example, Vortmann et al. (2019) studied the differentiation of these attentional states from eye tracking data and EEG and showed successful classification with an accuracy of 64% from eye tracking only. However, the present analysis was person-dependent; that is, a set of parameters was identified per participant, limiting the generalizability of the approach and operating only with little training data per model. In this paper, we will study the feasibility of person-independent classification of the attentional focus.

Present study
Available research has shown that many eye parameters discriminated between internal and external attentional focus, but findings have not been consistent across studies. This raises the question whether the observed effects are due to differences in attentional focus and/or differences in task demands of internal/external tasks. Therefore, this study investigated how robustly specific eye parameters exhibit attentional focus effects across different tasks. To this end, we compared eye behavior between tasks relying on external and internal attentional focus across three content modalities, that is, numerical, verbal, and visuo-spatial (see Fig. 1).

of 30
The numerical and verbal tasks required fixation of a specified area, while the visuo-spatial tasks were expected to elicit more eye movements (to targets). This allowed us to test how differences in task type and the inherent required eye movements influence the effect of attentional focus on eye behavior.
We expected pupil diameter to be larger during internal compared to external attentional focus across all task types, due to the inherently higher workload of internal tasks. However, the size of this effect could vary across task types. As the literature is still inconclusive for fixation disparity, we expected no specific direction (smaller or larger fixation disparity) of the effect of attentional focus on fixation disparity. Regarding variability of pupil diameter and fixation disparity, we expected larger variation in all internal tasks compared to external tasks, as eye behavior is less constrained by external visual stimulation during internal attentional focus. The size of this effect could be affected by task type. Further, we expected that internal attentional focus is associated with more blinks across all tasks, as this effect was most robust in the literature. And finally, we expected that attentional focus affects saccades and microsaccades and that the direction of this effect depends on the task type (i.e., inherent eye movement requirements).
Real-word application mandates the identification of attention states based on single observations. Therefore, in separate analyses, we applied machine learning techniques to explore the feasibility of correctly classifying attentional focus from single trials.

Method
We provide our materials, data, and analysis scripts in the Open Science Framework (OSF, https://osf.io/scmry/).

Participants
The final sample consisted of 157 adults (114 females) aged 19-48 years (M = 24.24, SD = 4.68). Participants received € 10/h for their participation. The majority of participants were students (88.5%). One hundred and forty participants had normal vision; 17 participants had corrected-to-normal vision (soft contact lenses); and none reported strabismus or other medical conditions affecting vision.
We excluded nine additional participants from analyses (6 females; mean age = 23.88). Two participants were excluded due to technical issues. Seven participants had excessive missing data (data from only one block and/or more than 50% missing data) due to eye tracker malfunction. All participants gave written informed consent.
This study was part of a larger test session, in which participants also performed other tasks addressing different research questions. The session included the performance of several cognitive tests, the present paradigm, an eye tracking paradigm on distraction during idea generation (Annerer- , and an eye tracking paradigm on voluntary control of eye vergence. Additionally, participants completed several personality tests during a separate online session. To achieve high power for all our intended analyses in the present study and the other studies within this test session, we aimed at obtaining valid data from appr. 150 participants for each eye-tracking paradigm. We provide a full list of all questionnaires and tasks administered in the Supplementary Material. Interested researchers can use these additional data for analyses that are beyond our research question.

Apparatus
The study took place in a sound-attenuated room with lights on. Participants were seated in front of a 24-inch screen (1920 × 1080 pixels, ca. 41.54 × 24.09 degrees of visual angle, 60 Hz refresh rate) at a distance of 70 cm, and their heads were stabilized by a chin rest.
Binocular eye data were recorded using an SMI RED250mobile system (SensoMotoric Instruments, Germany) with a temporal resolution of 250 Hz, spatial resolution of 0.03°, and gaze position accuracy of 0.4°visual angle. The stimulus presentation program was written in PsychoPy (Peirce, 2007) using the Software Development Kit by SMI. There was a 9-point calibration procedure at the beginning of the practice and main blocks and a drift check before each task.

Procedure and task
Participants performed six different tasks, which either required an external or internal focus of attention and reflected three types of task modality (i.e., numerical, verbal, and visuospatial; see Fig. 1a). This paradigm is an extension of an existing paradigm (Putze et al., 2016) and was used in a previous study focusing on the classification of attentional states based on EEG data and a small set of eye parameters in a small sample (N = 10) (Vortmann et al., 2019).
During all tasks, the same continuous sequence of visual stimuli (stimulus screens and masks) was presented to ensure comparable visual stimulation across all tasks and conditions (see Fig. 1b). In the external tasks, the stimulus screens were relevant for task performance, while the internal tasks were performed entirely in mind and did not rely on processing of visually presented information.
In the numerical external task, participants evaluated whether the number comparisons on the stimulus screen were correct (e.g., 4 < 2, see Fig. 1a). In the numerical internal task, participants read an addition at the beginning of a trial (e.g., 22 + 3) and continued to mentally add the second number to the result throughout the trial. In the verbal external task, participants evaluated whether the words on the stimulus screen were names of animals (e.g., Blauwal [blue whale]). In the verbal internal task, participants generated words starting with a letter presented at the beginning of the task (e.g., D: dog, dove, driver). In the external visuospatial task, participants evaluated whether there was an L in a circle of T's on the stimulus screen. In the internal visuo-spatial task, participants imagined a given scenery throughout the task (e.g., school, soccer game). All task items of the internal tasks are listed in Table S1, and all stimuli of the external numerical and verbal tasks are listed in Table S2.
At the beginning of each trial, the task name with a short instruction appeared. In the internal tasks, the stimulus for the trial (i.e., the addition for the numerical task, the letter for the verbal task, or the scene for the visuo-spatial task) was also presented on the instruction screen.
After a mouse click by the participant, a drift check was performed after which a fixation cross appeared for 1,000 ms and the trial started. During trials, a sequence of alternating masks (400 ms) and stimulus screens (800 ms) was presented starting with a mask (see Fig. 1b). The number of stimulus screens per trial and thereby the trial duration varied from 8 to 11 stimulus screens (10-13.6 s) and was counterbalanced across tasks to avoid trial end expectation effects on eye behavior. Each task comprised eight trials, two each with 8, 9, 10, and 11 stimulus screens in randomized order. Performance checks were randomly presented at the end of 25% of trials to monitor task compliance. In the external tasks, participants reported the number of correct number comparisons (numerical task), the number of animal names (verbal task), or the number of L's (visuo-spatial task) in the given trial. In the internal tasks, participants reported the final result of the serial addition (numerical task), the last word generated (verbal task), or the experienced vividness of their imagination on a scale ranging from 1 = perfectly clear and vivid to 5 = I had no picture in my mind at all (visuo-spatial task).
Participants performed a practice block with one trial per task followed by two task blocks with eight trials per task in randomized order. The whole session, including short breaks between blocks, took 45 min on average.

Stimuli
The background was white and stimuli were black written in Arial font. Each stimulus screen included a fixation cross in the center (ca. 0.20°visual angle), a number comparison (ca. 0.9°x 0.29°visual angle), a word (ca. 1.58°x 0.29°visual angle), and a circle of Ts and Ls (radius ca. 4.18°visual angle, letters 0.32°x 0.41°visual angle; see Fig. 1b).
The number comparisons included the numbers 1-9 and the comparison sign "<." We used 50 unique number comparisons, half of which were correct. The words included 25 animal names and 25 food names, constituting 50 seven letter words. Numbers were ca. 0.33°visual angle above and words ca. 0.33°visual angle below screen center. We presented each number comparison and each word 6 times per block and 12 times in total. All number comparisons and words are listed in Table S2.
Combining the stimulus parts (number comparison, word, circle of T's), we generated 100 unique stimulus screens. Stimulus screens were randomly assigned to trials for each participant with the restriction that each stimulus screen was presented only three times per block and six times in total. Additional stimulus screens were generated for the practice trials.
(A) Overview of all experimental tasks for each level of attention attentional focus (external and internal) and task modality (numerical, verbal, and visuo-spatial). (B) Schematic course of a trial. Stimuli are enlarged for visualization. After task instruction and drift check, a trial started with a 1000 ms fixation. Then, sequences of 8-11 masks and stimulus screens were presented. The external tasks required continuous processing of the stimulus screens, whereas internal tasks were performed in mind while ignoring the presented visual stimuli. After 25% of trials, performance checks were presented asking participants to vocalize their responses.

Data preprocessing
We used the same data preprocessing procedure as in Annerer- . Data preprocessing and analyses were performed with R (R Version 4.0.2, R Core Team, 2020) in RStudio (Version 1.3.959, RStudio Team, 2020). Blinks were defined by the eye-tracking system as periods where pupil diameter is zero or pupil diameter changes faster than an internal velocity threshold (due to lid closure) (SensoMotoric Instruments, Germany). From raw pupil and gaze position data, we excluded samples recorded during blinks plus two additional samples (=8 ms) at the beginning and the end of each blink to additionally control for distorted data due to lid closure (percent data discarded: M = 5.13, SD = 4.64, Max = 25.35). We further excluded samples in which pupil diameter (PD) or fixation disparity (FD) were beyond their natural range (PD > 15 mm or PD < 1.8 mm, |FD| > mean pupil distance = 60 mm) or three standard deviations beyond an individual's mean (percent data discarded: M = 2.51, SD = 3.26, Max = 37.87) to avoid data distortion due to measurement errors. Pupil diameter and fixation disparity data were also not analyzed for time periods that were classified as saccade or microsaccade.
We summarized eye tracking data for each trial. As all trials had at least eight stimulus screens (see Section 2.3), we included only the first eight masks and stimulus screens, resulting in an analyzed period of 9.6 s per trial. For each trial, we calculated median pupil diameter and median fixation disparity. To investigate variation in pupil diameter and fixation disparity within a trial, we calculated the standard deviation of those parameters. We calculated mean number of blinks, saccades, and microsaccades per second for each trial. There were many trials with no blinks, saccades, or microsaccades, especially in the external tasks, and some trials with many blinks, saccades, or microsaccades, leading to a problematic distribution for further analyses (see Fig. S1 in the Appendix). Therefore, we decided to dichotomize those three variables, assigning 0 if there was no respective event in this trial and 1 if there was at least one event in this trial.

Data analyses
Our analysis consists of two parts. In the first part, we analyze the effect of attentional focus and task type on eye behavior. In the second part, we apply machine learning methods to our raw data to test the feasibility of extracted eye parameters for classification of attentional focus on a single-trial level.

Analysis of effects of attentional focus and task type on eye behavior
We analyzed the effects of attentional focus and task type on eye behavior using linear mixed effects models because this method can be applied to a trial-level dataset without the information loss that results from aggregating trials. To compare the three task types in a multi-level context, we defined three models per eye parameter, one for each task type comparison (numerical vs. verbal, numerical vs. visuo-spatial, and verbal vs. visuo-spatial), with attentional focus, task type, and interaction of attentional focus and task type as fixed effects. (This analysis approach was necessary, because in an overall model with all three task types, main effects or interactions with task type would be less informative as they only compare two levels of task type to one other level.) We added random intercepts for participants and trials only as models with additional random slopes did not converge.
In addition to the standardized regression coefficients as an estimate of the magnitude of effects, we computed Bayes Factors (BF) as an estimate of the weight of evidence in favor or against our hypotheses given the data (Wetzels et al., 2011). The BF 10 reflects the ratio of evidence in favor of including a particular effect compared to the model without this effect. BF 01 reflects the evidence against the effect (1/BF 10 ). A BF between 3 and 20 is considered to represent moderate, between 20 and 150 is considered strong, and larger than 150 very strong evidence for the respective hypothesis (Wagenmakers, 2007). For the planned pairwise t-tests, we provide Cohen's d as estimate of the magnitude of effects. Although Cohen's d underestimates effects in within-subject designs like the present one, Cohen's d is unaffected by study design and therefore better comparable across different studies. The following R packages were used in these analyses:

Classification of attentional focus using machine learning
In a separate line of analysis, we investigated the feasibility of identifying attentional states from single trial data. To this end, we applied machine learning techniques to train a classifier of internal versus external attentional focus. A recurrent neural network using the long shortterm memory (LSTM) cells was trained on raw output of the eye tracker from a 2,400 sample window (=9.6 s) per trial labeled as internal or external attentional focus.
The raw parameter output included left and right pupil diameter, left and right gaze position X, left and right gaze position Y, and pupil confidence (confidence of the eye tracking software that the pupil was detected). All derived parameters (e.g., number of fixations) should be learned by the algorithm itself. An LSTM is designed to capture patterns in data sequences, even if these reach over longer segments of the input sequence. The model architecture is shown in Fig. 2. Similar topologies for LSTM-based recurrent neural networks have been proposed and successfully evaluated for the processing of eye tracking sequences to classify gaze patterns (Ahn, Kelton, Balasubramanian, & Zelinsky, 2020;Alghofaili et al., 2019). The LSTM was trained using stochastic gradient descent with mini-batches (batch size of 30). As LSTM models were trained across participants, using 135 sessions as training data to adjust the model parameters and the remainder as testing data, pooling data from all tasks. Analyses were performed using the Python programming language (Version 3.6.5; The Python Software Foundation, 2020) and the following packages: SciPy (Version 1.0.1; The SciPy community, 2020), scikit-learn (Version 0.19.1; Pedregosa et al., 2011), Keras (Version 2.1.5; Chollet, 2015), matplotlib (Version 2.2.2; The Matplotlib development team, 2020), numpy (Version 1.14.2; NumPy, 2020), opencv-python (Version 3.4.2.17; Python Software Foundation, 2020), and tensorflow (Version 1.7.0; Martín Abadi et al., 2015).

Task performance
The high overall performance observed at performance checks suggests high task compliance and modest task difficulty. In the external tasks, participants correctly answered on average in 84% (SD = 36%) of numerical trials (number evaluation), 87% (SD = 34%) of verbal trials (reading), and 87% (SD = 33%) of visuo-spatial trials (visual search) on average. In the internal tasks, all participants correctly named one word beginning with the instructed letter in the verbal task (word generation). In the internal numerical task, all participants reported a mathematically valid result of the serial addition (mental arithmetic). Participants performed on average 10.58 (SD = 3.10) additions across probed trials (min = 3, max = 23) with 2-24 additions in the task 7 + 3 (M = 10.84, SD = 3.76) and 1-23 additions in the task 22 + 4 (M = 10.32, SD = 3.55). In the internal visuo-spatial task, participants rated the vividness of their imagination on average with 2.62 (SD = 1.01).

Effects of attentional focus and task type on eye behavior
We investigated effects of attentional focus and task type on each eye parameter by means of multi-level models comparing two modalities at once (see data analysis). Model summaries are given in Table S3 (numerical vs. verbal), Table S4 (numerical vs. visuo-spatial), and Table  S5 (verbal vs. visuo-spatial). Mean and standard deviation of each eye parameter and planned pairwise comparison of internal versus external attentional focus within each task type are given in Table 1 and planned pairwise comparisons of task types within internal and external attentional focus are given in Table 2. Mean and 95% confidence intervals of each eye parameter are visualized in Fig. 3. Correlations between all eye parameters are given in Table 3. Further, we provide graphical analyses depicting the difference in eye behavior between internal and external condition for each participant and the distribution of these attention effects across participants in the Appendix (Figs. S2-S4).
Additionally, we examined the average time course of all eye parameters over the course of a trial. To this end, we plotted eye tracking data for 1,200 ms periods (400 ms mask plus 800 ms stimulus screen) in Fig. 4. We added the 1,000 ms fixation period at the beginning of each trial as additional period.

Pupil diameter
For the verbal and numerical tasks, internal attentional focus elicited larger pupil diameter than external attentional focus. Fig. 4 shows that this difference already emerged right Note. Mean (standard deviation) eye behavior during 9.6 s trials. d = Cohen's d; BF = Bayes Factor; BF 10 = ratio of evidence in favor of effect; BF 01 = ratio of evidence against effect. N = 157. after instruction during the fixation period suggesting that participants may have started task performance right after the instruction. However, attentional focus had no effect on pupil diameter in the visuo-spatial task. Together, this indicates that the effect of attentional focus on pupil diameter was moderated by task type.
The absent effect of attentional focus in the visuo-spatial task type was mainly due to the relatively small pupil diameter during the internal visuo-spatial task compared to the other internal tasks (see Fig. 3). In fact, pupil diameter during the internal visuo-spatial task was numerically closer to the external tasks than to the other internal tasks. Further, within the other internal tasks, the numerical task elicited a larger pupil diameter than the verbal task. Within the external tasks, the verbal task was associated with a smaller pupil diameter than the numerical and visuo-spatial tasks. The numerical and visuo-spatial tasks did not differ in pupil diameter.
Visual inspection of the time course within trials (Fig. 4) showed that in external tasks, pupil diameter slightly increased and then decreased, while in internal tasks, pupil diameter remained increased throughout the trial. Note. Mean (standard deviation) eye behavior during 9.6 s trials. d = Cohen's d; BF = Bayes Factor; BF 10 = ratio of evidence in favor of effect; BF 01 = ratio of evidence against effect. N = 157. Fig. 3. Eye parameters as a function of focus and task type. Note. Data from the first 9.6 s of each trial. Error bars depict 95% confidence intervals.

Pupil diameter variation
Across all three task types, internal attentional focus was associated with more within-trial variation in pupil diameter than external attentional focus. Task type had only marginal effects on pupil diameter variation (d's ≤ .08). Fig. 4 shows that pupil diameter variation increased from fixation to first stimulus screen in all tasks. In external tasks, pupil diameter variation then dropped and remained low; in internal tasks, pupil diameter variation remained on the increased level. Fig. 4. Time course of eye behavior during internal and external attentional focus for each task type. Note. The first sequence refers to the 1 s fixation period at the beginning of the trial. Each following sequence consists of one mask (400 ms) and one stimulus screen presentation (800 ms), 1.2 s in total. Error bars depict 95% confidence intervals. * Proportion of trials with at least one blink, saccade, and microsaccade, respectively (note that these variables were dichotomized to account for skewed distributions, see Fig. S1).

Fixation disparity
Fixation disparity was similar across all task types and both internal and external attentional focus, except for the internal visuo-spatial task (Imagination), where fixation disparity continuously increased over the course of the trial.

Fixation disparity variation
Across all three task types, internal attentional focus was associated with more within-trial variation in fixation disparity than external attentional focus. Additionally, the external visuospatial task (visual search) was associated with more variation in fixation disparity than the other two external tasks. Fig. 4 shows that in internal tasks and the external visuo-spatial tasks, variation in fixation disparity increased at trial start and remained at an increased level, while in the numerical and verbal external tasks, variation did only slightly increase.

Blinks
Across all three task types, blink frequency was greater in internal tasks than in external tasks. Fig. 3 shows that in internal tasks, blink rate increased from fixation to first stimulus screen and slowly decreased during trials. In external tasks, blink rate dropped from fixation to first stimulus und slightly increased during trials.
There were also effects of task type. Performing the external visuo-spatial task resulted in fewer blinks than the other external tasks. Within internal tasks, task type effects were marginal (d ≤ .07).

Saccades
Task type determined whether internal attentional focus was associated with higher or lower proportion of saccadic activity than external attentional focus. For numerical and verbal tasks, internal attentional focus was associated with more saccades than external attentional focus. In the visuo-spatial tasks, the opposite occurred: External attentional focus was associated with more saccades than internal attentional focus. Overall, internal and external visuo-spatial tasks were associated with more saccades than their numerical and verbal counterparts. Fig. 4 shows that in the internal tasks, saccade rate minimally increased at trial start and remained stable throughout the trial. In the external numerical and verbal tasks, saccade rate remained low throughout the trial. In the external visuo-spatial task (visual search), the saccade rate increased strongly after trial start and remained high. This is unsurprising, given that this task required to search for targets in a circle of letters.

Microsaccades
Similar to saccades, internal attentional focus was associated with more microsaccades than external attentional focus only in the numerical and verbal tasks, and the effect was reversed in the visuo-spatial task type. Fig. 4 shows that in the internal tasks, the microsaccade rate remained stable throughout the trial. In the external verbal and numerical tasks, microsaccade rate dropped at trial start and remained low throughout the trial. In the external visuo-spatial task, microsaccade rate drastically increased at trial start and remained high throughout the trial.

Classification of attentional focus using machine learning
We used classification accuracy as evaluation metric for the identification of attentional focus based on single trial raw data. The evaluation resulted in an average classification accuracy of 75.7% (SD = 11.3%), which exceeds the majority baseline. Shorter training windows of 6 and 4 s length yielded higher-than-baseline or even slightly higher accuracy scores of 77.4% and 69.3%, respectively. Given that the analyses in Section 3.2 revealed that the attentional focus effects are not task-independent for some parameters, we further examined whether a transfer from a trained model to unseen tasks was possible. This was done with a modified analysis in which the LSTM was trained on only two internal and two external tasks, while the data for the remaining two tasks were used during testing. Hence, we systematically compared all possible combinations of four training tasks and two testing tasks, using the same 135 sessions for training and the remaining ones for testing, respectively (see Table 4). The classification accuracy dropped noticeably for the transfer condition, where 20 of 30

Note.
Percentage values below the external and internal testing task reflect the percentage of correctly classified samples within this external or internal task.

of 30
we observed an average classification accuracy of 61.1% (SD 15.6%), which was still above chance. This indicates that single-trial classification with data from only a few trials was generally feasible, which is highly relevant for future real-time applications. Interestingly, the accuracies of individual training-test pairings ranged from 47% to 81% (see Table 4). While the type of the internal testing task did not strongly affect classification accuracy (averaging at 61% for numerical, 62% for visuo-spatial, and 60% for verbal, respectively), the type of the external task had a more pronounced effect: Average accuracy for the verbal external task was 68%, while it was only 54% when predicting attentional focus in the visuo-spatial task (numerical: 59%).

Discussion
Eye behavior is increasingly used as indicator of internal versus external focus of attention in both research and application, but a closer look at the literature reveals mixed evidence on how eye parameters are related to internal versus external attentional focus. Therefore, the present study investigated to what extent the relationship between attention attentional focus and eye parameters is moderated by task type (i.e., numerical, verbal, and visuo-spatial). We found that blinks, variation in pupil diameter, and variation in fixation disparity showed a robust effect of internal versus external attentional focus across all three task types. Median pupil diameter and median fixation disparity showed an effect of attentional focus only in some but not in all tasks. And, although saccades and microsaccades differed between internal and external attentional focus in all three tasks, the direction of the difference inversed depending on task type. Below, we discuss these results and derive conclusions on the reliability of eye parameters as indicators of internal versus external attentional focus with respect to different task settings.

Pupil diameter
Pupil diameter is frequently used as an indicator of internal versus external attentional focus (e.g., Franklin et al., 2013;Konishi et al., 2017;Palinko et al., 2010;. As expected, in the numerical and verbal tasks, pupil diameter was larger during internal compared to external attentional focus. However, surprisingly, there was no difference in pupil diameter between internal and external attentional focus in the visuospatial tasks. This was mainly a result of the small pupil diameter during the internal visuospatial task (imagination). Typically, internal tasks are associated with higher workload and, consequently, larger pupil diameter as all information needs to be held in working memory. However, internal activities, such as imagining scenes, mainly involve episodic simulation processes that do not seem to strongly tax working memory and were shown to draw on different neural networks (Beaty, Thakral, Madore, Benedek, & Schacter, 2018;Schacter, Addis, & Buckner, 2007). This notion is also consistent with the sometimes smaller pupil diameter during mind wandering (Grandchamp et al., 2014;. Moreover, the internal visuo-spatial task required to imagine only one scene per trial, whereas participants performed several additions and generated several words per trial in the other two internal tasks. In sum, the visuo-spatial imagination task may have involved lower working memory load compared to other internal tasks, which may explain the smaller pupil diameter for the internal visuo-spatial tasks. Overall, our results corroborate the notion that pupil diameter is a good indicator of internal workload. Internal workload is increased in many forms of internally directed cognition (e.g., cued semantic search and mental arithmetic), but not in others (e.g., episodic simulation and mind-wandering). Therefore, pupil diameter will only be a reliable indicator of internal versus external attentional focus when one is well informed about the workload levels in the internal and external activities or when other task characteristics are kept equal Ceh et al., 2020). In such settings, one can benefit from the fact that pupil diameter can be assessed continuously, allowing for continuous and temporally fine-grained analysis of attentional focus.

Fixation disparity
Attentional focus had no effect on fixation disparity in the numerical and verbal tasks. However, in the internal visuo-spatial task, during mental imagination, fixation disparity continuously increased, suggesting that participants were "staring into space." Fixation disparity adapts not only to real distances but also to imagined distances (Laeng & Sulutvedt, 2014). Therefore, disparity of the eyes might only be indicative of internal attentional focus in tasks involving visual representations that eye behavior can couple to (see also Huang et al., 2019). Especially if the imagined distance is different from the screen, fixation disparity should indicate attentional focus. In contrast, the performance of the verbal and numerical tasks relied on abstract semantic information but did not require complex visualization. Consequently, gaze might still have been focused on the monitor, leading to no change in fixation disparity.
We conclude that fixation disparity may not be considered a task-general indicator of internal versus external attentional focus, but it can still indicate internal activities involving visuospatial imagination. As with pupil diameter, the high temporal resolution and straightforward computability make fixation disparity a powerful tool for future investigations of imagination processes.

Variation in pupil diameter and fixation disparity
As expected, variation in pupil diameter and variation in fixation disparity were larger during internal compared to external attentional focus, and this effect was robust across all three task types. Although tasks differed in their content modality (i.e., numerical, verbal, and visuo-spatial), all external tasks required continuous encoding of external visual information, leading to a close linkage of pupil diameter and fixation disparity to the physical stimuli on the screen (e.g., luminance and distance of screen). For internal activities, external visual information is usually irrelevant. Hence, pupil diameter and fixation disparity are less constrained (i.e., perceptual decoupling; Smallwood & Schooler, 2006) and can exhibit more spontaneous variation and variation due to coupling to aspects of the mental representations (Laeng & Sulutvedt, 2014). This seems relatively independent of task type, making those variation measures broadly applicable indicators of internal versus external attentional focus. Table 3 shows that larger pupil diameter was associated with more variation in pupil diameter (r = .49) across the whole data including internal and external tasks. Notably, the visuospatial task was associated with higher variation in pupil diameter but no differences in pupil diameter. Hence, increased variation in pupil diameter during internal attentional focus is not simply an artifact of larger pupil diameter. Further, the two variation measures correlate only marginally or not at all with blink, saccade, and microsaccade rate (see Table 3), further supporting the independence of the variation measures from other eye parameters.
A downside of variation measures is that their assessment relies on the analysis of time windows, leading to a relatively low temporal resolution, which makes this measure less suited for fine-grained continuous assessment of focus. Future studies could determine how much data is needed to obtain reliable estimates of pupil diameter and fixation disparity variation. We conclude that variation measures appear to be among the most robust, task-insensitive eye-tracking indicators of internal attentional focus according to our results. However, as the few existing studies employing variability measures show mixed results, replication by future research is required.

Blinks
Blink rate was robustly higher during internal versus external attentional focus across all three task types. Temporal analysis further showed that blink rate decreased relative to fixation in external tasks (see Fig. 3). Blinks interrupt visual input and even temporarily suppress visual processing (Berman et al., 2012). Hence, blinks are suppressed when tasks require processing of external visual information (Fukuda, 2001;Shin et al., 2015;Shultz et al., 2011). Interestingly, Fig. 3 shows that blink rate increased during performance of internal tasks relative to the fixation period. Hence, the higher blink rate during internal attentional focus compared to external attentional focus is not just due to the momentary suppression in the external condition. Instead, an increase in blink rate during internal attentional focus could represent visual disengagement from irrelevant external visual information. In sum, this suggests that blink rate reflects a fine-grained index of visual disengagement: Blink rate is lowest when relevant external information is expected (external tasks), moderate when visual information is available but does not require extensive processing (fixation), and highest when visual information is irrelevant and potentially distracting (internal tasks).
The robustly higher blink rate during internal compared to external attentional focus makes blinks an especially attractive indicator of attentional focus. A downside of blinks is their relatively rare occurrence, which requires long time windows for analysis. Hence, this parameter may only be suited to assessment of attentional focus in longer cognitive activities.

Saccades and microsaccades
Our study showed that saccades and microsaccades differentiate between internal and external attentional focus; however, the direction of the difference depends on the task type. External tasks can either limit or require eye movements depending on the task characteristics. The numerical and verbal external tasks required steady fixation at the position where relevant information was presented, leading to few saccades and microsaccades.

of 30
S. Annerer- Walcher et al. / Cognitive Science 45 (2021) The visuo-spatial external task required eye movements toward the target, leading to a higher rate of saccades and (corrective) microsaccades. In internal tasks, eye movements are less determined by the external visual stimulus, and more spontaneous eye movements and coupling to internal representations and processes can occur. The rate of saccades and microsaccades was quite similar in all internal tasks, but it tended to be higher in the internal visuo-spatial task, suggesting that gaze reflected the exploration of the mental visual scene (Ferreira et al., 2008;Todd & Hills, 2020). Importantly, higher saccade and microsaccade rates were observed for internal attentional focus in the numerical and verbal tasks, while rates were lower in the internal compared to the external visuo-spatial task. Hence, saccades and microsaccades can serve as indicators of attentional focus only when the need for eye movements (especially in the external activity) is known. In fact, given the large effect sizes (especially in the visuo-spatial task), these two eye parameters might be especially effective for indicating internal versus external attentional focus for tasks with well-known task demands.

Classification of attentional focus using machine learning
The differentiation between internal and external attentional focus is of interest for many real-world scenarios, such as driving, classrooms, and many others. In a first line of analysis, we demonstrated that eye behavior measures can be used to differentiate internal and external attentional focus. However, real-world applicability requires classification of attention states based on single events. While the applied mixed effect models already account for intertrial variance, effects are still evaluated based on a comparison of mean values for individual features. Therefore, in a complementary line of analysis, we applied machine learning techniques to examine classification accuracy of attention attentional focus each based on single-trial data.
This analysis yielded a classification accuracy of 75.7%, which is consistent with previous work showing good classification accuracy for eye behavior as an indicator of internal versus external attentional focus Vortmann et al., 2019). The classification accuracy dropped noticeably for the transfer condition; however, some information remained as accuracy was still above chance. The drop for the transfer condition is in line with the observation that attention effects are substantially moderated by task demands, suggesting that the choice of external task influences classification performance: Reading in the verbal condition seems to be the most characteristic external task (leading to the highest average accuracy), while the search process associated with the external visuo-spatial task seems to be easily confused with random eye movement during internal attentional focus (leading to the lowest average accuracy). It should be noted that the machine learning approach only relied on raw data from continuous eye parameters, and, therefore, it did not explicitly consider blink rate, which proved to be an important, robust index of internal attention. While the focus of this study was on the task-specificity of eye behavior for indicating internal versus external attention states, future research may examine more systematically how to optimize the classification of attention states by varying the level of information regarding eye behavior with respect to raw data versus features, length of observation, and different classification approaches.
Together, these results show that the single-trial classification of internal and external attentional focus is possible, but sensitive to variations of task characteristics.

Which eye parameters to use as indicators of internal versus external attentional focus
Our study showed that blink rate and variation in pupil diameter and fixation disparity discriminated most robustly between internal versus external attentional focus across all three task types. Therefore, we expected that these parameters are best suited as index of attentional focus especially in conditions in which the specific external and internal task characteristics are unknown, as well as in complex tasks involving diverse cognitive processes. For example, when one wants to indicate any kind of internal attentional focus in a real-world setting like driving, or one wants to indicate any kind of external attentional focus during solving a math problem in mind. As a downside, their potential applicability suffers from having low temporal resolution.
In settings, where more information on the specific tasks is available and generalization across different tasks is secondary, indication of internal and external attentional focus can be increased by additionally considering pupil diameter, saccades, and microsaccades. For example, when internal activity is defined by mental workload in terms of taxing working memory, pupil diameter can be a powerful additional indicator of attentional focus. Further, usage of saccades and microsaccades can boost identification of internal versus external attentional focus when the specific activities of interest are known to differ at the level of required eye movements, which is common for most external activities.

Limitations and future directions
In the present study, we aimed at selecting an internal and external task for each of the three task types, which are relatively comparable in complexity and allow for the same stimulus screens to be presented in each task. As suggested by the differences in pupil diameter (Piquado et al., 2010), workload was likely higher in the internal numerical and verbal tasks compared to the other tasks. Therefore, some of the effects on eye parameters could be the result of differences in workload. However, effects of attentional focus on variation in pupil diameter, variation in fixation disparity, and blinks also appeared in the visuo-spatial task type, although there were no differences in pupil diameter. Future research may manipulate the level of workload explicitly and obtain subjective difficulty ratings to explore its relevance on eye behavior in more detail.
We aimed at making the internal and external tasks as comparable as possible. While we seemed to have achieved a similar workload in the visuo-spatial tasks as indexed by pupil diameter, the internal and external visuo-spatial tasks differed in the level of constraint. Specifically, the internal visuo-spatial task required imagination of a very realistic unconstrained scenario, while the external visuo-spatial task used an abstract and constrained stimulus. Hence, the difference in the level of constraint could be partly responsible for differences in eye behavior between internal and external visuo-spatial tasks, for example, the increase in fixation disparity in the internal visuo-spatial task. In future studies, comparable levels of constraint could be achieved by either increasing constraints in the internal task (e.g., imagining a ticking clock) or by decreasing constraints in the external task (e.g., presenting pictures of naturalistic scenes). It should also be noted that this study focused on goal-directed forms of internal and external cognition, and thus did not include more spontaneous, undirected forms of internal and external cognition. Task-unrelated internal attentional focus (e.g., mind wandering) and task-unrelated external attentional focus (e.g., distraction) can differ from goal-directed cognition in aspects like workload, intentionality, and relevant cognitive processes and content. It could be assumed that eye behavior during task-unrelated internal cognition is driven by the same underlying mechanisms reflecting visual disengagement, perceptual decoupling, and internal coupling of eye behavior. Future investigations still need to test whether these findings generalize to spontaneous, task-unrelated forms of internal and external attentional focus.
Finally, when we refer to external attentional focus, we specifically mean focus on visually presented external information. Many research questions and applications investigating internal versus external attentional focus are mainly interested in this form of external attentional focus (e.g., external focus on traffic during driving). However, an external focus of attention may of course also be directed to nonvisual sensory information (e.g., external auditory or tactile information). We expect that an external attentional focus on nonvisual information is also characterized by visual disengagement and, thus, related eye behavior may actually be more similar to forms of internal attentional focus. Future research may extend this work by considering shifts of attentional focus between visual and nonvisual sensory information.

Conclusion
In the present study, we tested how robustly different eye parameters discriminate between internal and external attentional focus across different task types. We identified three eye parameters that showed attention effects independent of task characteristics: Blink rate, variation in pupil diameter, and variation in fixation disparity were increased for tasks with an internal attentional focus. Attentional focus also affected other eye parameters (pupil diameter, fixation disparity, saccades, and microsaccades), but here, effects were moderated by task demands in a predictable way. Differences in eye behavior during internal attentional focus were especially driven by mechanisms of visual disengagement, perceptual decoupling, and increases of workload. Complemental classification analyses demonstrated that attentional focus can be identified with acceptable accuracy across subjects, although transfer to unseen tasks remains challenging. In sum, this study shows that various eye parameters can serve as indicators of internal versus external attentional focus: Some are highly discerning under well-defined conditions, while others offer discrimination even largely independent of task characteristics.