Visually evoked responses are enhanced when engaging in a video game

Abstract While it is well known that vision guides movement, less appreciated is that the motor cortex also provides input to the visual system. Here, we asked whether neural processing of visual stimuli is acutely modulated during motor activity, hypothesizing that visual evoked responses are enhanced when engaged in a motor task that depends on the visual stimulus. To test this, we told participants that their brain activity was controlling a video game that was in fact the playback of a prerecorded game. The deception, which was effective in half of participants, aimed to engage the motor system while avoiding evoked responses related to actual movement or somatosensation. In other trials, subjects actively played the game with keyboard control or passively watched a playback. The strength of visually evoked responses was measured as the temporal correlation between the continuous stimulus and the evoked potentials on the scalp. We found reduced correlation during passive viewing, but no difference between active and sham play. Alpha‐band (8–12 Hz) activity was reduced over central electrodes during sham play, indicating recruitment of motor cortex despite the absence of overt movement. To account for the potential increase of attention during gameplay, we conducted a second study with subjects counting screen items during viewing. We again found increased correlation during sham play, but no difference between counting and passive viewing. While we cannot fully rule out the involvement of attention, our findings do demonstrate an enhancement of visual evoked responses during active vision.

visual cortex and passing through the parietal lobe (Goodale & Milner, 1992;Milner & Goodale, 2006) before terminating in the motor cortex. Studies of visually guided action have generally adopted a feedforward view, where the relevant information flows from the visual system to the premotor and motor centers. On the other hand, much less attention has been devoted to potential influences of downstream regions, including the motor cortex itself, on visual processing.
Despite this, multiple lines of evidence indicate that the motor system exerts influence over visual processing. First, the visual and motor cortices have reciprocal anatomical connections in the primate brain (Van Essen, 2005;Wise, 1997;Young, 1993). Moreover, numerous behavioral studies have demonstrated that learning of motor actions improves subsequent recognition of congruent visual stimuli (Casile & Giese, 2006;Engel, 2008;Hecht, Vogt, & Prinz, 2001) and that perceptual decisions may be primed by action (Brown, 2007;Helbig, 2010;Wohlschläger, 2000). Human sensitivity to visual motion appears to be higher when that motion matches the observer's own movement patterns (Knoblich & Flach, 2001;Loula, 2005). There is also evidence from neuroimaging studies that objects affording actions enhance early visual evoked potentials (VEPs) via a purported sensory gain mechanism (Adamo & Ferber, 2009;Handy, 2003;Matheson, Newman, Satel, & McMullen, 2014;Wykowska & Schubö, 2012). Neural recordings from visual extinction patients demonstrate that graspable objects bias visual perception in an unconscious manner (Pellegrino, Rafal, & Tipper, 2005). Peripheral visual processing in the human brain is enhanced during locomotion (Cao & Händel, 2019). Based on these findings, we suspected that the presence of a motor task would acutely modulate visual processing. Specifically, we hypothesized that visual evoked responses are enhanced when accompanying engagement of the motor cortex. Testing this hypothesis in the human brain is not straightforward due to the fact that manual actions (e.g., button presses) introduce somatosensory and motor related signals into recordings of brain activity, potentially confounding measures of neural visual responses, particularly because actions are often time-locked to changes in the stimulus.
Here, we developed a "sham" motor task aimed at identifying the online effect of motor engagement on the dynamics of concurrent visual processing. Subjects were under the belief that their brain activity was controlling a car racing video game, when in fact they were viewing a recording. The purpose of this manipulation was to engage the motor cortex while avoiding somatosensory or motor evoked potentials. Neural activity was recorded with the scalp electroencephalogram (EEG) to capture fast neural responses that could then be correlated with rapid stimulus fluctuations without requiring exogenous stimulus labels. We assessed the strength of visual evoked responses by measuring the correlation between a feature of the time-varying visual stimulus (optic flow) and the evoked EEG response: the stimulus-response correlation (SRC) (Cheveigné et al., 2018;Dmochowski, Ki, DeGuzman, Sajda, & Parra, 2017). To test the hypothesis of enhanced visual evoked responses during motor engagement, we compared the SRC obtained during "sham play" with that measured during passive viewing and conventional manual game play ("active play"). To investigate the possible role of increased attention during video game play, we conducted a second study where subjects were additionally asked to view the game while counting the appearance of target items on the screen. In what follows, we report a robust enhancement of visual evoked responses when subjects were engaged in video game play and discuss the roles of motor coupling and attention in our findings.

| Participants
All participants provided written informed consent in accordance with procedures approved by the Institutional Review Board of the City University of New York. In the initial study, 18 healthy human subjects (9 females) aged 20 ± 1.56 participated. For the follow-up experiment testing the effects of attention, we recruited a new cohort of 24 healthy subjects (14 females, 20 ± 3.01 years).

| Video game stimulus
We employed the open-source car racing game SuperTuxKart, in which participants navigate vehicles around a track against simulated opponents. All experimental trials were conducted on the default course and spanned three laps in "easy" mode. The average trial had a duration of 175.9 ± 5.51 s. In the initial study, we removed several graphical items from the stimulus such that the video stimulus consisted of only the race car, track, and opponents. To generate the stimuli employed during the sham play and passive viewing conditions, we recorded several races for subsequent playback during the experiments. A nonparticipant played 4 races, with 2 serving as stimuli during the sham play condition and the other 2 employed during the passive viewing condition. With the exception of active play, which produces unique stimuli during each trial, all subjects experienced the same stimuli.
The stimulus was presented on a high-definition Dell 24inch UltraSharp Monitor (1920 by 1,080 pixels) at a frame rate of 60 Hz. Subjects viewed the stimulus in a dark room at a viewing distance of 60 cm. The game's sound was muted during the experiment. The video frame sequence of each race was captured with the open-source Open Broadcaster Software at the native resolution and frame rate. In order to subsequently synchronize the video frame sequence with the recordings of the EEG, a 30-by-30 pixel square was flashed in the top right corner of the display throughout each trial. These were not visible to the subject, but a photodiode registered these markers and transmitted an electrical pulse to the EEG recorder with low latency.

| EXPERIMENTAL PROCEDURES
Initial study participants experienced two trials of the video game stimulus in each of 3 conditions: "active play," "sham play," and "passive viewing." The ordering of the conditions was randomized and counterbalanced across subjects. Subjects were permitted one practice trial of the video game prior to commencing the experiment. During active play, subjects controlled the game via keyboard presses made with the right hand: the left and right keys controlled steering, while the up and down keys produced acceleration and braking, respectively. During sham play, subjects were shown a previously recorded game but were falsely told that their brain activity will be controlling the video game with their brain activity (details below). During passive viewing trials, subjects were instructed to freely view playback of a previously recorded game. The recordings shown during sham play and passive viewing conditions were distinct and not previously seen by the participants, but were reused across participants (within each condition, all subjects viewed the same two stimuli).

| Sham play protocol and design
In order to emulate the experience of a brain-computer interface (BCI), we implemented protocols for priming the subjects prior to sham play. Participants performed a mock calibration of a BCI, where they were asked to imagine pressing the game controls (steer left, steer right) following a highlighted arrow that appeared on the screen. Additionally, subjects were provided a modified set of instructions and were told that the kart would be automatically accelerating, such that steering left and right were the only degrees of freedom. Given that subjects generally held down the accelerate button during active play, we felt that this was justified and would enhance the deception. Subjects were also instructed that the kart would automatically reposition itself in the middle of the road via "AI" assistance if it veered too far from the track. This instruction aimed to prevent subjects from becoming suspicious of the deception when the steering did not match the intended direction. During sham play, subjects were told to place their hands away from the keyboard in a comfortable position.
Upon completion of the experiment, participants filled out a survey reporting their experienced "engagement" during each condition. Scores ranged from 1 ("not engaged") to 10 ("fully engaged"). Following the survey, subjects were informed of the deception task and were asked whether they had become aware of the fact that their brain activity was not controlling game play.
In the follow-up experiment, we repeated the same three conditions (active play, sham play, and passive viewing) and included a fourth "counting" condition. In this condition, subjects were instructed to view prerecorded playback of the game while simultaneously counting the total number of appearances of a target item (i.e., a gift box) on the track. Upon completion of each trial, subjects were asked to recall the total number of times that the item appeared. The correct number of items was 66, with items appearing regularly over the approximately three-minute trial. As in the initial experiment, the ordering of the conditions was randomized and counterbalanced across subjects, and each condition of the game was repeated twice.

| EEG acquisition and preprocessing
The scalp electroencephalogram (EEG) was acquired with a 96-electrode cap (custom montage with dense coverage of the occipital region) housing active electrodes connected to a Brain Products ActiChamp system and Brain Products DC Amplifier (Brain Vision GmbH, Munich, Germany). The EEG was sampled at 500 Hz, digitized with 24 bits per sample, and transmitted to a recording computer via the Lab Streaming Layer software (Kothe, 2015), which ensured precise temporal alignment between the EEG and video frame sequence.
EEG data were imported into the MATLAB software (Mathworks, Natick, MA) and analyzed with custom scripts. Data were downsampled to 30 Hz in accordance with the Nyquist rate afforded by the 60 Hz frame rate, followed by high-pass filtering at 1 Hz to remove slow drifts. To remove gross artifacts from the data, we employed the robust PCA technique (Candès, Li, Ma, & Wright, 2011), which provides a low-rank approximation to the data and thereby removes sparse noise from the recordings. Due to volume conduction, sparse EEG components are generally artifacts. We employed the robust PCA implementation of Lin et al. (Lin et al., 2010) with the default hyperparameter of λ = 0.5. To reduce the contamination of EEG from eye movements, we linearly regressed out the activity of four virtual electrodes constructed via summation or subtraction of appropriately selected frontal electrodes. These virtual electrodes were formed to strongly capture the activity produced by eye blinks and saccades. To further denoise the EEG, we rejected electrodes whose mean power exceeded the mean of all channel powers by four standard deviations. Within each channel, we also rejected time samples (and its adjacent samples) whose amplitude exceeded the mean sample amplitude by four standard deviations. We repeated the channel and sample rejection procedures over three iterations.
During the follow-up experiment, we experienced problems with synchronizing the EEG recordings with the video frame sequence. This was diagnosed by analyzing the temporal alignment between the periodic flashes registered by the EEG recorder's auxiliary channel (triggers) and the corresponding events in the recorded video frame sequence. In 4 of 24 subjects, we observed severe temporal misalignment between the EEG triggers and the video frame sequence: this was detected by comparing the inter-trigger intervals with the interval between corresponding frames (frames during which triggers were issued were identified by noting a white square in the top right corner of the frame). For each subject, we computed the mean absolute difference between these two intervals and noticed that in four of the subjects, a 300 ms average misalignment was measured. These subjects were thus excluded the other 20 subjects had no appreciable misalignment.

| Stimulus feature extraction
Video frames were downsampled to a resolution of 320-by-180 pixels to reduce data size and then converted to grayscale images. Optical flow was computed with the Horn-Schunck method as implemented in the MATLAB Computer Vision System Toolbox (Horn & Schunck, 1981). For each frame, we computed the mean (across pixels) of the magnitude of the optical flow vector. Temporal contrast was constructed by taking the mean (across pixels) of the frame-to-frame difference of the video sequence (Dmochowski et al., 2017). The resulting time series were z-scored prior to SRC analysis.

| Stimulus-response correlation (SRC)
To measure the correlation between the time-varying stimulus feature s (t) and the D dimensional evoked neural response r i (t) ,i ∈ 1,2, … ,D, we employed the multidimensional SRC technique developed in Dmochowski et al. (Dmochowski et al., 2017). This regression approach leveraged Canonical Correlation Analysis and was independently developed in de Cheveigné et al. (Cheveigné et al., 2018). The approach consists of temporally filtering the stimulus: and spatially filtering the neural response: to produce stimulus component u(t) and response component v (t) that exhibit maximal correlation: where h * = [h(1)…h(L)] ⏉ are the optimal temporal filter coefficients of the L-length filter and w * = [w 1 …w D ] ⏉ are the optimal spatial filter coefficients, and where p uv is the Pearson correlation coefficient between u(t) and v(t). The solution to (3) is given by Canonical Correlation Analysis (Hotelling, 1936) and consists of pairs of projection vectors h * j ,w * j k j=1 that yield a set of maximally correlated components u j (t) and v(t) with corresponding correlation coefficients that decrease in mag- . Note here that we regularized the CCA solution by truncating the eigenvalue spectrum of the EEG covariance matrix to K = 11 dimensions, as this value explained over 99% of the variance in the data. Encompassing all components, the total correlation between the stimulus and response is given by: with the exception of the results presented in Figures 4 and 6, the CCA filters were computed after pooling data from all conditions. In this manner, SRCs were computed over a common basis for all conditions.
The filter coefficients h j (t) are equivalent to the "temporal response function" extracted with conventional multivariate regression (Crosse, Di Liberto, Bednar, & Lalor, 2016). Note that here j represents a specific component, rather than a specific electrode as in Crosse et al. (24). One can also conceive of h j (t) as the temporal response function of a virtual electrode or "source" j, extracted from the EEG with the spatial filter w j . The corresponding "forward model,"a j , can be obtained following conventional approaches in EEG (Haufe et al., 2014;Parra, Spence, Gerson, & Sajda, 2005). In Dmochowski et al. (Dmochowski et al., 2017) we show that this forward model a j is the equivalent of a "spatial response function," in that the total evoked response is the product of the spatial and temporal responses, summed over all components:

| Alpha power analysis
To test for differences in alpha power between conditions ( Figure 3, S6), we temporally filtered the EEG response of each electrode r i (t), i = 1, …, D, to the alpha-band (8-12 Hz) using a fourth order Butterworth filter. We then measured the alpha power at each electrode by computing the temporal mean square of the filter output. Alpha power was averaged across the two trials performed by each participant prior to statistical tests.

| Statistical testing
We tested for conditional differences in SRC, self-reported engagement, and alpha power by conducting paired, twotailed Wilcoxon signed-rank tests on sets of n = 18 (or n = 20 for the follow-up study) samples in each condition, with each exemplar corresponding to a subject.

| Comparing spatial and temporal responses across conditions
To test for spatial and temporal differences in the visual evoked responses across conditions (Figure 4), we computed response functions separately for active play, sham play, and passive viewing, and counting. In order to obtain the conditional spatial response functions, we filtered the stimulus with the first three temporal response functions, as computed over data pooled across conditions (shown in Figure 2a, second row). This yielded three distinct (filtered) versions of the optic flow, u j (t), j = 1, …, 3. We then performed a linear regression from this filtered optic flow onto the scalp EEG, but separately for each condition. The resulting regression weights represent the strength of the evoked EEG response a j to the filtered optic-flow stimulus in each condition (Figure 4a-c). Analogously, to obtain the temporal response function for each condition, we spatially filtered the EEG with the first three CCA-derived filters (shown in Figure 2a, first row). This is equivalent to generating three virtual electrodes v j (t), j = 1, …, 3. For each condition, we then performed a temporal linear regression from the optic-flow stimulus onto these spatially filtered neural responses. The resulting time courses represent the dynamics h j (t) of the visual evoked response in each condition (Figure 4g-i).
To test for significant differences between conditions (sham versus passive: Figure 4; sham play versus counting Figure 6), we computed the difference of the values a j (or h j (t)) between passive and sham conditions (Figure 4d-f). These differences were measured on group-averaged spatial response or temporal response functions. Statistical significance was conducted with a permutation test. A null distribution of conditional differences was generated by randomly swapping subject assignment between sham play and passive viewing (without replacement) over 1,000 random assignments. To correct for multiple comparisons, we controlled the false discovery rate at 0.05 across the 96 electrodes and 30 times points, respectively.

| RESULTS
We hypothesized that visual evoked responses are enhanced during stimulus-dependent motor control. To test our hypothesis while ruling out activity associated with actual movement, we informed study participants that their brain activity would be controlling a car racing video game but instead presented them with playback of a previously recorded game ("sham play"). In other trials, subjects controlled the game with keyboard presses ("active play") or passively viewed game playback ("passive viewing"; Figure 1a).
Our dependent measure was the temporal correlation between the time-varying optic flow of the video stream and the evoked brain response captured by the scalp EEG (Figure 1b). To account for the spatial diversity of the 96-channel EEG and varying response latencies, we captured multiple spatial components of the EEG and temporal components of the stimulus following the methodology developed previously (Dmochowski et al., 2017). This approach employs Canonical Correlation Analysis (CCA) to model neural responses to continuous stimuli with "temporal response functions" (Crosse et al., 2016). These evoked responses are analogous to conventional event-related potentials (ERP) but do not require the specification of discrete visual events. The CCA approach differs from multivariate regression in that it decomposes neural activity into components with their own temporal and spatial profile. We measure the overall strength of the visual evoked responses as the summed correlation measured in each component to arrive at the total stimulus-response correlation (SRC; Figure 1c).
When applied to the present data, we obtained several visual response components evoked by optic-flow fluctuations ( Figure 2a). Notably, the strongest component was marked by a parietal topography centered at electrode CPz (centroparietal midline). The corresponding temporal response function showed a positive peak at 200 ms. The second strongest component exhibited poles over the medial frontal and medial occipital regions and showed a late temporal response with a peak at 400 ms (Figure 2a). Components 3 and 4 showed mirror symmetric spatial response functions with peak expression over right and left frontocentral electrodes, respectively. Collectively, the set of evoked response functions indicate that the visual stimulus drove neural activity over broad scalp regions and included late responses.

| Enhanced stimulus-response correlation during active and sham play
We measured the total SRC separately for each experimental condition and found a significant increase during sham play relative to passive viewing (z = 2.33, p = .02; paired, two-tailed Wilcoxon signed-rank test, n = 18 subjects; Figure 2c). Similarly, SRC was increased during active play (z = 2.33, p = .02, Figure 2c). No significant difference in SRC was found between active and sham play (z = 0.54, p = .58; Figure 2b). To compute the SRC, we employed the optic flow of the video stream because this particular feature drives the EEG stronger than other low-level visual or auditory features (Dmochowski et al., 2017). However, similar results were obtained with temporal visual contrast ( Figure S1). Namely, the spatial response functions are highly congruent (compare Figures S1a; Figure 1a), and we found a significant increase in SRC during sham play compared to passive viewing (z = 2.61, p = .008, Figure S1b), and a numerically higher SRC during active play relative to passive viewing (z = 1.48, p = .138). We therefore continued our analysis with the optic-flow feature.
Following the experiment, participants were asked to rate their engagement with the game in each condition. Analogous to the SRC measure, subjects reported higher engagement scores for active play (z = 3.44, p = 2.9 × 10 −4 , paired, two-tailed Wilcoxon signed-rank test, n = 18 subjects) and sham play (z = 2.13, p = .031) relative to passive viewing ( Figure 1d). No significant difference in self-reported engagement was observed between active and sham play (z = 1.17, p = .24; Figure 1d).
After completing the post-experiment survey, subjects were informed of the deception in the sham play and were asked whether they had become aware of the fact that their brain activity was not controlling game play. Of the 18 study participants, 13 reported being deceived for the entirety of the experiment. The remaining five subjects did not immediately notice the sham. Interestingly, the SRC measured in the deceived participants was significantly higher than that measured in the non-deceived subjects, but only during the sham play condition ( Figure S2a; p = .024, one-tailed Wilcoxon rank sum test). This was mirrored by a corresponding increase of self-reported engagement in the deceived subjects during sham play ( Figure S2b; p = .026, one-tailed Wilcoxon rank sum test).

F I G U R E 1
Measuring visual evoked responses with and without motor engagement. a, Study participants experienced a car racing video game under several conditions: manual control ("active play"), viewing but under the false belief that brain activity was controlling game play ("sham play"), and knowingly viewing game playback ("passive viewing"). In a follow-up study, we asked subjects to count screen items while viewing playback ("count viewing") to control for the possible effects of increased attention. b, Throughout the experiment, we recorded the video stream as well as the evoked scalp EEG. c, The strength of visual evoked responses was assessed by measuring the temporal correlation between the overall optic flow of the video stream and the time-locked neural response. To account for varying response latencies and multiple recording electrodes, we formed multiple spatial components of the EEG and temporal components of the stimulus using Canonical Correlation Analysis (Cheveigné et al., 2018;Dmochowski et al., 2017). The sum of correlations across all components formed the dependent measure, which we term here the total stimulus-response correlation (SRC) [Colour figure can be viewed at wileyonlinelibrary.com]

| Alpha desynchronization over motor cortex indicates motor engagement during sham play
By design, there were no overt differences in behavior between sham play and passive viewing-in both conditions, participants viewed the stimulus without performing manual actions. This prevented confounds due to motor or somatosensory evoked responses that could have been present during active play. To test whether our sham condition nevertheless engaged motor cortex, we measured the power of alpha-band (8-12 Hz) oscillations for each condition. Desynchronization of alpha activity has long been observed over the motor cortex ("mu" rhythm) when subjects perform or visualize motor actions (Pineda, 2005). Indeed, we observed a significant reduction in alpha power during both active and sham play relative to passive viewing, with the largest differences observed at bilateral central scalp locations over the motor cortex (Figure 3a-b). On the other hand, alpha power did not significantly differ between active and sham play (Figure 3b). This provides evidence that the motor system was indeed engaged during sham play.

| Sham play elicits stronger late evoked responses over parietal cortex
Thus far, we pooled the data from all conditions in order to form a common set of evoked response components and only evaluated differences in total SRC. To find the origin of these differences, next we computed temporal and spatial response functions separately for each condition. The condition-specific spatial responses were computed by spatially regressing a condition-pooled stimulus component onto the unfiltered EEG of each condition. The resulting regression weights, depicted in Figure 4a-c, represent the strength of the visual evoked response in each condition. Condition-specific temporal responses were constructed by temporally regressing F I G U R E 2 Enhanced visual evoked responses during active and sham play. a, Spatial and temporal response functions for the four strongest components evoked by the optic flow of the video game stimulus. Time indicates the delay of the EEG evoked response relative to the stimulus presentation time. b, The SRC contributed by each component, where the four components displayed in (a) are indicated in blue. c, The total SRC was measured separately for each condition (bar height depicts mean across n = 18 subjects, while markers denote individual subjects, joined across conditions with gray lines). Passive viewing elicited significantly lower total SRC compared to active play (p = .02, n = 18, paired twotailed Wilcoxon signed-rank test) and compared to sham play (p = .02, n = 18). No significant difference was found between active and sham play (p = .58, n = 18). d, Participants were asked to rate their engagement with the video game in each condition. Subjects reported significantly higher engagement during active play (p = 2.93 × 10 -4 , n = 18, paired two-tailed Wilcoxon signed-rank test) and sham play (p = .016, n = 18) relative to passive viewing. No significant difference in self-reported engagement was found between active and sham play (p = .24, n = 18) [Colour figure can be viewed at wileyonlinelibrary.com] the stimulus time course onto the spatially filtered EEG, where a common spatial filter was used in each condition. The temporal regression weights represent the latency of the evoked response.
During the active condition, subjects controlled the game with button presses, potentially evoking motor and somatosensory responses that correlate with the stimulus. Such activity would obfuscate the visual stimulus-evoked response. We therefore focused our analysis on the differences between sham play and passive viewing. The results of the comparison between active play and passive viewing were mixed ( Figure S3). In the first component, passive viewing exhibited stronger responses compared to active viewing ( Figure S3a,g). On the other hand, the second component, which exhibited activity over the frontocentral electrodes, showed a robust increase of active play relative to passive viewing ( Figure S3b, h).
Comparing sham play and passive viewing, the spatial and temporal patterns of evoked responses were largely preserved across conditions (Figure 4a-c, g-i). We observed significant differences in the magnitude of the spatial and temporal responses. In particular, the spatial response of the first component, which peaked over the medial centroparietal electrodes, was stronger during sham play compared to passive viewing (Figure4d). Furthermore, compared to passive viewing, the temporal responses measured during sham play were stronger between 500 and 700 ms in components 1 and 2 ( Figure  4g,h). This suggests that motor engagement may amplify late visual evoked responses that were generated downstream from the primary visual cortex.

| No SRC difference between counting task and passive viewing
One interpretation of the increased SRC during sham play is that participants paid more attention to the stimulus, thus enhancing visual evoked responses. To test this hypothesis, we repeated the study with a separate cohort of n = 20 subjects, but this time also asking subjects to view a prerecorded game while counting target items that appeared in the game (n = 66 items were presented during each race)-a task that required a high level of attention. This condition aimed to control for attention while removing any effects from engagement of the motor system.
In line with the initial study, we found increased SRC during both active and sham play compared to passive viewing (active versus passive: z = 2.61, p = .009; sham versus passive: z = 1.978, p = .048, n = 20, paired, two-tailed Wilcoxon signed-rank test, Figure 5a). On the other hand, there was no significant difference in SRC between passive viewing and the counting task (z = 1.12, p = .26, n = 20). Although active and sham play elicited greater SRC than the counting task, the difference fell short of reaching significance (active versus count: z = 0.89, p = .37, sham versus count: z = 1.60, p = .11). The components measured during the follow-up study showed a strong resemblance to those found in the initial experiments ( Figure S4).
Self-reported engagement scores indicated that subjects were more engaged during active play, sham play, and the counting task relative to passive viewing (active play: F I G U R E 3 Alpha desynchronization over motor cortex during active and sham play. a, The power of the EEG in the alpha-band (8-12 Hz) across the scalp, shown for each experimental condition. Note the greater power over left central locations during passive viewing. b, The difference in alpha power between conditions, where significant differences are indicated with "+" markers (p < .05, n = 18, paired two-tailed Wilcoxon sign rank test, corrected for multiple comparisons over 96 electrodes by controlling the FDR at 0.05). During active and sham play, a significant decrease in alpha power was resolved over broad regions of the scalp, most notably over the left and right central electrodes. This suggests that the motor cortex was indeed engaged during sham play despite the absence of an overt motor task [Colour figure can be viewed at wileyonlinelibrary.com] p = .001; sham play: p = .001, counting task: p = 9.23 × 10 −4 , n = 20; Figure 5b). Of the 20 participants, only 6 subjects completed the experiment believing that their brain activity was controlling the video game. We did not find a significant difference between deceived and non-deceived subjects in SRC for any of the conditions, including sham play ( Figure S5). This was paralleled with a lack of difference in self-reported engagement between deceived and non-deceived. This suggests that the presence of deception did not evoke greater engagement with the game.
Consistent with the findings of the initial study, we found reduced alpha power over the central electrodes during active and sham play relative to passive viewing (p < .05, n = 20, corrected for multiple comparisons by controlling the false discovery rate at 0.05; Figure S6a-b). On the other hand, there were no significant differences in alpha power between passive viewing and the counting task conditions (p > .05, n = 20; Figure S6a-b).
Finally, we probed differences in the spatial and temporal response functions between sham play and the counting F I G U R E 4 Sham play elicits stronger late evoked responses over centroparietal cortex. In order to compare the spatial and temporal characteristics of the evoked EEG responses, we computed separate spatial and temporal responses for each experimental condition. a-c, Spatial response functions of the first three components, shown for sham play (top row) and passive viewing (bottom). d-f, Sham play exhibited stronger responses over centroparietal cortex in the first component. The scalp maps indicate the spatial response difference between sham play and passive viewing, where significant effects are marked with dots (corrected for multiple comparisons over 96 electrodes by controlling the FDR at 0.05, n = 18 subjects). g-i, Temporal response functions of the first three components, shown separately for sham play and passive viewing. Time indicates the delay of the evoked EEG response relative to the stimulus presentation time. Sham play exhibited significantly stronger evoked potentials at late times (400-800 ms) in components 1 and 2. Dots indicate times exhibiting a significant difference (corrected for multiple comparisons over 30 time points by controlling the FDR at 0.05, n = 18 subjects) [Colour figure can be viewed at wileyonlinelibrary.com] task ( Figure 6). Sham play exhibited a stronger spatial response at a bilateral cluster spanning frontal, central, and temporal electrodes in the third component (Figure 6f). Sham play also evoked a stronger temporal response near 250 ms in the second component ( Figure 6h). Thus, although less pronounced, the differences between sham play and counting mirrored those between sham play and passive viewing (Figure 4).

| DISCUSSION
The findings of this study are consistent with the view that visual evoked responses are enhanced when vision guides motor control. The employment of a sham mitigated confounds from movement and somatosensation, while a reduction of alpha-band activity indicated that the motor cortex was indeed engaged despite the lack of overt actions. By asking subjects to count screen items during viewing, we attempted to control for increased attention in the play conditions relative to passive viewing. Indeed, sham play enhanced visual responses over passive viewing, but counting did not. However, the difference between sham play and counting fell short of reaching significance. Therefore, we cannot fully rule out that heightened attention contributed to the enhanced visual responses during sham play.
The main limitation of our study is thus that we were not able to fully disentangle the effects of an engaged motor system from that of increased top-down attention. The difficulty in deceiving subjects throughout this longer experiment (due to the addition of the fourth condition) likely contributed to this-only 6 of 20 subjects were fully deceived, and these six participants did not report a significantly higher engagement during sham play. As a result, it is likely that the visual evoked responses measured during sham play partially reflected brain states consistent with passive viewing. Note that despite this, we still found significant increases in the amplitudes of the spatial and temporal responses measured during sham play relative to counting ( Figure 6). Another study limitation relates to possible differences in brain state between the sham play, passive viewing, and counting conditions that are separate from motor activation. For example, deceived subjects may have noticed discrepancies between the car's movement and the intended command. The unpredictability of the stimulus in this case may have evoked stronger visual responses.
The findings of the comparison between neural responses measured during active play and those of passive viewing were somewhat mixed. As expected, the total SRC was significantly higher during active play (Figures 2,5). However, when comparing the SRFs and TRFs, we found a mixed result: In the first component, active play actually showed weaker activity compared to passive viewing. On the other hand, active play did show stronger activity in the second component ( Figure S3). This suggests that the overall stronger visual evoked response during active viewing was dominated by the second component, which indeed showed activity at electrodes over the motor cortex.
Visual stimuli containing objects that afford actions have been shown to increase visual spatial attention and amplify evoked responses, but only when the premotor and prefrontal cortices are activated (Handy, 2003;Humphreys et al., 2010). This implies connectivity between premotor and prefrontal F I G U R E 5 No enhancement in stimulus-response correlation during counting task. In order to control for the potential influence of attention on the observed SRC increase, we performed a follow-up study including a condition where subjects were asked to count appearances of a target item on the screen while viewing playback of a prerecorded game ("Counting"). a, Reproducing the initial study, we observed a significant increase in SRC during both active and sham play relative to passive viewing (active play: z = 2.61, p = .009; sham play: z = 1.98, p = .048, n = 20, paired two-tailed Wilcoxon sign rank test). On the other hand, the counting task did not elicit a significant increase in SRC relative to passive viewing (z = 1.12, p = .26). Although sham play exhibited higher SRC compared topassive viewing, the difference fell short of significance (z = 1.60, p = .11). Bar heights depict the mean total SRC across n = 20 subjects, and markers denote individual subjects. b, Subjects reported significantly higher engagement during active play, sham play, and the counting task compared to passive viewing (p < 9.23 × 10 −4 , n = 20) [Colour figure can be viewed at wileyonlinelibrary.com] regions and the visual cortex, which has been shown anatomically in the primate brain (Van Essen, 2005;Wise, 1997;Young, 1993). Here, the presence of the race car on the screen may have similarly amplified the evoked response to the optic-flow stimulus. Note, however, that the modulation of visual responses required an active motor plan, in that the same actionable stimulus did not enhance visual responses during passive viewing.
Observing motor actions has been shown to generate imitative motor plans in the observer (Rizzolatti & Craighero, 2004), but the role of these motor plans has been debated (Rizzolatti, Fadiga, Gallese, & Fogassi, 1996). One account is that the function of this motor activation is to generate a prediction of future perceptual input, thus bypassing the delays of sensory processing (Wilson & Knoblich, 2005). During active and sham game play, study participants may have formed a prediction of the evolving optic-flow stimulus, consistent with increased stimulus-driven activity over the central cortex (Figure 4). This interpretation is consistent with the theory that perceived events and planned actions share a common representational domain (Prinz, 1997).
The sham and active play conditions were associated with a large reduction in alpha-band activity, particularly over the central electrodes (Figure 3). When recorded over the central electrodes, 8-12 Hz activity is known as the "mu" rhythm, reflecting the belief that this activity is distinct from sensory alpha activity (Pineda, 2005). Our finding of a modulation in mu power during active and sham play, relative to passive viewing, is consistent with the interpretation of the "mu" rhythm as reflecting the transformation between vision and action (Pineda, 2005). Sustained reductions of mu power have been found during execution of motor tasks primed by visual stimuli (Sabate, Llanos, Enriquez, & Rodriguez, 2012). It is interesting that during sham play, the reduction in mu power was left lateralized. This suggests that subjects may have primed their left motor cortex during mock neural control (active play was performed with the right hand). Although F I G U R E 6 Visual evoked responses are stronger during sham play relative to counting. a-c, To probe differences in the evoked responses between sham play and counting, we computed the spatial and temporal response functions separately for each condition. The spatial responses of the first three components are shown. d-f, Sham play exhibited a stronger spatial response at a bilateral cluster of frontal, central, and temporal electrodes in the third component. g-i, Temporal response functions of the first three components, shown separately for sham play and counting. Sham play evoked a stronger temporal response near 250 ms in the second component [Colour figure can be viewed at wileyonlinelibrary.com] the strongest alpha reductions were expressed over central electrodes, the region of significant change included parietal and occipital electrodes, indicating that both mu and sensory alpha were modulated.
In general, active and sham play may have exhibited stimulus-driven neural activity along a broader portion of the brain. For example, it is possible that the optic-flow stimulus evoked correlated activity in dorsal regions downstream from striate visual cortex, such as the parietal or premotor cortices. Indeed, the strongest modulation of the evoked response, as well as alpha power, was seen over the parietal and central cortices (Figures 3 and 4). The first component was expressed over these areas (Figure 2). The posterior parietal cortex (PPC) has been shown to code motor intentions in the macaque (Gnadt & Andersen, 1988), and it is tempting to speculate that a PPC-like component tracked the visual stimulus more faithfully in the sham play condition compared to the passive state. However, a limitation of our study is the poor spatial resolution of the scalp EEG, and the associated difficulties in recovering cortical sources from observed scalp topographies. The ill-posed nature of the EEG inverse problem is exacerbated when averaging scalp topographies over multiple subjects, as was implicitly done here. A natural extension of this work is thus to replicate the experiment with fMRI to glean insight into the brain areas driving the enhancement of visual evoked responses. However, note that the high temporal resolution of the EEG allowed us to measure fast evoked responses to the dynamic stimulus, which may not be feasible with fMRI due to the slowness of the hemodynamic response to neural activation.
The SRC approach taken here (Cheveigné et al., 2018;Dmochowski et al., 2017) permitted the capture of continuous visual evoked responses during a sensorimotor task that more closely mimics real-world settings than conventional event-related designs that employ discrete stimuli. Moreover, we were able to capture several components of the neural response to the optic-flow stimulus. Note that in our framework, the analogs of the classical visual event-related potential (ERP) are the temporal response functions shown in Figure 2a(second row), which are entirely analogous to what is extracted with multivariate regression techniques (Crosse et al., 2016). These time courses indicate the brain's response to an impulse of optic flow. In our framework, the response is expressed over a set of electrodes as depicted by the corresponding "spatial response function" (first row in Figure 2) (Dmochowski et al., 2017). Note that while optic flow is a low-level feature of the visual stimulus, the neural response to it may be modulated by complex brain states such as anticipation, surprise, fear, or arousal. Thus, the neural activity that was measured here captured potentially more than the conventional visual evoked response. Note for example that the effects of an engaged motor cortex were to enhance late responses over central and parietal cortex. While not "visual" in the conventional sense, these evoked responses were nonetheless driven and thus correlated with the dynamic visual stimulus.
Regardless of the neural mechanism underlying the enhancement of visual processing during game play, our results provide an avenue for decoding active versus passive vision from non-invasive measurements of neural activity. By measuring the correlation between neural responses and a time-varying visual stimulus, one can extract an estimate of how active the viewer is. While here we measured SRC at the scale of a 3-min trial, it can also be computed in finer time increments and tracked continuously. We speculate that there is a continuum between passive viewing and active control and that the SRC can place the subjects onto this continuum on a moment-to-moment basis. In the future, wearable devices may be equipped with various sensors for capturing environmental stimuli in real time (e.g., microphones and video cameras). Given the development of unobtrusive techniques for non-invasive sensing of neural activity (Casson, Yates, Smith, Duncan, & Rodriguez-Villegas, 2010), such as that from inside the ear canal (Looney et al., 2012), the SRC represents a natural technique for gleaning information about individual brain state in real time. For example, it may be possible to decode spatial attention (Bae & Luck, 2018) by computing the SRC separately for multiple areas of the visual field or directions of incoming sound. There is already evidence that SRC can be used to determine speech comprehension in the context of hearing aids (Iotzov et al., 2017) or capture a listener's attention (Cheveigné et al., 2018). An advantage of the SRC approach is that it is unsupervised, in that no learning procedure is required to, for example, learn patterns of neural activity that distinguish active from passive viewing.
Finally, an interesting facet of this work is that we were able to deceive a substantial number of our participants. In total, 19 of 38 participants completed the experiment with the belief that their brain activity was controlling game play during trials in which they actually viewed prerecorded stimuli. It is likely that the car racing video game employed in our study elicited stereotyped manual (and imagined) responses across subjects, thus contributing to the efficacy of deception. It is also notable that sham play evoked strong neural activity over the parietal cortex (Buneo & Andersen, 2006), a region associated with visually guided movement planning and control. This suggests that such visuomotor pathways may be activated with only the perception of control. Aside of being an interesting psychological finding, this opens up new experimental paradigms for probing the brain under active scenarios.