Fluctuations in alpha and beta power provide neural states favourable for contextually relevant anticipatory processes

Cued sensory input occasionally fails to immediately ensue its respective trigger. Given that our environments are rich in sensory cues, we often end up processing other contextually relevant information in the meantime. The experimental design of the present study allowed us to investigate how such temporal delays and visual interferences may impact anticipatory processes. Thirty‐four participants were trained to remember an individualised set of eight paired‐up faces. These paired‐up faces were presented pseudorandomly in sequences of unpaired face images. To keep participants engaged throughout the electroencephalography study, they were instructed to classify each face image, according to its sex, as fast as possible without compromising accuracy. We observed dissimilar modulations in alpha and beta power between the 6‐s timeframe encompassing the onsets of predictive and expected images (temporal delay block) and the 6‐s timeframe encompassing the predictive, interference and expected images (visual interference block). Furthermore, an expectation‐facilitated reduction of the face‐sensitive N170 component was only observed if an anticipated face image directly followed its corresponding predictive counterpart. This effect was no longer evident when the expected face was preceded by a distracting face image. Regardless of the block type, behavioural measures confirmed that anticipated faces were classified significantly faster and with fewer erroneous responses than faces not foretold by a predictive face. Collectively, these results demonstrate that whilst the brain continuously adjusts internal hierarchical generative models to account for temporal delays in stimulus onset and visual interferences, the higher levels, and subsequent predictions, fundamental for expectation‐facilitated behaviours remain intact.

The sensory triggers constituting our natural surroundings are exceedingly fluid and common to rapidly change and develop.Our brains deal with this continual transmission of sensory material by means of filtering out relevant and dismissing irrelevant information (Ligeza et al., 2017;van Moorselaar et al., 2020).To achieve this, the current context plays a crucial role in assisting the brain to compartmentalise which sensory data to attend to and which to suppress (Kelly et al., 2006;Limanowski et al., 2020;Rihs et al., 2007;Worden et al., 2000).The contextual setting is therefore pivotal for narrowing down and selecting the most 'newsworthy' cues to focus on in order to adapt internal predictive models and behave accordingly.For instance, the sound of an ambulance siren informs us to vigilantly take in our surroundings in case we must make way for the onrushing vehicle.The notion that cues such as the latter evoke an array of internal predictions to optimise respective behaviours is well established (Clark, 2013;Friston, 2005).Within this predictive processing framework, predictions regarding upcoming events are derived from prior knowledge and are propagated top-down within sensory hierarchies.Incoming sensory information, on the other hand, is mediated in a bottom-up motion.These complementary pathways are distinguishable by distinct neural signatures, whereby top-down processes are facilitated by alpha/beta frequency ranges (Arnal & Giraud, 2012;Bastos et al., 2015) and bottom-up processes by gamma frequencies (Bastos et al., 2015; also see Kaiser & Lutzenberger, 2005).
Apart from benefiting behavioural performance, having access to prior knowledge downregulates the amount of cognitive resources necessary to process a given stimulus (Blom et al., 2020;Klimesch, 2011).More precisely, the brain can draw information from these pre-activated stimulus-specific neural representations ahead of their respective afferent sensory input.Upon stimulus onset, fewer cognitive resources are required to process this already expected sensory input.In turn, neural activity such as gamma-facilitated bottom-up processes can be minimised, leading to a subsequent reduction in neural expenses (Bauer et al., 2014; also see Gordon et al., 2019).Diminished neural activity in response to a given target can, thus, be seen as a marker reflecting predictiveness.For instance, a reduction in the face-sensitive eventrelated potential, N170, can indicate if a face was expected and/or familiar (Johnston et al., 2016;Ran et al., 2014).A growing body of evidence has revealed that this pre-activation of already existing knowledge is associated with pre-stimulus enhancements in alpha/beta power (Brodski-Guerniero et al., 2017;Mayer et al., 2016).In a previous study, we even observed that this enhancement in low-frequency power stretched throughout the entire interstimulus interval (ISI) between a cue and its implicitly expected target (Roehe et al., 2021).In a natural setting, however, it is relatively unlikely that a cue is immediately pursued by a single, specific anticipated event.More commonly, we are left to process various sensory input before the anticipated event occurs.Revisiting the ambulance scenario mentioned previously, sometimes a few seconds go by after first hearing the siren in which we are hastily scanning our surroundings for flashing blue lights.In these cases, we end up processing several afferent sensory inputs before glimpsing the anticipated event.To the best of our knowledge, the notion of how and to what extent contextually relevant interferences impact pre-activated expectations remains underexplored.
The central aim of the present electroencephalography (EEG) study was, therefore, to investigate how different 'interruptions', such as a delay in stimulus onset and distracting visual information, would impact the availability of cued prior knowledge and subsequent sensory predictions.Prior to the EEG study, participants were extensively trained to learn the identity of eight face images that were sorted into four customised pairs and were pseudorandomly embedded in sequences of unpaired faces.To remain engaged throughout the EEG experiment, the participants were instructed to classify all occurring images as either female or male faces.To incorporate both a delay condition and an interference condition, we adapted our previous experimental design (Roehe et al., 2021) by (i) elongating the ISI between two paired-up face images to generate a delayed temporal onset of the expected image, and (ii) inserting a contextually relevant face image in between the paired-up images to act as a visual interference.These temporal delay and visual interference blocks were presented in alternate succession.
Foremost, we expected to replicate findings indicating that expectations boost behavioural responses (Ran et al., 2014;Turk-Browne et al., 2010).In line with previous findings, we hypothesised that early access to a neural template of the anticipated face would permit bottom-up processes to be downregulated and, hence, result in a diminished N170 response for expected images (Johnston et al., 2016;Ran et al., 2014;Roehe et al., 2021).On the contrary, because the interference images were contextually relevant, that is, required a specific behavioural response and invariably occurred amidst the cue and anticipated images, we did not expect neural signatures of distractor suppression for these images (Kelly et al., 2006).This foregoing assumption was based on previous research reporting that distractor interference was greatly reduced when their presentation was highly predictable in terms of spatial and temporal occurrences (van Moorselaar et al., 2020).Ultimately, we used time-frequency analyses to investigate to what extent a temporal delay in stimulus onset and visual inferences would influence the augmentation in pre-stimulus alpha/ beta power and the predictive impact of the cueing stimulus.Succinctly, we observed that the brain shifts between prioritising neural states favourable for either top-down or bottom-up processes.These fluctuations in alpha/beta power conveyed different spectral patterns depending on whether a temporal delay in stimulus onset occurred or an interfering face image was presented in between the predictive and expected images.Thus, the brain seems to adapt internal predictive models to account for temporal delays in sensory input and visual interferences.Intriguingly, different levels within this hierarchical model seem to be fine-tuned to varying degrees so that contextually relevant predictions can continue to aid expectationfacilitated behavioural responses.

| Participants
A total of 37 participants took part in the study (31 females; 21.57± 3.14 years of age [mean ± SD]) after having signed informed consent based on the principles expressed in the declaration of Helsinki.Two participants had to be excluded because of excessive movement artefacts and one further participant because of extremely delayed response times (3 SD from the mean).Subsequently, the final sample size consisted of 34 participants (28 females; 21.62 ± 3.25 years of age [mean ± SD]).All participants were right-handed as assessed by the Edinburgh Handedness Inventory (Oldfield, 1971), had no history of neurological and psychiatric disorders, and reported (corrected-to-) normal visual acuity.The participants were either awarded class credits or were reimbursed (24 Euros) for their participation.The study was approved by the Ethics Committee of the University of Münster (Department of Psychology).

| Stimulus material and experimental design
The participants were presented with sequences of 20 recurring neutral face images (10 female) chosen from the Radboud Faces Database (RaFD; Langner et al., 2010).To limit the amount of eye movement, all images were scaled, using GIMP (GNU Image Manipulation Program), so that salient facial features, that is, eye and mouth regions, aligned across images (Blais et al., 2008).Eight of these images (four male and four female images) were sorted into four reoccurring pairs, covering all possible paired-up combinations.Each individual participant was assigned a unique set of four pairs which were pseudorandomly presented within sequences of randomly reoccurring unpaired images.The face images (W = 9.5 cm, H = 14 cm) were depicted individually for 500 ms in the centre of a grey background (subtending visual angles of approximately 9 vertically and 6 horizontally).The depiction of these face images was immediately followed by a 2.5-s fixation period.A single trial was, therefore, a total of 3 s in length.
The experimental design consisted of two types of blocks, each occurring once during the training session and twice during the EEG experiment.The blocks were shown in an alternating order, commencing with the first temporal delay block, and finishing with the second visual interference block.During the temporal delay blocks, each image was succeeded by an elongated fixation period of an additional 3 s.Hence, the timeframe between the onsets of two consecutive images was 6 s long (Figure 1).In contrast, the timeframe between the onsets of two face images in the visual interference blocks was only 3 s.The hidden face pairs in the visual interference blocks were, however, disjointed by the depiction of a randomly selected unpaired face image (interference image).Similar to the temporal delay blocks, the interval between the paired images in the visual interference blocks was, subsequently, also 6 s in duration (Figure 1).
To balance the occurrences of each individual image, every face image was repeated twice during each of the four sequences making up a single temporal delay block.After each sequence, a small break of 1 min (at most) could be taken after which the next sequence would commence automatically.A longer self-determined break ensued upon completion of each block.With the additional 3-s fixation period within each trial, the temporal delay blocks took approximately 16-19 min to complete, depending on whether participants made use of the entire 1-min break after completing each sequence.For the visual interference blocks, each image was repeated three times.In this case, each unpaired image occurred twice as a random and once as an interference image in each of the four sequences of a single visual interference block.Each visual interference block lasted approximately 12-15 min.This meant that together the two temporal delay blocks comprised 64 predictive, 64 expected, and 192 random trials and the two visual interference blocks of 96 predictive, 96 expected, 96 interference, and 192 random trials.
The experiment was programmed and presented using Presentation 18.1 (Neurobehavioral Systems, Dan Francisco, CA, USA).

| Task
Prior to starting the experiment, the participants were shown four individual face images (two male and two female faces) which they were asked to remember, as these would each be paired-up with a specific face.The participants then engaged in a classification task, in which they were instructed to distinguish between male and female faces as fast and as accurately as possible.All images, regardless of their image category (predictive, expected, interference or random), required a behavioural response.In addition, the participants were required to learn and remember the identity of the faces which immediately pursued each of the four remembered faces shown during the induction.For half of the participants, a left button press (left index finger) classified the depicted face as a female face and a right button press (right index finger) as a male face.This classification arrangement was reversed for the other half of the participants.At the end of each experiment (training and EEG), the participants had to correctly identify each of the four image pairs.

| Experimental procedure
The study took place on two consecutive days: a short behavioural training session was scheduled for the first day and the EEG experiment for the following day.The training session allowed participants to become accustomed to the task at hand and explicitly learn the identity of the paired-up faces.During the 15-min behavioural training, one temporal delay and visual interference block was shown, each consisting of three sequences.At the end of the experiment, the participants had to correctly identify all four of their personally tailored face pairs as a prerequisite to take part in the EEG experiment the following day.
For the EEG experiment, the participants were comfortably seated in front of a response box and screen in a dimly lit, electrically shielded EEG booth.The distance between the seated participant and the screen measured approximately 80 cm.Over the duration of approximately an hour, two temporal delay and visual interference blocks (each bearing four sequences) were shown in alternating order.Like before, at the end of the EEG experiment, the participants were asked to correctly classify the identity of their paired-up faces.At the end, a general questionnaire was carried out inquiring about participants' wakefulness and awareness.

| EEG data acquisition and preprocessing
Scalp EEG was recorded using Brain Products' actiCAP snap system, combined with the BrainVision Recorder Software (Brain Products, Gilching, Germany).Sixty-two Ag/AgCl-electrodes were distributed on the cap according to the 10-20 system.Two further electrodes served as F I G U R E 1 Schematic illustrations of the behavioural task and the sequential pattern of face images constituting the two types of blocks.The colours of the frames mark different event categories (blue: paired images; green: elongated interstimulus interva; light grey: interfering images).
electrooculograms and were placed above and beside the right eye to account for vertical and horizontal eye movements, respectively.Electrodes at FCz and FPz served as the online reference and ground, respectively, and were disregarded from all following analyses.EEG data were recorded at a sampling rate of 1 kHz, with an applied online bandpass filter of .1-1000Hz.Electrode impendence was maintained to be below 10 kΩ.
Recorded EEG data were pre-processed using MATLAB (R2017b) in combination with the EEGLAB toolbox (version 14.1.1b;Delorme & Makeig, 2004).Raw data were downsampled to 500 Hz before applying a Butterworth bandpass filter (12 db/octave) with cutoffs at .1 Hz and 30 Hz for the ERP data and .5 and 40 Hz for the time-frequency (TF) data, respectively.For the ERP analysis, continuous data were segregated into epochs extending from À250 to 600 ms, locked to stimulus onset.To segregate continuous data into epochs of 3750 s for the TF analysis, artificial triggers had to be added to the elongated timeframes separating each stimulus in the temporal delay block.Here, the interval was 6000 ms long instead of the 3000 ms in the visual interference block.These artificial triggers were inserted 3000 ms after each image onset, allowing all continuous data points to be separated into 3750 ms epochs extending from À250 to 3500 ms.The Gratton plug-in for EEGLAB was then applied to correct for ocular movement (Gratton et al., 1983).Noisy channels were semi-automatically inspected and interpolated if kurtosis criterion > 6.Of the ERP data, a total of 1% of electrodes were interpolated, whereas 3.4% of electrodes were interpolated of the TF datasets.Artefacts were removed semi-automatically with the criteria that trials were discarded if artefacts exceeded an amplitude threshold of ± 75 μV or conveyed fluctuations in voltage greater than 50 μV respective to the previous sample point.Hereupon, trials were visually inspected and removed if containing any residual artefacts.Out of the initial 1136 trials (per participant), a mean number of 1081 trials remained of the ERP datasets and 873 trials of the TF datasets.The number of random trials was reduced to match the number of predictive, interference, and expected trials of the ERP datasets (temporal delay block: predictive trials = 61.15± 2.15, expected trials = 61.82± 2.25, and random trials = 64.12± .41;visual interference block: predictive trials = 93.76± 2.92, expected trials = 93.32 ± 2.28, random trials = 95.88 ± .69,interference trials = 93.21± 2.53; [mean ± SD]) and of the TF datasets (temporal delay block: predictive trials = 46.88 ± 8.24, expected trials = 51.41 ± 7.01, random trials = 63.71 ± 1.71; visual interference block: predictive trials = 78.21± 9.57, expected trials = 81.03± 9.04, random trials = 84.09± 10.29, interference trials = 79.79 ± 8.12).Datasets were then re-referenced to a common average.

| Behavioural analysis
Behavioural data were analysed using RStudio (version 3.6.0).For each individual participant, datasets were trimmed so that reaction times (RT) 3 SD slower than the sample's mean were removed from all further analyses (number of trials removed: 5.71 ± 11.70 [mean ± SD]).For the RT analysis, any trials bearing missed or incorrect responses were also removed.The number of trials for each image category were then equalised; that is, the number of random images was reduced to match the number of predictive/interference/expected for each block type (temporal delay block: n = 64 and visual interference block: n = 96).For the response time, an individual one-way, repeated measures analysis of variance (ANOVA) was implemented for each of the two block types.A Bonferroni correction was applied to all post hoc comparisons.Given that the data for response accuracy were not normally distributed, non-parametric Friedman tests, along with post hoc Wilcoxon signed-rank tests, were applied to analyse the percentage of correct responses.

| ERP analysis
The ERP datasets were averaged across each image category of interest (temporal delay: expected and random; visual interference: expected, interference and random).Given that the ERP analysis was conducted to investigate the predictability of the different image types, the predictive images were disregarded because of their confounding informative, cue-like nature.
To measure the mean amplitude of the N170, the mean voltage within the timeframe of 140-180 ms over electrodes P5/P6 and P7/P8 was calculated relative to the 250 ms pre-stimulus baseline (Ran et al., 2014).The mean amplitude was entered into a 2 Â 2 repeated measures ANOVA for the temporal delay block with factors hemisphere (left and right) and image type (expected and random).For the visual interference block, a 2 Â 3 repeated measures ANOVA was applied with factors hemisphere (left and right) and image type (expected, interference and random).Where applicable, the degrees of freedom of the F-ratio were amended according to the Greenhouse-Geisser method.

| Time-frequency analysis
Spectral analyses were performed using MATLAB (R2020b) and the Fieldtrip toolbox (Oostenveld et al., 2011).To estimate spectral power, a fast Fourier transform approach was applied to averaged trials.Here a Hanning taper was used for our frequencies of interest (2-30 Hz), using a 500-ms long sliding window which moved in fixed steps of 50 ms and 1 Hz increments.
As for the statistical analyses, cluster-based permutation tests were used to assess the differences in lowfrequency power within the timeframe extending from 0 to 3000 ms (locked to stimulus onset) between the different image conditions within each block type.First, time-frequency power was normalised by calculating the raw differences in power estimates between the predefined contrasts of interest, that is, normalised difference expected vs random = (X À Y)/(X + Y) (Spaak et al., 2016).This normalised data of our predefined region of interest (O1/Oz/O2/PO7/PO8/PO4/PO3/POz) was then used for all statistical analyses and to generate time-frequency representations (Houshmand Chatroudi et al., 2021).For each individual contrast, this normalised power was then subjected to Monte Carlo randomisations using dependent sample t-tests and k = 1000 permutations.Differences in power were deemed significant when a cluster size exceeded the threshold (95th quantile) of the permuted sample.Applying the cluster-based method to the permutation tests offers a robust way to control the family-wise error rate associated with multiple comparisons (Cohen, 2014).
A data-driven, within-participants correlation was carried out to assess the relationship between the coexisting clusters, marking the immediate onset of the expected (in comparison to random) faces of the visual interference block (Figure 4i-j).For each participant (n = 34), the normalised power of the two observed clusters was averaged over channels, frequencies and time.The averaged power of the beta suppression was then correlated with the averaged power of the alpha enhancement.

| Performance on classification task
The behavioural performance of the classification task was assessed in terms of response time and accuracy.Here we hypothesised that participants would respond and classify images faster and more accurately when the identity of the upcoming face was predictable.For the response time, both one-way, repeated measures ANOVAs yielded a main effect for image types (temporal delay block: F[1. 24, 40.97 2).The remaining comparisons showed no substantial differences in accuracy between the image types ( p > .05).A Bonferroni correction was applied to all post hoc comparisons.
Hence, a significant reduction in the N170 was only observed when a predictable image immediately ensued its corresponding cue.

| Ongoing modulations of pre-and post-stimulus alpha/beta power
In a previously carried out study, we observed that alpha/ beta power enhancements, extending from the onset of the predictive until the onset of the expected image, revealed an optimal state that prioritised top-down processes (Roehe et al., 2021).Based on these findings, we now hypothesised that a similar alpha/beta enhancement should be evident within the pre-stimulus timeframe prior to an expected image, regardless of the block type.However, both temporal delays and visual interferences may impede an elongated enhancement in alpha/beta power that extends throughout the entire period, stretching from the onset of the cue until the presentation of the expected image.
For the temporal delay block, the entire 6-s timeframe was analysed as two succeeding 3-s timeframes and contrasted with the 6-s timeframe between two random images.In the first 3-s timeframe, a distinctive negative cluster, encompassing both alpha and beta frequencies, marked the approximate timeframe of the behavioural responses classifying predictive images (p < .001; Figure 4a).Considering the substantial differences in power marked by the first cluster and its subsequent impact on rendering smaller differences insignificant (p = .072),we readjusted the analysed time window so that neural activity occurring during the period at which most behavioural responses occurred was disregarded (0-1000 ms, locked to stimulus onset).This decision was driven by the primary interest to examine neural correlates of face-related expectations rather than those underlying behavioural responses.Analysing the timeframe from 1000 to 3000 ms post-stimulus onset (predictive > random) consequently yielded a significant enhancement in alpha/low beta power for predictive images (relative to random images; p = .046;Figure 4d).The second timeframe showed a late negative cluster just F I G U R E 3 N170 amplitudes and topographies for each image type of the two experimental conditions.(a) Grand average waveforms for expected and random images of the temporal delay condition pooled across hemispheres (P5/6 and P7/8).The red line indicates the significant difference in mean amplitude observed in the temporal delay condition.The scalp topographies extend from 140 to 180 ms in 10 ms increments which depicts the period analysed for the N170 component.(b) Grand average waveforms for expected, interference, and random images of the visual interference condition pooled across hemispheres.Again, the scalp topographies extend from 140 to 180 ms, depicting the period analysed for the N170 component.

Temporal delay
Visual prior to the onset of the expected/random image (p < .001; Figure 4b).A very similar pattern in spectral power was observed for the visual interference block.Here, the onset of the predictive and interference images also evoked an alpha/ beta suppression, relative to random images, at the time corresponding to the behavioural responses (predictive: p = .002;interference: p = .002;Figure 4e, f).Once F I G U R E 4 Time-frequency representations (TFRs) of dissimilarities in spectral power amongst the different image conditions of each block type.(a-d) For the temporal delay block, the timeframe depicted stretches from the onset of the predictive image to the onset of the expected image.The elongated interstimulus interval, which is unique to the temporal delay condition, is illustrated in subplot b. (e-i) For the visual interference block, the timeframe extends from the onset of the predictive image to the onset of the expected image with the interference image occurring amid the two.In addition, the scalp map illustrated in the corners represents the predefined parieto-occipital region of interest used to create all TFRs.For all TFRs, the timeframe between the onsets of two random images was used as a comparison.Subplots, representing a smaller timeframe (1000-3000 ms), are depicted below their original 3000 ms counterparts (d, h, i).Clusters of interest, marking considerable differences in power between the chosen contrasts, are outlined in grey.The relationship between the beta suppression and alpha enhancement, reflected in the timeframe extending from 1000 ms after onset of the interference/random images (i), is illustrated in the scatter plot (j).
resizing the time window to extend from 1000 to 3000 ms post-stimulus onset, significant enhancements in alpha/ beta power succeeded the negative clusters (predictive > random: p = .047;interference > random: p = .046;Figure 4h, i).Intriguingly, whilst the alpha/beta enhancement after the onset of the predictive image stopped near to 500 ms before the onset of the interference image, the enhancement after the onset of the interference image remained until the presentation of the expected image.Similar to the temporal delay block, the onset of the expected images was met by a suppression of largely beta power that commenced more than 1000 ms prior to stimulus onset (Figure 4i).Ultimately, we analysed the relationship of the two concurring clusters leading up to the onset of the expected images (Figure 4i).Interestingly, we observed a correlation between the suppression of high beta and the coinciding augmentation of alpha/low beta activity (Spearman's rho = .36,p = .035,95% CI [.03 .63]; Figure 4j).
Collectively, to ensure that these substantial enhancements in alpha/beta power (1000-3000 ms post-stimulus onset) can indeed be linked to top-down processes reflecting expectations regarding the identity of the expected images, we analysed the timeframe covering the interval between the onset of the expected and the ensuing random image (Figure 4c/g).Like the previously stated results, the onset of the expected images also reflected a substantial suppression of alpha/beta power around the time of the behavioural response, irrespective of the block type (temporal delay: p < .001;visual interference: p < .001).However, diverging from the earlier findings, no significant enhancements in alpha/beta power followed these suppressions.Instead, these suppressions seemed to persist for almost half the ISI (Figure 4c/g).In addition, data-driven scalp maps were created to further examine the topographical distribution of the alpha/beta activity within the timeframes of the observed clusters.These conveyed that enhancements in alpha beta power were primarily located across posterior regions, whereas the alpha/beta suppressions tended to be strongest across central regions (Figure S2).
Finally, a data-driven correlation (within-participants) was implemented to analyse the relationship between the observed positive cluster (Figure 4d) and the significant N170 attenuation evident in the temporal delay block.For each participant, the magnitude of the N170 reduction was correlated with the power of the positive alpha/beta cluster.Normalised power was averaged across the parietooccipital region of interest and the significant time and frequency points.Results revealed an insignificant relationship between the enhancement in alpha/beta power and the reduction in the N170 response (Spearman's rho = À.21, p = .229,95% CI: [À.51 .14];see Figure S1).

| DISCUSSION
Ideally, an expected event would shortly ensue after being foretold by a cue.However, every now and then, we are left waiting for an anticipated stimulus to occur and are sometimes even faced with processing other percepts in the meanwhile.In the current study, we looked at the N170 component in combination with spectral changes in alpha and beta frequencies to investigate how such temporal delays and visual interferences impact face-related expectations.
In the timeframe leading up to the depiction of expected images (relative to random images), we observed enhancements in both alpha and low beta oscillations, suggesting increased inhibition of incoming information, followed by alpha and beta suppressions, suggesting a release from this inhibition.These fluctuations in alpha/beta power conveyed different spectral patterns depending on whether a temporal delay in stimulus onset occurred or an interfering face image was presented in between the predictive and expected images.Note, however, that the way the data were examined and analysed in the present study does not permit us to suggest that these observed fluctuations in alpha/beta power reflect a neural mechanism that gates bottom-up information per se.Instead, our results conveyed that the brain fluctuates between neural states of either alpha/ beta enhancements or suppressions to facilitate either anticipatory or feedforward processes, respectively.Moreover, we observed that time-resolved neural responses for expected images also showed dissimilar expressions depending on the block type.As such, a reduction in the N170 component was observed if the expected images followed the predictive faces, despite a relatively long temporal delay of 6 s, but vanished when preceded by a contextually relevant face image in the visual interference condition.Irrespective of the block type, behavioural measures confirmed that the identity of the cued face images was learned and could be predicted.As hypothesised, this led to a decrease in response time and an increase in overall accuracy.
4.1 | Suppressions in alpha/beta power reflect a neural state optimal for bottom-up processes Firstly, we observed suppressions encompassing alpha and beta frequencies during the first 1000 ms after image onset for all image categories in contrast to random faces (Figure 4).Given that both the predictive and expected images share an informative nature, although of different means, the observed decline in alpha/beta power seems to be in line with the previously establish premise that tasks which require greater engagement also seem to evoke a greater alpha/beta power decrease (Griffiths et al., 2019;Lebar et al., 2017).Broadly, these studies build upon the notion that low-frequency oscillations, predominantly alpha, are a marker for sensory inhibition and that decreases in alpha power reflect a release from this inhibition (Klimesch, 2011).Increases in alpha power across the hemisphere contralateral to the spatial location of unattended input has, for instance, been linked to the active inhibition or gating of visual-spatial attention (Kelly et al., 2006;Rihs et al., 2007;Worden et al., 2000).In turn, posterior alpha suppressions corresponded to selective attentional deployment towards a cued hemifield (Thut et al., 2006).Under this framework, enhancements in alpha/beta power would be linked to prioritising top-down processes, whilst a decline in these frequency ranges would shift priority to bottom-up processes.Notably, these task-related decreases in alpha/ beta power extend across various tasks (Lebar et al., 2017;Pfurtscheller et al., 1994), sensory modalities, such as visual and auditory (Griffiths et al., 2019;Thut et al., 2006), and somatosensory (Lebar et al., 2017), as well as various species, including humans (Griffiths et al., 2019;Pfurtscheller et al., 1994), macaques (Haegens et al., 2011), and rodents (Wiest & Nicolelis, 2003).The omnipresence of this pattern in lowfrequency power, therefore, seems to hint towards a more general process beneficial for processing incoming information, rather than reflecting actual sensory information itself (Griffiths et al., 2019).In their study, Griffith and colleagues (2019) showed that target stimuli that were contextually relevant to the task evoked an augmented alpha/beta suppression.Given that the paired-up images in the present study had to be learned and memorised prior to the EEG recording, they would have stood out from the rest of the randomly occurring images.We thus propose that the alpha and beta suppressions, associated with the predictive and expected images (relative to random images), may signal augmented contextual relevance.
Strikingly, we observed an identical alpha/beta response within a similar timeframe ($0-1000 ms) for interfering images (Figure 4f).One may question the sizable difference in alpha/beta power between interference and random images considering that the interference images are effectively just arbitrarily selected random images.Here we propose that the interfering images become an unintentional temporal cue for the onset of the expected images (Xu et al., 2021).Having learned the sequential structure of the visual interference block will have provided participants with the opportunity to anticipate that the onset of the expected image will ensue 3 s after the onset of the interference image.Thus, although the interference images convey little to no differences in behavioural and time-resolved neural responses when compared to random images (Figures 2  and 3b), their temporal station seems to be contextually relevant for predicting the imminent onset of anticipated faces.Given the distinctive attributes of predictive, expected and interference images, the mutual decline in alpha/beta power, within the first second after stimulus onset, seems to coincide with the suggestion that this spectral modulation reflects a more generic neural state which boosts the ability to process contextually relevant information (Griffiths et al., 2019).

| Enhancements in alpha/beta power reflect a neural state optimal for top-down processes
Extending the proposition raised above, an enhancement in alpha/beta power may reflect a contrasting neural state which is optimal for top-down processes, such as reflecting the activation of prior knowledge (Brodski-Guerniero et al., 2017;Mayer et al., 2016) and gating the gain and precision of neural communication (Lebar et al., 2017;Limanowski et al., 2020).Our findings revealed momentary enhancements in alpha/beta power within the timeframe ranging from the onset of the predictive to the onset of the expected image (in contrast to random images; Figure 4d/h-i).Here we put forth the notion that like the alpha/beta suppressions, these enhancements do not, in fact, carry specific information regarding the anticipated face but, instead, create an optimal neural condition that is favourable for top-down processes.With this interpretation, we intend to unify some of the widely held theories regarding the role of alpha/ beta frequencies.Brodski-Guerniero et al. (2017), for instance, conveyed that alpha and beta frequencies index the activation of prior face-related knowledge.More specifically, they observed that alpha/beta frequencies correlated with the amount of activated prior knowledge in face-specific brain regions, such as the fusiform face area (FFA).Likewise, Mayer et al. (2016) showed that an increase in pre-stimulus alpha power was associated with the activation of former knowledge about previously observed letters.These observations fit well with the notion that a neural state ideal for top-down processes could promote access to activated neural representations whilst suppressing other less relevant external input.As such, fluctuations of alpha and beta power appear to reflect continuous shifts in gating inhibition of bottomup processes whilst systematically giving rise to top-down processes.This suggestion provides a seamless transition to the proposal that alpha/beta power have been associated with gating neural communication (Lebar et al., 2017;Limanowski et al., 2020).Findings of these two studies revealed that beta power in occipital regions decreased when vision was task relevant and increased when visual input was ignored.Therefore, these studies suggest that modulations of beta power gate to what extent a particular visual stimulus is processed at a given moment.This foregoing argument assumes that there is a systematic relationship between beta and alpha power.In other words, one would expect an increase in beta power to be accompanied by an increase in alpha power.In turn, the established neural state would be favourable for the facilitation of top-down processes, that is, the activation of prior knowledge.Our results support such a relationship by revealing a correlation between the cooccurring beta suppression and alpha enhancement just prior to the presentation of expected (relative to random) images in the visual interference block (Figure 4j).These findings would suggest that the beta suppression restricts top-down processes, yielding confined anticipatory processes.Arguably, this could indicate that in the visual interference blocks the pre-activated neural representation of the upcoming expected image is suppressed or dismissed in order to unbiasedly process the contextually relevant interfering face (Blom et al., 2020).Hence, it appears that the underlying predictive hierarchical model is modified accordingly to account for the contextually relevant visual interferences.
On a related note, this spectral pattern of a coexisting beta suppression and alpha power enhancement was not observed in the temporal delay block.Instead, the onset of the expected (relative to random) image was met by a drawn-out suppression in both alpha and beta power (Figure 4b).Here we propose that the delayed onset of the anticipated image provided an elongated timeframe in which the brain is left 'waiting' for this upcoming event.Notably, to minimise confounds of surprise, the duration of the delay in stimulus onset was kept consistent throughout the temporal delay blocks.That is, the participants could predict, as a result of statistical learning, that an upcoming image is depicted six seconds after the onset of its precursor.Thus, rather than keeping the representation for the predicted image active throughout the 6-s interval, which could be relatively taxing, the brain efficiently shifts between neural states optimal for the current contextual setting.Namely, our results suggest that a temporary facilitation of relevant top-down processes shortly after the offset of the predictive images would suffice for the present context (Figure 4d), before shifting to a neural state optional for visual bottom-up and action-related processes in preparation for the upcoming sensory input (Figure 4b).For visual representation of the topographical distribution of the alpha/beta enhancements and attenuations, please refer to the scalp maps in Figure S2.

| Neural and behavioural signatures of anticipatory processes
Foremost, we observed that behavioural responses were significantly faster for anticipated faces compared to faces which were not foretold by a corresponding predictive face (Figure 2).In addition, significantly fewer errors were made when classifying a predicted in comparison to an unpredicted face.These anticipation-facilitated behavioural responses were evident for both the temporal delay and visual interference conditions.In both cases, prior knowledge could be relied upon to guide precise top-down predictions regarding the expected stimulus.Having explicitly learned the identity of the expected faces, the participants were less likely to accidentally misclassify the sex of these face images.In terms of speed, behavioural responses could be prepared ahead of the depiction of the anticipated stimulus, resulting in an accelerated behavioural response.In line with previous studies, these behavioural effects confirm that top-down activity, such as explicit or implicit expectations, boosts behavioural responses (Ran et al., 2014;Turk-Browne et al., 2010).
Likewise, cued expectations have also been shown to influence time-resolved neural responses (Johnston et al., 2016;Ran et al., 2014).As hypothesised, we observed that the face-sensitive N170 was significantly diminished (reduced in negativity) for the four predictable face images of the temporal delay block (Table 1 and Figure 3a).Notably, this neural activity in response to these predictable images did not reach significance in the visual interference block (Figure 3b).Previous studies have conveyed that the N170 diminishes for contiguous depictions of the same face (Caharel et al., 2009;Campanella et al., 2000;Ran et al., 2014).In these cases, the neural correlate of the identity of a particular face was available to be drawn upon to aid visual processing of the succeeding image.As such, fewer cognitive resources were required to process and respond to these predictable faces.In turn, this would be reflected in a diminished neural response, such as a reduction in the N170.The design of our temporal delay block allowed predictive images to pre-activate a representation of the expected images which, in a top-down fashion, would be available to facilitate early processing of the directly ensuing expected image.On the contrary, given that in the visual interference block the predictive and expected images were segregated by an interfering image, the cue-triggered neural representation of the expected image might be temporally overwritten when the new sensory information of the interfering face becomes available.Especially considering that unlike other visual 'distractor' paradigms, the interference images in the current study were relevant to the task at hand and required visual processing.This reasoning is supported by findings showing that the brain regions involved in actively upholding face-related neural templates are also the regions processing this context-specific information (Brodski-Guerniero et al., 2017).Contextual relevance, thus, appears to play a fundamental role in selecting the most efficient forthcoming neural state.This line of thought is based on a related suggestion claiming that when multiple target representations are active simultaneously, trial-by-trial changes in environmental context play a considerable role in regulating the attentional weight attributed to them individually (van Driel et al., 2019).The previously observed fluctuations in alpha and beta power provide confirmation that a goaldirected shift in neural states takes place between topdown and bottom-up processes throughout both block types.In addition, the interplay between the coexisting alpha enhancement and beta suppression in the visual interference condition provides an explanation as to why we observed significant expectation-facilitated behavioural responses but no substantial differences in the N170 component between expected and random images.As mentioned previously, the beta suppression seems to restrict anticipatory process (Figure 4i).Hence, it appears that lower sensory levels within the predictive hierarchical model are fine-tuned to account for the contextually relevant visual interferences whilst higher levels within this hierarchical predictive model remain stable (Long & Kuhl, 2018).Long and Kuhl (2018), for instance, conveyed that visual interruptions, in the form of scrambled facial features, predominantly influenced representations within the visual systems.Switches in goal or task relevance, on the other hand, influenced representations higher up in the cognitive hierarchy and involved frontoparietal networks.Similarly, the interfering faces in the present study seem to impede a sustained maintenance of the anticipated face's sensory template.Thus, the predictive model no longer provided a valuable source to draw identity-related expectations from ahead of its subsequent afferent sensory input, hence, resulting in an insignificant expectation-facilitated N170 response as observed in the visual interference condition.Access to higher cortical levels, representing for instance the learned associates between expected images and their corresponding behavioural responses, would, however, continue to enhance the propagation of specific behavioural predictions albeit the lack of an actively maintained face-related representation.As such, lower levels representing facerelated templates could be overwritten by incoming visual information without compromising expectationfacilitated behavioural measures.This supports the notion that once a visual input and its corresponding behavioural response have been encoded into working memory, the associated action response can be prepared without waiting for the visual representation to be retrieved first (van Ede et al., 2019).This opens an exciting line of further investigations which could combine M/EEG and multivariate pattern analysis (Barne et al., 2020;Blom et al., 2020) to decode the amount of face-specific information activated ahead of its afferent sensory input after having attended to visual interferences.
Lastly, we corroborated that the observed enhancements in alpha/beta power are indeed neural signatures of a generic neural state-allowing face-related neural patterns to emerge-rather than representing facespecific information per se.This interpretation was drawn from the data-driven correlation which did not provided evidence to suggest that the enhancement in alpha/beta power for expected relative to random images (cluster in timeframe predictive > random; Figure 4d) correlated with the significant reduction of the N170 obtained in the temporal delay condition (Figure S1).A question that remains, however, is that if these enhancements in power signal a general neural state that is beneficial as a means of boosting top-down processes, what neural signatures do then carry actual stimulus specific information?Griffiths et al. (2019) put forth the notion that since the phase and power of a given oscillation are mathematically distinct, they may also have independent facilitatory purposes.This theoretical suggestion is supportive of previous findings revealing that the phase of low-frequency oscillations ($8 Hz) appears to carry information about a given stimulus (Michelmann et al., 2016).Note, however, that more evidence is required to conclusively attribute distinctive, yet complimentary, neural purposes to these two oscillation components.

| CONCLUSION
In summary, we obtained novel findings which demonstrated that the brain shifts between neural states to optimise hierarchical predictive models and subsequent contextually relevant anticipatory processes.In both the temporal delay and visual interference block, we found indications of a neural state beneficial for top-down processes, that is, granting early access to cued neural representations.Nevertheless, if the onset of the anticipated face was interrupted by the depiction of a distracting yet relevant image, priority shifted to processing the interfering visual input before giving restricted access to contextually relevant properties of the formerly cued neural representation (in the current study: the gender/sex of the expected face).In line with a growing body of literature, these fluctuating shifts boosting access to either internal representations or external stimulus specific information were mediated by modulations of alpha and beta power (Benwell et al., 2021;Brodski-Guerniero et al., 2017;Griffiths et al., 2019;Lebar et al., 2017;Limanowski et al., 2020;van Moorselaar et al., 2020).Our observations suggest that lower sensory levels within these predictive models are continuously revised, granting us to constantly adapt to the fluidity of our surroundings.Notably, neither a temporal delay in stimulus onset nor visual interferences negatively impacted expectationfacilitated behavioural responses.The brain, thus, appears to fine-tune different levels within the hierarchical predictive model to different degrees.Whilst lower levels are revised and overwritten to allow us to have the most contextually adequate representation of our external environment at a given moment, higher levels appear to remain intact to aid higher cognitive functions.Ultimately, our findings fit neatly within the predictive processing framework by corroborating that the brain continuously adapts internal predictive architectures, and subsequent predictions, to optimise contextually relevant behaviours.

F
I G U R E 2 Behavioural performance for each type of image present in the two blocks.Significant differences (α ≤ .05) in response time (ms) and accuracy (%) are marked accordingly (*).