Is beta in agreement with the relatives? Using relative clause sentences to investigate MEG beta power dynamics during sentence comprehension

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2023 The Authors. Psychophysiology published by Wiley Periodicals LLC on behalf of Society for Psychophysiological Research. 1Neurobiology of Language Department, Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands 2Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands 3Academy for Leisure and Events, Breda University of Applied Sciences, Breda, the Netherlands 4Department of Cognitive Neuropsychology, School of Social and Behavioural Sciences, Tilburg University, Tilburg, the Netherlands


| INTRODUCTION
Neural oscillations have become a popular tool for uncovering various aspects of the cortical dynamics supporting language comprehension (e.g., Ding et al., 2016;Friederici & Singer, 2015;Giraud & Poeppel, 2012;Meyer, 2018;Prystauka & Lewis, 2019). Whether these neural signatures reflect domain-general systems-level processes or instead can be mapped directly onto specific cognitive functions, remains an open question. When it comes to sentence comprehension, one popular proposal, the frequency-based segregation of syntactic and semantic unification hypothesis (Bastiaansen & Hagoort, 2015), links oscillatory activity in the beta frequency range to syntactic unification operations, and oscillatory activity in the gamma frequency range to semantic unification operations. Syntactic and semantic unification respectively refers to operations at the cognitive level that integrate lexical building blocks retrieved from memory to form more complex combinatorial representations (for a more detailed explication of syntactic and semantic unification under the memory, unification, and control-MUC-framework see Hagoort, 2005Hagoort, , 2017. Here we will focus on the relationship between modulations of beta power (13-30 Hz) and syntactic processing, for which there is substantial evidence (for review see Prystauka & Lewis, 2019; for alternative perspectives on the role of beta in language processing see Weiss & Mueller, 2012). Several studies have shown that beta power is higher for syntactically acceptable sentences compared to sentences containing various forms of syntactic violation (Bastiaansen et al., 2010;Davidson & Indefrey, 2007;Kielar et al., 2014Kielar et al., , 2015Kielar et al., , 2018Lewis, Lemhӧfer, et al., 2016;Schneider et al., 2016). Another study showed that beta power increased for long-compared to short-distance subject-verb agreement dependencies at the point in the sentences where the dependency could be resolved (Meyer et al., 2013). For a comparison between center-embedded relative clauses and their right-branching counterparts, Bastiaansen and Hagoort (2006) reported higher beta power for the syntactically more complex center-embedded variety. These studies together suggest that disrupting syntactic processing leads to a decrease in beta power, while beta is higher when syntactic processing becomes more difficult. Extending these findings, Bastiaansen et al. (2010) showed that the level of beta power increases over the course of a sentence for syntactically acceptable sentences compared to random word lists (for a replication of this finding see Bastiaansen & Hagoort, 2015) and that for sentences containing a syntactic violation beta increases up to the point of the violating word, after which it falls back to baseline levels.
One problem for a strict mapping between beta oscillatory activity and syntactic unification operations is that not every kind of syntactic violation results in a modulation of beta power (Davidson & Indefrey, 2007;Lewis, Lemhӧfer, et al., 2016; for review see Prystauka & Lewis, 2019). Moreover, beta power modulations have also been observed for experimental manipulations targeting semantics (Kielar et al., 2014(Kielar et al., , 2015(Kielar et al., , 2018Li et al., 2017;Luo et al., 2010;Vignali et al., 2016;Wang et al., 2012), thematic role reversals (Li et al., 2014), discourse model updating in humor comprehension (Canal et al., 2019), and for disruptions of the rhythmical structure of sentences (Luo et al., 2010). Furthermore, the case of Spanish Unagreement 1 shows a decrease in beta power following a mismatching target word, even though it does not lead to a syntactically ill-formed sentence (Pérez et al., 2012).
These concerns led us to propose that during language comprehension, just as in other more domain-general contexts (Engel & Fries, 2010), oscillatory activity in the beta frequency range might be related to the maintenance or change of the current cognitive set, rather than exclusively to syntactic processing . Under this proposal, whenever the language comprehension system encounters cues in the linguistic input indicating that the representation of the sentence-level meaning (or situation model; c.f., Zwaan & Radvansky, 1998) needs to be changed, we should observe a decrease in beta activity in anticipation of the necessary change in the underlying network of regions supporting that representation. Similarly, if the system expects that the current sentence-level meaning needs to be actively maintained, we should observe an increase in beta activity in order to maintain the current network configuration. This proposal (henceforth the beta-maintenance hypothesis) can account for all the evidence reviewed above, where for instance syntactic violations and semantic anomalies (as well as violations of rhythmical structure and unexpected agreement marking) act as cues to the language comprehension system, indicating that the current sentence-level meaning needs to change, and hence beta power decreases (for more extensive discussion see . From the perspective of a neural systems-level explanation, this proposal can be considered domain-general in the sense that beta performs a similar 1 In Spanish Unagreement a third-person plural subject mismatches a subsequent plural verb in grammatical person, but for either first or second person verbs a grammatical parse can be recovered. This entails a shift from the default third person plural interpretation of the subject to include either the speaker (first person) or the addressee (second person) within the group indicated by the nominal subject (See Pérez et al., 2012 for examples and further explanation). | 3 of 22 LEWIS et al. role in terms of the up/down-regulation of cortical regions and the dynamic differential engagement of relevant neural circuits, regardless of any specific function that one might want to assign to those regions or circuits. From the perspective of a cognitive level of explanation, this proposal can be considered domain-specific in the sense that beta's role in coordinating neural circuits at the systems level is now explicitly cashed out in terms of how those functions might be recruited to support the specific case of language comprehension.
The roles for beta in language comprehension proposed by the frequency-based segregation of syntactic and semantic unification hypothesis (henceforth the beta-syntax hypothesis) and the beta-maintenance hypothesis have not yet been directly compared with one another. As pointed out by , the two theories make different predictions about how the beta activity should be modulated when the language comprehension system encounters linguistic input that is unexpected (or less expected), yet does not constitute a grammatical violation. In the present study, we compared the two theories based on this suggestion using the contrast between subject-relative and object-relative clause sentences as the critical test case.
Functional magnetic resonance imaging (fMRI) studies demonstrate consistent and reliable patterns of increased activation when comparing more and less syntactically demanding sentence structures (including the OR-SR asymmetry; Hagoort & Indefrey, 2014;Indefrey, 2012). These regions include left inferior frontal gyrus (LIFG), left posterior superior temporal gyrus (STG) and middle temporal gyrus (MTG), left angular gyrus (AG) and supramarginal gyrus (SMG), left precuneus, right posterior inferior frontal gyrus, and right posterior MTG. Investigating beta activity in these regions may provide important insights into the temporal dynamics of their differential recruitment for more syntactically demanding sentences, at the precise point in the sentence when processing becomes more demanding.
There remains some disagreement regarding exactly which factor(s) result in the OR-SR processing asymmetry, but explanations can be divided into three broad classes: (1) memory/resource-based models; (2) semantic/pragmatic models; (3) frequency-based models (for an excellent review see Gordon & Lowder, 2012). In the context of the present study, it is not so important which of these models turns out to be correct (perhaps many or all of them for different aspects of relative clause processing). More crucial for our purposes, and what is far less controversial about these types of sentences, is that at the point of disambiguation within an object-relative clause, the language comprehension system encounters an unexpected (or less expected) event. This event may indicate either that some form of reanalysis is required, that more difficult memory retrieval operations will be engaged, that less frequent sentence construction is about to be processed, or that the sentence structure implied by the input does not match a predicted structure. In all scenarios, this provides precisely the situation necessary for comparing the predictions of the beta-syntax and beta-maintenance hypotheses, namely, the linguistic input is unexpected (or less expected) based on the contextual sentence-level representation formed up to that point, but does not constitute a grammatical violation.

| The present study
In the present study, participants read Dutch relative clause sentences like those in Table 1 while their Magnetoencephalography (MEG) was recorded. The auxiliary verb at the end of the relative clause could agree in grammatical number with either the antecedent noun phrase in the matrix clause ("vader" in the examples in Table 1) or with the noun phrase within the relative clause ("zonen" in the examples in Table 1), resulting respectively in either a subject-relative (SR condition) or an object-relative (OR condition) clause reading of the sentence. Both the referent of the antecedent noun phrase and that of the relative clause-internal noun phrase were animate. Importantly, Dutch readers show a clear preference for a subject-relative reading of such sentences. The object-relative clause sentences occur less frequently according to a corpus analysis (27.97% of the time overall; 1.41% of the time when the antecedent noun phrase is animate), and they result in processing difficulties at the disambiguating auxiliary within the relative clause (for details see Mak et al., 2002). In addition to the SR and OR conditions, we included a third sentence type where the auxiliary at the end of the relative clause failed to agree in grammatical person with either of the preceding noun phrases, resulting in a grammatical violation at the end of the relative clause (AVR condition). In this way the target word (the auxiliary) in both OR and AVR sentence constructions is relatively less expected for a typical Dutch reader, but only in the AVR condition does it constitute an outright violation of the grammar. We refer to these 3 conditions as the complex relative clause (CRC) conditions because the relative clause is initially ambiguous between subject-and object-relative interpretations. In addition to the CRC conditions we included 2 simple relative clause (SRC) conditions (Table 2) where the relative clause was unambiguously subject-relative, and the matrix clause verb directly following the relative clause could either agree in grammatical number with the subject (AGR condition) or not, which results in a grammatical violation (AV condition) outside the relative clause. This provided an additional measure of the brain's response to unexpected input, but was less affected by the degree of embeddedness or the complexity of the relative clause sentence constructions.
We performed a time-frequency analysis of power changes relative to a baseline period immediately preceding the target word (TW) of the sentences in a frequency range from 2 to 30 Hz. This allowed us in a first step to isolate beta power modulations related to grammatical violations, by comparing the combined AVR and AV conditions with the combined SR and AGR conditions. In a second step, we then isolated beta responses to unexpected, but still grammatical target words by comparing the OR condition with the SR condition.
If the beta-syntax hypothesis is correct, then beta power should be higher for the OR condition than for the SR condition (more demanding syntactic unification), while it should be lower for the AVR condition than for the SR condition (syntactic unification is disrupted). If on the other hand, the beta-maintenance hypothesis is correct, beta power should be lower for both the OR and the AVR conditions (both provide a cue to a change in sentence-level representation) compared to the SR condition. Based on similar reasoning, beta power at the TW was hypothesized to be lower in the AV compared to the AGR condition.

| Participants
Thirty native speakers of Dutch took part in the experiment, 24 of whom were included in the final analysis (3 males, 21 females; aged 18 to 35). Participants provided informed consent and were paid or equivalently rewarded with course credits for their participation. All participants reported normal or corrected-to-normal vision, and were right handed. None of the participants reported any neurological impairment. Three participants were excluded from the final analysis due to poor performance on the comprehension questions (less than 65% correct answers overall). One further participant was excluded due to recording problems, and another 2 participants were excluded in order to balance the number of participants who were assigned to each experimental list (for lists with too many participants, those participants with the worst performance on OR comprehension questions were excluded). The study was approved by the local ethics committee (Commissie Mensgeboren Onderzoek Arnhem/ Nijmegen) and carried out in accordance with the principles laid out in the Declaration of Helsinki.

| Stimulus materials
All stimuli consisted of Dutch relative clause sentences, each between 11 and 22 words long. Complex relative clause (CRC) conditions comprised subject-relative (SR), object-relative (OR), and agreement violation within the relative clause (AVR) conditions. Simple relative clause (SRC) conditions comprised agreement violation outside the relative clause (AV) and no agreement violation (AGR) conditions.
For the CRC experimental materials (Table 1), the relative clause always consisted of the relative pronoun die (English that), followed by a full noun phrase (NP), then by a prepositional phrase, then by a past participle, and finally by an auxiliary verb. The relative clause was always preceded by an antecedent NP together with some modifier, and followed by at least 3 words to complete the matrix clause of the sentence. Conditions differed in terms of whether the auxiliary at the end of the relative clause (the TW for the CRC experimental conditions) agreed in grammatical number with the antecedent NP in the matrix clause (SR condition), with the NP within the relative clause (OR condition), or did not agree in grammatical person with either preceding NP, thus resulting in a grammatical violation within the relative clause (AVR condition). The referents of both the matrix clause NP and the relative clause-internal NP were animate. Up to the point of the auxiliary (TW) in the relative clause, these sentences are identical. Crucially, up to the TW, the sentences in all three conditions are ambiguous in terms of whether they will eventually turn out to be a subject-relative clause, an object-relative clause, or whether they will constitute a grammatical agreement violation.
Two additional simple relative clause (SRC) conditions where the relative clause was unambiguously subjectrelative (no NP was present within the relative clause) were also included ( Table 2). The relative clause was always preceded by an antecedent NP in the matrix clause, and followed by at least 3 words to complete the main clause of the sentence. For the SRC conditions, the matrix clause verb (the TW) directly following the relative clause was inflected to either agree (AGR condition) or not agree (resulting in a grammatical violation; AV condition) in the grammatical number with the subject of the sentence. These SRC conditions provide a contrast between grammatically acceptable sentences and sentences containing a grammatical violation in the relatively less complex context of unambiguously subject-relative clause sentences.
For the SR condition, 120 sentences were constructed according to the specifications just described. About a quarter of the sentences were taken directly from a selfpaced reading and eye-tracking study by Mak et al. (2008), while the remainder were adapted from subject-and object-relative clause sentences used in an unpublished study. For half of the sentences the antecedent matrix clause NP was singular while the NP within the relative clause was plural, and vice versa for the other half. One hundred twenty sentences for the OR condition were constructed by switching the auxiliary in the relative clause from the SR sentences (i.e., heeft became hebben and hebben became heeft) so that it agreed in grammatical number with the relative clause-internal NP rather than with the antecedent matrix clause NP. To create grammatical person agreement violations in the 120 sentences for the AVR condition, the auxiliary within the relative clause (heeft or hebben) was replaced by the Dutch auxiliary hebt, which carries second person singular grammatical marking and therefore does not agree in person with either the matrix clause NP or the relative clause-internal NP.
For the AGR condition 80 sentences were constructed according to the specifications described above for the SRC sentences. The antecedent matrix clause NP for half the sentences was singular (and thus in order for the sentence to be grammatical so was the inflectional marking on the verb in the matrix clause) and for the other half it was plural (again with plural inflectional marking on the matrix clause verb). To create grammatical number agreement violations in the 80 sentences for the AV condition, singular matrix clause verbs from the AGR condition were replaced by verbs with plural inflectional marking, and plural matrix clause verbs were replaced by verbs with singular inflectional marking.
Participants saw 40 sentences from each of the conditions over the course of the experiment. Which of the 120 sentences from each of the CRC conditions and which of the 80 sentences from each of the SRC conditions were presented was separately counterbalanced across participants, such that participants never saw the same sentence more than once throughout the experiment. Across participants, all CRC sentences appeared equally often in each of the three CRC conditions and all SRC sentences appeared equally often in each of the two SRC conditions. Half of the sentences presented from each condition had a singular antecedent matrix clause NP and plural relative clause-internal NP, and vice versa for the other half. Resulting experimental lists were then pseudorandomized according to the following criteria: (1) no more than two consecutive presentations of a sentence from the same experimental condition; (2) repetition of a sequence of 5 or more sentences from any particular sequence of conditions was avoided.

| Experimental design and procedure
Participants were tested in a dimly lit, sound-attenuating, magnetically and electrically shielded room. They were seated in front of a display, with a viewing distance of approximately 90 cm. The display consisted of a backprojection screen inside the magnetically shielded room, on which all stimuli were presented using a set of mirrors and an LCD projector positioned outside the magnetically shielded room in order to minimize electrical interference. The text was presented in black on a dark gray background using a 20-point-sized Consolas font type.
Sentences were presented word by word in the center of the screen. For each sentence, the first letter of the first word was capitalized, the word directly preceding the relative clause and the last word of the relative clause were presented followed by a comma, and the final word of the sentence was presented with a period. A single trial consisted of a sentence, a movement cue (see below), and a fixation cross (and sometimes a comprehension question). Words were presented for between 300 and 400 ms (randomly chosen for each word), followed by a blank screen between words presented for between 100 and 200 ms. The stimulus onset asynchrony (SOA) between two words was always 500 ms (e.g., if the word was presented for 325 ms then the blank screen would last for 175 ms). 2 Each trial began with the presentation in the center of the screen of three asterisks two spaces apart for 3000 ms, indicating that participants could move their eyes and blink. This was immediately followed by a fixation cross presented in the center of the screen for 1500 ms, indicating that eye movements and blinking should be avoided and that the sentence was about to start. The first word of the sentence immediately followed the fixation cross. Each sentence lasted between 5500 and 11,000 ms and a single trial lasted between 10,000 and 15,500 ms.
Participants were instructed to read all sentences attentively for comprehension, and that every once in a while they might notice a grammatical error, but should continue reading to the end anyway. They read a total of 200 sentences (40 SR, 40 OR, 40 AVR, 40 AGR, and 40 AV), presented in 20 blocks of 10 sentences each, with selftimed breaks between blocks. After a random 10% of the sentences (4 from each of the conditions) a comprehension question appeared on the screen instead of the next trial. Participants were required to respond with the index finger ('yes' response) or middle finger ('no' response) of their right hand, indicating whether the statement on the screen correctly described the content of the sentence they had just read. The question remained on the screen for 6500 ms or until participants made a response, after which the next trial began. Whether or not a statement correctly described the sentence just read was counterbalanced across participants (2 'yes' and 2 'no' responses to the 4 questions from each condition). Ten training sentences (not used in the main experiment) were presented to participants before the experiment began.

MRI recordings
Participants were seated upright in the MEG system with their heads as close as possible to the inside of the helmet. MEG signals were recorded from a whole-head MEG system with 275 axial gradiometers (CTF MEG systems, VSM MedTech) at a sampling rate of 1200 Hz and with a 300 Hz low-pass anti-aliasing filter. Participants' head position relative to the helmet was monitored in real-time (Stolk et al., 2013) using 3 localization coils, one placed on participants' nasion and one in each ear canal. After each block participants were asked to reposition their head in case of a deviation from their original head position exceeding 10 mm. Bipolar electrode montages were used to record participants' electrocardiograms, as well as their horizontal (electrodes positioned at outer canthi) and vertical (electrodes positioned above and below the left eye) electrooculograms. Electrode impedance was kept below 20 kΩ.
For 22 of the 24 participants included in the final analyses, a structural magnetic resonance image (sMRI) was obtained using a T1-weighted magnetization-prepared rapid acquisition gradient echo pulse sequence. Vitamin E capsules were placed as fiducial markers to allow for visual identification of left-right consistency and for coregistration with matching fiducial coils in the MEG data.

| Data pre-processing
MEG data were analyzed using the FieldTrip toolbox (Oostenveld et al., 2011) running in a MatLab environment (R2021a; Mathworks, Inc.). The data were high-pass filtered above 0.1 Hz, and a band-stop filter was applied at 50, 100, and 150 Hz (all using a windowed sinc finiteimpulse response filter with FieldTrip default settings) in order to minimize the effects of power line noise (50 Hz). Segments were then created from −1000 to 5500 ms relative to the onset of the first word of the relative clause for all conditions together, and the data were down-sampled to 500 Hz.
The data were temporarily transformed (filtered and/or normalized) to facilitate the detection of various types of well-known artifacts in the data. Detected artefactual data segments were removed from the original data without the above transformations applied. In a first step, we detected and removed segments containing superconducting quantum interference device (SQUID) jump artifacts and segments exhibiting extreme variance (over time) compared to other trials in the data. Next, the data were decomposed into independent components (ICA using EEGLAB's 'runica' implementation in FieldTrip with default settings), requesting the 50 component time courses accounting for the highest variance in the data. Components that captured residual eye blinks, eye movements (including obvious microsaccadic components; Hipp & Siegel, 2013), or cardiac response were removed from the data (Jung et al., 2000;Makeig et al., 1997). Between 3 and 14 components were removed per participant (M = 5.04). The data were then re-segmented from −1000 to 1500 ms relative to the onset of the TW. In a final semi-automatic artifact rejection step, visual inspection was used to remove muscle artifacts, along with any remaining segments still exhibiting extreme values.
The sMRI of each participant was co-registered to the coordinate system of the MEG data defined by coils placed on participants' nasion and peri-auricular points during the MEG recording. This co-registration was refined through a process of matching the scalp surface extracted from the sMRI with a recording of the participant's head shape (Polhemus Fastrak®), based on the Iterative Closest Point algorithm implemented in FieldTrip. A triangulated cortical surface mesh was constructed for use as a source model, based on the automatic surface extraction pipeline in FreeSurfer (http://surfer.nmr.mgh.harva rd.edu/fswik i/Recon AllTa bleSt ableV6.0). Resultant high-resolution meshes were surface-registered to a common template, and the HCP workbench (Marcus et al., 2011) was used to down-sample the mesh to a resolution of 7842 vertices per hemisphere. The result of this procedure is a participantspecific source model with dipoles located at each vertex that can be directly compared across participants.
A singleshell volume conduction model (Nolte, 2003) was constructed for each participant based on the brainskull boundary extracted from their sMRI using SPM12 (https://www.fil.ion.ucl.ac.uk/spm/softw are/spm12/). This was used in combination with their source model and gradiometer definition from the MEG data to compute a participant-specific forward leadfield solution.

| Time-frequency analysis
The high-pass filter was increased to 1 Hz, and single-trial pre-processed data for each participant were DC-offset corrected using a period from −200 to 0 ms relative to TW onset. Next, at each sensor location, the spatial derivatives of the magnetic field in two orthogonal directions were computed using neighboring sensors. TF analyses were carried out on this representation of the data so that in a later step the magnitude of the planar gradient representation of the TF data could be estimated, for easier interpretation of results at the sensor level. Data were then separated into trials from the subject-relative (SR) (M = 33.25, SD = 2.7), object-relative (OR) (M = 32.54, SD = 2.94), agreement violation within the relative clause (AVR) (M = 32.46, SD = 2.81), no agreement violation (AGR) (M = 30, SD = 3.85), and agreement violation outside the relative clause (AV) (M = 30.42, SD = 2.66) conditions.
Time-resolved power spectra of the data between 2 and 30 Hz were computed using a sliding window approach, with the application of a Short-time Fourier transform. Hanning tapered sliding windows of 500 ms were applied in frequency steps of 1 Hz (interpolated-implicit frequency precision was 2 Hz) and time steps of 40 ms across the entire time axis from −1000 to 1500 ms relative to TW onset. Single-trial power spectra were then averaged within each condition. Finally, to obtain the magnitude of the planar gradient representation of the TF data from each condition the absolute values of the power spectra for the two spatial derivatives at each sensor were summed (Bastiaansen & Knösche, 2000). This resulted in a condition-specific TF representation of power for each participant. For each condition, these participant averages were then expressed as a relative change (in dB) from a baseline period between 500 and 0 ms prior to the onset of the TW.

| Statistical analyses
Statistical significance was evaluated using a clusterbased random permutation approach (Maris & Oostenveld, 2007). We used this approach because of its natural handling of the multiple comparisons problem. Cluster-based random permutation statistics control the family-wise error rate by making use of the spatial, spectral, and temporal autocorrelation in MEG data. In short, a dependent-samples T test is performed for every data point (sensor-frequency-time point) giving uncorrected p values. A pre-set significance level is chosen and any data points not exceeding this level are discarded (set to zero). Clusters are calculated from the remaining non-zero data points based on their adjacency in space (adjacent sensors), frequency, and time. Cluster-level statistics are then calculated by summing the values of the T-statistics for all data points in each cluster. A permutation distribution is created by randomly assigning participant averages to one of the two conditions 10,000 times, and each time calculating cluster-level statistics as just described. The highest cluster-level statistic from each randomization is entered into the permutation distribution and the cluster-level statistics calculated for the measured data are compared against this distribution. If any of the clusters in the observed data fall in the highest or lowest 2.5th percentile of the estimated null distribution (highest 5th percentile for single-tailed tests) the statistical test was considered statistically significant (although, when appropriate, p values reported are corrected for the 2 tests performed and are considered significant at p < .05; effects reported as marginal for .05 < p < .1).
We used a two-stage approach to test our hypothesis that beta power exhibits a similar pattern of effects just after the disambiguation point (TW) in object-relative (OR) sentences as it does following a grammatical violation. In the first step, we combined the data from the subject-relative (SR) and no agreement violation (AGR) conditions, and separately combined the data from the agreement violation within (AVR) and outside (AV) the relative clause conditions, to create control (SR and AGR) and violation conditions (AVR and AV), irrespective of violation type or whether the TW appeared within or outside the relative clause. These combined conditions were compared (alpha level of 5% two-tailed) for all sensors, in a frequency range from 2 to 28 Hz and time interval from 0 to 1200 ms relative to TW onset, forming clusters in time, frequency, and space. Based on the output of this statistical comparison, a sub-selection of sensors, frequency bins, and time points was made, providing an indication of which data points likely to contribute to this overall grammaticality effect. In a second step, we then made the comparison between SR and OR conditions (alpha level of 5% single-tailed), within this restricted range of sensors, frequency bins, and time points, again forming clusters in time, frequency, and space. We also followed up with planned comparisons between SR and AVR, and between AGR and AV conditions within this restricted range, in order to check whether there were differences between grammaticality effects that were dependent on violation type and/or whether the violation appeared in the relative clause rather than in the matrix clause.

| Source analysis
Data were separated into trials from the subject-relative (SR), object-relative (OR), agreement violation within the relative clause (AVR), no agreement violation (AGR), and agreement violation outside the relative clause (AV) conditions (trial numbers already reported above). We also created control and violation conditions by combining data from the SR and AGR conditions, and separately combining data from the AVR and AV conditions. This allowed us to follow the same two-stage approach to identifying cortical sources as was used in the statistical analyses.
A frequency-domain adaptive spatial filtering algorithm (dynamic imaging of coherent sources-DICS beamformer; Gross et al., 2001) was used to estimate source power in time-frequency regions that likely contributed to statistically reliable sensor-level effects. With this approach, an optimized spatial filter is constructed for each specified dipole in the participants' cortical surface mesh (source model) based on a cross-spectral density (CSD) matrix obtained from the MEG data and the participant's leadfield matrix. CSD matrices were obtained using a multitaper approach (Mitra & Pesaran, 1999) in combination with a fast Fourier transform. All subsequent steps were performed separately for data from the beta and theta frequency bands. Based on the statistical output from the sensor-level analyses, we selected a center frequency of 19 Hz with 4 Hz smoothing for the beta frequency range, while for the theta range the center frequency was 4 Hz with 2 Hz smoothing.
Separate CSD matrices were obtained for time points corresponding to the baseline period (−500 to 0 ms relative to TW onset) and those corresponding to the period where effects from the sensor-level analyses were most pronounced (500 to 1000 ms relative to TW onset). This was done separately for each condition of interest (SR, OR, AVR, AGR, AV, control, violation). A single CSD matrix was obtained for the combination of all conditions (SR, OR, AVR, AGR, AV), including both baseline and effect periods. This facilitated the computation of common inverse spatial filters, through which the data for each individual condition were subsequently projected. The regularization parameter for the beamformer was set to 25% of the average sensor power. This resulted in spectral power estimates for every participant at each dipole position in the cortical surface mesh. These were obtained for the baseline and effect periods of each condition of interest. Next, decibel power change from baseline was computed separately for each condition by dividing mean spectral power from the effect period by mean spectral power from the corresponding baseline period and performing a log 10 transform.
Source data from the control and violation conditions were then parcellated into 370 separate parcels based on a refined version of the Conte69 atlas (Van Essen et al., 2012), taking the mean power over dipoles in each parcel. We used a two-step procedure to create descriptive statistical masks for the source data. In a first step, we used clusterbased permutation statistics (as described above but clustering only in space-i.e., by parcel) to identify parcels exhibiting statistically significant differences at a clustercorrected alpha level of 5% for the comparison between control and violation conditions. 3 Similar comparisons were made for the AV versus AGR, AVR versus SR, and OR versus SR contrasts. In a second step, we computed T values for each dipole in the unparcellated source data for the following contrasts of interest: AV versus AGR; AVR versus SR; OR versus SR. A mask was then created for each of these contrasts based on dipoles in the source data corresponding to the detected parcels from the control versus violation contrasts from the first step, where T values from the second step corresponded to an uncorrected alpha level of 5%. These masks were applied to the source estimates for the contrasts of interest for visualization purposes, and the parcellation scheme mentioned earlier was used to identify brain regions in the masks. This allowed us to identify source regions for each contrast contributing most strongly to the observed sensor-level statistical effects.

| RESULTS
Participants were excluded from further analysis when they answered less than 65% of comprehension questions correctly overall. Those participants included in the final analyses scored on average 77% correct for the comprehension questions (SD = 10%; Range = 65-95%). This suggests that participants were paying attention to the stimuli and understood the sentences they were reading.

| Time-frequency results
The first stage in our analysis of the time-frequency data comparing power at TWs that were either grammatical or ungrammatical in the sentence context produced a statistically significant difference (p = .0042), with the dominant cluster (violation < control) extending over the beta frequency range (13-26 Hz) from 40 to 1200 ms after TW onset, including bilateral frontal, temporal, and central sensors. A spatial-spectrally less extensive effect (violation > control) extended over the theta frequency range (2-6 Hz) from 440 to 1200 ms after TW onset, and included bilateral frontal, temporal, and central sensors.
The second stage compares the subject-relative (SR) and object-relative (OR) conditions, as well as comparisons between SR and agreement violation within the relative clause (AVR), and between no violation (AGR) and agreement violation outside the relative clause (AV) conditions, were carried out separately for the beta and theta effects from the first stage of the analysis. For beta, this produced a statistically significant (p = .0001) negative difference between AV and AGR over the detected beta range (13-26 Hz) from around 100 to 1200 ms after TW onset, with left frontal and temporal, and right central maxima (Figure 1). The comparison between AVR and SR also produced a statistically significant effect (p = .0493), but in a more restricted beta frequency range (16-22 Hz) from around 100 to 1200 ms after TW onset, with a maximum over left frontal sensors (Figure 2). Crucially for our main hypothesis, there was a statistically significant (p = .0222) negative difference between the OR and SR conditions in the beta band (14-23 Hz), with a slightly later onset (around 150 to 1200 ms post-TW) and over left frontal and temporal sensors (Figure 3).
For theta, there was a statistically significant (p = .0028) positive difference between the AV and AGR conditions over the detected theta range (2-6 Hz), extending from about 500 to 1200 ms after the onset of the TW, and with a maximum over mid-frontal sensors (Figure 4). Comparing the AVR and SR conditions also produced a statistically significant (p = .0002) positive difference in the 2-6 Hz range from about 450 to 1200 ms post-TW onset, with a 3 Throughout the manuscript we interpret the cluster extent purely from the perspective of descriptive statistics, in order to suggest portions of the data that are most likely to contribute to the statistical effects. Statistical inference is performed over the entire distribution of the data under the assumption of exchangeability (see Maris & Oostenveld, 2007), and so no inferential claims can be meaningfully attached to the specifics of the spatial, spectral, or temporal extent of the clusters (see Sassenhagen & Draschkow, 2019 for details). maximum over mid-frontal, but also right temporal sensors ( Figure 5). Finally, there was a statistically significant positive effect (p = .0271) when comparing the OR and SR conditions in a slightly more restricted theta range (2-5 Hz), extending from about 450 to 1200 ms after the onset of the TW, and exhibiting a maximal difference over mid-frontal sensors ( Figure 6).

| Source estimation results
Source-level cluster-based permutation statistics confirmed the sensor-level findings for both beta (violation < control: p = .0443; AV < AGR: p = .0279; AVR < SR: p = .0189; OR < SR: p = .0305) and theta (violation  Table 3 for all regions in the statistical masks). While these source estimates are common to all contrasts, the LIFG difference is clearly broader for the AVR versus SR contrast, and is more restricted to anterior portions of the LIFG for the AV versus AGR contrast, but to posterior portions of the LIFG for the OR versus SR contrast. 5 Left angular gyrus (AG) and left middle temporal gyrus (MTG) beta power differences are present for the AV versus AGR and for the OR versus SR contrasts, but not for the AVR versus SR contrast. Similarly, left dorsolateral prefrontal regions are differentially engaged for both the AV versus AGR and the AVR versus SR contrasts, but not for the OR versus SR contrast. 4 As statistical inference was already carried out on the sensor-level data the statistical output reported here should be considered strictly confirmatory in the service of probing spatial contrasts for the source-level data. 5 It is important to keep in mind the limited spatial precision available with source reconstruction, and so the kinds of fine-grained distinctions between sub-regions referred to here should be interpreted with extra care.

F I G U R E 1 Beta power: Agreement violation (AV) versus control (AGR) contrast in simple relative clauses (SRC). (a) TF representations
of power for the AV (top left) and AGR (top right) conditions at the target word (onset at 0 ms), and condition difference (AV-AGR) with line plots (AGR: purple; AV: orange) of mean power over time or frequency intervals exhibiting effects. Beta power (13-26 Hz) clearly exhibits a desynchronization in the AV condition, with an onset around 100 ms after the TW. Shaded regions in the waveforms indicate standard error of the mean over participants; TF representations and waveforms depict the mean power over sensors contributing to the first cluster for at least half of the time interval of that cluster; black boxes in the TF representations indicate the extent (spectral and temporal) of the most prominent cluster. (b) Scalp distributions for the mean power in the beta frequency range over the time interval of the most prominent cluster for the AV (left) and AGR (middle) conditions, as well as their difference (AV-AGR). The topography for the difference exhibits maxima over left frontal and right central sensors. Filled black circles indicate electrodes contributing to at least half the time interval of the most prominent cluster; color bar depicts power for both (a) and (b). (c) Source power estimates for the effect contrast (AV vs. AGR) in the beta frequency range (15-23 Hz), and in a time interval from 500 to 1000 ms after TW onset. Both unmasked (top row) and masked data (bottom row) are plotted on an inflated cortical surface from multiple points of view. Most prominent differences in the masked data are observed in left posterior superior temporal gyrus/sulcus and left inferior frontal (BA44 and BA45) regions (see Table 3 for a full list).
Theta effects are primarily driven by theta power differences in lateral and medial frontopolar and prefrontal regions, including anterior cingulate cortex (ACC; Figures 4c, 5c, and 6c; see Table 4 for all regions in the statistical masks). While the medial prefrontal source estimates are common to all contrasts, other regions exhibit more variability. Right pars orbitalis for instance exhibits a difference for both the AVR versus SR and the OR versus SR contrasts, but not for the AV versus AGR contrast. Frontopolar differences are more widespread for the AVR versus SR and the OR versus SR contrasts. Finally, the OR versus SR contrast exhibits more restricted and more anterior theta power differences than the other two contrasts.

| DISCUSSION
The beta-syntax hypothesis (Bastiaansen & Hagoort, 2015) links oscillatory activity in the beta frequency range to syntactic unification operations. On the other hand, the beta-maintenance hypothesis  argues that the experimental evidence linking beta to sentencelevel language comprehension is better described under the more domain-general proposal that oscillatory activity in the beta frequency range is related to the maintenance or change of the current cognitive set. We pitted these two hypotheses against one another by investigating how participants' MEG-derived beta power is modulated when they are presented with linguistic input that is unexpected (or less expected) but does not constitute a grammatical violation. Under these conditions, the beta-syntax hypothesis predicts that beta power should increase because syntactic unification becomes more demanding, while the betamaintenance hypothesis predicts that beta power should decrease because the unexpected target word provides a cue to the language comprehension system indicating the need for a change in the sentence-level representation.
We replicated the relatively well-established finding of a beta power decrease following a syntactic violation,  Table 3 for a full list). in our case both outside and within the relative clause, for number and person agreement violations respectively (Figures 1 and 2). Our key comparison (object-relative vs. subject-relative-OR vs. SR) produced a beta power decrease for the OR condition ( Figure 3) in line with the betamaintenance hypothesis. All effects were left-lateralized and largely restricted to brain regions typically implicated in language comprehension (LIFG, left posterior STG and left SMG). In addition to the beta effects, mid-frontal theta power was higher in all unexpected conditions, whether the target word constituted a grammatical violation or not (Figures 4 and 6). These theta effects were consistently driven by higher medial prefrontal (including ACC) activation for unexpected linguistic input.

| Beta power dynamics in sentence comprehension
Our beta power findings allow us to adjudicate in favor of the beta-maintenance hypothesis when it comes to a role of beta in sentence comprehension. It will be important for future work to home in on which aspects of language comprehension are supported by similar domain-general systems-level beta dynamics, and in which (combination of) brain regions. For instance, one might ask whether these beta dynamics are only relevant for the construction of sentence-level contextual meaning, or instead might support other types of linguistic and non-linguistic information that is encoded into and maintained in short-term or working memory.
With this in mind, a recent proposal regarding a role for beta power in working memory updating and maintenance (Miller et al., 2018) is of particular interest, and may provide a more general systems-level account that could subsume the beta-maintenance hypothesis as one particular instance thereof. This line of work has shown that beta in the prefrontal cortex plays a role in what the authors refer to as 'volitional control' of working memory, where at the neural systems-level beta has the function of inhibiting cortical processing, so that when beta power in a particular part of the cortex decreases that region is released F I G U R E 3 Beta power: Object-relative OR versus subject-relative SR contrast in complex relative clauses (CRC). (a) TF representations of power for the OR (top left) and SR (top right) conditions at the target word (onset at 0 ms), and condition difference (OR-SR) with line plots (SR: green; OR: blue) of mean power over time or frequency intervals exhibiting effects. Beta power (14-23 Hz) clearly exhibits a desynchronization in the OR condition, with an onset around 200 ms after the TW. Shaded regions in the waveforms indicate standard error of the mean over participants; TF representations and waveforms depict the mean power over sensors contributing to the first cluster for at least half of the time interval of that cluster; black boxes in the TF representations indicate the extent (spectral and temporal) of the most prominent cluster. (b) Scalp distributions for the mean power in the beta frequency range over the time interval of the most prominent cluster for the OR (left) and SR (middle) conditions, as well as their difference (OR-SR). The topography for the difference exhibits a maximum over left frontal and temporal sensors. Filled black circles indicate electrodes contributing to at least half the time interval of the most prominent cluster; color bar depicts power for both (a) and (b). (c) Source power estimates for the effect contrast (OR vs. SR) in the beta frequency range (15-23 Hz), and in a time interval from 500 to 1000 ms after TW onset. Both unmasked (top row) and masked data (bottom row) are plotted on an inflated cortical surface from multiple points of view. Most prominent differences in the masked data are observed in left posterior superior temporal gyrus and left inferior frontal (BA44 and BA45) regions (see Table 3 for a full list). from inhibition. At a cognitive level, this means that whatever processing is carried out by the region that has been released from inhibition now has additional neural resources devoted to it. In this way beta acts as a type of control switch that allows the brain to up-or down-regulate (beta decrease or increase respectively) particular regions in order to efficiently allocate processing resources. Miller et al. (2018) suggests that this may be a common organizing principle throughout the cortex, but that the specific frequency range may shift to the lower part of the beta band, and into the alpha band as one moves away from prefrontal regions to cortical regions typically linked to the processing of less abstract, more sensory information (for a similar proposal for alpha see Jensen & Mazaheri, 2010). This shift may reflect differences in timescale over which it is useful to prioritize more and less abstract information in short-term memory, but that remains to be demonstrated by future empirical work. Returning to our sentence-level beta findings, the fact that the presence of unexpected linguistic input results in a beta power decrease in regions typically implicated in linguistic processing (LIFG, left posterior STG, and left SMG) is consistent with the idea that this unexpected input triggers the language comprehension system to exert additional control over the contents of working memory. The system anticipates a need to prioritize relevant linguistic information and thus up-regulates processing in associated cortical regions by releasing those regions from inhibition. Accordingly, a typical pattern observed for sentence reading is that participants' beta (and alpha) power decreases after every word, beginning around 100 ms after word onset (with some variability), and rebounds just before or at the onset of the next word (especially when word onset is predictable; see Prystauka & Lewis, 2019). We speculate that this may reflect the encoding of new information into short-term memory, with the precise contents of the memory representation dependent on the regions showing these beta (and alpha) dynamics. On this account, our (and previous) findings showing lower beta power for grammatical violations or for various other F I G U R E 4 Theta power: Agreement violation (AV) versus non-violation (AGR) contrast in simple relative clauses (SRC). (a) TF representations of power for the AV (top left) and AGR (top right) conditions at the target word (onset at 0 ms), and condition difference (AV-AGR) with line plots (AGR: purple; AV: orange) of mean power over time or frequency intervals exhibiting effects. Theta power (2-6 Hz) clearly exhibits a synchronization in the AV condition, with an onset around 500 ms after the TW. Shaded regions in the waveforms indicate standard error of the mean over participants; TF representations and waveforms depict the mean power over sensors contributing to the most prominent cluster for at least half of the time interval of that cluster; black boxes in the TF representations indicate the extent (spectral and temporal) of the most prominent cluster. (b) Scalp distributions for the mean power in the theta frequency range over the time interval of the most prominent cluster for the AV (left) and AGR (middle) conditions, as well as their difference (AV-AGR). The topography for the difference exhibits a clear mid-frontal maximum. Filled black circles indicate electrodes contributing to at least half the time interval of the first cluster; color bar depicts power for both (a) and (b). (c) Source power estimates for the effect contrast (AV vs. AGR) in the theta frequency range (2-6 Hz), and in a time interval from 500 to 1000 ms after TW onset. Both unmasked (top row) and masked data (bottom row) are plotted on an inflated cortical surface from multiple points of view. Most prominent differences in the masked data are observed in dorsolateral and medial prefrontal cortex (see Table 4 for a full list). types of unexpected linguistic input (OR vs. SR contrast included) reflect additional neural resources devoted to encoding new information when the 'typical' resource allocation will not be sufficient.
This leads naturally to the question of whether there are relevant differences between our 3 experimental contrasts that may lead the language comprehension system to prioritize different types of information or processing resources upon encountering the unexpected linguistic input? Three clear patterns come into focus. First, one obvious difference is in the degree of syntactic complexity between the sentences in the simple relative clause (SRC) conditions (no violation-AGR and agreement violation outside the relative clause-AV conditions) compared to those in the complex relative clause (CRC) conditions (SR, agreement violation within the relative clause-AVR, and OR conditions). For SRC sentences, the TW occurs outside the relative clause at the matrix clause verb, whereas for CRC sentences it occurs within the relative clause, which equates to an increased depth of embedding within the syntactic structure. Moreover, for SRC sentences the auxiliary verb within the relative clause is preceded by only one potential referent, making the sentence unambiguously subject-relative. For CRC sentences on the other hand the auxiliary in the relative clause is preceded by two potential referents, making the sentence ambiguous between subject-and object-relative clause constructions. Since posterior LIFG exhibits beta power differences for the AVR versus SR and the OR versus SR contrasts but not the AV versus AGR contrast, processing in this region appears to be prioritized when sentence complexity increases. This is consistent with findings from the fMRI literature (e.g., Grodzinsky et al., 2021;Hagoort & Indefrey, 2014;Walenski et al., 2019), where left pars opercularis is typically more active for more complex sentences.
Second, another immediately striking pattern is that both contrasts involving grammatical violations (AV F I G U R E 5 Theta power: Agreement violation (AVR) versus subject-relative (SR) contrast in complex relative clauses (CRC). (a) TF representations of power for the AVR (top left) and SR (top right) conditions at the target word (onset at 0 ms), and condition difference (AVR-SR) with line plots (SR: green; AVR: pink) of mean power over time or frequency intervals exhibiting effects. Theta power (2-6 Hz) clearly exhibits a synchronization in the AVR condition and desynchronization in the SR condition, with an onset for the difference around 450 ms after the TW. Shaded regions in the waveforms indicate standard error of the mean over participants; TF representations and waveforms depict the mean power over sensors contributing to the first cluster for at least half of the time interval of that cluster; black boxes in the TF representations indicate the extent (spectral and temporal) of the most prominent cluster. (b) Scalp distributions for the mean power in the theta frequency range over the time interval of the most prominent cluster for the AVR (left) and SR (middle) conditions, as well as their difference (AVR-SR). The topography for the difference exhibits a clear mid-frontal maximum. Filled black circles indicate electrodes contributing to at least half the time interval of the most prominent cluster; color bar depicts power for both (a) and (b). (c) source power estimates for the effect contrast (AVR vs. SR) in the theta frequency range (2-6 Hz), and in a time interval from 500 to 1000 ms after TW onset. Both unmasked (top row) and masked data (bottom row) are plotted on an inflated cortical surface from multiple points of view. Most prominent differences in the masked data are observed in dorsolateral and medial prefrontal cortex, as well as bilateral orbitofrontal and pars orbitalis (BA47) regions (see Table 4 for a full list).
vs. AGR and AVR vs. SR), but not unexpected yet grammatical TWs (OR vs. SR), exhibit beta power differences in anterior LIFG and left dorsolateral prefrontal cortex (DLPFC). Increased pars triangularis activation has been observed in many (but not all-for discussion see Hagoort & Indefrey, 2014) fMRI studies investigating syntactic violations (e.g., Petersson et al., 2004;van de Meerendonk et al., 2013), and this is consistent with a recent proposal that this region supports a working memory buffer to preserve the sequence in which morphemes were encountered in the input (Matchin & Hickok, 2020). Both grammatical violation contrasts involve checking the inflectional morphology of the input to attempt repair, whereas the OR versus SR contrast instead involves syntactic reanalysis, which may not necessarily entail checking morphological sequencing in the input. Relatedly, there is evidence that in the context of sentence processing the DLPFC is only recruited when an ongoing process needs to be interrupted, inhibited or slowed down to allow for reanalysis or error repair to take place (Hertrich et al., 2021). This aligns well with our observation of beta power differences in these two regions for grammatical violations, however, one may still ask why we do not observe a beta difference in the DLPFC for the OR versus SR contrast. One possibility is that some brain regions (in this case DLPFC) that are engaged for reanalysis in the case of syntactic ambiguity differ from those engaged for repair in the case of grammatical violations. We acknowledge that this is not a definitive answer, and leave this question to future research.
Third, both left posterior MTG and left angular gyrus (AG) exhibited beta power differences for the AV versus AGR and the OR versus SR contrasts, but not for the AVR versus SR contrast. An intriguing similarity between the AV versus AGR and the OR versus SR contrasts is that in both cases the parser is faced with unexpected number agreement marking when processing is disrupted at the TW. Recent work on gender agreement processing in Spanish has implicated both left posterior MTG and left AG in the processing of local grammatical agreement relations (Quiñones et al., 2018). This is consistent with F I G U R E 6 Theta power: Object-relative OR versus subject-relative SR contrast in complex relative clauses (CRC). (a) TF representations of power for the OR (top left) and SR (top right) conditions at the target word (onset at 0 ms), and condition difference (OR-SR) with line plots (SR: green; OR: blue) of mean power over time or frequency intervals exhibiting effects. Theta power (2-5 Hz) clearly exhibits a desynchronization in the SR condition, with an onset around 450 ms after the TW. Shaded regions in the waveforms indicate standard error of the mean over participants; TF representations and waveforms depict the mean power over sensors contributing to the first cluster for at least half of the time interval of that cluster; black boxes in the TF representations indicate the extent (spectral and temporal) of the most prominent cluster. (b) Scalp distributions for the mean power in the theta frequency range over the time interval of the most prominent cluster for the OR (left) and SR (middle) conditions, as well as their difference (OR-SR). The topography for the difference exhibits a clear mid-frontal maximum. Filled black circles indicate electrodes contributing to at least half the time interval of the most prominent cluster; color bar depicts power for both (a) and (b). (c) Source power estimates for the effect contrast (OR vs. SR) in the theta frequency range (2-6 Hz), and in a time interval from 500 to 1000 ms after TW onset. Both unmasked (top row) and masked data (bottom row) are plotted on an inflated cortical surface from multiple points of view. Most prominent differences in the masked data are observed in bilateral anterior cingulate cortex, bilateral orbitofrontal regions, and right pars orbitalis (BA47; see Table 4 for a full list).
increased demands on these two regions for processing local number agreement relations in our AV and OR conditions, for which grammatical number is inconsistent with what was expected based on the sentence parse up to that point. Crucially, the AVR versus SR contrast does not involve a number (but instead a person) agreement mismatch. While number agreement relies on checking of the formal inflectional morphology on dependent elements (a local agreement computation), person agreement is instead thought to be anchored to the representation of the speech act participant(s) (Mancini et al., 2014). Violations of person agreement may thus result in the engagement of discourse-related processing to try to resolve the mismatch at the level of speech act participants, and hence these brain regions related to the processing of local agreement relations (i.e., left posterior MTG and left AG) are not differentially engaged in the AVR versus SR contrast. In our opinion, this demonstrates that tracking beta dynamics during sentence comprehension holds great promise for investigating how the language system prioritizes different types of information to reach an interpretation of the linguistic input. We have argued that beta power can be used to distinguish between brain regions that are differentially recruited when sentence complexity increases when monitoring is required for reanalysis and/or repair, and even for highly specific local agreement computations. An important avenue for future research will be to investigate whether or not these beta power effects are in fact oscillatory in nature. Experimental and computational modeling work on beta power in the context of perceptual and motor performance (Sherman et al., 2016) suggests that what has typically been considered oscillatory beta may be better explained as transient beta burst (of excitatory synaptic drive) events. This has important consequences for linking beta effects to systems-level biophysical models, and it will be important to work out whether beta power effects observed during sentence comprehension are also of this nature.

| Mid-frontal theta power signals conflict and a need for control
Although it was not of primary interest in this study, the finding of higher theta power for grammatical violations has been reported in previous work (Bastiaansen et al., 2002;Kielar et al., 2015;Lewis, Lemhӧfer, et al., 2016;Pérez et al., 2012;Regel et al., 2014;Roehm et al., 2004). These previous studies however have typically observed higher theta power over left hemisphere sensors and interpreted their findings as a reflection of increased demands on the retrieval of lexical-semantic information from memory when a grammatical violation is encountered. In our study, however, the topography of the effect (Figures 4-6) clearly indicates that we are dealing with mid-frontal theta, which has been linked to error monitoring and cognitive control (e.g., Cavanagh & Frank, 2014). Indeed, the mid-frontal theta is typically localized to medial prefrontal regions (including ACC and mid cingulate cortex) and functionally appears to reflect conflict and error detection/monitoring across various domains (see Cavanagh & Frank, 2014). This is consistent with the observation of higher mid-frontal theta power at the TW in our grammatical violation (i.e., error) conditions, and may suggest that the TW in our OR condition is also (at least initially) treated as a conflict.
An influential line of work has suggested that ACC implements a form of gain adjustment in lateral prefrontal cortical regions through local inhibition (Medalla & Barbas, 2009) and that this facilitates set shifting between lateral prefrontal regions for more demanding context representations. We observed theta effects in lateral prefrontal regions for all contrasts, but in right pars orbitalis only for the AVR versus SR and the OR versus SR (i.e., the CRC) contrasts. As we have already argued, this may reflect a difference in the complexity of the relative clause sentences in the CRC compared to the SRC conditions, which aligns well with right pars orbitalis being additionally modulated for more demanding context representations in the case of the CRC contrasts. Moreover, this is consistent with our observation of frontopolar theta power differences for the CRC contrasts and not the SRC contrast, as mid-frontal theta is also thought to Note: First column indicates parcel label from the adapted Conte69 atlas; L, left hemisphere; R, right hemisphere; first number in first column refers to Brodmann Area (BA) of the parcel; second column indicates approximate brain region of corresponding parcel; third to fifth column indicates proportion of dipoles within the corresponding parcel that were also in the statistical mask for the contrast of interest. play a role in increasing the influence of frontopolar regions (implicated in complex multitask operations) on lateral prefrontal regions (Medalla & Barbas, 2010). The additional engagement of more anterior frontopolar regions for the OR versus SR contrast (compared to the other two contrasts) may thus reflect the fact that on encountering an unexpected TW in the OR sentences some reanalysis is required, which could be reasonably argued to involve more complex multitask operations than the repair that is presumably attempted upon encountering a grammatical violation. On this account, mid-frontal theta serves the dual purpose of registering an error or conflict during sentence comprehension (but also in domains other than language), and when necessary recruiting frontopolar regions and coordinating set shifting operations in lateral prefrontal regions when the task becomes more demanding or involves greater contextual complexity (as would be the case for ambiguous relative clauses in the CRC conditions). 6 There thus appear to be at least two theta power effects related to sentence comprehension, the first involving the retrieval of lexical-semantic information from long-term memory (for review see Prystauka & Lewis, 2019), and the second involving conflict/error monitoring and the recruitment of additional cognitive control in lateral prefrontal regions when necessary. A similar observation has been made for resting state theta power (Beese et al., 2017) in the context of differences in sentence comprehension abilities across the lifespan, but further research into this distinction is clearly warranted.

| CONCLUSION
In sum, we have shown that beta power neural dynamics upon encountering an unexpected target word that disambiguates towards a less preferred object-relative clause interpretation of a sentence are very similar to those observed when encountering a grammatical violation. These beta-power decreases are predominantly present in the left hemisphere in regions typically associated with sentence comprehension and provide strong evidence in favor of the beta-maintenance hypothesis. Beta signals a need to either maintain or update the sentence-level representation and a beta power decrease provides an index of release from inhibition in regions responsible for encoding new information. We also showed that mid-frontal theta power signals an error or conflict in the case of grammatical violations or unexpected sentence structure, as well as the recruitment of frontopolar and lateral prefrontal regions when the representational complexity increases. Taken together these findings suggest that beta and mid-frontal theta power both play a role in exerting control during sentence comprehension. While both should be thought of as domain-general at a neural systems level, at a cognitive level beta appears to exert control over more contentspecific representations, while control in the case of mid-frontal theta appears to be more domain-general.