Inter‐individual variability in dorsal stream dynamics during word production

Abstract The current standard model of language production involves a sensorimotor dorsal stream connecting areas in the temporo‐parietal junction with those in the inferior frontal gyrus and lateral premotor cortex. These regions have been linked to various aspects of word production such as phonological processing or articulatory programming, primarily through neuropsychological and functional imaging group studies. Most if not all the theoretical descriptions of this model imply that the same network should be identifiable across individual speakers. We tested this hypothesis by quantifying the variability of activation observed across individuals within each dorsal stream anatomical region. This estimate was based on electrical activity recorded directly from the cerebral cortex with millisecond accuracy in awake epileptic patients clinically implanted with intracerebral depth electrodes for pre‐surgical diagnosis. Each region's activity was quantified using two different metrics—intra‐cerebral evoked related potentials and high gamma activity—at the level of the group, the individual and the recording contact. The two metrics show simultaneous activation of parietal and frontal regions during a picture naming task, in line with models that posit interactive processing during word retrieval. They also reveal different levels of between‐patient variability across brain regions, except in core auditory and motor regions. The independence and non‐uniformity of cortical activity estimated through the two metrics push the current model towards sub‐second and sub‐region explorations focused on individualized language speech production. Several hypotheses are considered for this within‐region heterogeneity.


| INTRODUCTION
In human cognitive neuroscience, linking a processing model with single-case patient data provides a fruitful means to pursue improvements of both the model and the clinical assessment. Constructing 'typical' participants from aggregated (e.g., average co-registered) data has been the primary focus of research on brain function (Seghier & Price, 2018), but there is increasing interest in understanding various kinds of interindividual differences. Stable and interpretable individual variability in resting state neural networks has been repeatedly observed, pointing out the existence of spatially variable 'network variants' (D'Esposito, 2019; Gordon et al., 2017;Seitzman et al., 2019). The potential sources of variability in brain function during task performance have been conceptualized and evidenced across various cognitive domains (Miller et al., 2009;Sanfratello et al., 2014; for a review, see Seghier & Price, 2018).
The neural network involved in language processing is broad and known to be variable across healthy individuals (Mahowald & Fedorenko, 2016) and in neurological disorders (e.g., Berl et al., 2014). This interindividual variability has been frequently studied to understand variations in the hemispheric specialization for language (Tzourio-Mazoyer et al., 2017). It has also been harnessed to distinguish a 'core' part of the language network from regions recruited across other cognitive domains (Fedorenko et al., 2011; or to explore how a given task (e.g., word reading) might be achieved through more than one processing pathway (Seghier et al., 2008). This research has been mostly concerned with perceptual processes (e.g., recognizing words and their meanings) or with higher levels of abstraction (e.g., comprehending sentences). In contrast, language production theories in cognitive neuroscience (Hickok, 2012; e.g., Indefrey, 2011;Roelofs, 2014;Strijkers & Costa, 2016) have been mostly developed and tested under the (often implicit) hypothesis that a roughly similar network, with roughly similar spatial or temporal characteristics, must be present in every individual. Consequently, the interactions between language production models and individual patient data must remain cautious. One representative example is the exploration of the language production network in the context of epilepsy surgery (Arya et al., 2018;Benjamin et al., 2020;Duffau, 2017;Nakai et al., 2019;Trebuchon et al., 2020).
According to current models, producing a word to express a particular thought requires that a single item is selected among alternatives and that its form is encoded to be articulated overtly (Levelt et al., 1999). These cognitive processes are sub-served by diverse functionally specialized regions (Indefrey, 2011). Their network intricacy has been clarified in terms of the dual stream framework for language processing with a distinction between (i) the ventral stream, a temporal-lobe stream that maps semantic knowledge on to lexical representations (e.g., during visual naming), and (ii) the dorsal stream, a parietofrontal stream that interfaces auditory-phonological information with the motor system (Hickok, 2012;Roelofs, 2014;Ueno et al., 2011).
The dorsal stream we focus on here is thought to be left lateralized in most right-handed individuals (Gleichgerrcht et al., 2015;McKinnon et al., 2018;Saur et al., 2008;Schwartz et al., 2012;Warren et al., 2005) (but see bilateral processes reported in Cogan et al., 2014). Between the temporo-parietal and lateral frontal cortices, the dorsal stream is supported by two of the anatomical pathways that form the superior longitudinal fasciculus (SLF). Its deep segment connects the posterior temporo-parietal regions to the Pars Opercularis, whereas the lateral part of SLF (SLF III) links the supra marginal gyrus (SMG) to the ventral premotor cortex (Catani & Mesulam, 2008;Parlatini et al., 2017;Petrides, 2014). Other regions that form the dorsal stream, such as supplementary motor area (SMA) and ventral motor cortex, are also connected to posterior parietal areas via the dorsal branch of the SLF (SLF I).
This network is swiftly activated during word production, with responses typically occurring within a second in experimental settings such as picture naming. The corresponding cognitive operations and neural activations are thus coordinated on a sub-second scale. An influential theoretical proposal combined a cognitive model with a meta-analysis of imaging and electrophysiological data into a framework that estimates a cascade of onsets for the different regional activations and processes (Indefrey, 2011). An alternative proposal hinges on hierarchical processing and top-down modulations, thereby highlighting the sustained reverberatory activity during which diverse processes largely overlap in time (Rapp & Goldrick, 2000;Strijkers & Costa, 2016).
An assessment of inter-individual variability in dorsal stream activation must thus be conducted with subsecond temporal resolution. Magneto-encephalographic (MEG) signals have a fair spatial resolution and a high temporal resolution, but synthetizing them in a unified spatio-temporal model of word production has proved to be challenging (Munding et al., 2016). The cortical dynamics underlying word production are increasingly explored in studies with invasive recordings (e.g., Nakai et al., 2019;Piai et al., 2016;Riès et al., 2017) (for reviews, see Flinker et al., 2018;Llorens et al., 2011). But these studies typically acknowledge the variability across patients without seeking to quantify or interpret it beyond its use for computing statistics at the group level (Flinker et al., 2018;Lachaux et al., 2012). The hypothesis is that such central tendency measures will erase possible pathological idiosyncrasies (Lachaux et al., 2012;Parvizi & Kastner, 2018), with the associated risk of losing important information or distorting data patterns (Seghier & Price, 2018).

| THE CURRENT STUDY
Despite renewed theoretical interest for inter-individual variability in neural functional activity, and its importance for translating between processing theories and beside assessments of the language network, interindividual variability in the language production network remains under-studied. Here, we explored the variability in the activation of fronto-temporo-parietal regions that encompass the dorsal stream (Hickok, 2012), when they were recruited for word production, a task that is typically described to be solved with a single processing pathway (Indefrey, 2011). At stake in our approach was whether the variability described for networks in other contexts (D'Esposito, 2019;Gordon et al., 2017;Seitzman et al., 2019) could be identified at the level of the individual regions implicated.
The quantification of inter-individual variability was performed on a dataset obtained from epilepsy patients involved in pre-surgical diagnosis procedures (stereoelectro-encephalography or SEEG) that required the implantation of intra-cerebral depth electrodes. Recordings from these electrodes gave us access to the variability in time-resolved electrophysiological signals rather than the spatial variability derived from functional magnetic resonance imaging (fMRI) data (Berl et al., 2014). A link between signals in the high gamma frequency band and cognitive processes has been demonstrated in a variety of experimental tasks (Lachaux et al., 2012). However, high gamma activity (HGA) does not capture all of the neurophysiological modulations (Flinker et al., 2018), as shown in intra-cerebral recordings during auditory perception (Edwards et al., 2009;Nourski et al., 2015), or in MEG recordings during our task of interest, picture naming (Laaksonen et al. 2012). Thus, we decided to investigate both HGA modulations and intra-cerebral evoked related potentials (iERPs). We explored variability across patients but also across single-electrode time courses, looking to assess the diversity of activities within each region of interest.

| Patients
Epileptic patients were recruited from the pool of patients assessed by the Cleveland Clinic Epilepsy Center for surgical treatment of medically refractory Epilepsy. They were undergoing an SEEG diagnostic evaluation as part of their pre-surgical assessment (Bancaud et al., 1969;Chauvel et al., 2019). After an anatomo-functional localizing epileptogenic network assumption is formulated, a strategy to implant depth electrodes to specific regions of the brain is defined. Depth electrodes are stereotactically inserted using a robotic surgical implantation platform (ROSA, Zimmer-Biomed, USA), in a three-dimensional arrangement with either orthogonal or oblique orientations, allowing intracranial recordings from lateral, intermediate or deep cortical and subcortical structures (Gonz alez-Martínez, 2015;Gonzalez-Martinez et al., 2014). For each patient, between 8 and 13 stereotactically placed electrodes were implanted. Within each electrode, the recording contacts were 2 mm long, 0.8 mm in diameter and spaced 1.5 mm apart. The insertion of these electrodes is based purely on clinical needs and is made independently of any research related purpose. Part of the procedures seeks to identify functional networks such as those involved in language, to be able to spare them from the surgical procedure (Trebuchon et al., 2020).
We studied 17 epileptic patients who were native speakers of English. Among them, 11 were implanted unilaterally (9 in the left and 2 in the right hemispheres, respectively), and 6 were implanted bilaterally (i.e., they had electrodes implanted in both hemispheres but not necessarily in directly homologous areas). Patients were invited to participate in a picture naming experiment when their implantation included depth electrodes within the parieto-frontal networks known as the dorsal language stream (Hickok, 2012). All patients enrolled voluntarily after giving written informed consent under criteria approved by the Cleveland Clinic Institutional Review Board (N 13-1248). Patient details are listed in Table 1. One patient (P15) was unable to complete the cognitive task properly; thus, the analysis involves data from only 16 patients.
3.2 | Anatomical reconstruction of electrode positions

| MRI acquisition
All MRI scans were acquired from a 3T Siemens Skyra scanner (Siemens, Erlangen, Germany). The T1-weighed Magnetization Prepared Rapid Acquisition with Gradient Echo (MPRAGE) volumetric scan was used for coregistration with stereo-computerized tomography (CT).

| MRI and CT fusion
Immediately after SEEG implantation, a high-resolution stereo-CT was taken to obtain the anatomical location of the implanted electrodes. The post-implantation CT scan was exported into Curry 7 (Compumedics Neuroscan, Hamburg, Germany), and co-registration with the MPRAGE images was performed using an automated fullvolume registration with maximization of mutual information. The accuracy of the co-registration was inspected visually and confirmed in all patients by a neurosurgeon (JGM) to ensure accuracy. The location of each electrode contact, determined by the centre of the highest intensity on the CT, was individually labelled and superimposed on the MRI for visualization of its anatomical location.

| Normalization of MRI and CT
Further processing was performed within the software suite SPM12 (Wellcome Department of Cognitive Neurology, London, UK) in MATLAB 2015a (MathWorks, Natick, Massachusetts) to normalize the locations of the electrodes of interest from all the patients. Three processing steps were performed: (1) co-registration of the CT to the MRI for each individual patient; (2) normalization of the individual MRI with the Montreal Neurological Institute (MNI) 3D common stereotactic space (Collins et al., 1998); (3) normalization of the co-registered CT in Step 1 to the MNI space by applying the same transformation matrix as obtained in Step 2. All the normalized CT and MRI images were then fused in Curry 7, so that the electrodes could all be superimposed on the normalized MRI and visualized on a template according to the Talairach stereotactic coordinate system. Given the focus of this research, our exploration was limited to regions located within the language dorsal stream (see Section 1) and organized in seven anatomical groups ( Figure 1). Each contact (N = 447) was localized to a specific brain region of the Talairach atlas (Talairach & Tournoux, 1988) based on its coordinates and confirmed by individual visual verification. Only regions that were sampled in at least 3 patients were included in the analysis, to allow assessing cross-patient consistency of neural responses, as described below.

| EXPERIMENTAL TASK
Patients were asked to name out loud pictures of common objects from the Snodgrass and Vanderwart (1980) collection. There were 36 common items, chosen from six different semantic categories (Accessories, Buildings, Kitchen Utensils, Fruits, Furniture and Musical Instruments). The pictures were presented in short blocks each involving six items randomly repeated five times, yielding 30 trials per block. A total of 8 blocks was planned for each patient (i.e., 240 trials), but not all patients completed all the blocks due to clinical circumstances (e.g., fatigue or interruptions). The items within a block could either be from a single semantic category (semantically homogeneous blocks) or from six different semantic categories (semantically heterogeneous blocks). This design and materials were reused from a previous research conducted with French speakers (Anders et al., 2019;Llorens et al., 2016), but with different F I G U R E 1 Anatomical sampling for 16 patients. Each horizontal bar represents the number of recording contacts that were identified to lie in each of the anatomically defined brain regions of interest. Each colour is a patient. For example, superior temporal gyrus was sampled 21 times across three patients, each contributing 8, 7 and 6 contacts, respectively. patients completing different numbers of blocks, the homogenous vs. heterogeneous contrast was unbalanced and could not be explored in the current analysis.
The experiment was controlled by the software E-Prime v2.0.1 (Psychology Software Tools, Pittsburgh, USA). The pictures were presented on the centre of the screen within a visual angle of 6 Â 6 . For a subset of the patients (N = 8), naming latencies were recorded with a microphone (audio Technica ATR20) placed about 13 cm in front of the patient. In this case, response times (RTs) were automatically recorded in milliseconds by the software's voice key. A trial consisted of a fixation point (variable duration across trials, between 1400 and 2100 ms), followed by the black and white target picture (presented for fixed duration of 1000 ms). Various pseudo-random orders were created to vary across patients the order of items within a block and the order of the blocks within the experiment.

| Data acquisition
SEEG electrophysiological data was acquired using a video-EEG monitoring data acquisition system (Nihon Kohden 1200, Nihon Kohden America, USA) at a sampling rate of 1000 Hz per channel. During the cognitive task, stimuli and behavioural event data were simultaneously acquired along with the SEEG signal (Johnson et al., 2014) and stored for subsequent analysis. The recordings were online referenced to a scalp-electrode located on vertex (Cz).

| Data analysis
Offline pre-processing was performed in Brainstorm (Tadel et al., 2011), which is freely available for download online under the GNU public licence (http://neuroimage. usc.edu/brainstorm). Evoked potentials and HGA were computed and combined into a group analysis using the Multi-patient Intracerebral data Analysis toolbox (MIA: Dubarry et al., 2022).
For each participant, the continuous monopolar SEEG recording was filtered digitally (finite impulse response high pass filter implemented in Brainstorm; lower cutoff frequency: 0.5 Hz; stop band attenuation: 60 dB). Epochs starting 1500 ms before stimulus onset and lasting 1500 ms post-stimulus were created, for a total of 3000 ms per trial. Only the epochs corresponding to correct behavioural responses were kept for the analysis. Epochs containing interictal epileptic activity (spikes or abnormal sharp waves) were removed manually following visual inspection of all epochs.

| iERPs
The signals recorded with the Cz electrode as reference (monopolar montage) were averaged across trials to yield iERP. iERP arises from synaptic activity of large numbers of neurons distributed over a cortical region. They primarily capture low frequency components and are time and phase-locked to the stimulus. They provide a measure of the synaptic input to, and the local processing within, the area where they are recorded (Lopes da Silva, 1991). Because the signals are time-locked to the stimuli, the recordings provide information about the temporal components of the underlying generators. Any local field potential can be recorded either as positive or negative response, and this polarity depends, among other things, on the location of the electrode contact with respect to the generator. This property of iERP can be conceived of as a limitation because it complicates averaging procedures across contacts and patients (see below Within-region variability across contacts and patients for the adopted strategy). In addition, they do not distinguish excitatory from inhibitory currents (Buzs aki et al., 2012; Lopes da Silva, 1991).

| HGA
The second step of data analysis focused on high-gamma band (HGA), which is characterized as broadband activity measured here between 80 and 120 Hz. To estimate HGA, a bipolar montage was computed in which the activity recorded at adjacent contacts was subtracted to yield bipolar channels (e.g., B2-B1 and B3-B2). Morlet Wavelet transforms were first performed separately at five consecutive frequencies between 80 and 120 Hz with a 10 Hz step (i.e., 80, 90, 100, 110 and 120 Hz). Then, the five resulting time courses were separately normalized with a z score against the baseline (À800 to À50 ms) and finally summed. This procedure was used to compensate for the larger contribution of low frequencies that would otherwise overpower the contribution of higher frequencies, due to the 1/f distribution of signal power (Bédard & Destexhe, 2009). HGA has been linked to BOLD signal of fMRI (Logothetis et al., 2001) as well as neuronal firing rate (Ray & Maunsell, 2011) localized in restricted cortical areas (Lachaux et al., 2012;Mukamel & Fried, 2012). One limitation of HGA is its power, between 100 and 1000 times lower than that of the iERP (Lachaux et al., 2012), which can make it more difficult to detect. Also, partly due to the time-frequency transformation process, it is less sensitive to small variations in the timing of behavioural and neural responses than iERP are. Oscillatory burst may suffer from latency jitters, turning the relationship with the stimulus onset fairly loose (Tallon-Baudry & Bertrand, 1999

| Task-related significant activity
Task-evoked responses were determined to be significant using a t test across trials at each time sample. To address the resulting multiple-comparison problem in the time domain, a minimum duration threshold T was estimated with a bootstrap procedure. Each step of this procedure involved the random selection of the same number of trials as in the original data set, with resampling, and the identification of periods with significant activity within the baseline window (À800 ms to À50 ms) in that sample. At each iteration, the significant period of maximal duration (p < 0.001) was pooled into a bootstrap surrogate distribution constructed through 1000 iterations of the procedure. For each patient, the significance threshold (minimum duration of consecutive significant t values, p < .001) was defined as the right-tail 95% quantile of that bootstrap distribution of durations. This yielded a threshold of significance for each channel (p < 0.001; corrected for multiple comparisons in time), which was used to determine task-related significant activity. Independently of the statistical analysis, an expert neurophysiologist (CLC) examined the responses for each contact and identified no discrepancies between the detections based on human expertise and the statistical method.
The analysis described above was conducted per patient. We also report the time courses of each patient in each region, obtained by averaging together all the contacts of a patient within a region that showed significant activity (any activity on the mask of significance). Finally, a group analysis was performed by computing the grand average of patients' time-courses per region.

| Within-region variability across contacts and patients
The variability of neural activities within each region of interest was estimated with a Pearson correlation metric computed from the recorded signal time courses. Specifically, we computed the pair-wise correlations across all signals within one region, at two different levels: contacts/channels and patients. The signal was first averaged across trials to yield a time course for each contact/ channel. These time courses were correlated pairwise, yielding a covariance matrix of correlations whose values were averaged. The resulting value is the 'correlation across contacts/channel' in the region. Then, the contacts of each patient were grouped, and their time courses averaged to yield a regional time course for each patient. The correlation matrix of these time courses was computed and averaged, as before, to yield the 'correlation across patients' in the region.
Importantly, for iERPs, any aggregation of signals from various contacts into a single metric must consider the arbitrary sign of the evoked potentials. Typically, two contacts recording the same dipolar source activity from opposite sides would show signals of opposite polarity, an inversion that can occur within or across patients. Conversely, opposite polarity alone does not warrant a single underlying source, as the two signals may arise from different regional sources. Here, for simplicity, we did not attempt to solve the SEEG inverse problem, which might have clarified this issue but which is very rarely explored in SEEG (Caune et al., 2014). The iERP matrix of correlation values was thresholded separately for positive and negative values, based on a heuristically determined threshold (here, jthrj = 0.7). We counted the number of highly negative correlations (i.e., below Àthr) and picked the channel with the highest number of anti-correlated channels. All channels highly correlated with this extreme channel, itself included, were sign flipped and the correlation matrix recomputed. Such sign flipping procedure was preferred over working on absolute valued time-courses, to preserve the shape of the signal and the information carried by potential inversions. The same analysis was conducted with other threshold values (0.6, 0.8 and 0.9) for comparison. With threshold 0.7, the number of regions that showed significantly correlated iERP was 11 out of 25 (see Section 5). For the other values, it was 7, 8 or 9 regions out of 25. In other words, our choice of threshold is the most liberal one. It might err on the side of showing within-region correlations, where we wanted to highlight the opposite.
To note, the sign flipping procedure was applied before patient and region averaging described above under Intra-cerebral event-related potentials and only for iERP signals. In HGA, the sign meaningfully indicates activation or possibly deactivation with respect to baseline and hence is informative. Therefore, for HGA, the averaging and the variability metric were computed without applying the sign flipping procedure.
The statistical significance of correlations by contacts/ channels was assessed with a permutation test that preserved the participants' relative contribution to each region. For any given region, we preserved the patients' identities and number of contacts (or channels in HGA) when permutating each signal with a signal from any other brain region. The two correlation indexes described above were computed. After 1000 permutations, we obtained surrogate distributions of correlation values (two for each region, for iERP and for HGA) with which significance was established at an alpha level of 0.05.
For any region, we preserved the number of patient and number of contacts and the statistics consisted in randomly permutating these signals with those of other anatomical regions.

| Overview
All patients except P-15 completed the task correctly. This left the data from 16 patients for the analysis. Because of technical issues, RTs could only be recorded in 8 out of 16 patients. In addition, the experimental software we used (E-Prime 2) detected vocal response onsets automatically based on sound intensity, without generating a recording of the sound wave for offline checking. Such intensity-based automatic detections are considered suboptimal (Schillingmann et al., 2018). For these reasons, we did not perform any detailed analyses based on RTs; in particular, we did not use these estimates to compute response-locked activities. The average RT was 894 ms (SD 105 ms). Mean accuracy was above 90%. Only trials with correct responses were included in the neural signal analysis.
The anatomical sampling criteria lead to the inclusion of 25 different regions, 18 of which were in the left hemisphere and 7 in the right hemisphere. Across patients, there were 88 different electrode probes, within which 362 electrode contacts were in the left hemisphere, and 85 in the right hemisphere ( Figure 1).

| Intra-cerebral responses
The analysis of intra-cerebral evoked response potentials (iERP) for the 447 monopolar contacts revealed 359 contacts for which there was significant post-stimulus taskrelated activity (i.e., 80%) and 88 contacts without significant response. There was a 100% consistency between the active/inactive classification performed visually (and independently) by the expert neurophysiologist (CLC) and by the automated statistical method. For the HGA, the 447 monopolar contacts were re-coded as 367 bipolar channels. Among them, 211 revealed significant poststimulus task-related HGA (i.e., 58%), and 156 contacts were without significant response.
Across the 25 sampled regions, there was a strong similarity between the variability of neural responses estimated in terms of correlations between contact averages vs. correlations between patient averages (Figure 2a). This suggests high within-patient correlations. Conversely, there were substantial variations between regions and, furthermore, between correlation estimates based on iERP or HGA time courses. There were 11 regions that showed significantly correlated iERP, also 11 regions that showed significantly correlated HGA, but one significance did not necessarily imply the other (Figure 2b).
Regions with high correlations (i.e., small variability) are easily understood as being recruited by the task, and the consistently recorded activity can be described as the signature of the cortical region for this task. For regions exhibiting higher variability of responses, more detailed interpretations (e.g., sub-regional distinctions) will be considered. Four groups of regions were distinguished based on the significance level of variability observed with each of the two metrics (colour coded in Figure 2b). Figure 3 illustrates one representative region of each of these groups.
The first group included regions for which both iERPs and HGA time courses showed significant correlations across patients and contacts. These 'low variability group' include sensory regions such as the superior temporal gyrus (STG) and motor regions such as central sulcus and sub-central gyrus. Figure 3a illustrates the highly concordant activation of STG across contacts and patients, presumably time locked to the patients' perception of their own voice. That is most clearly seen on the HGA raster-plots (Pt 9 and Pt16). The iERP components (N600/P1000) provide additional reliable information on the latency range of involvement of the STG. The temporal pattern of the central sulcus showed a sustained activation long before the onset of articulation (see below 8th panel on Figure 5 and 4th panel on Figure 6).
The second and the third groups included regions for which either the HGA or the iERPs time courses, but not both, showed significant correlations across patients and contacts. Consider for example intra parietal sulcus (IPS; Figure 3b). The average time course was similar for two of the patients (P01 and P09), an early bi-phasic followed by a more ample response. The third patient (P10) only showed a late ample response, with the opposite polarity. This opposite polarity might reflect different relative positions of the contact probes and the neural generator; this is despite the heuristic procedure we implemented for optimizing similarity across signals irrespective of their sign (see computations described for iERP in Withinregion variability across contacts and patients). Additionally, the iERP being computed over mono-polar recordings is more sensitive to distant sources that might be present in one patient but not in others. In contrast with the between-patient variability in iERPs, HGA was consistent across patients and even across trials. The singletrial raster plots reveal that IPS activity was linked to stimulus presentation rather than response preparation or execution (Figure 3b right column).
The opposite pattern, consistent iERPs but inconsistent HGAs across patients, was also observed in several regions, for example in Pars Opercularis (Figure 3c). A very systematic iERP response is recorded across the contacts of 11 patients, whereas HGA shows substantial variation across 22 bipolar channels, 18 of which show a significant increase of activity and 6 a significant decrease. HGA has a much smaller power than iERPs, which could contribute to its apparent instability. It is worth noting that the HGA activation starts around 300 ms in all the patients regardless of the RTs.
Finally, the fourth group includes areas showing inconsistent activities for both iERP and HGA time courses. For instance, the SMG presents a range of distinct time-course morphologies and onset times for both neurophysiological measures (Figure 3d). Given the centrality of this region in dorsal stream models of language processing and given that neither of the two metrics show any consistency, a closer examination of the anatomical positions of the contact probes in this region was performed. The post hoc hypothesis was that the heterogeneity in functional activity may reflect systematic differences in the activity across locations within the SMG region. We distinguished the contacts within SMG according to their anatomical location in supra-sylvian vs. retro-sylvian areas. Interestingly, the contacts located in the supra-sylvian aspect of SMG exhibit a postvocalization activity, which is also recorded in the homologous right hemisphere region (Figure 4).
Among the contacts located in the retro-sylvian aspect of SMG, iERPs show substantial variability in their temporal courses, yet a component peaking around 300 ms is recorded from almost all the contacts. This suggests an early involvement of this region that was not captured by HGA.
The complete list of regions sampled for the study is presented on Figures 5 and 6, sorted according to the variability of their iERP and HGA time courses, respectively. Angular gyrus (AG) is activated around 250 ms reliably across the contacts whereas a prominent slow wave peaking between 400 and 600 ms is recorded from the left frontal areas (IFG, IFS, MFG, Pars Opercularis). It is worth noting that the left precentral gyrus does not show a strong consistency across contacts but displays a time course similar to Pars Opercularis's starting in average around 300 ms, long before the average time of overt response (850 ms). Interestingly, scattered iERP patterns are observed in 6 out 15 contacts located in the Pars Triangularis.
The analysis of intra-cerebral HGA completes the picture of the naming process across the regions. The motor and auditory regions are consistently activated across their respective contacts. STG and Posterior Insula are involved around the time of the vocalization along with the subcentral gyrus and the central sulcus. The precentral gyrus is involved very early in the process and its activation lasts throughout the vocalization. The timing of HGA fluctuations is quite different within the usual parcellation of Inferior Frontal Gyrus. Pars triangularis displays very short activation or deactivation around 400 ms, whereas Pars Opercularis shows a substantial response, with inconsistencies for three out 8 sampled patients. Finally, Pars Orbitalis is mainly deactivated or silent.
In summary, iERP and HGA-consistent patterns of activation are primarily recorded from motor and auditory regions. The timing of the activation seen in these regions varies as function of the patients' response latencies. The spatio-temporal activity of most of the areas involved in the parieto-frontal language network is consistently captured either by the iERP or the HGA metric. The most notable exceptions are the left Pars Orbitalis, from which no cortical activity is recorded during this task, as well as the left Pars Triangularis and SMG in which there was no reliable activity across patients. Where available, the comparison with RTs indicates that neural activity is time locked to the stimulus and barely linked to the variation of RTs across trials, except in the suprasylvian region of SMG.
F I G U R E 3 Different levels of inter-patient variability of intracerebral activity during word production in four representative regions of the left hemisphere. Left column: intra-cerebral event-related potential (iERP) time courses and their significance mask in z score against baseline; middle column: high gamma activity (HGA) time courses and their significance mask in z score against baseline; right column: single trial activity in HGA in z score against baseline, black dots represent response times (where available). The colour of the region names boxes corresponds to those in Figure 2b. (a) -Superior temporal gyrus showed highly correlated time-courses both in the iERP and the HGA metrics. (b) Intra parietal sulcus showed uncorrelated iERP time courses, but correlated HGA time courses. (c) -For Pars Opercularis, in the inferior frontal gyrus, the opposite pattern was observed: correlated iERP but uncorrelated HGA. (d) In supramarginal gyrus, neither metric was correlated across patients.

| Comparison of left and right hemisphere activity in a subsample of the regions
Seven patients were implanted in right hemisphere regions, providing an opportunity for inter-hemispheric comparisons of the variability of functional response in eight regions. The correlation values and their statistical significance are summarized on Figure 7. Three broadly distinct correlation patterns are notable. First, central sulcus showed activity correlated across contacts in both hemispheres and both neurophysiological metrics. Secondly, various regions showed correlated activity in the left hemisphere (for one or both metrics) but not the right hemisphere. This was the case for MFG, Pars Orbitalis and Pre-central gyrus. Finally, various regions showed correlations that were significant for one or both metrics in the right but not the left hemisphere; in other words, there was less variability in the right compared to the left hemisphere. This was the case of Pars Triangularis, SMG and Orbito-frontal gyrus.

| DISCUSSION
We report the recruitment of the different areas composing the dorsal stream during word production. The candidate regions in the temporo-parietal frontal network were defined on anatomical criteria, and the most relevant regions of interest were then selected on the basis of previous functional imaging evidence (e.g., Indefrey, 2011;Price, 2012). The activity we observed was extensively distributed across the network.
The correlations computed on the iERPs and HGA time courses revealed contrastive levels of variability. The lowest variability in the patterns of iERP or HGA modulations was observed in auditory cortex (STG) and in some regions of the motor network (central sulcus and subcentral gyrus). These activities were linked to the patients' responses (auditory feedback and motor articulation). Interestingly, the activity in central sulcus started before and was sustained throughout the overt response, which is consistent with a motor preparation occurring bilaterally. A good spatial agreement between iERP and F I G U R E 4 Recordings from the supra sylvian area of supra marginal gyrus (SMG) in the left (Rp electrode) and right hemisphere (R electrode) and the left retro-sylvian area of SMG (Xp electrode) in Patient 16. Note that the activation post vocalization is recorded bilaterally only in the supra-sylvian area and not in the retro-sylvian area.
HGA in STG and the motor cortex is consistent with previous intracranial studies (Edwards et al, 2009;Towle et al, 2008;Sinai et al, 2005;Szurhaj et al, 2006).
Conversely, no consistent iERP and HGA deactivation were recorded from the Pars Orbitalis (i.e., in BA 47). This null result extended to the absence or suppression of activity stands in contrast with some studies that showed activation during lexico-semantic control, syntactic processes and retrieval processes (e.g., Badre et al., 2005; for a review, see Friederici, 2011;Tyler et al., 2011) and the meta-analysis conducted by Binder et al., 2009. This absence of activation can be linked to the simplicity of our task, given item repetitions and the lack of other language processes (e.g., syntactic).
A consistent iERP pattern, implying higher temporal accuracy, is observed in the AG and the Pars Opercularis F I G U R E 5 Consistency of intracerebral event-related potentials (iERPs) recorded during a word naming tasks across regions of the left dorsal stream. The regions are ordered by decreasing iERP correlation across contacts and patients to highlight the gradient of variability. From left to right columns: grand average, patient averages and contact-level statistical masks in z score against baseline as well as other frontal regions: MFG, SFG and SFS. Even if the averaged timing of the engagement of these cortical regions during the naming process must be taken with caution (Dubarry et al., 2017), the semantic processed subserved by the AG activation (Binder et al., 2009;Brownsett & Wise, 2010;Price et al., 2015) and the word selection subserved by the MFG and SFG (Price, 2012;Riès et al., 2016) overlap with the phonological encoding subserved by the Pars Opercularis (also referred to as Brodmann area BA 44) (for review, Friederici, 2011). The timing of the Pars Opercularis activity before response onset has been previously observed in a large cohort of patients (Nakai et al., 2019) and clearly links its functional role to phonological processing that prepares, rather than executes, the oral response (Flinker et al., 2015). F I G U R E 6 Consistency of high gamma activity (HGA) recorded during a word naming tasks across regions of the left dorsal stream. The regions are ordered by decreasing HGA correlation across contacts and patients to highlight the gradient of variability. From left to right columns: grand average, patient averages and contact-level statistical masks in z score against baseline The variability of the HGA observed in these regions stems from the deactivation or absence of response recorded from some contacts, which could be the result of less active neurons underneath these contacts. Field potential power decays for higher frequencies (1/f) are generally sensitive to the number of active neurons. Another contributing factor is noise. Activities in high gamma frequencies could have poor signal to noise ratio due to contacts being positioned near the skull, which attenuates signal changes (Pesaran et al., 2018). Conversely, a subset of regions involved in speech production was identified only by their significant HGA correlation across patients. This is the case of Precentral Gyrus, whose activity started in average 300 ms before articulation, confirming that pre-motor areas are involved in an early stage of language production (Fried et al., 1981). In terms of lateralization, Cogan et al. (2014) argued for a bilateral sub-lexical speech sensory-motor system. Our results show a high reliability in bilateral involvement of central sulcus and left precentral gyrus compared with right Precentral gyrus. These observations were made across patients, and the pattern would need to be explored within patient and thus within trials. With this limitation in mind, the current findings would favour the hypothesis of lateralized sensori-motor transformations (Hickok, 2012;Long et al., 2016). F I G U R E 7 Correlations (values on the xaxis) of neural responses across contacts for regions (y-axis) that were sampled in both hemispheres in the current dataset. L = left; R = right.
F I G U R E 8 Distribution of the depth electrode coverage across patients in the fronto-parietal dorsal network. The colour of each dot indicates whether the corresponding cortical region showed significantly consistent activities in intra-cerebral event-related potentials (iERPs) and high gamma activity (HGA) (green), in HGA only (orange), in iERPs only (blue), or in neither of these measures (red). SMG and Pars Triangularis are two key structures key in the dorsal pathway that belong to the fourth category of regions displaying a high variability across patients in the two metrics. SMG revealed an unexpected heterogeneity of responses, both in iERP and HGA. This disparity can be tentatively linked to the specific location of electrode implantation with a distinction made between its supraor retro-sylvian parts. HGA time-locked to the articulation is observed only in the supra-sylvian part, bilaterally, whereas scattered activity is recorded from the retrosylvian region. The activation of SMG in association with speech production and phonological processing has been frequently reported (Fridriksson et al., 2010;Moser et al., 2009). Oberhuber and colleagues (Oberhuber et al., 2016) have proposed a functional parcellation of SMG whereby different types of phonological processing are assigned to four sub-regions, with the anterior ventral subregion being associated with articulatory sequencing. Our data demonstrate that this area is involved in articulatory processing since its activation was time locked with the patient's response. In addition, the activation of the homotopic contralateral area (in Patient 16) demonstrates that this processing is bilateral. Applying transcranial magnetic stimulation (TMS), Hartwigsen and colleagues (Hartwigsen et al., 2016) showed that a bilateral functional lesion effect induced by TMS entailed a bigger impairment in phonological decision than unilateral stimulations over the left or right SMG.
The findings observed in the Pars Triangularis illustrate well the inconsistency between iERP and HGA and their variability across the patients. Two patients out of 4 displayed iERP along with increase HGA in one and deactivation in the others. One patient displayed only significant HGA increase. The activation of Pars Triangularis has been reported during (arguably more demanding) retrieval tasks, such as verbal fluency or verb generation in functional imaging or word-production after morphological inflection (Bourguignon, 2014;Sahin et al., 2009). In functional language mapping, picture naming has often been reported as inducing little HGA activation in the IFG (Arya et al., 2017;Babajani-Feremi et al., 2016;Kunii et al., 2013;Nakai et al., 2017).

| Limitations
Some technical limitations of our analysis must be acknowledged. First, we explored the variability of neural responses both within and across patients. This was done on a medium size group (N = 16) although not all patients were recorded in all areas. Future studies based on larger group sizes and focusing on smaller groups of regions of interest should confirm or challenge our conclusions regarding the consistency of activities. As noted in the Section 3, the significance of iERP correlations was dependent of a thresholding sign-flipping procedure, on account of potential differences in the relative positions of the generator and the recording contact. The liberal threshold we adopted might have under-estimated the real extent of within-region variability. Secondly, the temporal variability of our analysis stopped at single contacts per region and did not encompass single trials (although single trials were used to compute the time-frequency decompositions). Our interpretations are thus based on averages per contact, which can blur important features of the timing of neural activities (Dubarry et al., 2017). Given the broad spatial coverage reported here (left and part of right dorsal stream), the analysis at the level single trial was left for future, more focused research. Finally, due to clinical constraints during testing, we did not collect RTs for every patient; for this reason, we restricted our analysis to stimulus-locked activity.

| CONCLUSIONS
Most of the cortical structures known to be involved in word production were identified in this study, either by iERP or HGA. The evidence we report clearly revealed the simultaneous activation of parietal and frontal regions (AG, MFG), which have been linked to semantic processing and word retrieval. Lagging by around 100 ms on the average time courses was the activation of regions more specialized in phonological processing and articulatory preparation, such as Pars Opercularis and the Precentral gyrus. These overall timing properties are suggestive of a temporal overlap between regions dedicated to the different sub-processes of word production; they are in line with models that posit interactive processing and the concurrent recruitment of brain areas. Nevertheless, the high variability observed across patients in either one or both neurophysiological metrics confirms that they reflect two facets of the cortical activation subserved by distinct neural mechanisms. Gamma oscillations reflect a competition between excitation and inhibition in local cortical areas, whereas iERP represents the post synaptic activity resulting from the firing of nearby neurons along with remote neurons with afferent inputs into a region (Nunez & Srinivasan, 2006). HGA is more sensitive to changes in the firing rate of the underlying neuronal population than iERP. Its absence does not necessarily mean that there is no synchronized activity but rather that the threshold of temporally structured spiking is too low to be captured. Our analysis revealed substantial diversity across patients within many of the dorsal stream regions of interest. Figure 8 summarizes the classification into four types of regions, according to the variability observed in the cortical activity. There were sharp distinctions within the left Inferior Frontal Gyrus, with Pars Triangularis and Pars Orbitalis not being systematically recruited during the task. Parietal regions revealed more variability (for either iERP or HGA metrics) than expected based on current cognitive neuroscience models of word production.
Tentative interpretations were provided for some of these variations, based on the fine-grained spatial location of the recording contacts within each region. They call for sub-second and sub-region focused explorations of individualized language maps and for multimodal and individualized language assessments prior to brain surgery involving language cortical areas. Such individualized clinical assessments would benefit from complementing picture naming with alternative experimental tasks designed to maximize the likelihood of activity in the targeted regions.
From such clinical standpoint, the results pose a challenge for standard language localization at the individual level using the picture naming task. The results indicate the need for specific paradigms of language testing when evaluating language function in clinical settings. This highlights the importance of individualized interpretations of language mapping results, especially in the essential function of production. Although our report does not resolve the issue of the inter-individual variability of SEEG signals, it provides a grounding for future developments in neural signal assessment, clinical interpretations and neuroscientific theories.