## INTRODUCTION

The importance of studying interactions between specialized areas in the human brain has been increasingly recognized in recent years [Schnitzler and Gross,2005a,b; Schnitzler et al.,2000; Schoffelen et al.,2005,2008]. Magnetoencephalography (MEG) is particularly suited for connectivity studies as it combines a good spatial resolution with high temporal resolution. The high temporal resolution affords the investigation of transient coupling and is a prerequisite to study frequency dependent coupling. A large number of measures for the quantification of neural interactions have been introduced over the years. For these various measures it is custom to distinguish between functional and effective connectivity. Functional connectivity measures assess interactions by means of similarities between time series (e.g., correlation and coherence) or transformations of these time series (e.g., phase synchronization and amplitude correlation). In contrast, effective connectivity methods are used to study the causal effect of one brain area on another brain area.

Besides the distinction between functional and effective connectivity one has to be aware that connectivity analysis can be performed at the sensor-level or the source-level. In the first case, connectivity measures are evaluated on the time series recorded by MEG/EEG sensors. In the second case, connectivity measures are evaluated on time series that represent the activity of individual brain areas. Unfortunately, the interpretation of sensor connectivity results is difficult because of the complex and often diffuse sensitivity profiles of MEG/EEG sensors [Schoffelen and Gross,2009]. Significant connectivity between (even distant) sensors cannot be easily assigned to underlying brain areas, may be spurious, and can be affected by power modulations of nearby or distant brain areas [Schoffelen and Gross,2009]. These negative effects can be reduced (though not abolished) by performing connectivity analysis in source space. Most MEG/EEG source connectivity methods are based on functional connectivity measures such as coherence or phase synchronization [Gross et al.,2002; Hoechstetter et al.,2004; Jerbi et al.,2007; Lachaux et al.,1999; Lin et al.,2004; Pollok et al.,2004,2005; Timmermann et al.,2003]. Effective connectivity in source space has been studied with dynamic causal modeling (DCM) [David et al.,2006; Kiebel et al.,2009] or Granger causality [Astolfi et al.,2005; Gómez-Herrero et al.,2008].

Here, we present and test a new efficient method for Granger causality analysis in source space. Granger causality is a concept from economics that quantifies the causal effect of one time series on another time series. Specifically, if the past of time series x improves the prediction of the future of time series y time series x is said to granger-cause y. Classically, Granger causality is defined in the time domain, but a frequency domain extension has been proposed [Geweke,1982]. Granger causality has also been extended from its original pairwise form into a multivariate formulation in both the time and frequency domains, known as conditional Granger causality [Chen et al.,2006,2009; Geweke,1984]. This methodology is comparative in the sense that in a multivariate system if one investigates if y is causing x, then a model of x based on every variable including y is compared with a model of x based on every variable excluding y. In simple terms if inclusion of y reduces significantly the variance of the model of x as compared to the variance of the model of x when y is excluded then y is assumed to cause x. Several other multivariate metrics derived from Granger causality have been suggested, such as partial directed coherence (PDC) [Baccalá and Sameshima,2001] and directed transfer function [Kaminski and Blinowska,1991]. These metrics are estimated in the frequency domain and are thus frequency specific. One of their main differences with conditional Granger causality is that they are not comparative methods but they are computed directly from the multivariate model built based on all the variables in the system.

Source space Granger causality analysis is typically performed in the following way. First, regions of interest (ROIs) are selected. Second, the activation time series are computed for all ROIs. Third, a multivariate autoregressive model is computed for these time series and measures of Granger causality are computed. The most significant drawback of this approach is that a large number of potential activation sources correspond to a large number of projected activation time-series. This is prohibitive for the derivation of numerically robust MAR models without the assumption of sparse connectivity. [Haufe et al.,2010; McQuarrie and Tsai,1998; Valdés-Sosa et al.,2005] For example, dividing the brain volume into a regular 6 mm grid leads to roughly 10,000 voxels. In addition, Granger causality computation for a different set of ROIs requires time consuming computations because Steps 2 and 3 in the procedure mentioned earlier need to be repeated. The computational complexity precludes a tomographic mapping of Granger causality.

To bypass these limitations, we investigate an alternative approach, which entails the derivation of the MAR model directly on MEG sensor data and its projection into the source space. In this method the modeling process is performed in sensor space, which has moderate dimensionality as compared to the high-dimensional source space. This leads to greater model robustness as well as significantly reduced computation times. Feasibility of a similar approach for EEG data has already been shown in [Gómez-Herrero et al.,2008], where the multivariate model was projected onto a small number of locations in source space identified by independent component analysis (ICA) of the residuals of the MAR model and localized by swLORETA [Palmero-Soler et al.,2007]. Causality was inferred using the directed transfer function metric (DTF). In our work, we demonstrate the feasibility of the methodology when the MAR model is projected in the entire brain volume without any a priori assumption or estimation of the activity locations.

The main advantage of this approach is that all the voxels inside the brain volume can be investigated in terms of causality, something not practical with the traditional approach. This method also offers benefits in terms of data compression as the elements that need to be projected are the coefficients, which are typically significantly less than the data points used to derive them and which would be projected in the traditional case. Another advantage is that the derivation of the MAR model at the sensor space is much more robust, because of the moderate number of variables, than the derivation of the MAR model on projected time-series in a very large number of brain locations. Additionally, even if different ROIs are recursively selected to examine different network topologies in the brain, the sensor space MAR model is always the same and the only thing that changes is the locations where the model is projected. In the traditional approach, one would have each time to project the sensor time-series in the new set of brain locations and then build the MAR model again. Finally, due to the computational efficiency of this methodology, application of statistical inference methods on entire brain causality maps from MEG data is feasible.

To infer causality, PDC and the coefficients of the MAR model themselves are used. Although, conditional Granger causality would be, in terms of theory, a more robust choice because of its intrinsic normalization, its computational load for a very large number of considered source locations makes its use problematic. In a traditional approach, if 10,000 voxels are considered, 10,001 multivariate models must be computed. One including all the 10,000 projected voxel time-series and 10,000 models, each one with one voxel time-series excluded. In the proposed methodology, where the model is built on the sensor level and only the coefficients are projected, in order to implement conditional Granger causality, again 10,001 models must be built at the sensor level. One on the original sensor data and 10,000 models, each with the effect of one voxel extracted through the derived inverse solution. Then the coefficients of these 10,001 models must be projected in source space. This imposes a heavy computational load.

Also, due to the fact that in each of the 10,000 models the effect of one voxel is extracted through the derived inverse solution, under the condition that the number of sensors is much smaller than the number of voxels, the projected activity will be diffused around the voxels of actual activity. This simply means that even if one voxel's activation effect is excluded from the sensor data, in the context of Geweke's measures computation, the causal pairing will be modeled by the effect of the neighboring voxels.

Another issue regarding the conditional Granger causality in the frequency domain is that is based on the transfer function of the model, which is the inverse of the *z*-transform of the MVAR coefficients across model order. For each of the 10,000 models the size of this matrix is 9,999 × 9,999(10,000 × 10,000 for the entire brain model). Inversion of such a large matrix, given also the colinearities because of the projection through the inverse solution, can be very problematic and can lead to singular inverse matrices.

PDC has the implementational advantage that it is computed directly from the coefficients of one MAR model with all the variables included and that it does not require any inversion. Thus, only one model needs to be built at the sensor level, and after the coefficients are projected in source space, PDC can be efficiently computed for a wide range of frequencies. However, due to its semiarbitrary normalization it can only confidently be used to compare causality between voxel pairs that have the same causal voxel [Baccalá and Sameshima,2001].

Because of this drawback of PDC, also the projected MAR model coefficients are examined directly without any normalization. The fact that no normalization is applied means that in this approach causality is not bounded. Also when a continuous linear system with linear coefficients matrix A is periodically sampled with sampling frequency *f*, the resulting discrete linear coefficients are approximated as . This means that the discrete coefficients change in amplitude according to the sampling frequency. Nevertheless the aim of examining the MAR model coefficients is to examine if within the same dataset the causal information is correctly represented in the MAR model coefficients when they are derived at the sensor-level and then projected to a very large number of voxels inside the brain. This examination of the coefficients is only performed in the time-domain. This approach is used to identify areas inside the entire brain, which are involved in causal interactions. These specific brain areas could then be separately examined with theoretically more robust causality metrics such as the conditional Granger causality.

First, our proposed approach is investigated theoretically. Subsequently, the method is validated by simulations where pseudo-MEG data with added noise, uncorrelated, and spatiotemporally correlated, is produced from simulated neural activity in a small number of predefined locations inside the brain with specified causality structure. We show that the PDC reconstructed from the source projection of the MAR model coefficients, is very similar to the PDC, extracted from the simulated source signals directly.

The second part of this work is concerned with the investigation of the causality information that can be derived in the case when a very large number of voxels are considered as potential sources. First, the causality information recovered by PDC is investigated. Then the causality information recovered directly from the MAR model coefficients is investigated. The motivation for the latter comes from the fact that PDC, due to the way it is normalized, is very sensitive to the Signal-to-Noise ratio and may not be suitable for applications with very large numbers of voxels [Baccalá and Sameshima,2006; Faes et al.,2010; Schelter et al.,2006,2009]. Here the feasibility of using the model coefficients directly is investigated and it is demonstrated that causality information can be extracted more precisely than with PDC, when a very large number of voxels is considered. Within this context a preliminary evaluation of this methodology is performed with real data from a simple motor planning experiment.