### 1. Introduction

- Top of page
- Summary
- 1. Introduction
- 2. Methodology
- 3. Theory
- 4. Numerical Results
- 5. Conclusion
- 6. Supplementary Materials
- Acknowledgments
- References
- Supporting Information

Magnetoencephalography (MEG) is a technique for mapping brain activity by measuring magnetic fields produced by electrical currents occurring in the brain, using arrays of superconducting quantum interference sensors (Hamalainen et al., 1993). Applications of MEG include basic research into perceptual and cognitive brain processes, localizing regions affected by pathology, and determining the biological functions of various parts of the brain. In this article, we propose a novel method for analyzing MEG data and apply it to identify face-perception regions in a human brain.

While MEG offers a direct measurement of neural activity with very high temporal resolution, its spatial resolution is relatively low. In fact, improving its resolution by virtue of source reconstruction is lying at the heart of the entire MEG-based brain mapping enterprise. Reconstructing neural activities based on the measurements outside the brain is an ill-posed inverse problem since the observed magnetic field could result from an infinite number of possible neuronal sources. To be concrete, let be the measurement recorded by the MEG sensor *i* at time , and be the measurements from all *n* sensors at time , where the time points , the number of the time instants is determined by the time window *b* and the sampling rate per second, and the number of the sensors *n* is of order hundreds. Sarvas (1987) showed that the contribution of an individual source to can be numerically calculated by the use of an Maxwell's equation-based forward model and that the contributions of multiple sources can be summed up linearly. Accordingly, can be written as

- (1)

where is the source space (i.e., the space inside the brain), is the source magnitude at location *r* with unknown orientation and is a linear function of the orientation with being an matrix (called lead field matrix) at location *r*. The columns , and in are the noiseless output of *n* sensors when a unit magnitude source at location *r* is directed in the directions of the and *z* axes, respectively. The lead field matrix is known in the sense that it can be calculated by solving a set of Maxwell's equations (Sarvas, 1987). If we know the source locations and orientations, then is known and therefore model (1) reduces to a functional regression coefficient model which has been extensively studied in literature (e.g., Chapter 16, Ramsay and Silverman, 2005). Unfortunately, as both the locations and orientation are unknown, it cannot be directly treated as a standard functional regression coefficient model.

To obtain a good approximation to model (1), the size of the sieve *p* should be substantially larger than the number of sensors *n*. However, when *p* is considerably larger than *n*, the estimation problem becomes highly ill-defined as there are a diverging number of candidate models in the MEG model space. This leads to a surging interest in developing new methods and theories in order to cope with this situation (e.g., Friston et al., 2008); Henson et al., 2010); Sekihara and Nagarajan, 2010). It is necessary to impose certain regularizations on the above model to tackle the adverse effects mentioned above. One frequently used regularization is the so-called sparsity condition where the observed magnetic field is assumed to depend on a much smaller number of latent sources than the number of available sensors *n*. Under this condition, the problem reduces to the one of finding the true sources from a very large number of candidates. The central issue that offers a challenge to modern statistics is of how to overcome the effects of diverging spectra of sources as well as noise accumulation in an infinite dimensional source space.

The existing methods in literature roughly fall into two categories, global approach and local approach. Bayesian and beamforming-based methods are the special cases of these two approaches respectively (e.g., Friston et al., 2008); Sekihara and Nagarajan, 2010). In the global approach, we directly fit the candidate model to the data, where the sieve size is required to be known in advance. In contrast, the local approach involves a list of local models, each is tailored to a particular candidate region and therefore the sieve size can be arbitrary. The global approach often needs to specify parametric models for sources and the noise process, while the local approaches are model-free. When the sieve size is small or moderate compared to the number of available sensors *n*, we may use a Bayesian method to infer latent sources, with helps of computationally intensive algorithms (e.g., Friston et al., 2008). However, when the sieve size is large, these global methods may be ineffective or computationally intractable and local approaches are more attractive. The sensor covariance-based beamforming represents a popular and simple solution to the above large-*p*-small-*n* problem. The basic premise behind beamforming is to scan through a source space with a series of filters; each is tailored to a particular area in the source space (called pass-band) and resistant to confounding effects originating from other areas (called stop-band) (Robinson and Vrba, 1998). The scalar minimum variance beamforming aims to estimate the theoretical power at the location by minimizing the sample variance of the projected data with respect to the weighting vector *w*, subject to the constraint . In a scalar minimum variance beamformer, the pass-band is defined by linearly weighting sensor arrays with the constraint , while the stop-band is realized via minimizing the variance of the projected data. The estimated power can be used to produce a signal-to-noise ratio (SNR) map over a given temporal window while the projected data can provide time course information at each location. We rank these candidate sources by their SNRs and filter out noisy ones by thresholding. There are other beamforming methods such as vector minimum variance beamformers and minimum-norm types of beamformers (Sekihara and Nagarajan, 2010). Like the scalar version, the weighting vectors in the former are adaptive to sensor observations, while those in the latter are not.

Although significant progress has been made in assessing the performance of beamforming based on simulations (e.g., Brookes et al., 2008), there is no rigorous statistical theory available to allow one to examine the scope of a beamformer about what can and cannot be inferred on neuronal activity from beamforming. For example, although the sensor measurements are known linearly linked to underlying neuronal activity via the high-dimensional lead field matrix subject to some random fluctuations, there is no general and mathematically sound framework available to allow users to examine the extent of effects to which the spatial dimension (i.e., the lead field matrix) and the temporal dimension (i.e., the temporal correlations of sensor measurements) of a beamformer on its accuracy in source localization and estimation. In particular, when there are multiple sources, the accuracy of beamforming is compromised by confounding effects of multiple sources. The closer these sources are, the harder it will be for the beamformer to localize and estimate them. It is natural to ask when a beamformer will breakdown in presence of locationally nearby multiple sources and how this effect is determined by the spatial and temporal dimensions of a beamformer.

Here, to address these issues, a more flexible beamformer is proposed based on thresholding sensor covariance estimator. The proposed procedure reduces to the so-called unit-noise-gain minimum-variance beamformer in Sekihara and Nagarajan (2010) when the thresholding level is set to zero. A general framework is provided for the theoretical analysis of the proposed beamformers. An asymptotic theory is developed on how the performance of the beamformer mapping is affected by its spatial and temporal dimensions. Simulation studies and an MEG data analysis are conducted to assess the sensitivity of the proposed procedure to the thresholding level. In particular, the proposed method is applied to a human MEG data set derived from a face-perception experiment. Two clusters of latent sources are predicted, which reacted to face and scrambled face stimuli differently as shown in Figure 1.

The rest of the article is organized as follows. The details of the proposed procedure are given in Section 2. The asymptotic properties of the proposed procedure are investigated in Section 3 and in the On-line Supplementary Material. The simulation studies and the real data analysis are presented in Section 4. The conclusions are made in Section 5. The proofs of the theorems and lemmas are deferred to Web Appendix A in the On-line Supplementary Material.

### 3. Theory

- Top of page
- Summary
- 1. Introduction
- 2. Methodology
- 3. Theory
- 4. Numerical Results
- 5. Conclusion
- 6. Supplementary Materials
- Acknowledgments
- References
- Supporting Information

To make model (2) identifiable, we assume the following condition.

The assumption of , which holds approximately in many applications (Sekihara and Nagarajan, 2010), is made for simplicity. Under model (2) and condition (A1), if the noises are uncorrelated across the sensors and white, then the sensor covariance matrix can be expressed as

- (5)

where denotes the theoretical power at location , is the background noise level and is an identity matrix. To simplify the theory, assuming that , we reparametrize the model (2) as follows:

- (6)

where and . Here, stands for the Euclidean norm of a vector. For the notation simplicity, we let and stand for and , respectively. Note that the SAM index is invariant under the above reparametrization. Although the original time-course and power are not invariant under the reparametrization, they can be recovered by multiplying by the scaling factor . We often see that tends to a limit as *n* is large.

In this section, we present an asymptotic analysis for the proposed SAM index when both *n* and *J* are sufficiently large. In practice, the number of sensors is fixed around a few hundreds. Allowing *n* to be varying is an analytic device for finding spatial factors that affect the performance of a beamformer. We will show that the values of the proposed SAM index are much higher at source locations than at non-source locations. This implies that the screen based on the proposed beamformers can eventually identify latent sources if *n* and *J* are sufficiently large. We proceed the analysis with two steps. In the first step, we focus on the ideal situation where the sensor covariance matrix is known. In the second step, we investigate the asymptotic behavior of the proposed SAM index when the sensor covariance matrix is unknown but estimated by using the sensor measurements on a finite number of time instants.

#### 3.1. Beamforming with Known Sensor Covariance

##### 3.1.1. Single-source case

##### 3.1.2. Multiple sources

We now turn to multiple sources, where there exist *q* unknown sources located at with orientations , respectively. To show the consistency of the beamforming estimation, more notations and regularity conditions on the composite lead field vectors are introduced as follows.

First, we introduce a notation of source separateness to describe the estimability of multiple sources. Let and denote the (scaled) composite lead field vectors at locations and with orientations and , respectively. For the simplicity of notation, let To describe the spatial relationships among the composite lead field vectors, for , we define the partial coherence factor between and given by iteratively performing the so-called sweep operation on the matrix as follows:

where the dependence of the above notation on *n* has been suppressed. Note that shows the partial self-coherence of given the preceding ’s (Goodnight, 1979). See A2 of the Web Appendix A, the On-line Supplementary Material. Let and

- (7)

We impose the condition that as , which is equivalent to requiring that for any , the maximum partial coherences of is bounded above by for large *n*. If , then are all positive when *n* is large. Therefore, are linearly independent for large *n* because the inverse of matrix can be obtained by iteratively performing sweep operations times on this matrix. We say that the source locations are asymptotically separable if as , and that non-source location with orientation is asymptotically separable from the sources if . Secondly, to regularize the lead field matrix, we define notation by letting , and

It follows from the definition that if and (i.e., given , the partial regression coefficients of and with respect to are bounded). In particular, if is bounded below from zero as *n* tends to infinity, then the above partial regression coefficients are bounded and thus . The following theorem shows that if (i.e., the partial regression coefficients of are bounded), then is a necessary condition for the source powers to be consistently estimated. We state our general mapping theorem as follows.

1. Suppose that and that conditions (A1) and (A2) hold. Then, as , the power estimator at is asymptotically larger than , that is, the power estimator is not consistent, where is defined in the equation (7). As , we have:

- For any non-null source location , the power estimator and the SAM index at location with orientation admit the expressions where and are constants defined in the on-line supplementary material satisfying and as
*n* is sufficiently large. In particular, for , we have , . - For any null-source location which is asymptotically separable from the sources and satisfies , the power estimator and the SAM index at location with orientation can be respectively expressed as

Note that Theorem 1 holds under the reparametrized model (6) and continues to hold under the original model (2) after adjusting the power estimators by the corresponding scaling factors. The above theorem can also be extended to the setting with two stimuli. It can also be seen from Theorem 1 that at each of separable non-null source locations the SAM index is asymptotically equal to the product of the number of sensors *n*, the source power and the source coherence factor. In contrast, at null-source location which is asymptotically separated from the non-null sources, the SAM power index is equal to the lower bound up to the first asymptotic order. These facts suggest that the greater the number of sensors employed, the larger the contrast between non-null sources and null-sources will be, therefore, the easier for the beamformer to localize them. In particular, when *n* is large, the localization bias can be of order less than in terms of the lead field distance.

#### 3.2. Beamforming with Estimated Sensor Covariance

In addition to conditions (A1) and (A2), we need the following two conditions for extending the asymptotic analysis to the case of unknown sensor covariance. The first one is imposed to regularize the tail behavior of the sensor processes. The Gaussian iid processes considered in Bickel and Levina (2008) satisfies the following condition.