Comparison of three methods for generating group statistical inferences from independent component analysis of functional magnetic resonance imaging data

Authors


Abstract

Purpose

To evaluate the relative effectiveness of three previously proposed methods of performing group independent component analysis (ICA) of functional magnetic resonance imaging (fMRI) data.

Materials and Methods

Data were generated via computer simulation. Components were added to a varying number of subjects between 1 and 20, and intersubject variability was simulated for both the added sources and their associated time courses. Three methods of group ICA analyses were performed: across-subject averaging, subject-wise concatenation, and row-wise concatenation (e.g., across time courses).

Results

Concatenating across subjects provided the best overall performance in terms of accurate estimation of the sources and associated time courses. Averaging across subjects provided accurate estimation (R > 0.9) of the time courses when the sources were present in a sufficient fraction (about 15%) of 100 subjects. Concatenating across time courses was shown not to be a feasible method when unique sources were added to the data from each subject, simulating the effects of motion and susceptibility artifacts.

Conclusion

Subject-wise concatenation should be used when computationally feasible. For studies involving a large number of subjects, across-subject averaging provides an acceptable alternative and reduces the computational load. J. Magn. Reson. Imaging 2004;19:365–368. © 2004 Wiley-Liss, Inc.

INDEPENDENT COMPONENT ANALYSIS (ICA) has been previously proposed as a data-driven methodology for the analysis of functional magnetic resonance imaging (fMRI) data (1). It offers the advantages of not requiring the accurate specification of the hemodynamic response function (HRF), and does not make any assumptions about the nature of the noise. Hence it is an extremely flexible method well adapted to real-life fMRI conditions of autocorrelated, or red noise, and intersubject variability in the exact form of the HRF. While as a data-driven method ICA does not provide any built-in means of testing hypotheses and generating statistical inferences, methods have been proposed to test for significant active voxels (2) and task-related time courses (3).

Methods have also been proposed to perform group ICA analyses and to generate random-effects statistical inferences. The simplest and most intuitive method is to perform ICA on the data from each subject separately, and then perform random-effects analyses on the results (4). The difficulty is that an ad hoc subjective matching of the component maps must be performed. Mismatch of the component maps will cause severe loss of sensitivity. To overcome this difficulty, three methods have been proposed, in order of computational complexity, from smallest to largest: across-subject averaging (5), subject-wise concatenation (6), and row-wise concatenation (across time courses) (7). The methods will be described in more detail below.

Assuming that there are n voxels in the brain, m number of points in the voxel time courses, and N number of subjects in the (single-subject) noisy ICA model, the data for each subject X are assumed to be a linear mixing of p(p < m) independent components: X = A S + E, where X is the n-X-m data matrix, A is a p-X-m mixing matrix, S is an n-X-p matrix of p spatially independent sources, and E is an n-X-m matrix of residuals, assumed to be noise. The ith column of A consists of the time course associated with the ith row of S. The ICA algorithm is explained in more detail elsewhere (1, 8–10). Typically the data are preprocessed by reducing the dimensionality to an n-X-p matrix via principal component analysis (PCA). Using either a gradient-ascent or a fixed-point iterative algorithm, the unmixing matrix W = A+ is found that corresponds to, equivalently, maximizing the likelihood (11), minimizing the mutual information (9), or minimizing the entropy (10).

For across-subject averaging (5), the data are averaged across the N subjects, and the PCA reduction and ICA decomposition performed. While the component maps may be used directly, they do not provide any statistical inferences. However, the associated time courses found may be subsequently used in a conventional general linear model (GLM) approach (on a separate data set from the one used to find the time courses) in order to generate voxel-wise random-effects statistical inferences.

For subject-wise concatenation (6), the data from the ith subject are reduced via PCA to an n-X-pi matrix (the number of retained components may vary between subjects), and the data from all subjects are concatenated into an n-X-∑math imagepi matrix. A second PCA reduction further reduces the data into an n-X-p matrix, and the ICA decomposition is then performed. To generate random-effects inferences, individual subject maps are back-reconstructed by partitioning the second PCA decomposition matrix into submatrices corresponding to each subject. Time courses for each subject Ai may be obtained via Ai= XiSST(SST)−1, where Xi is the (unreduced) data matrix from the ith subject. Random-effects inferences may thus be generated for the time courses as well as the voxel activation maps. The second PCA decomposition may be quite computationally intensive for studies with larger numbers of subjects; iterative PCA algorithms (12) may be useful.

For row-wise concatenation (7), all data are row-wise concatenated into an nN-X-m matrix, PCA reduced to an nN-X-p matrix, and the ICA decomposition performed. A single set of associated time courses is generated. Individual subject maps are obtained by column-wise partitioning of the nN-X-p source matrix, and random-effects analyses may thus be performed. The method is highly computationally intensive; however, stochastic (9) or batch-mode (10) versions of ICA algorithms may be used to ease the computational load.

MATERIALS AND METHODS

All simulations were performed via routines written in IDL (Research Systems, Inc., Boulder, CO). Group fMRI data were simulated for a data set of 20 subjects, 10,000 voxels in the brain, and 100 time points. A zero-mean Gaussian noise background with unity standard deviation was generated. Simulated independent components were generated using the square of a Gaussian distribution, keeping the original sign. The simulated source distributions were thus highly super-Gaussian, corresponding to the empirically observed super-Gaussian nature of sources present in fMRI data. The source distributions were normalized to a standard deviation of 0.5. The associated time courses for all sources were generated from a zero-mean Gaussian distribution with a standard deviation of unity. The scaling factors were chosen so that at the tails of the source distributions (P < 0.01), the strength of the simulated blood oxygenation level-dependent (BOLD) signal intensity was approximately twice the background noise level, an empirically observed typical level for fMRI scans performed at 3 T. To model intersubject variability, Gaussian noise with a standard deviation of 0.25 was added to each source and associated time course. Twenty sources were simulated and added to a varying number of subjects between 1 and 20. Each method of group ICA analysis was then performed. The data were reduced to 25 components via PCA prior to the fastICA algorithm (10) being employed. For the subject-wise concatenation, the data were also reduced to 25 components per subject prior to concatenation. A total of 25 components were kept at each PCA reduction stage, rather than the 20 known to be present in the data, in order to simulate real-life conditions that might dictate retaining more principal components than the (unknown) number of sources present. The simulation was repeated 200 times, with the effectiveness of each method scored by matching up each of the original components to the corresponding found component with the lowest mean-squared error (after appropriate scaling) and calculating the average mean-squared error between the original components and the corresponding found components and the average cross-correlation value between the found and the original associated time courses, as a function of the number of subjects with the given independent component.

The simulation was then repeated, but with five additional components generated and added to each subject in the same manner as above. These components were added in order to model a different source of intersubject variability, namely, individual subjects having components not present in others. Empirically, due to unique sources of motion and susceptibility artifacts, unique sources are often present in fMRI data. Finally, in order to simulate the performance of the group ICA algorithms in a study with a larger N, the simulation including the unique sources was repeated for across-subject averaging and subject-wise concatenation only with N = 100.

RESULTS

For the simulation without any individual unique components added (Fig. 1), the performance of all three methods was comparable. When the sources were present in fewer subjects (<10), subject-wise concatenation performed significantly better in terms of estimating the component maps. When the simulation was repeated with the individual unique components (Fig. 2), however, the results were dramatically different. Row-wise concatenation performed very poorly even for the independent component present in all 20 subjects, while subject-wise concatenation performed almost as well as in the simulation without unique components. While significant degradation was seen in the performance of across-subject averaging, the time courses were still estimated very accurately (R > 0.9) for the components present in 10 or more subjects. The high degree of accuracy in the estimation of the time courses would make across-subject averaging a feasible method, since the time courses would then be used in a subsequent random-effects GLM analysis to determine activated voxels. When the number of subjects was increased to 100 (Fig. 3), significant degradation was seen in the performance of subject-wise concatenation only for components present in very few (<5) subjects relative to the previous simulation with 20 subjects. The performance of across-subject averaging was also degraded relative to the previous simulation. However, the time courses were still estimated accurately (R > 0.9) for the components present in 15 or more subjects.

Figure 1.

Comparison of the accuracy of three methods of group ICA of simulated fMRI data as a function of the number of subjects containing the given component. MSE, mean-squared error between original and estimated sources; average CC, average cross-correlation value between original and estimated associated time courses.

Figure 2.

Comparison of the accuracy of three methods of group ICA of simulated fMRI data as a function of the number of subjects containing the given component, with five unique sources added to the data from each subject. MSE, mean-squared error between original and estimated sources; average CC, average cross-correlation value between original and estimated associated time courses.

Figure 3.

Comparison of the accuracy of across-subject averaging with subject-wise concatenation of group ICA of simulated fMRI data as a function of the number of subjects containing the given component, with five unique sources added to the data from each subject, and with data generated from 100 total subjects. MSE, mean-squared error between original and estimated sources; average CC, average cross-correlation value between original and estimated associated time courses.

DISCUSSION

The results of the simulations are not surprising when the details of each procedure are taken into account. Subject-wise concatenation should perform the best in most cases in terms of estimating the component maps, since that algorithm allows for the inclusion of data from an exclusive subset of subjects where the source is present (i.e., the unmixing matrix elements corresponding to the data from the other subjects are close to zero). Furthermore, the performance of the algorithm is not greatly affected by the presence of unique sources in the individual subjects. For across-subject averaging, however, data from the other subjects have been averaged into the data set on which ICA is performed. Thus, more noise is present in the data. Without the unique sources present, the noise mainly affects the accuracy of estimating the source maps, since the increased noise will be averaged into the final result; the accuracy of estimating the time courses, however, is not significantly degraded since the increased noise does not change the optimum value of the unmixing matrix in terms of maximizing the spatial independence (or non-Gaussianity) of the sources. Similar considerations apply to row-wise concatenation, although the data are averaged to estimate the sources after the ICA decomposition rather than before. However, the situation changes drastically when the unique individual sources are added. For across-subject averaging, to some extent, by the central limit theorem, the added sources will average each other out and merely be present as added Gaussian noise. This appears to be the case for the results in Fig. 2, when the source is present in >10 subjects, as the accuracy in estimating the time courses is comparable to subject-wise concatenation. The accuracy in estimating the time courses falls rapidly as the number of subjects with a given source falls below 10, as the non-Gaussianity of the given source begins to compete with the residual non-Gaussianity of the unique sources. The performance of row-wise concatenation is extremely poor, because the non-Gaussianity of the unique sources is not averaged across subjects, and thus each of the unique sources is able to contribute to the total non-Gaussianity of the found source.

It is possible that the presence of unique sources may not pose quite as serious a problem for group ICA analyses of fMRI data as is present in the simulations. For instance, activated voxels due to motion artifacts may be present in similar locations, e.g., boundaries with areas outside the head or with cerebrospinal fluid (CSF). Moreover, the magnitude of the spurious activation might not be as large compared to real BOLD activation. However, even for the simulation without the unique sources, subject-wise concatenation performed superior to row-wise concatenation. The high computational demand, coupled with the nonrobustness to the presence of unique sources, makes row-wise concatenation an infeasible method in practice of performing group ICA analyses.

Subject-wise concatenation performed the best overall in all cases and should be the preferred method when the computational load is manageable. It should, however, be noted that the intersubject variability in the associated time courses may render the interpretation of the results more difficult. For studies with large numbers of subjects, for which the computational load may be intractable, across-subject averaging provides an attractive alternative. For the simulation with 100 subjects, accurate (R > 0.9) estimation of the associated time courses was obtained even for sources present in as few as 15, or 15%, of all subjects. Since the same degree of accuracy was obtained in the simulation with 20 subjects for sources present in 10, or 50%, of the subjects, it is reasonable to assume that the fraction of subjects with a given source necessary to provide accurate estimation of the time course will decrease as a function of the number of subjects in the study. Another advantage of across-subject averaging is that the associated time courses do not vary between subjects, making across- and between-subject random-effects analyses readily feasible using standard GLM procedures.

In conclusion, three methods of performing group ICA analyses of fMRI data were compared using computer simulation. Subject-wise concatenation produced the best overall performance, although that procedure potentially poses a large computational demand in studies with a large number of subjects. For studies with a large number of subjects, across-subject averaging estimated the associated time courses with high accuracy when the sources were present in a sufficient fraction of subjects, and is thus a viable method, since the time courses may then be used in a subsequent GLM. Row-wise concatenation was shown to be an infeasible method, due both to its very large computational demand and to its nonrobustness when unique sources are present in individual subjects.

Ancillary