Presurgical Language fMRI in Patients with Drug-resistant Epilepsy: Effects of Task Performance


Address correspondence and reprint requests to Dr. B. Weber at Department of Epileptology, University of Bonn, Sigmund-Freud Str. 25, 53105 Bonn, Germany. E-mail:


Summary: Purpose: To determine whether language functional magnetic resonance imaging (fMRI) before epilepsy surgery can be similarly interpreted in patients with greatly different performance levels.

Methods: An fMRI paradigm using a semantic decision task with performance control and a perceptual control task was applied to 226 consecutive patients with drug-resistant localization-related epilepsy during their presurgical evaluations. The volume of activation and lateralization in an inferior frontal and a temporoparietal area was assessed in correlation with individual performance levels.

Results: We observed differential effects of task performance on the volume of activation in the inferior frontal and the temporoparietal region of interest, but performance measures did not correlate with the lateralization of activation.

Conclusions:fMRI, as applied here, in patients with a wide range of cognitive abilities, can be interpreted regarding language lateralization in a similar way.

Presurgical language lateralization and mapping based on functional magnetic resonance imaging (fMRI) has become a routine clinical tool (1–3). Nevertheless, only a few studies have investigated the reliability and validity of language fMRI in large samples (4–9). For instance, it is well known from studies in healthy subjects that a systematic relation exists between cortical activation and task performance or task difficulty in language-related paradigms.

One study by Booth and colleagues (10) showed a positive correlation of task performance and activation levels in areas of the neurocognitive processing route required for the task (i.e., stronger activation in the fusiform gyrus for a visual word-spelling task and stronger activation in the superior temporal gyrus for an auditory rhyming task. A study by Chee and colleagues (11) examined cortical activation levels in Chinese–English bilinguals with differing proficiencies in each language. In this study, better-performing subjects showed a decrease in inferior frontal language area activation in addition to a decrease in left parietal activation. It was hypothesized that better tuned neuronal networks underlie the enhanced performance, and therefore less neuronal activity is necessary to perform the task. Dräger and colleagues (12) applied a word-retrieval task and examined the effect of task difficulty on activation levels. No effect of task difficulty in language-related areas was observed, but rather an increase in parietal activation with increased task demands.

Other studies examined the effect of performance on cortical activation in children and adults. Schlaggar and colleagues (13) found a negative correlation of cortical activity with performance in left inferior frontal regions in children unmatched for performance measures with adults. Brown and colleagues (14) found performance-related activity mainly in the medial parietal cortex, posterior cingulate, and occipital cortex in the comparison of performance-matched and nonmatched children and adults. Additionally, a study by Wood and colleagues (15) showed a decreased activation in left frontal regions in children compared with adults in a verb-generation and a lexical retrieval task, which was interpreted as differing cognitive abilities underlying the task performance. Interestingly, no effect of performance on the lateralization was observed. Hence, performance is a key determinant of the activation pattern, and fMRI results of differentially performing patients might not be comparable in their clinical consequence.

We tackled this issue here by correlating performance measures and brain activity in 195 consecutive patients from our epilepsy surgery program, a patient group with a high variability of cognitive performance levels (16). By doing so, we aimed at answering the following questions: Do performance measures correlate with cortical activity in our protocol? If reliable correlations exist, are they clinically relevant? Our hypothesis were (a) If the activation in the inferior frontal and temporoparietal areas is related to task difficulty, we would expect less activation in the respective areas due to better-tuned neuronal networks. Should, however, the demand on the processing increase, we would expect a stronger activation in the respective area. (b) Stronger demands could lead to a compensatory recruitment of contralateral area homologues to left hemispheric language areas, which would lead to a decrease of lateralization; conversely, demands could increase the activation in the ipsilateral language areas and thereby enhance lateralization. (c) The third possibility would be that areas outside of classic language areas would increase with stronger demands, leaving activation in classic language areas unchanged.


Between 1999 and 2004, 226 consecutive patients with drug-resistant epilepsy were investigated with language fMRI during their presurgical evaluations. Twenty-eight patients were excluded from this study because of excessive movement artifacts (more than twice the voxel size), seizures during scanning, large lesions interfering with normalization, or because of a mean T-value of the activation <2. Another three patients were excluded because they apparently misunderstood the task instructions, as they made more than twice as many incorrect than correct answers, leaving 195 patients included in the study (104 men, 91 women). Table 1 gives an overview of their demographic and clinical data. Written informed consent was obtained according to the Medical Ethics Committee of the University of Bonn and the Declaration of Helsinki (1991).

Table 1. Demographic and clinical data
Demographic variablesAll (n = 195)WGRT (n = 49)BGRT (n = 49)WGACC (n = 48)BGACC (n = 33)
  1. Means and standard deviation, number and percentage are shown, respectively.

  2. WGRT, worst-performing group, BGRT, best-performing group according to reaction times; WGACC, worst-performing group, and BGACC, best-performing group according to accuracy; EHI, Edinburgh Handedness Inventory.

Handedness (EHI)64.7463.8773.4176.7274.61
Age (yr)35.9135.6038.1031.2938.52
Age at first seizure (yr)15.2114.2417.2411.6919.79
Education (yr)11.5110.1112.899.7113.85
 (3.62) (2.98) (3.38)(3.17) (3.07)
Site of lesion
Left hemisphere117 (60%) 33 (67%)22 (45%)33 (67%)18 (55%)
Right hemisphere51 (26%) 9 (18%)20 (41%)11 (23%)11 (33%)
Both hemispheres6 (3%)2 (4%)2 (4%)02 (6%)
No MR lesion21 (11%) 5 (10%) 5 (10%) 4 (10%)2 (6%)
Temporal lobe120 (62%) 28 (57%)33 (67%)27 (56%)27 (82%)
Extratemporal54 (27%)12 (24%)11 (22%)13 (27%)3 (9%)
Unknown/no Lesion21 (11%) 9 (18%) 5 (10%) 8 (17%)3 (9%)

Subjects lay in a supine position with their heads stabilized by an individually molded vacuum cushion. Stimuli were back-projected onto a translucent screen positioned opposite the magnet bore by using an LCD projector. Subjects viewed the stimuli by way of a mirror mounted on the head coil. While undergoing fMRI, they were presented with a series of item pairs, either word pairs or consonant string pairs. Both constituents of each pair were simultaneously presented for 4 s, above and below a central fixation cross. During the activation scan lasting 12 min 30 s, an active condition (semantic decision task) and a control condition (letter-matching task) alternated every 25 s, so that six item pairs were presented for each of 30 half-cycles, one every 4.125 s, with an additional 0.25 s of fixation between each half-cycle. The verbal stimuli of the semantic condition were 180 common German nouns, ranging in length from five to 11 letters. Words were selected to form 45 word pairs comprising words with identical or highly similar meanings (synonyms) and 45 word pairs comprising semantically unrelated words. The unpronounceable consonant strings of the control condition were developed by using a pseudorandom algorithm to represent 90 pairs, 45 pairs of two identical strings and 45 pairs in which one letter was different between the two constituents of one pair. Strings were matched with words with regard to the number of letters. Pairs of synonyms and unrelated words as well as identical and different consonant strings were randomly intermixed across subjects within each condition. In the semantic condition, subjects were required to decide whether the words of a pair were synonyms or semantically unrelated. In the control condition, subjects matched the pair of consonant strings. They were asked to push a button of a fiberoptic control pad with the index finger of the dominant hand if they saw a pair with two synonyms (semantic condition) or with two identical letter strings (control condition).

For fMRI, we acquired 248 T2*-weighted, gradient-echo EPI-scans with 16 axial slices (slice thickness, 6 mm; interslice gap, 0.6 mm; matrix size, 64 × 64; field of view, 220 mm; echo time, 50 ms; repetition time, 3.125 s) oriented along the AC–PC line and, for structural MRI, a T1-weighted 3D-FLASH sequence (number of slices, 120; slice thickness, 1.5 mm (no interslice gap); matrix size, 256 × 256; field of view, 230 mm; echo time, 4 ms; repetition time, 11 ms).

MRI data were analyzed by using SPM99 ( The processing was performed with the following steps: (a) registration of the motion-correction parameters; (b) calculation of the parameters for the normalization onto the MNI atlas based on the first EPI scan using the EPI template with the default values for nonlinear corrections; (c) realignment and normalization using the sinc interpolation algorithm; (d) smoothing of the normalized images with a gauss kernel, using a FWHM, 7 mm; (e) modeling of the expected hemodynamic response function (hrf) with the appropriate block design, convoluted with the hrf as the basic approach; (f) filtering of the time series with the hrf as a low-pass filter and the suggested value of 106 s for the cut-off period of a high-pass filter; (f) correction for global signal drifts by intensity normalization; and (g) application of individual thresholds to the estimated t-test maps and suppression of activation clusters with <10 pixels. Processing of the anatomic images required the following steps: (a) normalization of the anatomic volume onto the MNI atlas by using the T1 template; (b) segmentation of the resulting data set; and (c) brain-surface rendering using the white- and gray-matter compartments. Finally, semantic > control activation maps were overlaid onto individual brain surfaces.

To quantify individual activation maps and to derive laterality indices, we applied initially a statistical threshold for each patient individually, which was determined objectively by adjusting the threshold to the individual activation level (17). This was achieved by first calculating a mean maximal t value defined as the mean of those 5% of voxels showing the highest level of activation. Voxels with a t value >50% of this maximum t value were included in the calculation of the laterality indices (LIs) and the volume of activation. To increase functional specificity, we assessed these measures within two functionally defined regions of interest (ROIs) associated with the clinically most relevant language areas: a left inferior frontal ROI and a left temporoparietal ROI (Fig. 1). The ROIs were based on statistical parametric maps of 12 healthy subjects with a region-growing algorithm starting from the local activation maximum. The resulting ROIs encompassed 15.9 cc for the Broca (including parts of the Brodmann areas 44, 45, 46, and 47) and 30.0 cc for the temporoparietal area (including parts of the Brodmann areas 22, 39, and 40). The symmetrical masks used were generated by adding mirror images. For a detailed description of the procedure, see Fernández et al. (18). For LI assessment, we compared the activation in these left hemispheric ROIs with the activation in homologous ROIs in the right hemisphere by using the formula:


where V is the set of activated voxels, XL is the t value of left hemispheric voxels, and XR is the t value of right hemispheric voxels.

Figure 1.

Histogram of the patients according to performance levels (percentage of correct responses in the semantic condition).

For further analysis of the effect of poor and good performance, patients were divided into four groups of approximately equal size selected according to their respective reaction times or accuracy, each consisting of 33 to 49 patients. We always analyzed performance effects based on both accuracy and reaction time to avoid a possible bias. However, we do not expect large differences, because of the close and positive correlation between accuracy and reaction time. The two groups with fastest (n = 49 for BGRT; RT < 1,652 ms) and slowest performance (n = 49 for WGRT; RT > 2,747 ms) as well as with best and worse accuracy (n = 48 for WGACC; n < 30 and n = 33 for BGACC; n > 41) were directly compared in further analyses. This division made it possible to use statistical tests for a direct comparison of patients with poor and good performance. Because of nonnormal distribution, we chose the Mann–Whitney U test and Spearman's correlation coefficient instead of parametric tests.


Behavioral data

Table 2 displays the behavioral data in the semantic and the perceptual control condition for all patients and additionally for the group of patients with best and worst performance. As can be seen from Fig. 1, our patients covered a large spectrum of performance levels, including patients performing at or close to ceiling, as well as chance level (i.e., hits minus false alarms ≈ 0). Reaction times and accuracy were negatively correlated in both the semantic (r=−0.564; p < 0.001) and the perceptual condition (r=−0.494; p < 0.001). In addition, the behavioral data in the semantic and in the control condition were positively correlated with each other regarding reaction times (r= 0.346; p < 0.001) and the number of correct responses (r= 0.416, p < 0.001).

Table 2. Behavioral data
  1. WGRT, slowest-performing group, BGRT, fastest-performing group according to reaction times; WGACC, worst-performing group, and BGACC, best-performing group according to accuracy.

Number of patients19549494833
Semantic task
 RT ± SD (ms)2,140 ± 4492,748 ± 2921,653 ± 1292,486 ± 4451,804 ± 256
 No. of correct responses ± SD 35.9 ± 9.3 30.1 ± 9.7 41.6 ± 5.0 23.2 ± 8.5 44.5 ± 0.5
Perceptual control task
 RT ± SD2,782 ± 4023,035 ± 4602,640 ± 3332,734 ± 495 2,828 ± 293
 No. of correct responses ± SD  34.2 ± 10.4  28.6 ± 10.1 37.6 ± 8.3  29.6 ± 11.5 39.0 ± 5.6

Correlation of laterality indices and performance

A correlation analysis of the performance level (accuracy and reaction time) with the regional LIs both in the inferior frontal and the temporoparietal ROIs revealed no reliable linear association (max r= 0.008; min p = 0.388). Direct comparison of the group of patients with best performance with the patients with poorest performance showed no difference in laterality indices for the inferior frontal (BGRT, 0.666 ± 0.555; WGRT, 0.744 ± 0.428; p = 0.731) or the temporoparietal (BGRT, 0.571 ± 0.530; WGRT, 0.587 ± 0.537; p = 0.677) area. Analyzing the groups according to accuracy measures showed similar results (inferior frontal area: BGACC, 0.601 ± 0.611; WGACC, 0.767 ± 0.448; p = 0.188; temporoparietal area: BGACC, 0.545 ± 0.560; WGACC, 0.688 ± 0.488; p = 0.187).

Correlation of regional activity volumes and performance

Figure 2 shows the correlation of both reaction times and the number of correct responses in the semantic condition with the number of significantly activated voxels in the inferior frontal and temporoparietal ROIs. With increasing reaction times, the number of activated voxels in the inferior frontal ROI increased (Fig. 2A; r= 0.141; p = 0.049). The opposite effect was observed in the temporoparietal ROI (Fig. 2B; r=−0.156; p = 0.030). Given that reaction time and accuracy were negatively correlated, it is not surprising that the accuracy in the semantic decision task was negatively correlated with the number of activated voxels in the inferior frontal ROI (Fig. 2C; r=−0.121; p = 0.034) and positively correlated with the number of activated voxels in the temporoparietal ROI (Fig. 2D; r= 0.165; p = 0.022). Direct comparison of the number of activated voxels for WGRT with the number of voxels for BGRT showed that the volume of activation in the inferior frontal ROI was significantly smaller for the latter group (p = 0.015), whereas the temporoparietal ROI exhibited the opposite effect (p = 0.042). The same holds true for groups divided according to accuracy measures (inferior frontal, p = 0.013; temporoparietal, p = 0.023).

Figure 2.

Regions of interests depicted on a three-dimensional rendering of a normalized high-resolution brain image overlaid with a box illustrating the approximate position of the acquired functional MR images. Scatterplots of task performance and volume of activations with linear fit lines. Correlation of reaction times with the number of significantly activated voxels in the inferior frontal (A) and temporoparietal region of interest (ROI) (B). Correlation of accuracy (percentage correct responses) with the number of significantly activated voxels in the inferior frontal (C) and temporoparietal ROI (D).

Effects of clinical and demographic data on laterality indices

A regression analysis of the demographic and clinical data (i.e., age, age at onset of epilepsy, years of education, sex, side of lesion, handedness) revealed a strong effect of handedness on the laterality index in the Broca area (r= 0.313; p < 0.001) and the temporoparietal area (r= 0.317; p < 0.001), with left-handedness being associated with more right-sided language dominance. The side of the lesion exhibited a tendency to influence the laterality index, with left-sided lesions being associated with more right-hemispheric dominance in the Broca area (r= 0.150; p = 0.058). No reliable association of the side of the lesion with the temporoparietal ROI (r= 0.145; p = 0.133) was observed. Neither the other demographic nor the clinical data revealed a significant effect on the laterality indices (min p = 0.149).

Effects of demographic and clinical data on regional activity

A correlation analysis of age, age at onset of epilepsy, duration of epilepsy, handedness, and age revealed only a negative correlation of the number of activated voxels in the inferior frontal ROI and age (r=−0.176; p = 0.016). All other factors did not influence the volumes of activation in either ROI (min p = 0.156). The same analysis was performed to evaluate the influence of the demographic factors on performance levels. Age was positively correlated with number of correct answers in the semantic task (r= 0.242; p < 0.001); education also was positively correlated with correctness (r= 0.401; p < 0.001) and negatively correlated to reaction times (r=−0.313; p < 0.001). All other factors showed no correlation on the performance levels in the semantic decision task. In the perceptual control task, no significant influence of any demographic factor was observed. Only age showed a tendency toward a positive correlation with reaction times in the control task (r= 0.126; p = 0.08).

A comparison of BGRT to WGRT showed a significant difference with respect to the level of education (p < 0.001). In addition, the fraction of left-hemispheric lesions was higher in the group of patients with weaker performance than in the best-performing group 2= 6.372; p = 0.01). A comparison of the groups according to accuracy revealed in addition to education (p < 0.001), a difference with respect to age (p < 0.001). Moreover, the number of patients with temporal lesions was significantly higher in the BGACC group in comparison to the WGACC group 2= 4.922; p = 0.021).


By using fMRI, we evaluated the influences of task performance on the size of the activated volume in language-related areas in a large clinical sample. We found a correlation of task performance and cortical activity in both the inferior frontal and the temporoparietal areas.

This correlation is not unexpected because several studies have described effects of task performance in language fMRI of healthy subjects (10–12,19). Interestingly, the effect of task performance differs in the two ROIs under study. As in the study by Booth and colleagues (10), we found an increase in activation levels with increased performance in the temporoparietal ROI. The temporoparietal area is thought to be involved in the semantic and phonologic analysis (20,21). Stronger activation of this area in well-performing patients might be related to a more extensive conceptual processing associated with the larger amount of retrieved semantic information (22). In contrast, we found decreased activation in the inferior frontal area with increased performance. This result is in line with the study by Chee and colleagues (11), who described fewer neuronal demands with increased performance. The inferior frontal lobe has been assumed to be involved in the search for and retrieval of semantic information (23,24). Stronger activation of this area might thus indicate that poorly performing patients have a greater demand for such search-and-retrieval operations. An alternative view could be a greater demand for decision-making processes, which has recently been shown to be correlated to inferior frontal activation, in a study using an auditory decision paradigm (25). This greater demand could be explained by less information on the semantic properties of individual items being provided by temporoparietal areas.

Despite these correlations between performance and activation measures, performance levels did not show any correlation with laterality indices; neither was a difference found in lateralization between the WGRT and BGRT or between WGACC and BGACC. Hence the measures laterality and regional laterality, which have shown their clinical usefulness by appropriate validation and reliability testing (26), seem to be independent of performance, at least as long as individual thresholding is used and sufficient compliance can be assumed from the relation between reaction times and error rate. This assumption seems reasonable, because patients who made more errors took longer to make a decision, which could be interpreted as effort and thus good compliance. Regarding language mapping (i.e., the precise delineation of a particular language region), the situation is different. With the same procedure and a similar, although much smaller sample of patients, we revealed roughly a 50% overlap of activation maps on a voxel-by-voxel basis in a test–retest within-subject design, rendering this procedure unsuited for the application of cortical mapping with a need for high spatial resolution and reliability (26).

The comparison of two subgroups of patients with regard to their performance shows a higher percentage of left-sided lesions in the group of patients with worst performance. It is well known that patients with a left-sided pathology in general have a greater difficulty in language-related tasks (27). Especially epilepsies of the temporal lobe have been studied extensively on their declining effect on cognitive abilities (28,29). The higher percentage of patients with left-sided pathologies in the group of patients with weaker performance is in line with these observations. The performance in the semantic decision task, as performed here in our study, is of course not able to investigate subtle neuropsychological differences. The higher percentage of temporal in comparison with extratemporal lesions in the better-performing patients suggests a stronger impairment of cognition, probably due to slower executive functions (30). It is not surprising that the level of education is a positive predictor of the performance measures. The same holds true for age. One may speculate that older patients have more semantic information available for the presented items and thereby less difficulty in performing the task (31). The finding that the effect of age is seen only in accuracy and not in reaction times may be due to the finding of an overall slower performance of older patients (32), because they also tended to be slower in the perceptual control task, whereas the effect of age on task performance strongly depends on the kind of task performed (33).

Many factors have been shown to influence language lateralization in health and disease. Handedness exhibits a strong correlation with language dominance, in that left-handedness is associated with stronger right-hemispheric involvement (34,35). Our results are in line with these studies, in that left-handedness is associated with a more negative laterality index (i.e., a stronger right-hemispheric activation). Furthermore, in epilepsy patients, the side of the lesion is known to influence language lateralization (36,37). A tendency of patients with left-hemispheric lesions to show a more right-sided language lateralization also was observed in our study. The fact that the effect in our sample does not reach significance may be due to the variety of different lesion locations in the left as well as in the right hemisphere, which may affect language lateralization in different ways. The same may hold true for the influence of age at onset of epilepsy, which exhibited no effect on language lateralization in our study. The effect of early-onset epilepsy in comparison to late-onset epilepsy has been described diversely, in that some studies found an effect, whereas others did not, probably because of differing sample sizes (1,6,38). Age at study did not show any effect on the laterality of activation in our sample of subjects. It has been hypothesized that the involvement of right-hemispheric structures would increase with age, leading to a more bilateral activation, known as the HAROLD (hemispheric asymmetry reduction in old adults) model, whereas others describe a stronger decline in right-hemispheric activation with age, which would lead to stronger asymmetry (39). Our sample of patients was probably too young to show any of these effects, with a mean age of ∼35 years. A different explanation may be that the underlying disease affects the LI in a way that the effect of age is no longer obvious. In accordance to this view, several studies that examined the effects of different demographic factors on language lateralization found an effect only in healthy subjects but not in patients with epilepsy (1,34).

The application of an individual rather than a uniform threshold for every patient has the advantage of controlling for interindividual variability in overall brain activation and vascular differences. Studies examining the effects of different statistical thresholds on the laterality of activations show a relation of specificity and sensitivity with the applied threshold, in that the specificity decreases with lower statistical thresholds, but none of these studies applied an individually adjusted threshold based on the overall brain activation. This method has proven to display activated language-related areas reliably (26) and has been shown to improve intersubject reliability in nonlanguage tasks (40). To ascertain the specificity of the observed activation, we excluded patients with very low significant overall activation (T value, <2). Statistical maps, as were examined here, are routinely used in clinical settings (41–44). The examination of statistical maps was applied here because of their routine clinical use and to enable the comparison of the results with those of other studies on language fMRI.

Although we have shown that the level of performance has no clinically relevant effect on lateralization, we do not at all argue that response recordings in clinical language fMRI are obsolete. Recording patients' responses not only enables compliance monitoring, but it might also increase compliance and effort by social pressure.

In conclusion, language fMRI, as used here, can be performed and interpreted in patients with widely varying cognitive abilities. Nevertheless, performance control should be used to foster and monitor compliance.