Language lateralization by fMRI and Wada testing in 229 patients with epilepsy: Rates and predictors of discordance


Address correspondence to Julie Janecek, Medical College of Wisconsin; Department of Neurology, Division of Neuropsychology; 9200 West Wisconsin Avenue; Milwaukee, WI 53226, U.S.A. E-mail:



To more definitively characterize Wada/functional magnetic resonance imaging (fMRI) language dominance discordance rates with the largest sample of patients with epilepsy to date, and to examine demographic, clinical, and methodologic predictors of discordance.


Two hundred twenty-nine patients with epilepsy underwent both a standardized Wada test and a semantic decision fMRI language protocol in a prospective research study. Language laterality indices were computed for each test using automated and double-blind methods, and Wada/fMRI discordance rates were calculated using objective criteria for discordance. Regression analyses were used to explore a range of variables that might predict discordance, including subject variables, Wada quality indices, and fMRI quality indices.

Key Findings

Discordant results were observed in 14% of patients. Discordance was highest among those categorized by either test as having bilateral language. In a multivariate model, the only factor that predicted discordance was the degree of atypical language dominance on fMRI.


fMRI language lateralization is generally concordant with Wada testing. The degree of rightward shift of language dominance on fMRI testing is strongly correlated with Wada/fMRI discordance, suggesting that fMRI may be more sensitive than Wada to right hemisphere language processing, although the clinical significance of this increased sensitivity is unknown. The relative accuracy of fMRI versus Wada testing for predicting postsurgical language outcome in discordant cases remains a topic for future research.

Although intracarotid amobarbital (Wada) testing has been used for many years with epilepsy surgery candidates to assess preoperative language lateralization and predict postoperative language outcome (Wada & Rasmussen, 1960; Loring, 1992), it has been suggested that functional magnetic resonance imaging (fMRI) language lateralization is a potential replacement for the Wada test (Binder, 2011). Replacement of the Wada test with fMRI appears to have been widely accepted in clinical practice, as fMRI is less costly than the Wada test, noninvasive, may be repeated if necessary, and provides information about intrahemispheric localization as well as lateralization of language processes. In 1993, >95% of epilepsy surgery centers worldwide were using the Wada test to assess all surgical candidates (Rausch et al., 1993). However, a more recent survey revealed a trend among epilepsy centers over the last 15 years to replace the standard Wada test with fMRI for the assessment of language lateralization (Baxendale et al., 2008).

As a first step in establishing fMRI as an alternative to Wada testing, numerous studies have assessed concordance between the two tests, although rates of concordance have ranged from as high as 100% to as low as 56% (Desmond et al., 1995; Binder et al., 1996; Bahn et al., 1997; Hertz-Pannier et al., 1997; Worthington et al., 1997; Yetkin et al., 1998; Benson et al., 1999; Lehericy et al., 2000; Baciu et al., 2001; Carpentier et al., 2001; Spreer et al., 2001; Gaillard et al., 2002; Liegeois et al., 2002; Rutten et al., 2002; Adcock et al., 2003; Sabbah et al., 2003; Woermann et al., 2003; Deblaere et al., 2004; Gaillard et al., 2002; Baciu et al., 2001; Benke et al., 2006; Chlebus et al., 2007; Szaflarski et al., 2008; Wellmer et al., 2008; Arora et al., 2009). A number of factors could account for this variability. Concordance appears to be better with some fMRI task paradigms than with others (Lehericy et al., 2000; Gaillard et al., 2002). In addition to a variety of task paradigms, investigators have used a variety of techniques for converting the Wada and fMRI data into dominance categories, and a variety of methods for classifying the results as concordant or discordant. Finally, most studies have depended on small samples of fewer than 20 patients, usually including very few individuals with atypical language dominance. Only two studies included >30 patients. The largest (n = 94) observed a concordance rate of 91% (Woermann et al., 2003), whereas the other (n = 67) observed a concordance rate of only 69–78%, depending on region of interest (Benke et al., 2006). Therefore, the actual rate of concordance between the two tests remains unclear.

The reasons for discordant Wada and fMRI results are also not well understood. Discordance could arise from a number of factors that lead to individual measurement error or poor data quality. For the Wada test, these include such factors as inadequate or excessive anesthesia, anomalous vasculature, short duration of drug effect, and interhemispheric crossflow of the anesthetic. For fMRI, poor data quality can result from excessive movement, poor task performance, and scanner artifacts. Whether or not any of these variables, alone or in combination, are sufficient to explain the observed rates of discordance is currently unknown. No studies have systematically investigated methodologic or subject factors that may predict Wada/fMRI discordance, though several have observed a higher rate of discordance in patients with bilateral language representation (Benke et al., 2006; Arora et al., 2009).

The present study uses a large sample of patients prospectively studied with a standardized Wada test and fMRI language mapping protocol to more definitively characterize the degree of concordance between these tests and to examine demographic, clinical, and methodologic predictors of discordance.



A consecutive series of 249 adults (ages ≥18) were enrolled prospectively in a research protocol and underwent standardized outpatient Wada testing, fMRI language mapping, and preoperative neuropsychological assessment at the Medical College of Wisconsin Comprehensive Epilepsy Program between 1993 and 2009. Ten were excluded due to invalid or incomplete fMRI data, including four who were unable to perform one or both activation tasks above chance levels, one with excessive movement resulting in absence of any activation, and five in whom scanning was discontinued due to seizure (2), arm pain (1), claustrophobia (1), or scanner malfunction (1). Eleven individuals were excluded due to invalid Wada testing (i.e., no language laterality index [LI] was calculated due to obtundation), two of whom also had lost fMRI data. In addition, one individual was excluded because he had a previous temporal resection. The final sample comprised 229 patients. Relevant demographic, medical, and neuropsychological characteristics of the patient sample are provided under Results. All patients provided written informed consent prior to fMRI under a protocol approved by the institutional review board.

Wada testing

Wada testing was always performed blind to the fMRI results. The Wada test was modeled after the procedure developed at the Medical College of Georgia (Loring, 1992). Baseline testing was performed 2 h before the procedure. Amobarbital (75–125 mg) was injected into the internal carotid artery ipsilateral to the seizure focus, and language functions of the contralateral cerebral hemisphere were tested. All patients were initially given 75 mg of amobarbital followed by a saline flush. If they did not develop hemiplegia and delta slowing on electroencephalography (EEG) they were administered 1–2 additional 25 mg boluses until hemiplegia was obtained and delta slowing occurred. Therefore, we used the minimal dose necessary to produce hemianesthesia for the purpose of avoiding invalid test data due to obtundation. The procedure was then repeated on the hemisphere contralateral to the seizure focus. Counting disruption was numerically rated, as well as ability to follow two simple midline commands just after injection. Language was assessed using measures of counting, comprehension of commands, object naming, phrase repetition, sentence reading, and a rating of paraphasic errors during the period of hemianesthesia. Return of motor function and cessation of delta slowing on EEG were used to determine the duration of hemianesthesia. Only language trials obtained during the period prior to any motor return in the contralateral upper extremity or resolution of delta on EEG (whichever occurred first) were included in the language lateralization score. The scores for each language task ranged from 0 to 3, with lower scores indicating a greater degree of impairment. The total possible, or maximal obtainable, score therefore varied depending on the duration of hemianesthesia. LIs were calculated as the difference between the percent of maximal obtainable score in the inject right/test left condition and the percent of maximal obtainable score in the inject left/test right condition. LIs ranged from +100 (indicating complete left hemisphere dominance) to −100 (indicating complete right hemisphere dominance). These quantitative LIs were used to define language dominance in subsequent clinical decision-making.

Functional magnetic resonance imaging

The language activation protocol was a semantic decision/tone decision contrast developed by Binder et al. (1995). During the semantic decision task, individuals listened to animal names and were instructed to press a button if the animal was both found in the United States and used by humans. During the tone decision task, individuals listened to brief sequences of high (750 Hz) and low (500 Hz) tones and were instructed to press a button if they heard a sequence containing two high tones. Tasks were alternated in a block design. The contrast of the semantic decision task with the tone decision task isolates speech perception and semantic language processes while controlling for attention, working memory, auditory, and motor processes. This contrast produces left-lateralized language activation in frontal, temporal, and parietal areas in healthy right-handed controls (Binder et al., 1997; Frost et al., 1999).

As described elsewhere (Binder et al., 1997; Frost et al., 1999), imaging was conducted on commercial 1.5T and 3T scanners (General Electric Medical Systems, Milwaukee, WI, U.S.A.). High-resolution, T1-weighted anatomic reference images were obtained using a three-dimensional spoiled-gradient echo sequence. Functional imaging used a gradient-echo T2*-weighted echoplanar sequence. Echoplanar image volumes were acquired as contiguous sagittal or axial slices covering the whole brain.

Image processing and statistical analyses were performed using analysis of functional neuroImages (AFNI) software. All analyses were performed at the individual subject level. Volumetric image registration was used to reduce the effects of head movement. Task-related changes in MRI signal were identified using a multivariable general linear model. The predicted task effect was modeled by convolving a gamma function with a time series of impulses representing each task trial. Movement vectors (computed during image registration) and a second-order linear trend were included as covariates of no interest. Regions of interest used for automated measurement of language lateralization were based on activated regions in the left hemisphere in 100 healthy right-handed adults (Frost et al., 1999). A “lateral” ROI was created by combining temporal, frontal, and parietal activations in the lateral two thirds of the hemisphere, excluding medial regions because they tend to be more bilaterally activated and can include midline voxels containing tissue from both hemispheres (Binder et al., 2008a). Corresponding right hemisphere ROIs were created by reflecting the left hemisphere ROIs symmetrically across the midline. Voxels passing an uncorrected activation threshold of p < 0.001 were counted for each patient. LIs reflecting the interhemispheric difference between voxel counts in the left and right homologous ROIs were calculated using the formula: LI = 100 * (L−R)/(L+R), where L equals the number of activated voxels in the left hemisphere and R equals the number of activated voxels in the right hemisphere. The scores range from +100 (complete left hemisphere dominance) to −100 (complete right hemisphere dominance). All fMRI analyses were fully automated and performed by a technician without knowledge of the Wada test results.

Operational definition of discordance

There is no standard, validated definition of Wada/fMRI language lateralization discordance. Clinical judgment is often used to determine left, right, or bilateral language dominance, and arbitrary cutoffs are frequently applied in studies investigating discordance. We defined discordance conservatively, using a method that accounts for inherent differences in the distributions of Wada and fMRI LIs (see Fig. 1). In a previous study of language lateralization in 100 neurologically normal right-handed individuals, Springer et al. (1999) found that 94% of the sample was left language dominant as defined by LI scores greater than 20. In the Springer et al. sample, there were no cases with LIs between 20 and 30. Therefore, for consistency with this and other studies (Seghier, 2008), we chose an fMRI LI cut score of ±25, yielding the following dominance categories: left (LI ≥ 25), right (LI ≤ −25), and bilateral (LI between −25 and 25). Using this cutoff score, 80% of the current sample was left language dominant, consistent with left language dominance rates (67–81%) reported in other epilepsy samples (Springer et al., 1999; Woermann et al., 2003; Szaflarski et al., 2008). Because Wada language lateralization estimates are not available for neurologically normal individuals, we set the Wada cut score to yield similar proportions of left, bilateral, and right dominant cases as fMRI. Of note, the kurtosis of the LI distributions for Wada and fMRI are different. Accordingly, Wada language dominance was categorized using a cut score of 50: left (LI ≥ 50), right (LI ≤ −50), and bilateral (LI between −50 and 50). Using this cut score, 80% of our sample was categorized as left language dominant by the Wada test, as with fMRI. To avoid the possibility that similar LIs on either side of the arbitrary cut scores could be categorized as discordant (e.g., an fMRI LI of 40 and a Wada LI of 40 being defined as discordant), we also required that discordant cases have Wada and fMRI LI values differing by >50 units (i.e., |Wada LI − fMRI LI|) > 50). “Discordance” was thus defined as follows: the Wada and fMRI LIs must (1) be in different categories as defined above, and (2) differ by more than 50 units.

Figure 1.

Frequency distributions of Wada and fMRI language laterality indices, with scores ranging from 100 (complete left hemisphere dominance) to −100 (complete right hemisphere dominance). Compared to the fMRI distribution, the Wada distribution is more strongly skewed to the left.

Predictors of discordance

Subject factors examined for a relationship with discordance included age, sex, handedness, education, IQ, age at seizure onset, location of seizure focus, number of anticonvulsant medications, and presence of mesial temporal sclerosis (MTS) on MRI. Several factors are known to compromise the integrity of Wada test results and were hypothesized to predict discordance. These included absence of anterior cerebral artery filling due to atresia of the A1 segment, posterior cerebral artery filling, interhemispheric crossflow (degree to which angiography showed crossflow through the anterior communicating artery to the contralateral hemisphere on hand injection of contrast), obtundation, amobarbital dose, use of carbonic anhydrase inhibitors at the time of Wada testing, and duration of drug effect (number of trials completed prior to return of motor functioning in the contralateral upper extremity). Duration of drug effect indicates Wada examinations that may have been compromised by early motor return (few trials completed) or a prolonged drug effect (potentially indicating excessive sedation). Several fMRI factors were selected as possible indicators of a poor-quality fMRI study. These included behavioral performance on the fMRI tasks (percent correct on semantic decision and tone decision tasks), signal to fluctuation noise ratio (SFNR) (Friedman et al., 2006) averaged over the brain volume, degree of motion artifact (head movement), number of image volumes contaminated by artifact, and mean residual error in the regression analysis averaged over the brain volume.

Possible relationships between these variables and discordant language lateralization results were examined in two ways. First, the group of patients with discordant results was compared to the group with concordant results on each measure using t-tests or chi-square tests. Second, Pearson correlation was conducted for each of the continuous variables to test for correlation with the absolute value of the Wada-fMRI LI difference score.


The numbers of left, bilateral, and right dominant cases grouped by language lateralization method and language dominance category are displayed in Table 1, along with a breakdown of the number of patients with each possible combination of results from the two tests. Most of the disagreements in categorization (38 of 42) involved patients who were labeled as bilateral by one of the tests but not by the other. In four extreme cases, Wada indicated left dominance, whereas fMRI indicated right dominance. There were no cases with the reverse pattern.

Table 1. Number of left, bilateral, and right dominant cases based on a Wada categorization cut score of 50 and an fMRI categorization cut score of 25 (N = 229)
  All184 (80.3%)30 (13.1%)15 (6.6%)
fMRILeft182 (79.5%)167150
Bilateral28 (12.2%)13105
Right19 (8.3%)4510

Tables 2 and 3 show the Wada/fMRI LI discordance rates, using the dual criteria requiring discordant categorization and an LI difference score of at least 50 units. Discordance was observed in 32 patients, or 14% of the total sample. Using Wada as the measure of reference, discordance rates ranged from 8 to 40%, depending on language dominance category (see Table 2). The difference in rate of discordance as a function of Wada dominance category was highly significant (chi-square = 26.8, p < 0.0001). Rates for both the bilateral category (p < 0.0001, Fisher's exact test) and the right dominant category (p = 0.0083, Fisher's exact test) were higher than for the left dominant category. The discordance rate was numerically higher in patients with bilateral language than in those with right dominance, although this difference did not reach statistical significance.

Table 2. Language discordance rates when Wada is left, right, and bilateral using a Wada categorization cut score of 50 and an fMRI categorization cut score of 25–Wada as reference
Left15/184 (8%)
Bilateral12/30 (40%)
Right5/15 (33%)
Total32/229 (14%)
Table 3. Language discordance rates when fMRI is left, right, and bilateral using a Wada categorization cut score of 50 and an fMRI categorization cut score of 25–fMRI as reference
Left11/182 (6%)
Bilateral16/28 (57%)
Right5/19 (26%)
Total32/229 (14%)

Using fMRI as the measure of reference, discordance rates ranged from 6 to 57%, depending on language dominance category (see Table 3). Differences between the dominance categories were highly significant (χ2 = 55.3, p < 0.0001). The discordance rate was higher when fMRI indicated bilateral language (p < 0.0001, Fisher's exact test) or right dominance (p = 0.0085, Fisher's exact test) than when fMRI indicated left dominance. The rate was also higher when fMRI indicated bilateral language than when fMRI indicated right dominance (p = 0.028, Fisher's exact test).

Overall, these data indicate high levels of concordance (92–94%) when a result indicates left dominant language. Discordance was much more likely when either test indicated bilateral language. The highest rates of discordance were observed when fMRI LIs indicated bilateral language.

Predictors of discordance

Summary statistics for the concordant and discordant groups are shown in Table 4. t-tests and chi-square analyses were performed to compare discordant and concordant groups. The discordant and concordant groups did not differ with regard to any subject variables, Wada quality indices, or fMRI quality indices after Bonferroni correction for multiple comparisons (all p values > 0.04). Significant differences were observed between the concordant group and the discordant group with regard to Wada LIs and fMRI LIs, with much lower mean LIs on both tests in the discordant group.

Table 4. Comparisons between concordant and discordant groups
 Concordant Group n = 197Discordant Group n = 32p-value
Mean (SD) or%Mean (SD) or%
  1. ACA, anterior cerebral artery; AED, antiepileptic drug; CAIs, carbonic-anhydrase inhibitors; RT, right temporal; LT, left temporal; ET, extratemporal; MTS, mesial temporal sclerosis; HA, hippocampal atrophy; y, years; PCA, posterior cerebral artery; RMS, root mean square; SD, standard deviation.

Age at surgery, y37.4 (11.1)39.6 (9.4)0.29
Sex,% female50%59%0.34
Education, y13.1 (2.5)13.6 (2.3)0.29
Handedness,% right-handed82.7%68.8%0.06
Age at epilepsy onset, y17.7 (12.2)16.5 (11.3)0.63
Epilepsy duration, y20.4 (14.0)23.6 (13.6)0.24
No. of AEDs2.1 (0.9)1.8 (0.9)0.20
CAIs (topiramate or zonisamide)14%9%0.50
Seizure focus RT/LT/ET81/82/3410/15/70.55
MTS or HA46%44%0.80
Full scale IQ93.0 (12.8)94.2 (13.8)0.65
Amobarbital dose, right82.688.20.14
Amobarbital dose, left86.189.20.33
Wada total possible, left21.0 (4.6)19.3 (4.1)0.05
Wada total possible, right21.4 (6.2)20.6 (7.5)0.50
Crossflow left to right40%50%0.29
Crossflow right to left36%17%0.05
Proximal ACA atresia5%0%0.22
PCA filling42%34%0.41
% Correct, Semantic task81.2 (10.3)79.4 (12.1)0.39
% Correct, Tones task90.3 (12.0)92.4 (10.4)0.35
Head motion (RMS mean)44.9 (70.0)50.8 (100.6)0.69
Mean residual error122.1 (213.3)109.2 (195.9)0.75
Signal-to-fluctuation noise ratio78.0 (26.8)76.2 (22.8)0.73
Artifact-contaminated volumes5.0 (11.6)8.5 (15.0)0.15
Wada LI71.2 (45.5)24.3 (64.9)<0.0001
FMRI LI57.2 (40.0)14.9 (50.1)<0.0001

Categorical comparisons between concordant and discordant groups may not be optimally sensitive, given that discordance varies along a continuum. We therefore conducted Pearson correlations with selected continuous variables to test for relationships between these variables and the absolute values of the Wada-fMRI LI difference score. No significant relationships were observed except for the Wada and fMRI LIs (Table 5). In both cases, the more atypical the LI, the larger the absolute Wada-fMRI LI difference.

Table 5. Correlations between continuous variables and Wada-fMRI LI difference scores
rp-value ρ p-value
Subject variables
Age at onset0.020.780.010.89
Age at surgery0.050.50−0.040.54
Duration of epilepsy0.020.78−0.030.70
Full scale IQ0.
Wada variables
Wada total possible left−0.050.42−0.040.51
Wada total possible right−0.050.44−0.090.16
fMRI variables
Percent correct semantic task−0.090.17−0.080.24
Percent correct tones task0.060.350.110.10
Head motion (RMS mean)
Mean residual error0.030.640.110.11
Signal to fluctuation noise ratio0.040.520.110.09
Artifact-contaminated volumes0.060.40−0.020.78
Wada Laterality Index−0.220.001−0.100.15
fMRI Laterality Index−0.51<0.0001−0.62<0.0001

Given the non-Gaussian LI distributions, we also examined the same variables with Spearman correlations (Table 5, right columns). The only significant relationship was observed for the fMRI LI. Again, the more atypical the LI, the larger the absolute Wada-fMRI difference.

Next, a series of simple regression analyses were performed to examine the relationship between selected groups of predictor variables and the absolute value of the LI difference scores. Although most variables of interest were not correlated with the LI difference score, we hypothesized that they might collectively explain a significant amount of variance in the LI difference score. First, we entered subject variables that were hypothesized to have a relationship with discordance due to their association with atypical language organization in previous studies (handedness, age at seizure onset, mesial temporal sclerosis or hippocampal atrophy, full scale IQ). This model did not account for any significant variance in the LI difference score (Adjusted R2 = 0.01, F(4, 221) = 1.82, p = 0.13). In a separate regression analysis, we entered the Wada quality indices described above, which also did not account for a significant amount of the variance in LI difference scores (Adjusted R2 = −0.01, F(7, 199) = 0.74, p = 0.64). Likewise, the fMRI quality indices did not predict a significant amount of the variance in LI difference scores (Adjusted R2 = 0.02, F(6, 208) = 1.77, p = 0.11).

Finally, because the Wada LI and the fMRI LI were both significantly correlated with LI difference score, we used a hierarchical regression analysis to explore the relative predictive value of each LI. When the Wada LI was entered in block one followed by the fMRI LI in block two, the Wada LI accounted for 5% of the variance in Wada-fMRI LI difference scores (R2 change = 0.05, p = 0.001), and fMRI LI accounted for an additional 22% of the variance (R2change = 0.22, p < 0.0001). When the fMRI LI was entered first in block one followed by the Wada LI in block two, the fMRI LI accounted for 26% of the variance in Wada-fMRI LI difference scores (R2change = 0.26, p < 0.0001) and the Wada LI accounted for only an additional 2% of the variance (R2change = 0.02, p = 0.03). Therefore most of the variance in LI difference scores is accounted for by the fMRI LI. The negative sign of this relationship (Table 5) indicates that as the fMRI LI becomes more negative (meaning more atypical language), the LI discrepancy tends to increase.

Cases of extreme discordance

As mentioned above, there were four extreme cases of discordance, in which Wada indicated left dominance and fMRI indicated right dominance. Although comprising <2% of the sample, these cases are potentially informative with respect to causes of discordance. These patients were all right-handed with right hemisphere seizure foci. Close examination of the Wada results for these patients did not reveal any potential quality issues. However, when the fMRI data were examined, one of the patients had an extremely small number of activated voxels (76 total voxels, in contrast to a median activation of 12,299 voxels in the entire patient sample). In a post hoc analysis, we did not find a systematic statistical relationship between discordance and total activation volume. There was a numerically greater rate of discordance in the decile of the sample with the smallest activation volume (22% discordant) compared to the rest of the sample (13% discordant), but this difference did not approach significance (χ2 = 1.28, p = 0.26).


This study provides further clarification regarding Wada/fMRI concordance in a large sample that included a substantial number of patients with atypical language dominance. The data show that these two methods of determining language dominance have a relatively high level of concordance. An average of the previous comparison studies shown in Figure 2, weighted by sample size, indicates an overall discordance rate of 15%, which is comparable to our rate of 14%. A few prior studies reported 0% discordance, although with one exception (Binder et al., 1996), all of these studies had 8 or fewer participants (Desmond et al., 1995; Bahn et al., 1997; Hertz-Pannier et al., 1997; Baciu et al., 2001; Liegeois et al., 2002). On the other hand, several studies reported discordance rates of 25% or more. These variable results are paralleled by substantial variability in methodology. Numerous fMRI language tasks have been used, including covert fluency (Bahn et al., 1997; Hertz-Pannier et al., 1997; Worthington et al., 1997; Yetkin et al., 1998; Benson et al., 1999; Lehericy et al., 2000; Liegeois et al., 2002; Rutten et al., 2002; Adcock et al., 2003; Sabbah et al., 2003; Woermann et al., 2003; Deblaere et al., 2004; Gaillard et al., 2004; Arora et al., 2009), abstract versus concrete word identification (Desmond et al., 1995), rhyming (Baciu et al., 2001), syntactic or semantic judgments (Carpentier et al., 2001; Arora et al., 2009), sentence repetition (Lehericy et al., 2000), sentence comprehension (Rutten et al., 2002; Arora et al., 2009), story listening (Lehericy et al., 2000), semantic decision (Binder et al., 1996; Spreer et al., 2001; Benke et al., 2006; Szaflarski et al., 2008), and naming (Rutten et al., 2002; Gaillard et al., 2004). These language tasks have been contrasted with a wide variety of different control tasks or, in some cases, “passive” or “resting” baseline conditions. In addition, nonstandardized Wada administrations, varied fMRI regions of interest (e.g., whole brain, frontal, temporal), variable methods of quantifying or categorizing asymmetry, and variable definitions of discordance (e.g., visual rating, varied cut scores) all likely contribute to the variability in discordance rates reported across studies.

Figure 2.

Reported rates of discordance in language dominance classification by Wada and fMRI testing. The studies are arranged from top to bottom in order of increasing sample size. The rates of discordance in this figure may differ from some of the rates of discordance reported in the original manuscripts, due to individual manuscripts reporting multiple analyses of language lateralization data, using multiple regions of interest or multiple tasks. When multiple analyses were performed, the data in this figure reflects the discordance rate that is most similar to the parameters used in the present study.

A novel aspect of the current study is that it is based on a semantic decision fMRI contrast that was validated previously to be predictive of cognitive outcomes. Language lateralization measured with this fMRI method is predictive of both naming (Sabsevitz et al., 2003) and verbal memory (Binder et al., 2008a) change after left anterior temporal lobectomy surgery. In both studies, fMRI was also more strongly predictive of outcomes than Wada language or memory asymmetry. This fMRI contrast has also been compared directly to similar measures using “passive” and “resting” baseline conditions and shown to produce both more consistent left lateralization and a greater volume of activation in healthy control subjects (Binder et al., 2008b). The use of tasks with overt responses allows the level of engagement in the tasks to be continuously monitored and encouraged through verbal feedback when necessary. Therefore, this fMRI paradigm offers numerous advantages that make it optimal for comparison against the Wada test.

Predictors of discordance

We observed the highest rates of discordance in patients who had bilateral language representation on fMRI (57%), followed by the group that had bilateral language on Wada testing (40%). Bilateral language representation has been associated with discordance in two previous studies (Benke et al., 2006; Arora et al., 2009). Of the 69 discordant cases reported in previous studies (Fig. 2), 28 had bilateral language representation on fMRI and 26 had bilateral language representation on Wada testing, suggesting further that discordance rates are high when either test indicates bilateral language representation. In addition, the current multiple regression analysis identified fMRI LI as the strongest predictor of the difference between Wada and fMRI LIs. Although we acknowledge that there may be an element of circularity in the analysis, the results are clinically informative, indicating that the lower the fMRI LI (i.e., the more rightward shift of the fMRI LI), the larger the difference between Wada and fMRI LIs. One likely explanation for this relationship is that fMRI is more sensitive than the Wada test to right hemisphere language processing. In some patients with partial right hemisphere language representation, for example, the right hemisphere component may be inadequate to sustain even minimal performance when the left hemisphere component is anesthetized, making the patient appear to be entirely left-lateralized on the Wada test. Left hemisphere anesthetization during the Wada could also interfere with right hemisphere processing through a diaschisis effect (Andrews, 1991).

Somewhat surprisingly, there were no systematic relationships between measures of test quality and LI discordance. Some of these quality factors (such as head motion and excessive sedation) almost certainly affect the results in individual cases, but cases with extreme movement were removed because the data were considered invalid. In the remaining sample, these effects were apparently either too infrequent or too small to consistently skew the test results in a particular direction.

It is difficult to draw conclusions about the four “extreme” cases of discordance in which fMRI showed right hemisphere dominance and Wada testing indicated left hemisphere dominance. All four patients had a right hemisphere seizure focus and were right-handed. Therefore, one would expect the left hemisphere to be the language dominant hemisphere as indicated by Wada testing. The mechanism for this pattern of extreme discordance is unclear, although for one case, very low levels of activation were noted across both hemispheres. Another possible mechanism to explain some cases of extreme discordance is the rare finding of interhemispheric dissociation of language functions, which is estimated to occur in 3% of epilepsy patients (Kurthen et al., 1992). One of the four extreme cases showed strongly right lateralized activation in the angular gyrus and left lateralized activation in the temporal lobe. This is consistent with other rare cases of interhemispheric dissociation of language functions that have been reported in individuals with discordant Wada and fMRI results (Lee et al., 2008). In such cases, the anterior language functions shift to the side opposite the seizure focus, whereas posterior language functions remain ipsilateral to the seizure focus (Kurthen et al., 1992).


As with all comparisons of fMRI and Wada language testing, the limitations of this study include a relatively small number of discordant cases and a somewhat arbitrary definition of LI discordance. The small sample size limits power to detect consistent group differences or individual or methodologic predictors of discordance. However, this is by far the largest sample of patients with Wada and fMRI language testing reported to date, and also the largest sample of discordant cases. Regarding the definition of LI discordance, a plethora of different methods have been suggested for comparing Wada and fMRI results, and different methods for calculating language lateralization have been used, including qualitative categorization methods and a variety of quantitative methods. We used a categorical criterion combined with difference scores between the LIs to operationally define discordance. We attempted to account for the unique differences inherent in each method by examining previously published language lateralization estimates from neurologically normal and epilepsy samples, and choosing cut scores that would yield similar numbers of patients in each language dominance group when assessed with Wada and fMRI. An additional caveat is that a small percentage of individuals (about 4% on each test) who had clearly invalid Wada or fMRI evaluations due to excessive sedation, excessive movement, or inability to perform the fMRI tasks were excluded from the study. These and other measures of test quality are critical for determining the validity of the tests and are likely to cause discordant results if not monitored.

Finally, it should be emphasized that both Wada and fMRI methods vary considerably across centers, and discordance rates may depend on a variety of methodologic factors, including the fMRI tasks used (both language and control conditions), Wada testing and scoring methods, and methods for computing lateralization. Therefore, the results reported here reflect a specific combination of Wada and fMRI methods and may not generalize to studies using very different methods.


Wada/fMRI LI discordance rates were relatively small and comparable to rates obtained by averaging across previous smaller studies, indicating that results obtained from these two language lateralization methods are more similar than they are different. Discordance was predicted by atypical language dominance on either test, most strongly by atypical dominance on fMRI. Taken together, these findings suggest that presurgical epilepsy patients who successfully complete fMRI for language lateralization may not require Wada testing, as the two methods are generally concordant, particularly if fMRI indicates left language dominance. Although fMRI more frequently demonstrated atypical language dominance in discordant cases, this does not imply that fMRI was less accurate in these cases. The Wada test has been used for many more years and is often considered a clinical standard, but whether the Wada test or fMRI is more likely to be “correct” in discordant cases is unclear. The aim of language lateralization is to determine whether the hemisphere targeted for surgery is dominant for language, based on the assumption that operating on the language-dominant hemisphere is associated with an increased risk of language impairment. A simple and clinically relevant means of comparing the accuracy of the two tests, therefore, is on their ability to predict postsurgical language outcome. The evidence on this point is so far limited to one small study comprised mostly of concordant cases, which found the fMRI LI to be more predictive of outcome than a Wada LI (Sabsevitz et al., 2003). No data are yet available specifically comparing accuracy of outcome prediction in discordant cases. Gathering such data will require not only identifying a large number of discordant cases but also acquiring postsurgical cognitive testing on a sufficient number who undergo dominant hemisphere surgery.


Thanks to Linda Allen, Patrick Bellgowan, Julie Frost, Dongwook Lee, George Morris, Wade Mueller, Conrad Nievera, Edward Possing, Thomas Prieto, Jane Springer, and Scott Winstanley for assistance with patient recruitment and collecting and coding the data.


None of the authors has any conflict of interest to disclose. We confirm that we have read the Journal's position on issues involved in ethical publication and affirm that this report is consistent with those guidelines.


This work was supported by National Institute of Neurological Diseases and Stroke (R01 NS35929), National Institutes of Health General Clinical Research Center (M01 RR00058), National Research Service Award Fellowship (F32 MH11921), and the Charles A. Dana Foundation.