Addressing reverse inference in psychiatric neuroimaging: Meta-analyses of task-related brain activation in common mental disorders


  • Conflict of Interest: None of the authors have any conflict of interest to declare.


Functional magnetic resonance imaging (fMRI) studies in psychiatry use various tasks to identify case-control differences in the patterns of task-related brain activation. Differently activated regions are often ascribed disorder-specific functions in an attempt to link disease expression and brain function. We undertook a systematic meta-analysis of data from task-fMRI studies to examine the effect of diagnosis and study design on the spatial distribution and direction of case-control differences on brain activation. We mapped to atlas regions coordinates of case-control differences derived from 537 task-fMRI studies in schizophrenia, bipolar disorder, major depressive disorder, anxiety disorders, and obsessive compulsive disorder comprising observations derived from 21,427 participants. The fMRI tasks were classified according to the Research Domain Criteria (RDoC). We investigated whether diagnosis, RDoC domain or construct and use of regions-of-interest or whole-brain analyses influenced the neuroanatomical pattern of results. When considering all primary studies, we found an effect of diagnosis for the amygdala and caudate nucleus and an effect of RDoC domains and constructs for the amygdala, hippocampus, putamen and nucleus accumbens. In contrast, whole-brain studies did not identify any significant effect of diagnosis or RDoC domain or construct. These results resonate with prior reports of common brain structural and genetic underpinnings across these disorders and caution against attributing undue specificity to brain functional changes when forming explanatory models of psychiatric disorders. Hum Brain Mapp 38:1846–1864, 2017. © 2017 Wiley Periodicals, Inc.


Functional magnetic resonance imaging (fMRI) is widely used in psychiatric neuroimaging because of its excellent safety profile, high patient acceptability, good spatial resolution and acceptable temporal resolution. To date, the majority of fMRI studies have examined the spatial distribution and level of blood oxygenation-level dependent (BOLD) signal associated with performance of different tasks. The ultimate aim of this line of research is the identification of abnormalities in task-related neural activity that are associated with a specific psychiatric disorder or symptom dimension. Much of the recent scientific impetus for task-fMRI studies in psychiatry can be attributed to the Research Domain Criteria (RDoC) project [Cuthbert, 2014; Insel et al., 2010; Sanislow et al., 2010]. The RDoC project posits that psychiatric diagnoses result from disruption in brain circuits that underpin domains of mental function which are important for adaptive behavior. These comprise circuits associated with reward (positive valence systems), threat sensitivity (negative valence systems), cognitive processes, interpersonal interactions (social processes), and biological activation (arousal and regulation) [Sanislow et al., 2010]. In this scheme, task-fMRI is an important tool in mapping mental processes to neural circuits. Case-control differences in task-related brain activity are then used to make inferences about abnormalities in domain circuits relating to disease processes.

The literature on task-fMRI in psychiatric disorders is extensive and lends itself to quantitative synthesis of the findings from the primary studies. Neuroimaging studies typically report the locations of peak statistical effects using anatomical coordinates referenced to a stereotactic system, most commonly the Talairach [Talairach and Tournoux, 1988] or the Montreal Neurological Institute (MNI) coordinate space [Evans et al., 1993]. Because of this reporting convention, quantitative synthesis of task-fMRI data utilizes coordinate-based analyses that are concerned with the consistency and specificity of the spatial convergence of the results of the primary studies [Fox et al., 1998; Wager et al., 2007]. To date, the task-fMRI literature has been synthesized in multiple coordinate-based meta-analyses that have mainly focused on a single psychiatric disorder [e.g., Chen et al. 2011; Crossley et al., 2016; Del Casale et al., 2015; Gentili et al., 2016; Graham et al., 2013; Ipser et al., 2013; Miller et al., 2015; Minzenberg et al., 2009]. The few meta-analyses that have compared two or more disorders suggest commonalities in the patterns of fMRI activation that transcend current nosological categorizations [e.g., Delvecchio et al., 2013; Etkin and Wager, 2007]. Evidence of trans-diagnostic overlap has also been reported in brain structural [e.g., Arnone et al., 2009; Goodkind et al., 2015; Kempton et al., 2011] and in genetic studies [e.g., Cross-Disorder Group of the Psychiatric Genomics Consortium, 2013; Network and Pathway Analysis Subgroup of Psychiatric Genomics Consortium, 2015]. This has led to proposals that psychiatric disorders may arise from perturbations in the functional and structural organization of common large-scale brain networks [Menon, 2011]. It is, therefore, timely to interrogate the task-fMRI literature to characterize the spatial pattern of case-control differences in brain activation across multiple RDoC domains and psychiatric diagnoses.

Given the size and complexity of the relevant literature, we chose to focus on task-fMRI data comparing healthy adults to adult patients diagnosed with schizophrenia (SCZ), major depressive disorder (MDD), Bipolar Disorder (BD), anxiety disorders (ANX), and Obsessive Compulsive Disorder (OCD). These psychiatric conditions are particularly amenable to joint examination because they are often comorbid at syndromal level and show significant overlap at the level of symptom dimensions [Buckley et al., 2009; DeVylder et al., 2014; Gorun et al., 2015; Kessler et al., 1994; Markon 2010; Vaidyanathan et al., 2012]. We therefore conducted a quantitative analysis of neuroimaging studies of these disorders and mapped peak coordinates of case-control differences in regional brain activation to predefined cortical and subcortical regions within a canonical atlas space. We classified each task according to its corresponding RDoC domain and construct. The coordinates of each case-control difference were then coded according to the corresponding RDoC domain and contrast, the diagnosis of the patient group in the primary study, the level of inference (region of interest or whole-brain) and the direction of signal change (hypoactivation or hyperactivation in patients compared to healthy participants). The main aims of the analysis were to test for diagnostic specificity of the reported case-control differences and to investigate whether any diagnostic specificity (if present) could be attributed to dysfunction within RDoC-specified circuits. The issue of specificity is of primary concern particularly in task-fMRI studies where there is often a tendency “to reason backward from patterns of activation to infer the engagement of specific mental processes” [Poldrack, 2011, 2006] and by extension to infer abnormalities of specific mental processes in relation to different disorders [Paulus, 2015]. Addressing the issue of specificity in connection to task-fMRI is a fundamental first step in moving the field forward conceptually and in identifying new directions for methodological refinement and improvement.


Literature Search and Study Selection

We included original, peer-reviewed fMRI studies that compared healthy adults (age range 18–65 years) to adult patients with SCZ, BD, MDD, OCD, or Anxiety Disorders (Generalized Anxiety Disorders, Panic Disorder, Post-Traumatic Stress Disorder, Specific Phobias and Social Anxiety Disorder) and reported case-control differences in stereotactic space. Among anxiety disorders, we examined only those for which we could identify a minimum of seven primary studies. This was not a consideration for the other diagnostic categories where the literature was more extensive. We excluded case reports, case series, reviews, studies that combined patients with different diagnoses into a single group and duplicate citations. We interrogated databases available through the National Center for Biotechnology Information up to December 2013 using relevant expanded subject headings and free text searches. A total of 12,037 unique publications were examined and 537 were included in the analyses based on the Preferred Reporting Items for Systematic reviews and Meta-Analyses Statement (PRISMA; PRISMA diagrams and full citations of the studies included are available in Supporting Information Methods 1 and 2a-e. Supporting Information Figure S1 shows the number of studies and their mean sample size per year of publication.

Database Construction

The following were recorded from each study separately for patient and control groups: number of participants, age [mean and standard deviation (SD)], sex (% male), diagnostic classification system (e.g., DSM-IV), and diagnostic ascertainment (i.e., structured interview or clinical assessment). The following were recorded, when available, from each study for each patient group: current and lifetime comorbidity with alcohol or substance abuse/dependence, medication status (% receiving any psychotropic medication), and medication type (% atypical antipsychotics, % typical antipsychotics, % antidepressants, % anticonvulsants and % lithium). For each study, we recorded the year of publication, the field strength of the MRI scanner, the activation paradigm used during fMRI data acquisition, the direction of activation changes in patients compared to healthy individuals for each contrast and the level of inference (region of interest, small-volume correction, and whole-brain). For brevity, the term ROI will be used hereafter to denote both region-of-interest and small-volume correction analyses. We followed the RDoC project in classifying fMRI paradigms according to RDoC domains and constructs. The number of studies included per diagnosis, level of inference and RDoC domain and demographic details of the samples are shown in Tables 1 and 2. Details of the study samples, tasks and task-contrasts used can be found in Supporting Information Methods 3.

Table 1. Overview of database structure
DiagnosisStudies (n)Patients (n)Controls (n)Region of interest analyses (n of studies)Whole brain analyses (n of studies)RdoC social processes (n of studies)RdoC cognitive systems (n of studies)RDoC negative valence (n of studies)RdoC positive valence (n of studies)Medication status (n of studies)
  1. Each cell presents the number of studies for each variable of interest; the total number of studies is not always the sum of the rows or columns, because some studies report coordinates that fit multiple categories; Anxiety disorders comprise Generalized Anxiety Disorder (n = 7), Specific or Social Phobia (n = 30), Panic Disorder (n = 11) and Post-traumatic Stress Disorder (n = 44); Medication status = number of studies reporting information about medication; RDoC = Research Domain Criteria.

Anxiety Disorders88151615385830342624483
Major Depressive Disorder84159115303747323841183
Bipolar Disorder7315491669314220490572
Obsessive Compulsive Disorder4186485224177208740
Table 2. Demographic information
DiagnosisTotal studies (n)Patients (n)Controls (n)Patients age, mean (SD)Controls age, mean (SD)Patients % maleControls % male
  1. Age is shown in years, mean (standard deviation); Anxiety Disorders includes studies on patients with Generalized Anxiety Disorders, Panic Disorder, Post-Traumatic Stress Disorder, Specific Phobias, and Social Anxiety Disorder.

Schizophrenia2514925539333.8 (8.5)32.8 (7.9)69.961.5
Anxiety Disorders881516153833.6 (8.3)32.7 (7.9)39.339.8
Major Depressive Disorder841591153037.0 (9.7)35.1 (9.6)37.336.8
Bipolar Disorder731549166936.7 (9.7)34.9 (9.3)46.447.2
Obsessive Compulsive Disorder4186485232.9 (8.5)31.6 (7.5)47.849.2
Total537104451098234.6 (8.8)33.3 (8.3)55.251.9

Quantification of Studies Per Anatomical Region

We extracted the coordinates of the anatomical locations of the peak statistical effects for case-control differences from each primary study. Following the general convention in meta-analyses, we accepted the results reported as significant in the primary studies. A small minority of primary studies used two fMRI tasks in which case we used the coordinates of the case-control differences for each task. When both ROI and whole-brain results were reported in the same study we only used the coordinates from the whole-brain analyses. Coordinates were transformed to MNI, if reported in Talairach space, using the “tal2icbm_fsl” transform ( All coordinates were mapped to the Harvard-Oxford cortical and subcortical atlases [Desikan et al., 2006; Frazier et al., 2005] (;, with a probability threshold of 10% for each region. This threshold accommodates uncertainty about region localization across individuals and studies and recognizes that activation clusters extend beyond the location of the peak coordinates. The number of unique primary studies reporting one or more coordinates within each region was calculated. Studies that reported two or more coordinates within the same anatomical region were only counted once. Study-counts were summed across hemispheres for each region.

Statistical Analyses

Independent variables of interest were diagnosis, level of inference (ROI or whole-brain), RDoC domain, RDoC construct, and direction of signal change (hypoactivation or hyperactivation in patients compared to healthy participants). The dependent variable was number of unique studies implicating each Harvard-Oxford cortical and subcortical atlas region. To test the effect of diagnosis, we created diagnosis-by-region cross-tables and performed χ2 tests with Yates' correction. The same approach was then used to investigate the effect of each of the other variables of interest. Then, we investigated the effect of the variables of interest separately for each atlas region using Fisher-exact tests. We tested for the similarity of the anatomical distributions of results for each variable of interest using Spearman's rank correlation coefficients of the numbers of studies contributing to each atlas region. For example, to test whether studies of SCZ report similar results to BD, we calculated the correlation between the number of SCZ studies by atlas region to those of BD.

We performed Kruskal-Wallis tests to compare the values of each continuous, quantitative demographic and clinical variable (sample size, age [mean and SD], % male, % of patients on medication and % on each medication type) between studies that reported an effect in each region and studies that did not. For discrete, categorical variables (substance abuse exclusion, scanner field strength) we performed chi-square tests per region.

To meaningfully compare the distribution of studies across anatomical regions and diagnoses, study counts are represented as a percentage of the total number of studies for each diagnosis divided by the volume of the anatomical region, as larger brain regions would be statistically more likely to contain activation loci simply by virtue of their size. This yielded a score for each diagnosis, for each atlas region, quantified as the percentage of studies for the respective diagnosis, per cm3 within the respective region. To test whether this score was randomly (i.e., normally) distributed across regions, we performed Kolmogorov Smirnov tests of the score distribution across all regions and across cortical regions and subcortical regions separately.


Anatomical Distribution of Results Depending on Inference Level

When all reported coordinates were considered together (regardless of whether the primary studies used a whole-brain or ROI approach), case-control differences in BOLD signal were reported in all atlas regions between 4 and 239 times as shown in Supporting Information Figure S2 and S3.Coordinates reported by ROI versus whole-brain studies were not equally distributed across all atlas regions (χ2 = 99.59, P < 10−3). The same was observed when subcortical (χ2 = 25.13, P < 10−3) and cortical regions (χ2 = 64.07, P =0.04) were considered separately. When considering case-control differences from studies using whole-brain analyses only, a Kolmogorov-Smirnov test examining the distribution of number of studies per cm3 was no longer significant (D = 0.10, P = 0.58), indicating that subcortical regions were not significantly overrepresented among whole-brain studies alone. Following Fisher-exact tests comparing each region to the total numbers of ROI and whole-brain studies, 3 out of the 8 sub-cortical regions and 16 out of the 48 cortical areas showed at least a nominal effect of level of inference (Table 3). Whole-brain studies were significantly overrepresented among those studies contributing to the thalamus and the brain stem, whereas ROI studies have been chiefly responsible for results in the amygdala (Table 3). Regarding cortical regions, ROI studies tended to focus on frontal and temporal regions and less on parietal and occipital regions (Table 3). Nevertheless, the frequency of ROI and whole-brain results correlated highly across regions (ρ = 0.78, P < 10−11, adjusted for region volume). Regions that were significantly supported by ROI studies tended to be among the top regions showing case-control differences even when considering whole-brain studies alone. This lends indirect support to the a priori ROI selection. However, the posterior parahippocampal gyrus and the thalamus appear to be remarkably under-selected among studies using ROI analyses despite the high frequency of case-control differences in these regions at the whole-brain level (Table 3). Among whole-brain studies, the top 10 regions across all disorders (ranked by frequency of case-control difference adjusted for region size) were the nucleus accumbens, anterior insula, posterior parahippocampal gyrus, globus pallidus, amygdala, hippocampus, caudate, thalamus, paracingulate gyrus, and putamen (Fig. 1; Supporting Information Table S1).

Figure 1.

The top 10 regions among whole-brain studies across all disorders (ranked by frequency of reported case-control difference, adjusted for region size). [Color figure can be viewed at]

Table 3. Fisher-exact P-values for the effects of variables of interest on the anatomical distribution of the results
LobeAtlas regionDiagnosisROI vs. WBDirection of signal changeRDoC domainaRDoC constructa
  1. ROI: Region of Interest, also includes studies using small volume correction; WB: Whole Brain; Direction of signal change is always referenced to controls, case-control differences are therefore coded as hypoactivation or hyperactivation if patients show respectively less or more activation than controls.

  2. P-values in bold indicate nominally significant effects (uncorrrected P-values < 0.05).

  3. a

    RDoC domains with sufficient observations: cognitive systems, negative valence, positive valence, social processes.

  4. b

    RDoC constructs with sufficient observations: response to threat, working memory, cognitive constructs, social processes, declarative memory, motivation, attention, perception.

  5. c

    Regions more likely to be hypoactive in patients.

  6. d

    regions more likely to be hyperactive in patients.

  7. e

    For the amygdala, the effect of RDoC construct was tested using the χ2 test because of memory limitations in R 3.1.3.

Putamen0.690.910.01c2.72 E − 030.01
Accumbens0.230.592.65 E − 03c2.01 E − 030.01
Amygdala0.012.84 E − 030.01d5.11 E − 052.15 E − 6e
Hippocampus0.230.690.440.014.46 E − 03
Caudate4.26 E −
FrontalAnterior Cingulate Gyrus0.840.590.310.430.82
Frontal Medial Cortex0.010.780.501.55 E − 030.02
Frontal Orbital Cortex0.170.510.120.850.71
Frontal Pole0.390.490.130.900.96
Inferior Frontal Gyrus pars opercularis0.620.760.770.270.54
Inferior Frontal Gyrus pars triangularis0.950.740.090.660.74
Middle Frontal Gyrus0.840.240.580.140.66
Paracingulate Gyrus0.070.440.520.500.62
Precentral Gyrus0.840.020.080.961.00
Subcallosal Cortex0.010.390.070.030.08
Superior Frontal Gyrus0.680.010.350.910.70
Supplementary Motor Cortex0.900.580.900.480.24
InsulaCentral Opercular Cortex0.
Frontal Operculum Cortex0.660.060.694.38 E − 030.01
Insular Cortex0.220.110.760.330.49
Parietal Operculum Cortex0.920.
OccipitalCuneal Cortex0.197.62 E − 040.310.530.70
Intracalcarine Cortex0.520.260.110.880.99
Lateral Occipital Cortex inferior division0.994.76 E − 030.920.670.21
Lateral Occipital Cortex superior division0.990.010.370.680.74
Lingual Gyrus0.450.071.000.860.48
Occipital Fusiform Gyrus0.330.010.470.510.78
Occipital Pole0.891.21 E − 030.560.790.94
Supracalcarine Cortex0.320.110.090.550.29
ParietalPosterior cingulate Gyrus0.
Postcentral Gyrus0.990.040.430.780.98
Precuneous Cortex0.710.020.850.660.48
Angular Gyrus0.930.010.500.910.84
Superior Parietal Lobule0.920.120.540.490.94
Anterior supramarginal gyrus0.400.120.520.580.86
Posterior supramarginal gyrus0.680.010.260.510.86
TemporalHeschl's Gyrus0.771.000.110.810.75
Inf Temporal Gyrus (temporooccipital)0.680.160.650.870.48
Anterior inf. temporal gyrus0.040.310.05d0.170.89
Posterior inf. temporal gyrus0.760.250.190.360.44
Middle Temp Gyrus (temporooccipital)0.700.451.000.860.31
Anterior middle temporal gyrus0.470.350.340.420.65
Posterior middle Temporal Gyrus0.680.380.710.700.46
Anterior parahippocampal gyrus0. E − 036.73 E − 04
Posterior parahippocampal gyrus0.550.020.400.180.12
Planum polare0.620.110.03d0.971.00
Planum temporale0.700.310.120.430.25
Anterior superior temporal gyrus0.640.240.260.450.70
Posterior superior temporal gyrus0.910.901.000.240.90
Anterior temporal fusiform cortex0.141.000.561.000.61
Posterior temporal fusiform cortex0.540.040.670.320.27
Temporal occipital fusiform cortex0.754.59 E − 030.620.460.23
Temporal pole0.160.480.260.220.57

Anatomical Distribution of Results Depending on Diagnosis

Overall, there was no significant effect of diagnosis on the spatial distribution of the reported case-control differences (χ2 = 232, P = 0.27). Pairwise contrasts of study-counts across all regions yielded nominal results for the contrasts of SCZ and MDD (P = 0.01) and of SCZ and anxiety disorders (P = 0.05). The general lack of diagnostic specificity is also apparent in the Spearman's rank correlations (Table 4 and Supporting Information Tables 2 and 3), as for each pair of diagnoses, correlation coefficients across regions ranged between 0.42 and 0.82 and were highly significant (0.001 < P < 10−12). When cortical and subcortical regions were examined separately, a significant effect of diagnosis on the anatomical distribution of reported locations of case-control differences was found for subcortical (χ2 = 52.75, P = 0.003), but not cortical regions. This effect was driven by the amygdala (P = 0.01) and the caudate nucleus (P < 0.01; Table 3), but disappeared when considering only whole-brain studies (Table 5). In contrast, among whole-brain studies the only region that showed a nominally significant effect of diagnosis was the nucleus accumbens (P = 0.004). This effect was primarily driven by an increased frequency of results reported in the nucleus accumbens from OCD studies compared to SCZ (P = 0.017) and MDD (P = 0.004) studies. Considering each region separately, none of the 56 regions showed an effect of diagnosis that was significant under a Bonferroni-corrected α of 0.05/56 = 0.0009 (Tables 3 and 5). The inference-dependence of the diagnostic effects is also illustrated in Figures 2-8.

Figure 2.

Percentage of studies within each diagnostic category reporting one or more coordinates within each subcortical structure. [Color figure can be viewed at]

Figure 3.

Percentage of studies across all diagnoses reporting one or more coordinates within each cortical structure. [Color figure can be viewed at]

Figure 4.

Percentage of studies of schizophrenia reporting one or more coordinates within each cortical structure. [Color figure can be viewed at]

Figure 5.

Percentage of studies of major depression reporting one or more coordinates within each cortical structure. [Color figure can be viewed at]

Figure 6.

Percentage of studies of bipolar disorder reporting one or more coordinates within each cortical structure. [Color figure can be viewed at]

Figure 7.

Percentage of studies of anxiety disorders reporting one or more coordinates within each cortical structure. [Color figure can be viewed at]

Figure 8.

Percentage of studies of obsessive-compulsive disorder reporting one or more coordinates within each cortical structure. [Color figure can be viewed at]

Table 4. Pair-wise Spearman's rank correlations of study counts per region between diagnoses, adjusted for region volume
Diagnostic-pair contrastCorrelation coefficientP-value
  1. Both whole-brain and region of interest studies were considered. ANX = Anxiety Disorders, includes studies on patients with Generalized Anxiety Disorders, Panic Disorder, Post-Traumatic Stress Disorder, Specific Phobias and Social Anxiety Disorder; BD= Bipolar Disorder; MDD = Major Depressive Disorder; OCD = Obsessive Compulsive Disorder; SCZ = Schizophrenia.

Table 5. Fisher-exact P-values for the effects of variables of interest on the anatomical distribution of results, for whole-brain studies only
LobeRegionDiagnosisDirection of signal changeRDoC domainaRDoC constructa
  1. a

    RDOC domains with sufficient observations: conitive systems, negative valence, positive valence, social processes.

  2. b

    RDOC constructs with sifficient observations: response to threat, working memory, cognitive constructs, social processes, declarative memory, motivation, attention, perception.

FrontalAnterior Cingulate Gyrus0.881.000.810.40
Frontal Medial Cortex0.890.830.120.42
Frontal Orbital Cortex0.550.180.930.94
Frontal Pole0.840.020.890.90
Inferior Frontal Gyrus pars opercularis0.980.420.720.48
Inferior Frontal Gyrus pars triangularis0.090.110.340.84
Middle Frontal Gyrus0.770.920.400.46
Paracingulate Gyrus0.730.830.350.67
Precentral Gyrus0.870.040.990.95
Subcallosal Cortex0.440.060.800.85
Superior Frontal Gyrus0.470.190.900.95
Supplementary Motor Cortex0.700.330.370.15
InsulaCentral Opercular Cortex0.370.420.220.24
Frontal Operculum Cortex0.610.870.090.11
Insular Cortex0.170.500.150.27
Parietal Operculum Cortex0.670.690.990.98
OccipitalCuneal Cortex0.400.490.900.93
Intracalcarine Cortex0.060.140.690.93
Lateral Occipital Cortex inferior division0.930.440.690.88
Lateral Occipital Cortex superior division0.980.760.980.83
Lingual Gyrus0.770.800.800.62
Occipital Fusiform Gyrus0.790.650.930.97
Occipital Pole0.980.410.990.98
Supracalcarine Cortex0.560.120.460.83
ParietalAngular Gyrus0.850.810.960.93
Cingulate Gyrus_ posterior division0.140.160.670.86
Postcentral Gyrus0.970.610.780.99
Precuneous Cortex0.940.640.270.30
Superior Parietal Lobule0.880.630.940.77
Supramarginal Gyrus anterior division0.870.170.750.77
Supramarginal Gyrus posterior division0.680.700.590.66
TemporalHeschl's Gyrus0.951.000.510.73
Inf Temporal Gyrus temporooccipital part0.420.340.710.79
Inferior Temporal Gyrus anterior division0.880.130.270.66
Inferior Temporal Gyrus posterior division0.410.300.440.17
Middle Temp Gyrus temporooccipital part0.591.000.860.51
Middle Temporal Gyrus anterior division0.210.750.300.37
Middle Temporal Gyrus posterior division0.250.740.430.75
Parahippocampal Gyrus anterior division0.680.250.770.43
Parahippocampal Gyrus posterior division0.860.270.700.78
Planum Polare0.481.000.910.89
Planum Temporale0.510.240.180.32
Superior Temporal Gyrus anterior division0.460.440.730.98
Superior Temporal Gyrus posterior division0.680.700.690.94
Temporal Fusiform Cortex anterior division0.190.680.650.16
Temporal Fusiform Cortex posterior division0.981.000.580.87
Temporal Occipital Fusiform Cortex0.610.360.150.05
Temporal Pole0.660.750.260.65

To examine the robustness of these results, we calculated the observed effect size for the overall chi-square test of diagnosis. We show that the chi-square test for the effect of diagnosis had 97% power to detect a modest effect size of w = 0.2 (df = 220, alpha = 0.05, 2,267 observations, calculated using G*Power), and >99.99% power to detect a moderate effect size of w = 0.30, even when considering only whole-brain studies. Next, we calculated the number of additional studies required to potentially change the current findings by simulating multiples of our cross-tables in R. The effect size of diagnosis was Cramer's V = 0.11 when considering both region of interest and whole brain studies. Therefore 500 new studies would be required to detect a diagnostic difference at 80% power and alpha < 0.05. When considering only whole brain studies, 1,054 additional studies would be required to detect a significant effect of diagnosis, at 80% power and alpha <0.05. We conducted the same analyses at the level of brain regions. For a significant effect of diagnosis, given the observed effect sizes the number of studies required increased further. For example, for the amygdala and middle frontal gyrus, two regions that are commonly included in different disease models, we would require approximately double the size of the current database. A review of the literature as the meta-analysis was conducted showed that 47 potentially eligible studies were published for SCZ and the numbers for the other conditions were substantially smaller. Therefore, inclusion of these studies would be unlikely to change the results.

Anatomical Distribution of Results Across RDoC Domains and Constructs

When all primary studies were considered together (irrespective of diagnosis), the choice of task, as classified by RDoC domains and constructs, had an effect on the location of case-control differences in subcortical structures (domains: χ2 = 66.71, P = 1.17 × 10−6; constructs: χ2 = 130.79, P = 1.46 × 10−5; Fig. 9). Following Fisher exact tests per region, effects were significant for the amygdala, hippocampus, putamen, and nucleus accumbens (Table 3; all Puncorrected ≤ 0.01). In the cortex, the overall effect was not significant (domains: χ2 = 156.4 P = 0.10; constructs: χ2 = 397.61, P = 0.78); only a nominally significant effect could be detected in the frontal medial cortex, frontal operculum and the anterior parahippocampal gyrus (Table 3; all Puncorrected ≤ 0.01). However, after excluding ROI studies, no regions were significantly associated with RDoC domains, or constructs (Table 5; all Puncorrected > 0.08). The frequency of the RDoC domains tested in the primary studies was significantly different across diagnoses (χ2 = 87.24; P < 10−12, Table 1). Therefore, we tested whether classifying studies by RDoC domain may uncover diagnosis-specific patterns of results. Fisher-exact tests per region within each RDoC domain, however, did not yield any significant results beyond what could be expected by chance (all P > 0.01).

Figure 9.

For each region, the contribution of studies that used tasks engaging domains defined by the RDoC project is shown as a proportion of the total number of studies showing case-control differences in that region. Regional distributions can be compared to the overall RDoC distribution shown in the bars on the right of each figure. [Color figure can be viewed at]

Anatomical Distribution of Results Depending on Direction of Signal Change

Across all studies, the pattern of “hyperactive” and “hypoactive” foci in patients compared to controls was highly correlated across all regions (ρ = 0.79, P < 2.2 × 10−16). Nevertheless, this effect was not distributed equally across all regions (χ2 = 98.51, P = 0.0001), cortical regions (χ2 = 65.93, P = 0.03), or subcortical regions (χ2 = 29.90, P < 10−4). The amygdala, planum polare, and the anterior part of the inferior temporal gyrus were significantly more likely to be “hyperactive” in patients, whereas the putamen and nucleus accumbens more likely to be “hypoactive” (Table 4, Supporting Information Figure S4a and S4b).

Effect of Potential Moderating Demographic and Clinical Variables

The following P-values are Bonferroni-corrected for number of regions tested, but not for number of moderator variables investigated to allow potentially meaningful moderator effects to be considered in future study designs. We found no evidence for an effect of medication status (medicated vs. unmedicated patients; all P > 0.14), medication type (all P > 0.99), study sample size (all P > 0.5), year of publication (P > 0.17), and mean age and SD (all P > 0.64) for any region. A significant effect of alcohol and substance abuse was found only for the angular gyrus (χ2 = 16.96; P = 0.01). Of the studies that included patients regardless of substance abuse status, 50% reported case-control differences in this region compared to 23% of studies that excluded patients with a lifetime history of substance abuse, and compared to 14% of studies that excluded patients with both current and lifetime substance abuse. Studies that reported results in the frontal pole tended to include more female patients (W = 39,607; P = 0.04) and controls (W = 39,351; P = 0.06). Studies reporting results in the frontal medial cortex also included on average a higher proportion of female patients (W = 15,862, P = 0.03), but not controls. Finally, higher field-strengths were more likely to contribute to results in the frontal pole (χ2 = 21.84; P = 0.001).


This study was motivated by the need to interrogate the large task-fMRI literature for evidence of specificity between regional activation patterns and diagnosis for common psychiatric disorders. A further aim was to test whether any evidence of specificity could be accounted for by abnormalities in task-related activations pertaining to mental function domains specified in the RDoC project. We also investigated which regions are most influenced, and possibly biased, by level of inference by comparing the results of ROI studies to data-driven whole-brain studies. Study results are based on the quantitative synthesis of task-fMRI findings from 547 studies comprising observations derived from 21,692 participants.

Similarity in the Spatial Distribution of Task-fMRI Foci of Case-Control Differences Across Psychiatric Disorders

We found nominal effects of diagnosis in several regions when considering both ROI and whole-brain studies (Table 3); these results were driven by ROI studies alone and none survived Bonferroni-corrected P-values. Thus, great caution is required in attributing these subtle and potentially biased results as indicative of meaningful diagnosis-specific effects. Rather, our results indicate that the anatomical distribution of case-control fMRI studies are largely diagnostic-general, as indicated by the high similarities between case-control fMRI results across disorders (Figs. 2 and 4−8, Table 4, Supporting Information Tables S2 and S3). The similarity in the cortical regions implicated in MDD and anxiety disorders is particularly striking as shown in Figures 5 and 7.

It is theoretically possible that misdiagnosis, medication, symptom profile or disease severity may have influenced the analyses of case-control differences. Given the differences in these variables between diagnostic groups, we consider a systematic bias resulting in increased similarity between diagnoses to be implausible. For example, on average 92% of patients with SCZ were prescribed antipsychotics while this was the case for only 15% of patients with OCD. Therefore we consider alternate interpretation for the similarities in the spatial distribution of task-fMRI case-control differences. It is possible that the disorders examined here arise from largely overlapping neural network dysfunction. This observation is supported by a recent meta-analysis of brain structural case-control differences across multiple disorders that also failed to identify diagnosis-specific effects [Goodkind et al., 2015]. A common biological substrate provides an explanation for the symptomatic overlap in the disorders examined here [Beesdo et al., 2010; Buckley et al., 2009; Eisen and Rasmussen, 1993; Kim et al., 2015; OConghaile and DeLisi, 2015; Pearlson, 2015; Rosen et al., 2012]. The transdiagnostic overlap in brain activation abnormalities also resonates with behavioral observations of shared transdiagnostic cognitive deficits that are present at or before disease onset [Shanmugan et al., 2016; Koenen et al., 2009], and with transdiagnostic overlap of genetic risk factors [Cross-disorder group of the PGC, 2013; Doherty and Owen, 2014]. Thus, it appears that transdiagnostic overlap is consistently observed at multiple scales of enquiry involving genetic factors, symptoms, cognitive function, brain morphology, and brain activity. This encourages a cross-diagnostic approach to the investigation of the biological underpinnings of mental disorders. However, there are significant differences in the relative prevalence of symptoms across the disorders examined here. For example, OCD symptoms may be present in about 18% of patients with SCZ [Kim et al., 2015] and psychotic experiences are reported by about 14% of patients with OCD [Eisen and Rasmussen, 1993]. Our findings suggest that the relationship between abnormalities in task-related networks to symptoms is both complex and unclear. This relationship has mostly been examined in SCZ where cognitive dysfunction shows no correlation with positive symptoms and only a moderate correlation with negative symptoms [Ventura et al., 2009, 2010]. It is therefore likely that the abnormalities in brain networks and network-regions we can observe with fMRI reflect disorder-general conditions that facilitate the emergence and persistence of symptoms but are insufficient for explaining symptomatic variability across disorders. Task-related fMRI cannot identify the nature of computations at the neuronal level; it can only detect the corresponding BOLD signal that suggest that such computations within neuronal assemblies. It is possible that the variability in the observed and reported symptoms across disorders is linked to the exact nature of neuronal computations which may not be detectable by the task-related fMRI studies included in this meta-analysis.

Regions Implicated Across Disorders

Although case-control differences were widely distributed, the dorsal and ventral striatum, the amygdala and hippocampus and cortical regions within the frontal operculum/anterior insula, posterior parahippocampal gyrus and paracingulate gyrus were relatively overrepresented (Fig. 1). Meta-analyses of anatomical MRI studies have confirmed the involvement of subcortical pathology in SCZ [van Erp et al., 2016], MDD [Schmaal et al., 2016] and BD [Hibar et al., 2016] while a meta-analysis of brain volumetric studies identified cortical grey matter reductions in the insula and the anterior cingulate cortex as the most robust deficits across disorders [Goodkind, 2015].

We found some evidence for diagnostic specificity, supported by whole-brain studies, regarding the nucleus accumbens: case-control differences in this region were more frequently reported in OCD, particularly compared to MDD and SCZ. This observation does not imply that the nucleus accumbens is not involved in disorders other than OCD. Rather, it indicates the relative importance of the nucleus accumbens in OCD compared to other disorders, which is congruent with current neurocognitive theories of OCD that focus on cortico-striatal-thalamic loops [Graybiel and Rauch, 2000; Milad and Rauch, 2012]. The role of the nucleus accumbens in this framework lies in the integration of affective information with motor selection [Fineberg et al., 2010; Wood and Ahmari, 2015]. However, given the difficulty of imaging a small structure like the nucleus accumbens with high confidence, studies that specifically focus on nucleus accumbens anatomy and function are necessary to verify and understand the present finding.

Similarity in the Spatial Distribution of Task-fMRI Foci of Case-Control Differences Across RDoC Domains and Constructs

The results of our study suggest that tasks assigned to different RDoC domains or constructs result in a similar neuroanatomical distribution of case-control differences. The RDoC framework specifies different mental processes that are grouped together under distinct domains which are assumed to map on discrete neural circuits. However, the degree to which mental processes engage specific, common or partially overlapping regions remains a topic of debate [Pessoa, 2014; Price and Friston, 2005]. In fMRI experiments, multiple brain areas are co-activated during a given task; conversely a single brain area may be activated by disparate tasks that may not always share cognitive components [Price and Friston, 2005]. The relationship between brain structure and function has been described both as pluripotent (one-to-many) and degenerate (many-to-one). Price and Friston [2005] argue that although brain regions are engaged by multiple dissimilar processes, there is a “common denominator” that encapsulates the core functionality of each region. Conversely, given the pluripotent nature of brain organization, such common denominator labels can be attributed to multiple brain regions that are bound to overlap significantly with brain regions defined by other “common denominator” labels. Our data suggest that RDoC domains and constructs operate at the level of “common denominator” labels which could account for the similarity in the spatial pattern of case-control differences across the diagnoses examined. Alternatively, our results may reflect the fact that standard univariate MRI analyses may not be ideally suited for the examination of brain structure-function relationships relevant to disease processes as they do not provide information about the interactions between the brain regions engaged during a task. It has been argued that brain connectivity, rather than activation, patterns may lead to increased specificity in localizing cognitive processing and the effect psychiatric disorders [Menon, 2011; Muldoon and Bassett, 2014]. To date however, examination of connectomic architecture of the brain has contributed further to the idea that disease expression impacts shared brain regions and networks across psychiatric disorders [Crossley et al., 2014].

The foci identified in task fMRI studies reflect the contrast between baseline and task-related brain activity. Individual differences in baseline activation patterns, such as inherent low frequency fluctuations, occur independently from task and may depend on intrinsic connectivity and morphology, synaptic organization and neuronal density, and physiological parameters such as heart rate and vascular factors. These inherent activation patterns are detectable during rest fMRI, and predict those that can be observed during tasks [Zou et al., 2013]. However, this is a line of enquiry for future studies as task-fMRI meta-analyses of published data cannot address the contribution of baseline activity.

ROI Analysis and Confirmation Bias

The use of ROI has the advantage of restricting analyses to specific brain regions thus reducing the burden of correction for multiple testing. We found a correlation between case-control frequencies derived from ROI analyses and those derived from whole-brain analyses, which generally supports this practice. However, on a region-by-region basis we also found that ROI studies resulted in the over-representation of the amygdala and the caudate nucleus, which was not supported when whole-brain studies were considered. Conversely, several regions, particularly the thalamus and parahippocampal gyrus, are not commonly selected in analyses based on a priori hypotheses despite data-driven support from whole-brain studies for their involvement in psychiatric disorders. The pre-selection of ROIs, possibly in combination with the difficulty of publishing negative results, seems to bias the literature and may indirectly lead to oversimplification and over-localization of neurobiological models of behavior and symptoms. Our data cautions that overreliance on ROI analysis may hamper data-driven discovery and artificially exaggerate the role of some regions in psychiatric disorders while ignoring crucial contributions from others.


As this is a co-ordinate based meta-analyses, we did not include studies that did not find case-control differences. Although it is statistically possible to model negative studies, such practice would rely on the assumption that negative studies were indeed sufficiently powered. Given the wide variability of tasks, sample characteristics and analyses methods, meaningful retrospective assessment of power at the level of individual studies is infeasible. Moreover, our aim was to test whether it is possible to infer diagnostic specificity for any brain region implicated by available task-fMRI data.

We do not examine the effect of symptom severity as there was little overlap in scales within diagnosis and particularly across diagnoses. We consider that symptom severity is unlikely to have changed our results given the lack of a significant effect of diagnosis in the topology of case-control differences despite the variable clinical presentation of the patient samples. The studies we considered varied widely in the details of their task designs and within-subject condition contrasts. We were guided by the RDoC project in assigning the different tasks to their purported cognitive domains as we were interested in identifying associations between RDoC domains and diagnosis on case-control differences in patterns of task-fMRI activation. It is theoretically possible, although unlikely given the general lack of diagnosis-specific findings, that other approaches to task classification may have yielded different results. However, we consider the RDoC scheme to represent the best approximation to a gold standard classification for fMRI tasks.

We used a standard atlas to map the reported coordinates as this facilitates interpretation of the results. As foci could only be mapped to regions included in the atlas, we ensured that our chosen atlas included all main subcortical regions, such as the nucleus accumbens, which proved to be important. As larger brain regions are by default more likely to include more coordinates, we accounted for this by normalizing the number of studies by region volume, by covarying for region volume, or by comparing the frequency distributions against predicted frequency distributions under the null hypothesis given the same regions.


The findings of this study suggests that case-control differences in task-fMRI activation reveal a shared topography for SCZ, BD, MDD, anxiety disorders, and OCD. This shared topography explains common deficits in cognitive circuits but does not fully account for variability in clinical presentation and cannot be assumed to imply shared etiological or pathogenic mechanisms. Our findings encourage studies that cross diagnostic boundaries, emphasize the importance of whole-brain studies and urge the careful interpretation and consideration of ROI studies.