Get access

Evaluation of automated brain MR image segmentation and volumetry methods

Authors

  • Frederick Klauschen,

    Corresponding author
    1. Department of Biomedicine, Neuroinformatics and Image Analysis Laboratory, University of Bergen, Bergen, Norway
    • National Institutes of Health, 9000 Rockville Pike, 10-11N311, Bethesda, MD 20892-1892, USA
    Search for more papers by this author
  • Aaron Goldman,

    1. National Institutes of Health, National Institute of Mental Health, Neuroimaging Core Facility, Genes, Cognition and Psychosis Program, Bethesda, Maryland
    Search for more papers by this author
  • Vincent Barra,

    1. LIMOS, UMR CNRS 6158, Blaise Pascal University Campus des Cezeaux, Aubiere, France
    Search for more papers by this author
  • Andreas Meyer-Lindenberg,

    1. National Institutes of Health, National Institute of Mental Health, Neuroimaging Core Facility, Genes, Cognition and Psychosis Program, Bethesda, Maryland
    2. Zentralinstitut f. Seelische Gesundheit, J5, Mannheim, Germany
    Search for more papers by this author
  • Arvid Lundervold

    1. Department of Biomedicine, Neuroinformatics and Image Analysis Laboratory, University of Bergen, Bergen, Norway
    Search for more papers by this author

Abstract

We compare three widely used brain volumetry methods available in the software packages FSL, SPM5, and FreeSurfer and evaluate their performance using simulated and real MR brain data sets. We analyze the accuracy of gray and white matter volume measurements and their robustness against changes of image quality using the BrainWeb MRI database. These images are based on “gold-standard” reference brain templates. This allows us to assess between- (same data set, different method) and also within-segmenter (same method, variation of image quality) comparability, for both of which we find pronounced variations in segmentation results for gray and white matter volumes. The calculated volumes deviate up to >10% from the reference values for gray and white matter depending on method and image quality. Sensitivity is best for SPM5, volumetric accuracy for gray and white matter was similar in SPM5 and FSL and better than in FreeSurfer. FSL showed the highest stability for white (<5%), FreeSurfer (6.2%) for gray matter for constant image quality BrainWeb data. Between-segmenter comparisons show discrepancies of up to >20% for the simulated data and 24% on average for the real data sets, whereas within-method performance analysis uncovered volume differences of up to >15%. Since the discrepancies between results reach the same order of magnitude as volume changes observed in disease, these effects limit the usability of the segmentation methods for following volume changes in individual patients over time and should be taken into account during the planning and analysis of brain volume studies. Hum Brain Mapp, 2009. © 2008 Wiley-Liss, Inc.

Get access to the full text of this article

Ancillary