To develop an automated lesion-filling technique (LEAP; LEsion Automated Preprocessing) that would reduce lesion-associated brain tissue segmentation bias (which is known to affect automated brain gray [GM] and white matter [WM] tissue segmentations in people who have multiple sclerosis), and a WM lesion simulation tool with which to test it.
Materials and Methods:
Simulated lesions with differing volumes and signal intensities were added to volumetric brain images from three healthy subjects and then automatically filled with values approximating normal WM. We tested the effects of simulated lesions and lesion-filling correction with LEAP on SPM-derived tissue volume estimates.
GM and WM tissue volume estimates were affected by the presence of WM lesions. With simulated lesion volumes of 15 mL at 70% of normal WM intensity, the effect was to increase GM fractional (relative to intracranial) volumes by ≈2.3%, and reduce WM fractions by ≈3.6%. Lesion filling reduced these errors to ≈0.1%.
SEPARATION OF BRAIN GRAY MATTER (GM) and white matter (WM) tissues by segmentation of high-resolution T1-weighted structural images has become a key element of many magnetic resonance imaging (MRI) analysis protocols, and is used to determine tissue volumes and regions of a given tissue type for subsequent quantitative parameter extraction. Near fully automated methods are now usually preferred (for example, SPM (1), SIENAX (2), or FreeSurfer (3)). However, they can be influenced by the presence of WM lesions, leading to misclassifications of GM and WM tissues (4). This can be problematic in multiple sclerosis (MS), where WM lesions are a cardinal feature of the disease, but where important pathological changes also occur in GM (for example, Refs.5, 6) and normal-appearing WM (for example, Refs.7, 8) that in turn cause a real disease-induced change in GM and WM tissue volumes. A clear awareness of a given segmentation method's limitations and robustness in the presence of WM lesions is required to achieve better segmentation solutions, develop practical methods for correcting the effect of lesions on tissue volume estimates, and enable robust interpretation of results derived from segmented GM and WM.
To address these issues we performed a study in two parts: the first to quantify the potential effect of simulated WM lesions on GM and WM atrophy measures, and the second to develop methods to reduce lesion-induced segmentation errors.
There has previously been relatively little work on the simulation of WM lesions (4, 9–11), their effect on GM and WM tissue volumes (4), and methods to limit their potential influence on image segmentation; Sdika and Pelletier (11) have recently looked at the effects of WM lesions on image registration—a key stage in many segmentation algorithms—and developed a lesion filling technique to reduce lesion-associated registration errors.
In the first part of the study we developed a technique to simulate focal WM lesions and then quantified their effect on tissue volumes estimated using one frequently employed method, SPM.
In the second part of the study, which aimed to limit lesion associated segmentation biases, we tested existing methods that either exclude lesions during segmentation, or reclassify lesions as WM after segmentation (for example, Ref.4). We also developed and tested an automated technique (LEAP; LEsion Automated Preprocessing) that fills lesions with normal WM values before segmentation to ascertain whether it would lead to better segmentation results, as assessed by measures of tissue volumes in the healthy brain.
MATERIALS AND METHODS
Structural images were acquired in healthy volunteers with a 3D inversion-prepared fast spoiled gradient recall sequence (3D FSPGR, repetition time = 13.3 msec, echo time = 4.2 msec, inversion time = 450 msec, final resolution 1.2 × 1.2 × 1.5 mm3) on a 1.5 T GE Signa scanner (General Electric, Milwaukee, WI) using the manufacturer's standard birdcage head radiofrequency coil. This work had approval from the National Hospital for Neurology and Neurosurgery and UCL Institute of Neurology Joint Research Ethics Committee; subjects gave written informed consent to participate.
Artificial Lesion Generation
A method was developed to:
1Add predetermined lesion-loads to a healthy control structural brain scan;
2With known intensity profiles;
3While retaining natural biological (small random differences in the intensities of neighboring WM voxels) and image inhomogeneity variation in WM intensities throughout the brain;
4With a lesion distribution capable of fulfilling the MRI elements of the diagnostic criteria for MS (12), ie, place lesions in infratentorial, periventricular, and juxtacortical WM locations;
5Using tools readily available to and usable by other groups.
ImageJ (Rasband, ImageJ, US National Institutes of Health, Bethesda, MD; http://rsb.info.nih.gov/ij/), a Java-based imaged processing toolkit, was used to manipulate the brain images, with an ImageJ tool developed in-house to allow the placing of 3D spheroid “lesions” (available for download from the NMR Research Unit website, along with step-by-step usage instructions: http://www.nmrgroup.ion.ucl.ac.uk/analysis/synthlesions.html).
The processing pipeline is as follows:
1A blank lesion mask image is first generated, with dimensions matching the target healthy-control structural image; all voxels are set to an intensity of 0;
2The mask image is overlaid on the control structural image, allowing lesions to be added to the mask with direct reference to (but without altering) the structural image;
3Spheroid regions of fixed volume (in our tests a 5 × 5 × 5 voxel spheroid, with six such lesions equivalent to 1 mL), with an intensity of 1, are manually added to the mask. This is saved for use in the volume estimation and lesion-filling stage;
4The lesion mask is scaled so that the lesion intensity was equivalent to the intended drop in WM intensity, ie, if lesions are to be 10% darker than normal WM, the lesion intensity is set to 0.1. This is subtracted from a further image containing the value 1 everywhere, yielding a lesion template image representing the desired output image intensity as a proportion of the input image intensity;
5The structural target image is multiplied by the lesion template to yield an image with simulated lesions.
A set of test images were then generated with known lesion-loads and intensity relative to normal WM. Test images sought to simulate lesion loads of 10 and 20 mL (with 60 and 120 artificial lesions, respectively, placed throughout the WM, including in the posterior fossa, periventricular, and subcortical regions), which is similar to lesion loads seen in people with MS of about 10 years duration (13) with intensities ranging from 90% of WM, through the GM range (about 60%–70% of WM intensity in this imaging sequence) to CSF (30%–40% of WM intensity). Within a given test image, all lesions were scaled to the same intensity. MS lesions visible using this imaging sequence usually fell in the 60%–80% of WM intensity range (tested by measuring the mean intensity in a single slice ovoid region of interest and sampling the core of nine lesions from three people with MS; mean intensity values from seven lesions fell within this range). An example of a simulated lesion image is shown in Fig. 1.
We developed LEAP to: 1) fill a WM lesion mask with simulated normal appearing WM; 2) replicate biological and imaging related variability in WM intensity over the brain; 3) avoid operator intervention, other than that required for the initial lesion segmentation; and 4) use tools readily available to and usable by other groups.
The method requires two 3D input images per slice position, the original structural image and a lesion mask; processing of these images is fully automated, resulting in a single 3D image with filled lesions. The software for this is available for download from the NMR Research Unit Website, along with step-by-step usage instructions: http://www.nmrgroup.ion.ucl.ac.uk/analysis/lesionfill.html.
2Nonparametric nonuniform intensity normalization (N3) (14) is applied with a characteristic distance of 35 mm to correct for spatial variation in both image and WM intensity; the resulting image has a similar WM intensity throughout the brain;
3The intensity variation profile is found by taking the ratio of the original image to the nonuniformity corrected image. This profile is used in step 6;
4A histogram of image intensities is generated for the skull-stripped and nonuniformity corrected image from step 2. Voxels with zero intensity, and those within the MS lesion mask, are excluded, and the histogram range is limited to intensities within 2.5 standard deviations of the mean value;
5The histogram is modeled numerically as the sum of four Gaussian components representing GM, WM, cerebrospinal fluid (CSF), and partial-volume voxels (15). Values for the peak height, location and half-width half maximum (HWHM) giving the best fit to the histogram data are found for each Gaussian component using the Fityk peak fitting software (http://sourceforge.net/projects/fityk) (Fig. 2);
6A 3D image containing WM signal intensity values is generated:
i)Simulated image noise values are generated randomly from a Gaussian distribution with a mean of zero and standard deviation equal to the normal-appearing WM peak HWHM;
ii)To introduce spatial correlation between adjacent voxels the image noise values are smoothed with a 2D Gaussian kernel of radius 0.6 pixels, and scaled to maintain the standard deviation;
iii)The signal intensity of the normal-appearing WM peak location is then added;
iv)These values are multiplied on a voxel-by-voxel basis by the 3D intensity nonuniformity profile obtained in step 2 to reintroduce the original spatial variation in WM signal intensity;
7The MS lesion volume in the original image is replaced with intensity values taken from corresponding locations in the simulated WM image.
An example of a test image with synthetic lesions filled with simulated normal-appearing WM is shown in Fig. 1.
The effect of synthetic MS lesions on GM and WM volumes was assessed using SPM5 (1). Images from three healthy subjects were used to assess the accuracy of the segmentation by initially producing GM and WM volumes using each dataset with no lesions added (these volumes were assumed to be the gold standard) and subsequently performing the segmentation on the same datasets with different volumes of artificial lesions added. For each artificial lesion-load trial, gray matter fractions (GMF) and white matter fractions (WMF) were estimated. Two methods for reducing lesion-associated volume estimate errors were tested: 1) tissue segmentation without masking, followed by reclassification of lesions as WM (as per Ref.4, in which lesion contours are generated using a semiautomated technique, and all voxels falling within these regions are classified as being WM regardless of the segmentation results from SPM); and 2) lesion filling with LEAP prior to tissue segmentation.
On the MRI sequence used in this work, our experience has shown that visible MS lesions usually have intensities that fall between 60% and 80% of normal WM values; from the simulation element of this work, segmentation errors appeared to be greater with lesions set at 70% rather than 60% or 80% of normal WM intensity, and we therefore tested the lesion correction methods with lesions set to 70% of normal WM values and total volumes of 15 mL (similar to the lesion load found in people with long-established relapsing-remitting MS (13)). Test images, with simulated lesions, were generated using scans from three healthy control subjects (one female and two males; mean age 34 years, range 34–35 years).
Estimated tissue volume errors were calculated as:
where E is the percentage error, VT is the volume estimated from the test image with synthetic lesions added to it, and VO from the original lesion-free image.
The effects of different artificial lesion volumes and intensities on estimated tissue volumes derived from a single subject are illustrated in Table 1. The volumes presented in this table have not been corrected for WM lesion misclassification as GM or CSF.
Table 1. Effects of Differing Artificial Lesion Volumes on Tissue Segmentations in a Single Subject
Ten and 20 mL, equating to about 0.6% and 1.2% of total intracranial volume, 2.0% and 4.1% of WM volume, or 1.1% and 2.2% of GM volume. SPM derived gray matter (GMF) and white matter fractions (WMF) are presented as % change relative to results from the original lesion-free image. Intensities of 30% WM approximate CSF, 60 to 70% approximate GM, 60 to 80% approximate most MS lesions. Tissue volumes have not been corrected for WM lesion misclassification as GM or CSF.
GMF % difference
WMF % difference
The effects of correcting for lesions before and after brain tissue segmentation are illustrated in Table 2. Using SPM, both lesion correction methods provide GM and WM fractions that are closer to the original measures obtained using lesion-free scans, with the least residual error seen using the LEAP technique. In regions where artificial lesions were placed, in the original (lesion-free) images 99.6% of voxels were classified by SPM as being WM, whereas when filled with simulated WM values this was 100.0%, ie, natural WM was slightly less likely to be classified as WM by SPM than was simulated WM.
Table 2. Effect of Lesion Correction Techniques on Estimated Tissue Volumes
Gray matter fraction (GMF), white matter fraction (WMF), and the sum of these (brain parenchymal fraction; BPF) values derived from three healthy control scans, to which 15 mL of lesions set to 70% WM intensity were added.
Correction: nil – no adjustment for lesion misclassification made; after – following segmentation, regions falling within lesions reclassified as WM; LEAP, lesions filled with simulated normal WM values. Percentage mean (standard deviation) errors.
This study demonstrates that GM and WM tissue volume measures derived using SPM are substantially affected by the presence of focal WM lesions (Table 1), and that the LEAP technique most effectively reduces lesion-associated segmentation biases, particularly those affecting GM, when compared with a technique that accounts for lesions after tissue segmentation. SPM was chosen for this study because it is one of the software tools most commonly used for segmenting GM and WM volumes, but the purpose of this article is not to characterize SPM errors, rather to present a robust method for dealing with WM lesions that can be used as a preprocessing step with any suitable software tool used for GM-WM segmentation.
As can be seen from Fig. 3, the effects of WM lesions on tissue segmentation is complex, reflecting both tissue misclassification within lesions and subtle shifts in tissue boundaries due to apparent changes in tissue intensities. In the images used in this work, WM is brighter than GM, which is brighter than CSF. Lesions that are slightly darker than normal WM may remain classed as WM, but the darker they are the more likely it is that they are classified as GM or CSF. The average intensity of WM, GM, and CSF segments will also be influenced by (WM) lesions that are included within them. If a lesion remains classified as WM, the mean intensity of WM appears to drop, which in turn shifts WM boundaries into darker (GM) regions, ie, WM volumes increase, GM volumes decrease (as is seen with lesions set to 90% of normal WM values in Table 1). Once lesions are dark enough not to be classified as WM, then they may be included in the GM segments. This will directly increase GM and decrease WM volumes, but also increase the apparent mean GM value, so causing GM boundaries to shift toward lighter (WM) regions, further increasing GM volumes (as seen in Table 1 with lesions between 60% and 80% of normal WM values). If lesions are sufficiently dark, then they will be classified as CSF, which in turn could make the apparent mean CSF intensity increase, leading CSF boundaries shift toward GM and WM, so reducing both GM and WM volumes (Table 1, lesions at 30% of normal WM values).
Although individual MS WM lesions will exhibit a wide range of signal intensities, and their overall effect on GM and WM volume measures will reflect a combination of the mechanisms discussed above, most lesions have signal intensities in the 60%–80% range compared to nonlesion WM. The potential effect of such “typical” lesions on SPM-derived GM and WM volumes in MS is illustrated by considering the following two scenarios: 1) data from Fisniku et al (13, 16) indicate that in people with relapsing–remitting MS, disease durations of about 20 years, and an average T2-weighted lesion load of 13 mL, GM atrophy is on the order of ≈6% relative to healthy controls, and WM atrophy ≈3%; 2) in the present simulation study, the effects of lesions with a “typical” load (15 mL) and signal intensity (70% WM) on GM volumes was estimated to be ≈+2.3% and on WM volumes ≈−3.6%. These observations suggest that the effects of WM lesions (if uncorrected) on tissue segmentations may lead to disease-associated GM atrophy being underestimated by up to a third and WM atrophy being overestimated by a factor of two.
Current volume correction techniques target the misclassification of lesions during (as for example in SIENAX) or after segmentation (as per Ref.4), and there has been little work seeking to reduce the effects of lesions prior to presenting images for further processing (11). It was with this in mind that we developed LEAP, a straightforward automated method for filling lesions with simulated normal-appearing WM values, and set about testing the performance of this against a method accounting for lesions after segmentation. For WM volumes, both methods of dealing with lesions perform similarly. For GM volumes, however, there is a clear advantage of the presegmentation lesion filling technique; in MS, filling lesions before segmentation would be expected to reduce biases to less than one-fiftieth of the observed atrophy (16).
It is to be expected that the performance of the LEAP method proposed here will be similar regardless of lesion intensities, so while we tested the technique with lesions set to a “typical lesion” 70% of normal WM values (Table 2), the residual segmentation error would be the same regardless of the lesions' precorrection intensities; this would not be true of methods correcting for lesion misclassification after tissue segmentation, where the effects of lesions on overall tissue signal intensity—thus potentially shifting tissue boundaries—would not have been addressed. In addition, as the LEAP technique is independent of the eventual segmentation method used it makes it a robust candidate for being used with a variety of different algorithms.
With regard to the lesion simulation method proposed here, the solution presented represents a pragmatic compromise between several competing factors. We chose to develop a method that would allow the main elements of lesion variability (their volume, intensity, and location) to be varied in a controlled fashion. Using this method, it is possible to simulate more complex lesion shapes by adding overlapping spheroids, and more heterogeneous intensity profiles by adding lesions in several rounds, each with different scaling factors applied. With regard to the lesion filling method, visual inspection reveals subtle residual differences between filled and native WM (Fig. 1), and this is mirrored by small residual differences in tissues volumes after lesion filling compared to the gold-standard results. Given that the LEAP method requires a lesion mask, errors defining this mask could also result in lesion being underfilled, or encroaching on GM or CSF; it is likely that this also explains the apparent increase in the WM content of regions in which lesions were placed, ie, artificial lesions next to GM and CSF will have included a few voxels that were actually GM or CSF, but after lesion filling with simulated WM values they fell into the WM segments. Recognizing that some lesion contouring methods or operators may systematically draw conservative lesion masks, an option to dilate the masks before filling is provided in the software toolkit. Compared to the method proposed by Sdika and Pelletier (11), the LEAP technique uses the whole white matter histogram and an established nonuniformity correction algorithm to overcome possible biases related to the use of local signal intensities, which may be advantageous in some situations. However, the nonuniformity correction may perform with variable efficiency, dependent on the head coil used, which counsels some caution when extrapolating the present results not only to other image sequences, but also the same sequences implemented using different MR equipment and parameters.
Although the present report is focused on development of the LEAP method per se, and understanding its potential utility through simulated lesion experiments, the method has been applied in a preliminary investigation of 30 scans from MS patients (Bodini B, personal communication); LEAP was successful in filling 100% of lesions in 94% of subjects; it failed to lesion-fill in only 6% of subjects, in whom the WM signal peak was indistinct in the intensity histogram. This suggests that the technique should be robust in clinical studies in MS, although further investigation in MS and other patient cohorts is warranted. As for the actual magnitude of the correction effect of LEAP, as may be seen from the results of the synthetic lesion work, this will depend on the original lesion intensity characteristics and volume, and is likely to be influenced by scan acquisition factors (such as scan sequences and instrument characteristics). It would be useful to know how large the effect of lesion correction really is in scans obtained from people with MS; however, in the absence of a “gold standard” true lesion free volume, this cannot be quantified. The results of the synthetic lesion tests do provide an objective measure, but given that the synthetic lesions are not a perfect facsimile of real lesions, while results derived from such a test can tell us how various correction methods compare, they should be considered with due caution when estimating how much lesion associated bias they remove from data obtained in people with MS.
In conclusion, this work highlights the importance of testing GM and WM segmentation techniques if they are to be used to assess tissue-specific volumes in MS, or in other diseases in which focal WM lesions occur. Lesion-associated segmentation biases may significantly affect results, reducing sensitivity to real disease effects while simultaneously introducing spurious changes that may be incorrectly considered to be disease-related. The artificial lesion method presented here provides a straightforward way to characterize a segmentation technique's performance when WM lesions are present, while LEAP provides a fully automated and effective lesion-filling technique to eliminate most of the lesion-associated segmentation errors.
The NMR Research Unit is supported by the MS Society of Great Britain and Northern Ireland. This work was undertaken at University College London Hospital and University College London, who received a proportion of funding from the Department of Health's National Institute for Health Research Biomedical Research Centres funding scheme. Lesion filling failure rates in people with MS were estimated from data kindly provided by Dr. Benedetta Bodini.