MP2RAGE multispectral voxel‐based morphometry in focal epilepsy

Abstract We assessed the applicability of MP2RAGE for voxel‐based morphometry. To this end, we analyzed its brain tissue segmentation characteristics in healthy subjects and the potential for detecting focal epileptogenic lesions (previously visible and nonvisible). Automated results and expert visual interpretations were compared with conventional VBM variants (i.e., T1 and T1 + FLAIR). Thirty‐one healthy controls and 21 patients with focal epilepsy were recruited. 3D T1‐, T2‐FLAIR, and MP2RAGE images (consisting of INV1, INV2, and MP2 maps) were acquired on a 3T MRI. The effects of brain tissue segmentation and lesion detection rates were analyzed among single‐ and multispectral VBM variants. MP2‐single‐contrast gave better delineation of deep, subcortical nuclei but was prone to misclassification of dura/vessels as gray matter, even more than conventional‐T1. The addition of multispectral combinations (INV1, INV2, or FLAIR) could markedly reduce such misclassifications. MP2 + INV1 yielded generally clearer gray matter segmentation allowing better differentiation of white matter and neighboring gyri. Different models detected known lesions with a sensitivity between 60 and 100%. In non lesional cases, MP2 + INV1 was found to be best with a concordant rate of 37.5%, specificity of 51.6% and concordant to discordant ratio of 0.60. In summary, we show that multispectral MP2RAGE VBM (e.g., MP2 + INV1, MP2 + INV2) can improve brain tissue segmentation and lesion detection in epilepsy.

the common causes associated with refractory focal epilepsy that are identifiable via an MRI. Typical imaging features are blurred graywhite matter (GM-WM) junctions, cortical thinning/thickening, and hypo-or hyperintense MR signal (Blumcke et al., 2011). However, a relevant proportion of lesions escape visual detection. Approximately half (30-50%) of patients undergoing surgery without MRI visible lesion eventually have cortical dysplasia/FCD upon histological investigations (Bernasconi, Bernasconi, Bernhardt, & Schrader, 2011;Wang et al., 2013). Potential reasons for missing an FCD are their subtlety and, at times their small size as well as variable location. Scans with missed FCDs are mostly considered as normal MRI (MRI-negative patients). Consequently, patients have worse surgical outcomes compared to cases with visible lesions on MRI (Tellez-Zenteno, Hernandez Ronquillo, Moien-Afshari, & Wiebe, 2010).
Particularly at higher field strengths of ≥3 T, image bias due to static magnetic (B0) and radio-frequency field (transmission B1 + and reception B1 − ) inhomogeneities are problematic for segmentation algorithms (Focke et al., 2011). One option to improve signal intensity inhomogeneities has been described by primarily acquiring 2 MPRAGE (hence MP2RAGE) images with different inversion times, otherwise keeping sequence parameters identical (Marques et al., 2010;Van de Moortele et al., 2009). The resultant images show enhanced contrastto-noise ratio (especially GM-WM contrast), independent of T 2 *, proton density, B1 − , and reduced B1 + inhomogeneities. Thus, acquired images are, at least partially, corrected for image bias intrinsically and have been called "self-bias corrected images." Newer studies have shown promise of MP2RAGE in improving visualization of lesions (Beck et al., 2018;Pittau et al., 2018). Moreover, reduced intensity inhomogenities should also improve tissue segmentation, which facilitates VBM analysis for lesion detection (Ashburner & Friston, 2005). However, MP2RAGE and the multispectral MP2RAGE variants have not been systematically analyzed for detecting subtle epileptogenic lesions in a VBM approach. Moreover, it was previously shown that the performance of lesion detection is strongly influenced by the choice of smoothing and statistical thresholds or t-scores Martin et al., 2017). The ideal parameters for an MP2RAGE VBM are yet unclear in focal epilepsy.
To this end, we first assessed systematic differences of MP2RAGE versus T1 and T1 + FLAIR multispectral tissue segmentation in healthy controls. Subsequently, the diagnostic performance of MP2RAGE-VBM in focal epilepsy with and without visible lesions (MRI-negative) was quantified using a lobar hypothesis. These results will offer guidance in applications of MP2RAGE-VBM (and its multispectral combinations) in contrast to conventional T1 and T1 + FLAIR VBM.  (Ashburner & Friston, 2005) was applied with default settings of bias regularization 0.0001 and bias cutoff FWHM 60 mm. Next, GM, WM, and CSF tissue probability maps were spatially normalized with an isotropic resolution of 1mm 3 using Diffeomorphic Anatomical Registration Through Exponentiated Lie Algebra (DARTEL), based on the respective native space GM and WM maps and building a custom template in MNI space for each multispectral combination (Ashburner, 2007). During the normalization process, images were modulated to preserve tissue quantity (in case of group level analysis) and unmodulated images to preserve tissue concentration (in case of individual subject vs. group analysis) (Good et al., 2001). As a final step, images were smoothed using Gaussian kernel sizes 4-16 mm (step size of 2 mm) full width at half maximum (FWHM).

| Comparisons for absolute tissue volumes
Native segmented GM, WM, and CSF were used to calculate the tissue volumes for healthy controls across segmentation models. For this purpose, voxel values (ranging from 0 to 1) were summed and multiplied with the voxel volume (1 mm 3 ) to yield a tissue class specific volume. To estimate the total intracranial volume (TIV), the absolute volumes of GM, WM, and CSF were added. To assess significant differences in segmentation models, one-way repeated measures ANOVA was conducted in SPSS (IBM SPSS Statistics 22) with adjustment for multiple comparisons (Bonferroni).

| Voxel-based comparison of multispectral variants
Smooth normalize modulated GM, WM, and CSF images of healthy controls were used to perform group comparisons across different segmentation models. T1 + FLAIR segmentation was used as a reference, since this had the best overall segmentation quality in our previous work . Each multispectral combination was then compared against T1 + FLAIR using a paired t-test in SPM, at 4 mm smoothing and a statistical threshold of p < .05 FWE (family wise error). In the individual analysis, we compared each patient against all controls (patient comparison) and each control against the rest of the controls (after removing the control in question that is, control comparison) in SPM12 using statistical cutoffs from 2.5 to 6 (step size of 0.1). This analysis was repeated for all smoothing levels (4-16 mm in step size of 2).

| Automated individual lobar analysis
We used the MNI structural lobar atlas provided with FSL version 5.0 (Collins, Holmes, Peters, & Evans, 1995;Mazziotta et al., 2001) for the automated lobar analysis. The atlas comprised of bilateral mask for frontal, parietal, temporal and occipital lobes. We extracted lobar regions-of-interest from the atlas based on the clinical hypothesis (please refer hypothesis lobes from Data S1- Table S1), for each patient. These lobes were considered as "concordant lobes," that is, concordant to the clinical hypothesis, while nonconcordant lobes (remaining lobes) were labeled "discordant lobes" individually for each patient. Since we do not expect controls to have epileptogenic findings, all lobes (bilateral: frontal, temporal, parietal, and occipital lobes) were defined as discordant lobes for all controls. Specificity was defined as the percentage ratio of controls with no VBM findings to total number of controls (findings with less than 1/3rd of voxels inside the brain were considered as no finding); S P = Number of controls with no findings Total number of controls × 100 C R , D R , and S P were calculated for each smoothing levels 4-16 mm, across statistical cutoffs 2.5-6, for all VBM models.

| Estimating smoothing and T-threshold
The method to estimate the diagnostically ideal smoothing and Tthreshold has already been explained in detail in our previous work . In brief, a single-channel VBM variant is considered as a reference, that is, MP2 in our case. Receiver operating characteristic (ROC) curves are plotted for each smoothing, using C R (concordant rate) and 100-S P (specificity or 1-false positive rate) values generated across statistical cut-offs. For the MP2 VBM, the smoothing level yielding the highest AUC was then selected for the further analysis. At this reference smoothing, the optimal T-threshold was determined by the intersection between C R and S P . If multiple Tthresholds fulfilled the criterion, the point with highest C R was chosen. As such, the selected T-threshold can be seen as a balanced trade-off between C R and S P . Euclidean distances in the ROC plots are calculated between the consequently obtained C R /S P point to the ideal C R /S P point (i.e., 100, 100) as an overall performance marker that integrate both C R and S P . Thus, lower Euclidean distances signify a better diagnostic test performance. To further validate our choice of smoothing and statistical cut-offs, we additionally used these parameters on visible epileptogenic lesions (suspected MCDs n = 5).

| Visual interpretation
Visual cross-verification of VBM findings was done by one certified neuroradiologist with more than 10 years of experience. For group level analysis, structural differences were visually interpreted on native space across individual subjects, to assess if the VBM differences were evident on the native tissue maps. For individual level analysis, at the estimated smoothing level and T-threshold for each subject, the VBM findings across models were combined and inverse transformed to native space using the deformation utility in SPM12.
These native combined VBM findings were smoothed with 1 mm Gaussian kernel and were provided to the reviewer for scoring over- For controls, each finding was rated as either 1: visible and likely nonepileptogenic, 2: unclear/nonvisible or 3: artifact. The nonepileptogenic label was given when finding was visible on one/more structural images but was not likely related to epilepsy (e.g., perivascular spaces or microangiopathy). Unclear labels were used whenever the findings were not clearly visible to be confirmed as either epileptogenic, nonepileptogenic, or as an artifact.

| Group level differences across segmentation models
We found significant differences among segmentation combinations for absolute volumes of GM (p < .05), WM (p < .05), CSF (p < .05), and TIV (p < .05; Figure 1, Data S1- Table S2). The voxel-based group level comparisons in healthy subjects using a paired t-test (p < .05

| Individual VBM based on MP2 and multispectral variants in focal epilepsy
The best performing smoothing level with single-contrast MP2 VBM as the reference method was found at 14 mm, (AUC = 0.24, Figure 4a, c and Data S1- For 14 mm, we found the optimal T-threshold at 3.3 (Figure 4b).

| Differences among MP2 and its multispectral combinations
We found that the absolute GM volumes were significantly higher in both single-contrast approaches: T1 (55 mL) and MP2 (161 mL inversion recovery, facilitating enhanced visualization of tissue boundaries of interests (Costagli et al., 2014;Mougin et al., 2016).
This effect is likely due to similar longitudinal magnetizations of GM and WM but with opposite polarities (Pitiot, Totman, & Gowland, 2007). An underlying biological reason could be the dependency of local T1 values on the density of myelin (Stuber et al., 2014

| Smoothing and statistical cutoffs
It has been previously shown that the selection of smoothing (12 mm with T1 as reference) as well as statistical threshold (tscore = 3.7) affected the detection rates in MRI-negative cohort . In the current study, a smoothing of 14 mm at a liberal T-threshold of 3.3, gave the optimal trade-off between specificity and concordant rate for the MP2 contrast. This is similar to our previous results, where increased smoothing level is compensated with decreased T-score . The worst performance was found at 4 mm smoothing, reflected by minimal AUC across all models. This is also line with previous studies on single patient comparisons, which shows that reducing kernel size to 4 or 8 mm reduces experimental design robustness and results are prone to more false positives Salmond et al., 2002). In Salmond et al. (2002) study, 12 mm smoothing was suggested for single patient comparisons. This is closer to our obtained smoothing of 14 mm and the difference in performance from 12 to 14 mm in our study is only 0.01 in AUC for MP2. Therefore, a smoothing of 12 mm could have also been considered, but at a higher statistical cutoff of 4.2 to yield comparable results. As a further validation step for the obtained parameters, we found the expected VBM sensitivity for patients with visible lesions within 60-100%, which is in line with previous studies based on lesional cohorts Martin et al., 2015). Furthermore, our results can provide guidance for maximizing the performance (concordant rate or specificity) of VBM models at a range of smoothing levels and statistical cutoffs.

| Visual interpretation of VBM findings
It can be expected that visual interpretation improves the specificity by eliminating false positive findings through expert knowledge.
As predicted, specificity across models was higher after visual inspection, but also resulting in a decreased concordant rate, which is in line with previous studies ( had a relevant high number of discordant findings, which reduced after the visual analysis. However, it should be noted that clinical hypothesis was based on noninvasive data in most patients, which is limited by propagating of seizure activity (Alarcon et al., 2001;Spencer et al., 1985). Hence, the discordant findings may still hold clinical significance, though this cannot be resolved at this point. It can also be the case that some of these patients have multi-focal lesions.

| Diagnostic significance of MP2 and multispectral MP2 combinations
In a recent qualitative assessment in lesional epilepsy cases at 7 T, epileptogenic characteristics (cortical thickening, cortical-subcortical atrophy, and blurred GM-WM junction phenomena) were well appreciated on MP2RAGE images (6/7 cases = visual sensitivity 85.7%; Pittau et al., 2018). This is similar to our study, where MP2 (80%) and MP2 VBM variants showed a sensitivity between 80 and 100% in the lesional cohort (n cases = 5). One such example case is presented with Figure 5, for a patient with histopathologically proven FCD Type IIb in the right frontal lobe. The patient was operated and has been seizure free for the last 2.5 years. All models segment the affected area as GM, likely due to the isointensity with normal cortex. In this lesional/MRI-positive case, MP2RAGE did not offer any substantial benefit over conventional sequences in terms of VBM lesion detection. One case from the MRInegative cohort is represented in Figure 6. Seizure onset was presumed in the right temporal lobe supported by noninvasive EEG recordings and neuropsychological evaluations. Though an intracranial EEG was indicated, it has not been pursued till date. In this case, MP2 + INV1 alone revealed gray matter increase in the area around the right amygdala, which is concordant to the clinical hypothesis.

| Technical and diagnostic considerations
The approach and results demonstrated in our study should be assessed carefully, keeping in mind the technical and diagnostic considerations.
A better separation of tissues of interest, that is, GM-WM tissue classes, was achieved using the MP2 + INV1 multispectral segmentation. As the TI increases, the "dark rim" of hypointense voxels will shift toward the GM-CSF (Pitiot et al., 2007). In our study, the inversion time of INV1 was 700 ms, which results in the "dark rim" appearing at the GM-WM border. For epileptogenic lesions that are often characterized by GM-WM junction smearing/expansion, such a contrast enhancement seems to aid in the lesion detection. One might also speculate that a thinner cortical segmentation might as well be sensitive toward lesions with a blurred GM-WM junction.
The hypointense GM-WM boundary rim has been observed at inversion times between 200 and 1,000 ms with different sequence parameter optimizations at variable field strengths of 3 and 7 T (Costagli et al., 2014;Marques et al., 2010;Mougin et al., 2016;Pitiot et al., 2007)