Grey matter atrophy in patients with benign multiple sclerosis

Abstract Background Brain atrophy appears during the progression of multiple sclerosis (MS) and is associated with the disability caused by the disease. Methods We investigated global and regional grey matter (GM) and white matter (WM) volumes, WM lesion load, and corpus callosum index (CCI), in benign relapsing‐remitting MS (BRRMS, n = 35) with and without any treatment and compared those to aggressive relapsing‐remitting MS (ARRMS, n = 46). Structures were analyzed by using an automated MRI quantification tool (cNeuro®). Results The total brain and cerebral WM volumes were larger in BRRMS than in ARRMS (p = .014, p = .017 respectively). In BRRMS, total brain volumes, regional GM volumes, and CCI were found similar whether or not disease‐modifying treatment (DMT) was used. The total (p = .033), as well as subcortical (p = .046) and deep WM (p = .041) lesion load volumes were larger in BRRMS patients without DMT. Cortical GM volumes did not differ between BRRMS and ARRMS, but the volumes of total brain tissue (p = .014) and thalami (p = .003) were larger in patients with BRRMS compared to ARRMS. A positive correlation was found between CCI and whole‐brain volume in both BRRMS (r = .73, p < .001) and ARRMS (r = .80, p < .01). Conclusions Thalamic volume is the most prominent measure to differentiate BRRMS and ARRMS. Validation of automated quantification of CCI provides an additional applicable MRI biomarker to detect brain atrophy in MS.

atrophy (Coles et al., 2017;De Stefano et al., 2016;Gaetano et al., 2018;Yousuf et al., 2017), while most other DMTs have shown only minimal or controversial results. MRI examinations have traditionally focused on detecting and characterizing the WM lesions to follow-up disease activity and treatment outcome.
The whole-brain and regional grey matter (GM) atrophy measurements have recently become an essential part of the imaging domain in MS, indicating that the degenerative neuroaxonal component plays a significant role in the irreversible physical and cognitive disability in MS (Bjartmar et al., 2003). Therefore, the reduction of the rate of brain atrophy is also an essential target of MS treatments to minimize the permanent disability. The volume loss of GM is a result of the slowly ongoing neurodegenerative process. Brain atrophy occurs in normal aging at the rate of 0.1-0.3% per year, but in MS this annual rate is higher compared to age-related measures: 0.5-1.3% at all stages of the disease (De Stefano et al., 2014;Giorgio et al., 2010). Pronounced GM atrophy can be seen already at the early stages of the relapse-onset MS (Bergsland et al., 2012;Calabrese et al., 2007) and primary-progressive MS (PPMS) (Sastre-Garriga et al., 2004).
In addition to global GM atrophy, regional deep GM atrophy has been associated with the evolution of definite MS and disability progression in early relapsing-remitting MS (RRMS), and with the evolution of PPMS (Mesaros et al., 2011;Zivadinov et al., 2013). Especially thalamic atrophy seems to appear early in MS and is associated with cognitive decline (Houtchens et al., 2007;Schoonheim et al., 2015).
Thalamus is a vital relay nucleus with cortical and subcortical connections and, thus, a critical location in MS. MRI studies have strengthened the previous histopathologic findings of axonal disconnection in major thalamic tracts and thalamic demyelinating lesions (Cifelli et al., 2002;Harrison et al., 2015). Thalamic volume decline has been reported to be present consistently across MS subtypes and throughout the disease, correlating with whole-brain atrophy (Azevedo et al., 2018).
Corpus callosum (CC) contains millions of axons, which are mainly myelinated. It is significantly affected by focal demyelination and Wallerian degeneration in the pathogenesis of MS (Evangelou et al., 2000).
Corpus callosum atrophy is associated with the level of disability in MS and correlates with other measures of brain atrophy and GM atrophy (Klawiter et al., 2015;Vaneckova et al., 2012;Yaldizli et al., 2010). At the same time, CC is resistant to age-related changes in healthy individuals (Pozzilli et al., 1994;Sullivan et al., 2001). As a sharply demarcated WM structure, CC can be delicately identified in conventional MRI.
Although different visual rating scales are used in clinical work to quantify brain atrophy, they are relatively coarse and subject to interrater variability. Automated quantification tools providing brain and lesion load volumetry may help to evaluate the prognosis and activity of the MS and monitor the drug responses.
The clinical course of MS is variable. A proportion of MS patients show minimal disability even decades after the onset of MS symptoms, and this entity of the so-called benign MS has been debated since the 1950s (Ramsaransing & De Keyser, 2006). There are no clinical prognostic markers to predict the benign course of MS. Controversially, a proportion of patients with clinical benign MS phenotype have a large WM T2 lesion load (Strasser-Fuchs et al., 2008). More significant brain volume loss has also been reported in benign MS patients compared to healthy subjects . The reduction of brain volume in benign MS disease has been even comparable to secondary-progressive MS (SPMS) as the course of the disease has been long . Reduction of thalamic volume  and GM volumes in subcortical and frontoparietal regions (Mesaros et al., 2008) in benign MS compared to healthy controls has been reported, but no studies are validating the CC as an atrophy marker in benign MS. The effect of DMT on brain atrophy in benign MS has not been studied.
In this study, we used an automated MRI quantification tool (cNeuro®) to evaluate global and regional GM volumes and WM lesion load in benign MS. Also, the CCI was evaluated as a marker of brain atrophy.
A patient was classified to have BRRMS when the Expanded Disability Status Scale (EDSS) score was ≤ 3 and disease duration ≥ 10 years, a commonly used definition for benign MS (Glad et al., 2010

MRI acquisition and analysis
The A set of 328 different volumetry and voxel-based morphometry imaging biomarkers was extracted from T1-weighted and FLAIR images using the cNeuro® MRI quantification tool (Combinostics Oy, Tampere, Finland) (Lotjonen et al., 2010). Images were segmented into 133 brain regions using the multiatlas segmentation method (102 cortical and 31 subcortical regions) (Hanninen et al., 2019;Koikkalainen et al., 2016;Lotjonen et al., 2010). In this study, results for 27 imaging biomarkers are reported. The WM lesions were segmented as previously described (Koikkalainen et al., 2016;Wang et al., 2012), and the lesion volume is reported globally and regionally for the following brain regions: periventricular, subcortical, deep white matter, pons, and cerebellum ( Figure 2). The method uses the state-of-the-art lesion filling technique which removes lesions from images before T1 segmentation.
All the quantified variables were normalized regarding age, gender, and head size (Buckner et al., 2004;Cole & Green, 1992). The extraction of the CCI was not available in cNeuro. For the automated computation of the CCI (Goncalves et al., 2018;Yaldizli et al., 2010), six landmarks were first manually located on a mean anatomical template. The T1 image was first affinely and then nonrigidly registered with the mean anatomical template, and the landmarks were propagated accordingly to the T1 image for the automated computation of individual CCI ( Figure 3).

Statistical analysis
Statistical analyses were performed with IBM SPSS Statistics for Windows version 24 (IBM Corp, Armonk, NY). Baseline demographics were expressed as means with ranges or frequencies with percentages. Differences between groups were tested with the Mann-Whitney U test and the chi-square test. Brain MRI segmentation volumetric results between groups were compared by analysis of covariance (ANCOVA) model. In the ANCOVA model, age, the length of disease duration, and Gd-enhancement (with or without Gd) functioned as adjusting variables. Means and standard deviations were reported. In addition, the regression coefficients with p-values and standardized Betas were expressed to measure effect size difference between study groups.

Clinical characteristics
Patients in the BRRMS group were older (mean age 51.0 years, range 32-70) than in the ARRMS group (mean 43.2 years, range 21-69) at the time of MRI (p < .001), the duration of the disease was longer (18.2 years vs. 12.6 years, respectively, p < .001), and they had had fewer relapses (median 4.0 vs. 5.0, p = .004). Onset symptoms did not differ between BRRMS and ARRMS groups (Table 1). The majority of patients with ARRMS were using fingolimod or natalizumab (n = 25, 54.3%) at the time of MRI examination (Table 1).

F I G U R E 3
Brain imaging was done after the initiation of fingolimod or natalizumab in 41 (89.1%) patients with ARRMS ( Figure 1). The time of MRI examination in relation to the initiation of highly effective DMT varied due to the retrospective nature of the study, and there was variation in the MRI imaging protocols: in five patients, there were applicable MRI scans with 3D T1-w images only from the time before the start of fingolimod or natalizumab (Figure 1).

3.2
Whole-brain volume, GM and WM volumes and regional GM volumes in BRRMS and ARRMS Total brain tissue volume was larger in patients with BRRMS (mean 1098.42 ml, SD 52.82) compared to ARRMS (mean 1069.4 ml, SD 60.09), (p = .014). Both the cerebral (mean 369.82 ml, SD 37.76) (p = .017) and cerebellar (mean 22.12 ml, SD 3.58) (p = .015) WM volumes were larger in patients with BRRMS, while cortical GM volumes did not differ between the groups. Thalamic volume was larger in BRRMS (mean 12.94 ml, SD 1.9) than in the ARRMS group (mean 11.82 ml, SD1.82) (p = .003). Total and regional volumes are given in Table 2.

DMT use, brain volumes and WM lesion volumes in BRRMS
There were no differences between the treated and nontreated BRRMS patients in total brain volumes, neither in regional GM volumes. Also, CCI did not differ between these subgroups. The total WM lesion volumes (p = .033), as well as regional WM lesion volumes in the subcortical area (p = .046) and deep white matter (p = .041), were larger in the subgroup of patients without DMT use (Table 3).

Whole-brain and regional volumes, WM lesion volumes and CCI in subgroups of ARRMS compared to BRRMS
Because the time of MRI in relation to the initiation of highly effective DMT varied within the ARRMS group, we did a subgroup analysis between BRRMS and the three different subgroups of ARRMS given in Figure 1. These results in volumetry are given as supplementary data (Table 4). Smaller thalamic volumes and periventricular WM lesion volumes compared to BRRMS were detected in the subgroups of ARRMS scanned before and those scanned for more than 12 months after the initiation of highly effective DMT.

DISCUSSION
In this study, we focused on brain atrophy measures, global and regional GM volumes and WM lesion load, in benign MS. Further, we evaluated CCI as a measure of atrophy not reported earlier in benign MS using an automated MRI quantification tool (cNeuro®).
Total brain volumes, regional GM volumes, and CCI measures were found similar between treated and nontreated BRRMS patients.
Within the BRRMS group, those patients who had never been treated with DMT had larger WM lesion volumes; even they had had a slightly lower number of relapses than patients treated with DMT. Our results support the assumption that subclinical inflammatory disease activity also occurs in seemingly benign and mild MS. So, there is justification for DMT use also in the benign course of the disease, regardless of the clinical relapse rate (Montalban et al., 2018;Ziemssen et al., 2016;Zivadinov et al., 2016  Note: B = coefficient B in regression analysis for group difference. Difference from ARRMS to BRRMS adjusted with duration of disease and Gd-enhancement. Abbreviations: ARRMS, aggressive MS; BRRMS, benign MS; CCI, corpus callosum index; p = p-value for group difference, adjusted with time from onset symptoms and Gd-enhancement; WM, white matter.

F I G U R E 4
Correlation between corpus callosum index (CCI) and total brain volume F I G U R E 5 Correlation between total white matter (WM) lesion volume and total brain volume with second-line DMT compared to first-line DMT but did not report specifically GM atrophy (Branger et al., 2016). In a 2-year follow-up study, patients treated with fingolimod showed milder GM atrophy versus nontreated patients (Yousuf et al., 2017). A longitudinal study with follow-up MRI scanning covering a long enough period would be much more sensitive to differences and give more information on the effect of DMT in brain atrophy in a benign clinical course of MS.
As the definition of benign MS is a retrospective judgment of past disease trait, the prognostic MRI markers of the disease course are worth searching. So far, there are no established predictors for long-term outcomes. It seems that a significant proportion of patients with benign MS develop cognitive decline, overall disability, and brain atrophy after a long follow-up period (i.e., more than 20 years), even though there are no clinical relapses and neurological signs remain mild (Correale et al., 2012;Hirst et al., 2008;Mesaros et al., 2009;Portaccio et al., 2009;Rovaris et al., 2008;Zivadinov et al., 2016). The extent of brain atrophy in benign MS compared with an age-matched healthy control group is scarcely investigated, reporting reduced subcortical and cortical GM in benign MS patients compared to healthy controls (Mesaros et al., 2008;Rovaris et al., 2008). Reduction of thalamic volume in benign MS compared to healthy subjects has been reported, but it may be a typical characteristic of MS itself, purely reflecting the vulnerability of the thalamus to specific damage in MS pathology . Interestingly, we found that thalamic volume was larger in BRRMS than in ARRMS, contrary to previous findings in a smaller patient study (Ceccarelli et al., 2008).
We used a set of volumetric biomarkers that were extracted from routine MRI examinations. Earlier studies have demonstrated that GM atrophy progresses despite clinically highly effective DMT, such as natalizumab (Koskimaki et al., 2018). Our study setting was planned to compare two clinically different phenotypes of MS, BRRMS, and ARMMS in a cross-sectional study, but not to evaluate the effect of highly effective DMTs on the rate of brain atrophy in a longitudinal setting. Thus, we included the five patients in the ARRMS group scanned before the fingolimod or natalizumab treatment initiation. In the subgroup analysis, these patients had similar brain volume patterns as those who had been treated with highly active DMT for at least 1 year at an MRI time point.
To our knowledge, this is the first study using CCI as a parameter in an automated MRI quantification tool in benign MS patients.
Our results are in line with the previous few reports on the negative correlation of CCI and GM atrophy (Klawiter et al., 2015) as well as CCI in benign MS (Mesaros et al., 2009). Thus, CCI seems to be an easily assessable MRI marker for brain atrophy in MS patients, and applicable in an automated tool, considering that CCI analysis done manually is time consuming and vague (Yaldizli et al., 2010). In a recent study with early relapsing MS and secondary progressive MS patients, thalamic atrophy and whole-brain atrophy were identified as possible disease progression markers measured with the same automated MRI quantification method as used in our study (Hanninen et al., 2019).
An acceleration of volume reduction after initiation of DMTs, also referred to as pseudoatrophy, is associated mainly with natalizumab (Koskimaki et al., 2018;Miller et al., 2007). It is supposed to be a consequence of the resolution of inflammation after therapy initiation, probably due to fluid shifts (i.e., resolution of brain edema) and changes in inflammatory cells, and mostly due to white matter volume changes.
However, the exact mechanism is poorly understood. The pseudoatrophy effect does not seem to occur for GM (Prinster et al., 2006). In our study, five patients in the ARRMS group had initiated fingolimod or natalizumab within 1 year before MRI scanning. These patients did not have smaller whole-brain or WM volumes compared to the other patients within the ARRMS group. As results in CCI and thalamus volume are similar to whole-brain volume, we assume pseudoatrophy alone does not explain the smaller whole-brain volumes in the entire ARRMS group.
The strengths of our study include detailed clinical characteristics for each patient and thorough EDSS evaluation. The duration of the disease in patients with BRRMS clearly exceeds 10 years, which is a commonly used criterion for benign MS (Glad et al., 2010). Lack of cognitive testing may be counted as a weakness in our study. We only used EDSS as a clinical measure, which emphasizes motor functions. Part of the patients with benign clinical phenotype and minimal motor disability suffer from notable cognitive decline and depression, which should be recorded in the overall disability (Correale et al., 2012;Gonzalez-Rosa et al., 2006;Mesaros et al., 2009).
Due to the retrospective nature of the study, the imaging protocols, scanners, and voxel sizes were variable. This might have had some impact on the imaging results, especially for the cortical GM measures rather than other volume measures. Both 1.5T and 3T imaging were analyzed, with the emphasis of 3T images in ARRMS. There is a possibility of bias due to this imbalance of scanners, but the normalization of the structures and previous studies with the same algorithm suggest that this bias has not affected the results significantly (Koikkalainen et al., 2016;Lotjonen et al., 2010). On the other hand, in a large sample size, even small differences become significant when using one single scanner and sequence with a defined single voxel size. In previous studies with FreeSurfer structural tool, the use of multiple different MRI scanners and pulse sequences did not appear to have a significant effect on cortical thickness measurements (Govindarajan et al., 2014;Potvin et al., 2017). Test-retest difference in cNeuro® MRI quantification tool between different scanners is two to three fold compared to having a one single scanner, which is equal to other methods. Also, voxel size variation does not seem to affect the results in cNeuro® MRI quantification tool. Nevertheless, we consider that our results are logical and suggest the methodology is quite robust (Kaipainen et al., 2021).
Another weakness of our study is the nature of a single-point MRI analysis and thus, lack of longitudinal analysis. Longitudinal volumetric analysis requires that a specific MRI scanning protocol is repeated with the same scanner, which was impossible to achieve in real-life retrospective data. Also, in this study, we did not have a healthy age-matched control group. Almost half of the 3D T1-w sequences were done with Gd-enhancement, which was taken into account in the analysis. We were not able to combine information about Gd-enhancing lesions in this volumetric study. However, we consider that this does not confound the interpretation of our results. We excluded patients who had had a clinical relapse or cortisone treatment within 1 month before MRI scanning to avoid the possible effect of evident inflammation.

CONCLUSIONS
We conclude that thalamic volume was the most prominent GM measure to differentiate BRRMS and ARRMS. Patients with BRRMS had larger whole-brain and thalamic volumes than patients with aggressive disease course. CCI has been suggested as a marker of brain atrophy, and we conclude that an automatically quantified CCI seems to be an accessible and applicable MRI marker of brain atrophy, to be used in combination with other measures, such as whole-brain volume and thalamic volume.

This work was supported by research grants from The Finnish Cultural
Foundation and Kuopio University Hospital.

CONFLICT OF INTEREST
JK and JL are employees and shareholders in Combinostics. JL has given educational presentations for Merck and Sanofi, paid to his institution. Other authors have no conflicts of interest.

DATA AVAILABILITY STATEMENT
Data available on request from the authors.