Diffusion Histology Imaging to Improve Lesion Detection and Classification in Multiple Sclerosis

Background: Diagnosing MS through magnetic resonance imaging (MRI) requires extensive clinical experience and tedious work. Furthermore, MRI-indicated MS lesion locations rarely align with the patients’ symptoms and often contradict with pathology studies. Our lab has developed and modified a novel diffusion basis spectrum imaging (DBSI) technique to address the shortcomings of MRI-based MS diagnoses. Although primary DBSI metrics have been demonstrated to be associated with axonal injury/loss, demyelination and inflammation, a more detailed analysis using multiple DBSI-structural metrics to improve the accuracy of MS lesion detection and differentiation is still needed. Here we report that Diffusion Histology Imaging (DHI), an improved approach that combines a deep neural network (DNN) algorithm with improved DBSI analyses, accurately detected and classified various MS lesion types. Methods: Thirty-eight multiple sclerosis patients were scanned with T2-weighted imaging (T2WI) using fluid attenuated inversion recovery (FLAIR), T1-weighted imaging (T1WI) using magnetization-prepared rapid acquisition with gradient echo (MPRAGE), magnetization transfer contrast (MTR) imaging and diffusion-weighted imaging. The imaging results identified 43,261 voxels from 91 persistent black hole (PBH) lesions, 89 persistent gray hole (PGH) lesions, 16 acute gray hole (AGH) lesions, 189 non-black hole (NBH) lesions and 113 normal-appearing white matter (NAWM) areas. Data extracted from these lesions were randomly split into training, validation, and testing groups with an 8:1:1 ratio. The DNN was constructed with 10 fullyconnected hidden layers using TensorFlow 2.0 in Python. Batch normalization and dropout regularization were used for model optimization. Results: Each MS lesion type had unique DBSI derived diffusion metric profiles. Based on these DBSI diffusion metric profiles, DHI achieved a 93.6% overall concordance with neurologist . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. not certified by peer review) (which was The copyright holder for this preprint this version posted October 22, 2019. ; https://doi.org/10.1101/19009126 doi: medRxiv preprint


Introduction
Multiple sclerosis (MS) is a common inflammatory CNS disorder that affects around 900,000 people in the United States 1 . MS usually begins with patients having intermittent "attacks" (i.e. While standard cMRI is sensitive in detecting various MS lesions, it lacks pathological specificity and requires extensive clinical experience and large workload 6 . Accurate identification of MS lesions using cMRI becomes extremely difficult due to lesion variability in location, size and shape; the anatomical variability between subject further complicates diagnoses 7 . In addition, the association of MS lesions with clinical presentation and pathologies may come in discordance.

MS Lesion Identification
All the lesions were identified by an experienced neurologist with 20 years of clinical experience. Amira 6.0.1 visualization and analysis software (FEI, Hillsboro, OR) was used to quantify intensity values for each hypointense lesion on all scans. Hypointense areas of white matter (WM) on T1WI were defined as "black holes" (BH). Less hypointense BHs were defined as "gray holes" (GH).
Note that the lesion intensity assessment requires establishing the baseline intensity of each scan to account for scan-to-scan variations.
The BH and GH that have persisted for at least 12 months are markers of focal tissue injury in MS and are known as "persistent black holes" (PBH) and "persistent gray holes" (PGH), respectively 12 . BHs and GHs that have persisted for less than 12 months are markers for inflammation and are known as "acute black holes" (ABH) and "acute gray holes" (AGH), respectively. Other T2WI hyperintense non-black/gray hole lesions were defined as non-black hole (NBH) T2W lesions. ABH were excluded in this study because the number of identified ABHs were insufficient for model training.

Diffusion Basis Spectrum Imaging
DBSI metric maps were estimated on pre-processed DW images using an in-house software developed using MATLAB ® (Mathwork, MA). DBSI models the diffusion-weighted MRI signals as a linear combination of multiple tensors describing both the discrete anisotropic axonal fibers . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. not certified by peer review) (which was The copyright holder for this preprint this version posted October 22, 2019. ; https://doi.org/10.1101/19009126 doi: medRxiv preprint and an isotropic diffusion spectrum encompassing the full range of diffusivities 9 , as seen in Eq. [1]. can also be defined as apparent axonal density in WM. DBSI-derived AD and RD retain the pathological specificity for axonal injury and demyelination as previously proposed models but contain fewer confounds from coexisting MS pathologies. The DBSI-derived "restricted" isotropic diffusion fraction (ADC ≤ 0.3 µm 2 /ms) has been shown to reflect cellularity 9 . Cellular and axonal packing plays a crucial role in extracellular and extra-axonal diffusion characteristics. Less restricted or non-restricted isotropic diffusion components represent water molecules in less densely packed environments, such as areas of tissue disintegration or edema, or non-restricted water within cerebrospinal fluid (CSF) 9,13,14 .

Image Processing
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. not certified by peer review) (which was The copyright holder for this preprint this version posted October 22, 2019. ; https://doi.org/10.1101/19009126 doi: medRxiv preprint Whole-brain voxel-wise DTI and DBSI analyses were performed by an in-house software developed using MATLAB ® . To control for scan-to-scan variation within individual scans, cerebral spinal fluid (CSF), which is unaffected by MS pathologies, was used as the baseline for individual scans and to asses signal intensities of MS lesions under T1WI and T2WI modalities.
Around one hundred representative CSF voxels in the center of the ventricle were selected to ensure the exclusion of any voxels containing choroid plexus or partial volume effect from the ventricular edges. The median intensity was defined as "CSF Intensity" for that scan. For each voxel in MS lesions, the voxel intensity was divided by the CSF intensity to normalize b0, T1WI, and T2WI intensities.

DNN Model Development and Optimization
Our complete data sample consisted of 43261 imaging voxels from 499 MS lesions coming from 38 patients. The collected voxels were split into training, validation, and testing sets with a ratio of 8:1:1 respectively. We compared four different models, each containing their own set of features. was also adopted to optimize the number of nodes per layer in DNN models. The neural networks were pruned to having between 0 and 200 nodes in each layer in increments of 10 nodes. Pruned neural networks were tested against validation accuracy over 150 epochs. Ten sets of 10-fold crossvalidation were performed at unique random states to ensure the statistical relevance of the results.
Batch normalization was performed with a mini-batch size of 200 before feeding data to the next hidden layer to improve model optimization and prevent overfitting. Exponential linear units (ELU) were used to activate specific functions in each hidden layer. The final layer was a fully connected softmax layer that produces a likelihood distribution over the five output classes. The network was trained with random initialization of the weights as described in He et al 16

Statistical Analysis
Confusion matrices were calculated and used to illustrate the specific examples of MS lesion classes where the DNN prediction contradicts the neurologist's diagnoses. The one-versus-rest strategy was implemented to perform ROC analysis and area under curve (AUC) was calculated to assess model discrimination of each lesion type. Sensitivity and specificity values were calculated at the optimal cut off points, which were computed by maximizing the differences between true positive rates and false positive rates. The precision-recall curve was calculated to show the relationship between precision (positive predictive value) and recall (sensitivity/true . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. not certified by peer review) AGH lesions were also largely incorrectly predicted to be NBH and NAWM with percentages of 55.1% and 38.3%, respectively. The true prediction rate of AGH was only 3.2%.
The one-versus-rest classification strategy was used to calculate ROC and precision-recall curves to test each DNN model's detection performance on individual lesion/region type. ROC  (Fig. 5A). However, as ROC analysis is insensitive to class imbalance, the ROC AUC values could overestimate the model performances.
Precision-recall curve could effectively address this issue and provide complimentary informaton to ROC. These three models didn't perform as well on precision-recall analyses. For example, the precision-recall AUC values for PGH and AGH in these three models were all lower than 0.650 (Fig. 5B). We used bootstrap method with 1000 itertations to calculate ROC AUC, sensitivity and positive correlations with increased MS-induced disabilities 35,36 . Additionally, several other studies have shown that PBH numbers or volumes correlate with worse clinical test scores 35 . This shows that PBHs have significant relevance to clinical outcomes and disease progression.
Compared to PBHs, PGHs reflect a lower degree of axonal loss. Comparisons of imaging and neuropathology in over 100 MS lesions found with a certain degree of hypointensity was strongly associated with axonal density 37 . MS lesion burden has often been reported as the sum of lesion volumes, but the degree of tissue destruction may vary between lesions 38 . A quantitative method to distinguish the degree of hypointensity in individual MS lesions could improve patient monitoring, allow for better correlations to clinical measures, and be useful as an outcome measure in clinical trials of potential reparative therapies 37 . In contrast with PBH and PGH lesions (which are associated with axonal loss), ABH and AGH lesions are likely caused by inflammation and edema, since most ABH/AGH lesions resolve within months of contrast resolution 22 . Although these lesion types reveal different pathological characteristics, they all show hypointensity in precontrast T1W images. Clinically it is challenging to distinguish these lesions solely by expert raters consistently. In contrast, our DNN based tool could automatically and accurately predict different lesion types, showing great promise to effectively and efficiently assist clinicians.
MS lesion identification and segmentation based on manual tracings of structural boundaries and signal intensity remained the most commonly used MS diagnostic method. This method not only requires experienced radiologists or neurologists, but is also time-consuming, expensive, and subject to operator bias, which often causes considerable inter-and intra-rater variability 39 . In addition, it is difficult for a human expert to combine information from various slices and channels when multispectral MRI data are examined 40 . Computer-aided analysis can assist in identifying brain structures and lesions, extract quantitative and qualitative properties and . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. not certified by peer review)  44 have been explored to segment MS lesions from normal brain regions.
However, classifications for various types of MS lesions using deep neural networks have not been reported. Therefore, we constructed a DNN model employing DBSI metrics that showed high capability in distinguishing different MS lesion types.
There are some limitations of this study. First, the subject number is relatively small with only 38 MS patients included in our study cohort. However, a total of 43,261 imaging voxels from 499 MS lesions were identified by expert neurologist from three subtypes of MS patients, which could effectively remedy the small cohort size of the study. Additionally, DBSI derived metrics were modeled and calculated on a single voxel basis, which would address the heterogeneity of MS lesions. Second, the data distribution was imbalanced among different lesion\region types, which could compromise the performance of a DNN model. While the class imbalance is a common issue in most clinical studies, DBSI provided pathology-specific features that significantly improve the overall performance for the DNN model.

Conclusions
DHI significantly improves the accuracy of MS detection and classification for five representative MS lesion types, especially when the overall prediction accuracy is 91.2%, which significantly outperformed other models. The improved DHI framework could greatly assist neurologists and neuroradiologists in making clinical decisions. The efficiency and effectiveness of this DNN . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. not certified by peer review) (which was The copyright holder for this preprint this version posted October 22, 2019. ; https://doi.org/10.1101/19009126 doi: medRxiv preprint model demonstrates great promise for clinical application and the potential to become a marker of lesion severity and possibly lesion recovery for clinical trials and in practice. Additional longitudinal studies with larger cohorts, different scanners, and multiple centers are imperative for further explore the possibilities of applying DHI on a broader scope; this DHI may aid in assesing therapeutic efficacy with regards to axonal degeneration in trials. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.  . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. not certified by peer review)

Patient/Lesion Characteristics
No.