Multivariate white matter alterations are associated with epilepsy duration

Previous studies investigating associations between white matter alterations and duration of temporal lobe epilepsy (TLE) have shown differing results, and were typically limited to univariate analyses of tracts in isolation. In this study we apply a multivariate measure (the Mahalanobis distance), to capture the distinct ways white matter may differ in individual patients, and relate this to epilepsy duration. Diffusion MRI, from a cohort of 94 subjects (28 healthy controls, 33 left-TLE and 33 right-TLE), was used to assess associations between tract fractional anisotropy (FA) and epilepsy duration. Using ten white matter tracts, we analysed associations using traditional univariate analyses (z-scores) and a complementary multivariate approach (Mahalanobis distance), incorporating multiple white matter tracts into a single unified analysis. In patients with right-TLE, FA was not significantly associated with epilepsy duration for any tract studied in isolation. In patients with left-TLE, the FA of two limbic tracts (ipsilateral fornix, contralateral cingulum gyrus) was significantly negatively associated with epilepsy duration (Bonferonni corrected p<0.05). Using a multivariate approach we found significant ipsilateral positive associations with duration in both left, and right-TLE cohorts (left-TLE: Spearman's rho=0.487, right-TLE: Spearman's rho=0.422). Extrapolating our multivariate results to duration equals zero (i.e. at onset) we found no significant difference between patients and controls. Associations using the multivariate approach were more robust than univariate methods. The multivariate distance measure provides non-overlapping and more robust results than traditional univariate analyses. Future studies should consider adopting both frameworks into their analysis in order to ascertain a more complete understanding of epilepsy progression, regardless of laterality.


Introduction
Epilepsy affects over 50 million people worldwide, with around 60% of patients presenting with focal seizures, most commonly being of temporal lobe origin (Téllez-Zenteno and Hernández-Ronquillo 2012). Temporal lobe epilepsy (TLE) can have varying aetiologies, laterality, and onset at different ages. This heterogeneity makes it challenging to study the onset and progression of TLE. Evidence that epilepsy may be associated with progressive cerebral damage has been reported in experimental and human longitudinal studies (Pitkänen and Sutula 2002;R. S. N. Liu et al. 2003;Galovic et al. 2019). An improved understanding of the progressive nature of epilepsy would be beneficial as this may assist in measuring where a patient may be in their disease progression, and help identify early onset pre-symptomatic biomarkers of epilepsy risk.
Whilst longitudinal data have advantages, these are difficult to obtain. Cross-sectional data can infer epilepsy progression at a group level by analysing associations between neuroimaging properties and duration of epilepsy. Indeed, several studies have investigated the relationship between grey matter properties and duration (Tasch et al. 1999;Keller et al. 2002;Seidenberg et al. 2005;Bonilha et al. 2006;Bernhardt et al. 2009;Whelan et al. 2018). In contrast, the relationship between subcortical limbic white matter and epilepsy duration is less well understood, with only a handful of studies reporting partially conflicting results (Table 1).
In a multi-modal analysis investigating the inter-relationships between measures of grey matter volume, and white matter FA, Keller et al. (2012) analysed associations with epilepsy duration in a cohort of patients with TLE and hippocampal sclerosis. Widespread associations between duration and fractional anisotropy (FA) beyond the effects of natural aging were reported. Significant correlations were found in eight white matter structures located both ipsilateral and contralateral to the epileptogenic zone, and in remote tracts beyond the temporal lobe. Investigating differences based on patient laterality, Chiang et al. (2016) correlated FA reductions in each tract with epilepsy duration. For patients with left-TLE there were no significant correlations with duration. However, for patients with right-TLE, significant correlations were identified in the ipsilateral hippocampus and ipsilateral external capsule prior to multiple comparison corrections.
Additional studies investigating the associations between white matter properties and epilepsy duration are listed in Table 1, with varying results. In the uncinate fasciculus, for example, there was evidence of significant correlations between the FA reduction and epilepsy duration in some, (Kemmotsu et al. 2001;Kreilkamp et al. 2017;Tsuda et al. 2018;Hatton et al. 2020) and no significant correlations in others (Lin et al. 2008;Chiang et al. 2016;Kreilkamp et al. 2019). Each study differs in the selection of tracts analysed and the method used for reconstruction. However, all studies use a univariate framework for analysis, correlating epilepsy duration with each individual tract independently. ROI (manual) Hipp Regression analysis No significant correlations (Lin et al. 2008) 10 HC   (Arfanakis et al. 2002;Gross, Concha, and Beaulieu 2006;Govindan et al. 2008;Andrade et al. 2014;Slinger et al. 2016;Park et al. 2018;Ashraf-Ganjouei et al. 2019). These studies are not included in Table 1 as they do not focus specifically on FA of limbic system tracts in adult TLE patients.
Univariate analyses have the advantage of being clear, interpretable and simple to implement. There are, however, limitations. First, univariate analyses are susceptible to outliers within a dataset which increase the probability of inconsistent results between different studies. Secondly, multiple comparison corrections are required when analysing multiple white matter tracts in isolation, to mitigate the chance of a false positive (Type 1 error). However, this correction has the effect of inflating the false negative rate (Type 2 error) leading to the increased probability of overlooking genuine relationships. Thirdly, univariate analyses do not account for the natural covariance between tracts in individuals (Wahl et at. 2010;Westlye et al. 2010;Cox et al. 2016), nor spatial colocalisation of tract segments. Accounting for this covariation is important because if multiple tracts are affected by the same process then a univariate approach does not correct for this in the statistical analysis, and can lead to erroneous conclusions (Wang et al. 2020). Furthermore, if different tracts are affected in different patients then the overall effect for each individual tract will be less than if using a multivariate approach which accounts for this .
In this study we use a multivariate measure -the Mahalanobis distance -complementing the univariate approach, by analysing the associations between white matter FA and duration of epilepsy using numerous white matter tracts simultaneously. We hypothesised that patients with a longer epilepsy duration would be associated with greater abnormalities ipsilateral to the epileptic focus. This approach has been fruitful in studies of autism and traumatic brain injury (Dean et al. 2017;Taylor et al. 2020). Applications of the Mahalanobis distance include analysing individual tracts by integrating multiple diffusion metrics into a single measure, or by pooling numerous metrics from a number of different modalities. In our study we analyse a cohort of subjects using a single diffusion metric (FA), combining multiple white matter tracts to create patient specific measures of hemispheric distance from healthy control subjects. First, (A) using z-scores derived from individual tracts, and second, (B) using Mahalanobis distances derived from all ipsilateral tracts and all contralateral tracts separately. Spearman correlations are corrected for multiple comparisons using Bonferroni corrections and assessed for significance. Finally, robustness of the results are ascertained (C). Subsamples of the patient data are selected N times and used to calculate N correlations. Proportion of samples achieving significance are reported as a measure of consistency. Note that the univariate approach results in five consistency values per hemisphere (one per tract), whereas only one consistency values is produced per hemisphere for the multivariate approach.

Patients
Our cohort consists of 28 healthy controls, and 66 individuals with unilateral TLE (33 left and 33 right). The individuals with TLE were recruited from the National Hospital for Neurology and Neurosurgery epilepsy surgery programme, with diagnoses made by consultant neurologists specialising in epilepsy on the basis of clinical history, seizure semiology, and prolonged video-EEG telemetry with ictal and interictal EEG, high resolution MRI, neuropsychological and neuropsychiatric assessments. Where applicable, " tests were performed to identify group differences in categorical variables: sex, surgery outcome. Two-tailed t-tests were conducted to check for group differences in age, age at epilepsy onset, and epilepsy duration after correspondence to normality was identified using Lilliefors tests. Epilepsy duration was estimated by subtracting the seizure onset age from the age at diffusion imaging scan. Cohort demographics and results of the statistical tests are summarised in Table 2.

Diffusion MRI acquisition
All subjects underwent diffusion weighted MRI acquisition on the same scanner, 3T GE Signa Excite HDx, as described previously (Winston et al. 2013;Taylor et al. 2018;Sinha et al. 2019). Diffusion MRI data were acquired using a cardiac-triggered single-shot spin-echo planar imaging sequence (Wheeler-Kingshott et al. 2002) with echo time = 73 ms. Sets of 60 contiguous 2.4 mmthick axial slices were obtained covering the whole brain, with diffusion sensitising gradients applied in each of 52 noncollinear directions (b value of 1,200 mm2 s−1 [δ = 21 ms, Δ = 29 ms, using a full gradient strength of 40 mT m−1]) along with 6 non-diffusion weighted scans. The gradient directions were calculated and ordered as described elsewhere (Cook et al. 2007). The field of view was 24 cm, and the acquisition matrix size was 96 × 96, zero filled to 128 × 128 during reconstruction, giving a reconstructed voxel size of 1.875 × 1.875 × 2.4 mm. The DTI acquisition time for a total of 3480 image slices was approximately 25 min (depending on subject heart rate).

Image preprocessing
Diffusion images were initially corrected for signal drift (Vos et al. 2017), followed by eddy correction using the FSL tool 'eddy_correct' (Andersson and Sotiropoulos 2016), and rotation of the b vectors using the tool 'fdt_rotate_bvecs' (Jenkinson et al. 2012;Leemans and Jones 2009). Reconstruction and registration were performed with DSI-Studio (http://dsi-studio.labsolver.org) using a Q-space diffeomorphic reconstruction (QSDR) (Yeh and Tseng 2011) with an unweighted diffusion sampling length ratio of 1.25. Diffusion maps were registered to standard space using the HCP1021 template and white matter volumetric regions of interest (ROI) were derived using atlases. The bilateral anterior thalamic radiation (ATR), cingulum in the cingulate cortex area (cingulum gyrus: CG), cingulum in the hippocampal area (cingulum hippocampus: CH), and uncinate fasciculus (UF) were defined using the JHU atlas (Hua et al. 2008). The structure of the bilateral fornix (F) was defined using the HCP842_tractography atlas native to DSI-Studio. See figure 2 for visual representation of tracts analysed. These structures were chosen as per their description by Catani, Dell'Acqua, and Thiebaut de Schotten (2013). Mean FA values for each white matter region were extracted from the ten white matter structures and analysed using R (https://www.r-project.org/).

Figure 2 All white matter tracts reconstructed in DSI-Studio. Colours correspond to each pair of homologous tracts. The
anterior thalamic radiation (ATR; blue), cingulum in the cingulate cortex area (CG; green), cingulum in the hippocampal area (CH; orange), and uncinate fasciculus (UF; pink) were reconstructed using the JHU white matter atlas. The fornix (F; red) was reconstructed using the HCP842_tractography atlas native to DSI-Studio.

Statistical correction of covariates
The effects of sex and healthy aging in each tract were removed using a robust linear model (rlm function from MASS package in R (Venables and Ripley 2002)). In place of fitting the model using an ordinary least squares (OLS) estimator, the rlm function uses an M-estimator with Huber weightings. Standard model fitting using OLS estimators are susceptible to outliers as all data points are assigned a weight of 1. The M-estimator using Huber weightings overcomes this limitation by assigning weights of 1 to residuals of small magnitude and progressively smaller weights to residuals of increasing magnitude. Iterated reweighted least squares is used to solve the estimator as calculation of the weights requires the residuals and calculation of the residuals requires the weights (Fox 2016). The remaining residuals ( 5 ) are used throughout the paper to calculate the univariate and multivariate distances.

Univariate analysis
We quantified the univariate distance of each subject from the control distribution. Robust z-score values were calculated to mitigate the effects of outliers in the control data. First, we selected a subject and a random subset of 24 controls from the total control population (n=28). If the selected subject was a healthy control, we removed them from the population prior to random subset selection, to mitigate the bias of that control subject when calculating their distance. A subset of size 24 was selected as this is the maximum size in which at least 1000 unique sub-samples can be inferred from the population of (n-1) controls. A z-score distance is calculated for the subject by subtracting the mean of the sample control distribution from the subject FA residual and scaling by the sample distribution standard deviation. For each subject, the process is repeated 1000 times to create a distribution of z-score distances. Finally, median z-scores for each subject are reported, providing a robust, single valued measure of univariate distance from the control distribution.

Multivariate analysis
Following the univariate analysis, we calculated subject specific distances from the control distribution, using multiple white matter tracts simultaneously. To achieve this, we used the Mahalanobis distance. An extension to the multivariate z-score distances (Euclidean distance), the Mahalanobis distance is a measure of distance from a reference distribution in multiple dimensions whilst accounting for the covariance structure. Penalising data points that fail to adhere to the natural structure, two subjects with similar z-scores in each individual dimension can have vastly different Mahalanobis distances (Supplementary figure S1, and , fig. 1).
We derived the Mahalanobis distance from the population of healthy controls. Distances are calculated using equation 1, where x represents a vector of subject FA residual values, µ the average tract FA residual values calculated from the healthy controls, and C a matrix representing the natural covariance structure exhibited in the healthy control population.
The Mahalanobis distance assumes normality in the reference distribution. Therefore, univariate assessments of normality in the control subject distribution were conducted using the Lilliefors test (R package; nortest (Juergen Gross and Uwe Ligges 2015)) and multivariate assessments were conducted using a Mardia test (R package; MVN (Korkmaz, Goksuluk, and Zararsiz 2014)). No significant p-values were found after Bonferroni correction, suggesting a good correspondence to normality.
Robust measures of multivariate distances for each subject were calculated by taking 1000 random subsamples of size (24) from the control data (28), calculating a Mahalanobis distance for each sample and reporting the median value. Non-linear shrinkage estimators of the covariance matrix were used in place of the sample covariance matrix to minimise the estimation error of the inverted matrix (Ledoit and Wolf, n.d.). This technique was applied previously by .
Two Mahalanobis distances are calculated per subject. One measure unifies all left hemisphere tracts into a single value and the other unifies all right hemisphere tracts. We interpret these multivariate distances as the overall abnormality associated with TLE in each hemisphere. For both patient groups, we hypothesised a positive association with duration in the ipsilateral hemisphere, i.e. larger distances relate to longer duration.

Associations with duration of epilepsy
We investigated the association of the computed uni-and multi-variate measures with epilepsy duration. Analysing patients with left and right TLE separately, we used Spearman correlations ( ) to quantify the association observed between the univariate and multivariate distances and epilepsy duration. A non-parametric alternative to the Pearson correlation, the Spearman correlation is a measure of the monotonic relationship between two variables which is more robust to outliers in the dataset. Hypothesising a more negative z-score and more positive Mahalanobis distance with a greater epilepsy duration, significant correlations were assessed using a onetailed test. Per patient group, ten univariate correlations were computed (one per tract) and two multivariate correlations (ipsilateral and contralateral). Reporting significance at the = 0.05 threshold a Bonferroni correction was applied to account for h multiple comparisons (univariate h=10, multivariate h=2). Significant correlation thresholds for samples of size n were approximated using a Student's t distribution with (n-2) degrees of freedom and test statistic (t) shown in equation 2.

= E F@"
$@G H (2) Assessments into the effects of laterality on the correlational analysis were conducted by combining the Mahalanobis distances of all patients into a single ipsilateral and contralateral measure of distance and correlating these with epilepsy duration.
We also investigated if the ipsilateral and contralateral Mahalanobis distances calculated in patients exhibit white matter deviations from the healthy population at onset (i.e. where duration equals zero years). Using robust linear regression models of all patients' ipsilateral and contralateral Mahalanobis distances, estimates of the Mahalanobis distance at duration zero were calculated by regressing out the effects of duration and considering the intercept ( J from equation 3). Robust z-scores were computed using the estimated distances at duration zero as points of interest and the 56 Mahalanobis distances for control subjects as the reference distribution (56; 28 left hemisphere, 28 right hemisphere). Similar to the calculation of univariate and multivariate distances, 1000 random subsamples of size (54) taken from the control distribution (56) were used to calculate z-scores with the median value reported. Samples of size 54 were chosen as it is the maximum size in which at least 1000 unique sub-samples can be inferred from the population of 56 control distances. Given that the Mahalanobis distances, by definition, are positively skewed, the control distribution and points of interest were log transformed to ensure normality prior to calculating each z-score.

Relationship with surgical outcome
Finally, we investigated associations between the z-scores and log(Mahalanobis distances) calculated and surgical outcomes. Hypothesising that larger distances would relate to poorer outcomes, we used one-tailed, two sample t-tests to see if the z-scores, and ipsilateral and contralateral Mahalanobis distances adequately separated patients who were, and were not, completely seizure free following surgery.

Robustness
To assess the robustness of the univariate and multivariate correlational analyses to outliers in the data and potentially explain some of the variability seen in the literature we used Jackknife resampling. A random subsample of 30 left-TLE and 30 right-TLE patients was taken and the association of each of the ten tract FA values and the ipsilateral/contralateral Mahalanobis distances with duration were calculated. Samples of size 30 were chosen as it is the maximum size in which at least 1000 unique sub-samples can be inferred from the population of 33 patients.
Repeating 1000 times and reporting the proportion of samples yielding significant correlations (consistency ) provides a measure of the robustness the data from each tract has to outliers in the dataset. An ideal measure would either always show a significant result ( = 100%) or never show a significant result ( = 0%). Where deviates far from the extremities we interpret this as being inconsistent and therefore has the potential to lead to different results depending on the sample chosen or specific methodology. This thus leads to varied reporting in the literature of (non)significant results. In order to assess the stability of our robustness analysis to cohorts of various sizes we also repeated the analysis for subsample sizes, ranging from 20 to 30 patients per group. Figure 3a highlights the association between epilepsy duration and z-scores for the bilateral uncinate fasciculus in left and right-TLE patients using Spearman correlation. In left-TLE, lower FA, bilaterally, was associated with longer duration of epilepsy in all 10 white matter tracts ( Figure  3a; upper, Figure 3b). In two tracts this was statistically significant after multiple comparisons correction (ipsilateral fornix ⍴=-0.493, p=0.002, and contralateral cingulum gyrus ⍴=-0.460, p=0.004). In right-TLE, there was no significant association between FA and duration in any tract ( Figure 3a; lower panels). All ⍴ and p values are shown in supplementary Table S1.

Outcome of surgery
Evaluation of associations between univariate z-score distances and surgery outcome revealed significant results for right-TLE patients only. After correction for multiple comparisons the contralateral uncinate fasciculus remained significant, showing that FA values which deviate the least from healthy controls relate to a better seizure free outcome (T=2.785, p=0.005).

Multivariate associations between Mahalanobis distance and duration of epilepsy
In left-TLE patients, a significant association was present between the Mahalanobis distances and duration of epilepsy in the ipsilateral hemisphere only (⍴=0.493, p=0.002), with increased distance as duration progresses (Figure 4; upper panels). In contrast to the univariate approach, right-TLE patients showed significant association with duration ipsilaterally (⍴=0.412, p=0.009) ( Figure 4; lower panels). All ⍴ and p values are shown in supplementary Table S3.
Combining all patients into a single unified analysis ( Figure 5; upper panels) reveals a strong significant correlation between Mahalanobis distance and duration of epilepsy in the ipsilateral hemisphere (⍴ =0.482, p=1e-05) and a weaker non-significant correlation in the contralateral hemisphere (⍴ =0.195, p=0.058). Intercepts of the ipsilateral and contralateral progression ( J = 2.164 and J = 2.096 respectively) appeared to originate from the control distribution mean ( 7 = 2.137) ( Figure 5; lower panels). Analysis of the intercept revealed that estimates of the ipsilateral (z=0.239, p= 0.811) and contralateral (z=0.169, p=0.866) Mahalanobis distances of patients at epilepsy onset (i.e. at duration equal zero; the intercept of the regression) were not significantly different from healthy controls.
In left-TLE patients, no significant associations with surgery outcome were present using the ipsilateral or contralateral Mahalanobis distances. However, for right-TLE patients, the contralateral Mahalanobis distance was significantly associated with surgery outcome, surviving Bonferroni correction (T=-2.810, p=0.004), with larger distances in those who did not become seizure free. No significant associations with surgery outcome were found using the ipsilateral Mahalanobis distance (supplementary Table S4).

Robustness
In patients with left-TLE ( Figure 6; upper panels), a univariate analysis of the ipsilateral fornix gives a significant association with duration 70% of the time, depending on the subsample of patients chosen (κ=70%). Other white matter tracts are also varied such as the contralateral cingulum gyrus (κ=50%), and contralateral uncinate fasciculus (κ=21%). All other white matter tracts used in the univariate analyses show good consistencies with values of near 0%. The Mahalanobis approach yields very consistent results, showing a significant association with duration regardless of the subsample ipsilaterally, and never showing a significant association contralaterally (κ=100% and κ=0% respectively). Univariate analyses of right-TLE patients show strong robustness to outliers with all white matter tracts showing consistencies of (κ=0%) indicating that significant correlations are never reported. For the analysis of multivariate robustness in right-TLE patients we see a strong robustness to outliers in the contralateral Mahalanobis distance (κ=0%) and a relatively strong level of robustness for the ipsilateral Mahalanobis distance analysis (κ=82%).
Stability of the robustness analysis over a range of other subsample sizes is reported in Figure S2 and are consistent with those in figure 6. As expected, better performance (i.e. ability to consistently detect a significant effect) was found with larger subsample sizes. Associations with low consistency are seen over the whole subsample range. Those with high consistency values at the maximal subsample size decline rapidly as the subsample size decreases.

Discussion
We report a multivariate analysis of FA and assessed the robustness of traditional and novel approaches to outliers. Our key findings were first, significant correlations between ipsilateral Mahalanobis distances and duration were present regardless of patient laterality, contrary to univariate findings. Secondly, by extrapolating our data to duration equal to zero (i.e. disease onset). we found no significant difference to controls. Thirdly, our robustness analysis revealed that a number of univariate associations were susceptible to outliers in the dataset, whereas results obtained using multivariate Mahalanobis distances were more stable.
Previous studies have typically conducted univariate analyses, correlating the associations between white matter FA and epilepsy duration, with varied findings reported in the literature. We showed that univariate z-scores were significantly correlated with epilepsy duration in two white matter regions after Bonferroni correction (p<0.05, h=10), with an additional seven regions showing associations prior to correction. Significant associations were present in 1) ipsilateral fornix and 2) contralateral cingulum gyrus of left-TLE patients. Similar associations were shown previously, with a whole brain TBSS analysis by (Tsuda et al. 2018) also showing significant associations between epilepsy duration and FA of the fornix in adult TLE patients after multiple comparisons correction. However, contrary to our findings, analyses by (Concha et al. 2009;Kemmotsu et al. 2011;M. Liu et al. 2012;Chiang et al. 2016;Hatton et al. 2020) found no significant associations between epilepsy duration and FA of the fornix. Keller et al. (2012) identified a significant relationship between duration and FA in the ipsilateral cingulum gyrus. No significant association with duration of epilepsy was found using the contralateral structure. A white matter ROI study by (Hatton et al. 2020) investigated associations in left and right-TLE patients taking account of presence, or not, of hippocampal sclerosis (HS). Significant associations after multiple comparisons correction were only observed in left-TLE patients with evidence of HS. In contrast to our univariate analyses of left TLE patients, we found no significant association with duration in right TLE for any tract.
Our multivariate assessment of associations between hemispheric Mahalanobis distances and epilepsy duration showed stronger associations in the ipsilateral hemisphere, regardless of laterality. The strength of the association between duration of epilepsy and the ipsilateral Mahalanobis distance surpassed all univariate z-score associations. The lack of any significant univariate associations with duration in the ipsilateral hemisphere suggests that additional information is gained in a multivariate analysis, that is not captured using traditional methods.
Previous studies have shown that the Mahalanobis distance provides information over and above a univariate analysis. In a study of patients with traumatic brain injury,  demonstrated that the Mahalanobis distance derived from FA values of 22 white matter tracts better distinguished patients from controls (AUC=0.82) than any individual univariate tract z-score (AUC=0.72) and was associated with a level of cognitive impairment. A study of patients with autism demonstrated that the Mahalanobis distance could better distinguish patients from controls (Dean et al. 2017). Interestingly, that study found that combining the FA, MD, RD, and AD of 19 white matter ROIs into a single Mahalanobis distance yielded the best patient control separation, with zero overlap between the two groups. A similar approach could be used in future work applied to TLE. Together, these findings suggest that the Mahalanobis distance provides information complementary to univariate approaches in terms of assessing white matter damage.
We extrapolated our data to estimate the mean Mahalanobis distance at epilepsy duration zero, which we interpret as epilepsy onset, and suggest that white matter abnormality may not precede the onset of TLE, but progresses with the course of the condition. A longitudinal study by Liu et al. (2005) found that a cohort of patients with newly diagnosed and chronic TLE showed a reduction in hippocampal volume at baseline scan relative to healthy controls, with small reductions as time progressed. The rate of decline was comparable to those of healthy controls and thus that study concluded that initial reduction was likely attributed to a precipitating insult, with further declines attributed to healthy aging. Conversely, a study focussing on new onset seizures in children (Widjaja et al. 2012) reported no significant differences in the hippocampal volume of patients compared to controls. Conflicting results may be attributed to a number of factors, including patient selection, age, and the presence of lesions.
Diffusion MRI is sensitive to alterations in the microstructural architecture Soares et al. 2013) and therefore has the potential to reveal early deviations from healthy controls that are not captured by T1 weighted MRI. In our cross-sectional analysis of multivariate white matter alterations, we find that patients with TLE do not deviate from controls at duration zero both ipsilateral and contralateral to the epileptogenic focus ( Figure 5). The lack of difference from controls suggests that gross alterations to limbic system white matter may not be present prior to the start of the epilepsy, however longitudinal studies of new onset patients are needed to confirm.
Assessment of framework robustness showed that association with duration obtained using univariate z-scores were more susceptible to generate variable results than associations calculated using multivariate Mahalanobis distances. Three univariate associations between epilepsy duration and z-scores show poor consistency when subsampling the dataset. All associations pertain to the left-TLE patients, namely the ipsilateral fornix, contralateral cingulum gyrus, and contralateral uncinate fasciculus. Situated near the significant correlation threshold after multiple comparisons correction, it is unsurprising that these three tracts show poor consistency as small deviations from the monotonic relationship would easily alter the state of significance. As we have used a Bonferroni correction (which is dependent on the number of comparisons) it should also be noted that different consistency values would be observed if the number of tracts studied varied. Based on this and the differences in inter-study sample sizes, it is likely that the Bonferroni correction accounts for some of the inconsistencies which exist in the literature. Consistency values associated with the correlations between epilepsy duration and Mahalanobis distances show a strong robustness to outliers. Bonferroni correction of the Mahalanobis distance analysis based on number of tracts is also not required and is a distinct advantage of the multivariate approach.
All patients studied here later underwent anterior temporal lobe surgery. This therefore offered the opportunity to investigate the relationship to post-surgical seizure-freedom. We found significant differences between outcome groups for right TLE patients in a univariate approach (for the contralateral uncinate fasciculus) and the multivariate approach (contralateral Mahalanobis distance) - Table S2,S4. Our finding that patients with poorer surgical outcomes were significantly further from controls than patients with seizure-free outcomes suggests a predisposing factor to surgical treatment success. This agrees with a large number of recent studies suggesting that pre-operative diffusion metrics may be predictive of post-surgical outcomes (Bonilha et al. 2015;Bonilha and Keller 2015;Munsell et al. 2015;Keller et al. 2017;Sinha et al. 2017;Taylor et al. 2018;Sinha et al. 2019).
Our univariate analysis of associations between epilepsy duration and z-score distances in ten white matter structures revealed notable differences based on patient laterality. We found stronger and more widespread correlations in left-TLE, with two significant correlations surviving multiple comparisons correction. No significant correlations were seen in the right-TLE patient group (Table S1). Consistent with our findings, Kemmotsu et al. (2011) found significant correlations between duration and alterations in white matter FA for patients with left-TLE only. They reported significant Pearson correlations in the ipsilateral cingulum hippocampus (r=-0.775) and ipsilateral uncinate fasciculus (r=-0.682). In contrast, Chiang et al. (2016) found no significant correlations between white matter FA alterations and epilepsy duration in left-TLE patients. These observed differences based on laterality could be attributed to multiple factors including the reconstruction method, the multiple comparisons correction used and the sample sizes of those studies.
Associations between epilepsy duration and the multivariate Mahalanobis distances revealed similarities regardless of laterality ( Figure 4). Stronger correlations were observed using ipsilateral hemisphere white matter structures compared to the contralateral structures. Given that the Mahalanobis distance is a measure of the overall hemispheric FA alteration, the results are convincing given that univariate analyses have previously shown stronger white matter alterations in TLE patients ipsilateral to the epileptogenic zone with fewer abnormalities contralaterally (Ahmadi et al. 2009;Otte et al. 2012;Besson et al. 2014).
Our cross-sectional analysis of epilepsy progression has limitations. In order to compare patients, we removed the effects of healthy aging using regression. This procedure assumes FA alterations in all subjects follow a similar natural linear trajectory. It is likely that FA alterations in some patients are underestimated whereas others are overestimated. These residuals may have an effect of altering the magnitude of associations between FA and duration. Secondly, the use of cross-sectional data only provides associations with epilepsy onset and progression, rather than giving direct causal evidence. Thirdly, although FA is the most widely used diffusion MRI metric in the literature, it nonspecific in its measurement which can be influenced by various different factors including axonal density and myelination (Concha et al, 2010). Additionally, a limitation of the multivariate approach in this study is the loss of spatial specificity. By combining multiple white matter tracts into a single analysis, it becomes difficult to interpret which regions contribute most to the observed associations.
Collectively, our results show that the Mahalanobis distance can be used alongside the traditional univariate analyses for a more complete understanding of the progressive nature of epilepsy and its association with white matter abnormalities. More robust to outliers than the traditional univariate z-score approach, the Mahalanobis distance is a complementary method which can be used to compare the overall epilepsy burden in a given hemisphere with clinical variables. Future studies with large cohorts and multi-site data (Hatton et al. 2020) could confirm if the Mahalanobis distance provides consistent results when merging multiple white matter tracts into a single analysis. Additionally, future studies could focus on using robust Mahalanobis distances to explore localised changes within cohorts of patients living with epilepsy, either by combining multiple diffusion measures of individual tracts into a single analysis, or by pooling measures from different modalities.    Table S2 Results of one-tailed, two sample t-tests assessing the associations between univariate z-scores and post-operative seizure freedom. Reported are the t-statistics and corresponding p-values; T (p-value). Positive t-statistics indicate that larger negative z-scores pertain to ILAE2+ patients relative to ILAE1 patients. Conversely, negative t-statistics indicate the inverse relationship, that larger negative z-scores pertain to ILAE1 patients. ATR: Anterior thalamic radiation, CG: Cingulum gyrus, CH: Cingulum hippocampus, F: Fornix, UF: Uncinate Fasciculus. L and R correspond to the left and right hemisphere respectively. Significance levels; * indicates p<0.05, ** indicates p<0.01, *** indicates p<0.001. Bold represents significance after multiple comparisons correction.   Table S4 Results of one-tailed, two sample t-tests assessing the associations between Mahalanobis distances and clinical variables. Reported are the t-statistics and corresponding p-values; T (p-value). Positive t-statistics indicate that larger Mahalanobis distances pertain to ILAE2+ patients relative to ILAE1 patients. Conversely, negative t-statistics indicate the inverse relationship, that larger Mahalanobis distances belong to ILAE1 patients relative to ILAE2+ patients. Ipsilateral: Mahalanobis distance calculated using all ipsilateral ROI's. Contralateral: Mahalanobis distance calculated using all contralateral ROI's. Significance levels; * indicates p<0.05, ** indicates p<0.01, *** indicates p<0.001. Bold indicates significance after multiple comparisons correction.

Figure S1
A schematic example demonstrating the importance of accounting for the covariance structure in measures of distance. A population of healthy controls (blue points) covary in their FA values of two tracts (A). If we consider the FA values of two patients (red and green points) we can see that they do not deviate from the controls when considering each tract in isolation (histograms). An extension to the univariate z-score we use the Euclidean distance to discover the shortest path from each point to the center of the control population (B). Here the red and green points deviate from the distribution the same amount and are again within a normal range from the controls. (C) Accounting for the covariance structure and penalising points that deviate from it we see a clearer difference between patients and controls. Additionally we see that the red point deviates further from controls than the green point. The approach presented here is visualised in two dimensions for two tracts but can be extended to higher dimensions.