Correlation between the clinical disability and T1 hypointense lesions’ volume in cerebral magnetic resonance imaging of multiple sclerosis patients: A systematic review and meta‐analysis

Abstract Background To evaluate the correlation between T1 hypointense lesions’ mean volume on cerebral MRI with disability level of patients with multiple sclerosis. Methods We included studies testing the desired outcome in adult patients diagnosed with RRMS or SPMS. In Feb 2021, we searched PubMed, Embase, CENTRAL, and Web of Science to find relevant studies. All included studies were assessed for the risk of bias using a tailored version of the Quality in Prognosis Studies (QUIPS) tool. Extracted correlation coefficients were converted to the Fisher's z scale, and a meta‐analysis using a random‐effects model was performed on the results. Results We included 27 studies (1919 participants). Meta‐analysis revealed a correlation coefficient of 0.32 (95% CI 0.26–0.37) between T1 hypointense lesions’ mean volume and EDSS score. Discussion The correlation between T1 hypointense lesions’ mean volume and EDSS was interpreted as low to slightly moderate. The certainty of the evidence was judged to be high.

Magnetic resonance imaging (MRI) is a sensitive paraclinical test for diagnosis and assessment of disease progression in MS and is often used to evaluate therapeutic efficacy. White matter lesions on T2WI are hyperintense and could indicate several different histopathological changes such as edema, inflammation, demyelination, gliosis, and axonal loss. 3 On the other hand, T1WI hypointense white matter lesions mostly correspond to axonal loss, white matter destruction, axonal loss, and irreversible clinical outcome. 4,5 As the role of neurodegeneration in the pathophysiology of MS has become more prominent, the formation and evolution of these lesions have been used to measure disease activity. These lesions result from an expansion of the extracellular space due to either an increase in water content or a deterioration of structural components. 6 This reaction may be the consequence of tissue destruction or increased water influx through the impaired blood-brain barrier. Some variations in these lesions have been identified. As Adusumilli et al. showed, using spin-echo (SE) sequence images, these lesions can be classified based on the levels of the hypointensity to gray holes (less hypointense) and black holes (more hypointense). 7 These classes are correlated with different clinical and cognitive measures. 7,8 Sahraian et al. define a black hole as "an area that is hypointense compared with the white matter in a T1WI and is concordant with a hyperintense lesion on a T2WI". 5 They also propose black holes can be divided into two groups: acute (when it coincides with a contrast-enhancing lesion) and chronic or persistent (those lesions that appear hypointense in T1WI, but do not enhance after contrast injection). Some other authors consider a black hole persistent if it persists for more than 6 months after its first appearance on MRI. 4,9 The evolution of T1WI hypointense lesions is also of importance. While some remain unchanged during the time, some other convert to isointense lesions (probably due to extensive or partial remyelination). 4 In a longitudinal study by van Waesberghe et al. on T1WI, 55% of hypointense lesions converted to isointense lesions by six months, while the rest remained unchanged. 10 It is also worth mentioning that 25% of the isointense lesions in their patients converted to hypointense lesions after that time interval.
Truyen et al. were the first to describe an association between T1WI hypointense lesions and the clinical state of MS patients. 11 In their study, when using MRI to evaluate the clinical disability, T1 hypointense lesion load showed greater cross-sectional and longitudinal 11 correlations with Expanded Disability Status Scale (EDSS) 12 scores (a scale extensively used in studies for the assessment of disability for patients with MS) for patients with RRMS or SPMS than did T2 lesion load. In contrast, some other studies reported an absence of correlations between T1 hypointensity and EDSS in patients with SPMS 13-17 and RRMS. 17 O'Riordan et al. even reported a negative correlation between these measures. 16 Considering the importance of these lesions and also considering the controversial findings of their association with the clinical disability of patients, in this review, we aimed to systematically assess the available findings regarding this matter up to this date.

| Objectives
To evaluate the correlation between T1 hypointense lesions lesion mean volume on cerebral MRI with disability level of patients with RRMS or SPMS.

| MATERIALANDME THODS
Design and methods used for this review comply with CRD's Guidance for Undertaking Reviews in Healthcare 18 and are reported in line with Preferred Reporting Items for Systematic Reviews and Meta-Analyses 2020 (PRISMA 2020). 19

| Eligibilitycriteria
Eligibility criteria were informed using the PICOTS system: (S) Setting: any.

| Online databases
The search employed sensitive topic-based strategies designed for each database with no time frame, language, or geographical restrictions. On the 10 th of February 2021, AV performed the search in the following databases:

| Citation searching
We also examined the forward and backward citations of the included studies on the 25 th of February 2021 using Scopus.

| Searchstrategy
Our search was designed in line with PRISMA-S guideline 21 and is presented in (Appendix S1).

| Selectionprocess
AV and MM independently screened the titles and abstracts of identified studies for inclusion. Disagreements in this stage were resolved through discussion. Full text of potentially eligible studies was retrieved. Each study was included when both reviewers independently assessed it as satisfying the inclusion criteria from the full text. MF acted as arbiter in the event of disagreement following discussion.

| Datacollectionprocess
Using a standardized form, AV and MM extracted the data independently. We resolved any disagreements by discussion. In cases in which PPMS or CIS patients were also included in a study, we included such studies only if CIS and PPMS patients consisted less than 15% of the study's sample size.

| Outcomes
The main outcome of interest was the correlation between the T1 hypointense lesions' mean volume on cerebral MRI and the EDSS score of participants. Some studies reported measurements at multiple time points for the same participants. In these cases, we averaged the presented correlation coefficients using the formula presented by Alexander 22 :

| Other variables
Other variables of interest that we extracted from the included studies were the following:

| Studyriskofbiasassessment
AV and MM assessed the risk of bias of each included study. We resolved any disagreements by consensus.

We used a tailored version of the Quality in Prognosis Studies
(QUIPS) tool, 23 presented in (Appendix S2). Our tailored version of the tool consisted of six domains: study participation, study attrition, prognostic factor measurement, outcome measurement, study confounding, and statistical analysis and reporting.

| Summarymeasures
The summary measure used for our review was Spearman's rank correlation coefficient (r).

| Synthesismethods
We used R version 4 "meta" package 24 and "rob.summary" package 25 as the software for our data synthesis.

| Eligibility for synthesis
All studies that met our eligibility criteria and reported our outcome of interest were assessed to be eligible for quantitative synthesis.

| Preparing for synthesis
Given that our effect measure of interest should have been reported directly in the included studies, no data transformation or conversion was deemed necessary. The only exception was the conversion of Pearson's correlation coefficients to Spearman's, which is described above. Although we did not face any situation to perform this conversion, imputation of missing data was not applicable for our effect measure either.

| Tabulation and graphical methods
We planned to present the results of each included study with the 95% confidence interval for the effect measure, in conjunction with the synthesized effect estimate, in a forest plot.

| Statistical synthesis methods
We used correlation coefficients (r) as our summary measure. Most metaanalysts do not perform syntheses on the correlation coefficient itself �� because the variance depends strongly on the correlation. Rather, the correlation is converted to the Fisher's z scale and all analyses are performed using the transformed values. We performed a meta-analysis on converted the Fisher's z scale values based on the random-effects model.
Finally, we converted back Fisher's z to r for the sake of presentation.
2.9.5 | Methods to explore heterogeneity We expected some heterogeneity between studies because of methodological diversity. We evaluated the range of the effects of the random-effects meta-analyses using prediction intervals, χ 2 statistics, and I 2 statistics. χ 2 statistics considered to be interpreted as substantial if either τ 2 was greater than zero, or there was a p-value < 0.10. I 2 statistic quantifies inconsistency across studies to assess the impact of heterogeneity on the meta-analysis 26 and is interpreted as: • 0%-40%: might not be important; • 30%-60%: moderate; • 50%-90%: substantial; • 75%-100%: considerable.
To investigate the possible sources of heterogeneity, we performed subgroup analyses based on classification of MS (RRMS or SPMS), the diagnostic criteria used for the diagnosis (Poser or McDonald), and the static magnetic field (SMF) used for the MRI (≥1.5T or <1.5T).

| Sensitivity analyses
We performed sensitivity analyses on studies with the following factors: • Component-based analyses for studies with at least one domain at high or unclear risk of bias • Very large studies to establish the extent to which they dominate the results.

| Reportingbiasassessment
To evaluate the risk of reporting bias across studies, a contourenhanced funnel plot was generated using Fisher's z transformed correlation for visual inspection of potential publication bias. This plot was designed to have contour lines corresponding to perceived milestones of statistical significance (p-value = 0.01, 0.05, and 0.1).
A test for funnel plot asymmetry was conducted.

| Certaintyassessment
The strength of the overall body of evidence was assessed using an adapted version of the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) framework for prognostic factor research, 27 which takes into account five considerations: study limitations, inconsistency, indirectness, imprecision, and publication bias. We considered a moderate/large effect size as a criterion for upgrading the certainty of evidence. AV and MM rated the certainty of the evidence for the outcome as "high," "moderate," "low," or "very low". We resolved any discrepancies by consensus.

| Excluded studies
We excluded 59 studies in the full-text assessment phase. For a full description of the reasons for the exclusion of these 59 studies, see (Appendix S3).

| Studycharacteristics
We included 27 studies with 1919 participants. For a detailed summary of the characteristics of the included studies, see (Appendix S4).

| Study participation
Eight studies were at high risk of bias for this domain, 3,10,15,16,34,37,40,43 mostly due to inadequate information about the diagnostic criteria used for the diagnosis of MS. The only exception was the study of Giugni et al. 13 that although did not report the diagnostic criteria used, due to adequate description of participants, we judged that it did not put the study at a higher risk of bias.

| Study attrition
Only one study was judged to be at a considerable risk of bias in this domain. Masek et al. 15 did not provide any information about the patients that were lost to follow-up or the reason for the loss.

| Prognostic factor measurement
Two studies were at risk of bias for this domain. Garrido et al. 43 did not provide any information about the prognostic factor measurement techniques used and thus was at high risk of bias, while Masek et al. 15 provided little information regarding this matter for the review authors to be able to judge it and thus was at unclear risk of bias.

| Outcome measurement
Five studies were judged to be at unclear risk of bias in this domain 10,13,15,29,34 because they did not report if the outcome assessors were trained physicians and if the same outcome assessors assessed all the participants.

| Study confounding
Most of the included studies were at high risk of bias regarding this domain, due to not reporting or inadequate reporting of the potential confounding factors in their studies. Six studies 17,31,33,36,40,47 reported potential confounders or the ways used to account for the confounders and thus were judged to be at low risk of bias in this domain.

| Statistical analysis and reporting
Twenty-six studies reported the appropriate effect size (Spearman's rank correlation coefficient) directly. The only exception was the study of Masek et al. 15 which did not provide enough statistical information about the participants and we had to impute the effect size based on a presented graph.

| Resultsofindividualstudies
All studies reported a positive correlation coefficient, with the only exception being the study of O'Riordan et al. 16 which reported a negative correlation. In most of the studies, the correlation was also statistically significant. Nine studies concluded that the correlation was not statistically significant. 10,15,16,30,38,40,43,45,46 For a detailed summary of the results of individual studies, see the forest plot in

| Characteristics of contributing studies
A summary of the characteristics of the contributing studies is provided in Table 1.

Sample sizes
The median sample size was 41 participants (interquartile range 28.5-69.5). The smallest sample size was 11 15

Participants
All studies were conducted on adults. Most studies presented a detailed summary report of the age and sex of the participants.
Only two studies 36,41 did not provide such a detailed summary report.

Diagnostic criteria
Eleven studies used the Poser criteria for the diagnosis of MS. 11 45 One study used Magnetization Prepared-

| Results of statistical syntheses
Twenty-seven studies obtained data sufficient for quantitative synthesis. Results of the meta-analysis are presented in Figure 4.
A positive r corresponds to a higher EDSS score in participants with larger lesion loads and vice versa.
The pooled sample size was 1919. The pooled estimated Spearman's r was 0.32, with a 95% CI of 0.26-0.37 and a p-value of <0.0001. Thus, we conclude that there is a significant positive correlation between the T1 lesion load and EDSS score. As a rule of thumb, this r is judged to represent a weak to slightly moderate correlation.

| Results of investigations of heterogeneity
The study of O'Riordan et al. 16 appeared to be an outlier, which can be a reason for statistical heterogeneity. The prediction intervals for the summary effect were 0.11-0.50. τ 2 was 0.0102 with a pvalue < 0.01. I 2 was 43% which is considered as a moderate size of heterogeneity.
The forest plots for the subgroup analyses are presented in (Appendix S5).

| Results of sensitivity analyses
For sensitivity analysis on studies with a large sample size (<90), there were 5 studies. 17,29,31,32,36 The results of this synthesis were not much different from our overall results (n = 1026, r = 0.29, 95% CI 0.22-0.36).
We also performed component-based analysis on the bias domains. Eight studies were at risk of bias for the participation domain. 3,10,16,34,37,40,43 The results of synthesis on these studies indicated the direction of r was not different in these studies (n = 256, Five studies were at risk of bias in the outcome measurement domain. 10,13,15,29,34 The results of synthesis on these studies indicated the direction of r was not different in these studies (n = 261, r = 0.29, 95% CI 0.17-0.40).
Twenty-one studies were at risk of bias in the study confounding domain. The results of synthesis on these studies indicated the direction of r was not different in these studies (n = 954, r = 0.33, The forest plots for all sensitivity analyses are presented in (Appendix S5). These analyses underline the robustness of the results.

| Reportingbiases
The contour-enhanced funnel plot is presented in Figure 5.
Visual inspection confirmed that the plot was symmetrical, indicating a low risk for publication bias. Also, a Rank correlation test for funnel plot asymmetry 48 was performed which yielded nonsignificant results (p = 0.83), confirming there is no significant risk of publication bias in our review.

| Study limitations
Twenty-two studies were at risk of bias for at least one domain.
Five studies were at risk of bias for two domains. Three studies were at risk of bias in three domains. One study was at risk of bias in all six domains. The most common sources of potential bias in these studies were as follows: not appropriately accounting for potential confounders in the design and the analysis, not presenting sufficient data to assess the validity of the diagnosis of participants, and not presenting enough data for validating the outcome measurement methods used. Moreover, we sometimes observed no description of inclusion/exclusion criteria. In general, we considered the included studies to have some serious study limitations.
Thus, we decided to downgrade the certainty of evidence by one level.

| Inconsistency
The I 2 was 43% while τ 2 was 0.0102 with a p of less than 0.01. These results indicated moderate inconsistencies in our included studies.
Subgroup and sensitivity analyses revealed no difference in the direction of effect in various subsets of studies. Therefore, we decided not to downgrade the certainty of evidence of the results because of this domain.

Indirectness in population
In this review, we were interested in adult patients with a diagnosis of either RRMS or SPMS. All of our included studies evaluated this population, and we did not encounter any serious indirectness in this domain.

Indirectness in prognostic factor
Our prognostic factor of interest was the cerebral MRI T1 hypointense lesions volume. Considering the specificity of our prognostic factor, there is no indirect way of measuring it. Thus, there are no considerable issues in this domain.

Indirectness in outcome
The outcome of interest for our study was the disability measure using EDSS. All included studies used this scale, and thus, there was no issue in this domain either.

| Imprecision
The pooled sample size for our review was 1919 which is considered an adequate sample size. 95% CI for the pooled effect size in our metaanalysis was 0.26-0.37, which represents a weak to slightly moderate correlation. Also, it is worth mentioning that the pooled 95% CI did not overlap the value of no effect. In conclusion, there was no reason for downgrading the certainty of evidence regarding this domain.

| Publication bias
We evaluated the risk of the publication bias using a contourenhanced funnel plot, observational assessment of the funnel plot for asymmetry, and statistical test of the symmetry of the funnel plot. All of our investigations indicated that there was a low risk for publication bias, so we decided not to downgrade the quality of this evidence for publication bias.

| Moderate/large effect size
Our synthesis reported a weak to slightly moderate effect size for the correlation between MRI T1 hypointense lesion load and EDSS score. So, we decided not to upgrade the quality of evidence for the moderate/large effect size domain.

| Overall assessment of the confidence in cumulative evidence
We represent the results of the assessments of each domain evaluated for the confidence in cumulative evidence, with overall confidence in cumulative evidence level in Table 2.

| Limitationsofevidence
A large proportion of the included studies were found to be at risk

| Limitationsofreviewprocesses
We encountered a considerable number of studies that probably evaluated the prognostic factor and outcome related to our review, but unfortunately, did not report the results. We tried to reach the authors for the data, but were not successful or were not provided with the data. Including those studies could have a substantial effect on our results. Also, a large proportion of the included studies did not mention or account for the potential confounding factors which resulted in downgrading the confidence in our results.

| Implications
Our results indicate that the cerebral MRI T1 hypointense lesions load is a weak to moderate prognostic factor in practice to estimate the disability rate of RRMS and SPMS patients. Future studies should evaluate other imaging markers' validity for this purpose, such as cerebral MRI T2 lesions load or T1 to T2 ratio.

PROTO CO L
The protocol is published elsewhere. 53

AMENDMENTS
As a post hoc decision, we decided to assess the heterogeneity of the included studies by also using I 2 and τ 2 . We also encountered some studies that reported our outcome of interest at several time points for the same participants. So as another post hoc decision, we decided to average the results of different time points using the formula provided in the methods section. We also found out a large proportion of the studies which reported our prognostic factor and outcome of desire, used the Poser criteria instead of the McDonald criteria for the diagnosis of patients. Thus, we decided to add the Poser criteria as another acceptable method in the population section of our eligibility criteria. We also performed subgroup analyses based on the diagnostic criteria used in the primary studies (Poser or McDonald) and based on the SMF of the MRI in the primary studies (≥1.5T or <1.5T). The rest of the review was done according to our published protocol's methods.

CO N FLI C T SO FI NTE R E S T S
None declared.

AUTH O RCO NTR I B UTI O N S
AV involved in coordination of the review, designing review, designing the protocol, performing the search, study selection, data extraction, assessing the risk of bias in included studies, analysis of data, interpretation of the results, assessing the confidence in cumulative evidence, and writing the review. EB involved in analysis of data, interpretation of the results, and writing the review. MS involved in conception of the review, designing review, designing the protocol, and writing the review. FA involved in interpretation of the results and writing the review. MF involved in conception of the review, designing review, designing the protocol, and writing the review. MM involved in correspondent, coordination of the review, designing review, designing the protocol, study selection, data extraction, assessing the risk of bias in included studies, assessing the confidence in cumulative evidence, and writing the review.

DATAAVA I L A B I L I T YS TAT E M E N T
All the data that were used in the conduction of this review are pub-