Instruments measuring change in cognitive function in multiple sclerosis: A systematic review

Abstract Background Multiple sclerosis (MS) is a chronic demyelinating/neurodegenerative disease associated with change in cognitive function (CF) over time. This systematic review aims to describe the instruments used to measure change in CF over time in people with MS (PwMS). Methods PubMed, OVID, Web of Science, and Scopus databases were searched in English until May 2021. Articles were included if they had at least 100 participants and at least a 1‐year interval between baseline and last follow‐up measurement of CF. Results were quantitatively synthesized, presented in tables and risk of bias was assessed with the Newcastle–Ottawa Scale. Results Fifty‐seven articles met the inclusion criteria (41,623 PwMS and 1105 controls). An intervention (drug/rehabilitation) was assessed in 22 articles. In the studies that used a test battery, Visual and verbal learning and memory were the most frequently measured domains, but when studies that used test battery or a single test are combined, Information processing speed was the most measured. The Symbol Digit Modalities Test (SDMT) was the most frequently used test as a single test and in a test battery combined. Most studied assessed “change in CF” as cognitive decline defined as 1 or more tests measured as ≥ 1.5 SD from the study control or normative mean in a test battery at baseline and follow‐up. Meta‐analysis of change in SDMT scores with seven articles indicated a nonstatistically significant –0.03 (95% CI –0.14, 0.09) decrease in mean SDMT score per year. Conclusion This study highlights the slow rate of measured change in cognition in PwMS and emphasizes the lack of a gold standard test and consistency in measuring cognitive change at the population level. More sensitive testing utilizing multiple domains and longer follow‐up may define subgroups where CF change follows different trajectories thus allowing targeted interventions to directly support those where CF is at greatest risk of becoming a clinically meaningful issue


INTRODUCTION
Multiple sclerosis (MS) is an autoimmune disease that affects the central nervous system with associated demyelination, inflammation, and irreversible axonal loss seen early in the disease (Oh et al., 2018).

MS can present with protean clinical features including dysfunction
from the involvement of any part of the CNS with wide intra-and interindividual variation (Katz Sand, 2015). Females are significantly more susceptible to MS but the progression of the disease is worse in males with males getting to a specific disability level faster than females (Golden & Voskuhl, 2017).
People with MS (PwMS) can develop impairment of cognitive function in the early stages of the disease (Patti, 2009). Cognitive impairment has been observed in up to 65% of PwMS throughout the disease course (Amato et al., 2006). While cognitive impairment may be present in all types of MS, it is more common in primary and secondary progressive MS (Amato et al., 2006;Brochet & Ruet, 2019). Attention, delayed memory and executive functions are mostly affected, but language, short-term memory and general intelligence are not typically affected (Rao, 1995). In MS, cognitive change may not be linear and may involve different domains at different points in the MS disease course. Cognitive impairment due to MS is also usually multidomain and so has multidimensional adverse effects on a person's life, which include unemployment, and problems with communication and education (Bose et al., 2022).
Most studies of cognitive functioning in MS have examined a single time point, which does not provide information about how cognition may change over time (Sumowski et al., 2018). Examining change over time is important since cognitive decline has been observed in PwMS (Eijlers et al., 2018). A decline in cognition can also be an early symptom of MS progression, secondary to potentially treatable inflammatory disease activity Shanmugarajah et al., 2017). Therefore, there may be need for a comprehensive assessment by highly trained clinicians (neuropsychologists), indicated by cognitive impairment on screening, to monitor change in cognitive function. But this is often not feasible given the lack of specialist clinicians availability, and because it is expensive, complex and time-consuming to undertake (Longley & Honan, 2022). However, early detection is essential given the potential for a "brain healthy lifestyle" to protect against further decline and the availability of rehabilitation to improve cognitive functioning or assist with the development of cognitive compensation strategies (Longley & Honan, 2022;Meca-Lallana et al., 2021).
Several clinical tools have been developed to assess cognitive func-  (Sumowski et al., 2018), and MSReactor (Merlo et al., 2019). Screening tools are easy to administer, easy to interpret with the use of cut-off scores, reflect the cognitive functioning of PwMS at a general level, and can be adminis-tered repeatedly to monitor change in CF though cognitive impairment in people with above average baseline levels of intellectual functioning could be missed (Longley & Honan, 2022). Practice effect (more common between first and second assessments) can be an issue and can be minimized through dual baseline assessment before further regular assessments (Duff et al., 2001). The heterogeneity of these batteries which includes having different tests measuring similar domain makes the comparison of change in cognitive function problematic.
Although each cognitive test is usually regarded as measuring a particular cognitive domain, they also tap into abilities in other functional domains. For example, while the Symbol Digit Modalities Test (SDMT) is primarily considered a test of information processing speed, it also requires abilities in visual scanning and incidental memory (Sumowski et al., 2018). The SDMT, written or oral form, nonetheless has been observed to be a reliable and valid test for the measurement of information processing speed in PwMS  and slowed processing speed is considered a hallmark characteristic of MS due to the loss of myelin on neurons that facilitate neural communication (Sivakolundu et al., 2020). Experts convened by The US National MS Society recommended that the SDMT, or similarly validated test, be used for an annual assessment of cognitive function in PwMS as it is sensitive to cognitive function changes and the best rapid clinical assessment tool for cognitive function (Kalb et al., 2018). Therefore, the SDMT may be referred to as the current gold standard to assess cognitive function in MS.
This systematic review aims to evaluate how changes in cognitive function occur in PwMS, over a minimum 1-year observation period, and how this change was measured and quantified, in studies that included samples of at least 100 participants (PwMS and healthy controls inclusive). The rationale in choosing these studies is to understand how change in cognitive function in PwMS occurs, and how it is measured, at a population or epidemiological level. The review will include details of the assessment tools used, the cognitive domains these tools measure, and the extent of change over time reported in studies using these tools. A meta-analysis examining change in SDMT scores over time is also presented.

Study selection
Title and abstract screening was completed by one rater (CE) in Covidence using the following exclusion criteria: the study did not measure cognitive function in MS, measured cognitive function at only one time point, measured cognitive function at less than a 1-year interval between baseline and follow-up, had less than 100 participants (study control inclusive), or were duplicates, case reports, brief reports, letters to the editor, reviews, or a study protocol. The full-text screening was completed by two researchers (CE and AZ) using the stated criteria working independently. Conflicts were resolved by consensus or by a third researcher (BT) when the first and second researchers could not reach a consensus.

Data extraction
Data were extracted by one reviewer (CE) with the following informa- mean score and standard deviation at baseline and follow-up and sample size at baseline and follow-up were used for the meta-analysis.
Fifteen authors were contacted through email for some missing information for the meta-analysis but five responded. Only papers with complete data were used for the meta-analysis. SDMT was chosen because it is sensitive to cognitive function changes and has been recommended as the best clinical practice tool for the assessment of cognitive function in PwMS (Kalb et al., 2018).
The quality and risk of bias assessment of the included studies was assessed by 2 reviewers (CE and AZ) independently using the Newcastle-Ottawa Scale (NOS). Stars were assigned to the included publications based on three criteria: selection of cases/controls or cohorts, comparability of cases/controls or cohorts, and outcome assessment with a maximum of four, two and three stars respectively.
For each publication, total stars ranging from 0 to 3, 4 to 6, and 7 to 9 were considered as low (high risk of bias), moderate (moderate risk of bias), and high quality (low risk of bias), respectively.

Statistical analysis
A meta-analysis was completed using STATA version 17 to examine change over time in SDMT scores. The SDMT has been recommended as one of the most reliable and valid tools for measuring cognitive function in MS. Cohen's d of SDMT between baseline and follow-up was calculated for all studies. A weighted average was applied depending on the sample size and the random-effects model used due to study heterogeneity. The analysis was reported with a forest plot (with 95% confidence intervals). A negative value in the effect size indicated a decline in cognitive function whereas a positive value indicated improvement. A p-value of < .05 was considered statistically significant. The heterogeneity of the studies used for meta-analysis was assessed with I2 with values of 25%, 50%, and 75% considered as low, moderate, and high, respectively (Higgins et al., 2003). Publication bias was assessed with Egger's test.

RESULTS
The primary search yielded 11,023 publications. After duplicates were removed, 5964 remained. Title and abstract screening yielded 97 papers. After full-text screening, 57 papers were included (Figure 1) of which 35 were observational studies (Table 1) and 22 were interventional studies ( Figure 1 and Table 2). All the studies were published between 1995 and 2021. North America contributed 22 papers, Europe 32, Oceania 2, and Asia 1. The sample size of PwMS and controls in the articles ranged from 100 (Raimo et al., 2020) to 11,222 (Crielaard et al., 2019) and included a total of 41,623 PwMS and 1105 controls. The interval between cognitive function measurement at baseline and follow-up ranged from 1 to 30 (Crielaard et al., 2019) years. Five (Demakis & Buchanan, 2010;Demakis et al., 2009;Hughes et al., 2018;Lincoln et al., 2020;McKay et al., 2019) out of the 57 included studies did not specify the Expanded Disability Status Scale (EDSS) of their study participants. Among studies that specified the EDSS mean or median, all had a mean or median of less than 6.5 except for one study that reported participants' EDSS in the range of 6.0 to 9.5 (Ytterberg et al., 2008). The mean age ranged from 12.5  to 57.5 years (Demakis & Buchanan, 2010) for all participants-PwMS and controls. All MS groups in the included studies reported a mean disease duration of 6 months (Johnen et al., 2019) to 21 years at baseline (Chan et al., 2017). After assessment of the risk of bias for all the 57 included publications, five were of high quality (7 to 9 stars/low risk of bias) while the rest were of moderate quality (4 to 6 stars/moderate risk of bias) (Supplementary TablesS1 and S2).

Change in cognitive function
Most observational studies focused predominantly on cognitive impairment at baseline and follow-up or cognitive decline. However, in some studies (interventional) the focus was on improvement in cognitive function or both improvement and decline (from baseline to follow-up) Benedict et al., 2021;DeLuca et al., 2021  impairment as failure in 1 or more tests with a cut-off score of < −1.5 SD from the mean (normative or study healthy control). However, there was heterogeneity in the use of a cut-off score with −2.0, −1.5, and −1.0 SD from the mean (normative or study healthy control), and heterogeneity in the number of qualifying failed tests (1 to 5 or more tests), to define "impairment." Decline or improvement was shown by comparing the percentage/proportion of cognitively impaired to improved participants between baseline and follow-up (Amato & Ponziani, 1998;Andravizou et al., 2020;Chruzander et al., 2014;Johnen et al., 2019) and reliable change index was used to report meaningful change as it corrects for measurement error and practice effects (Andravizou et al., 2020). For some papers that used one cognitive test, a 10% or 20% change in mean score (Bsteh et al., 2019) or a 4-point change (SDMT mostly) in the score DeLuca et al., 2021;Koch et al., 2021) were used.

The cognitive function domains measured, and the tests and batteries used
Visual learning and memory (21/57) and verbal learning and memory (20/57) were the most frequently measured domains when a battery was used (Table 3). Information processing speed and complex attention (34/57) was the most measured domain when cumulatively a single test or a battery was used. Rao's BRNB and BICAMS were the predominant batteries used in the measurement of change in cognitive function in PwMS (19/57). Other studies used Randt's memory battery, Birt Memory and Information Processing Battery (BMIPB), and Visual Object and Space Perception (VOSP) battery (Chan et al., 2017). In some cases, it was a combination of cognitive tests from different batteries that were used (Jonsson et al., 2006).

TA B L E 3
Cognitive function domains and the cognitive tests and batteries that were used in included studies.

Results related to change in cognitive function from the included papers
Most longitudinal studies without any intervention reported cognitive decline between baseline and follow-up and some occurred without associated clinical neurological deterioration based on EDSS and relapse (Motyl et al., 2021). Other observed findings include that the trend of cognitive decline in PwMS depended on the associated comorbidities: psychiatric conditions such as depression, bipolar disorder, schizophrenia, or neurological conditions such as Alzheimer's and Parkinson's diseases, or both (Demakis et al., 2009). Cognitive decline in PwMS is associated with poorer work outcomes (Amato & Ponziani, 1998), poorer health-related quality of life (Chruzander et al., 2014) and personality changes (Raimo et al., 2020). A decline in cognitive function was theorized to occur gradually over time in PwMS as opposed to it occurring in a step-wise manner (as would occur for example in stroke) (Demakis & Buchanan, 2010;Healy et al., 2021).
There was also a greater decline in progressive MS than RRMS (Eijlers et al., 2018), older PwMS, and males with MS (de Groot et al., 2009;Wallach et al., 2020). Pediatric onset (i.e., < 18 years of age) is also associated with greater cognitive decline than adult-onset MS (McKay et al., 2019). Cognitive dysfunction is also more likely to occur in PwMS with an EDSS of more than 3.5 compared to those with an EDSS of 1 to 3.5 (Ytterberg et al., 2008).

Meta-analysis of SDMT
In the meta-analysis of SDMT, 2594 participants were pooled from 7 observational study papers (Figure 2

DISCUSSION
This systematic review found that changes in cognitive function in MS are more often measured at the population level using a battery of tests rather than a single test. The domains most measured using a test battery were visual learning and memory, and verbal learning and memory.
Many papers defined cognitive impairment as failure in 1 or more tests with a cut-off score of < −1.5 SD from study control or normative mean and change in cognitive function as the proportion with cognitive impairment or improvement between baseline and follow-up. This illustrates that there was no uniformity in the assessment of change in cognitive function in MS. The SDMT was the most frequently used cognitive test, either as a sole test or as part of a battery in all the papers.
The meta-analyses did not find a clinically or statistically significant change in SDMT scores over a period of three years in PwMS.
The findings suggest that there is no uniform method for measuring change in cognitive function over time. Nor is there a uniform or agreed method to denote the presence of cognitive impairment. Most papers included in this review defined cognitive impairment as a test score less than 1.5 standard deviations from a study or normative sample mean in one or more cognitive function tests, and a decline was defined by a comparison of the proportion that was impaired at baseline and followup. However, cut-off scores varied with some papers using a criterion of more than 2 or as low as 1 standard deviation. Other studies used a 10 or 20% change (decrease for impairment) in the mean score or a cognitive function. However, this consistent definition needs to consider the heterogeneity that exists between PwMS in the profile of cognitive impairments that they experience and whether the assessment is for screening purposes or is part of a more comprehensive assessment of cognitive functioning. The lack of consistency may be problematic insofar as developing or progressive cognitive impairment may be missed or conversely over-interpreted.
This study observed that the SDMT is popularly used in studies and that there is more emphasis in the literature on assessing information processing speed in PwMS. This may be because it has been reported to be a reliable measure (Sumowski et al., 2018). By implication, the use of reliable and sensitive tools to assess change in other domains of cognitive function has become important, particularly for the clinician who is conducting a more comprehensive assessment of cognitive function. However, while poor processing speed is often thought to underlie poor functioning in other domains of cognition (e.g., it may impair learning processes), this is not always the case (Chiaravalloti & DeLuca, 2008). Hence, one cannot infer dysfunction in everyday living based on poor processing speed alone. It may also be that specific impairments in learning and memory occur without the presence of poor processing, i.e., as would occur if there are specific lesions in the brain in areas that govern these learning and memory processes.
This study pooled and evaluated longitudinal change in SDMT scores in PwMS across studies which spanned from 1 year to 3 years.
No significant change in the SDMT score was observed over three years. However, this meta-analysis would suggest that to detect a clinically meaningful change in cognitive function using SDMT, researchers may need to aim for a longer interval than three years. This raises two important considerations. First, it may be that cognitive change in MS proceeds at such slow a rate that meaningful change is not detected over a short-to-moderate period. Second, it may be that the SDMT is not a sensitive marker of change in cognition. Such a slow rate of change may not be easily differentiated from cognitive decline in normal aging though older PwMS decline faster than younger PwMS (Wallach et al., 2020).
An alternative explanation for the lack of change over 3 years is that there is likely to be considerable heterogeneity in the trajectories of cognitive change over time. While some PwMS may experience no change (i.e., cognition remains stable with no excess changes compared to background normal cognitive aging), others may experience cognitive decline over time. Indeed, there is now emerging evidence that early cognitive impairments (assessed using the SDMT) are an important prognostic marker of future cognitive decline and cortical thinning (Healy et al., 2021;. It is also possible that future decline may be minimized in response to changes in disease-modifying therapies or that practice effects on the tests may exist in the published studies or due to natural fluctuations. As supported by other studies, a decline in SDMT scores among PwMS is very slow, a 1-point change in SDMT score in 10 years (Chruzander et al., 2014) and a decline of 0.22 in SDMT score per year , although not as slow as our meta-analysis suggest, which found that SDMT score decreased by 0.03 mean score per year.
There are several limitations of this study. First, only studies with 100 or more participants were evaluated and thus the present results may not be entirely representative of the broader population of people with MS. This was done to evaluate the use of cognitive tests and batteries in large population groups. After the full-text screening, data were extracted by one rater though the risk of bias assessment was done by two independent raters. There was no analysis of how change occurred across groups of PwMS (i.e., according to the type of MS, measures of cognitive reserve, sex-specific or age-related changes or extent of baseline impairments). This is owing to this diversity not being characterized in the larger samples. Only studies in English were included due to convenience though studies with a different language could be informative. However, no article was screened out in full-text screening based on not being written in English.
A meta-analysis was conducted for the SDMT as it has been recommended as one of the most reliable and valid tools for measuring cognitive function in MS. A meta-analysis was not conducted for other domains of cognitive functioning because of the high heterogeneity (in tests used) in the studies included. Change in SDMT could not be reported between RRMS and the progressive types of MS because most papers either did not define the MS group or reported a combined SDMT without differentiation.

CONCLUSION
This study highlights the slow rate of change of cognition in PwMS at the population level, the plethora of tests used for measuring cogni- practice effects also needs to be accounted for. Likely, clinically meaningful change in cognitive function in a population of people with MS will take more than 3 years to manifest when using SDMT, although there will be wide variations in individuals in the rate of this change.
Identifying subgroups with a more rapid trajectory of change may be the best strategy to assess the effectiveness of treatments and interventions that are designed to slow, halt, or improve, cognition in PwMS.
There is also a place for developing more sensitive tests that are better than SDMT. In population-based and intervention studies, larger sample sizes, longer duration of follow-up, and the use of cognitive batteries are recommended to allow meaningful outcomes to be reported.
Based on the reviewed papers, there is no battery of tests that appears to perform better, although further work is required to determine the optimal measurement of cognitive change in this complex field.

FUNDING
No funds were received for this review.

CONFLICT OF INTEREST STATEMENT
There is no conflict of interest to declare. There was no financial support for this review.