A 30‐Year Clinical and Magnetic Resonance Imaging Observational Study of Multiple Sclerosis and Clinically Isolated Syndromes

Objective Clinical outcomes in multiple sclerosis (MS) are highly variable. We aim to determine the long‐term clinical outcomes in MS, and to identify early prognostic features of these outcomes. Methods One hundred thirty‐two people presenting with a clinically isolated syndrome were prospectively recruited between 1984 and 1987, and followed up clinically and radiologically 1, 5, 10, 14, 20, and now 30 years later. All available notes and magnetic resonance imaging scans were reviewed, and MS was defined according to the 2010 McDonald criteria. Results Clinical outcome data were obtained in 120 participants at 30 years. Eighty were known to have developed MS by 30 years. Expanded Disability Status Scale (EDSS) scores were available in 107 participants, of whom 77 had MS; 32 (42%) remained fully ambulatory (EDSS scores ≤3.5), all of whom had relapsing–remitting MS (RRMS), 3 (4%) had RRMS and EDSS scores >3.5, 26 (34%) had secondary progressive MS (all had EDSS scores >3.5), and MS contributed to death in 16 (20%). Of those with MS, 11 received disease‐modifying therapy. The strongest early predictors (within 5 years of presentation) of secondary progressive MS at 30 years were presence of baseline infratentorial lesions and deep white matter lesions at 1 year. Interpretation Thirty years after onset, in a largely untreated cohort, there was a divergence of MS outcomes; some people accrued substantial disability early on, whereas others ran a more favorable long‐term course. These outcomes could, in part, be predicted by radiological findings from within 1 year of first presentation. ANN NEUROL 2020;87:63–74

M ultiple sclerosis (MS) is a highly variable condition.
Some people with MS accrue little or no neurological disability over decades, 1,2 whereas others have their life significantly shortened. 3 With a view to preventing long-term disability, there is growing interest in the early use of MS disease-modifying therapies (DMTs) capable of inducing sustained remission, albeit with the caveat that these may themselves be associated with lifechanging side effects. 4,5 Given this, it is important that the approach to DMTs should, as far as possible, involve a personalized risk-benefit analysis, ideally early in the disease course.
About 85% of people with MS initially develop the relapsing-remitting form, and first present with a clinically isolated syndrome (CIS), an episode of neurological symptoms that at least partially resolves. 6 Features associated with more favorable longer-term outcomes in MS include an early age at symptom onset, an initially relapsingremitting multiple sclerosis (RRMS) course, optic neuritis (ON), or a predominantly sensory CIS, complete remission after a CIS, and a longer interval between the first and second relapse. 1,7 Following a CIS, a higher initial brain lesion load and greater accrual of lesions over the first 5 years, measured on magnetic resonance imaging (MRI) scans, are associated with an increased likelihood of developing disability within 20 years. 8 Lesions within the brainstem and spinal cord also appear to be associated with a greater risk of subsequent disability. 9,10 Natural history studies have demonstrated that in the majority of people with RRMS, it took over a decade for their mobility to become limited, and over 2 decades before they were immobile without aids. 7 Given this, assessment of the relationships between early prognostic features and later outcomes ideally requires clinical follow-up of 2 decades or more.
In this study, we considered 2 main questions: How diverse are clinical outcomes at 30 years following a CIS? Can we identify early on (within 5 years) those who will develop progressive MS or have their life shortened by MS? We addressed these questions using data from a unique cohort of people recruited prospectively following a CIS between 1984 and 1987. 11,12 The group was followed up clinically and had MRI at 1,5,10,14,20, and now 30 years. As recruitment predated the DMT era, the cohort was largely untreated.

Participants
One hundred forty people with a CIS were prospectively recruited between 1984 and 1987 at the National Hospital of Neurology and Neurosurgery, and Moorfields Eye Hospital. Eight were subsequently found to have alternative diagnoses. 7 The cohort has previously been followed up on 5 occasions since their baseline assessment. Participants underwent clinical assessment and an MRI brain scan at baseline, with subsequent follow-up at 1, 5, 10, 14, and 20 years. 8,[11][12][13][14][15] This is an updated 30-year follow-up of the cohort. At 1 year, radiological data without clinical data were obtained; at all other time points both were acquired. The numbers of participants and their demographic characteristics at each time point are detailed in Table 1. CISs were classified as being an ON, transverse myelitis (TM), or brainstem syndrome based on clinical features. This study was approved by our institutional ethics committee and the National Research Ethics Service (15/LO/0650). All participants gave informed consent, written if they attended in person, or verbal if they provided clinical information by telephone only. For the deceased members of the cohort, death certificates were obtained where possible (27 death certificates obtained out of 29).

Clinical Assessment
Expanded Disability Status Scale (EDSS) 16 scores were used to measure disability, retrospectively from notes and participant recall at baseline and at nadir, where clinical improvement had plateaued or at 1 year, whichever was earlier, and prospectively by examination or by telephone 17 at later time points. Baseline EDSS scores could not be determined in 14 participants due to the absence of notes and unclear recall. In participants not assessed at a given time point, EDSS scores were determined retrospectively from later records and scores from adjacent time points. At 30 years, the Paced Auditory Serial Addition Test (PASAT), an assessment of information processing speed, and Brief International Cognitive Assessment for MS (BICAMS) scores were also obtained for those who attended for review. 18,19 The BICAMS has 3 components: the Brief Visuospatial Memory Test-Revised (BVMTR), the Symbol Digit Modalities Test (SDMT), and the California Verbal Learning Test (CVLT).

Clinical Outcomes
Participants were classified as having either a CIS or MS based on the McDonald 2010 criteria. 20 Those with MS were further subclassified as having this either on clinical (a further relapse or clinical progression) or radiological grounds (new lesions seen on MRI), and RRMS or secondary progressive multiple sclerosis (SPMS). 21 Death due to MS was determined by consensus review of death certificates or notes (where available) by K.K.C. and D.T.C., where MS was either given as the cause of death or a clear contributing factor to death, for example, aspiration pneumonia in someone with advanced MS or a pulmonary embolus secondary to chronic immobility. A working definition of nondisabling MS at 30 years was an EDSS score of ≤3.5 (fully ambulatory, with or without abnormal neurological findings on examination). 16 At 30 years, disease course was classified as CIS, MS with EDSS scores ≤3.5, MS with EDSS scores >3.5, or death relating to MS. Recognizing that the EDSS scores take little account of cognition, 22 we determined the proportion in each group who were found to have cognitive impairment on the BICAMS. We also determined the proportion who remained in employment or who had retired at the national state pension age of 60 years.

MRI Analysis
Film prints from baseline, 1, 5, and 10 years were redigitized using a VIDAR Diagnostic Pro Advantage film digitizer (VIDAR Systems, Herndon, VA), and processed to reconstruct a digital image stack comparable with native stacks (see Table 1). 23 For each participant, all available scans were reviewed side by side, using 3D Slicer version 4.4. 24 White matter (WM) lesions were marked by consensus (K.K.C. with F.B., D.T.C., or both), with reference to preceding or subsequent scans, and then counted by K.K.C. Whole brain, juxtacortical (JC), periventricular (PV), infratentorial (IT), and deep white matter (DWM) lesions were counted separately. DMW lesions were defined as supratentorial lesions that were neither JC nor PV.

Statistics
Early prediction models were fitted from the perspective of earlier time points, when future or final outcomes and diagnostic groups were unknown, and therefore, unless otherwise stated, include all available subjects, including those who remained classified as CIS. Univariate and multivariate logistic regression was used to identify early (baseline, 1 year, and 5 year) predictors of the following 3 binary 30-year outcomes: (1) 30-year EDSS scores ≤3.5 versus 30-year EDSS scores >3.5 (including deaths due to MS [ie, EDSS scores = 10 by 30 years]); this EDSS cutoff was chosen a priori as more clinically meaningful and objective than the >3.0 versus ≤3.0 threshold; (2) SPMS diagnosis by 30 years, including SPMS deaths, versus CIS and RRMS at 30 years; and (3) death due to MS by 30 years versus all still alive at 30 years. Independent variables analyzed are listed in the Results section. Additionally, for MS-associated death, a Cox proportional hazards model was used to identify the best predictors. All deceased participants, regardless of MS status and cause of death, contributed to the Cox survival analysis, censored at the time of death. Individuals whose deaths were unrelated to MS were not included in the models for 30-year outcomes. The categories for early EDSS, EDSS changes, and lesion count predictors were categorized to generate approximately equal frequencies; binary lesion variables, where possible, were dichotomized a priori 1+ versus 0 lesions, or to equalize frequencies if 0/1+ resulted in a very unequal distribution. Resulting ordered categorical variables were naturally coded so that when entered into a model, the coefficient gave a linear test for monotonic trend across the increasing category levels, assuming equal steps between adjacent categories. When the ordinal lesion variables did not predict materially better than binary, models with binary lesion predictors were reported. For multivariate logistic and Cox models, manual backward stepwise elimination of variables with p > 0.05 was used to identify the best subset of independent predictors. Ageadjusted comparisons of cognitive outcomes between groups at 30 years were performed using multiple regression of the cognitive measure on group indicators, with age as covariate. Analyses were performed using Stata 15.1 (StataCorp, College Station, TX), 25 and statistical significance is reported at p < 0.05.

Whole Cohort
At the 30-year follow-up, outcome data (including deaths) were obtained in 120 out of the original 132 participants.
Twelve individuals declined or were not traceable. Twentynine individuals were deceased, of whom 19 had MS, and 10 died with last known classification as CIS. Of these 10 participants, 3 were last assessed at 20 years, 1 at 10 years, 2 at 5 years, 2 at 1 year, and 2 at baseline. The mean followup duration was 30.9 years. Table 1 summarizes the number of participants with a known outcome at each time point. In those alive at 30 years, the mean (standard) age was 61.6 years (7.4 years), with 59 (65%) female and 32 (35%) male. In the 91 alive individuals, 30 remained classified as having had a CIS, and 61 had MS. In total, 80 were known to have MS (61 alive and 19 deceased). BICAMS scores were obtained in 61 participants, 41 with MS and 20 with CIS. Table 2 shows baseline demographic and clinical features for all participants, based on the 30-year outcome.

MS Cohort
Of the 80 people known to have MS by 30 years, 19 were deceased. Sixteen died of complications relating to advanced MS (EDSS scores = 10), 2 died of unrelated causes, and for 1 the cause of death was unknown. All 3 were assessed and documented to have RRMS at 20 years, with EDSS scores of 2.5, 3.0, and 6.0. Of the 61 who were alive, 26 had SPMS and 35 RRMS. BICAMS scores were obtained in 41 (26 with EDSS scores ≤3.5, 15 with EDSS scores >3.5), and BICAMS z scores (adjusted for age, sex, and years of education) were available in 31 subjects who were ≤ 65 years of age. Subjects who did not complete the cognitive tests At 30 years, EDSS score peaks were observed at 0, 2.0, 6.0, and 10, with the lowest points at 4.0 and 9.5 (Fig 2). All of the 26 with SPMS (34%) had EDSS scores >3.5. Of the 35 (45%) with RRMS, 32 (42%) had EDSS scores ≤3.5. Six people fulfilled 2010 MS diagnostic criteria on radiological rather than clinical grounds, and they all had EDSS scores ≤3.5.
With regard to cognition, of the 32 with EDSS scores ≤3.5, 21 had validated BICAMS z scores, of whom 2 had a z score of <−1.5 in 1 or more modalities. None of the 32 had retired early for medical reasons, and all remained in employment (full-time or part time), or retired at the national state pension age (Fig 3). Age-adjusted cognitive measures in the group with MS and EDSS scores ≤3.5 were not significantly different from the CIS group: for the PASAT, the MS with EDSS scores ≤3.5 group (adjusted mean 42.32) was 9% worse than the CIS group (adjusted mean = 46.31, difference =  Table. There was no association of gender and disease duration with any of the 30-year outcome groups. People presenting with a brainstem CIS were at greater risk than those presenting with either ON or TM (of MS-related death, hazard ratio [HR] = 2.87, p = 0.04). This was consistent with the higher proportion of brainstem subjects with baseline IT lesion present (41%), compared to TM (19%) FIGURE 2: Expanded Disability Status Scale (EDSS) scores at 30 years. EDSS scores were obtained from 107 individuals at 30 years. An EDSS score of 10 was only assigned to those where multiple sclerosis (MS) was known to have contributed to death. In the 3 other people with MS who had died, the cause of death was either unrelated to MS or unknown, and no EDSS score was assigned. CIS = clinically isolated syndrome; RRMS = relapsing-remitting multiple sclerosis; SPMS = secondary progressive multiple sclerosis. and ON (11%; χ 2 test, p = 0.009). The change in EDSS scores from nadir to 5 years was also largest in the brainstem CIS group (mean = 1). Older people at presentation were at greater risk of MS-related mortality (HR = 1.07 per year, p = 0.04).
For predicting 30-year EDSS score ≤ 3.5 versus EDSS score > 3.5 outcome, EDSS scores at nadir and 5 years were significant, more so than EDSS changes between these time points, and the predictive value of EDSS scores at 5 years was, unsurprisingly, greater than at earlier time points.
Combined Predictive Models. Variables entered into multivariate predictive models, to determine best predictors, include: age at onset, gender, CIS type, disease duration, early EDSS scores and interval EDSS changes between time points, number of relapses within the first 5 years; and early total lesion count, changes in total lesion count, and location-specific lesion counts. Overall, MRI-detected brain lesions proved more effective predictors of 30-year outcomes than EDSS: in multivariate models including both lesion and EDSS variables, the latter no longer contributed significantly, with their coefficients substantially reduced. Tables 3 and 4 show the results for early prediction of 30-year EDSS scores >3.5 and 30-year SPMS. For each of the 2 outcomes, 2 models are shown: up to 1 year and up to 5 years. IT and DWM lesions were the best predictors, with the addition of nadir-to-5-years EDSS change in the SPMS prediction model. The up to 1-year models show that subjects with neither baseline IT nor 1-year DWM lesions had a 13% probability of 30-year EDSS scores >3.5 (87% probability of 30-year EDSS scores ≤3.5), whereas subjects with at least 1 lesion of both types had 94% probability (95% CI = 83 to 100%) of 30-year EDSS scores >3.5, and 94% probability of SPMS by 30 years. The up-to-5-years models show that subjects with ≤5 DWM lesions at 5 years and EDSS change of <2 from nadir to 5 years, had 11% probability of SPMS by 30-years; conversely, subjects with >5 DWM lesions and ≥ 2 EDSS change had 96% probability (95% CI = 86 to 100%) of SPMS.

Discussion
This cohort provides a unique perspective on the long-term clinical and MRI evolution of relapse-onset MS. As MRI first became available in the 1980s and DMTs in the 1990s, it is highly unlikely that such long-term, essentially natural history, data can be obtained again. The results from this study suggest that 30 years following symptom onset, there are 3 distinct MS outcomes: an RRMS group with little accrued disability (EDSS scores ≤3.5), an SPMS group who all had impaired mobility (EDSS scores ≥4.0), and a group who have had their lives shortened by MS (all of whom had SPMS). The results also suggest that, at 30 years, cognitive assessment scores in the EDSS scores ≤3.5 group were not significantly different from the CIS group, whereas in the EDSS scores >3.5 group, they were worse. Thirty-year outcomes could, in part, be predicted by early EDSS scores and more robustly by MRI-derived regional lesion counts.
After allowing for other factors, 30-year outcomes were not independently associated with age at onset, gender, baseline EDSS score, and CIS type. MRI lesion counts proved to be better predictors than EDSS scores, and lesion location was more important than lesion number. There was more missing data for changes in lesion count than for absolute lesion counts, and this may be a factor in why new early lesions were not as predictive. Interestingly, although PV and JC lesions are highly relevant in the diagnosis of MS, 20 it was early IT and DWM lesions that had the greatest long-term prognostic value. For example, in people with baseline IT and DWM lesions by 1 year, the chances of having SPMS were 94%, whereas those with 1 or more IT lesions by 1 year were 5 times more likely to have died due to MS than the rest of the cohort. Conversely, absence of both baseline IT and 1-year DMW lesions gave an 87% probability of EDSS scores ≤3.5 at 30 years. IT lesions have previously been linked with less favorable outcomes in people with MS, after a mean follow-up of 7.7 years. 9 Considering the potential application of these results, treatment decisions are often made prospectively, and increasingly early in the disease course, prognostic factors identified within a year of symptoms onset may prove more useful than those identified within 5 years. However, favorable prognostic features at 1 year may also not impact significantly on choices; instead, the emergence of markers suggestive of more disabling outcomes may carry more weight.
Since this cohort was first recruited, diagnostic criteria for MS have changed, and most significantly an MS diagnosis can now be made in people after only a single episode of symptoms, but who fulfil MRI criteria for dissemination of lesions in space and time. 20 In the present study, 6 individuals had MS diagnosed on radiological grounds (all had EDSS scores ≤3.5 at 30 years, compared with 37% if diagnosed on clinical grounds). Thus, they appear to represent a clinically silent end of the MS spectrum, who would previously have been overlooked. With the routine use of DMTs, the longterm evolution of MS may be changing: in a large cohort of RRMS patients, where 62% were treated with DMTs, only 11.3% transitioned to SPMS after a 17-year period. 26 There are several study limitations. First, this study used well-established clinical outcome measures. This is less controversial for physically disabling outcomes such as SPMS or MS-related deaths, but what is considered a nondisabling outcome may differ substantially depending on whose perspective it is from, and patient-reported outcomes have not been assessed. 27,28 For example, we have not included detailed assessments of fatigue and visual impairment, which may significantly affect functional outcomes in people with MS. With this in mind, the nondisabling MS group identified in this study may more pragmatically be considered to be people with MS who have consistently low levels of disability with no progression, and less to gain from DMTs, rather than those who have no ill effects from MS. Second, at the inception of this cohort, MRI was a new technique, and image quality was not as good as is achievable now; given this, analyses of the earlier images will be less reliable than later ones. Postgadolinium sequences were not obtained at any time point, and only limited T1-weighted images were obtained at 14 and 20 years, and as such we have not been able to assess active lesion inflammation at the time of scanning or assess the early relevance of T1 "black holes" for longer-term outcomes. Third, symptoms attributable to spinal cord involvement were not systematically assessed early on in this cohort, and spinal cord imaging was not routinely obtained. Additionally, we were not able to obtain outcome data for 12 of the 132 original participants, nor were we able to obtain MRI scans in 9 of the 30 participants classified as CIS; it is possible that some of these individuals would fulfil MS diagnostic criteria, although this is unlikely to change the main findings of this study. Twelve participants did not contribute to early MRI information, of whom 7 were lost to 30-year follow-up; however, the similarity in their baseline demographic features to the rest of the cohort suggests it is unlikely our main results are materially affected. It should also be noted that the cohort originated from 1 neurosciences center, and therefore there may be limitations in generalizability.
With regard to EDSS data, particularly early in the study, these were not captured consistently. To minimize inaccuracies, data from adjacent time points and from clinical records, where available, were used. However, it is worth noting that EDSS scores ≤3.5 are derived from symptoms and examination findings, whereas scores from 4 upward represent thresholds of mobility impairment. Although a > 3.0 versus ≤3.0 threshold has been proposed in the literature on benign MS, our use of an EDSS threshold of ≤3.5 versus >3.5 at 30 years is more objectively interpretable, should minimize the impact of any inter-or intrarater variabilities, and be more reliable in the predictive models. Furthermore, only 1 participant in our sample would have been reclassified if a > 3.0 versus ≤3.0 dichotomy was used, with little impact on the main results. With regard to cognition, it is worth noting that of those who did not complete cognitive assessment, when compared to those who did, a significantly higher proportion were more neurologically disabled. Given this, it is likely that we have underestimated the true magnitude of cognitive deficits in those who are more physically disabled, and it is also possible that the small differences observed between the EDSS scores ≤3.5 and CIS groups have been underestimated due to incomplete data.
In our statistical analyses, as the main focus of our study is on early (within 5 years) predictors of late outcomes (30 years), we have confined ourselves to investigating only the associations between these time points, and not associations at all other time points. Analyses of the intermediate time points would be of interest, but would answer different questions and lie outside the scope of the present article. A further caveat is that some subgroups were small, resulting in estimated odds ratios that, although statistically significant, should be interpreted with caution, particularly where the ORs' confidence intervals are very wide. Classification properties of our multivariate models might be improved with probability cutoffs different from 0.5; however, we believed this optimization may not be reliably generalizable and preferred not to screen for the best classification.
Lastly, it is worth noting that ON as a presenting CIS has been associated with a more favorable outcome when compared with other presentations. 1 Fifty-two percent of the participants in this study presented with ON, whereas in another large prospective cohort study, 37% presented with ON. 29 Although there was no evidence of association between ON and 30-year outcome in this study, there may still be some bias toward more favorable outcomes.
In conclusion, the results of this study suggest a divergence of natural outcomes in people with MS 30 years after symptom onset: those with SPMS, who have developed greater disability and have a significant risk of their life being shortened by MS; and those classified as having RRMS, who remained fully ambulatory, with no significant cognitive impairment, and who remained employed or retired at the expected age. The results also indicate that for less favorable outcomes, the die may be cast early. This suggests that there are people with MS who have more to gain from earlier use of higher efficacy DMTs, although also counsels caution when considering the blanket use of DMTs in early MS or following a CIS. The predictive models developed by this study include features that can be obtained in clinical practice, and so hopefully may inform risk-benefit analyses when considering DMTs. A key goal of future research is to determine what pathologically differentiates progressive from persistently nonprogressive MS, with a view to targeting treatments that would substantially increase the chances a person with MS follow a less-disabling clinical course.
Physical Sciences Research Council Centre for Doctoral