Development of a reliable clinical assessment tool for meningoencephalitis in dogs: The neurodisability scale

Abstract Background Meningoencephalitis of unknown origin (MUO) comprises a group of debilitating inflammatory diseases affecting the central nervous system of dogs. Currently, no validated clinical scale is available for the objective assessment of MUO severity. Objectives Design a neurodisability scale (NDS) to grade clinical severity and determine its reliability and whether or not the score at presentation correlates with outcome. Animals One hundred dogs with MUO were included for retrospective review and 31 dogs were subsequently enrolled for prospective evaluation. Methods Medical records were retrospectively reviewed for 100 dogs diagnosed with MUO to identify the most frequent neurological examination findings. The NDS was designed based on these results and evaluated for prospective and retrospective use in a new population of MUO patients (n = 31) by different groups of independent blinded assessors, including calculation of interobserver agreement and association with outcome. Results The most common clinical signs in MUO patients were used to inform categories for scoring in the NDS: seizure activity, ambulatory status, posture and cerebral, cerebellar, brainstem, and visual functions. The intraclass correlation coefficient (ICC) for prospective use of the NDS was 0.83 (95% confidence interval [CI], 0.68‐0.91) indicating good agreement, and moderate agreement was found between prospective and retrospective assessors (ICC, 0.71; 95% CI, 0.56‐0.83). No association was found between NDS score and long‐term outcome. Conclusions and Clinical Importance The NDS is a novel clinical measure for objective assessment of neurological dysfunction and showed good reliability when used prospectively in MUO patients but, in this small population, no association with outcome could be identified.

meningoencephalomyelitis, and necrotizing leukoencephalitis. [1][2][3] The etiology of MUO is unknown, but it is considered most likely a group of immune-mediated diseases considering their generally positive response to immunosuppressive treatment. 2 The incidence of MUO in the canine population is unknown, but it has been reported as the most common inflammatory disease affecting the central nervous system (CNS) in dogs in a referral hospital population. 4 Meningoencephalitis of unknown origin in dogs is a debilitating disease and, despite appropriate treatment, 25% to 33% of dogs die within a week of diagnosis. 5,6 Several studies have evaluated short and long-term outcome in dogs with MUO, but all have focused on survival at different time points. [5][6][7][8][9][10][11][12][13][14] The use of survival as an outcome measure for MUO patients is flawed because affected dogs often are euthanized, and owners may elect for euthanasia at different time points based on a variety of complex underlying factors including financial constraints, difficulties managing chronic disease and views on what constitutes acceptable quality of life. We are therefore in need of an objective scoring tool specific for MUO that describes the general status of each patient, is easy to use and is repeatable. Such an instrument could improve our ability to monitor this condition, especially when studying the effects of different treatment protocols, and may assist in early prognostication. One previous study attempted to create an outcome score based on neurological deficits, but it was generated retrospectively and no attempts were made to validate or assess reliability of this score. 15 The importance of developing objective scales that can guide physicians' daily practice as well as facilitate development of newer therapeutic options has been recognized in human patients with inflammatory diseases affecting the CNS. 16,17 Diseases such as multiple sclerosis (MS) and autoimmune encephalitis (AE) share some similarities with MUO. 18,19 In recent decades, several outcome mea- for AE. 17,20,21 Our aims were to: (1) design a disability scale that could be used to describe disease status; (2)  Dogs were excluded if no pleocytosis was found on CSF analysis, except for dogs with signs of increased intracranial pressure (ICP) on imaging studies because CSF was not collected in these cases. The findings of the neurological examination performed on initial presentation were retrieved from the medical records and spectrum and frequencies of the different clinical signs in patients with MUO were determined.
The NDS was designed by a veterinary neurology diplomate (RG) and included only signs of neurological dysfunction that had been identified in the initial cohort of MUO patients. The EDSS and CASE scales used in humans were consulted because they are widely used instruments to assess disease progression in MS and AE and have been validated as useful primary outcome indicators in clinical trials. All members of the neurology team (including 4 veterinary neurology diplomates and 3 veterinary neurology residents in training) then were consulted, and minor changes made after review and recommendations.
The NDS (Table S1) was designed by attributing a numerical rating of dysfunction (0-3, with the higher number denoting more dysfunction) for the following categories: seizures, ambulatory status, cerebral functions, cerebellar functions, brainstem functions, visual functions, and postural abnormalities. We assigned specific clinical signs identified most commonly in the initial population to the different degrees of dysfunction in each category so as to minimize subjectivity in the grading. The degree of disability and perceived effect on quality of life were taken into account when deciding on the degree of dysfunction to attribute to each deficit, similar to the use of the human scales, which usually describe whether or not a deficit affects ability to perform daily activities. Two binary categories also were added to include abnormalities identified in the initial cohort: presence or absence of hyperesthesia and presence or absence of proprioceptive deficits. The NDS score was calculated as the sum of the individual scores for each category. Interobserver reliability was calculated using Cohen's kappa for binary categories (classified as absent or present), weighted kappa for ordinal categories (ranked 0-3) of the scale, and the intraclass correlation coefficient (ICC) for the total scores. Intraclass correlation coefficient estimates and their 95% CI were calculated based on absolute agreement, 2-way random effects models as a measure of prospective and retrospective interobserver agreement. All scores (2 prospective and 2 retrospective) then were used to calculate the ICC as measurement of interobserver agreement between prospective and retrospective use of the NDS.

| Reliability of the NDS
A Kruskal-Wallis test was used to evaluate the association between the NDS score and outcome. The Mann-Whitney test was used to assess the associations between the NDS score and survival to discharge and the NDS score and relapse. The same test was used to evaluate the association between time to relapse and response to treatment at time of relapse. The correlation between NDS score and time spent in the intensive care unit (ICU) was assessed using the Pearson correlation coefficient.  Table 1 and were used to design the NDS (Table S1). In 6 of these dogs, CSF analysis was not performed because herniation through the foramen magnum was identified on MR images. This category included positional strabismus (n = 16), reduced physiological nystagmus (n = 8), anisocoria (n = 3), incontinence (n = 1), and Horner syndrome (n = 1). Note: The scale relies on attributing a numerical rating of dysfunction (0-3) in 7 categories giving an overall score of between 0 (normal) and a theoretical maximum of 21 (severe disability).  Table 2.

| Reliability of the NDS
The ICC values for prospective and retrospective use of the NDS (after removal of the binary categories) are presented in compared to those that did not (median, 5 days; range, 2-7). The NDS score also was associated with ICU hospitalization time (r = 0.41, P = .02).

| DISCUSSION
The NDS was designed, consistent with clinical outcome scales used in inflammatory CNS diseases in humans and based on our retrospective study of neurological examination findings, as an objective clinician-administered measure of neurological impairment in dogs with MUO. The scale relies on attributing a numerical rating of dysfunction (0-3) in 7 categories, giving an overall score of between 0 (normal) and a theoretical maximum of 21 (severe disability). We found good agreement between assessors when using the NDS prospectively, but only moderate agreement was found between prospective and retrospective use. No association was found between the NDS score at admission and long-term outcome, but the score was associated with survival to discharge and days spent in the ICU.
No association was found between the NDS score and relapse, but patients that relapsed earlier were less likely to respond to treatment.  16,23,24 Reliability is the degree of consistency exhibited when a measurement is repeated under identical conditions. 23,24 Our results indicated high interobserver reliability for the prospective use of the total NDS score and most of the categories that compose it, except for the 2 binary categories (proprioceptive deficits and hyperesthesia), which therefore were withdrawn from the final NDS. Some of the discrepancies identified might be a result of variation in experience T A B L E 3 Intraclass correlation coefficients (with 95% confidence intervals in parenthesis) for interobserver agreement of the neurodisability scale (NDS) total score recorded by 2 independent assessors and for the retrospective assessor. Responsiveness measures an instrument's ability to capture change. [23][24][25] Future studies with larger numbers of dogs are required to further validate the NDS and, most importantly, to evaluate its responsiveness. Such studies will provide further support as to whether the NDS can be a useful tool in monitoring disease progression in patients with MUO and to assess the effectiveness of therapeutic interventions in clinical trials.
When using clinical assessment scales, methods should be used to increase reliability, including training of investigators, assessment by the same rater during the study, standardized protocols for neurological examination, and precise definitions of all requirements. 17 Most raters in our study used the scale on several occasions but some (mainly the rotating interns) were less experienced in performing neurological examinations and only used the NDS on a single occasion.
Nonetheless, inter-rater reliability was high, suggesting that the NDS is a robust COM. Increasing reliability further by using the same assessor throughout a trial should be considered in future studies. Using the same COM (such as the NDS) in different studies of MUO may overcome the limitations of survival as an outcome measure and could allow for easier comparison of results.
In our initial study, outcome (defined as good, fair, or poor) was not associated with NDS score. The NDS may not discriminate sufficiently among different severities of clinical signs, but it possibly also is affected by the different pathologies included within a clinical diagnosis of MUO and the absence of a standardized treatment protocol.
It is also likely that it could be related to small sample size. Post hoc power analysis based on the effect size identified by our study using 3 outcome groups (with a significance level of .