To determine the validity of the Montreal Cognitive Assessment (MoCA) and the Mini–Mental State Examination (MMSE) as screening tools for cognitive impairment after stroke.
To determine the validity of the Montreal Cognitive Assessment (MoCA) and the Mini–Mental State Examination (MMSE) as screening tools for cognitive impairment after stroke.
Cognitive assessments were administered over 2 sessions (1 week apart) at 3 months post-stroke. Scores on the MoCA and MMSE were evaluated against a diagnosis of cognitive impairment derived from a comprehensive neuropsychological battery (the criterion standard).
Sixty patients participated in the study [mean age 72.1 years (SD = 13.9), mean education 10.5 years (SD = 3.9), median acute NIHSS score 5 (IQR 3–7)]. The MoCA yielded lower scores (median = 21, IQR = 17–24; mean = 20.0, SD = 5.4) than the MMSE (median = 26, IQR = 22–27; mean = 24.2, SD = 4.5). MMSE data were more skewed towards ceiling than MoCA data (skewness = −1.09 vs −0.73). Area under the receiver operator curve was higher for MoCA than for MMSE (0.87 vs 0.84), although this difference was not significant (χ2 = 0.48, P = 0.49). At their optimal cut-offs, the MoCA had better sensitivity than the MMSE (0.92 vs 0.82) but poorer specificity (0.67 vs 0.76).
The MoCA is a valid screening tool for post-stroke cognitive impairment; it is more sensitive but less specific than the MMSE. Contrary to the prevailing view, the MMSE also exhibited acceptable validity in this setting.
Cognitive impairment is common after stroke  and is independently associated with death and disability , institutionalization  and costs of care . Cognitive deficits may be substantial even after minor stroke , lessening chances for return to work  and lowering quality of life . Yet cognitive status is often overlooked by both clinicians and researchers. A review of 190 stroke trials identified only 3 that included a specific cognitive outcome measure . The most widely used screening tool remains the Mini–Mental State Examination (MMSE) , but it may lack sensitivity for cognitive impairment after stroke [10-12]. In addition, the MMSE is no longer available free of charge, so researchers have started to seek out alternative measures.
The Montreal Cognitive Assessment (MoCA) has been recommended for vascular cognitive impairment . Early validation studies indicated that the MoCA was more sensitive to mild cognitive impairment than the MMSE , and findings in stroke populations were seemingly consistent [15-17]. Only recently, however, have the MoCA and MMSE been directly compared with a comprehensive neuropsychological battery – the ‘criterion standard’ for defining cognitive impairment – in stroke. Godefroy and colleagues reported that the MoCA had superior sensitivity but lower specificity than the MMSE . The substantial delay between testing sessions at a time of rapid spontaneous recovery (session 1, mean 7 days post-stroke; session 2, mean 24 days post-stroke), however, means these results may reflect prognostic value of the screening tools rather than concurrent validity against the battery. Pendlebury and colleagues identified comparable sensitivity and specificity of the two tools at their optimal cut-offs . The sample characteristics in this study, though, limit generalizability: TIA patients were included, testing took place beyond 1 year post-event, and demented patients were excluded.
In this study, we compared performance on MoCA and MMSE (tested at 3 months post-stroke) with a diagnosis of cognitive impairment derived from a comprehensive neuropsychological battery (tested 1 week later). We hypothesized that, against the criterion standard, the MoCA would have greater predictive value (area under the curve) and greater agreement than the MMSE.
Patients who were admitted to the acute stroke unit at the Austin Hospital with completed stroke (ischaemic or intracerebral haemorrhage) were eligible. Patients were excluded if they (i) were younger than 18 years of age, (ii) were unconscious on admission to hospital, (iii) required an interpreter or (iv) had major visual, hearing or language impairments, which would likely prevent completion of the cognitive assessments. Screening for recruitment took place in the acute stage.
All patients provided informed consent to participate; some patients provided this when first approached in the acute stage; others provided it immediately prior to 3-month testing. Ethical approval for this study was obtained from the Austin Health Human Research Ethics Committee.
At 3 months post-stroke, a researcher visited the participant in their place of residence for the first testing session. During this session, the cognitive screening tools (MoCA and MMSE) were completed. Over the course of the study, 2 different researchers conducted this initial session. Both were psychology graduates who were trained in administration and scoring of the tools. The order in which the 2 screening tools were presented was counterbalanced across participants. Basic demographic information such as living arrangement, level of education and native language was also collected. One week later, a different researcher (blind to session 1 testing results) visited the patient for the second testing session. In this session, the patient completed a full neuropsychological testing battery, which took between 60 and 90 min. The researcher who conducted this testing (TC) had many years of experience in administering neuropsychological tests.
The MMSE  and the MoCA  are both screening tools for cognitive impairment that are scored out of 30 and take approximately 10 min to complete. The MMSE includes items on orientation to time and place (10 questions), registration (immediate verbal recall of 3 words), serial subtraction (from 100 by 7s), memory (delayed verbal recall of 3 words), naming (pencil, watch), language (repeat a phrase, follow a written instruction, follow a 3-step command, write a sentence) and drawing (copy a line drawing of overlapping pentagons). The MoCA includes sections on visuospatial/executive function (alternating trail-making, cube copy, clock drawing), naming (lion, rhinoceros, camel), attention (forward and backward digit span, tapping to the letter A, subtracting 7s from 100), language (sentence repetition, letter fluency), abstraction (similarities between train and bicycle, watch and ruler), memory (delayed verbal recall of 5 words) and orientation to time and place (6 questions). As two MoCA tasks (subtracting 7s and orientation questions) overlapped with identical items on the MMSE, these items were tested only once. Mood disorder was also assessed during the first 3-month session; the Hospital Anxiety and Depression Scale (HADS)  yielded scores for both anxiety and depressive symptoms.
The neuropsychological battery administered in the second session was designed to be similar to those used in previous studies requiring a cognitive criterion standard . Our battery included widely used cognitive tests that have established age-specific norms: the Rey Complex Figure, WAIS-R subtests of block design, digit span and digit symbol, the Hopkins Verbal Learning Test–Revised, the Trail-Making Test, letter and animal fluency, Star Cancellation, the Token Test and the Boston Naming Test. These tasks were grouped into the separate but interdependent domains of Visuospatial, Memory, Executive, Language, Attention and Visual neglect (see Table S1 for full details of the battery).
The criterion standard classification of cognitive impairment was determined on the basis of the neuropsychological battery. Scores on each test were translated to z-scores using age- and education-specific normative data. These z-scores were then averaged across the contributing tests within each domain to yield an overall domain z-score. The only exception was the Visual neglect domain, where a standard cut-off of <51 on the Star Cancellation test was used to indicate a deficit . A patient was classified as cognitively impaired if he or she had domain z-scores of <−1 (i.e. >1 standard deviation below the mean) in 2 or more domains. This domain-based z-score approach to classification has strong endorsement . We used the >1 standard deviation threshold in our primary analysis to capture the mild end of the cognitive impairment spectrum; there is a precedent for using this threshold as a cut-point for post-stroke cognitive impairment .
The primary outcome measure was agreement with the criterion standard. Using the method of calculating sample size for reliability studies , we set a baseline agreement (minimally acceptable level) at 0.50 and a target agreement (desirable level) at 0.80. Using α = 0.05 and β = 0.20, we required a sample size of 22. To correct for 2 independent comparisons (MoCA and MMSE), our conservative estimate of required sample size was 60 patients.
Data analysis involved plotting receiver operator characteristic curves and estimating the area under the curve for the 2 screening tools against the criterion standard classification of cognitive impairment. The chi-square statistic was used to assess whether the MoCA and MMSE had different receiver operator characteristic curves.
The data from the screening tools were dichotomized at a range of different cut-points, and sensitivity and specificity against the criterion standard were calculated at each cut-point. Positive and negative predictive values were computed. Once the optimal cut-points for MoCA and MMSE were established, agreement with the criterion standard was estimated using kappa and further validated using both intraclass and concordance correlation coefficients (ICC, CCC).
The influence of lesion side and mood disorder on the cognitive screening tools was evaluated using descriptive statistics, area under the curve analysis and correlation.
To divide the group into 3 levels of severity, we split the group of patients classified as cognitively impaired into moderate–severe (domain z-scores <−2 in at least 2 domains) and mild. Mann–Whitney U-tests were then used to examine whether the 2 screening tools could differentiate between the moderate–severe and mild groups and the mild and no cognitive impairment groups.
In follow-up analyses, criterion standard classification was varied to include only those patients with domain z-scores of <−1.5 (i.e. >1.5 standard deviations below the mean) in 2 or more domains, then varied again to include only those patients with 2 or more domain z-scores of <−2.
To determine agreement between cognitive measures within our sample, we employed a different way of calculating agreement that reduced the need for transformative steps prior to data analysis (i.e. no calculation of z-scores in reference to population norms). While this analysis is uninformative regarding existence of cognitive impairment, it is useful for providing direct within-sample measures of agreement. For each individual test in the neuropsychological battery, the 60 patients were placed in rank order. These ranks were then averaged across the contributing tests within each domain to yield an overall rank order for that domain. Total scores on the MMSE and MoCA were also used to order the patients according to ranks. Agreement between the rank order of the domains and the screening tools was then analysed using both ICC and CCC.
Recruitment took place between August 2008 and March 2011. The total number of patients approached in the acute stage was 104; 60 patients were recruited and they completed the 3-month sessions (Fig. 1). The first testing session took place at a mean of 98.3 days (SD = 12.0) post-stroke, and the second session took place at a mean of 8.1 days (SD = 2.4) after the first session. There were no dropouts between the first and second sessions, and each patient was able to complete both screening tools and all neuropsychological tests. Demographic and clinical characteristics of the sample are outlined in Table 1.
|Age – mean (SD), range||72.1 (13.9), 30–95|
|Years of education – mean (SD), range||10.5 (3.9), 5–21|
|English as native language||40 (67)|
|Acute NIHSS score – median (IQR)||5 (3–7)|
|Mild (1–7)||44 (76)|
|Moderate (8–15)||11 (19)|
|Severe (16+)||3 (5)|
|Oxfordshire Stroke Classification|
|Not visible on acute scan||7 (12)|
The MoCA yielded lower scores (median = 21, IQR = 17–24; mean = 20.0, SD = 5.4) than the MMSE (median = 26, IQR = 22–27; mean = 24.2, SD = 4.5). MMSE data were more skewed towards ceiling than MoCA data (skewness = −1.09 vs −0.73). According to the criterion standard classification, 39 of 60 (65%) patients were cognitively impaired. Area under the receiver operator curves of the MMSE and MoCA, shown in Fig. 2, was not significantly different (χ2 = 0.48, P = 0.49). Sensitivity and specificity of the 2 screening tools at a range of cut-offs are outlined in Table 2.
|AUC (95% CI) 0.84 (0.73–0.95)||AUC (95% CI) 0.87 (0.78–0.97)|
|PPV at 26/27||0.86||PPV at 23/24||0.84|
|NPV at 26/27||0.70||NPV at 23/24||0.82|
At their respective optimal cut-offs, the MMSE classified 37 of 60 (62%) and the MoCA classified 43 of 60 patients (72%) as cognitively impaired. Of the 12 patients misclassified by the MMSE, 5 were false positives and 7 were false negatives. Of the 10 patients misclassified by the MoCA, 7 were false positives and only 3 were false negatives. Agreement was higher for the MoCA (κ = 0.62, 95% CI 0.41–0.83; ICC = 0.62; CCC = 0.62) than the MMSE (κ = 0.57, 95% CI 0.36–0.79; ICC = 0.57; CCC = 0.57).
Mean screening tool score did not significantly differ by lesion side, for either the MMSE (left lesion mean 24.7 vs right lesion mean 23.5) or the MoCA (left lesion mean 20.7 vs right lesion mean 19.2). For the MoCA, lesion side did have some influence on predictive validity, with area under the curve higher in those with right hemisphere lesions (0.94) than those with left hemisphere lesions (0.77). For the MMSE, there was no laterality difference in area under the curve (right 0.84 vs left 0.85). Neither depression nor anxiety, as measured on the HADS, were significantly correlated with total score on the MMSE or MoCA (all r < 0.20, all P > 0.30).
With the group of patients classified into 3 levels of severity, there were 25 (42%) moderate–severe patients, 14 (23%) mild patients and 21 (35%) patients with no cognitive impairment. Mean scores on the MMSE and MoCA for these 3 groups are presented in Fig. 3. Mann–Whitney U-tests indicated that both screening tools could differentiate between mild and moderate–severe cognitive impairment [MMSE z = −3.17, P = 0.002; MoCA z = −2.85, P = 0.004], and between mild and no cognitive impairment [MMSE z = −2.18, P = 0.029; MoCA z = −2.94, P = 0.003].
Altering the criterion standard threshold to require 2 or more domain z-scores of <−1.5 resulted in a classification of 32 of 60 (53%) patients as cognitively impaired. In this analysis, the MMSE had greater area under the curve (0.89) than the MoCA (0.87). Both tools were sensitive at their (unchanged) optimal cut-offs (both = 0.94), but the MMSE was more specific than the MoCA (0.75 vs 0.54). Agreement was greater for the MMSE (κ = 0.70, 95% CI 0.51–0.88; ICC = 0.70; CCC = 0.70) than the MoCA (κ = 0.49, 95% CI 0.28–0.69; ICC = 0.47; CCC = 0.49). Using the classification that required 2 or more domain z-scores of <−2, 25 of 60 (42%) patients were cognitively impaired. In this analysis, the MMSE again had slightly greater area under the curve (0.87) than the MoCA (0.86).
Analyses of agreement between rankings indicated that the MoCA had superior agreement with memory performance, while the MMSE had superior agreement with language performance (Table 3).
|Domain||ICC (95% CI)||ICC (95% CI)|
|Visuospatial||0.71 (0.55–0.81)||0.68 (0.52–0.80)|
|Memory||0.64 (0.46–0.77)||0.69 (0.53–0.80)|
|Executive||0.72 (0.58–0.82)||0.70 (0.55–0.81)|
|Language||0.67 (0.50–0.79)||0.57 (0.38–0.72)|
|Attention||0.74 (0.60–0.84)||0.73 (0.58–0.83)|
|Totala||0.77 (0.64–0.85)||0.74 (0.61–0.84)|
The MoCA exhibited acceptable validity in identifying cognitive impairment post-stroke, with slightly greater area under the curve and agreement with a criterion standard classification than the MMSE. The most important quality in a screening tool is sensitivity – so as not to miss patients who need further assessment – and the MoCA had good sensitivity even at the mild end of the cognitive impairment spectrum. The MMSE was also a valid tool, with good area under the curve and higher specificity (proportion of true negatives) than the MoCA. When the criterion standard classification was altered to detect only more severely cognitively impaired patients, the MoCA's superiority in sensitivity was lost. Our results indicate that the shortcomings of the MMSE in stroke may have been overstated.
We replicated previous findings that the MoCA generates lower scores than the MMSE [15-17, 25]. Only 9 of 60 patients were classified as cognitively ‘normal’ at the previously proposed 25 of 26 cut-off , adding weight to the argument that this cut-off is overly strict . While MoCA data were less skewed towards ceiling than the MMSE data, this does not necessarily imply greater sensitivity. With the criterion standard threshold set at 1 SD, the MoCA did have greater sensitivity but lower specificity than the MMSE, matching a previous result from acute stroke . With the threshold set at 1.5 SD, however, the MMSE performed better than the MoCA, with equivalent sensitivity but substantially higher specificity. Accumulated evidence from this and the other 2 studies to feature an external criterion standard [18, 19] indicates that the MoCA and MMSE have similar predictive validity after stroke, with the MoCA perhaps having a slight advantage in sensitivity but a slight disadvantage in specificity. This similarity is notable given that these 3 studies took place at very different time points post-stroke (within 1 month, at 3 months, beyond 1 year).
These results challenge the idea that the MMSE is insensitive to stroke-related cognitive impairment. Authors of a previous study concluded that the MMSE was extremely poor at detecting cognitive impairment after stroke , but the a priori cut-off they used was ≤23, and thus, only 4 of 71 patients were classified as cognitively impaired. Others have reported a low MMSE sensitivity of 0.62 in stroke , but this study also used the low ≤23 cut-off. In our data, the MMSE had sensitivity of 0.82 and specificity of 0.76 at its optimal cut-off of 26/27, strikingly similar to the MMSE sensitivity of 0.80 and specificity of 0.77 reported at the same cut-off in acute stroke .
Our data showed that the MoCA had greater predictive validity for patients with right hemisphere stroke compared with left hemisphere stroke; MMSE data had no such laterality effect. While the small numbers dictate that this should be seen as a preliminary finding, it supports the idea that the MoCA can detect the attentional and visuospatial deficits that are common after right hemisphere stroke. This has the potential to be clinically important, as recognizing so-called hidden cognitive problems in right-sided stroke, such as agnosia and inattention, remains a challenge. Contrary to previous work [27, 28], we did not find a correlation between symptoms of depression or anxiety and cognitive function.
Secondary analysis using agreement between rankings eliminated the need for norm-based z-score transformations. While this cannot be used to determine levels of cognitive impairment, it is the best way to evaluate internal agreement between measures. This analysis revealed that the MMSE had slightly better overall agreement with the neuropsychological testing than the MoCA, and this was particularly pronounced in the Language domain. Although the MoCA is widely regarded as being more heavily loaded with frontal and executive measures than the MMSE, agreement with the Attention and Executive domains was marginally in favour of the MMSE. The MoCA had superior agreement with the Memory domain, which is consistent with suggestions that the MMSE memory task might be too easy to be discriminating.
The current investigation had several shortcomings. Using receiver operator curves to derive optimal cut-points that are then used as the basis for analysis of agreement is somewhat circular. Studies in different stroke samples that use these as a priori cut-offs are now required. A related point is the loss of detail that occurs when scales are dichotomized. When the criterion standard and screening tools were both split into a simple yes/no for cognitive impairment, agreement was approximately 0.60. Yet, when the detail was preserved by using rank order, agreement was approximately 0.70. It must be acknowledged that both the MoCA and MMSE are subject to effects of age and education [26, 29]. Part of the attractiveness of these screening tools, however, is their simplicity, and our results indicate that their raw total scores have good predictive validity. When we did z-transform the MoCA data, area under the curve against the criterion standard was not increased. As with many studies of post-stroke cognitive impairment, we could not include patients with moderate to severe dysphasia. The make-up of our sample, however, does allow results to be generalized to populations with substantial numbers of non-native English speakers.
The clinical cognitive screening tool of choice for stroke patients may depend on the intended use, whether as a global cognition outcome measure for a research trial or a quality registry, a clinical screen for the purposes of triage and referral, or the sole cognitive assessment of the patient given in lieu of more time-consuming neuropsychological evaluation. The MoCA has considerable strengths: its sensitivity of 0.92 and specificity of 0.67 at a cut-off of 23 of 24 in our primary analysis show it is a valid screen. A ceiling effect does not appear to be a major problem. It has good depth of information, with clinically familiar items like clock drawing, memory recall, verbal fluency and abstraction providing opportunities for more detailed interpretation beyond assigning points towards a total score. This aspect is particularly important in situations where a single screening tool will be the extent of cognitive assessment for the patient. The MoCA is also easily accessible, free to use and has been translated into many languages. Contrary to prevailing wisdom, the MMSE also exhibited robust validity in detecting cognitive impairment after stroke, with acceptable sensitivity and specificity at its optimal cut-off of 26 of 27, irrespective of how the criterion standard was defined. Our results indicate that the MoCA and MMSE are both good clinical indicators of cognition after stroke. As such, lack of appropriate tools is not a reasonable excuse for overlooking cognitive assessment in these patients.
We thank Debbie Hansen and Karen Moss for helping to collect the data.
None of the authors have conflicts to report. This work was supported by research grants from the National Stroke Foundation and Equity Trustees Preston & Loui Geduld Trust Fund. Dr Cumming is funded by a National Heart Foundation Postdoctoral Research Fellowship. Dr Linden was supported by grants from the Per-Olof Ahl cerebrovascular research fund and the Västra Götaland FoU office. The Florey Institute of Neuroscience and Mental Health acknowledges the strong support from the Victorian Government and in particular the funding from the Operational Infrastructure Support Grant.