Mini-Mental State Examination (MMSE) for the detection of Alzheimer's disease and other dementias in people with mild cognitive impairment (MCI)

  • Protocol
  • Diagnostic

Authors

  • Ingrid Arevalo-Rodriguez,

    Corresponding author
    1. Fundación Universitaria de Ciencias de la Salud, Hospital de San José/Hospital Infantil de San José, Division of Research, Bogotá D.C., Colombia
    • Ingrid Arevalo-Rodriguez, Division of Research, Fundación Universitaria de Ciencias de la Salud, Hospital de San José/Hospital Infantil de San José, Bogotá D.C., 11001, Colombia. inarev7@yahoo.com. iarevalo@fucsalud.edu.co.

    Search for more papers by this author
  • Nadja Smailagic,

    1. University of Cambridge, Institute of Public Health, Cambridge, UK
    Search for more papers by this author
  • Agustín Ciapponi,

    1. Southern American Branch of the Iberoamerican Cochrane Centre, Argentine Cochrane Centre IECS, Institute for Clinical Effectiveness and Health Policy, Buenos Aires, Capital Federal, Argentina
    Search for more papers by this author
  • Erick Sanchez-Perez,

    1. Hospital Infantil Universitario de San José-FUCS, Neurosciences, Bogotá, Colombia
    Search for more papers by this author
  • Antri Giannakou,

    1. Bristol University, School of Social and Community Medicine, Bristol, UK
    Search for more papers by this author
  • Marta Roqué i Figuls,

    1. CIBER Epidemiología y Salud Pública (CIBERESP), Iberoamerican Cochrane Centre, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Catalunya, Spain
    Search for more papers by this author
  • Olga L Pedraza,

    1. Hospital Infantil Universitario de San José-FUCS, Neurosciences, Bogotá, Colombia
    Search for more papers by this author
  • Xavier Bonfill Cosp,

    1. CIBER Epidemiología y Salud Pública (CIBERESP), Spain - Universitat Autònoma de Barcelona, Iberoamerican Cochrane Centre, Biomedical Research Institute Sant Pau (IIB Sant Pau), Barcelona, Catalonia, Spain
    Search for more papers by this author
  • Sarah Cullum

    1. Bristol University, Centre for Mental Health, Addiction and Suicide Research, School of Community Medicine, Bristol, UK
    Search for more papers by this author

Abstract

This is a protocol for a Cochrane Review (Diagnostic test accuracy). The objectives are as follows:

To determine the diagnostic accuracy of the MMSE at various thresholds for detecting individuals with MCI at baseline who would clinically convert to Alzheimer’s disease dementia or other forms of dementia at follow-up

To assess the heterogeneity of test accuracy by population (e.g. memory clinics, community settings) and MMSE thresholds, amongst other factors.

Background

Dementia is a progressive global cognitive impairment syndrome. In 2009, more than 34 million people worldwide were estimated to be living with dementia—a number that will increase to more than 81 million by 2040 (Ferri 2005; Wimo 2010). Dementia encompasses a group of neurodegenerative disorders that are characterised by progressive loss of both cognitive function and the ability to perform daily living activities. It can be accompanied by neuropsychiatric symptoms and challenging behaviours of varying type and severity. Its underlying pathology is usually degenerative, and subtypes of dementia include Alzheimer’s disease dementia (ADD), vascular dementia, dementia with Lewy bodies and frontotemporal dementia, among others. Considerable overlap may be noted in the clinical and pathological presentations of dementia (MRC CFAS 2001), and ADD and vascular dementia often coexist (Matthews 2009; Savva 2009).

Recently, a new type of predementia phase called mild cognitive impairment (MCI) was brought to light. MCI refers to a heterogeneous condition, and currently 16 different classifications are used to define it (Matthews 2008; Petersen 1999; Petersen 2004; Winblad 2004). Prevalence of MCI varies widely (between 0.1% and 42%) according to the criteria applied, with most systems including memory impairment and absence of cognitive decline as basic conditions for diagnosis (Stephan 2007). As part of ADAMS assessment, Plassman et al estimated the prevalence of cognitive impairment without dementia as 22% in people aged 71 years or older (Plassman 2008). MCI may be classified as amnestic or non-amnestic, according to the presence of clinically significant memory impairment that does not meet the criteria for dementia or a subtle decline in other functions not related to memory (Petersen 2011).

Over time, people with MCI may experience a gradually progressive cognitive decline and changes in personality and behaviour. When the cognitive impairment in memory, reasoning, language and visuospatial abilities interferes with daily function, individuals are diagnosed with dementia. Research studies indicate that an annual average of 10% to 15% of individuals with MCI may progress to dementia, in particular ADD, but with wide variation, depending upon the source of study participants, with self-selected clinic attendees having the highest conversion rates (Bruscoli 2004; Mitchell 2008). Information on long-term cohorts suggests that annual conversion rates range from 4.2% (95% confidence interval (CI) 3.9% to 4.6%) for any dementia to 5.8% (95% CI 5.5% to 6.5%) for ADD (Mitchell 2008).

Establishing a definitive diagnosis of MCI in the presence of subtle symptoms can be challenging. In these cases, it is necessary to document the cognitive decline from the patient's medical history and corroborate it by means of neuropsychological testing, among other suggested tools (Petersen 2001). The American Academy of Neurology recommended in 2001 that patients with MCI should be evaluated and monitored in accordance with their risk of progression to dementia by means of general or brief cognitive screening tools (Petersen 2001). Likewise, the National Institute on Aging and the Alzheimer's Association remarked in 2011 that longitudinal evidence of progressive decline in cognition could support the diagnosis of MCI due to AD and could allow assessment of the potential benefits of early treatment (Albert 2011).

Usually, recognition and assessment of people with suspected dementia in any setting (community, primary care or secondary care) requires a brief test of cognitive function and/or the use of informant questionnaires (Arevalo-Rodriguez 2013). The brief cognitive evaluations needed are usually paper-and-pencil tests that are easy to administer, take no longer than 10 minutes to complete, involve major executive functions and yield an objective score. This final score is useful in determining which individuals need a more comprehensive evaluation (usually identified by low scores) (Boustani 2003). One of these brief cognitive tests is the Mini-Mental State Examination (MMSE) (Folstein 1975), which has become the best-known and the most often used short cognitive screening test for dementia in clinical, research and community settings, although it is now the subject of copyright issues (Nieuwenhuis-Mark 2010).

Systematic assessments of the diagnostic accuracy of brief cognitive tests such as MMSE are scarce. In 1992, Tombaugh et al presented a narrative review of MMSE studies that emphasised psychometric properties such as reliability and construct validity without evaluating the quality of the included evidence (Tombaugh 1992). Later, Mitchell published a systematic review and meta-analysis of cross-sectional studies of MMSE and reported different estimations of sensitivity and specificity according to the setting and population (Mitchell 2009). Until now, the relationship between MMSE scores and conversion from MCI to ADD or other dementias has not been evaluated in a systematic fashion.

It is thus the aim of this DTA review for diagnostic test accuracy in dementia to evaluate the ability of the MMSE to identify those people with MCI who will progress to the full clinical syndrome of dementia in such settings as community residences, primary care facilities and memory clinics.

Target condition being diagnosed

In general, dementia as diagnosed is defined by a deficit in more than two cognitive domains of sufficient degree to impair functional activities. Symptoms are usually progressive over a period of at least several months and should not be attributable to any other brain disease (American Psychiatric Association 1994). Dementia develops over a trajectory of several years, and it is presumed that during some portion of this time, people are asymptomatic and pathology is accumulating (Jack 2011). Individuals or their relatives may notice subtle impairments of recent memory during this time. Gradually, more cognitive domains become involved, and difficulty planning complex tasks becomes increasingly apparent. Subtypes of dementia include Alzheimer's disease dementia (McKhann 1984; McKhann 2011), vascular dementia (Roman 1993), fronto-temporal dementia (Lund and Manchester Groups 1994) and Lewy body dementia (McKeith 1996), among others. Some dementia subtypes are related to other neurological diseases such as Parkinson's disease (Goetz 2008).

This review will focus on conversion from MCI to Alzheimer's disease dementia, as well as conversion from MCI to other forms of dementia, respectively, which will be assessed at follow-up. As was previously noted, several studies have shown that most patients with MCI are at increased risk of developing dementia (Petersen 2011). Several medications have been evaluated for use in reducing or delaying the risk of progression, but none have been adopted for extended clinical use (Farina 2012; Russ 2012; Yue 2012).

Index test(s)

The Folstein Mini-Mental State Examination (MMSE) is a 30-question assessment of cognitive function that evaluates attention and orientation, memory, registration, recall, calculation, language and ability to draw a complex polygon (Folstein 1975). The MMSE has recently been subject to copyright restrictions (de Silva 2010).

Advantages of the MMSE include rapid administration, availability of multiple language translations and high level of acceptance as a diagnostic instrument amongst health professionals and researchers (Nieuwenhuis-Mark 2010). The presence of cognitive decline is determined by the total score. Traditionally, a 23/24 cut-off has been used to select patients with suspected cognitive impairment or dementia (Tombaugh 1992). However, several studies have shown that sociocultural variables affect individual scores (Bleecker 1988; Brayne 1990; Crum 1993); therefore local standards must be developed for each population and setting evaluated (Diniz 2007; Kulisevsky 2009; Shiroky 2007; Trenkle 2007).

Clinical pathway

Dementia develops over a trajectory of several years. It is presumed that during some portion of this time, people are asymptomatic and pathology is accumulating. Individuals or their relatives may notice subtle impairments of recent memory during this time. Gradually, more cognitive domains become involved, and difficulty planning complex tasks becomes increasingly apparent. People with memory complaints usually present to their general practitioner (primary care), who may administer one or more brief cognitive tests and potentially refer the individual to a memory clinic (secondary care). However, many people with dementia do not present until much later in the course of the disease and follow a different pathway to diagnosis. In community settings, screening tests are usually administered to estimate epidemiological figures of dementia, identify cases to be included in clinical trials or even establish a follow-up to detect incident cases or changes in cognitive performance (Brayne 2011). In all cases, a follow-up period is mandatory to detect cognitive changes in populations and conversion of mild cases to dementia (delayed verification).

Standard assessment of dementia includes a history and clinical examination (including neurological, mental state and cognitive examinations); laboratory tests such as thyroid-stimulating hormone, serum folic acid, serum vitamin B12 and blood count; an interview with a relative or other informant; and neuroradiological evaluation (Feldman 2008; Hort 2010). Before dementia is diagnosed, other physical and mental disorders (e.g. hypothyroidism, depression) that might be contributing to cognitive impairment should be excluded or treated. Neuropsychological examination includes full assessment of major cognitive domains, including memory, executive functions, language, attention and visuospatial skills. A neuroradiological examination (computed tomography (CT) or magnetic resonance imaging (MRI) scan of the brain) is also recommended in most recent consensus guidelines (McKhann 2011), although the use of cerebrospinal fluid (CSF) biomarkers is controversial (Dubois 2010). Sometimes the diagnosis is made on the basis of history and presentation alone.

Prior test(s)

Most tests (e.g. neuroimaging, CSF analysis) are usually performed after a cognitive deficit has been identified. However, it is conceivable that patients with abnormalities on brain imaging—performed for any number of reasons—are likely to be tested subsequently for cognitive deficits.

Role of index test(s)

Accurate diagnosis leads to opportunities for treatment. At the present time, no “cure” for dementia is known, but some treatments can slow cognitive and functional decline or reduce associated behavioural and psychiatric symptoms of dementia (Birks 2006; Clare 2003; McShane 2006). Furthermore, diagnosis of ADD (and other dementias) at an early stage will help people with dementia, their families and potential carers in making timely plans for the future. Coupled with appropriate contingency planning, proper recognition of the disease may also help to prevent inappropriate and potentially harmful admissions to hospital or institutional care. In addition, accurate early identification of dementia may increase opportunities for the use of newly evolving interventions designed to delay or prevent progression to more debilitating stages of dementia.

Alternative test(s)

The Cochrane Dementia and Cognitive Improvement Group (CDCIG) is in the process of conducting a series of DTA reviews of biomarkers and other tests to determine their sensitivity and specificity for the diagnosis of Alzheimer’s disease dementia and other dementias. These include the following:

  • 18F-FDG PET (positron emission tomography-fluorodeoxyglucose).

  • PET-PiB (positron emission tomography-Pittsburgh compound B).

  • sMRI (structural magnetic resonance imaging).

  • CSF (cerebrospinal fluid analysis of Abeta and tau).

  • APOE e4 (apolipoprotein E e4, a major genetic risk factor for cognitive decline).

  • FP-CIT SPECT (2β-carbomethoxy-3β-(4-iodophenyl)-N-(3-fluoropropyl)nortropane single photon emission computed tomography).

  • Informant interviews (IQCODE (Informant Questionnaire on Cognitive Decline in the Elderly); AD8 (a brief informant interview to detect dementia)).

Rationale

The public health burden of cognitive and functional impairment due to dementia is of growing concern. With the changing age structure of populations in both high- and low-income countries, the prevalence of dementia is increasing (Ferri 2005). At the population level, this event has major implications for service provision and planning, given that the condition leads to progressive functional dependence over several years. Accurate diagnosis leads to opportunities for treatment and appropriate care, but it is also crucial to identify participants for clinical trials of sufficient power to demonstrate the effectiveness of potential treatments.

At the present time, no “cure” for dementia is known, but some treatments can slow cognitive and functional decline or reduce associated behavioural and psychiatric symptoms of dementia (Birks 2006; Clare 2003; McShane 2006). Furthermore, diagnosis of ADD (and other dementias) at an early stage (i.e. MCI) will help people with dementia, their families and potential carers in making timely plans for the future. Coupled with appropriate contingency planning, proper recognition of the disease may also help prevent inappropriate and potentially harmful admissions to hospital or institutional care. In addition, accurate early identification of dementia may increase opportunities for the use of newly evolving interventions designed to delay or prevent progression to more debilitating stages of disease.

The Cochrane Dementia and Cognitive Improvement Group is undertaking a series of DTA systematic reviews, including three on the accuracy of the MMSE for diagnosing dementia (Davis 2013). Particularly, this review will be focused on evaluation of the MMSE and will include delayed-verification studies for assessment of conversion from MCI to dementia.

Objectives

To determine the diagnostic accuracy of the MMSE at various thresholds for detecting individuals with MCI at baseline who would clinically convert to Alzheimer’s disease dementia or other forms of dementia at follow-up

Secondary objectives

To assess the heterogeneity of test accuracy by population (e.g. memory clinics, community settings) and MMSE thresholds, amongst other factors.

Methods

Criteria for considering studies for this review

Types of studies

We will consider longitudinal studies in which results of the MMSE administered to MCI participants are obtained at baseline and the reference standard is obtained by follow-up over time (at least 12 months). We will include case-control studies (nested or not) with delayed verification of conversion to dementia only if longitudinal prospective studies are not available. We will not consider randomised controlled trials unless a specific analysis of the diagnostic test accuracy of the MMSE is included (RCT for diagnostic performance of MMSE); participants in the control groups of intervention studies might also be considered. We will exclude cross-sectional studies, before-after studies and case reports.

Participants

We will include participants recruited and clinically classified as individuals with MCI at baseline from community, primary care and secondary care settings. We will establish the diagnosis of MCI using Petersen and revised Petersen criteria (Petersen 1999; Petersen 2004), Matthews criteria (Matthews 2008) and/or Clinical Dementia Rating (CDR) = 0.5 (Morris 1993). These criteria include subjective complaints, decline in memory objectively verified by neuropsychological testing in combination with patient history, decline in other cognitive domains, minimal or no impairment in activities of daily living and not meeting the criteria for dementia. We will include all subtypes of MCI participants (amnestic single domain, amnestic multiple domain, non-amnestic single domain and non-amnestic multiple domain). We will exclude studies of participants with a secondary cause of cognitive impairment, namely, current or past alcohol/drug abuse, central nervous system (CNS) trauma (e.g. subdural haematoma), tumour and infection, amongst others.

Index tests

The Mini-Mental State Examination (Folstein 1975), or MMSE, is a simple pen-and-paper test of cognitive function based on a total possible score of 30 points; it includes tests of orientation, concentration, attention, verbal memory, naming and visuospatial skills. In follow-up studies, participants with MCI are evaluated by the MMSE to obtain a baseline score and then are followed for several months to allow identification of new cases of dementia. Its utility as a predictive factor could be evaluated in several thresholds, some of them previously specified or otherwise obtained from statistical methods (e.g. logistic regression); optimal cut-offs are established according to sensitivity and specificity figures, amongst others.

Target conditions

The target condition is conversion at follow-up from MCI to Alzheimer's disease dementia or other forms of dementia. We expect to find most studies focused on AD dementia, vascular dementia, Lewy body dementia and fronto-temporal dementia. We will appraise findings separately if studies examining dementias of differing origins have been identified and are included in the review.

Reference standards

Currently, no in vivo gold standard is used for the diagnosis of dementia, and even the value of diagnoses based on neuropathological criteria has been questioned (Scheltens 2011). However, we will be using acceptable and commonly used reference standards. Clinical diagnosis after follow-up will include all-cause (unspecified) dementia, according to any recognised diagnostic criteria, for example,The Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) and the International Classification of Diseases, Tenth Revision (ICD-10). National Institute of Neurological and Communicative Disorders and Stroke (NINCDS)-Alzheimer Disease and Related Disorders Association (ADRDA) criteria (McKhann 1984; McKhann 2011) are the best antemortem clinical consensus gold standard for Alzheimer’s dementia, defining three antemortem groups: probable, possible and unlikely Alzheimer’s dementia. DSM and ICD definitions are also acceptable classifications for diagnosis of eventual Alzheimer’s dementia. The reference standard for Lewy body dementia is the McKeith criteria (McKeith 1996; McKeith 2005), for fronto-temporal dementia the Lund-Manchester criteria (Lund and Manchester Groups 1994) and for vascular dementia the National Institute of Neurological Disorders and Stroke (NINDS)-Association Internationale pour la Recherche et l'Enseignement en Neurosciences (AIREN) criteria (Roman 1993).

Search methods for identification of studies

Electronic searches

We will search MEDLINE (Ovid SP, 1966 to date), EMBASE (Ovid SP, 1982 to date), BIOSIS (Ovid, inception to date), Science Citation Index (ISI Web of Knowledge, inception to date), PsycINFO (Ovid SP, inception to date) and LILACS (BIREME, 1982 to date). Science Citation Index includes in its database conference abstracts (see Appendix 1 for a proposed draft strategy to be run in MEDLINE). We will design similarly structured search strategies using search terms appropriate for each database. We will use controlled vocabulary such as MeSH terms and EMTREE where appropriate. We will not use search filters (collections of terms aimed at reducing the number needed to screen) as an overall limiter because those published have not proved sensitive enough (Whiting 2011). We will not apply any language restriction to the electronic searches. We will request a search of the Cochrane Register of Diagnostic Test Accuracy Studies (hosted and maintained by the Cochrane Renal Group) and the specialised register of the CDCIG, ALOIS, which includes both intervention and diagnostic test accuracy studies in dementia. A single researcher with extensive experience of systematic review will perform the initial searches.

Searching other resources

We will check the reference lists of all relevant papers for additional studies. We will also search:

  • MEDION database (Meta-analyses van Diagnostisch Onderzoek): www.mediondatabase.nl.

  • DARE (Database of Abstracts of Reviews of Effects): www.york.ac.uk/inst/crd/crddatabases.html.

  • HTA Database (Health Technology Assessments Database, The Cochrane Library).

  • ARIF database (Aggressive Research Intelligence Facility): www.arif.bham.ac.uk.

Through PubMed, relevant studies will be used to search for additional studies using the 'Related Articles' feature. We will examine key studies in citation databases such as the Science Citation Index and Scopus to ascertain further relevant studies. We will identify grey literature through the Science Citation Index, which now includes conference proceedings. We will aim to access theses and PhD abstracts from institutions known to be involved in prospective dementia studies. We will also attempt to contact researchers involved in studies with possibly relevant but unpublished data. We will not perform handsearching, as there is little published evidence of the benefits of handsearching for reports of DTA studies (Glanville 2012).

Data collection and analysis

Selection of studies

We will select studies on the basis of title and abstract screening undertaken by the review authors or by teams of experienced assessors. We will then locate the full paper for each potentially eligible study identified by the search, and two review authors will independently evaluate each study for inclusion or exclusion. We will resolve disagreements by discussion. If this does not prove conclusive, the default position will be to include the study. We will present the study selection process in a PRISMA flow diagram.

Data extraction and management

We will extract data on study characteristics to a study-specific proforma and will include data on assessment of quality and investigation of heterogeneity, as described in Appendix 2. The proforma will have been piloted against ten primary diagnostic studies. Two review authors will extract data. We will dichotomise the results if necessary and cross-tabulate in 2 × 2 tables the index test results (positive or negative) against the target disorder (positive or negative) and will show results in RevMan tables.

Assessment of methodological quality

We will assess the methodological quality of each study by using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) (Whiting 2011a), as recommended by The Cochrane Collaboration. This tool is made up of four domains: patient selection, index test, reference standard and patient flow (see Appendix 3). Each domain is assessed in terms of risk of bias, and the first three domains are also considered in terms of applicability. We will thus report the QUADAS-2 methodological assessment of studies using bespoke tables. Operational definitions describing the use of QUADAS-2 are detailed in Appendix 4.

Statistical analysis and data synthesis

The target condition comprises two categories: (1) dementia (not otherwise specified) and (2) dementia subtypes (Alzheimer’s, vascular, Lewy body, etc.). Studies may detail one or both outcomes. Each of these target conditions will merit a separate meta-analysis.

We will use for all included studies the data in the 2 × 2 tables (showing binary test results cross-classified with the binary reference standard) to calculate sensitivities and specificities, with their 95% confidence intervals. We will present individual study results graphically by plotting estimates of sensitivities and specificities both in a forest plot and in a receiver operating characteristic (ROC) space. We will consider these findings in the light of the previous systematic assessment (using QUADAS-2) of the methodological quality of individual studies. We will use RevMan software to document these descriptive analyses and to produce summary ROC curves. If more than one threshold is reported in an individual study, we will present the graphical findings for all thresholds reported. However, we will avoid inclusion of study data in the calculation of a summary statistic obtained on more than one occasion (in the same setting) by using only the threshold that is considered to be “standard practise” for the target population in question. We will pool studies only if they are conducted in the same/similar settings. When this has been determined, we will perform a meta-analysis on pairs of sensitivity and specificity. Once all relevant studies have been identified, it will be clear whether most of the studies have reported results with consistent thresholds. We will use the optimal threshold for the index test if no agreed standard practise (the threshold nearest to the upper left corner of the ROC curve) is available.

We recognise that a threshold variation may exist and may produce imprecise summary points useful for plotting the ROC curve in RevMan; therefore more sophisticated approaches may be used, such as the bivariate random-effects model and the hierarchical summary ROC model. With these methods, the average accuracy of the index test, the average sensitivity and the average specificity can be estimated and adjusted for any covariates that may introduce heterogeneity across studies. In addition, these methods enable incorporation of unexplained variation across the estimate (Leeflang 2008; Reitsma 2005). Categorised covariates can be incorporated in the bivariate model to allow examination of the effects of potential sources of bias and variation across subgroups of studies, as outlined in The Cochrane Handbook for DTA Reviews, Chapter 10. We will assess model fit by using likelihood ratio tests. We will use Stata software, version 12.1 (StataCorp, College Station, Texas), to carry out additional analyses using a bivariate or hierarchical summary receiver-operator curves (HSROC) approach.

Investigations of heterogeneity

According to Davies et al (Davis 2013), neuropsychological tests for the diagnosis of ADD and other dementias have a common framework for heterogeneity sources.

Index test

  • Thresholds

  • Technical features (including different versions of the test)

  • Operator characteristics (e.g. training)

Target disorder

  • Reference standards used: DSM definition, ICD definition, NINDS-ARDRA or other classification, including pathological definitions, and operationalisation of these classifications (e.g. individual clinician/algorithm/consensus group)

  • Spectrum of target disorder (may depend on study design)

Target population

  • Age, sex, education, sociocultural variables (social network/social engagement)

  • Other characteristics (e.g. APOE status, definition and duration of MCI at baseline (if applicable))

  • Prevalence in different settings

  • Treatment: previous or current interventions

Study quality

  • Types of studies

  • Prior clinical information to increase the accuracy of the index test

  • Duration of follow-up (measured in years for delayed-verification studies)

  • Loss to follow-up: We will consider separately those studies that have more than 20% drop-outs. If possible, we will perform subgroup analyses

It is likely that only a handful of studies will be sufficiently robust for inclusion in the meta-analyses, which will allow only one or two sources of heterogeneity to be explored (because of insufficient data). We will investigate heterogeneity in the first instance through visual examination of forest plots of sensitivities and specificities and through visual examination of the ROC plot of the raw data. Main sources of heterogeneity will include index test thresholds, reference standards used for dementia and duration of follow-up, amongst others. We will first investigate their effects by conducting subgroup analyses in RevMan and by including each of these as covariates in the regression analyses. If we identify further likely sources of heterogeneity, we will investigate these by subgroup analyses and, if data allow, will include these as covariates in the regression model, with assistance from the DTA UK Support Unit.

Sensitivity analyses

We will perform sensitivity analyses to determine the effect of excluding studies that are deemed to be at high risk of bias, according to the QUADAS-2 checklist. Additionally, we will perform sensitivity analyses to determine the effect of excluding studies that were flagged as possibly less appropriate for inclusion (when disagreement between authors could not be resolved). The primary analysis will include all studies; the sensitivity analysis will exclude studies of low quality (high likelihood of bias) to determine whether results have been influenced by inclusion of the lower-quality studies.

Assessment of reporting bias

Quantitative methods for exploring reporting bias are not well established for studies of DTA. Specifically, we will not consider funnel plots of the diagnostic odds ratio (DOR) versus the standard error of this estimate.

Acknowledgements

Ingrid Arévalo-Rodríguez is a PhD student at the Department of Pediatrics, Obstetrics and Gynecology and Preventive Medicine of the Universitat Autònoma de Barcelona.

Appendices

Appendix 1. Search strategy (Medline Ovid SP) run for specialised register (ALOIS)

Search narrative: this is a single concept search using only the index test. This was felt to be the simplest and most sensitive approach.

1. MMSE*.ti,ab.

2. sMMSE.ti,ab.

3. Folstein*.ti,ab.

4. MiniMental.ti,ab.

5. "mini mental stat*".ti,ab.

6. 3MS.ti,ab.

7. *mini mental state examination/

8. or/1-7

MMSE is a test that is applied a great deal and it is therefore possible that some studies for which 2x2 data is extractable may not specifically (or primarily) have been designed as a diagnostic test accuracy study and may therefore not allude to this test in the bibliographic citation. To counter this as much as possible we hope the generic searches run for CDCIG’s Specialised Register, ALOIS, will help identify those studies. The strategy used is below:

The MEDLINE search uses the following concepts:

A Specific neuropsychological tests

B General terms (both free text and MeSH) for tests/testing/screening

C Outcome: dementia diagnosis (unfocused MeSH with diagnostic sub-headings)

D Condition of interest: Dementia (general dementia terms both free text and MeSH – exploded and unfocused)

E Methodological filter: NOT used to limit all search

Concept combination:

1. (A OR B) AND C

2. (A OR B) AND D AND E

3. A AND E

= 1 OR 2 OR 3

The search strategy

1. "word recall".ti,ab.

2. ("7-minute screen" OR “seven-minute screen”).ti,ab.

3. ("6 item cognitive impairment test" OR “six-item cognitive impairment test”).ti,ab.

4. "6 CIT".ti,ab.

5. "AB cognitive screen".ti,ab.

6. "abbreviated mental test".ti,ab.

7. "ADAS-cog".ti,ab.

8. AD8.ti,ab.

9. "inform* interview".ti,ab.

10. "animal fluency test".ti,ab.

11. "brief alzheimer* screen".ti,ab.

12. "brief cognitive scale".ti,ab.

13. "clinical dementia rating scale".ti,ab.

14. "clinical dementia test".ti,ab.

15. "community screening interview for dementia".ti,ab.

16. "cognitive abilities screening instrument".ti,ab.

17. "cognitive assessment screening test".ti,ab.

18. "cognitive capacity screening examination".ti,ab.

19. "clock drawing test".ti,ab.

20. "deterioration cognitive observee".ti,ab.

21. ("Dem Tect" OR DemTect).ti,ab.

22. "object memory evaluation".ti,ab.

23. "IQCODE".ti,ab.

24. "mattis dementia rating scale".ti,ab.

25. "memory impairment screen".ti,ab.

26. "minnesota cognitive acuity screen".ti,ab.

27. "mini-cog".ti,ab.

28. "mini-mental state exam*".ti,ab.

29. "mmse".ti,ab.

30. "modified mini-mental state exam".ti,ab.

31. "3MS".ti,ab.

32. “neurobehavio?ral cognitive status exam*”.ti,ab.

33. "cognistat".ti,ab.

34. "quick cognitive screening test".ti,ab.

35. "QCST".ti,ab.

36. "rapid dementia screening test".ti,ab.

37. "RDST".ti,ab.

38. "repeatable battery for the assessment of neuropsychological status".ti,ab.

39. "RBANS".ti,ab.

40. "rowland universal dementia assessment scale".ti,ab.

41. "rudas".ti,ab.

42. "self-administered gerocognitive exam*".ti,ab.

43. ("self-administered" and "SAGE").ti,ab.

44. "self-administered computerized screening test for dementia".ti,ab.

45. "short and sweet screening instrument".ti,ab.

46. "sassi".ti,ab.

47. "short cognitive performance test".ti,ab.

48. "syndrome kurztest".ti,ab.

49. ("six item screener" OR “6-item screener”).ti,ab.

50. "short memory questionnaire".ti,ab.

51. ("short memory questionnaire" and "SMQ").ti,ab.

52. "short orientation memory concentration test".ti,ab.

53. "s-omc".ti,ab.

54. "short blessed test".ti,ab.

55. "short portable mental status questionnaire".ti,ab.

56. "spmsq".ti,ab.

57. "short test of mental status".ti,ab.

58. "telephone interview of cognitive status modified".ti,ab.

59. "tics-m".ti,ab.

60. "trail making test".ti,ab.

61. "verbal fluency categories".ti,ab.

62. "WORLD test".ti,ab.

63. "general practitioner assessment of cognition".ti,ab.

64. "GPCOG".ti,ab.

65. "Hopkins verbal learning test".ti,ab.

66. "HVLT".ti,ab.

67. "time and change test".ti,ab.

68. "modified world test".ti,ab.

69. "symptoms of dementia screener".ti,ab.

70. "dementia questionnaire".ti,ab.

71. "7MS".ti,ab.

72. ("concord informant dementia scale" or CIDS).ti,ab.

73. (SAPH or "dementia screening and perceived harm*").ti,ab.

74. or/1-73

75. exp Dementia/

76. Delirium, Dementia, Amnestic, Cognitive Disorders/

77. dement*.ti,ab.

78. alzheimer*.ti,ab.

79. AD.ti,ab.

80. ("lewy bod*" or DLB or LBD or FTD or FTLD or “frontotemporal lobar degeneration” or “frontaltemporal dement*).ti,ab.

81. "cognit* impair*".ti,ab.

82. (cognit* adj4 (disorder* or declin* or fail* or function* or degenerat* or deteriorat*)).ti,ab.

83. (memory adj3 (complain* or declin* or function* or disorder*)).ti,ab.

84. or/75-83

85. exp "sensitivity and specificity"/

86. "reproducibility of results"/

87. (predict* adj3 (dement* or AD or alzheimer*)).ti,ab.

88. (identif* adj3 (dement* or AD or alzheimer*)).ti,ab.

89. (discriminat* adj3 (dement* or AD or alzheimer*)).ti,ab.

90. (distinguish* adj3 (dement* or AD or alzheimer*)).ti,ab.

91. (differenti* adj3 (dement* or AD or alzheimer*)).ti,ab.

92. diagnos*.ti.

93. di.fs.

94. sensitivit*.ab.

95. specificit*.ab.

96. (ROC or "receiver operat*").ab.

97. Area under curve/

98. ("Area under curve" or AUC).ab.

99. (detect* adj3 (dement* or AD or alzheimer*)).ti,ab.

100. sROC.ab.

101. accura*.ti,ab.

102. (likelihood adj3 (ratio* or function*)).ab.

103. (conver* adj3 (dement* or AD or alzheimer*)).ti,ab.

104. ((true or false) adj3 (positive* or negative*)).ab.

105. ((positive* or negative* or false or true) adj3 rate*).ti,ab.

106. or/85-105

107. exp dementia/di

108. Cognition Disorders/di [Diagnosis]

109. Memory Disorders/di

110. or/107-109

111. *Neuropsychological Tests/

112. *Questionnaires/

113. Geriatric Assessment/mt

114. *Geriatric Assessment/

115. Neuropsychological Tests/mt, st

116. "neuropsychological test*".ti,ab.

117. (neuropsychological adj (assess* or evaluat* or test*)).ti,ab.

118. (neuropsychological adj (assess* or evaluat* or test* or exam* or battery)).ti,ab.

119. Self report/

120. self-assessment/ or diagnostic self evaluation/

121. Mass Screening/

122. early diagnosis/

123. or/111-122

124. 74 or 123

125. 110 and 124

126. 74 or 123

127. 84 and 106 and 126

128. 74 and 106

129. 125 or 127 or 128

130. exp Animals/ not Humans.sh.

131. 129 not 130

The searches will identify a large number of citations to screen. However, we will use a team of trained screeners to work through the large numbers.

Appendix 2. Information for extraction to proforma

Bibliographic details of primary paper.

  • Author, title of study, year and journal.

Details of index test.

  • Method of MMSE administration, including who administered and interpreted the test and their training.

  • Thresholds used to define positive and negative tests.

Reference standard.

  • Reference standard used.

  • Method of reference standard administration, including who administered the test and their training.

Study population.

  • Number of participants.

  • Age.

  • Gender.

  • Other characteristics (e.g. APOE status).

  • Settings: (i) community; (ii) primary care; (iii) secondary care outpatients; (iv) secondary care inpatients and residential care.

  • Participant recruitment.

  • Sampling procedures.

  • Time between index test and reference standard.

  • Proportion of people in sample with dementia.

  • Subtype and stage of dementia if available.

  • MCI definition used (if applicable).

  • Duration of follow-up.

  • Attrition and missing data.

Results of the 2 × 2 tables cross-relating index test results of the reference standards.

Table 1: Conversion from MCI to Alzheimer’s disease dementia

Index test informationReferences standard information
ADD present ADD absent
Index test positiveMMSE+ who convert to ADD (TP)MMSE+ who remain MCI (FP) & MMSE+ who convert to non-ADD (FP)
Index test negativeMMSE- who convert to ADD (FN)MMSE- who remain MCI (TN) & MMSE- who convert to non-ADD (TN)

Table 2: Conversion from MCI to non-Alzheimer’s disease dementia

Index test informationReferences standard information
Non-ADD present Non-ADD absent
Index test positiveMMSE+ who convert to non-ADD (TP)MMSE+ who remain MCI (FP) & MMSE+ who convert to ADD (FP)
Index test negativeMMSE- who convert to non-ADD (FN)MMSE- who remain MCI (TN) & MMSE- who convert to ADD (TN)

Table 3: Conversion from MCI to any forms of dementia

Index test informationReferences standard information
Dementia present (any forms of dementia present) Dementia absent
Index test positiveMMSE+ who convert to any forms of dementia (TP)MMSE+ who remain MCI (FP)
Index test negativeMMS - who convert to any forms of dementia (FN)MMSE- who remain MCI (TN)

Appendix 3. Assessment of methodological quality QUADAS-2

DomainPatient selectionIndex testReference standardFlow and timing
Description

Describe methods of patient selection

Describe included participants (prior testing, presentation, intended use of index test and setting)

Describe the index test and how it was conducted and interpretedDescribe the reference standard and how it was conducted and interpreted

Describe any participants who did not receive the index test(s) and/or reference standard or who were excluded from the 2 × 2 table (refer to flow diagram)

Describe the time interval and any interventions between index test(s) and reference standard

Signalling questions (yes/no/unclear)

Was a consecutive or random sample of participants enrolled?

Was a case-control design avoided?

Did the study avoid inappropriate exclusions?

Were the index test results interpreted without knowledge of results of the reference standard?

If a threshold was used, was it prespecified?

Is the reference standard likely to correctly classify the target condition?

Were the reference standard results interpreted without knowledge of results of the index test?

Did all participants receive a reference standard?

Did all participants receive the same reference standard?

Were all participants included in the analysis?

Risk of bias
(high/low/unclear)
Could the selection of participants have introduced bias?Could the conduct or interpretation of the index test have introduced bias?Could the reference standard, its conduct or its interpretation have introduced bias?Could the participant flow have introduced bias?
Concerns regarding applicability
(high/low/unclear)
Are there concerns that included participants do not match the review question?Are there concerns that the index test, its conduct or its interpretation differs from the review question?Are there concerns that the target condition as defined by the reference standard does not match the review question? 

Appendix 4. Anchoring statements for quality assessment of MMSE diagnostic studies

We provide some core anchoring statements for quality assessment of diagnostic test accuracy of MMSE. These statements are designed for use with the QUADAS-2 tool and were derived during a 2-day, multidisciplinary focus group in 2010. If a QUADAS-2 signalling question for a specific domain is answered “yes”, then the risk of bias can be judged to be “low”. If a question is answered “no”, this indicates risk of potential bias.

The focus group was tasked with judging the extent of the bias for each domain. During this process, it became clear that certain issues were key to assessing quality, whilst others were important to record but were less important for assessing overall quality. To assist, we describe a “weighting” system. When an item is weighted “high risk”, that section of the QUADAS-2 results table is judged to have a high potential for bias if a signalling question is answered “no”. For example, in dementia, diagnostic test accuracy studies, ensuring that clinicians performing dementia assessment are blinded to results of the index test, are fundamental. If this blinding was not present, then the item on the reference standard should be scored “high risk of bias”, regardless of the other contributory elements. When an item is weighted “low risk”, it is judged to have a low potential for bias if a signalling question for that section of the QUADAS-2 results table is answered “no”. Overall bias will be judged on whether other signalling questions (with a high risk of bias) for the same domain are also answered “no”. In assessing individual items, the score of "unclear" should be given only if there is genuine uncertainty. In these situations, review authors will contact the relevant study teams for additional information.

Anchoring statements to assist with assessment for risk of bias

Domain 1: Patient selection
Risk of bias: Could the selection of patients have introduced bias? (high/low/unclear)

Was a consecutive or random sample of patients enrolled?
When sampling is used, the methods least likely to cause bias are consecutive sampling and random sampling, which should be stated and/or described. Non-random sampling or sampling based on volunteers is more likely to be at high risk of bias.
Weighting: high risk of bias (no)

Was a case-control design avoided?

Case-control study designs have a high risk of bias, but sometimes they are the only studies available, especially if the index test is expensive and/or invasive. Nested case-control designs (systematically selected from a defined population cohort) are less prone to bias, but they will still narrow the spectrum of participants who receive the index test. Other study designs (both cohort and case-control) that may increase bias are those designs for which the study team deliberately increases or decreases the proportion of participants with the target condition, for example, a population study may be enriched with extra participants with dementia from a secondary care setting.
Weighting: high risk of bias (no)

Did the study avoid inappropriate exclusions?
The study will be automatically graded as unclear if exclusions are not detailed (pending contact with study authors). When exclusions are detailed, the study will be graded as “low risk” if exclusions are believed by the review authors to be appropriate. Exclusions common to many studies of dementia include the following: medical instability; terminal disease; alcohol/substance misuse; concomitant psychiatric diagnosis; and other neurodegenerative condition. However, if “difficult to diagnose” groups are excluded, this may introduce bias, so exclusion criteria must be justified. For a community sample, we would expect relatively few exclusions. Post hoc exclusions will be labelled “high risk” of bias.
Weighting: high risk of bias (no)

Applicability: Are there concerns that included patients do not match the review question? (high/low/unclear)

Included patients should match the intended population as described in the review question. If not already specified in the review inclusion criteria, the setting will be particularly important—the review authors should consider population in terms of symptoms, pretesting and potential disease prevalence. Studies that use very selected participants or subgroups will be classified as having low applicability, unless they are intended to represent a defined target population, for example, people with memory problems referred to a specialist and investigated by lumbar puncture.

Domain 2: Index test
Risk of bias: Could the conduct or interpretation of the index test have introduced bias? (high/low/unclear)

Were MMSE results interpreted without knowledge of the reference standard?
Terms such as “blinded” or “independently and without knowledge of” are sufficient, and full details of the blinding procedure are not required. This item may be scored as “low risk” if explicitly described, or if a clear temporal pattern to the order of testing precludes the need for formal blinding (e.g. all MMSE assessments were performed before the dementia assessment). As most neuropsychological tests are administered by a third party, knowledge of the dementia diagnosis may influence ratings; tests that are self-administered, for example, use of a computerised version, may be associated with less risk of bias.
Weighting: high risk (no)

Were MMSE thresholds prespecified?
For neuropsychological scales, there is usually a threshold above which participants are classified as “test positive”; this may be referred to as the threshold, the clinical cut-off or the dichotomisation point. Different thresholds are used in different populations. A study is classified as having higher risk of bias if the study authors define the optimal cut-off post hoc on the basis of their own study data. Some papers use an alternative methodology for analysis that does not use thresholds; these papers should be classified as not applicable.
Weighting: high risk (no)

Were sufficient data on MMSE application given for the test to be repeated in an independent study?
Particular points of interest include method of administration (e.g. self-completed questionnaire vs direct questioning interview); nature of the informant; and language of the assessment. If a novel form of the index test is used, for example, a translated questionnaire, details of the scale should be included and a reference given to an appropriate descriptive text; evidence of validation should be provided.
Weighting: high risk (no)

Applicability: Are there concerns that the index test, its conduct or its interpretation may differ from the review question? (high/low/unclear)

Variations in length, structure, language and/or administration of the index test may affect applicability if they vary from those specified in the review question.

Domain 3: Reference standard
Risk of bias: Could the reference standard, its conduct or its interpretation have introduced bias? (high/low/unclear)

Is the assessment used for clinical diagnosis of dementia acceptable?
Commonly used international criteria to assist with clinical diagnosis of dementia include those detailed in DSM-IV and ICD-10. Criteria specific to dementia subtypes include but are not limited to NINCDS-ADRDA criteria for Alzheimer’s dementia, McKeith criteria for Lewy body dementia, Lund-Manchester criteria for fronto-temporal dementia and NINDS-AIREN criteria for vascular dementia. When the criteria used for assessment are not familiar to the review authors and the Cochrane Dementia and Cognitive Improvement Group, this item should be classified as “high risk of bias”.
Weighting: high risk (no)

Was clinical assessment for dementia performed without knowledge of the MMSE results?
Terms such as “blinded” and “independent” are sufficient, and full details of the blinding procedure are not required. Interpretation of results of the reference standard may be influenced by knowledge of results of the index test.
Weighting: high risk (no)

Applicability: Are there concerns that the target condition as defined by the reference standard does not match the review question? (high/low/unclear)

Some methods of dementia assessment, although valid, may diagnose a far smaller or larger proportion of individuals with the disease than in usual clinical practise. For example, currently the reference standard for vascular dementia may underdiagnose compared with usual clinical practise. In this instance, the item should be rated as having poor applicability.

Domain 4: Participant flow and timing (n.b. refer to, or construct, a flow diagram)
Risk of bias: Could the participant flow have introduced bias? (high/low/unclear)

Was there an appropriate interval between MMSE and the reference standard?
As we test the accuracy of the MMSE test for MCI conversion to dementia, a delay will always be noted between the index test and the reference standard assessments. The time between reference standard and index test will influence the accuracy, and therefore we will note time as a separate variable (both within and between studies) and will test its influence on diagnostic accuracy. We have set a minimum mean time to follow-up assessment of 1 year. If more than 16% of participants undergo assessment for MCI conversion before 9 months, this item will score "no".
Weighting: high risk (no)

Did all participants receive the same reference standard?
In some scenarios, participants who score “test positive” on the index test may undergo a more detailed assessment for the target condition. When dementia assessment (or reference standard) differs between participants, this should be classified as high risk of bias.
Weighting: high risk (no)

Were all participants included in the final analysis?
If the number of participants enrolled differs from the number of participants included in the 2 × 2 table, the potential for bias exists. If participants lost to follow-up differ systematically from those who remain, then estimates of test performance may differ. If drop-outs are present, these should be accounted for; the maximum proportion of drop-outs for low risk of bias has been specified as 20%. Details of the causes of study drop-outs are crucial, and if such data are missing, the reliability of the conclusions must be questioned.
Weighting: high risk (no)

Contributions of authors

All authors contributed to the writing of this protocol.

Declarations of interest

None known.

Sources of support

Internal sources

  • Fundación Universitaria de Ciencias de la Salud, Hospital San José/Hospital Infantil de San José, Bogotá D.C., Colombia.

  • Institute for Clinical Effectiveness and Health Policy IECS, Buenos Aires, Argentina.

  • Iberoamerican Cochrane Centre, Barcelona, Spain.

External sources

  • Agencia de Calidad del Sistema Nacional de Salud, Ministry of Health, Madrid, Spain.

Ancillary